Original Link: https://www.anandtech.com/show/1080
NVIDIA's GeForce FX 5600/5200 Ultra Performance Review
by Anand Lal Shimpi on March 10, 2003 6:36 PM EST- Posted in
- GPUs
"NVIDIA's focus at this point is NV31 and NV34, after all, that's where the money is. The small percentage of the market that will go after the NV30 will not make or break NVIDIA, but should ATI compete like this in other market segments then there will be cause for worry."
And thus we concluded our performance review of the GeForce FX, and today we're able to bring you the first benchmarks of NVIDIA's NV31 and NV34, to hopefully put to rest whether or not NVIDIA has still got what it takes to be competitive.
Just under one year ago, NVIDIA briefed us on NV30 as well as their DirectX 9 roadmap. What was most intriguing to us was that after NV30's release, NVIDIA was going to quickly transition DirectX 9 compliance from the very high end down to the entry-level segments.
What NVIDIA was going to do was learn from the mistakes they made with NV17 (GeForce4 MX), something that we definitely appreciated when we heard it. If you'll remember, the biggest problem we had with the GeForce4 MX was that it carried the GeForce4 name but was in no way, shape or form, a DirectX 8 part. In fact, the GeForce4 MX was much more like a GeForce2 MX than it was a GeForce4, despite its name.
With the successor to the GeForce4 MX, NVIDIA was promising a direct derivative of the NV30. From this point forward, all market segments would have identical feature sets and would only differ based on performance - the way it should have been to begin with. NVIDIA's GPUs are actually designed a bit differently these days, which is what allows NVIDIA to launch a plethora of GPUs that cover all market segments at around the same time frame with identical feature sets. NVIDIA has componentized their verilog code much more with the NV30 design, which is why we see that even despite NV30's delays, the derivatives of the core (NV31/34) are still on track.
Before we dive into the cards themselves, let's have a few words about what NV31 and NV34 actually are. As confusing as this may be, NV34 is the slower of the two and NV31 is the faster GPU. Consider the NV34 to be the successor to the GeForce4 MX and the NV31 to be the follow-up to the GeForce4 Ti 4200. The reason that the higher number is actually given to the slower part has nothing at all to do with the capabilities of either GPU, instead it is a limitation of one of NVIDIA's tools developed in-house that is used in cataloging all of their GPUs. This tool in particular generates a codename based on the features of the GPU (pipe configuration, manufacturing process, etc ) and because of the way the tool works, the low-end part ended up carrying a higher code name than the high-end part. This obviously doesn't matter in the end, since you won't find NV31 or NV34 on a box anywhere, but it's a bit of behind the scenes trivia that you might enjoy.
With that out of the way, let's first dive into NVIDIA's new mainstream GPU - the NV31.
GeForce FX 5600 Ultra (NV31)
The NV31 is a 0.13-micron GPU, and as we mentioned earlier, is a direct descendent of the NV30. The GPU itself can render a maximum of four pixels per clock (four single textured pixels, or two dual textured pixels), making it effectively half of a NV30. This is reflected in its transistor count, which is 60% of the NV30 and still greater than the GeForce4 (125M vs. 75M vs. 57M). Since NV31 is feature-identical to the NV30, it is fully DirectX 9 compliant, meaning anything you can run on NV30 you can also run on NV31 (and NV34 for that matter). This is the beauty of NVIDIA's more componentized design process, and you'll see its benefits even more as we look at the NV34. For an idea of the features supported by NV31/34, be sure to read our GeForce FX Technology Preview.
The NV31 also retains the GeForce FX's 128-bit memory interface, although only with DDR-I support; thus memory clocks will not be as high as what we're used to seeing on the FX. The other downside to not supporting DDR-II is that memory power consumption is going to be higher on NV31 boards, but with the GPU's 80 million transistors already eating up quite a bit of power, the dent that DDR-I puts in the board's power envelope compared to DDR-II isn't significant.
What's interesting is that the NV31 actually has a superior memory controller to the NV30, akin to what we saw with the Radeon 9600 vs. Radeon 9800 where the 9600 had a more efficient memory controller. You can consider the memory controller in the NV31 to be midway between the NV30 (GeForce FX) and the NV35, with the NV35 having a significantly improved memory controller. There's no worry at NVIDIA about NV31 outperforming GeForce FX however, remember that fill rate and actual clock speeds also matter, both areas where the NV30 has the NV31 beat.
NVIDIA's Color & Z Compression algorithms have also been improved on the NV31, once again, not as improved as what we'll see with NV35 but definitely superior to NV30.
The RAMDACs from NV30 also make their way down to the NV31, with dual 400MHz units finding their way onto the GPU. Unlike the NV30 however, the NV31 does have an integrated TMDS transmitter for DVI output. NVIDIA has historically left TMDS transmitters out of their highest end parts because of the difficulties in including such a high speed transmitter in a very noise-sensitive GPU; ATI is actually of a differing opinion and has no qualms about outfitting even their Radeon 9800 with an integrated TMDS.
The NV31 has the same AA and Anisotropic filtering engines as the NV30, meaning it has much more efficient AA than the GeForce4 and suffers from the same anisotropic filtering issues as the NV30.
The first incarnation of the NV31 GPU is the GeForce FX 5600 Ultra, which is shipping with a 350MHz core clock and 350MHz DDR memory clock. The GeForce FX 5600 Ultra thus has more memory bandwidth than a GeForce4 Ti 4600, and makes more efficient use of the bandwidth thanks to a much improved memory controller and color/Z compression. Although the 5600 Ultra has a higher core clock than the Ti 4600, it has a lower multitextured fill rate (the NV31 has four pipes with one texture unit per pipe vs. GeForce4's four pipes with two texture units per pipe) which will hurt performance significantly in situations that aren't overly memory bandwidth limited.
The 5600 Ultra will retail for $199 and will be competing directly with ATI's Radeon 9600. Remember that the Radeon 9600 will be similarly equipped (4 pipes, 1 texture per pipe), but will feature a higher core clock (400MHz) and a lower memory clock (300MHz DDR). Keep that in mind when you're looking through the benchmarks of the 5600 Ultra later on.
GeForce FX 5200 Ultra (NV34)
Although the rest of their Spring line (sounds almost too fashionable to be computer hardware, no?) is built on TSMC's 0.13-micron process, NVIDIA went with the more mature 0.15-micron process for their entry-level part - the NV34.
Remember from our explanation of the codenames in the intro, the NV34 is the successor to the GeForce4 MX but unlike the MX, it is not an overly castrated version of its older siblings. In fact, the architecture of the NV34 is, according to NVIDIA, virtually identical to the NV31; the only official differences being the memory controller and clock speeds.
The NV34 128-bit memory controller is devoid of any of the NV30/31's compression algorithms, mostly to keep die size down on the larger 0.15-micron process. The NV34 is aimed at the mass market, and thus must be as cheap to manufacture as possible. Featuring only 47 million transistors (compared to NV31's 75 million), the NV34 is missing more than just the NV31's compression engine. Because the NV31 is geared for higher clock speeds its pipelines, although functionally equivalent to the NV34's, do have extra stages to help gear the part for higher clocks. Around 15 - 20% of the NV31's die is reserved for the compression engines, with the remaining 10 - 20% of the die difference being reserved for those extra pipeline stages. The combination of the two aforementioned differences make up the vast majority of the 28 million transistor difference between the NV31 & 34.
Obviously on a 0.15-micron process, the NV34 can't reach as high clocks as the NV31, which is why it shouldn't be a surprise that one of its first incarnations - the GeForce FX 5200 Ultra will be shipping at a 325MHz core clock. This 325MHz core clock is still quite high, you'll note that it is higher than the GeForce4 Ti 4600 built on the same 0.15-micron process. The difference is that the 5200 Ultra's NV34 GPU is slightly less complicated than the GeForce4 and thus can run at higher speeds without producing as much heat and adversely impacting the overall manufacturing yield of the GPUs.
The GeForce FX 5200 Ultra features a 325MHz DDR memory clock, and although it's only 7% lower than the 5600 Ultra, the performance will be significantly less because of the lack of any compression engine.
The 5200 Ultra will retail for $149 and at that price, it will deliver the exact same feature set as the GeForce FX 5800 Ultra - mainly DirectX 9 compliance. There will be a regular GeForce FX 5200 that will sell for under $100, also with the same feature as the 5200 Ultra, 5600 Ultra and 5800 Ultra cards. NVIDIA wasn't kidding when they said that they would bring DirectX 9 support to their entire line of GPUs, and they have delivered.
ATI has no competition to the 5200 series, simply because their Radeon 9200/9100/9000 do not offer DirectX 9 support, they are based on a DirectX 8 class of GPUs. Granted that there won't be any use of DirectX 9 specific features in games for quite a while, but the tables have definitely turned; ATI is now the one that isn't delivering all of the features possible to the entry-level, and NVIDIA is clearly learning from their own past mistakes.
Anisotropic Filtering - Shame on NVIDIA
When we reviewed the GeForce FX 5800 Ultra we came to the conclusion that NVIDIA's anisotropic filtering quality was significantly degraded in the FX line of GPUs. What we were worried about however was that the derivatives of the NV30 would also suffer the same fate because, after all, they all use the same anisotropic filtering and AA engines. When we went to test the NV31/34, we quickly realized that our worst nightmares had come true and that the anisotropic filtering was still as poor quality as we saw with the first FX and it continued to not properly work in a number of games.
We asked NVIDIA about the issues and they insist that they are driver related and that by the launch of NV35, they will be fixed. Unfortunately until then, we are left with a half- working anisotropic filtering setting with all FX derived cards, and a performance-aggressive mode that is completely useless (click here to get more information on what we're talking about).
Until NVIDIA fixes the driver issues, we're forced to use the settings we agreed upon in our GeForce FX 5800 Ultra review:
For the GeForce FX 5x00 we ran with the following settings enabled:
- 8X Performance - Balanced Anisotropic Filtering
- 4X Anti-Aliasing
For the GeForce4/4MX we ran with the following settings enabled:
- 2X Performance - Balanced Anisotropic Filtering
- 4X Anti-Aliasing
The Radeon 9x00 Pro were run with the following enabled:
- 8X Performance Anisotropic Filtering
- 4X Anti-Aliasing
The Test
Windows
XP Professional Test Bed
|
|
Hardware
Configuration
|
|
CPU |
Intel
Pentium 4 3.06GHz (Hyper-Threading Enabled)
|
Motherboard |
Intel
D850EMV2
Intel 850E Chipset |
RAM |
2
x 256MB PC1066 Kingston RIMMs
|
Sound |
None
|
Hard Drive |
120GB
Western Digital Special Edition 8MB Cache HDD
|
Video Cards (Drivers) |
ATI Radeon 9500
Pro (128MB) - CATALYST 3.1 |
Unreal Tournament 2003
With this review we continue to use the final retail version of Unreal Tournament 2003 as a benchmark tool. The benchmark works similarly to the demo, except there are higher detail settings that can be chosen. As we've mentioned before, in order to make sure that all numbers are comparable you need to be sure to do the following:
By default the game will detect your video card and assign its internal defaults based on the capabilities of your video card to optimize the game for performance. In order to fairly compare different video cards you have to tell the engine to always use the same set of defaults which is accomplished by editing the .bat files in the X:\UT2003\Benchmark\ directory.
Add the following parameters to the statements in every one of the .bat files located in that directory:
-ini=..\\Benchmark\\Stuff\\MaxDetail.ini -userini=..\\Benchmark\\Stuff\\MaxDetailUser.ini
For example, in botmatch-antalus.bat will look like this after the additions:
..\System\ut2003 dm-antalus?spectatoronly=true?numbots=12?quickstart=true -benchmark -seconds=77 -exec=..\Benchmark\Stuff\botmatchexec.txt -ini=..\\Benchmark\\Stuff\\MaxDetail.ini -userini=..\\Benchmark\\Stuff\\MaxDetailUser.ini -nosound
Remember to do this to all of the .bat files in that directory before running Benchmark.exe.
|
Starting out, we see some interesting things; the FX 5600 Ultra is right on the heels of the GeForce4 Ti 4200, but unable to outperform it. Remember that the 5600 Ultra has a lower fill rate, but more (and more efficient) memory bandwidth. At this low of a resolution, memory bandwidth isn't the limiting factor, thus the Ti 4200 comes out on top.
The 5200 Ultra manages to outperform its competition fairly well, although it will be interesting to see where the regular 5200 would fall into this chart.
You can also see a clear division between the 5600 Ultra and the 5200 Ultra, although the two cores may be similar, they are clearly not on equal performance levels.
|
|
|
The trend continues all the way up to 1600x1200, where the FX 5600 Ultra is finally able to outperform the Ti 4200, once again because of being more memory bandwidth dependent at this high of a resolution.
What's interesting to note is that the Radeon 9500 Pro does incredibly well in all of these tests, if ATI is able to make the Radeon 9600 Pro perform just as well as the 9500 Pro then NVIDIA could be in trouble. But considering that the Radeon 9500 Pro has twice as many pipelines as the Radeon 9600 Pro, it is unlikely that we will see that sort of a showing from the 9600 core.
Unreal Tournament 2003 (AA & Anisotropic Performance)
|
Turning on AA & Anisotropic filtering shows the advantage of the NV3x architecture over the previous generation, where the GeForce FX 5200 Ultra is now almost as fast as a GeForce4 Ti 4200. The FX 5600 is significantly faster than a Ti 4200, but still not close to a 9500 Pro.
|
Serious Sam 2
|
|
Serious Sam 2 starts out by clearly not favoring the NV3x cards, even the 9000 Pro does a better job here.
|
|
Things don't improve much at 1024x768, let's see how the higher resolutions go.
Serious Sam 2 (continued)
|
|
As memory bandwidth becomes an issue, the two NV3x cards do a bit better, but still end up at the bottom of the charts. It looks like it will take AA + Aniso in order to get these GPUs to flex their muscle.
|
|
Serious Sam 2 (AA & Anisotropic Performance)
|
|
As expected, enabling AA and Anisotropic filtering changes the situation tremendously, although the Ti 4200 still performs very close to the FX 5600 Ultra.
|
|
Jedi Knight II
|
Jedi Knight is very CPU/platform bound at lower resolutions, thus we see no real performance difference between the cards here.
|
At 1024x768 the lower end cards manage to start fading away, although the lower end NV34 is doing just fine.
|
Jedi Knight II (AA & Anisotropic Performance)
|
With AA & Anisotropic filtering enabled, the trend we've been seeing all along continues under Jedi Knight. It isn't until we turn on those features at 1024x768 that the men are truly separated from the boys though:
|
Here the 5200 Ultra is actually faster than the GeForce4 Ti 4200, which goes to show you the benefits of NVIDIA's compression engine as well as their improved AA algorithms.
Comanche 4
|
|
|
|
Comanche 4 (AA & Anisotropic Performance)
|
|
Final Words
NVIDIA's DirectX 9 strategy is to be commended, they are delivering DX9 compatibility for cards ranging in price from < $100 all the way up to their $400 flagship. This is a move NVIDIA should have made back in the DirectX 8 days with the GeForce4 MX but failed to do so, at least they are listening to the demands of developers and end users alike.
When looking at the actual products themselves, you can make a couple of generalizations about their performance. For starters, the GeForce FX 5600 Ultra performs much like a GeForce4 Ti 4200 in situations where no anti-aliasing or anisotropic filtering is used. Enabling either or both of those features allow the 5600 Ultra to significantly outperform the GeForce4 Ti 4200, mostly thanks to the NV31's superior memory controller, compression and AA engines.
The GeForce FX 5200 Ultra performs slightly above the level of a GeForce4 MX 460 in situations where no AA or aniso is enabled; but once again, enabling those features causes the 5200 Ultra to perform more like a GeForce4 Ti 4200 than a GeForce4 MX. The reason behind this is obviously not because of any compression algorithms (as there is no color/Z compression in the NV34 GPU), but rather because of the AA and anisotropic filtering engines, as well as a vastly improved memory controller when compared to the GeForce4 MX (not to mention the higher clock speeds).
The NV31 itself has a great deal of headroom left in the part; although NVIDIA is currently only announcing DDR-I based solutions (e.g. GeForce FX 5600 Ultra), the chip should be able to work in a higher speed configuration with faster DDR-II memory. When NV35 hits, NVIDIA could theoretically fill the gap between NV31 and NV35 with a higher clocked NV31 with DDR-II memory, something that could become a reality depending on competition from ATI.
Speaking of competition, factoring in the performance of ATI's Radeon 9500 Pro does make NVIDIA's offering not seem nearly as attractive. The Radeon 9500 Pro with its 8 pixel pipe design is the clear winner here, but because the Radeon 9600 isn't a direct derivative of the 9500 Pro, we cannot immediately assume that the Radeon 9600 Pro will do just as well as the 9500 Pro in these tests. As we mentioned before, the Radeon 9600 Pro has a higher core clock than the GeForce FX 5600 Ultra, but a lower memory clock; this will most likely mean that in non-AA/aniso situations, the Radeon 9600 Pro should outperform the 5600 Ultra. If NVIDIA can fix their anisotropic filtering issues to the point where their performance-aggressive mode is identical to ATI's performance aniso mode (which NVIDIA claims it should be), then the comparison under aniso/AA modes should be quite interesting. One thing is for sure, current Radeon 9500 Pro owners should stick with their cards, and if you can find one, it's not a bad buy.
We'll have to wait on two things before we can truly crown a winner of the mainstream; for starters, we'll need cards from ATI, which for whatever reason are still not available yet (strange considering that they are supposedly done). We will also need fixed aniso drivers from NVIDIA, which we're hoping the team out in Santa Clara is hard at work on. It is disappointing that we're not able to give you a solid recommendation on what to buy just yet, but holding off on any purchases until both ATI and NVIDIA can deliver what we've asked for will help you make a much more educated purchasing decision in the end.