Original Link: https://www.anandtech.com/show/567

NVIDIA GeForce 2 GTS 64MB

by Matthew Witheiler on June 21, 2000 9:51 AM EST


Introduction

Until rather recently, video cards played a minor role in computer speed. The fact of the matter is that the display of 2D text on a black screen really does not require extensive video card power. With the advent of 3D computer gaming, a whole new weight was placed on the video card. No longer did the video card serve to only display text, but it was called upon to perform complex 3D calculations and draw a realistic 3D world on the screen. With this advancement, the 256 kb of on card video memory found on early video cards quickly filled up with the images being drawn. In order for the cards to keep up with the games, RAM had to be added at an almost exponential rate. Just how fast has this growth been occurring?

By examining the time periods during which a high performance card's memory amount increased, a trend may be found. Let us begin with the Voodoo2 a card which brought the transition from 8 MB to 12 MB of card RAM. Popular in the summer of 1998, 8 MB configurations began to disappear rather quickly with the higher performing 12 MB configuration taking charge. The next transition occurred during the summer of 1999 with the release of the TNT2. This card, which sold in both 16 MB and 32 MB configurations, drew the following conclusion as seen in our TNT2 Roundup: "The better overall decision, from a performance perspective, is to go with a 32MB card." That takes us to the summer of 2000, present day, where, lo and behold, we we find that the latest and greatest graphics processor, NVIDIA's GeForce 2 GTS, being sold in both 32 MB and 64 MB configurations. Historically, now is the time for RAM doubling to occur once again. However, is now the right time for a new standard to be set? Does the additional 32 MB of memory help in current games? How does the longevity of 32 MB cards look? By taking a comprehensive look at both the GeForce 2 GTS as well as the alternate RAM configurations, AnandTech attempts to determine if now is the time for change.



Without question, the GeForce 2 GTS GPU contains the most raw power of any video card processor out on the market. With 25 million transistors manufactured on a .18 die process, the GeForce 2 GTS comes clocked at 200 MHz. This power, coupled with the GeForce 2 GTS's quad pipelines, results in an unheard of pixel fill rate of 800 megapixels per second and a textel fill rate of 1.6 gigatexels per second. NVIDIA proclaims that the GeForce 2 GTS will allow the user to play games never imagined before, with fully realistic 3D graphics. The problem is that this fill rate of 800 megapixels per second is only one side of a double edged sword. This edge is known as "theoretical fill rate."

The theoretical fill rate is based off a video card's clock speed and pixel pipeline. From these characteristics of any card, the theoretical fill rate can be calculated. In the case of the GeForce 2 GTS, we have a 200 MHz core clock speed powering 4 pixel pipelines and thus a resulting fill rate of 800 megapixels per second. The theoretical textel fill rate of 1.6 gigatexels per second comes from the fact that the GeForce 2 GTS GPU can process 4 pixels per pipeline, resulting in 800 x 2 textels per second, or 1.6 gigatexels per second. The problem lies in the fact that, while the GPU may be able to push out 800 megapixels per second, the memory bus is almost never fast enough to keep up with the processor.

Herein lies the other edge of our sword. The GeForce 2 GTS GPU is often times able to put out more information than the card's memory can receive at one time. Since all the data sent out from the processor must find its way into and out of the memory at one time or another, the slowness of the memory bus creates a bottleneck in the video card, creating a traffic jam of sorts. As a result of this, the theoretical fill rate of a card is almost never achieved. For this reason it is necessary to introduce a new term into the video card equation: "effective fill rate." Effective fill rate can be defined as the real world speed of a card. Since the video card processor must always make use of the memory bus and the true rendering ability of a card is only as fast as its slowest component, we can conclude that any video card that passes information through the memory bus will almost never achieve its theoretical fill rate, resulting in an effective fill rate. This bottleneck is seen to the greatest extent when in 32-bit color mode, as twice as much data must pass from processor to memory. Unlike theoretical fill rate, which is completely dependent on core speed and pixel pipelines, effective fill rate takes into account that the GPU is not working by itself. Therefore, effective fill rate of a card is dependent on the bandwidth of the memory bus.

Just how dependent are GeForce 2 GTS based cards on their memory bus? Our Overclocking the GeForce 2 GTS guide answers this question by observing the effects of overclocking the memory and core speeds independently. Let's take a look at what the guide found.



Results of GeForce 2 GTS Limitations - The Core

In order to observe true effect of the core and the memory speed has on the GeForce 2 GTS, we independently raised both clock speeds. With a Pentium III 733E running Quake III Arena Demo001, we collected the following data by raising the core speed of the GeForce 2 GTS alone.

As the graphs show, overclocking the core of the GeForce 2 GTS does not result in the same effect as we saw in TNT2 days. True, theoretical fill rate has jumped as high as 960 million pixels per second, 160 million pixels above the stock clocked card, however the immense power displayed resulted in almost no real world performance increase. This is shown by the nearly horizontal lines on the graph, displaying that an increase in core frequency results in almost no performance increase. What is the case for the lack of response to overclocking? To answer this question, we need to observe what happens when the memory clock of the GeForce 2 GTS is raised while the core stays stock. For this, we turn to graphs once again.



Results of GeForce 2 GTS Limitations - The Memory

This time we see the that trend is rather different. As opposed to the results encountered when overclocking the core, overclocking the memory bus resulted in a seemingly linear increase in speeds across the board. The performance line is no longer horizontal, showing that card performance increases to a noticeable extent when just the memory clock is pushed up. This time, as opposed to before, we are actually increasing the effective fill rate of the card while leaving the theoretical fill rate unchanged. With this information, we can clearly see that the effective fill rate of the GeForce 2 GTS when at default clock is no where near the theoretical fill rate. By increasing the memory clock speed, we come closer and closer to eliminating this bottleneck and approaching the theoretical fill rate.

This shows the demand for higher speed RAM, as the 166 MHz DDR SGRAM chips that are shipping with 32 MB GeForce 2 GTS cards are quite frankly not up to par with the rest of the card. There is no question that raising the memory speed increased performance to a greater extent than raises in the core speed. When at default clock, the DDR RAM chips found on the GeForce 2 GTS possess 5.3 GB/s peak bandwidth. This number is found by taking the memory clock speed, 166 MHz, and multiplying it by 2 since the RAM is being written to twice per clock. Next, this product of 333 is multiplied by the memory bus width of 128 bits. This number must finally be converted from bits to bytes by using the fact that there are 8 bits in a byte, thus we divide the product found in the last step by 8 to get 5.3 GB/s peak bandwidth. As the core overclocking graph shows, the bottleneck formed by this "minimal" 5.3 GB/s bandwidth is enough to prevent the GeForce 2 GTS from achieving its full speed.

By raising the memory speed of the card, we were able to more closely simulate the theoretical fill rate of the GeForce 2 GTS GPU. At our maximum overclocked memory speed of 203 MHz DDR (406 MHz effective), the peak bandwidth of the card lied at an impressive 6.5 GB/s. Even at this bandwidth level our graphs did not begin to go horizontal, suggesting that the default clock the GeForce 2 GTS actually requires more bandwidth than this for theoretical fill rate to be realized.

Although memory bandwidth may seem like a plague on theoretical fill rate, there are actually worse evils in the world of rendering. We have yet to consider what happens when the data attempting to enter the card's memory can not be stored because of lack of space. Since the computer can not throw this information away or display only half of a scene, additional memory needs to be called upon in a process known as AGP texture swapping.



Texture Swapping - A forgotten problem, for now

Texture swapping is not nearly as big as a problem now as it was a short as 2 months ago. To understand why, we first must examine what exactly texture swapping is.

Texture swapping occurs when the amount of memory found on the video card is not large enough to fit a whole scene inside of it. In order to properly display the scene, it is necessary to collect a storage area from another place. In order to do this, the video card calls upon the system memory as a storage area. This method of rendering is usually seen at high resolutions and 32-bit color, for it is in these modes that the amount of data necessary to render a scene is tremendous. The overflowing textures are pushed from the video card to the system memory, as needed, and sequestered back when necessary. Ideally, this method of rendering would result in no performance loss, however we all know that the computer world is far from ideal. Similar to the bottleneck faced when passing data from the GPU to the on card memory, passing data to system memory also requires a comparatively slow data path. The problem is that, unlike the card's memory path, the path taken to pass information to the system, the AGP path, is much slower.

Just how much slower? Let's take a look at the peak bandwidth of the different AGP speed standards. At AGP 1x mode, we can calculate the peak memory bandwidth in the same manner we did before. We take the AGP clock speed, which is 66 MHz, and multiply this by the AGP mode (1x). Next, multiply this by the AGP bus width (32 bits) and divide by 8 to convert to bytes. Calculating this out, we get a bandwidth of 266 MB/s, lagging way behind the already slow 5.3 GB/s peak bandwidth found in the on card memory system. Performing the same set of operations and replacing the mode numbers, we calculate the peak memory bandwidth of the AGP 2x system to be 533 MB/s and the AGP 4x system to be 1.06 GB/s. Even at the highest AGP speed rating, we see that the AGP bus possesses only 20% of the power found in the GeForce 2 GTS's memory bus. We saw above how limiting the 5.3 GB/s peak bandwidth was under normal game play. Imagine if the system memory was needed on a regular basis for rendering. Gameplay would all but cease.

Luckily for us, NVIDIA has found a way around this problem, at least for the moment. The GeForce 2 GTS did not only bring higher memory bus speeds to the GeForce line, but it also brought the release of a new driver set: the 5.xx series drivers. Besides support for the new processor and its advanced features, the 5.xx series drivers also introduced a type of texture compression known as S3TC compression. Using a hardware based algorithm, S3TC compression takes the information to be passed to memory and compresses it and expands it as needed. What this does is not only decrease the amount of information that must be passed to and from the memory, it also decreases the footprint of a given scene. By implementing this feature, NVIDIA was able to weasel their way around AGP texturing by essentially ensuring that the 32 MB of on card RAM would be sufficient to store the compressed information. This feature, in fact, does turn out to reverse the effects of AGP texturing seen in our 64 MB GeForce review.



Problems With Compression

There are two major problems that compression will have to face. First off, game developers must be willing to write games that take advantage of this algorithm, meaning that some games that use it may be almost unplayable on cards where S3TC is not supported. The second matter of concern is that texture sizes will just keep growing. As games become more realistic and game play becomes more interactive, texture and scene sizes are bound to grow. This was seen with the introduction of Quake III Arena. The original GeForce 256 DDR running 3.xx series drives would choke under heavy loads in Quake III. This was due to the fact that the large, complex scenes being rendered were using the system's RAM as a storage area, meaning that data had to travel from the GPU down the slow AGP bus path. The introduction of S3TC compression eliminated this problem in Quake III Arena, however it is very likely that the next generation of games will once again push the limits of a 32 MB memory storage system. Even with S3TC compression in future games, common sense tells us that texture and scene sizes will continue to grow and push the limits of even a compressed 32 MB memory system.

When this occurs, and it will, texture swapping will become a problem once again. When the compressed data overflows from the card's memory, the slow AGP path will have to be traveled just as it was in the past. And, just as it was in the past, the path will remain extremely slow compared to the on card memory bus path.

With this in mind, there is only one way to overcome the problem for a longer period of time. That is by increasing on card memory size.



Why 64 MB

By increasing the amount of memory that is present on a video card, card manufacturers are able to prevent the full usage of video memory which results in AGP texture swapping. By containing more storage space for the needed textures, additional memory prevents data from traveling the APG route. By eliminating this mode of texturing, additional memory results in data only having to travel on the "fast" video card memory bus, going along at 5.3 GB/s.

In the future, there is almost no question that the additional 32 MB of on card memory will aid in speed. The result should be similar to what we found in our 64 MB GeForce review, where, basically, Quake III Arena became unplayable at high color depth and resolutions when rendering complex scenes. Texture compression alleviated this for now, however future games should see AGP texturing pop up again in cards with even 32 MB of memory as scene and texture size outgrows memory size. So, we know that 64 MB of on card memory will no doubt help in future games to some extent, however we are left to wonder what the additional RAM does in the current generation of games. For this, we turn to tests, performed on our first 64 MB GeForce 2 GTS card, the Guillemot/Hercules 3D Prophet II GTS 64MB.



The Card

The first 64 MB GeForce 2 GTS card to enter the AnandTech lab also appears to be the first 64 MB GeForce 2 GTS card on the market. Not long after the Guillemot/Hercules 3D Prophet II GTS 64MB entered the lab, it was well on its way to store shelves across America. While we will feature a review of this card in the coming week, this section serves as an introduction to the card in order to familiarize the reader with the test system.

Besides all the visible eye candy created with the RAM heatsinks and blue PCB, the 3D Prophet II GTS closely resembles the reference design for the GeForce 2 GTS with one noteworthy exception: the use of DDR SDRAM. Every single 32 MB DDR GeForce line card we have seen in the lab has used Infineon SGRAM chips for storage. However, just as we saw on the 64 MB GeForce, the 3D Prophet II GTS 64MB uses Hyundai 6 ns DDR SDRAM chips to power the memory subsystem. It is well known that SGRAM chips perform slightly better in video cards, so why do the 64 MB cards opt for SDRAM? Well, it seems that 64 MB card designers have no other choice. The current memory that the only manufacturer of DDR SGRAM chips for use in video cards, Infineon, is currently making are not high enough density to pack 64 MB onto a single card. With this limitation, Guillemot/Hercules was forced to choose the slower performing DDR SDRAM chips to power the 3D Prophet II GTS 64MB.

Outfitted with 64MB of DDR SDRAM, the 3D Prophet II GTS was ready to take its position in the testbed system and serve as a model for all 64 MB GeForce 2 GTS cards to come. It should be noted that the first batch of 3D Prophet II GTS 64MB cards were clocked at 220/366 (core/memory), including our evaluation sample. Later versions of the card feature a new BIOS that puts the clock back at 200/333 (core/memory) like all other GeForce 2 GTS cards and is the speed that we tested at for this review.



Windows 98 SE Test System

Hardware

CPU(s)

Intel Pentium III 550E

AMD Athlon 750
Motherboard(s) AOpen AX6BC Pro Gold AOpen AK72
Memory

128MB Corsair PC133 SDRAM

Hard Drive

Quantum Fireball CR 8.4 GB UDMA 33

CDROM

Acer 24x

Video Card(s)
NVIDIA GeForce256 32MB DDR
NVIDIA GeForce256 64MB DDR
NVIDIA GeForce 2 GTS 32 MB
NVIDIA GeForce 2 GTS 64 MB
Voodoo5 5500 64 MB
ATI Rage Fury MAXX

Software

Operating System

Windows 98 SE

Video Drivers

NVIDIA Detonator2 v5.22
3dfx Voodoo5 5500
1.00.01
ATI Rage Fury MAXX A6.40CD06

VIA 4-in-1 Service Pack version 4.22
VIA AGP GART 4.03

Benchmarking Applications

Gaming

GT Interactive Unreal Tournament 420 AnandTech.dem
idSoftware Quake III Arena v1.16n demo001.dm3
idSoftware Quake III Arena v1.16n Quaver.dm3



Athlon 750 - Quake III Arena demo001.dm3

For a higher end benchmarking system, we chose the Athlon 750. Using Quake III Arena's built in demo001, we were able to simulate performance in a real world situation. The demo001 benchmark is not known for its complex textures or scene quality and is generally thought to be an accurate representation of the majority of Quake III Arena levels.

In 16-bit color mode, we find that the GeForce 2 GTS 64 MB stays about neck and neck with the lower costing 32 MB configuration. In fact, the 64MB GeForce 2 GTS actually takes a back seat to the 32 MB GeForce 2 GTS when at 1280x1024x16. The extra speed gained by the 32 MB GeForce 2 GTS is most likely a result of the faster SGRAM chips being used on this card compared to the slower SDRAM chips on the 64 MB version. The one place in 16-bit color mode where the extra 32 MB of memory proves to be an advantage is at 1600x1200x16. It seems that the extra memory finally begins to help at this high resolution, as the extra RAM most likely prevents a very mild form of AGP texturing. At this resolution, we find that the 64 MB GeForce 2 GTS performs 8.4% faster than the 32 MB GeForce 2 GTS.

In 32-bit color we finally begin to see the advantage that the extra 32 MB of RAM provides, however the performance gain experienced does not set in until the high resolution of 1280x1024 and above. Below this resolution, the extra RAM goes unused, providing no performance increase. At 1280x1024x32, it seems that Quake III Arena is finally able to take advantage of the extra memory, providing for a 7.8% performance increase. The speed only increases when at 1600x1200x32, with the 64 MB GeForce 2 GTS performing 9% faster than its closest competition: the 32 MB GeForce 2 GTS.



Athlon 750 - Quake III Arena Quaver.dm3

In our 64 MB GeForce review, we introduced the use of Quaver as a system stressing benchmark. Recorded in a Quake III level where textures are known to be large, when the texture quality slider in Quake III Arena was set all the way to the right Quaver was able to cripple many of cards. With the release of the 5.xx series drives which included the addition of S3TC compression, it seems that the Quaver benchmark can be toppled by almost any card using S3TC compression. This leaves only the ATI Rage Fury MAXX in the dust due its lack of this compression method.

Quaver remains a good benchmark to show how cards perform on situations that previously were thought to be the highest stress possible. We see once again that the 64 MB GeForce 2 GTS holds the top spot in 16-bit color rending in all modes except for 1280x1024. At all resolutions below 1600x1200, it seems that the extra memory is not helping in 16-bit color mode, with the 64 MB GeForce performing nearly identically to its 32 MB brother. This lack of performance is due to the fact that at 16-bit color, the card is simply not crunching out enough data to fill the 32 MB of memory, let alone 64 MB of it. In these situations, the extra memory goes unused and seemingly wasted. It is not until the resolution is cranked all the way up to 1600x1200 in 16-bit mode do we notice any difference between the 64 MB and 32 MB versions. With the large amount of data that must be stored in the memory at this high resolution, the 64 MB card prevents any minuscule amount of AGP texturing and results in a 6.7% speed increase when compared to the 32 MB version.

In 32-bit color mode, the memory system becomes taxed earlier, with performance of the 64 MB card becoming noticeably better at resolutions of 1280x1024 and above. This resolution and color depth appears to act in the similar manner to the 1600x1200x16 resolution, with the additional memory accounting for a 7.4% speed increase. This margin is expanded when at 1600x1200x32, with the large amount of data passing to the memory being speed up by the extra 32 MB of storage space. This results in a 9.3% performance increase.



Pentium III 550E - Quake III Arena demo001.dm3

A Pentium III 550E was used to simulate card performance on lower end systems.

While in general the trend of the 16-bit color performance appears to be the same as experienced on the Athlon 750, there is one noteworthy exception. Unlike the Athlon 750 test system, the Pentium III 550E system did not show any performance increase in 16-bit color until 1600x1200. If you recall, we found that on the Athlon, performance jumped a marginal amount of 1%. This increase is gone in the Pentium III system, most likely due to the fact that the processor is acting as a bottleneck. The performance trend found on the Athlon 750 remains the same on the Pentium III 550E when at 1600x1200x16, with the 64 MB GeForce 2 GTS card performing 8.3% faster (very close to the 8.4% speed increase found on the Athlon 750).

When in 32-bit color, the performance difference does not show again until 1280x1024, a resolution where the memory system begins to be taxed. At this resolution, the additional 32 MB of memory provided a 9% speed increase. At 1600x1200x32, where it appears that the 64 MB GeForce 2 GTS is most able to stand out from its 32 MB counterpart, the card performs 8.7% faster.



Pentium III 550E - Quake III Arena Quaver.dm3

Once again, we find that Quaver does not stress the cards to any great amount with the exception to the S3TC lacking Rage Fury MAXX. In 16-bit color mode, we note that performance is nearly equal across the board for the GeForce 2 GTS based cards. This time we find that the 64 MB GeForce 2 GTS is routinely a bit slower than the 32 MB GeForce 2 GTS, a product of the use of slower SDRAM on the 64 MB board. At 1600x1200x32, where performance begins to be separate between the two cards, the GeForce 2 GTS outfitted with 64 MB of RAM performs 6% faster.

In 32-bit color, performance begins to be in the advantage of the 64 MB GeForce 2 GTS when at 1280x1024 and above. Below this resolution, the 64 MB card performs a bit slower than the 32 MB card, once again due to the slower SDRAM chips used. When at 1280x1024, the 64 MB GeForce 2 GTS performs 7% faster, with this performance gain rising to 7.4% when at 1600x1200x32.



Athlon 750 - Unreal Tournament

The Rage Fury MAXX would not complete Unreal Tournament runs on our KX133 motherboard.

At the 16-bit color setting, we find that the 64 MB GeForce 2 GTS performs lower than expected, with the Voodoo5 5500 beating it out in 3 of 5 resolutions. The extra memory in this case does result in any performance increase until the high resolution of 1600x1200x16 is reached. At this resolution, the extra memory seems to have decreased a small amount of texture swapping, with performance gaining 2.8%.

In 32-bit color mode, performance remains nearly identical across the board until 1280x1024x32. Due to the high amount of textures used in the Unreal Tournament game, it is no surprise that we see a very large 45.9% performance increase over the 32 MB GeForce 2 GTS. Since Unreal will not run at 1600x1200x32, we could not calculate scores for this resolution but it is very likely that the performance gain from the 32 MB of additional memory would be even more amazing.



Pentium III 550E - Unreal Tournament

The results of the benchmarks run on the Pentium III 550E are nearly identical to the trends seen in the Athlon 750 benchmarks. We find that the GeForce 2 GTS with 64 MB of memory takes back seat to the Voodoo5 5500 in 16-bit color mode until 1600x1200. At this resolution, the 64 MB card performs 3% faster than the 32 MB card.

When in 32-bit color mode, the 64 MB GeForce 2 GTS does not begin to stand out until 1280x1024, where it gain a large performance increase from its 32 MB counter part. Due to the large textures that Unreal Tournament used to render rooms and objects, the additional 32 MB of memory prevents AGP texture swapping and results in a 52.6% performance increase compared to the 32 MB GeForce 2 GTS.



Conclusion

Running anywhere from $70-$100 more than an equally configured 32 MB GeForce 2 GTS card, the extra 32 MB of memory comes at quite a cost. On the up side is the fact that many retailers are now carrying the Guillemot/Hercules 3D Prophet II GTS 64MB, however the cost will most certainly deter many potential buyers. With such a large premium to pay for owning the "most powerful" video card out there today, is the performance gain from a 64 MB GeForce 2 GTS worth it? Shall we continue with the trend of memory increases every summer? To answer these questions, we need to examine the 64 MB GeForce 2 GTS's current performance as well as its long term value.

As the performance numbers show, the extra 32 MB of memory (as well as $70-$100) results in a very nominal performance increase of about 8% only at very high colors and resolutions when in Quake III Arena. The Unreal Tournament numbers show that the additional 32 MB of memory really does not do anything until 1280x1024x32, where performance increased up to 52%. $70-$100 seems a high price increase to pay for a large performance gain in one specific color depth and resolution and minor performance gains in all others.

In order to determine if the 64 MB GeForce 2 GTS is right for you, one needs to consider how often you wish to walk down the upgrade path. Many users may find it more economically sound to purchase a 32 MB GeForce 2 GTS card now and upgrade around the time of the NV25, slated to be released a little less than a year from now. For those not interested in having top of the line performance no matter the cost, this path seems to make the most sense, as it seems that the 32 MB GeForce 2 GTS, with its S3TC texture compression, will be fine for games in the not so distant future.

A different path, however, should be taken by those who either want the best of the best now and are still willing to upgrade in a year and those who want to upgrade now and hold onto their investment for a longer period of time. For the user who wants the best of the best and is going to upgrade in about a year regardless, go with the 64 MB GeForce. While it dose not make that large of a difference now, future games will only increase the performance gain experienced with the additional 32 MB of memory. For the user who wishes to upgrade now to the best and hold on to their investment the longest, the 64 MB GeForce 2 GTS makes the most sense once prices fall within obtainable ranges. Once again, a user who wishes to retain his/her card for a long period of time will not regret the additional memory that the 64 MB card has to offer.

It seems that the best advice for one looking to upgrade now is to wait until 64 MB GeForce 2 GTS cards fall more in price. With the 32 MB cards dropping in price almost every few weeks, it seems that it is only a matter of time before the 64 MB cards are within the price range of current 32 MB cards. The performance gain from the extra memory may not be there now, but as games that do not take advantage of S3TC compression are released and texture sizes grow once again, the additional memory will most definitely equate to an increase in performance.

Log in

Don't have an account? Sign up now