Original Link: https://www.anandtech.com/show/375
Although NVIDIA and S3 have already made their announcements regarding next generation parts featuring hardware accelerated transform and lighting, 3dfx is letting the world know about their next generation product piece by piece. It started nearly two months ago when 3dfx announced their T-Buffer Technology. Just a couple days ago, they made another announcement - this time about their FXT1 Texture Compression. 3dfx is being extremely tight lipped about this next generation product and has let very little slip out.
We decided to try our best to squeeze a little bit of info out of 3dfx's Chief Technology Officer, Scott Sellers, a little early, as well as hear their thoughts on NVIDIA and S3's latest products.
The cat is finally out of the bag for NVIDIA's GeForce 256 and S3's Savage 2000. What is your take on these announcements and how does 3dfx plan to compete?
In the case of Savage 2000, I can’t really say my opinions on the product because I have not seen it in action (nor has anyone really written about it about witnessing it). So, it looks like a classic S3 part – good feature set whose performance will be very average and thus be classified in the "value" segment of the gaming market. The Savage 2000 bottleneck appears to be its memory interface, only now catching up to the rest of the industry supporting a 128-bit single data rate memory interface. It appears S3 is about one generation behind everyone else in getting supporting for high speed memory interfaces with significant memory bandwidth, which is obviously required for sustaining high fill-rates. The Savage 2000 looks like a good mainstream OEM product though.
A lot more information is now available about the GeForce so we’ve been able to form much stronger opinions on it. In summary, the GeForce appears to be a couple of TNT2 raster engines coupled together with a new geometry engine bolted on top. We give nvidia a lot of credit for being first to market with a geometry accelerator, but we do disagree with their decision to emphasize geometry over fill-rate. The reality is that the announced fill-rate for GeForce is not going to allow it to remain competitive with our next generation products for games which run at high resolution and high color depth. While every new 3D product cycle gamers have been able to increase the resolution and pixel depth with which they enjoy their games, nvidia appears to have made no substantial improvements in being able to run games at high resolution and 32bpp color. In contrast, we have listened to our customers and have focused on delivering a compelling gaming experience at 1024x768 and beyond resolution in 32bpp color. In addition to being able to run at high resolutions and color depths, we will absolutely stun people when they see the quality of our rendering in the next generation product with the T-Buffer and its full-scene spatial anti-aliasing capability. The other odd decision nvidia made with GeForce and their emphasis on geometry acceleration is that the geometry acceleration does relatively little for any games in the near future. Even many games which are written for OpenGL, which theoretically can be accelerated by the GeForce, may not be written properly to be able to extract much additional performance. So, consumers who buy the GeForce boards are going to have to wait quite some time before any games actually support geometry acceleration in any compelling way. Our strategy, however, with the T-Buffer technology was to give the users an immediate benefit right when they plug in their new hardware accelerator, being able to immediately upgrade all of today’s games. Being able to play a ton of games immediately with substantially enhanced visual effects is a compelling reason to choose a product over one which only promises games in the future. We believe consumers will choose our technology in spades.
3dfx has always been the speed king in the market - always pushing speed over quality. With the Voodoo3 vs. TNT2 debate, it seemed to come down to the fact that the Voodoo3 was better for playing current games on the market whereas the TNT2 was a better forward looking investment thanks to its more complete feature set (32-bit rendering, etc.). With NVIDIA going the route of T&L and rumors pointing towards 3dfx going with massive fill rate, it seems like the same scenario is upon us again with 3dfx the leader for current games but NVIDIA a bit more forward looking. Is this the trend 3dfx is looking to pursue, or is this just how things have turned out?
Well, I would say that our strategy is always to be both the feature leader and the performance leader, but you always have to make trade-offs. Nvidia, with their announcement of GeForce, clearly has made the decision that geometry acceleration is more important than fill-rate. What is important to recognize, however, is that with the level of fill-rate that the GeForce is capable of, we don’t think anyone will ever see substantial benefits of their geometry capabilities (unless running at very low resolution). So, while our belief is that fill-rate must be raised substantially before it makes sense to increase geometry rates, nvidia has taken the opposite approach – we believe fundamentally the market will judge their decision to ignore substantially increasing fill-rate to be a big mistake.
Both the GeForce 256 and Savage 2000 are offering T&L support. Is this the next big step in 3D rendering?
Well, first off I think it’s important that some people actually see the Savage 2000 in action supporting hardware-based T&L before any decisions or commentary can be made about their particular implementation. While we believe GeForce has some significant shortcomings on the rendering feature set and fill-rate side of things, there is no doubt that they have made substantial progress in geometry acceleration (witness some very nice demos they were showing on the GeForce hardware). So, until S3 shows off the Savage 2000 running demos, and more importantly real applications, utilizing their hardware T&L solution I remain doubtful that their implementation is actually any faster than a good CPU. Maybe I’ll be surprised, but I’m not holding my breath. So that being said, I really think only one vendor (nvidia) has a compelling geometry solution. But as I said previously, you really cannot take advantage of significant geometry acceleration unless it is coupled with equally impressive fill-rate (otherwise even if you can generate the triangles quickly how are you going to fill them just as quickly?). So for the particular implementation of geometry acceleration on GeForce, I do not believe this to be the "next big step" in 3D rendering.
If not, what are the big upcoming advances for 3D?
Well, clearly this is a very unbounded question. But, relative to the next cycle of 3D accelerators, we believe full-scene spatial anti-aliasing to be the next big advance in 3D. There are very few instances in the history of 3D acceleration when a feature has been offered that does not require any software development support and yet can dramatically improve the overall visual quality of a game or application. The people who buy a GeForce product will by-and-large see relatively small levels of performance and visual quality improvement for the overwhelming majority of games available in the near future. In contrast, our customers who buy T-Buffer enabled hardware with our real-time full-scene spatial anti-aliasing capability will be immersed in an incredibly powerful "out of box experience" that has rarely been seen in the history of PC 3D graphics – immediately a person’s library of games will be substantially improved the moment they plug in the T-Buffer enabled hardware. This is really quite exciting for the industry, which historically has forced customers to wait some time before games actually take advantage of a new hardware capability.
John Carmack said the following about the GeForce 256 in a recent .plan update:
It is fast. Very, very fast. It has the highest fill rate of any card we have ever tested, has improved image quality over TNT2, and it gives timedemo scores 40% faster than the next closest score with extremely raw beta drivers.
The throughput will definately improve even more as their drivers mature.
For max framerates in OpenGL games, this card is going to be very hard to beat.
Once again, 3dfx has always been the speed king, but this sounds pretty tough to beat and John certainly knows his stuff. Do you think you'll be able to do it again? How and why?
Well, John C. has not yet tested our next generation product yet so the fact that GeForce beats anything that’s currently on the market is really no big surprise. The GeForce has an advantage with its onboard geometry capability for lower resolutions and color depths, however we believe that our next generation product will substantially outperform the GeForce when running at resolutions and color depths that gamers demand. We have always placed a big emphasis on being the premiere hardware platform for the Quake games, and expect our next generation of products to continue to live up to that high standard.
The announced 480MP/s fillrate seems a bit lower than many had expected for the GeForce 256. Doing the math, this seems to point towards a 120MHz core clock, which is lower than their existing TNT2 parts. Since 3dfx uses the same fab plant, do you anticipate any clock speed issues?
Clock frequency is much more a function of the logic design and physical implementation than fab, so the fact that the GeForce runs at a low clock frequency certainly does not mean our next generation product will also.
Do you think NVIDIA might be downplaying their specifications for now until a product is ready or until after 3dfx has made an announcement?
Well, anything is possible I guess in this hyper-competitive market. But I think everyone who follows this market knows that over time nvidia has always promised more than they actually deliver, and we don’t expect that to be any different this time around either. If you consider that nvidia has touted to "revolutionize the world" with GeForce, I think you’ll agree that nvidia is once again over-hyping a product.
The Voodoo rendering architecture hasn't changed significantly from the original Voodoo Graphics (aside from adding a TMU, increasing clock speeds, etc.). Will the "Voodoo 4" (or whatever it's eventually called) offer a completely new core?
Again we are not here to discuss the features of our next products, however clearly we plan to offer substantial benefits which go above and beyond the current Voodoo Graphics architecture (T-Buffer is a good example of something we’ve already disclosed). What is important, however, when we add new features is that we do so in a completely backward compatible manner so that our customers can continue to run all the games already available for 3dfx products.
When can we expect an announcement and/or shipment of such a product from 3dfx?
Unfortunately we can’t really get into the details of future products at this time. We have detailed some of the new functionality of those parts in the form of the T-Buffer, which we believe is very exciting for the industry. We have said that the first product to incorporate the T-Buffer technology will be available for Christmas.
Will we finally see a 3dfx chip with full AGP texturing support?
We will support full AGP texturing in forthcoming products. However, it really is a moot point at this stage because AGP texturing is not being embraced (nor has it ever been) by the development community. There simply is not enough bandwidth on the AGP bus, even with AGP 4x, to sustain high fill-rates when texturing from AGP system memory. What has happened is that as polygonal complexity increases, the additional AGP bus bandwidth is used up for polygon traffic. And, the continual decline in memory prices has made having to store textures in system memory less and less of an issue (witness that you can buy 32MB graphics boards now for under $100). And, even Intel is backing off from its strong support of AGP texturing for the same reasons I outline, so I think you’ll see moving forward it becoming a very unimportant "checklist" feature….
How about AGP 4x with fast writes?
By and large, anything which improves the rate at which data can be transferred from the CPU to the graphics device is a very good thing. AGP 4x Fast Writes certainly fall into this category. What I am still confused about from the GeForce announcement is how they actually utilize AGP 4x Fast Writes when they’re running a game. Remember that a large part of 3D graphics performance is being able to build up large command streams which can then be processed by the 3D accelerator as fast as it can. Traditionally in nvidia’s architectures, these large command streams are created in AGP system memory and then automatically read by the 3D accelerator from AGP system memory and then executed. In that scenario, AGP 4x Fast Writes never come into play because the CPU is writing directly to AGP system memory, and then the graphics device is simply reading from AGP memory using traditional AGP read commands. So, unless nvidia has radically changed their command interface mechanism, which we do not believe they did, they actually might not even use AGP 4x Fast Writes during a real-world game scenario (i.e. not a canned benchmark). We’re looking forward to finding out more about this.
What about support for the much touted, but little used, Environment Mapped Bump Mapping that Matrox is pushing?
We have seen very little developer interest in any of the bump mapping techniques that are supported in DirectX 6. Quite frankly, we thought there would be more interest, but developers don’t seem to be embracing it. We believe that the difficulty of implementing bump-mapping properly in an application is likely the reason why there are such few titles which take advantage of the capability, in addition to a host of visual artifact problems which all the currently-announced bump-mapping techniques exhibit.
A number of people have claimed that the T-buffer is little more than the accumulation buffer that OpenGL has had for quite some time. What makes 3dfx's solution different and/or better?
Well first off you have to realize that much of the 3D technology available for consumers today was first productized in very expensive workstation products in the past. That being said, the T-Buffer is a much more cost effective way to implement many of the effects which previously required a very costly Accumulation Buffer, which was only found in high end professional graphics solutions (note that even though OpenGL has supported Accumulation Buffer for some time, no mainstream PC graphics hardware has ever had hardware support for it). The T-Buffer is much more efficient, both in terms of amount of memory required and the amount of bandwidth required relative to an Accumulation Buffer, however, so it lends itself much more readily to consumer applications. But just because something was introduced in the workstation market first doesn’t diminish in any way how exciting that technology can be for consumers who are now able to experience it for the first time.
The same point can be said about nvidia’s claims of "changing the world" with their hardware transform and lighting capabilities – hardware transform and lighting has been done in professional workstations for years so the capability of doing it in hardware is certainly not revolutionary by any stretch. Nvidia has simply migrated the technology into the consumer space.
The T-Buffer effects all seem to require massive fill rate, especially full scene anti-aliasing. Meanwhile, 3dfx still claims that 1024x768x32 at 60fps is the minimum acceptable for a next-gen card for real gamers. Will frame rates dip below the magic 60fps at high resolutions such as 1024x768?
Well, as I think most people recognize, 3dfx has always been the king of fill-rate. There is no doubt that utilizing the novel T-Buffer capabilities requires additional fill-rate. As we said during the T-Buffer technology disclosure, enabling the T-Buffer capabilities will lower overall fill-rate. However, we believe we have sufficient fill-rate in all products which include the T-Buffer technology such that even with the T-Buffer enabled games will be able to sustain 60 fps at 1024x768 resolution. Of course, fill-rate requirement is very much dependent on the game itself, so while we can’t guarantee that every single game will be able to achieve this level of performance, we certainly think that the grand majority will be able to.
Since the announcement of the T-Buffer earlier this summer, 3dfx has been careful to point out that the "Voodoo 4" will support a number of other new features. What can you tell us about these features?
Sorry, mum’s the word until the product announcement. We’re very excited to tell the world though when the time is right!
What makes 3dfx's FXT1 better than S3TC/DXTC?
There are numerous ways FXT1 is superior to S3TC. From a technology perspective, FXT1 utilizes four separate compression algorithms that result in substantially better overall image quality than S3TC. Also, S3TC does not support a 4 bpt compression format for 32-bit textures which use the Alpha channel for translucency, a capability supported in the FXT1 compression format. From a development perspective, S3TC is not an Open Source technology and is thus restricted by S3 licensing terms and conditions, thereby limiting its use as a cross-platform compression standard. With a cross platform texture compression technology like FXT1, a develop can use the same high quality artwork and high resolution textures for direct X with other platforms like Mac and Line Linux the texture compression technology transfers, too.
Will 3dfx products be able to support S3TC/DXTC?
3dfx is committed to offering the best DirectX 3D accelerators, so all future 3dfx products will support both FXT1 as well as S3TC texture compression standards. However, we believe long-term that the freedom using FXT1 under the Open Source model will result in substantially broader acceptance by the development community.
Can a card that supports S3TC/DXTC take advantage of FXT1 through a driver update or other method?
FXT1, while similar to S3TC, is different. So, some of the FXT1-compressed textures could be, with software tricks and some slowdown, modified to be decompressed on S3TC decompression hardware. However, it would be very inefficient so we don't expect this to be done. As we offer all the tools and source code to all hardware venders, we expect they will offer real FXT1 decompression in their future products.
Will FXT1 be incorporated into DirectX or OpenGL?
We would love to see that happen, but Microsoft has made it clear that S3TC is the texture compression technology in DirectX. This is not a big set back because they where very forward thinking when they developed the API. They have extensions so it is easy to add support for new technology. Same with OpenGL. S3TC is only available on a single API on a single platform. Not only will FXT1 be supported in DirectX on the Windows platform, it will also be supported in OpenGL, Glide, and whatever other 3D APIs developers are interested in using across the Windows, Linux, Macintosh and BeOS environments. As a result, developers utilizing FXT1 compression are not limited to only the DirectX API or only the Windows platform.
It's great to see that 3dfx has followed the open source model with FXT1. Are there any plans to officially support the other platforms mentioned (BeOS, Linux, etc.) with 3dfx written drivers? Does 3dfx plan to make anything else open source?
We already have begun the process of supporting non-Windows APIs. We have already released Macintosh and Linux drivers to developers and have been getting great feedback. GLIDE is also a cross platform technology. We also like the open source concept. It opens the technology to the creativity of people out side the company and can serve to really increase the advancement of technology.
It seems that OpenGL and Direct3D have garnered sufficient enhancements and support that there's no reason to go with Glide any more. What do you see as the future of the Glide API?
We try not to get "religious" about 3D APIs. Basically, our philosophy is simply to support whatever developers are asking for. That is the reason that we’re the only company which offers support for the 3 most popular APIs – no other hardware vendor can make this claim. So, as long as developers are asking for Glide, we’ll continue to deliver it to them. That being said, however, there is no doubt that both OpenGL and Direct3D have made tremendous strides in the last several years in terms of improving capabilities for gaming. This is a very good thing for us and for the industry overall, as it has raised the performance and functionality bar substantially for everyone.
Can and will Glide be extended to fully support the features of the "Voodoo 4"?
Since Glide is 3dfx proprietary, we can do anything we want with it regarding adding new features for new pieces of hardware. I can’t comment on Glide support for unannounced features, however.
Any chance we'll ever see 3dfx turn Glide into an open API?
We are currently looking at GLIDE to determine how it will serve 3dfx and developers best in the future.
3dfx recently released the Voodoo 3 3500TV that offered a complete video capture and TV/FM tuner solution on a single board, not to mention one of the fastest 3D cards on the market. Has the 3500TV sold well and does 3dfx plan to do it again with the "Voodoo 4"?
The 3500TV is selling very, very well for us right now, and we are very excited about the product. What the 3500TV has proved to us is that there is definitely a strong market which cares about both 3D acceleration and excellent video capabilities. We will undoubtedly be continuing the line of fully featured combined 3D/video products moving forward with our future products.
3dfx is one of the first big 3D graphics makers with drivers (albeit beta) for the MacOS in addition to announced support for the upcoming G4 Mac's. Do you see the Mac becoming a viable gaming platform in the near future?
We have been very impressed by the excitement in the Mac community about 3dfx offering our technology for that market. And, we sense a continued momentum building amongst the game developers in supporting the Mac as a gaming platform. So, we are excited about dominating the Mac gaming market just as we have done so successfully in the PC, in helping to make the Mac a premiere gaming platform.
Conclusion
So after all that, what do we know? We tried our best to get some additional information on their next generation product, but 3dfx is being very secretive on this. However, we can infer a fair amount about such a product from the information above.
3dfx is still saying speed is king, but has finally come around to supporting more advanced rendering features like 32-bit rendering, larger texture sizes, and the T-Buffer effects. We also know that a sigh of relief was breathed by everyone when NVIDIA announced the GeForce 256 and its meager 480MP/s fill rate. It looks like 3dfx's next generation product is going to be quite solid with an immense fill rate that should make it the fastest for current games. They expect 1024x768x32 to be the resolution will be playing at with 60fps still considered the minimum acceptable frame rate.
It's also great to see 3dfx expanding their support to include alternative OS's like BeOS, Linux, MacOS, etc. Hoepfully this trend will be picked up by other manufacturers. Otherwise, 3dfx may end up dominating those small, but important segments.
Of course, they could get in trouble if polygon rates in games rise as NVIDIA would like. Either way, it will definitely be an interesting holiday season, with the big winners the consumers.
Special thanks go out to Scott Sellers at 3dfx for taking the time to answer all our questions.