Original Link: https://www.anandtech.com/show/931



Tomorrow AMD will release their Athlon MP 2100+ processor; we've already reviewed the desktop variant of the chip so how can we possibly get excited about a processor we've had on the desktop for months? Before you get too confused, the aforementioned situation happens quite often in a number of the markets that we cover at AnandTech. Every now and then there are relatively small but important announcements that don't warrant a full 30 hours of testing and 10,000 words of editorial but they are too significant to let them go unmentioned.

Once in a blue moon, a handful of those relatively small but important announcements happen to fall within days of one another and we're left in the pleasant situation of being able to lump a bunch of little things into one decent sized article. It's all about efficiency, instead of three separate news stories you get one article that encompasses everything that's happening in a particular market. In this case we're dealing with the graphics market and the players involved are ATI, NVIDIA and Matrox.



ATI Launches CATALYST

Last Thursday ATI released their new line of drivers under the CATALYST brand. CATALYST is basically a name that fits over ATI's entire software suite that ships with their graphics cards including the display drivers, Multimedia Center, Hydravision and Remote Wonder software. There will be between 8 and 10 updates per year to the display driver portion of CATALYST and you will be able to download just the updated part of the suite whenever an update is available. ATI is hoping to have an auto-update utility built into CATALYST before the end of the year (ala Windows Update).

In terms of actual features and capabilities built into the first CATALYST release, which is now available on the web, there are a handful of things to talk about. In an effort to reduce tech support calls ATI has introduced SMARTGART as a part of the CATALYST display drivers. During any install/upgrade of your display drivers the SMARTGART tool will run a number of tests transparent to the user; based on the results of these tests the utility will then adjust your system's AGP operation mode (e.g. 1X, 2X, 4X, Fast Writes, SBA, etc…). This is mainly a way of dealing with poor AGP controller implementations; especially on legacy chipsets (remember the problems with the ALi Aladdin V?). ATI insists that SMARTGART will be user overrideable and is mainly for the non-techie end user that may run into problems with their systems. ATI is also implementing this sort of functionality today with hopes of sorting out many issues with compatibility before Microsoft's launch of their Longhorn (3D UI) OS about two years from now.

The new CATALYST software release also bundles HydraVision 3.1 which now has an nView-like installation wizard. All known problems since the February release of the software have been addressed; there is also support for quad-head setups using any ATI cards (e.g. two Radeon VEs) in the new version of the software.

The Multimedia Center gets a new skin but there's no new functionality built into the first CATALYST release. Remote Wonder 1.2 is bundled with a predefined plug-in for both Winamp and Power Point to customize the remote for use in those two applications. ATI was supposed to be making the Remote Wonder development kit online so that end-users could make their own plug-ins for any software packages but after a bit of scavenging on ATI's site we couldn't find it posted.

Finally with the CATALYST drivers you also get a new set of OpenGL/Direct3D tools that let you control all of your card's features (e.g. Anti-Aliasing, Anisotropic Filtering, Texture Details, etc…). ATI has truly raised the bar in terms of what we expect from configurability of drivers and now we turn to NVIDIA for an adequate response.



The Most Exciting part of CATALYST

You can download the CATALYST drivers for yourself and evaluate their performance, which is slightly higher than the previously available releases. But that's not what interested us the most about ATI's latest driver release; what got us drooling was the that with the launch of CATALYST, ATI finally delivered on a promise they made months ago: a DVI-to-Component Video Adapter is finally ready and supported in the new CATALYST drivers.

HDTV owners have been waiting for such an adapter from ATI ever since they originally told us they'd be making one last November. The excitement surrounding a component video adapter for HDTV owners is huge for a number of reasons:

- Several HDTVs line-double their non-component (Y/Pb/Pr) inputs since they assume that if you have a Composite or S-Video connection you will be feeding the TV an interlaced signal. The end result is that you're taking your non-interlaced/progressive video from your PC, interlacing it and then line-doubling it which ends up looking horrible. This makes TV output on PC graphics cards useless if you have a HDTV that line-doubles all inputs without a method of disabling it.

- You can't get HD resolutions (480p, 720p, 1080i) on non-component inputs which completely defeats the purpose of having a HDTV. Once again you're left with a TV that is capable of displaying great pictures but is forced to display horribly blurry text.

- Converting DVI/VGA to Component (Y/Pb/Pr) is very simple; all you need is a device that reprograms the RAMDACs to output Y/Pb/Pr signals instead of R/G/B and you've got a very cheap-to-manufacture component output dongle. Most VGA-to-Component transcoders that are available for sale are priced at over $100.

The dongles are finally here and they are going to be made available in two versions - a DVI to Component adapter and a VGA to Component adapter.


Click to Enlarge

The DVI adapter will only work on the All-in-Wonder Radeon 8500 and 8500DV cards, while the VGA adapter will work on the Radeon 8500 (64/128MB) and the Radeon 8500LE. While the adapters could theoretically work on any Radeon cards, according to ATI there are other factors that limit support to the 8500 line for now.

The two adapters are the same in terms of functionality; they only differ in size and one side of the connector (VGA/DVI).

A row of dip switches on the face of the connector controls the HD mode the adapter will output in. The current list of resolutions are: 480i, 480p, 540p, 720p and 1080i with a separate toggle for 16:9 aspect ratios. Currently 540p does not work however ATI expects to have that cleared up in a future software release.



The adapters only work under Windows (you won't get any POST screens, etc…) and in order to get your system setup for use with them you need to perform the initial setup with a conventional monitor attached (CRT/LCD). After you've installed the CATALYST drivers and properly setup your resolution/refresh rate (this is critical for making sure your system works, e.g. 640x480-60Hz or 720x480-60Hz for 480p or 1920x1080-30Hz for 1080i) you power down your system and turn it back on with the component adapter installed.

This being ATI's first attempt at such an adapter, the implementation is understandably rough. The biggest complaint we had was that there was a considerable amount of overscan where the desktop was being displayed outside the viewable area of our TV; ATI is aware of the problem and will offer a software fix later on.

We would also like to see ATI offer a HDTV control panel that lets you select from the standard HDTV resolutions and have that automatically configure the resolution/refresh rate of your display. This would be most helpful for users that aren't familiar with what resolutions 480p or 720p stand for.


Windows XP in 480p - Note that the Start Menu & Recycle Bin icon appear off the screen

Then there's the issue with games supporting HDTV resolutions; basically, most games don't. We found ourselves running at 640x480 in 480p mode in the vast majority of games we tested. Our test TV (a Toshiba TheaterWide HD 65H80) upconverts all 720p signals to 1080i and unfortunately would not work if we tried to force a 720p resolution and most games fail to offer 1920 x 1080 as a selectable resolution.


TV output done right - nothing matches the quailty of component outputs.

At 640 x 480 with 4X AA and 16X Anisotropic filtering enabled, the picture quality was quite impressive with 480p output enabled. We were looking at the best PC output we had ever seen on a TV. Since the adapters do nothing more than just reprogram the RAMDACs, the output quality is governed by the Radeon RAMDACs and not an external transcoder.

You can order the adapters from ATI's website for $29.00 USD.



NVIDIA does Cg

Before we get into the next topic of discussion let's have a quick lesson in the benefits of a high level programming language vs. hardware-centric assembly code. Remember that assembly language is what a particular processor operates on; whether it is a GeForce4 GPU or a Pentium 4, both of those processors operate on their own architecture-specific assembly code. When you compile a program in a high level programming language like C++ the compiler is merely translating the code that you wrote in the C++ language into assembly code which is then fed to the processor in binary form. Before high level programming languages became prevalent on the PC, almost all coding was done by hand in assembly.

In order to illustrate how much more tedious writing in assembly can be let's take a simple operation such as adding two integers together and storing them in a location in memory. In a high level programming language (e.g. C, C++, Java, etc…) the process goes like this:

int result = 2 + 2

The syntax obviously varies from one language to the next but in that one line we defined an integer variable, stored in memory and gave it the value of 2 + 2. Now let's do the same but in assembly, again this is a very general example and is not specific to any particular architecture:

ADD 2,2,R1
STORE R1,RESULT
RESULT: x133B

Once again, the syntax will vary from one architecture to the next but the basic idea remains the same. The first line adds the two numbers and stores the result in register R1. The second line stores the contents of R1 at the memory address pointed to by the label RESULT. The third line tells the assembler points the label RESULT at the appropriate memory address. Which one looks simpler to you?

Keep in mind that we're dealing with a relatively simple example here; once you start dealing with branches, loops and especially more complicated forms of memory addressing and allocation, assembly quickly becomes tedious.

You also have to be relatively familiar with the particular architecture you're coding for when using assembly as the opcodes and instruction formats do vary from one architecture to the next. This is both a pro and a con since it gives the programmer the opportunity to highly optimize their code for execution on a particular architecture but at the same time it makes their code virtually useless on any other platform.

Although NVIDIA always talked about how easy implementing DX8 pixel and vertex shader programs would be, they didn't really play up the fact that all of the coding was still done by hand in assembly. In order for more developers to actually take advantage of the shader capabilities of their next-generation GPUs NVIDIA would have to offer a higher level language for them to write code in. A good compiler can generate code very close in performance (and sometimes even faster than) to hand written assembly; even more importantly, a compiler can target multiple architectures and platforms to make reusing code much more programmer-friendly.

With all of that said, it wasn't a surprise that NVIDIA launched their high level programming language 'Cg' a few days ago. The name implies 'C for graphics' and thus employs a very C-like language but with an obvious skew towards writing shader programs. The syntax of the language is nearly identical to Microsoft's own high level graphics programming language called D3Dx, but the main difference between the two efforts is in NVIDIA's compiler development.

Here's an example of the reduction in code when going from raw assembly to Cg for the Phong Shader program used in the picture above:

Assembly Code for a Phong Shader

...
RSQR R0.x, R0.x;
MULR R0.xyz, R0.xxxx, R4.xyzz;
MOVR R5.xyz, -R0.xyzz;
MOVR R3.xyz, -R3.xyzz;
DP3R R3.x, R0.xyzz, R3.xyzz;
SLTR R4.x, R3.x, {0.000000}.x;
ADDR R3.x, {1.000000}.x, -R4.x;
MULR R3.xyz, R3.xxxx, R5.xyzz;
MULR R0.xyz, R0.xyzz, R4.xxxx;
ADDR R0.xyz, R0.xyzz, R3.xyzz;
DP3R R1.x, R0.xyzz, R1.xyzz;
MAXR R1.x, {0.000000}.x, R1.x;
LG2R R1.x, R1.x;
MULR R1.x, {10.000000}.x, R1.x;
EX2R R1.x, R1.x;
MOVR R1.xyz, R1.xxxx;
MULR R1.xyz, {0.900000, 0.800000, 1.000000}.xyzz, R1.xyzz;
DP3R R0.x, R0.xyzz, R2.xyzz;
MAXR R0.x, {0.000000}.x, R0.x;
MOVR R0.xyz, R0.xxxx;
ADDR R0.xyz, {0.100000, 0.100000, 0.100000}.xyzz, R0.xyzz;
MULR R0.xyz, {1.000000, 0.800000, 0.800000}.xyzz, R0.xyzz;
ADDR R1.xyz, R0.xyzz, R1.xyzz;
...

Cg Shader for same Phong Shader

...
COLOR cSpec = pow(max(0, dot(Nf, H)), phongExp).xxx;
COLOR cPlastic = Cd * (cAmbi + cDiff) + Cs * cSpec;

Microsoft's compiler will obviously only compile for Direct3D while NVIDIA's will be able to compile for both Direct3D and OpenGL. NVIDIA will also open-source the majority of the compiler with the exception of the backend that contains hardware specific optimizations for their GPUs. NVIDIA's compiler will receive regular updates (at least once for every major GPU release) and it will support all currently available competitor GPUs, however NVIDIA will be encouraging their competitors to develop their own compilers for Cg.

Currently Cg is offered in a beta state but it will go gold in the fall alongside the release of NV30. NVIDIA claims that the although the current version isn't faster than raw assembly code, the Fall release will be much more optimized and faster than hand coded assembly.



Parhelia - It's 1 Week Away

We were starting to get worried there; when we first talked to Matrox about Parhelia they mentioned that they would try to have boards in our hands before the end of May. We didn't hear anything for a while and just yesterday received an update, and today we got our card.

Our performance analysis of the card will be published on June 25th, exactly one week from today. The reason for Matrox's decision to set a NDA of the 25th is so that when reviews are published, you should be able to actually purchase a card. Parhelia boards will be available sometime next week according to Matrox.

The hardware we were sent is final hardware running RC1 drivers; we expect final drivers to be in our hands by next week. Here are the final specifications of the hardware:

Matrox Parhelia (128MB Retail Version)

- 220MHz core clock
- 275MHz DDR memory clock (17.6GB/s of memory bandwidth)
- $399 Estimated Street Price

Matrox Parhelia (128MB OEM Version)

- 200MHz core clock
- 250MHz DDR memory clock (16GB/s of memory bandwidth)
- ~$300 Estimated Street Price?

The first thing you'll notice is that the retail Parhelia has a fairly low core clock. This means that unless you're really stressing its memory bandwidth or texture unit advantages, the Parhelia's performance will be proportional to the clock speed difference between it and the GeForce4 (since they both have four rendering pipes).

The core clock is a bit lower than what Matrox expected, with 220MHz being even lower than our original 250MHz estimate. You have to remember that this is a very large chip and thus clock speeds will be limited. The price is a tad lower than originally anticipated as is the memory clock (presumably to keep the price under $400).

The OEM version will differ from the retail version in clock speed, unfortunately at this time it won't be branded any differently. We have already talked to Matrox and recommended that they add a suffix onto the name of the OEM cards to avoid confusion.

There's much more to talk about but we'll save that for next Tuesday; until then, be sure to beef up your knowledge of Parhelia by reading our overview of the technology.

Log in

Don't have an account? Sign up now