Original Link: https://www.anandtech.com/show/2726



Earlier this week we posted the first article in a series of articles on multiGPU performance, scaling and value. The first article focused on two GPU configurations in both single card and dual card flavors. This is the next installment and today we will cover 3-way performance, scaling and value in much the same way as our first article.

The way we will look at scaling and value are mostly unchanged, with a slight exception in the value department. While we will still be ranking solutions by FPS / $100US (how much performance do you get for every 100 USD spent), we are also taking into account another value factor. As was suggested in our comments on the original article, we are zeroing out the value of solutions that don't provide playable framerates. We give ourselves a wide birth and put the cutoff at 25 fps as some people do get by with lower framerates. For instances where a configuration comes in at less than 25 fps, we assign a value of zero. Changing the way we look at value should help us get a better picture of how both absolute performance and performance per dollar play into the value of a given setup.

While scaling is calculated the same, we are looking at two different metrics. Rather than look again at 1 to 2 GPU scaling, we are looking at peformance scaling from 1 to 3 GPUs and from 2 to 3 GPUs. There will be one more set of bar graphs on every page this time, but we hope to give a well rounded picture of the performance improvement with three cards. Unlike the move from 1 to 2 cards, we aren't looking at a theoretical max of 2x performance in non-CPU limited situations. With the increase from 1 to 3 cards, we could see as much as 200% performance improvement (3x the performance) in theory. We don't get anywhere near this in practice though.

Moving from 2 to 3 cards, the maximum performance improvement we would expect to see with perfect scaling and no CPU or system limitation is 50%. While we might see good scaling from 1 to 3 cards, moving from 2 to 3 cards might show a much less significant improvement. Looking at both metrics will help us get a feel for scaling in general and scaling/value of 3-way as compared to 2-way multiGPU solutions.

For color coding, we find that more than 4 colors in a bar graph can get distracting, so we tried to strike a balance in color use and readability by coloring all the configurations we already looked at in the first installment blue. 3-way AMD solutions are orange and 3-way NVIDIA solutions are green. Representing this much data in a clear fashion is always a balance. Hopefully this does a good job of getting things across.

As with last time, we'll look at how often games scale with 3 cards. This will be based on scaling from one to three as well as from two to three, and we will see more diminishing returns on 3 cards than on two. This is to be expected, but theoretically those who spring for three cards are not interested in thrift anyway. Our value graphs will tie together the performance scaling and price data. What we expect to see is that, even more than 2-way solutions, 3-way multiGPU options require a much higher premium for the performance they deliver and are only really viable options for owners of 2560x1600 monitors.



Who Scales

As we mentioned, we'll be looking at scaling from both 1 to 3 and 2 to 3 GPUs. This gives us a few more metrics than last time to look at both overall and on a per game basis. From the overall standpoint, we'll first look at scaling from 1 to 3 GPUs. We'll again look at general success as >33.3% scaling and complete failure will be <5% scaling. This will give us information on how many titles seem to be only CPU limited and how many are of zero or negative value as compared to a single card.

Before we get to the numbers, it is important to note that all of this data is out of 21 tests for AMD cards (like the previous article) but out of 20 for NVIDIA hardware. We had an issue with FRAPS running at 2560x1600 with 3-way NVIDIA solutions. We do want to be clear that the game ran fine, and this seems to be a high res high memory usage issue in Race Driver GRID when FRAPS is combined with 3-way and higher SLI. Let's make it clear that this isn't an issue with the game or the hardware per se, but an issue in combination with FRAPS. 3-way SLI runs really well on Race Driver GRID: we just can't tell you how well unless and until this issue is resolved. We leave the game in this article because there is AMD data at 2560x1600 and because there's still usable data at 1680x1050.

First up in our look at who scales is general success (>33.3% scaling) in moving from 1 to 3 GPUs:

NVIDIA GeForce GTX 28516
NVIDIA GeForce GTX 28018
NVIDIA GeForce GTX 26019
NVIDIA GeForce 9800 GTX+19
ATI Radeon HD 4870 1GB18
ATI Radeon HD 4870 512MB16
ATI Radeon HD 485018

The cards that sees the least success in moving from single GPU 3-way are the 4870 512MB and the GTX 285. With the 4870 512MB, this is a combination of failures and CPU/system limited situations while the GTX 285 is purely CPU/system limited here. 16 out of 21 tests isn't hugely different than the 18 or 19 out of 21 (20 for NVIDIA cards), but we do see less "success" in general as compared to our two card situation. And keep in mind that this is 33.3% out of a possible performance improvement of 200%. We are being less strict and seeing less success.

Now lets look at complete failure of scaling from 1 to 3 GPUs. This is based on scaling of <5% and ends up catching the cases of negative scaling. While we did this on a per game basis for 2-way scaling, this time we are looking at the results out of the total number of tests (out of 21 tests for AMD cards and out of 20 tests for NVIDIA)

NVIDIA GeForce GTX 2850
NVIDIA GeForce GTX 2800
NVIDIA GeForce GTX 2600
NVIDIA GeForce 9800 GTX+1
ATI Radeon HD 4870 1GB0
ATI Radeon HD 4870 512MB2
ATI Radeon HD 48503

Again, this is generally failure (negative scaling) at 2560x1600 and is generally an issue we can attribute to 512MB of RAM not being enough at high resolution. There are some differences here as compared to 2-way scaling, but generally this isn't that many cases of abject failure to contend with. We'd still love to see AMD and NVIDIA implement something that caught multiGPU failure and reverted to running on a single card in those cases rather than producing a negative experience. But since we can manually disable both SLI and CrossFire, this isn't a deal breaker (it's just an annoyance).

When we look at scaling up from 2 to 3 cards, things get a bit more dim. Since the maximum scaling percent is 50%, we decided to lower our bar for calling a configuration "successful" by reducing our threshold to 10% (which is very generous at only 1/5th of the theoretical maximum). Our results show that much reduced improvement when moving from two to three cards. Here's the data on the number of "success" (>10% improvement) we saw when going from 2 to 3 cards:

NVIDIA GeForce GTX 28511
NVIDIA GeForce GTX 28012
NVIDIA GeForce GTX 26014
NVIDIA GeForce 9800 GTX+12
ATI Radeon HD 4870 1GB13
ATI Radeon HD 4870 512MB8
ATI Radeon HD 485014

With the best success rate coming in at 14 out of 20 with the GeForce GTX 260 3-way SLI, and our threshold for success so low, 3-way isn't looking so great out of the gate. Let's take a look at failure to round that out. We'll consider failure to be <2.5% scaling.

NVIDIA GeForce GTX 2856
NVIDIA GeForce GTX 2805
NVIDIA GeForce GTX 2607
NVIDIA GeForce 9800 GTX+7
ATI Radeon HD 4870 1GB5
ATI Radeon HD 4870 512MB9
ATI Radeon HD 48504

These numbers show that the majority of the time that cards don't scale up well from 2 to 3 GPUs, they either make no statistical difference or they degrade performance. This is in contrast to scaling from 1 to 2 GPUs. So despite the fact that 3 GPUs can offer good improvement over 1 GPU, it doesn't seem that 3 GPUs consistently offers good improvement over 2 GPUs.

But this is the high level overview. Let's take a look at each game test to get a better idea of what's going on. First we'll recap prices and the test setup and then we'll get to the analysis.



Prices, Stutter, and The Test

We are using the same prices we listed earlier this week, as not much as changed since then. This price data does include our 3-way solutions and we've put the numbers into a graph rather than a table to help see the relative pricing of these parts more easily.

Cost of Graphics Solution

We often hear users wondering about micorstutter when we talk about multiGPU solutions. We do play all these games with all these solutions in addition to running the benchmarks. Our benchmark data is only a way to quantitatively show relative performance, it's not a substitute for experience. When we run into issues that disrupt our experience, we note them and report them. One such issue is microstutter.

Microstutter is what happens when we see a high average framerate but experience frequent choppiness in the form of lower framerates interspersed with much higher framerates. The experience can be hard to notice but at times frustrating. Typically, we don't notice this as a problem with current drivers on modern games with 2-way solutions. Looking at 3-way and 4-way configurations is a different story though.

With lower memory 3-way 3 card solutions, we do notice some microstutter sometimes. Pairing a single card dual GPU AMD card with a single card single GPU option to get 3-way CrossFireX also seems to have a positive impact on microstutter. The worst offenders are the 3 card 4870 512MB and the 3 card 9800 GTX+. The 4850 tends to fail all around rather than stutter when the 4870 512MB runs into problems.

It's very difficult to really collect high quality quantitative data that shows microstutter, as the only way to really get a good idea of what's going on is to analyze raw frame data on a per frame basis (which you can't get with FRAPS). At the same time, while sometimes we can catch a whiff of microstutter, we don't feel it is a significant problem here. Actually, there are other problems (like return on your investment) that do more to get in the way of us recommending a 3-way solution. But we'll talk about all that later.

Our test system is unchanged from the previous installment except that we've also tested 3-way solutions. For our 3-way 4870 1GB, we used one 4870 X2 and one 4870 1GB card, but for all other 3-way solutions we used 3 cards. Because we could only get ahold of 2 GTX 285 cards, our 3-way GTX 285 setup is done with 3x overclocked GTX 280 cards (which will perform the same).

Test Setup
CPU Intel Core i7-965 3.2GHz
Motherboard ASUS Rampage II Extreme X58
Video Cards ATI Radeon HD 4870 X2
Sapphire ATI Radeon HD 4850 X2 2GB
ATI Radeon HD 4870 512MB CrossFire
ATI Radeon HD 4850 CrossFire
ATI Radeon HD 4870 1GB
ATI Radeon HD 4870 512MB
ATI Radeon HD 4850
NVIDIA GeForce GTX 295
NVIDIA GeForce GTX 285 SLI
NVIDIA GeForce GTX 280 SLI
NVIDIA GeForce GTX 260 SLI
NVIDIA GeForce 9800 GTX+ SLI
NVIDIA GeForce 9800 GX2
NVIDIA GeForce GTX 285
NVIDIA GeForce GTX 280
NVIDIA GeForce GTX 260 core 216
NVIDIA GeForce GTX 260
NVIDIA GeForce 9800 GTX+
Video Drivers Catalyst 8.12 hotfix
ForceWare 181.22
Hard Drive Intel X25-M 80GB SSD
RAM 6 x 1GB DDR3-1066 7-7-7-20
Operating System Windows Vista Ultimate 64-bit SP1
PSU PC Power & Cooling Turbo Cool 1200W



Age of Conan Analysis

The first thing we note here is that AMD leads the benchmark.




1680x1050    1920x1200    2560x1600


Both 3-way SLI and 3-way CrossFire perform very well in this game all the way up to 2560x1600. It looks like the GTX 280 and GTX 285 3-way tests are CPU limited here, but it is odd that the 4870 512MB 3-way solution scores so much lower than the 4850 3-way at 2560x1600 when they've got the same amount of RAM onboard.




1680x1050    1920x1200    2560x1600


Scaling from 1 to 3 GPUs comes in pretty well at 2560x1600 with the 9800 GTX+ and the GTX 260 netting around 140% improvement. the Radeon HD 4870 1GB does pretty well with about 128% improvement. Below the maximum resolution, we don't see scaling much better than 100% (though the GTX 260 and Radeon HD 4850 do manage it).




1680x1050    1920x1200    2560x1600


Moving from 2 to 3 GPUs shows a less impressive picture until we move to 2560x1600 where we see most of the hardware gain some good performance. The Radeon HD 4850 gains near its theoretical maximum at almost 45%. The oddity here is the GTX 260 which shows better performance scaling at lower resolutions peaking at 41% at 1920x1200.




1680x1050    1920x1200    2560x1600


As for value, the two most expensive solutions (the GTX 280 and 285) are pegged at the bottom of the chart. AMD's 4850 3-way shows good performance and good value, especially for a 3 card solution, in Age of Conan. The 4850 3-way doesn't quite reach the 4870 X2 which posts both better performance and better value.

We have 6 configurations that can't post 25 fps at 2560x1600, but some AMD single cards are able to squeak by and it looks as though two card solutions are a better balance of performance and cost than any 3-way option.



Call of Duty World at War Analysis




1680x1050    1920x1200    2560x1600


NVIDIA's GT200 parts in 3-way SLI rule this benchmark, while the GeForce 9800 GTX+ holds its own. This test definitely shows NVIDIA in a good light across all three resolutions.




1680x1050    1920x1200    2560x1600


The apparent system limitation on single cards does inflate the scaling a little bit here, but there's no doubt that lots of scaling is happening. This time we don't see scaling above the theoretical maximum of 200%, but we do see some big gains. While NVIDIA hardware does do well at low resolution, the best total 1 to 3 GPU scaling at 1920x1200 and 2560x160 goes to AMD.




1680x1050    1920x1200    2560x1600


The story doesn't change much when comparing scaling from 2 to 3 GPUs. We get very decent scaling rates with the 4870 1GB getting dangerously close to the theoretical maximum of 50% at 2560x1600. In general though, CoD:WaW does a good job taking advantage of available GPU horsepower whether it's 1, 2 or 3 GPUs.




1680x1050    1920x1200    2560x1600


Below 2560x1600 the 9800 GTX+ 3-way option actually has decent value (comparatively) in this test. At the highest resolution, the 9800 GTX+ falls a few spots. Again, of the 3-way options, NVIDIA's two highest end options have the lowest value of cards that get something near playable framerates.

We still see lots of two way and single card options that get decent performance leading the way in value over the much more expensive 3-way configurations even at the highest resolution.

In general, with any of these setups, performance won't be an issue below 2560x1600, and even then only the slowest couple might warrant skipping.



Crysis Warhead Analysis




1680x1050    1920x1200    2560x1600


Like in CoD, we are seeing NVIDIA dominated performance. AMD's 3-way solutions can hold their own against the rest of the hardarwe out there, but NVIDIA sets the bar here in terms of raw performance.

While other tests don't show any real need for 3-way graphics, Crysis isn't playable at 2560x1600 with single GPU options under these settings. Even at lower resolutions Crysis just seems to absorb what's thrown at it. The 512MB parts do take a bigger hit at the highest resolution than the higher memory hardware out there also.




1680x1050    1920x1200    2560x1600


Scaling is decent, but we would prefer it to be higher even so considering that to get closer to getting what you pay for we would need to see 200% scaling with 3 cards. NVIDIA hardware seems to scale much better than AMD hardware in this test.




1680x1050    1920x1200    2560x1600


Looking at how the hardware scales from 2 to 3 GPUs, we can see that AMD's 4870 1GB shows good improvement despite the fact that it trails in the 1 to 3 GPU scaling chart. 512MB hardware struggles a lot in this as well. The other hardware does scale really well up at 2560x1600 where it counts.




1680x1050    1920x1200    2560x1600


Because this game is very graphically intense even when not set to the maximum settings, not all the cards can score any "value" even at the lowest resolution. The ATI Radeon HD 4850 fails to break the 25fps barrier in any test, so it gets a value of zero across the board. The 9800 GTX+ 2-way option isn't that shabby until 2560x1600, and the 4850 X2 posts some good value. We still have single cards at the top of the value chart until we hit 2560x1600 where single cards fail to offer any value (as well as some 2-way solutions). At that point 2-way solutions that offer some level of passable performance give more for the money with the GTX 260 3-way option leading the pack of our recently added tests.

This game shows the most casualties in terms of our value threshold, but hopefully it will help show which cards are actually worth comparing here.



Fallout 3 Analysis

Even with iPresentInterval set to 0 and vsync disabled, Fallout 3 has an LOD system that limits performance to certain averages. The frame rate can and does go beyond 60, but when it gets to a certain point it drops back down to below 60 in a sort of sawtooth pattern. This can make getting useful comparison data tough especially when we start adding higher and higher performance options to the mix.




1680x1050    1920x1200    2560x1600


As we can see, some two GPU solutions do better than the 3-way options, especially at lower resolutions. Again, at 2560x1600, the 512MB per GPU 3-way configurations struggle to keep up. With Fallout's performance profile, it's a safer bet to run a single card with a good amount of RAM or at most a two way setup as we just don't need more power for this one.




1680x1050    1920x1200    2560x1600


Because of the LOD limited performance, we see speed up with 3 GPUs from 1 at only about 50% out of a possible 200%. In the worst case, at high res with a low memory part, we can even see performance degradation. The GeForce GTX 260 stands out at high resolution, while the 9800 GTX+ and 4850 put in a good showing at lower resolutions.




1680x1050    1920x1200    2560x1600


Failure is the word that comes to mind when looking at scaling up from 2 to 3 GPUs. There isn't a single case where the extra GPU helps improve performance in a significant way at any resolution. There is just no reason for 3 GPUs with Fallout 3.




1680x1050    1920x1200    2560x1600


Our value data bares this out with the 3-way solutions clustered at the bottom of our list. The GeForce GTX 280 and 285 SLI options are still pretty abysmal in delivering on framerate for your money as well. The only boost we see in value comes in at 2560x1600 for 3-way options that have more than 512MB of RAM. The only reason these cards show better value though is because some configurations fail to perform adequately to play the game at this resolution.

Bottom line is that 3-way is unnecessary for Fallout 3 from a performance standpoint, doesn't produce any significant benefit over 2-way options, and is a horrible investment all around for this game.



FarCry 2 Analysis

This benchmark pushes the hardware really hard, so it's a good candidate for 3-way SLI. At high resolutions, it's also tough on lower memory graphics options.




1680x1050    1920x1200    2560x1600


The GeForce GTX 280 and 285 do very well, but the 4870 1GB does a good job keeping up with the NVIDIA hardware at lower resolutions and even leads the GTX 260 by a good margin at 2560x1600. While the 512MB 9800 GTX+ and 4850 both drop way off in performance at 1920x1200, the 512MB 4870 holds on to good performance at this resolution. All the 512MB parts tank in 3-way at 2560x1600 though.




1680x1050    1920x1200    2560x1600


Scaling is really high for high end NVIDIA parts, which contributes to their lead in this benchmark. The GTX 260 falls off a little on scaling at the high end, and none of the 512MB parts scale that well even at lower resolutions.




1680x1050    1920x1200    2560x1600


Looking deeper at scaling, it's clear that moving from 2 to 3 GPUs really doesn't add much for the Radeon HD 4850 and 512MB 4870. The 9800 GTX+ does well at 1680x1050, but tanks at higher resolutions. The 4870 1GB improves in scaling as resolution increases, and NVIDIA hardware scales fairly consistently in this test from two to three GPUs.




1680x1050    1920x1200    2560x1600


The GTX 260 shows the best value of the 3-way solutions until we hit 2560x1600 where the 4870 1GB takes the lead in 3-way value. The 512MB 3-way options some of the 2 GPU solutions, and slower single GPUs can't post playable framerates at the highest resolution so fall off our value chart.



Left 4 Dead Analysis

This game runs really well on all our hardware, so 3-way isn't necessary, but the Source engine also scales well with multiple GPUs despite it's already high performance.




1680x1050    1920x1200    2560x1600


In this case, AMD hardware out performs the rest of the lineup. Both the 512MB and 1GB versions of the 4870 outperform the highest end NVIDIA hardware. This also shows that Left 4 Dead doesn't trash graphics memory and does a good job making things look pretty without requiring cards to have more RAM onboard.




1680x1050    1920x1200    2560x1600


The Radeon HD 4850 always scales pretty well, while other cards hit a bit of a system limitation at lower resolutions. Moving up to 2560x1600, the AMD cards all show that they can scale better than NVIDIA hardware in this game.




1680x1050    1920x1200    2560x1600


At 1680x1050, only the GTX 280 and 285 really scale up from 2 GPUs to 3. 2560x1600 shows another side of the coin where all the ATI hardware scales pretty well while NVIDIA hardware can't push past a 10% improvement from 2 to 3 GPUs. Two do pretty well on their own, but AMD comes through better on this one.




1680x1050    1920x1200    2560x1600


Despite the high performance and good scaling, our value data still shows the 3-way solutions on the lower end of the spectrum. AMD's cards do tend to lead in value among the 3-way options and even best the single GTX 280 and 285 at 2560x1600.



Race Driver GRID Analysis

Keep in mind that we had an issue with FRAPS here that didn't allow us to capture framerate at 2560x1600 with 3-way NVIDIA solutions. We decided to include the game because we've still got all the data for 1680x1050 and 1920x1200 and AMD data at 2560x1600. The problems we had didn't affect playing the game at all, and we only had an issue when we tried to record framerate data.




1680x1050    1920x1200    2560x1600


NVIDIA 3-way solutions lead at the lower resolutions with the 2-way GTX 285 SLI at the top of the heap. From the look of the framerate data, though we couldn't capture it, 3-way NVIDIA hardware performed between the GTX 285 SLI and 4870 1GB 3-way CrossFire. The 512MB hardware once again has trouble performing at high resolution in 3-way configurations. GRID is tough on memory, and performance in the menu screens on 512MB hardware at 2560x1600 is incredibly painful.




1680x1050    1920x1200    2560x1600


NVIDIA hardware shows good scaling form 1 to 3 GPUs in the tests from which we could collect data. AMD hardware scales better as resolution increases, until performance tanks with 512MB cards at 2560x1600.




1680x1050    1920x1200    2560x1600


Scaling up from 2 to 3 GPUs is more of a mixed bag. At lower resolution, some options are system limited. While the lower end NVIDIA options have more room to improve in performance, both the 4850 and 4870 1GB scale pretty well at 1920x1200. Only the 4870 1GB scales well at 2560x1600 though.




1680x1050    1920x1200    2560x1600


The 9800 GTX+ 3-way shows good value at lower resolutions. Though the 4870 1GB 3-way rises up the list at 2560x1600, there are some options that loose all value because they don't make 25fps.



Power Consumption

No surprises here really. 3-way hardware requires more power at idle and load. Remember that these numbers are total system power consumption, but that these numbers are run with the 3dmark Vantage POM shader test. This means the rest of the system isn't pushed very hard at all. Depending on the game, you could expect real load power draw to be 50 to 100W higher.

While NVIDIA hardware seems to handle idle power a bit better, at load 3-way SLI tends to require much more than 3-way CrossFire. But this is in line with what we saw with 2-way solutions as well.



Final Words

So ... wow. And this isn't even all the data. We've still got a four GPU update to bring out as soon as we finish aggregating and analyzing everything.

First things first though: let's sort through our 3-way tests and try to make some sense of all this.

The first thing to take away is that 3-way graphics systems are absolutely not something anyone without a 30" display needs or should even want. There are so many instances where 3-way fails to improve on 2-way, and because of the high price barrier to entry and the diminishing returns on adding more hardware, 3-way is just not for everyone and has questionable value for those who choose to go that route as well.

Honestly, NVIDIA and AMD trade blows, but the one clear thing to note is that 3-way GTX 280 and 285 are just way overpriced. You don't get what you pay for and you can still get huge performance from the GTX 260 which offers consistently better value. On the AMD front, 3-way 4870 1GB pairing a 4870 X2 and single 4870 1GB is the way to go if you want 3-way. The 4850 can scale in some games that don't crush it with too much data and it offers better value in some situations.

But even more than single or two card solutions, 3-way is a case by case basis. You've really got to look at the games you like and pick the solution that performs best there. This is in large part because it's just such an investment and these options are just not "worth" the cost unless you are really picking your graphics hardware for a purpose. There isn't an easy way to make a general recommendation.

Other than to stay away from 3-way GTX 280 and 285, we've got to split our recommendation between 3-way GTX 260 and 3-way 4870 1GB. If you've got to have a 3-way solution that is. And much of our data reinforces our recommendations from our look at 2-way multiGPU solutions which tend to offer a better balance of performance and value if more power than a single GPU can offer is required.

But really, 3-way, and especially with 3 cards, is just not for everyone and not something we put a lot of stake into as a good option. Saving the money to upgrade to a new architecture is going to get you a lot more for the money than trying to squeeze longevity out of hardware by stacking more than 2 of the same thing in one system.

Log in

Don't have an account? Sign up now