Original Link: https://www.anandtech.com/show/7195/amd-frame-pacing-explorer-cat138
AMD Frame Pacing Explored: Catalyst 13.8 Brings Consistency to Crossfire
by Ryan Smith on August 1, 2013 2:00 PM ESTIn an off year that hasn’t seen too many new product releases thus far, this has been anything but a dull time. For the better part of a year now the technology journalist community – spearheaded by The Tech Report’s Scott Wasson – has been investigating the matter of frame pacing and frame timing on GPUs. In applying new techniques and new levels of rigor, Scott found that frames were not being rendered as consistently as we had always assumed they were, and that cards that were equal in performance as measured by frame rates were not necessarily equal in in performance as measured by frame intervals. It was AMD in particular who was battered by all of this work, with the discovery that both their single-GPU and multi-GPU products were experiencing poor frame pacing at times. AMD could meet (and beat) NVIDIA on frame rates, only to lose out on smoothness as a result of poor frame pacing.
Since then we have seen both some progress and some new revelations on these matters. AMD was very quick to start working on resolving their single-GPU issues, and by March when they were willing and able to fully engage the tech community, they had already solved the bulk of those single-GPU issues. With those issues behind them, they also laid out a plan to tackle the more complex issue of multi-GPU frame pacing, which would involve spending a few months to write a new frame pacing mechanism for their cards.
At the same time NVIDIA also dropped a small bombshell with the public release of FCAT, their long in development frame interval benchmarking tool. FCAT could do what FRAPS alone could not, capturing and analyzing the very output of video cards to determine frame rates, frame times, and frame intervals. Though FRAPS was generally sufficient to find and diagnose single-GPU issues, FCAT shed new light onto AMD’s multi-GPU issues, painting a far more accurate – and unfortunately for AMD more dire picture of Crossfire frame pacing.
Perhaps as proof that there’s no such thing as coincidence, since then we have seen the release of AMD’s latest multi-GPU monster, the Radeon HD 7990. Packing a pair of high clocked Tahiti GPUs, the 7990 was AMD’s traditional entry into the realm of $1000 multi-GPU super cards. A capable card on paper, the 7990 has been at the mercy of AMD’s drivers and lack of a frame pacing mechanism, with the previous revelations and FCAT results causing the 7990 to suffer what can only be described as a rough launch.
Ultimately when AMD engaged the community back in March they had a clear plan for addressing their multi-GPU frame pacing issues, developing a new frame pacing mechanism for their cards. AMD stated outright that this work would take a few months, something of an arduous wait for existing Crossfire users, setting a goal that the new frame pacing mechanism would “come in or around a July driver drop.” July has since come and gone by a day, but at long last AMD has completed their initial work on their new frame pacing mechanism and is releasing the first public driver today at 2pm ET as Catalyst 13.8 Beta 1.
As part of today’s launch activities, AMD seeded the beta driver to the press a week in advance to give us a chance to put it through the necessary paces, give AMD feedback, and write up about our experiences with the new driver. Over the next several pages we’ll be going over what changes AMD has made to their drivers, how they impact the 6 games we do frame interval testing with, and ultimately whether AMD has made sufficient progress in resolving their frame pacing issues. Make no mistake: AMD wants to get past these frame pacing issues as quickly as possible and remove the cloud of doubt that has surrounded the 7990 since its launch, making this driver launch an extremely important event for the company.
In Summary: The Frame Pacing Problem
Before we dive into the technical details of AMD’s frame pacing mechanism and our results, we’re going to spend a moment recapping the basis of the frame pacing problem. So if you haven’t been keeping up with this issue, please read on, otherwise feel free to jump a page.
In brief, in multi-GPU setups, be it single-card products like the GTX 690 or multiple cards such as a pair of 7970s, the primary mode of splitting up work is a process called Alternate Frame Rendering (AFR). In AFR, rather than have multiple GPUs working on a single frame, each GPU gets its own frame. This method has over time proven to be the most reliable method, as attempting to split up a single frame over multiple GPUs (with their relatively awful interconnect) has proven to be unreliable and difficult to get working. AFR in contrast is by no means perfect and has to deal with inter-frame dependency issues – where the next frame relies in part on the previous frame – but this is still easier to implement and more consistent than previous efforts at splitting frames.
However due to the mechanisms of AFR, left unattended it can significantly impact the intervals between frames and consequently whether stuttering is perceived. To do AFR well it’s necessary to pace the output of each GPU such that each GPU is delivering a rendered frame at as even a rate as possible; not too soon after the previous frame, and not too late such that the following frame comes up quickly. In a 2 GPU setup, which is going to be the most common, this means the second GPU needs to produce a finished frame when the first GPU is roughly half-way done with its current frame. Should this fail to happen then we have poorly paced frames that will result in perceived micro-stuttering.
Micro-stuttering has been a longstanding issue on multi-GPU setups. Both NVIDIA and AMD have worked on the issue to various degrees, but at the end of the day multi-GPU setups have never proven to be as reliable as single-GPU setups, which is why our editorial position on the matter has been to always favor single powerful GPUs over multiple GPUs when at all possible. Consequently it’s impractical to fully solve micro-stuttering and achieve frame pacing consistency on level with single-GPU setups, but it’s still possible to improve on previous methods and achieve a level of frame pacing that is reasonably effective and “good enough” for most needs. This is what AMD has been focusing on for the past few months.
Moving on, how AMD ended up in this situation is effectively the combination of three factors. The first of course being the innate technical challenged posed by AFR, while the second and third factors have been a poorly realized position on lag vs. consistency and a failure of competitive analysis respectively.
On the former, AMD’s position up until now has been that they’ve favored minimizing input lag in their designs. If you need to hold back a frame to better pace it, then you are by definition introducing some input lag, a quality that is generally undesirable to a user base that usually avoids mechanisms like v-sync for that reason. AMD’s position hasn’t been wrong of course, but it has come at the exclusion of allowing a bit of input lag to better manage frame pacing. AMD’s decision then has been to lighten up on this position and dedicate the resources to deal with both approaches. AMD would introduce advanced frame pacing as an optional control, while leaving the simpler, less laggy approach as another option.
Meanwhile the story with competitive analysis is far less complex. Simply put, AMD wasn’t testing for frame pacing as part of their standard competitive analysis, so when these results first broke AMD was caught flat-footed. This is a business failure rather than a technical failure, which makes it easy enough to resolve. But it’s also the reason why AMD needed time to develop an advanced frame pacing mechanism, as they had never seen the need to develop one before.
Ultimately this is a problem that should have never happened, and it is unfortunate that AMD let it come to this. At the same time however we believe it’s never too late for redemption, and AMD has been making all of the right moves to try to achieve that. They have been clear about their failures and shortcomings, including their frustrations that they’ve left performance on the table by not looking for these issues, and they have been equally clear in laying out a plan for how they would go about fixing all of this. So today we will finally get to see first-hand whether AMD’s initial efforts for resolving frame pacing in multi-GPU setups has paid off.
Catalyst 13.8 Beta 1: The First Multi-GPU Frame Pacing Driver
The culmination of AMD’s first wave of efforts to manage frame pacing is the Catalyst 13.8 driver (driver branch 13.200). Being released in beta form today, the marquee feature for this driver is the new frame pacing mechanism for Crossfire setups. As with any major new driver branch this also includes some other improvements, and while we don’t have the complete release notes, AMD has mentioned that these drivers will bring about full OpenGL 4.3 compliance (apparently they were missing a couple of items before).
AMD is calling this driver “phase 1” of their frame pacing solution, and for good reason. In implementing frame pacing AMD has tackled the issue in what’s very obviously a triage-like manner, focusing on the most important/significant problems and working out from there. So what’s addressed by this first driver resolves AMD’s biggest issues, but not all of them.
So what’s being addressed in phase 1? Phase 1 is being dedicated to Direct3D 10+ games running on a single display. What’s not being addressed in the first driver are the Direct3D 9 and OpenGL rendering paths, along with Eyefinity in any scenario.
It goes without saying that in an ideal would we would have liked to see AMD hit everything at once, but if they couldn’t do it all at once then choosing to tackle D3D10+ games first was the next best move they could make. This covers virtually all of the games present and future that are graphically challenging enough to weigh down a high-end Crossfire setup. D3D9 games by and large are not that demanding on this class of hardware – we’d have to resort to Skyrim mods to find a D3D9-exclusive title that isn’t CPU limited and/or gets less than 90fps off of a single GPU. OpenGL has even less traction, the last OpenGL game of note being 2011’s Rage which is capped at 60fps and easily hits that at 1080p on even 7800 series hardware.
Catalyst 13.8 Frame Pacing | |||||
Single Display | Eyefinity | ||||
D3D11 | Y | N | |||
D3D10 | Y | N | |||
D3D9 | N | N | |||
OpenGL | N | N |
It’s Eyefinity users who will be the most unfortunate bunch at the moment. Eyefinity is one of the premiere usage scenarios for Crossfire because of the amount of GPU horsepower required, however it’s also the most complex scenario to tackle – splitting work across multiple GPUs and then multiple display controllers – compared to the fairly low user uptake. More so than with D3D9 and OpenGL AMD does need to get Eyefinity sorted and quickly, but for the moment single display setups are it. On that note, 4K displays are technically also out, since the current 60Hz 4K displays actually present themselves as two displays, with video cards addressing them via Eyefinity and other multi-monitor surround modes.
On the plus side, since this is a purely driver based solution, AMD is rolling out frame pacing to all of their currently supported products, and not just the 7000/8000 series based GCN parts. This means 5000 and 6000 series Crossfire setups, including multi-GPU cards like the 5970 and 6990, are also having their pacing issues resolved in this driver. Given the limited scope of this driver we were afraid it would be GCN-only, so this ended up being a relief.
Moving on, let’s dive into the new driver. True to their word, AMD has made the new frame pacing mechanism a user controllable option available in the Catalyst Control Center. Located in the CrossfireX section of the 3D Application Settings page and simply titled “Frame Pacing,” it defaults to on. Turn it off and AMD’s rendering behavior reverts to the low-lag behavior in previous drivers.
As far as technical details go, AMD has not offered up any significant details on how their new frame pacing mechanism works. Traditionally neither AMD nor NVIDIA have offered a ton of detail into how they implement AFR under the hood, so while unfortunate from an editorial standpoint it’s not unexpected. Hopefully once AMD finishes the other phases and enabling the new frame pacing mechanism elsewhere, we’ll be able to get some solid details on what AMD is doing to implement frame pacing. So for the moment we only have the barest of details: AMD is delaying frames as to prevent any frame from being shown too early, presumably relying on backpressure in the rendering queue to stabilize and keep future frames coming at a reasonable pace.
With that said, based on just the frame time measurements from our benchmark suite we can deduce a bit more about what AMD is doing. Unlike NVIDIA’s “organic” approach, which results in frame times that follow a similar pattern as single-GPU setups but with far wider variation, the frame times we’re seeing on 13.8 have a very distinct, very mechanical metered approach.
Accounting for some slight variation due to how back buffer swapping works, what we see are some very distinct minimum frame time plateaus in our results. Our best guess is that AMD is running some kind of adaptive algorithm which is looking at a window of rendering times and based on that is enforcing a minimum frame time, ultimately adjusting itself every few seconds as necessary. NVIDIA doesn’t implement something quite like this, but beyond that we don’t know how the two compare algorithmically at this time. However regardless of their differences what we’re ultimately interested in is how well each mechanism works.
The Test
For the purposes of our testing we’ll be looking at the 6 games we’ve adopted for use with FCAT due to their proven reliability. These are Total War: Shogun 2, HItman: Absolution, Sleeping Dogs, Battlefield 3, Bioshock Infinite, and Crysis 3. All of our results unless otherwise noted are using Catalyst 13.8b1 for the AMD cards, and NVIDIA’s 326.19 beta drivers for the GeForce cards.
Our metric of choice for measuring frame times and frame pacing is a metric we’re calling Delta Percentages. With delta percentages we’re collecting the deltas (differences) between frame times, averaging that out, and then dividing delta average by the average frame time of the entire run. The end result of this process is that we can measure whether sequential frames are rendering in roughly the same amount of time, while controlling for performance differences by looking at the data relative to the average frame time (rather than as absolute time). This gives us the average frame-to-frame time difference as a percentage.
In general, a properly behaving single-GPU card should have a delta average of under 3%, with the specific value depending in part on how variable the workload is throughout any given game benchmark. 3% may sound small, but since we’re talking about an average it means it’s weighed against the entire run, as the higher the percentage the more unevenly frames are arriving. For a multi-GPU setup we’d ideally like to see the delta percentages be equal to our single-GPU setups, but this is for the most part unreasonable. There is no hard number for what is or isn’t right here, but based on play testing we’d say 15%-20% is a reasonable threshold for acceptable variance, with anything under 10% being very good for a multi-GPU setup.
Finally, in our testing we did encounter an issue with Catalyst 13.8 that required we make some slight adjustments to FCAT to compensate for this bug, so we need to make note of this. For reasons we can’t sufficiently explain at this time but has been confirmed by AMD, in some cases in Crossfire mode AMD’s latest drivers are periodically drawing small slices of old frame buffers at the top of the screen. The gameplay impact is minimal-to-nonexistent, but this problem throws off FCAT badly.
To quickly demonstrate the problem, below we have two consecutive frames from one of our Battlefield 3 runs. The correct FCAT color order here is dark blue, green, light blue, and olive. The frames corresponding to dark blue and green occur on frame one, and light blue and olive on frame two. Yet looking at frame two, we see a small 6 pixel high stripe of dark blue at the very top of the image. At this point the dark blue frame should have already been discarded, as the cards have moved on to the green and later light blue frames. Instead we’re getting a very small slice of a frame that is essentially 2 frames old.
The gameplay impact from this is trivial to none; the issue never exceeds a 6 pixel slice, only occurs at the top of the frame (which is generally skybox territory), and is periodic to the point where it occurs at most a few times per minute. And based on our experience this primarily occurs when a buffer swap should be occurring during or right after the start of a new refresh cycle, which is why it’s so periodic.
However the larger issue is that FCAT detects this as a frame drop, believing that over a dozen frames have been dropped. This isn’t actually possible of course – the context queue isn’t large enough to hold that many frames – and analysis shows that it’s actually part of the old frame as we’ve explained earlier. As such we’ve had to modify FCAT to ignore this issue so that it doesn’t find these slices and count them as dropped frames. The issue is real enough (this isn’t a capture error) and AMD will be fixing it, but it’s not evidence of a dropped frame as the stock implementation of FCAT would assume.
Ultimately our best guess here is that AMD is somehow mistiming their buffer swaps, as the 2 frame old aspect of this correlates nicely to the fact that the dark blue and light blue frames would both be generated by the same GPU in a two-GPU setup.
CPU: | Intel Core i7-3960X @ 4.3GHz |
Motherboard: | EVGA X79 SLI |
Power Supply: | Antec True Power Quattro 1200 |
Hard Disk: | Samsung 470 (256GB) |
Memory: | G.Skill Ripjaws DDR3-1867 4 x 4GB (8-10-9-26) |
Case: | Thermaltake Spedo Advance |
Monitor: | Samsung 305T |
Video Cards: |
AMD Radeon HD 6990 AMD Radeon HD 7970GE AMD Radeon HD 7990 NVIDIA GeForce GTX 590 NVIDIA GeForce GTX 680 NVIDIA GeForce GTX 690 |
Video Drivers: |
NVIDIA ForceWare 326.19 AMD Catalyst 13.5 Beta 2 AMD Catalyst 13.6 Beta 2 AMD Catalyst 13.8 Beta 1 |
OS: | Windows 8 Pro |
Catalyst 13.8 Results in Summary
For this article we’ve decided to do things a bit differently and lead in with a summary of our results, rather than starting with detailed results and then going to a summary. Based on past feedback most of you want to quickly know whether this works at all and how well it works, which is something we can quickly cover first before diving into individual games.
We’ll start with the graph that is of the most importance: delta percentages on a 7990, comparing Catalyst 13.6b2 to Catalyst 13.8b1 with frame pacing enabled.
The results, quite frankly, speak for themselves. In roughly half of our 6 games AMD had absolutely absurd frame pacing on Catalyst 13.6. Total War, Sleeping Dogs, and Battlefield 3 all had massive pacing issues that were the result of second frames coming far too soon after first frames, leading to a high instance of “runt” frames – that is frames that are only shown for an incredibly short period of time before being replaced with a newer frame. These are the games where micro-stuttering and/or the feeling of lower frame rates would be the most apparent.
Earlier we decided that our cutoff would be 15%-20% for an “acceptable” range for delta percentages on a multi-GPU setup, and with the exception of Total War: Shogun 2 (the only non-action game in this collection), AMD has just managed to hit that. How smooth this is going to be perceived is going to vary on a person-by-person basis, but this is right where we’d say micro-stuttering and other issues become generally unnoticeable.
For the more visually inclined, we’ve also quickly cooked up frame time graphs in FCAT showing the two 7990s. The full series is below, but we’ll print in full the Total War: Shogun 2 graph in full since it was one of the bigger problem cases for AMD’s cards without frame pacing. Shogun doesn’t have any scene transitions, but it does have some snap camera movements that leads to a clear separation between scenes. In each scene we can clearly see the much lower variability with Catalyst 13.8 with frame pacing turned on, as opposed to 13.6 with frame pacing turned off.
Similarly, turning off frame pacing results results in Catalyst 13.6-like behavior, with much higher variability compared to having frame pacing turned on.
Moving on, the next question on most readers’ minds will probably be performance. What’s the performance sacrifice for using this new frame pacing mechanism? AMD said that the performance hit should be non-existent, and strictly speaking within Catalyst 13.8 that’s true, as we get identical frame rates with it on or off. However compared to Catalyst 13.6 we are seeing a performance regression.
With the exception of Hitman: Absolution, performance is down across the board on 13.8 versus 13.6. The specific performance losses vary on the game, but we’re looking at 5-10%. However compared to the 13.5 launch drivers and again with the exception of Hitman AMD’s performance has held constant or increased. So at the very least when it comes to frame rates AMD is no worse off than they were at the launch of the 7990.
Our next summary graph is plotting the 7970GE against a pair of 7970GEs in Crossfire, to take a fresh look at AFR (Crossfire) versus a single GPU. Our editorial position has been and remains that we favor a single larger GPU over a pair of smaller GPUs when this approach is practical, and this chart demonstrates exactly why.
The delta percentages on the single 7970GE are all under 2%, versus 12%+ for the Crossfire setup. AFR simply cannot match the consistency of a single GPU at this time, which is why a high AFR is best left to being pursued after single-GPU performance has been exhausted.
Catalyst 13.8 Results in Summary, Cont
Up next, let’s take a quick look at how the 7990 with frame pacing compares to NVIDIA’s GTX 690. NVIDIA’s frame pacing has been the gold standard thus far, so let’s see how close AMD has come to NVIDIA on their first shot.
Frankly the results aren’t flattering for AMD here, although keeping things in perspective they’re not terrible. In every last game GTX 690 has much lower frame time variability than 7990. NVIDIA has been working on this problem a lot longer than AMD has and it shows. Ultimately while it’s true this is an absolute metric when it comes to comparing results – AMD experiences more than two times the frame time variation in 5 of the 6 games – keep in mind we’re looking at the variance in frame times, rather than the frame times themselves, a first order derivative. What it means is that AMD clearly still has room for improvement, but AMD’s approximately 20% results are not a poor showing in this metric; for every individual there exists a point below which the frame time variations cease to be perceptible.
While we’re on the matter of this comparison, it’s very much worth pointing out that while AMD can’t match NVIDIA’s delta percentages at this time the same cannot be said for runt and dropped frames. Throughout our tests on Catalyst 13.8 AMD delivered 0 runt frames and dropped 0 frames. This is a massive improvement over Catalyst 13.6, which would regularly deliver runt frames and drop frames at times too. In fact even NVIDIA can’t do this well; the GTX 690 doesn’t drop any frames but does deliver a small number of runt frames (particularly towards the start of certain benchmarks). So in their very first shot AMD is already beating NVIDIA on runt frames, a concept pioneered by NVIDIA in the first place.
We’ve also posted the FCAT graphs for the 7990 versus the GTX 690 below. We can clearly see the higher variation of the 7990, while we see a few more instances of late frames on GTX 690 than we do 7990.
Moving on, we wanted to quickly compare D3D9 to D3D11 performance on the 7990. As a reminder AMD’s frame pacing mechanism isn’t enabled for D3D9, so this gives us a quick chance to look at the difference. The only title in our collection that is D3D9 capable is Total War: Shogun 2, so we’ll use that.
And there you go. Frame pacing is not available on D3D9, leading to much more variable results for the 7990 when using the D3D9 path, even though it’s otherwise faster due to the simpler effects. AMD will ultimately address D3D9 in a further phase, but in the meantime this reinforces the need for a switch to turn off Crossfire on dual-GPU cards like the 7990. NVIDIA allows this, and AMD lets you do it on multi-card setups, but with the 6990 and 7990 you are unfortunately locked into Crossfire mode at all times.
Finally, while it’s not something we can properly measure, we did want to touch upon the matter of input lag. AMD’s earlier position that frame pacing and input lag are inversely related was not wrong. At some level adding frame pacing is going to increase the input lag due to frames being held back. The question is, to what extent and is it acceptable?
The short answer is that while we can’t really give the issue the full attention it deserves without a high speed camera (something we don’t have), subjective testing is quite good. If there is a difference in input lag from enabling frame pacing, it’s not something we’re able to perceive. Despite AMD’s concerns about input lag from what usage testing we’ve done we have no problem saying that enabling frame pacing by default was the right move. In our experience there’s simply no reason not to enable it.
Total War: Shogun 2
Our first detailed benchmark is Shogun 2, which is a continuing favorite to our benchmark suite. Total War: Shogun 2 is the latest installment of the long-running Total War series of turn based strategy games, and alongside Civilization V is notable for just how many units it can put on a screen at once. Even 2 years after its release it’s still a very punishing game at its highest settings due to the amount of shading and memory those units require.
For the sake of completeness we’re posting our frame rate charts for each of our individual games, but in general there’s nothing here we haven’t seen before in the 7990 review, in other reviews, or in Bench. The 7990 and GTX 690 still swap places fairly regularly.
Looking at our expanded delta percentages for Shogun, we can see how the 7990 and other Crossfire solutions stack up to the GTX 690 and other SLI solutions. For all AFR configurations the results match what we saw in our summary, with NVIDIA’s solutions offering lower deltas than AMD’s even with the new drivers.
This is actually AMD’s weakest game, with both the 7970GECF and 7990 exceeding 20% variability on this game. However it’s also the only non-action game in this collection, so it’s the game least affected by higher levels of variation and consequently the game AMD can afford to do the worst at. Nevertheless the improvement over Catalyst 13.6 without frame pacing is nothing short of amazing.
Meanwhile we’ll hit upon this a few times, but as a reminder AMD’s frame pacing improvements apply to older cards too, so the 6990 has its frame pacing problems resolved like the rest of AMD’s multi-GPU cards. It actually does better than the rest, we believe due to the fact that the lower framerate and higher frame times give AMD’s drivers more time to analyze and schedule frames.
Looking at the FCAT graphs, we can see that the higher variability of the 7990’s frame times is represented well. Though NVIDIA’s frame time spikes are more extreme than AMD’s.
Finally we have our 95th percentile frame times. Despite the fact that AMD’s framerates are down slightly versus Catalyst 13.6, their 95% percentile times are way up. Simply by instituting frame pacing they’ve dropped from 36.2ms to 21.5ms per frame.
Hitman: Absolution
The second game in our lineup is Hitman: Absolution. The latest game in Square Enix’s stealth-action series, Hitman: Absolution is a DirectX 11 based title that though a bit heavy on the CPU, can give most GPUs a run for their money. Furthermore it has a built-in benchmark, which gives it a level of standardization that fewer and fewer benchmarks possess.
Again there’s nothing really new here. In the previous game the GTX 690 led, while in Hitman the 7990 leads.
Moving on to our delta percentages, we again see the gains AMD has made with their frame pacing implementation, though Hitman was one of their better games in the first place. Consequently they’ve shaved off 11 percentage points, from 31 to 20, just enough to reach our 20% threshold but no more. This is still 13 percentage points worse than the GTX 690 however, so much like Shogun there’s still clear room for further improvement.
Meanwhile the FCAT graph data shows the same thing we saw with our numerical analysis, with the 7990 showing a much greater degree of variability. AMD’s frame pacing mechanism seems to be much less present here than in other games, with the adaptive window less noticeable here than elsewhere, perhaps due to the relatively high variability.
The smaller gains with regards to deltas are reflected in a much smaller improvement on 95th percentile times. Though in this case at the 95th percentile the 7990 is competitive with the GTX 690.
Sleeping Dogs
Another Square Enix game, Sleeping Dogs is one of the few open world games to be released with any kind of benchmark, giving us a unique opportunity to benchmark an open world game. Like most console ports, Sleeping Dogs’ base assets are not extremely demanding, but it makes up for it with its interesting anti-aliasing implementation, a mix of FXAA and SSAA that at its highest settings does an impeccable job of removing jaggies. However by effectively rendering the game world multiple times over, it can also require a very powerful video card to drive these high AA modes.
Once more the tables turn with frame rates, and we see the 7990 hold on to a very small lead over the GTX 690.
Sleeping Dogs is another title where AMD has greatly improved on their frame consistency. With Catalyst 13.6 the 7990 had deltas over 100% (the variability was larger than the average frame time) which has since been heavily reduced with frame pacing. The end result eliminated AMD’s runt frames in this benchmark, and pushing them down to the 20% range. Still, this pales in comparison to the GTX 690. Though we do see the 6990 doing rather well for itself due to its lower frame rate, even with this metric compensating for that factor. It’s clearly a lot easier to schedule frames when you have more time to do it.
This is another case of where AMD’s massive improvements on frame pacing have led to an improvement in frame times at the 95th percentile. Here they shave off 7ms, and not unlike their frame rate are slightly ahead of the GTX 690.
Battlefield 3
Our multiplayer action game of our benchmark suite is Battlefield 3, DICE’s 2011 multiplayer military shooter. Its ability to pose a significant challenge to GPUs has been dulled some by time and drivers, but it’s still a challenge if you want to hit the highest settings at the highest resolutions at the highest anti-aliasing levels. Furthermore while we can crack 60fps in single player mode, our rule of thumb here is that multiplayer framerates will dip to half our single player framerates, so hitting high framerates here may not be high enough.
BF3 typically favors NVIDIA cards, so it comes as no great shock that performance has once again flipped in advantage of the GTX 690.
Looking at our delta percentages this is another game where AMD has made massive gains; previously they’d be so unbalanced that every other frame for a long stretch of the benchmark would be a runt frame. 12.6% is the single best showing for the 7990 and much closer to where we would like AMD to be. They still have more than twice the variability of the GTX 690, but if every game were like this AMD would in a better position than where they’re going to be with this first phase of frame pacing.
The graphical representation of our FCAT data neatly matches our numeric analysis, once again showcasing AMD’s improved frame pacing, while showing how much farther they have to go to catch NVIDIA.
Finally this is another case where improving on the frame pacing situation by so much has greatly improved on 95th percentile times. Though 7990 still trails GTX 690, as you’d expect given the latter’s general performance lead.
Bioshock Infinite
Bioshock Infinite is Irrational Games’ latest entry in the Bioshock franchise. Though it’s based on Unreal Engine 3 – making it our obligatory UE3 game – Irrational had added a number of effects that make the game rather GPU-intensive on its highest settings. As an added bonus it includes a built-in benchmark composed of several scenes, a rarity for UE3 engine games, so we can easily get a good representation of what Bioshock’s performance is like.
AMD and NVIDIA exchange places once more, with the 7990 taking a small lead over the GTX 690.
AMD’s initial situation with Bioshock was not as dire as it was in say Battlefield 3, but with deltas approaching 60% it wasn’t pretty either. Once more they’ve managed to get their delta percentages to around 20%, a level that is acceptable for now while leaving clear room for improvement. Especially as once more the 7990 delivers deltas more than twice those of the GTX 690.
Though on a side note, this game is a great reminder of just how much better single-GPU cards are at consistency. The best multi-GPU setup is at 8.2%; the worst single-GPU setup is 2.6%.
Graphically things are roughly as expected. It’s interesting to note that NVIDIA has some significant frame time spikes that AMD doesn’t encounter, though a single-GPU setup would shortcut the issue entirely.
AMD’s 95th percentile improvement isn’t nearly as pronounced in Bioshock. Meanwhile the higher variability does cost them just enough to have the 7990 fall behind the GTX 690 here.
Crysis 3
Our final benchmark in our suite needs no introduction. With Crysis 3, Crytek has gone back to trying to kill computers, taking back the “most punishing game” title in our benchmark suite. Only in a handful of setups can we even run Crysis 3 at its highest (Very High) settings, and that’s still without AA. Crysis 1 was an excellent template for the kind of performance required to driver games for the next few years, and Crysis 3 looks to be much the same for 2013.
Our last game and our last flip. AMD and NVIDIA exchange places one final time, with the GTX 690 and 7990 swapping out so that the GTX 690 takes the lead.
Crysis 3 is another game where AMD’s initial position wasn’t quite as bad, and consequently the improvements aren’t as great. 19% gets them to the acceptable range on the 7990, while at the same time with only 4 percentage points separating the 7990 and GTX 690 means that this is the closest the two cards have ever come to matching each other in frame time consistency.
Graphically we can see that both AMD and NVIDIA still struggle with consistency to some extent. GTX 690 in particular has a short run of very high variability about 10 seconds in that AMD doesn’t experience, likely due to their hard cap on minimum frame times.
Finally on the matter of 95th percentile times, our data here mirrors what we’ve seen earlier. AMD shows a smaller gain, with their final value of 20.7ms still leaving them a couple of milliseconds behind the faster GTX 690.
Final Words
Bringing things to a close, today’s driver release isn’t about any grand revelations for AMD, but rather about AMD following through on their plans and commitments to improve frame time consistency on their products. We’ve seen AMD get their house in order with respect to single-GPU cards earlier this year, and now the same is starting to happen for multi-GPU setups.
To be clear here AMD’s initial situation should never have happened. AMD should have been doing the appropriate competitive analysis from the start, never letting themselves fall behind like this. But we aren’t in the business of berating companies that make an honest effort to improve their products like AMD is doing, so while AMD could have done better in the past they are finally making the right moves in the present, and it’s the present that’s going to be the most important for AMD’s customers.
So what does AMD’s present look like? Quite frankly, it looks a lot better than it did yesterday. AMD set out to greatly improve on their frame pacing situation on their Crossfire setups and they have delivered just that. With just one driver revision we have seen the Radeon HD 7990’s frame pacing go from laughable to acceptable; delta percentages of over 100% have been reduced to 20% or lower in 5 of the 6 games we’ve tested. For those sensitive to micro-stutter and other matters of consistency the difference is at times going to be staggering. At the most basic level, AMD has achieved their objectives.
With that said, there’s still room for improvement, and this goes for both functionality and further improvements in frame consistency. AMD’s triage-like approach means that D3D9, OpenGL, and most importantly Eyefinity are still not capable of using frame metering. These will be covered in future phases of AMD’s rollout of their frame pacing technology, and they can’t come soon enough, but for the time being these are limitations that need to be kept in mind.
Similarly while AMD’s frame pacing has improved to the point where we find it acceptable, 20% deltas are still generally twice that of NVIDIA’s GeForce GTX 690, never mind the extreme consistency single-GPU setups offer. We never realistically expected AMD to match NVIDIA’s frame interval consistency overnight, but in time it would be nice to see them get close, and for both parties to further improve beyond that.
Moving on, while today’s driver release is primarily one part of AMD’s longer term plan to deal with frame interval consistency, AMD is trying hard to also use this moment as something of a second launch for the Radeon HD 7990. With the 7990 launching in April it had the poor timing of arriving shortly after the multi-GPU frame pacing issue came to a head, which is something that has hobbled the card since its launch. In terms of absolute performance (average frame rates) things have changed very little since the launch of the 7990 so we’re not going to get into the matter of performance.
What has changed since then for the 7990 is first and foremost its frame pacing improvements as we’ve seen today. To be very clear here the GTX 690 is still the better card for those users heavily concerned about consistency, but AMD’s improvements have brought the 7990 to the point where we find its frame consistency generally acceptable. This isn’t a rousing recommendation nor is it meant to be, but it’s a reflection of the fact that AMD has brought their consistency to the point where pairing up multiple Tahiti GPUs as is done in the 7990 is no longer fraught with the frame pacing problems it once was. For most users I believe we’re to the point where the consistency differences are greater on paper than they are on the eyes, but of course that is going to depend on the visual acuity of the user.
Moving on, the other thing that has changed for AMD is pricing and competitive positioning. Officially AMD hasn’t cut the price on the 7990, but the fact that XFX is now offering a reference 7990 for $799 after rebate is not a coincidence. With AMD’s Level Up with Never Settle Reloaded promotion still running, AMD is making a clear play for the value segment right now. I don’t believe it’s where AMD would like to be, but there’s no arguing that it’s effective. For users who have a reasonable level of faith in Crossfire scaling and are satisfied with AMD’s frame pacing improvements, a $799 7990 is a very good deal at the moment.
With that in mind, we do want to reiterate that our editorial position here on AFR setups isn’t changing. We still favor strong single-GPU setups over weaker multi-GPU setups, but this is a matter of valuing the lack of AFR profile requirements, coupled with the tendency for newly launched games to have immature AFR profiling, and of course the general consistency issues we’ve covered today. AFR is still the only way to further improve performance once the single-GPU route has been exhausted, and in AMD’s case it’s the only way to exceed the performance of a 7970 GHz Edition, so it does have its place.
Ultimately we have to give AMD the kudos they deserve. They have come forward about their issues, set out a plan to fix them, and have begun delivering on those plans. There’s still room for further improvement within AMD’s drivers, so AMD’s job is far from done, but today they have taken the first step needed to settle the frame pacing problems that have been dogging their products.