Original Link: https://www.anandtech.com/show/11717/the-amd-radeon-rx-vega-64-and-56-review



We’ve seen the architecture. We’ve seen the teasers. We’ve seen the Frontier. And we’ve seen the specifications. Now the end game for AMD’s Radeon RX Vega release is finally upon us: the actual launch of the hardware. Today is AMD’s moment to shine, as for the first time in over a year, they are back in the high-end video card market. And whether their drip feeding marketing strategy has ultimately succeeded in building up consumer hype or burnt everyone out prematurely, I think it’s safe to say that everyone is eager to see what AMD can do with their best foot forward on the GPU front.

Launching today is the AMD Radeon RX Vega 64, or just Vega 64 for short. Based on a fully enabled Vega 10 GPU, the Vega 64 will come in two physical variants: air cooled and liquid cooled. The air cooled card is your traditional blower-based design, and depending on the specific SKU, is either available in AMD’s traditional RX-style shroud, or a brushed-aluminum shroud for the aptly named Limited Edition.

Meanwhile the Vega 64 Liquid Cooled card is larger, more powerful, and more power hungry, utilizing a Radeon R9 Fury X-style external radiator as part of a closed loop liquid cooling setup in order to maximize cooling performance, and in turn clockspeeds. You actually won’t see AMD playing this card up too much – AMD considers the air cooled Vega 64 to be their baseline – but for gamers who seek the best Vega possible, AMD has put together quite a stunner.

Also having its embargo lifted today, but not launching until August 28th, is the cut-down AMD Radeon RX Vega 56. This card features lower clockspeeds and fewer enabled CUs – 56 out of 64, appropriately enough – however it also features lower power consumption and a lower price to match. Interestingly enough, going into today’s release of the Vega 64, it’s the Vega 56 that AMD has put the bulk of their marketing muscle behind.

AMD Radeon RX Series Specification Comparison
  AMD Radeon RX Vega 64 Liquid AMD Radeon RX Vega 64 AMD Radeon RX Vega 56 AMD Radeon R9 Fury X
Stream Processors 4096
(64 CUs)
4096
(64 CUs)
3584
(56 CUs)
4096
(64 CUs)
Texture Units 256 256 224 256
ROPs 64 64 64 64
Base Clock 1406MHz 1247MHz 1156MHz N/A
Boost Clock 1677MHz 1546MHz 1471MHz 1050MHz
Memory Clock 1.89Gbps HBM2 1.89Gbps HBM2 1.6Gbps HBM2 1Gbps HBM
Memory Bus Width 2048-bit 2048-bit 2048-bit 4096-bit
VRAM 8GB 8GB 8GB 4GB
Transistor Count 12.5B 12.5B 12.5B 8.9B
Board Power 345W 295W 210W 275W
(Typical)
Manufacturing Process GloFo 14nm GloFo 14nm GloFo 14nm TSMC 28nm
Architecture Vega
(GCN 5)
Vega
(GCN 5)
Vega
(GCN 5)
GCN 3
GPU Vega 10 Vega 10 Vega 10 Fiji
Launch Date 08/14/2017 08/14/2017 08/28/2017 06/24/2015
Launch Price $699* $499/599* $399/499* $649

Between these SKUs, AMD is looking to take on NVIDIA’s longstanding gaming champions, the GeForce GTX 1080 and the GeForce GTX 1070. In both performance and pricing, AMD expects to be able to bring NVIDIA’s cards to a draw, if not pulling out a victory for Team Red. This means we’ll see the $500 Vega 64 set against the GTX 1080, while the $400 Vega 56 goes up against the GTX 1070. At the same time however, the dark specter of cryptocurrency mining hangs over the gaming video card market, threatening to disrupt pricing, availability, and the best-laid plans of vendors and consumers alike. Suffice it to say, this is a launch like no other in a time like no other.

Overall it has been an interesting past year and a half to say the least. With a finite capacity to design chips, AMD’s decision to focus on the mid-range market with the Polaris series meant that the company effectively ceded the high-end video card market to NVIDIA once the latter’s GeForce GTX 1080 and GTX 1070 launched. This has meant that for the past 15 months, NVIDIA has had free run of the high-end market. Meanwhile AMD’s efforts to focus on the mid-range market to win back market share meant that AMD initially got the jump on NVIDIA in this market by releasing Polaris ahead of NVIDIA’s answer, and their market share has recovered some. However it’s a constant fight against the dominating NVIDIA, and one that’s been made harder by essentially being invisible to the few high-end buyers and the many window shoppers. That is a problem that ends today with the launch of the Vega 64.

I’d like to say that today’s launch is AMD landing a decisive blow in the video card marketplace, but the truth of the matter is that while AMD PR puts on their best face, there are signs that behind the scenes things are more chaotic than anyone would care for. Vega video cards were originally supposed to be out in the first-half of this year, and while AMD technically made that with the launch of the Vega Frontier Edition cards, it’s just that: a technicality. It was certainly not the launch that anyone was expecting at the start of 2017, especially since some of Vega’s new architectural functionality wasn’t even enabled at the time.

More recently, AMD’s focus on product promotion and on product sampling has been erratic. We’ve only had the Vega 64 since Thursday, giving us less than 4 days to completely evaluate the thing. Adding to the chaos, Thursday evening AMD informed us that we’d receive the Vega 56 on Friday, and encouraging us to focus on that instead. The reasoning behind this is complex – I don’t think AMD knew if it could have Vega 56 samples ready, for a start – but ultimately boils down to AMD wanting to put their best foot forward. And right now, the company believes that the Vega 56 will do better against the GTX 1070 than the Vega 64 will do against the GTX 1080.

Regardless, it means that we’ve only had a very limited amount of time to evaluate the performance and architectural aspects of AMD’s new cards, and even less time to write about them. Never mind chasing down interesting odds & ends. So while this is a full review of the Vega 64 and Vega 56, there’s some further investigating left to do once we recover from this blitz of a weekend and get our bearings back.

So without further ado, let’s dig into AMD return to the high-end market with their Vega architecture, Vega 10 GPU, and the Vega 64 & Vega 56 video cards.



Vega 10: Fiji of the Stars

Before we dive into the Vega architecture itself, I want to start with the Vega 10 GPU proper, and as we look at its features you’ll soon understand why.

Vega 10 is for most practical purposes the successor to the Fiji GPU used in the Radeon R9 Fury and Nano products. And at face value this may seem a bit obvious – after all, it’s AMD’s first high-end GPU since then – but digging down a bit deeper, it’s interesting just how alike Fiji it is.

At a high level, Vega 10’s compute core is configured almost exactly like Fiji. This means we’re looking at 64 CUs spread out over 4 shader engines. Or as AMD is now calling them, compute engines. Each compute engine in turn is further allocated a portion of Vega 10’s graphics resources, amounting to one geometry engine and rasterizer bundle at the front end, and 16 ROPs (or rather 4 actual ROP units with a 4 pix/clock throughput rate) at the back end. Not assigned to any compute engine, but closely aligned with the compute engines is the command processor frontend, which like Fiji before it, is a single command processor paired with 4 ACEs and another 2 Hardware Schedulers.

On a brief aside, the number of compute engines has been an unexpectedly interesting point of discussion over the years. Back in 2013 we learned that the then-current iteration of GCN had a maximum compute engine count of 4, which AMD has stuck to ever since, including the new Vega 10.  Which in turn has fostered discussions about scalability in AMD’s designs, and compute/texture-to-ROP ratios.

Talking to AMD’s engineers about the matter, they haven’t taken any steps with Vega to change this. They have made it clear that 4 compute engines is not a fundamental limitation – they know how to build a design with more engines – however to do so would require additional work. In other words, the usual engineering trade-offs apply, with AMD’s engineers focusing on addressing things like HBCC and rasterization as opposed to doing the replumbing necessary for additional compute engines in Vega 10.

Not shown on AMD’s diagram, but confirmed in the specifications, is how the CUs are clustered together within a compute engine. On all iterations of GCN, AMD has bundled CUs together in a shader array, with up to 4 CUs sharing a single L1 instruction cache and a constant cache. For Vega 10, that granularity has gone up a bit, and now only 3 CUs share any one of these cache sets. As a result there are now 6 CU arrays per compute engine, up from 4 on Fiji.

It’s only once we get away from Vega 10’s compute core that we finally start to see some greater differences from Fiji. Besides being rewired to backstop the ROPs, the L2 cache has also been enlarged from 2MB on Fiji to 4MB on Vega 10. This growth not only gives Vega 10's L2 cache the room to serve the ROPs, but follows a general trend of ever-increasing cache sizes in GPUs.

But easily the biggest shift here is that AMD has moved from HBM to HBM2, and as a result they’ve halved the number of memory controllers from 4 to 2. As we’ll see in the card specifications, this costs Vega 10 just a bit of memory bandwidth since HBM2 hasn’t reached its intended speeds, but it saves AMD some die space, not to mention cuts down on the number of signal lines that need to be run off of the die and onto the silicon interposer

Connecting the memory controllers to the rest of the GPU – and the various fixed function blocks as well – is AMD’s Infinity Fabric. The company’s home-grown technology for low-latency/low-power/high-bandwidth connections, this replaces Fiji’s unnamed interconnect method. Using the Infinity Fabric on Vega 10 is part of AMD’s efforts to develop a solid fabric and then use it across the company; we’ve already seen IF in use on Ryzen and Threadripper, and overall it’s a lot more visible in AMD’s CPUs than their GPUs. But it’s there, tying everything together.

On a related note, the Infinity Fabric on Vega 10 runs on its own clock domain. It’s tied to neither the GPU clock domain nor the memory clock domain. As a result, it’s not entirely clear how memory overclocking will fare on Vega 10. On AMD’s CPUs a faster IF is needed to carry overclocked memory. But since Vega 10’s IF connects a whole lot of other blocks – and outright adjust the IF’s clockspeed based on the workload need (e.g. video transcoding requires a fast VCE to PCIe link), it’s not as straightforward as just overclocking the HBM2. Though similarly, HBM1 overclocking wasn’t very straightforward either, so Vega 10 is not a great improvement in this regard.

Otherwise, while all of the various fixed function units and engines have been updated over Fiji, their roles remain unchanged. So the multimedia engine, display engine, and XDMA engine are still present and accounted for.

Meanwhile it’s interesting to note that while Vega 10 is a replacement for Fiji, it is not a complete replacement for Hawaii. 2013’s Hawaii GPU was the last AMD GPU to be designed for HPC duties. Which is to say that it featured high FP64 performance (1/2 the FP32 rate) and ECC was available on the GPU’s internal pathways, offering a high reliability mode from GPU to DRAM and back again. Vega 10, on the other hand only offers the same 1/16th FP64 rate found on all other recent AMD GPUs, and similarly doesn’t have internal ECC. Vega 10 does do better than Fiji in one regard though, and that’s that it has “free” ECC, since the feature is built into the HBM2 memory that AMD uses. So while it doesn’t offer end-to-end ECC, it does offer it within the more volatile memory. Which for AMD’s consumer, professional, and deep learning needs, is satisfactory.

All told then, Vega 10 measures in at 486mm2 (ed: a nice number if I ever saw one), and like Polaris and the Ryzen CPUs, it’s built on partner GlobalFoundries’ 14nm LPP process. Within AMD’s historical pantheon of GPUs, this makes it 48mm2 larger than Hawaii and 110mm2 smaller than the late-generation Fiji. AMD has been producing GPUs at GlobalFoundries for a while now, so in a sense this is a logical progression from Polaris 10. On the other hand as AMD’s first high-end chip for the 14nm generation, this is the biggest they’ve ever started at.

That space is put to good use however, as it contains a staggering 12.5 billion transistors. This is 3.9B more than Fiji, and still 500M more than NVIDIA’s GP102 GPU. So outside of NVIDIA’s dedicated compute GPUs, the GP100 and GV100, Vega 10 is now the largest consumer & professional GPU on the market.

Given the overall design similarities between Vega 10 and Fiji, this gives us a very rare opportunity to look at the cost of Vega’s architectural features in terms of transistors. Without additional functional units, the vast majority of the difference in transistor counts comes down to enabling new features.

Talking to AMD’s engineers, what especially surprised me is where the bulk of those transistors went; the single largest consumer of the additional 3.9B transistors was spent on designing the chip to clock much higher than Fiji. Vega 10 can reach 1.7GHz, whereas Fiji couldn’t do much more than 1.05GHz. Additional transistors are needed to add pipeline stages at various points or build in latency hiding mechanisms, as electrons can only move so far on a single (ever shortening) clock cycle; this is something we’ve seen in NVIDIA’s Pascal, not to mention countless CPU designs. Still, what it means is that those 3.9B transistors are serving a very important performance purpose: allowing AMD to clock the card high enough to see significant performance gains over Fiji.

Overall Vega 10 is a very important chip for AMD because it’s going to be pulling double (if not triple) duty for AMD. It’s their flagship consumer GPU, but it’s also their flagship professional GPU, and it’s their flagship server GPU. This goes for both deep learning (Vega Instinct) and potential other future server products, such as virtualization cards. As AMD likes to boast, they had to do it all with one chip rather than NVIDIA’s hyper-segmented stack. Of course the reality is that AMD doesn’t have the resources to mirror NVIDIA’s efforts 1-to-1, so it means they have to be smarter about what they do in order to make the most of Vega 10.

Vega 10 won’t be alone however. As early as last year AMD reps confirmed that there’s a Vega 11 in the works, though at this time AMD isn’t saying anything about the chip. Given that Vega 10 is already a fairly large chip, and that Polaris chips decreased in size with their number, I’d expect Vega 11 to be a smaller version of Vega. Though where that fits into the Vega 10/Polaris 10 stack is anyone’s guess at this point.



The Vega Architecture: AMD’s Brightest Day

From an architectural standpoint, AMD’s engineers consider the Vega architecture to be their most sweeping architectural change in five years. And looking over everything that has been added to the architecture, it’s easy to see why. In terms of core graphics/compute features, Vega introduces more than any other iteration of GCN before it.

Speaking of GCN, before getting too deep here, it’s interesting to note that at least publicly, AMD is shying away from the Graphics Core Next name. GCN doesn’t appear anywhere in AMD’s whitepaper, while in programmers’ documents such as the shader ISA, the name is still present. But at least for the purposes of public discussion, rather than using the term GCN 5, AMD is consistently calling it the Vega architecture. Though make no mistake, this is still very much GCN, so AMD’s basic GPU execution model remains.

So what does Vega bring to the table? Back in January we got what has turned out to be a fairly extensive high-level overview of Vega’s main architectural improvements. In a nutshell, Vega is:

  • Higher clocks
  • Double rate FP16 math (Rapid Packed Math)
  • HBM2
  • New memory page management for the high-bandwidth cache controller
  • Tiled rasterization (Draw Stream Binning Rasterizer)
  • Increased ROP efficiency via L2 cache
  • Improved geometry engine
  • Primitive shading for even faster triangle culling
  • Direct3D feature level 12_1 graphics features
  • Improved display controllers

The interesting thing is that even with this significant number of changes, the Vega ISA is not a complete departure from the GCN4 ISA. AMD has added a number of new instructions – mostly for FP16 operations – along with some additional instructions that they expect to improve performance for video processing and some 8-bit integer operations, but nothing that radically upends Vega from earlier ISAs. So in terms of compute, Vega is still very comparable to Polaris and Fiji in terms of how data moves through the GPU.

Consequently, the burning question I think many will ask is if the effective compute IPC is significantly higher than Fiji, and the answer is no. AMD has actually taken significant pains to keep the throughput latency of a CU at 4 cycles (4 stages deep), however strictly speaking, existing code isn’t going to run any faster on Vega than earlier architectures. In order to wring the most out of Vega’s new CUs, you need to take advantage of the new compute features. Note that this doesn’t mean that compilers can’t take advantage of them on their own, but especially with the datatype matters, it’s important that code be designed for lower precision datatypes to begin with.



Rapid Packed Math: Fast FP16 Comes to Consumer Cards (& INT16 Too!)

Arguably AMD’s marquee feature from a compute standpoint for Vega is Rapid Packed Math. Which is AMD’s name for packing two FP16 operations inside of a single FP32 operation in a vec2 style. This is similar to what NVIDIA has done with their high-end Pascal GP100 GPU (and Tegra X1 SoC), which allows for potentially massive improvements in FP16 throughput. If a pair of instructions are compatible – and by compatible, vendors usually mean instruction-type identical – then those instructions can be packed together on a single FP32 ALU, increasing the number of lower-precision operations that can be performed in a single clock cycle. This is an extension of AMD’s FP16 support in GCN 3 & GCN 4, where the company supported FP16 data types for the memory/register space savings, but FP16 operations themselves were processed no faster than FP32 operations.

The purpose of integrating fast FP16 and INT16 math is all about power efficiency. Processing data at a higher precision than is necessary unnecessarily burns power, as the extra work required for the increased precision accomplishes nothing of value. In this respect fast FP16 math is another step in GPU designs becoming increasingly min-maxed; the ceiling for GPU performance is power consumption, so the more energy efficient a GPU can be, the more performant it can be.

Taking advantage of this feature, in turn, requires several things. It requires API support and it requires compiler support, but above all it requires code that explicitly asks for FP16 data types. The reason why that matters is two-fold: virtually no existing programs use FP16s, and not everything that is FP32 is suitable for FP16. In the compute world especially, precisions are picked for a reason, and compute users can be quite fussy on the matter. Which is why fast FP64-capable GPUs are a whole market unto themselves. That said, there are whole categories of compute tasks where the high precision isn’t necessary; deep learning is the poster child right now, and for Vega Instinct AMD is practically banking on it.

As for gaming, the situation is more complex still. While FP16 operations can be used for games (and in fact are somewhat common in the mobile space), in the PC space they are virtually never used. When PC GPUs made the jump to unified shaders in 2006/2007, the decision was made to do everything at FP32 since that’s what vertex shaders typically required to begin with, and it’s only recently that anyone has bothered to look back. So while there is some long-term potential here for Vega’s fast FP16 math to become relevant for gaming, at the moment it doesn’t do much outside of a couple of benchmarks and some AMD developer relations enhanced software. Vega will, for the present, live and die in the gaming space primarily based on its FP32 performance.

The biggest obstacle for AMD here in the long-term is in fact NVIDIA. NVIDIA also supports native FP16 operations, however unlike AMD, they restrict it to their dedicated compute GPUs (GP100 & GV100). GP104, by comparison, offers a painful 1/64th native FP16 rate, making it just useful enough for compatibility/development purposes, but not fast enough for real-world use. So for AMD there’s a real risk of developers not bothering with FP16 support when 70% of all GPUs sold similarly don’t support it. It will be an uphill battle, but one that can significantly improve AMD’s performance if they can win it, and even more so if NVIDIA chooses not to budge on their position.

Though overall it’s important to keep in mind here that even in the best case scenario, only some operations in a game are suitable for FP16. So while FP16 execution is twice as fast as FP32 execution on paper specifically for a compute task, the percentage of such calculations in a game will be lower. In AMD’s own slide deck, they illustrate this, pointing out that using 16-bit functions makes specific rendering steps of 3DMark Serra 20-25% faster, and those are just parts of a whole.

Moving on, AMD is also offering limited native 8-bit support via a pair of specific instructions. On Vega the Quad Sum of Absolute Differences (QSAD) and its masked variant can be executed on Vega in a highly packed form using 8-bit integers. SADs are a rather common image processing operation, and are particularly relevant for AMD’s Instinct efforts since they are used in image recognition (a major deep learning task).

Finally, let’s talk about API support for FP16 operations. The situation isn’t crystal-clear across the board, but for certain types of programs, it’s possible to use native FP16 operations right now.

Surprisingly, native FP16 operations are not currently exposed to OpenCL, according to AIDA64. So within a traditional AMD compute context, it doesn’t appear to be possible to use them. This obviously is not planned to remain this way, and while AMD hasn’t been able to offer more details by press time, I expect that they’ll expose FP16 operations under OpenCL (and ROCm) soon enough.

Meanwhile, High Level Shader Model 5.x, which is used in DirectX 11 and 12, does support native FP16 operations. And so does Vulkan, for that matter. So it is possible to use FP16 right now, even in games. Running SiSoftware’s Sandra GP GPU benchmark with a DX compute shader shows a clear performance advantage, albeit not a complete 2x advantage, with the switch to FP16 improving compute throughput by 70%.

However based on some other testing, I suspect that native FP16 support may only be enabled/working for compute shaders at this time, and not for pixel shaders. In which case AMD may still have some work to do. But for developers, the message is clear: you can take advantage of fast FP16 performance today.



Sizing Up Today’s Launch: RX Vega 64 & RX Vega 56

Now that we’ve gone through the architectural details of Vega, let’s size up today’s launch with the Radeon RX Vega 64 and Radeon RX Vega 56. While this isn’t quite a traditional one-two launch since Vega 56 doesn’t come out for another two weeks, it otherwise follows the usual pattern. That means a high-performance, fully-enabled card at the top, with its salvaged, lower-clocked, lower-priced counterpart below it.

AMD Radeon RX Series Specification Comparison
  AMD Radeon RX Vega 64 Liquid AMD Radeon RX Vega 64 AMD Radeon RX Vega 56 AMD Radeon R9 Fury X
Stream Processors 4096
(64 CUs)
4096
(64 CUs)
3584
(56 CUs)
4096
(64 CUs)
Texture Units 256 256 224 256
ROPs 64 64 64 64
Base Clock 1406MHz 1247MHz 1156MHz N/A
Boost Clock 1677MHz 1546MHz 1471MHz 1050MHz
Memory Clock 1.89Gbps HBM2 1.89Gbps HBM2 1.6Gbps HBM2 1Gbps HBM
Memory Bus Width 2048-bit 2048-bit 2048-bit 4096-bit
VRAM 8GB 8GB 8GB 4GB
Transistor Count 12.5B 12.5B 12.5B 8.9B
Board Power 345W 295W 210W 275W
(Typical)
Manufacturing Process GloFo 14nm GloFo 14nm GloFo 14nm TSMC 28nm
Architecture Vega
(GCN 5)
Vega
(GCN 5)
Vega
(GCN 5)
GCN 3
GPU Vega 10 Vega 10 Vega 10 Fiji
Launch Date 08/14/2017 08/14/2017 08/28/2017 06/24/2015
Launch Price $699* $499/599* $399/499* $649

First off of course is the Radeon RX Vega 64. Based on a fully enabled Vega 10 GPU, this card is AMD’s best foot forward on Vega 10 performance. As it lives up to with its name, the Vega 64 ships with all 64 of Vega 10’s CUs enabled, giving the card 4096 SPs and 256 texture units. These CUs are in turn paired with AMD’s now L2 cache-backed ROPs, with a complete set of 64 of them.

In terms of clockspeed then, the Vega 64 can reach clockspeeds significantly higher than AMD’s Polaris cards, never mind the 28nm Fury X. In fact clockspeed is the single greatest resources AMD has for improving performance in existing games relative to Fury X, as while the architecture optimizations we talked about earlier do help performance in specific situations, nothing else has the raw potency and consistency of Vega’s much greater clockspeeds.

To this end the base clock is admittedly a bit low at 1247MHz, however the boost clock is at 1546MHz. As a reminder, AMD is taking a more NVIDIA-like stance with clockspeed advertising, so whereas Fury X’s boost clock was its highest attainable clockspeed (throttling when it couldn’t sustain it), RX Vega’s boost clock is the average clockspeed AMD expects the card to be able to sustain under gaming workloads. The card itself can actually boost even higher than this given a low-powered workload (e.g. something compute-heavy), with both of our sample cards boosting up to 1630MHz.

All told then, relative to its Fury X predecessor, on paper Vega 64 offers 47% higher shader, texture, geometry, and ROP throughput. These numbers don’t factor in Vega’s architecture enhancements, and in practice the actual performance gain will depend on what clockspeeds that the Vega 64 can attain, and conversely how much a given workload benefits from those aforementioned architectural improvements. So as with past architecture launches, the specs are only half of the story and the benchmarks will play the rest.

Paired with the Vega 10 GPU is 8GB of HBM2 memory, in the form of a pair of 4-Hi 4GB stacks. The two stacks are connected to Vega 10 via a 2048-bit wide memory bus running through the silicon interposer, and coupled with HBM2’s improved clockspeeds, gives Vega 10 GPU a lot of memory bandwidth in little space. AMD has clocked the memory at 1.89Gbps, giving the Vega 64 484GB/sec of memory bandwidth. With the original goal for the flagship card being a 2Gbps data rate (for a total of 512GB/sec) this does put Vega 64 at a slight disadvantage versus the previous Fury X. Which means that from an architectural perspective, AMD has to do more with slightly less.

Moving on, for better or worse the Vega 64 is a high-powered card in all senses of the word. AMD’s official board power specification for the card is 295W. Board power is a bit of a new metric for AMD; in previous generations they have published the typical board power of a card, whereas the 295W value here is no longer considered typical. Truthfully I don’t have a full grasp of the difference, but given our data and the fact that this value is higher than the Fury X’s typical board power of 275W, and this seems to function closer to a maximum for AMD. Which means that depending on the scenario a card can draw less than 295W, however as virtually all GPU-bound scenarios are also power-bound scenarios, it’s as good a number as any for a full load power specification.

For AMD, a high board power specification isn’t especially new. Fury X, Fury, and the R9 290X all shipped with high TBPs as well. But it means buyers should set their expectations accordingly for how the card will compare to past cards and the competition in terms of power consumption, and what acoustics might be like on this blower-based card.

Finally, while we’re not reviewing it today, AMD also has the RX Vega 64 Liquid Cooled edition. AMD is wisely not treating this card as the “baseline” performance of the Vega 64, but rather an enhanced enthusiast edition. This card has a boost clock rating of 1677MHz – 131Mhz above the air cooled card – and should perform a bit better than its non-liquid counterpart. However power consumption has more than gone up to match, with a 345W board power rating. This is going to be a low-volume halo part for enthusiasts who want the fastest Vega 64 possible, regardless of what it means for pricing or power consumption, and AMD is treating it accordingly.

Radeon RX Vega 56

Second on deck for today’s review embargo and arguably the focus of AMD’s promotional efforts is the Radeon RX Vega 56. The lower-tier counterpart to the RX Vega 64, the RX Vega 56 is the obligatory cut-down version of the RX Vega family. This features the same Vega 10 GPU as the Vega 64, however as accurately described in the name, only 56 of the 64 CUs are enabled on this card, leaving it with 3584 stream processors and 224 texture units.

Clockspeeds have also been cut down for the Vega 56, leading to the card shipping with a 1156MHz base clock and 1471MHz boost clock. On paper then, it offers 83% of the RX Vega 64’s compute and texturing performance, and 95% of the Vega 64’s ROP and geometry performance. Consequently, how the Vega 56 will perform in games has the potential to swing anywhere between a solid step below AMD’s flagship card, and something that gets a bit too close for comfort.

Meanwhile like its high-tier counterpart, the Vega 56 gets 8GB of HBM2 memory. Like its GPU clockspeed, memory clockspeeds have also been reduced to a less aggressive frequency here, leading to AMD shipping the card with a 1.6Gbps data rate. All told then, this gives the Vega 56 410GB/sec of memory bandwidth to work with, about 85% of Vega 64’s. Given that Vega 56’s compute/shading throughput is also about 85% of Vega 64’s, you can see how this memory configuration is a good match for Vega 56’s GPU configuration.

The upside of pulling back on performance is that AMD has also been able to pull back on power consumption. Vega 56 shaves 85W off of its board power rating, bringing it down to 210W. Which considering that on paper Vega 56 should deliver 85% of the performance of Vega 64, doing so at 71% of the power consumption looks very tantalizing in its own way.

The one catch specific to the Vega 56’s launch however is that it isn’t. Only the Vega 64 is launching today, while the Vega 56 will not be for sale until August 28th, making this a bit of a paper launch.



The Potential Cloud over Vega: Cryptocurrency Demand

While AMD has a solid game plan in place in terms of products, getting those cards to gamers may be the trickier proposition. For the last few months the company has been selling every last Polaris 10 GPU they can produce, which is not a bad situation to be in for the underdog GPU manufacturer. However those great sales have not come courtesy of gamers, but rather of cryptocurrency miners, who have been riding a wave of high coin values.

As a result, for gamers this has been the summer of waiting. Miner demand has rapidly depleted the video card market, sending prices spiking on those cards that remain. Unfortunately this has come at the cost of gamers, who have been left with few options besides waiting for more cards to become available, or pay sometimes vastly inflated prices for mid-range and enthusiast cards. Worse still, it has reached a point where it’s impacting NVIDIA card pricing and availability as well, so any value-priced GeForce cards are going just as quickly.

Consequently, for the Vega launch AMD has significant concerns about whether they’ll be able to keep cards from getting scooped up by miners. The problem is one of pure economics – video cards are downright cheap for the amount of silicon they come with – which limits AMD’s options here. If Vega cards are good at mining, then miners will buy them until the market reaches a new equilibrium, one where everyone pays more. AMD has a counterplan in place with their Radeon Pack bundles (more on this in a minute), but ultimately AMD is not in control of Vega demand.

The good news is that as it stands right now, it’s looking like Vega-based cards aren’t going to be especially good at mining, dodging the problem Polaris continues to face. However AMD is also recognizes that the biggest groups engaging in cryptocurrency mining have significant resources to throw into optimizations – including firmware modifications – which means that miners can’t be counted out quite yet. Even if Vega doesn’t perform well with current coins, all it would take is for another Vega-friendly coin to spike in price to start the whole process over again.

Competitive Positioning & Radeon Packs

Moving on, let’s talk about the competitive positioning of these cards and how AMD will be making them available to the public. With the RX Vega series, AMD is taking a direct shot at the heart and soul of NVIDIA’s enthusiast range of video cards, the GeForce GTX 1080 and GTX 1070.

We’ve previously been told to expect the Vega 64 to trade blows with the GTX 1080, and AMD is pricing it accordingly. The bare card will have an official MSRP of $499, which is the same as the official MSRP of the GTX 1080. In practice, the GTX 1080 has been running for around $30 more than that due to the effects of miner demand. So if the Vega 64 launches at $499 and stays at that price, it would undercut the GTX 1080 in price by a small amount.

Meanwhile AMD is a lot more gung-ho on the Vega 56, pricing it aggressively and stating that they expect it to take a solid lead over the GeForce GTX 1070. Complicating matters here significantly is that the GTX 1070 has been particularly popular with miners, and as a result market prices are well off of its official $379 MSRP. MSRP-to-MSRP, Vega 56 versus GTX 1070 is a very interesting fight, but at the GTX 1070’s market prices it’s a different matter. If AMD actually gets the Vega 56 out at $399 and holds it there, they’d have a significant price advantage and decent performance advantage over the GTX 1070, giving them a strong position as the value choice. However with Vega 56 not set to launch for another two weeks, there are a lot of “ifs” in the above statements and we’ll have to wait to see where retail prices actually land.

Throwing one last complication into matters, as part of their efforts to stymie cryptocurrency miners, AMD is offering RX Vega cards both with and without bundles. The latter point is going to be especially important, because these are no ordinary bundles. But if you just want a card – full stop – then AMD will be selling both the Vega 56 and Vega 64 stand-alone. These cards will go for $399 and $499 respectively.

As for AMD’s fancier Vega 64 Limited Edition and Vega 64 Liquid Cooled Edition card, these will not be available on a stand-alone basis at all. If you want these cards, they will only be available as part of a bundle.

However what’s not being said by AMD is how much of the RX Vega launch supply will be allocated to stand-alone cards. For various reasons the bundles are more lucrative to AMD and its partners, and as a result it’s difficult to imagine AMD sending the bulk of the cards to stand-alone configurations, or even a 50-50 split. Instead I expect that the bulk of the cards will go towards the bundles. So while we’re going to have to see what launch day brings next month, I’m not convinced that stand-alone cards will be available in great numbers.

Instead, if you want an RX Vega on launch day, and this is where things take an interesting turn. AMD has offered hardware bundles and game bundles before, but for their latest Radeon family, they have never offered a bundle quite like this.

For what amounts to a $100 premium, AMD is doing a combined software and hardware bundle. AMD calls these the Radeon Packs, and they include bundled games and hardware discounts. In short, if you commit to buying an RX Vega card for $100 more, you get access to AMD’s best cards with additional games and some sizable hardware discounts.

Overall the bundle contains the following items: an RX Vega video card, a $200 discount on Samsung’s CF791 34-inch WQHD Curved Freesync monitor, a $100 discount on a Ryzen 7 + motherboard bundle, and 2 AAA games (for North America this is Wolfenstein II and Prey, coming from AMD’s close partner Bethesda).

Bundle Contents

  1.  AMD Radeon RX Vega Card
  2.  2 Bundled Games (For US: Wolfenstein II and Prey)
  3.  $200 Discount: Samsung CF791 34-inch Widescreen Freesync Monitor
  4.  $100 Discount: Ryzen 7 + Motherboard Combo

Perhaps the most important aspect of the deal is how the bundles work. Unlike traditional retailer bundles, these AMD bundles require purchasing the Radeon Pack version of the card. If you buy the stand-alone RX Vega cards, then you won’t get the games or qualify for the hardware discounts. And these Radeon Pack SKUs are, in turn, priced $100 over the stand-alone cards.

As for the contents of the bundle, the bundled games are rather straightforward: the bundle simply comes with the games (presumably in voucher form). For the hardware however, the discounts come in the form of an instant rebate on the hardware, not a voucher or other “bankable” form. This means that if you buy a Radeon Pack SKU, you must buy the discounted hardware at the same time to get the discount. You are not required to buy the hardware – instead paying the $100 premium just for the card selection and included games – but the discounts cannot be saved until later. It’s now or never.

All told then, the total cost to take full advantage of a Radeon Pack will not be cheap. The packs themselves are priced at $499 for the Vega 56 pack (Red Pack), $599 for a Vega 64 pack (Black Pack), or $699 for a Vega 64 Liquid Cooled Edition pack (Aqua Pack). Meanwhile, turning to our favorite retailer Newegg, the Samsung CF791 currently retails for $936, a Ryzen 7 1700X is $339, and a good X370 motherboard is $129 (or more). This would bring the total cost, after the $300 in rebates, to $1700.

Update: As of 10am ET, Newegg's entire allocation of Vega 64 cards has sold through. This includes the stand-alone cards and the bundled cards. Indications are that Newegg's stock of cards sold through in under 15 minutes.

Summer 2017 GPU Pricing Comparison (Crypto-Crazy Edition)
AMD Price NVIDIA
Radeon RX Vega 64 $499 GeForce GTX 1080
  $449 GeForce GTX 1070
Radeon RX Vega 56 $399  
Radeon RX 580 (8GB) $299 GeForce GTX 1060 (6GB)


The 2017 GPU Benchmark Suite & the Test

Paired with our RX Vega 64 and 56 review is a new benchmark suite and new testbed. The 2017 GPU suite features new games, as well as new compute and synthetic benchmarks.

Games-wise, we have kept Grand Theft Auto V and Ashes of the Singularity: Escalation. Joining them is Battlefield 1, DOOM, Tom Clancy’s Ghost Recon Wildlands, Warhammer 40,000: Dawn of War III, Deus Ex: Mankind Divided, F1 2016, and Total War: Warhammer. All-in-all, these games span multiple genres, differing graphics workloads, and contemporary APIs, with a nod towards modern and relatively intensive games. Additionally, we have retired the venerable Crysis 3 as our mainline power-testing game in favor of Battlefield 1.

AnandTech GPU Bench 2017 Game List
Game Genre Release Date API(s)
Battlefield 1 FPS Oct. 2016 DX11
(DX12)
Ashes of the Singularity: Escalation RTS Mar. 2016 DX12
(DX11)
DOOM (2016) FPS May 2016 Vulkan
(OpenGL 4.5)
Ghost Recon Wildlands FPS/3PS Mar. 2017 DX11
Dawn of War III RTS Apr. 2017 DX11
Deus Ex: Mankind Divided RPG/Action/Stealth Aug. 2016 DX11
(DX12)
Grand Theft Auto V Action/Open world Apr. 2015 DX11
F1 2016 Racing Aug. 2016 DX11
Total War: Warhammer TBS/Real-time tactics May 2016 DX11 + DX12

In terms of data collection, measurements were gathered either using built-in benchmark tools or with AMD's open-source Open Capture and Analytics Tool (OCAT), which is itself powered by Intel's PresentMon. 99th percentiles were obtained or calculated in a similar fashion: OCAT natively obtains 99th percentiles, GTA V's built-in benchmark include 99th percentiles, and both Ashes: Escalation and Total War: Warhammer's built-in benchmark outputs raw frame time data. Dawn of War III continutes to suffer from its misconfigured built-in benchmark calculations and so its native data cannot be used. In general, we prefer 99th percentiles over minimums, as they more accurately represent the gaming experience and filter out outliers that may not even be true results of the graphics card.

We are continuing to use the best API for a given card when given a choice. As before, we use DirectX 12 for Ashes of the Singularity: Escalation, being natively designed for that API. For DOOM (2016), using Vulkan is an improvement to OpenGL 4.5 across the board, and for those not in-the-know, Vulkan is roughly comparable to OpenGL in the same way DX12 is to DX11. We also stick to DX11 for Battlefield 1, with the persistent DX12 performance issues in mind, and similar reasoning follows with Deus Ex: Mankind Divided, where DX12 did not appear to give the best performance for RX Vega.

In the same vein, we have used DX12 for Total War: Warhammer when testing AMD cards, but we are still sussing out the exact effects on the Vega cards. With Vega running Total War: Warhammer, neither API seems to be absolutely better performing than the other, and we are continuing to investigate.

2017 GPU Compute and Synthetics

We have also updated our compute and synthetics suites, which are now as follows:

  • Compute: Blender 2.79 - BlenchMark
  • Compute: CompuBench 2.0 – Level Set Segmentation 256
  • Compute: CompuBench 2.0 – N-Body Simulation 1024K
  • Compute: CompuBench 2.0 – Optical Flow
  • Compute: Folding @ Home Single Precision
  • Compute: Geekbench 4 – GPU Compute – Total Score
  • Synthetics: TessMark, Image Set 4, 64x Tessellation
  • Synthetics: VRMark Orange
  • Synthetics: Beyond3D Suite – Pixel Fillrate
  • Synthetics: Beyond3D Suite – Integer Texture Fillrate (INT8)
  • Synthetics: Beyond3D Suite – Floating Point Texture Fillrate (FP32)

Testing with Vega

Testing was done with default configurations with respect to the High-Bandwidth Cache Controller (HBCC) and BIOS/power profiles. By default, HBCC is disabled in Radeon Software. As for power profiles, both Vega 64 and 56 come with primary and secondary VBIOS modes, each having three profiles in WattMan: Power Saver, Balanced, and Turbo. By default, both cards use the primary VBIOS' Balanced power profile.

GPU Power Limits for RX Vega Power Profiles
  Radeon RX Vega 64 Air Radeon RX Vega 56
Primary VBIOS Secondary VBIOS Primary VBIOS Secondary VBIOS
Power Saver 165W 150W 150W 135W
Balanced 220W 200W 165W 150W
Turbo 253W 230W 190W 173W

A small switch on the cards can be toggled away from the PCIe bracket for the lower power secondary VBIOS. In Radeon WattMan, a slider permits switching between Power Saver, Balanced, Turbo, and Custom performance profiles. In total, each card has six different power profiles to choose from. RX Vega 64 Liquid has its own set of six profiles as well, ranging from 165W to 303W. We don't expect Turbo mode to significantly change results: for Turbo vs. Balanced, AMD themselves cited a performance increase of about 2% at 4K.

The New 2017 GPU Skylake-X Testbed

Last, but certainly not least, we have a new testbed running these benchmarks and games. For that reason, historical results cannot be directly compared with the results in this review.

CPU: Intel Core i7-7820X @ 4.3GHz
Motherboard: Gigabyte X299 AORUS Gaming 7
Power Supply: Corsair AX860i
Hard Disk: OCZ Toshiba RD400 (1TB)
Memory: G.Skill TridentZ DDR4-3200 4 x 8GB (16-18-18-38)
Case: NZXT Phantom 630 Windowed Edition
Monitor: LG 27UD68P-B
Video Cards: AMD Radeon RX Vega 64 (Air Cooled)
AMD Radeon RX Vega 56
AMD Radeon RX 580
AMD Radeon R9 Fury X
NVIDIA GeForce GTX 1080 Ti Founders Edition
NVIDIA GeForce GTX 1080 Founders Edition
NVIDIA GeForce GTX 1070 Founders Edition
Video Drivers: NVIDIA Release 384.65
AMD Radeon Software Crimson ReLive Edition 17.7.2 (for non-Vega cards)
AMD Radeon Software Crimson Press Beta 17.30.1051
OS: Windows 10 Pro (Creators Update)


Battlefield 1

Battlefield 1 leads off our new 2017 benchmark suite with a bang as DICE brings gamers the long-awaited AAA World War 1 shooter. As mentioned earlier, we used DX11 for all cards, knowing that DX12 still has performance issues in this title. The Ultra preset is used with no alterations. As these benchmarks are from single player mode, our rule of thumb with multiplayer performance still applies: multiplayer framerates generally dip to half our single player framerates.

Battlefield 1 - 3840x2160 - Ultra QualityBattlefield 1 - 2560x1440 - Ultra QualityBattlefield 1 - 1920x1080 - Ultra Quality

Battlefield 1 - 99th Percentile - 3840x2160 - Ultra QualityBattlefield 1 - 99th Percentile - 2560x1440 - Ultra QualityBattlefield 1 - 99th Percentile - 1920x1080 - Ultra Quality



Ashes of the Singularity: Escalation

A veteran from our 2016 game list, Ashes of the Singularity: Escalation continues to be the DirectX 12 trailblazer, with developer Oxide Games tailoring and designing the Nitrous Engine around such low-level APIs. Ashes remains fresh for us in many ways: Escalation was released as a standalone expansion in November 2016 and was eventually merged into the base game in February 2017, while August 2017's v2.4 brought Vulkan support. Of all of the games in our benchmark suite, this is the game making the best use of DirectX 12’s various features, from asynchronous compute to multi-threaded work submission and high batch counts. While what we see can’t be extrapolated to all DirectX 12 games, it gives us a very interesting look at what we might expect in the future.

Settings and methodology remain identical from its usage in the 2016 GPU suite.

Ashes of the Singularity: Escalation - 3840x2160 - Extreme QualityAshes of the Singularity: Escalation - 2560x1440 - Extreme QualityAshes of the Singularity: Escalation - 1920x1080 - Extreme Quality

 

Ashes: Escalation - 99th Percentile - 3840x2160 - Extreme QualityAshes: Escalation - 99th Percentile - 2560x1440 - Extreme QualityAshes: Escalation - 99th Percentile - 1920x1080 - Extreme Quality



Doom

By now, we all know the legacy of the original DOOM (1993), and DOOM (2016) seeks to fulfill its birthright of shooting things until they die. The fast-paced arena-shooter-style gameplay relies heavily on high framerates for an optimal experience. DOOM can be considered the current flagship Vulkan game, showcasing what Vulkan has to offer.

The Ultra preset was used without alterations.

Doom - 3840x2160 - Ultra QualityDoom - 2560x1440 - Ultra QualityDoom - 1920x1080 - Ultra Quality

 

Doom - 99th Percentile - 3840x2160 - Ultra QualityDoom - 99th Percentile - 2560x1440 - Ultra QualityDoom - 99th Percentile - 1920x1080 - Ultra Quality



Ghost Recon Wildlands

Ghost Recon Wildlands brings a modern, highly demanding DX11 title to our suite, with the lush open world style FPS/3PS requiring heavy graphics horsepower to run at the highest settings. On that note, we turned down settings to Very High, which also had the effect of turning off NVIDIA GameWorks settings: HBAO+, Enhanced Volumetric Lighting (Godrays), and Turf Effects. This keeps performance apples-to-apples, and makes direct cross-vendor comparisons easier to make.

Ghost Recon Wildlands - 3840x2160 - Very High QualityGhost Recon Wildlands - 2560x1440 - Very High QualityGhost Recon Wildlands - 1920x1080 - Very High Quality



Dawn of War III

A Dawn of War game finally returns to our benchmark suite, with its predecessor last appearing in 2010. With Dawn of War III, Relic offers a demanding RTS with a built-in benchmark; however, the benchmark is still bugged, something noticed by Ian, as well as by other publications. The built-in benchmark for Dawn of War III collects frametime data for the loading screen before and black screen after the benchmark scene, rendering the calculated averages and minimum/maximums useless. While we used the benchmark scene for consistency, we used OCAT to collect the performance data instead. Ultra settings were used without alterations.

A note on the 1080p results: further testing revealed that Dawn of War III at 1080p was rather CPU-bound on our testbed, resulting in anomalous performance. Due to the extreme time constraints, we discovered and determined this very late in the process. For the sake of transparency, the graphs will remain as they were at the time of the original posting.

Dawn of War III - 3840x2160 - Ultra QualityDawn of War III - 2560x1440 - Ultra QualityDawn of War III - 1920x1080 - Ultra Quality

 

Dawn of War III - 99th Percentile - 3840x2160 - Ultra QualityDawn of War III - 99th Percentile - 2560x1440 - Ultra QualityDawn of War III - 99th Percentile - 1920x1080 - Ultra Quality



Deus Ex: Mankind Divided

The sequel to Deus Ex: Human Revolution (2011), Mankind Divided is a genre-straddling, graphically-demanding DX11 title that received DX12 support a month after launch. DX12 mode still remains ambiguous in terms of overal increased performance, including the RX Vega cards, and so we've opted to stick with DX11. Running through the built-in benchmark, we used Ultra settings without alterations.

Deus Ex: Mankind Divided - 3840x2160 - Ultra QualityDeus Ex: Mankind Divided - 2560x1440 - Ultra QualityDeus Ex: Mankind Divided - 1920x1080 - Ultra Quality



Grand Theft Auto V

The other veteran from our 2016 GPU game suite, GTA V is still graphically demanding as they come. As an older DX11 title, it provides a glimpse into the graphically intensive games of yesteryear. Originally released for consoles in 2013, the PC port came with a slew of graphical enhancements and options. Just as importantly, GTA V includes a rather intensive and informative built-in benchmark.

Like its previous appearances, we follow those settings, as GTA V does not have presets. To recap, for "Very High" quality we have all of the primary graphics settings turned up to their highest setting, with the exception of grass, which is at its own very high setting. Meanwhile 4x MSAA is enabled for direct views and reflections. This setting also involves turning on some of the advanced rendering features - the game's long shadows, high resolution shadows, and high definition flight streaming - but not increasing the view distance any further.

Grand Theft Auto V - 3840x2160 - Very High QualityGrand Theft Auto V - 2560x1440 - Very High QualityGrand Theft Auto V - 1920x1080 - Very High Quality

 

Grand Theft Auto V - 99th Percentile - 3840x2160 - Very High QualityGrand Theft Auto V - 99th Percentile - 2560x1440 - Very High QualityGrand Theft Auto V - 99th Percentile - 1920x1080 - Very High Quality



F1 2016

The spiritual successor to the 2016 suite's DiRT Rally, F1 2016 is Codemasters' latest installment of the F1 franchise. It features Codemasters' traditional built-in benchmarking tools and scripts, something that is surprisingly absent in the latest iteration of DiRT. Graphically demanding in its own right, F1 2016 adds a useful racing-type graphics workload to our suite.

Ultra settings were used without alterations.

F1 2016 - 3840x2160 - Ultra QualityF1 2016 - 2560x1440 - Ultra QualityF1 2016 - 1920x1080 - Ultra Quality



Total War: Warhammer

The last game in our lineup is Total War: Warhammer, a DX11 game that received official DX12 support a couple months after launch. While DX12 is still marked as beta, Warhammer was to some extent developed with DX12 in mind, with preview builds showcasing DX12 performance.

The built-in benchmark was used with Ultra settings without alterations.

While the DX12 render path was used for AMD cards, there appear to be some oddities with 1080p performance. As mentioned earlier, we'd like to use the best performing API for a given card; in this case, while there was improved performance at higher resolutions, we noticed a potential regression in 1080p performance. Unfortunately, due to time constraints, we weren't able to investigate further; like Dawn of War III, it's possible Warhammer at 1080p was CPU-bound as well..

Total War: Warhammer - 3840x2160 - Ultra QualityTotal War: Warhammer - 2560x1440 - Ultra QualityTotal War: Warhammer - 1920x1080- Ultra Quality

Total War: Warhammer - 99th Percentile - 3840x2160 - Ultra QualityTotal War: Warhammer - 99th Percentile - 2560x1440 - Ultra QualityTotal War: Warhammer - 99th Percentile - 1920x1080 - Ultra Quality



Compute Performance

Compute: Blender 2.79 - BlenchMarkCompute: CompuBench 2.0 - Level Set Segmentation 256Compute: CompuBench 2.0 - N-Body Simulation 1024KCompute: CompuBench 2.0 - Optical FlowCompute: Folding @ Home Single PrecisionCompute: Geekbench 4 - GPU Compute - Total Score



Synthetics

Synthetic: TessMark, Image Set 4, 64x TessellationSynthetic: VRMark OrangeSynthetic: Beyond3D Suite - Pixel FillrateSynthetic: Beyond3D Suite - Integer Texture Fillrate (INT8)Synthetic: Beyond3D Suite - Floating Point Texture Fillrate (FP32)



Power, Temperature, & Noise

Moving on from performance metrics, we’ll touch upon power, temperature, and noise. This is also normally where we’d discuss voltages, but as Vega is a new chip on a new architecture, nothing seems to read Vega 64 and 56 correctly.

In terms of average game clockspeeds, neither card maintains its boost specification at 100% with prolonged usage. Vega 64 tends to stay closer to its boost clocks, which is in line with its additional power overhead and higher temperature target over Vega 56.

Radeon RX Vega Average Clockspeeds
  Radeon RX Vega 64 Air Radeon RX Vega 56
Boost Clocks
1546MHz
1471MHz
Max Boost (DPM7)
1630MHz
1590MHz
 
Battlefield 1
1512MHz
1337MHz
Ashes: Escalation
1542MHz
1354MHz
DOOM
1479MHz
1334MHz
Ghost Recon: Wildlands
1547MHz
1388MHz
Dawn of War III
1526MHz
1335MHz
Deus Ex: Mankind Divided
1498MHz
1348MHz
GTA V
1557MHz
1404MHz
F1 2016
1526MHz
1394MHz
 
FurMark
1230MHz
HBM2: 868MHz
1099MHz
HBM2: 773MHz

With games, the HBM2 clocks ramp up and stay at their highest clock state. Expectedly, the strains of FurMark cause the cards to oscillate memory clocks: between 945MHz and 800MHZ for Vega 64, and between 800MHz and 700MHz for Vega 56. On that note, HBM2 comes with an idle power state (167MHz), an improvement on Fiji's HBM1 single power state. Unfortunately, the direct power savings are a little obscured since, as we will soon see, Vega 10 is a particularly power hungry chip.

As mentioned earlier, we used the default out-of-the-box configuration for power: Balanced, with the corresponding 220W GPU power limit. And under load, Vega needs power badly.

Idle Power ConsumptionLoad Power Consumption - Battlefield 1Load Power Consumption - FurMark

The performance of both Vega cards comes at a significant power cost. For the RX 500 series, we mused that load consumption is where AMD paid the piper. Here, the piper has taken AMD to the cleaners. In Battlefield 1, Vega 64 consumes 150W more system-wide power than the GTX 1080, its direct competitor. To be clear, additional power draw is expected, since Vega 64 is larger in both shader count (4096 vs. 2560) and die size (486mm2 vs. 314mm2) to the GTX 1080. But in that sense, when compared with the 1080 Ti, powered by the 471mm2 GP102, Vega 64 still consumes more power. 

As for Vega 64's cut-down sibling, Vega 56's lower temperature target, lower clocks, and lower board power make its consumption look much more reasonable, although it is still well above the 1070.

In any case, the cooling solutions are able to do the job without severe effects on temperature and noise. As far as blowers go, RX Vega 64 and 56 are comparable to the 1080 Ti FE blower.

Idle GPU TemperatureLoad GPU Temperature - Battlefield 1Load GPU Temperature - FurMark
Not Graphed: Temperature of the actual Vega (Star): 9329C

Noise-testing equipment and methodology differ from past results, with a more sensitive noise meter and closer distance to the graphics card. Readings were also taken with an open case. As such, the noise levels may appear higher than expected.

Idle Noise LevelsLoad Noise Levels - Battlefield 1Load Noise Levels - FurMark

 



Final Words

Bringing this review to a close, if it feels like Vega has been a long time coming, it’s not just your imagination. AMD first unveiled the Vega name 17 months ago, and it was 7 months ago when we got our first peek into the Vega architecture. So today’s launch has in fact been a long time coming, especially by GPU standards. Ultimately what it means is that AMD has had plenty of time to make noise and build up demand for the Radeon RX Vega lineup, and today they get to show their cards, both figuratively and literally.

Vega comes at an interesting (if not critical) time for AMD. They’ve been absent from the enthusiast and high-end video card markets for this entire generation of cards thus far, and while this has allowed them to recapture some of their market share in the mid-range market, it has hurt their visibility to some extent. Furthermore the recent spike in demand for GPUs from cryptocurrency miners has completely turned the video card market on its head, obliterating supplies of AMD’s thus-far leading video cards, the Radeon RX 580 and RX 570.

So for AMD, the RX Vega launch is a chance to re-enter a market they’ve been gone from for far too long. It’s a chance to re-establish a regular supply of video cards to at least some portion of the market. And it’s an opportunity to push their GPU architecture and performance forward, closing the performance and efficiency gaps with NVIDIA. In short, the Vega launch has the potential to be AMD’s brightest day.

So how does AMD fare? The answer to that is ultimately going to hinge on your opinion on power efficiency. But before we get too far, let’s start with the Radeon RX Vega 64, AMD’s flagship card. Previously we’ve been told that it would trade blows with NVIDIA’s GeForce GTX 1080, and indeed it does just that. At 3840x2160, the Vega 64 is on average neck-and-neck with the GeForce GTX 1080 in gaming performance, with the two cards routinely trading the lead, and AMD holding it more often. Of course the “anything but identical” principle applies here, as while the cards are equal on average, they can sometimes be quite far apart on individual games.

Unfortunately for AMD, their GTX 1080-like performance doesn’t come cheap from a power perspective. The Vega 64 has a board power rating of 295W, and it lives up to that rating. Relative to the GeForce GTX 1080, we’ve seen power measurements at the wall anywhere between 110W and 150W higher than the GeForce GTX 1080, all for the same performance. Thankfully for AMD, buyers are focused on price and performance first and foremost (and in that order), so if all you’re looking for is a fast AMD card at a reasonable price, the Vega 64 delivers where it needs to: it is a solid AMD counterpart to the GeForce GTX 1080. However if you care about the power consumption and the heat generated by your GPU, the Vega 64 is in a very rough spot.

On the other hand, the Radeon RX Vega 56 looks better for AMD, so it’s easy to see why in recent days they have shifted their promotional efforts to the cheaper member of the RX Vega family. Though a step down from the RX Vega 64, the Vega 56 delivers around 90% of Vega 64’s performance for 80% of the price. Furthermore, when compared head-to-head with the GeForce GTX 1070, its closest competition, the Vega 56 enjoys a small but none the less significant 8% performance advantage over its NVIDIA counterpart. Whereas the Vega 64 could only draw to a tie, the Vega 56 can win in its market segment.

Vega 56’s power consumption also looks better than Vega 64’s, thanks to binning and its lower clockspeeds. Its power consumption is still notably worse than the GTX 1070’s by anywhere between 45W and 75W at the wall, but on both a relative basis and an absolute basis, it’s at least closer. Consequently, just how well the Vega 56 fares depends on your views on power consumption. It’s faster than the GTX 1070, and even if retail prices are just similar to the GTX 1070 rather than cheaper, then for some buyers looking to maximize performance for their dollar, that will be enough. But it’s certainly not a very well rounded card if power consumption and noise are factored in.

The one wildcard here with the RX Vega 56 is going to be where retail prices actually end up. AMD’s $399 MSRP is rather aggressive, especially when GTX 1070 cards are retailing for closer to $449 due to cryptocurrency miner demand. If they can sustain that price, then Vega 56 is going to be real hot stuff, besting GTX 1070 in price and performance. Otherwise at GTX 1070-like prices it still has the performance advantage, but not the initiative on pricing. At any rate, this is a question we can’t answer today; the Vega 56 won’t be launching for another two weeks.

Closing things on a broader architectural note, Vega has been a rather conflicting launch. A maxim we embrace here at AnandTech is that “there’s no such thing as a bad card, only bad prices”, reflecting the fact that how good or bad a product is depends heavily on how it’s priced. A fantastic product can be priced so high as to be unaffordably expensive, and a mediocre product can be priced so low as to be a bargain for the masses. And AMD seems to embrace this as well, having priced the RX Vega cards aggressively enough that at least on a price/performance basis, they’re competitive with NVIDIA if not enjoying a small lead.

The catch for AMD is that what they need to price RX Vega at to be competitive and what it should be able to do are at odds with each other. The Vega 10 is a large, power-hungry GPU. Much larger and much more power hungry than NVIDIA’s competing GP104 GPU. And while this isn’t an immediate consumer concern – we pay what the market will bear, not what it costs AMD to make a chip with a nice gross margin on the side – from a technology and architectural perspective it indicates that AMD has moved farther away from NVIDIA in the last couple of years. Whereas the Radeon R9 Fury X was almost a win that AMD didn’t get, the RX Vega 64 doesn’t appear to be fighting in the weight class it was even designed for. Instead the power efficiency gap between AMD and NVIDIA has grown since 2015, and apparently by quite a bit.

The good news for AMD is that this doesn’t come anywhere close to dooming the RX Vega line or the company. RX Vega still has a place in the market – albeit as the bargain option – and AMD can still sell every Polaris 10 GPU they can make. All the while AMD is still ramping up their Radeon Instinct lineup and general assault into deep learning servers, where Vega’s compute-heavy design should fare better.

In the meantime, from an architectural perspective there’s a lot I like about Vega. I’m interested in seeing where fast FP16 performance goes in the gaming market, and the primitive shader looks like it could really upend how geometry is processed at the GPU level given sufficient developer buy-in. Meanwhile the high-bandwidth cache controller concept is a very forward looking one, and one that in time may prove a major evolutionary step for stand-alone GPUs. However I can’t help but wish that these advancements came with a more competitive video card stack. As great as Vega is architecturally, it’s clear that performance and power consumption aren’t where they need to be for AMD to take on a surging NVIDIA.

So here’s to hoping for a better fight in the next round of the GPU wars.

Log in

Don't have an account? Sign up now