Original Link: https://www.anandtech.com/show/13249/nvidia-announces-geforce-rtx-20-series-rtx-2080-ti-2080-2070
NVIDIA Announces the GeForce RTX 20 Series: RTX 2080 Ti & 2080 on Sept. 20th, RTX 2070 in October
by Ryan Smith on August 20, 2018 4:00 PM ESTNVIDIA’s Gamescom 2018 keynote just wrapped up, and as many have been expecting since it was announced last month, NVIDIA is getting ready to launch their next generation of GeForce hardware. Announced at the event and going on sale starting September 20th is NVIDIA’s GeForce RTX 20 series, which is succeeding the current Pascal-powered GeForce GTX 10 series. Based on NVIDIA’s new Turing GPU architecture and built on TSMC’s 12nm “FFN” process, NVIDIA has lofty goals, looking to drive an entire paradigm shift in how games are rendered and how PC video cards are evaluated. CEO Jensen Huang has called Turing NVIDIA’s most important GPU architecture since 2006’s Tesla GPU architecture (G80 GPU), and from a features standpoint it’s clear that he’s not overstating matters.
As is traditionally the case, the first cards out of the NVIDIA stable are the high-end cards. But in a rather sizable break from tradition we’re not only going to get the x80 and x70 cards at launch, but also the x80 Ti card as well. Meaning the GeForce RTX 2080 Ti, RTX 2080, and RTX 2070 will all be hitting the streets within a month of each other. NVIDIA’s product stack is remaining unchanged here, so RTX 2080 Ti remains their flagship card, while RTX 2080 is their high-end card, and then RTX 2070 the slightly cheaper card to entice enthusiasts without breaking the bank.
All three cards will be launching over the next two months. First off will be the RTX 2080 Ti and RTX 2080, which will launch September 20th. The RTX 2080 Ti will start at $999 for partner cards, while the RTX 2080 will start at $699. Meanwhile the RTX 2070 will launch at some point in October, with partner cards starting at $499. On a historical basis, all of these prices are higher than the last generation by anywhere between $120 and $300. Meanwhile NVIDIA’s own reference-quality Founders Edition cards are once again back, and those will carry a $100 to $200 premium over the baseline pricing.
Unfortunately, NVIDIA is already taking pre-orders here, so consumers are essentially required to make a “blind buy” if they want to snag a card from the first batch. NVIDIA has offered surprisingly little information on performance and we’d suggest waiting for trustworthy third-party reviews (i.e. us), however I have to admit that I don’t imagine there’s going to be much stock available by the time reviews hit the streets.
NVIDIA GeForce Specification Comparison | ||||||
RTX 2080 Ti | RTX 2080 | RTX 2070 | GTX 1080 | |||
CUDA Cores | 4352 | 2944 | 2304 | 2560 | ||
Core Clock | 1350MHz | 1515MHz | 1410MHz | 1607MHz | ||
Boost Clock | 1545MHz | 1710MHz | 1620MHz | 1733MHz | ||
Memory Clock | 14Gbps GDDR6 | 14Gbps GDDR6 | 14Gbps GDDR6 | 10Gbps GDDR5X | ||
Memory Bus Width | 352-bit | 256-bit | 256-bit | 256-bit | ||
VRAM | 11GB | 8GB | 8GB | 8GB | ||
Single Precision Perf. | 13.4 TFLOPs | 10.1 TFLOPs | 7.5 TFLOPs | 8.9 TFLOPs | ||
Tensor Perf. | 440T OPs (INT4) |
? | ? | N/A | ||
Ray Perf. | 10 GRays/s | 8 GRays/s | 6 GRays/s | N/A | ||
"RTX-OPS" | 78T | 60T | 45T | N/A | ||
TDP | 250W | 215W | 175W | 180W | ||
GPU | Big Turing | Unnamed Turing | Unnamed Turing | GP104 | ||
Transistor Count | 18.6B | ? | ? | 7.2B | ||
Architecture | Turing | Turing | Turing | Pascal | ||
Manufacturing Process | TSMC 12nm "FFN" | TSMC 12nm "FFN" | TSMC 12nm "FFN" | TSMC 16nm | ||
Launch Date | 09/20/2018 | 09/20/2018 | 10/2018 | 05/27/2016 | ||
Launch Price | MSRP: $999 Founders $1199 |
MSRP: $699 Founders $799 |
MSRP: $499 Founders $599 |
MSRP: $599 Founders $699 |
NVIDIA's Turing Architecture: RT & Tensor Cores
So what does Turing bring to the table? The marquee feature across the board is hybrid rendering, which combines ray tracing with traditional rasterization to exploit the strengths of both technologies. This announcement is essentially a continuation of NVIDIA’s RTX announcement from earlier this year, so if you thought that announcement was a little sparse, well then here is the rest of the story.
The big change here is that NVIDIA is going to be including even more ray tracing hardware with Turing in order to offer faster and more efficient hardware ray tracing acceleration. New to the Turing architecture is what NVIDIA is calling an RT core, the underpinnings of which we aren’t fully informed on at this time, but serve as dedicated ray tracing processors. These processor blocks accelerate both ray-triangle intersection checks and bounding volume hierarchy (BVH) manipulation, the latter being a very popular data structure for storing objects for ray tracing.
NVIDIA is stating that the fastest GeForce RTX part can cast 10 Billion (Giga) rays per second, which compared to the unaccelerated Pascal is a 25x improvement in ray tracing performance.
The Turing architecture also carries over the tensor cores from Volta, and indeed these have even been enhanced over Volta. The tensor cores are an important aspect of multiple NVIDIA initiatives. Along with speeding up ray tracing itself, NVIDIA’s other tool in their Turing bag of tricks is to reduce the amount of rays required in a scene by using AI denoising to clean up an image, which is something the tensor cores excel at. Of course that’s not the only feature tensor cores are for – NVIDIA’s entire AI/neural networking empire is all but built on them – so while not a primary focus for the Gamescom crowd, this also confirms that NVIDIA’s most powerful neural networking hardware will be coming to a wider range of GPUs.
Looking at hybrid rendering in general, it’s interesting that despite these individual speed-ups, NVIDIA’s overall performance promises aren’t quite as extreme. All told, the company is promising a 6x performance boost versus Pascal, and this doesn’t specify against which parts. Time will tell if even this is a realistic assessment, as even with the RT cores, ray tracing in general is still quite the resource hog.
As for gaming matters in particular, the benefits of hybrid rendering are potentially significant, but it’s going to depend heavily on how developers choose to use it. From performance standpoint I’m not sure there’s much to say here, and that’s because ray tracing & hybrid rendering are ultimately features to improve rendering quality, not improve the performance of today’s algorithms. Granted, if you tried to do ray tracing on today’s GPUs it would be extremely slow – and Turing an incredible speedup as a result – but no one uses slow path tracing systems on current hardware for this reason. So hybrid rendering is instead about replacing the approximations and hacks of current rasterization technology with more accurate rendering methods. In other words, less “faking it” and more “making it.”
Those quality benefits, in turn, are typically clustered around lighting, shadows, and reflections. All three features are inherently based on the properties of light, which in simplistic terms moves as a ray, and which up to now various algorithms have been faking the work involved or “pre-baking” scenes in advance. And while current algorithms are quite good, they still aren’t close to accurate. So there is clear room for improvement.
NVIDIA for their part is particularly throwing around global illumination, which is one of the harder tasks. However there are other lighting methods that benefit as well, not to mention reflections and shadows of those lit objects. And truthfully this is where words are a poor tool; it’s difficult to describe how a ray traced shadow looks better than a fake shadow with PCSS, or real-time lighting over pre-baked lighting. Which is why NVIDIA, the video card company, is going to be pushing the visual aspects of all of this harder than ever.
Overall then, hybrid rendering is the lynchpin feature of the GeForce RTX 20 series. Going by their Gamescom and SIGGRAPH presentations, it’s clear that NVIDIA has invested heavily into the field, and that they have bet the success of the GeForce brand over the coming years on this technology. RT cores and tensor cores are semi-fixed function hardware; they can’t be used for rasterization, and the transistors allocated to them are transistors that could have been dedicated to more rasterization hardware otherwise. So NVIDIA has made an incredibly significant move here in terms of opportunity cost by going the hybrid rendering route rather than building a bigger Pascal.
As a result, NVIDIA is attempting a paradigm shift in consumer rendering, one that we’ve really only see before with the introduction of pixel and vertex shaders (DX8 & DX9 era tech) all the way back in 2001 & 2002. Which is why Microsoft’s DirectX Raytracing (DXR) initiative is so important, as are NVIDIA’s other developer and consumer initiatives. NVIDIA needs to sell consumers and developers alike on this vision of mixing rasterization with ray tracing to provide better image quality. And more so than that, they need to ease developers into the idea of working with more specialized, fixed function units as Moore’s Law continues to slow down and fixed function hardware becomes a means to achieve greater efficiency.
NVIDIA hasn’t bet the farm on hybrid rendering, but they’ve never attempted to move the market in this fashion. So if it seems like NVIDIA is hyper-focused on hybrid rendering and ray tracing, that’s because they are. It’s their vision of the future, and now they need to get everyone else on board.
Turing SM: Dedicated INT Cores, Unified Cache, Variable Rate Shading
Alongside the dedicated RT and tensor cores, the Turing architecture Streaming Multiprocessor (SM) itself is also learning some new tricks. In particular here, it’s inheriting one of Volta’s more novel changes, which saw the Integer cores separated out into their own blocks, as opposed to being a facet of the Floating Point CUDA cores. The advantage here – at least as much as we saw in Volta – is that it speeds up address generation and Fused Multiply Add (FMA) performance, though as with a lot of aspects of Turing, there’s likely more to it (and what it can be used for) than we’re seeing today.
The Turing SM also includes what NVIDIA is calling a “unified cache architecture.” As I’m still awaiting official SM diagrams from NVIDIA, it’s not clear if this is the same kind of unification we saw with Volta – where the L1 cache was merged with shared memory – or if NVIDIA has gone one step further. At any rate NVIDIA is saying that it offers twice the bandwidth of the “previous generation” which is unclear if NVIDIA means Pascal or Volta (with the latter being more likely).
Finally, also tucked away in the SIGGRAPH Turing press release is the mention of support for variable rate shading. This is a relatively young and upcoming graphics rendering technique that there's limited information about (especially as to how exactly NVIDIA is implementing it). But at a very high level it sounds like the next generation of NVIDIA's multi-res shading technology, which allows developers to render different areas of a screen at various effective resolutions, in order to concentrate quality (and rendering time) in to the areas where it's the most beneficial.
Feeding the Beast: GDDR6 Support
As the memory used by GPUs is developed by outside companies, there are no big secrets here. The JEDEC and its big 3 members Samsung, SK Hynix, and Micron, have all been developing GDDR6 memory as the successor to both GDDR5 and GDDR5X, and NVIDIA ha confirmed that Turing will support it. Depending on the manufacturer, first-generation GDDR6 is generally promoted as offering up to 16Gbps per pin of memory bandwidth, which is 2x that of NVIDIA’s late-generation GDDR5 cards, and 40% faster than NVIDIA’s most recent GDDR5X cards.
GPU Memory Math: GDDR6 vs. HBM2 vs. GDDR5X | ||||||||
NVIDIA GeForce RTX 2080 Ti (GDDR6) |
NVIDIA GeForce RTX 2080 (GDDR6) |
NVIDIA Titan V (HBM2) |
NVIDIA Titan Xp |
NVIDIA GeForce GTX 1080 Ti | NVIDIA GeForce GTX 1080 | |||
Total Capacity | 11 GB | 8 GB | 12 GB | 12 GB | 11 GB | 8 GB | ||
B/W Per Pin | 14 Gb/s | 1.7 Gb/s | 11.4 Gbps | 11 Gbps | ||||
Chip capacity | 1 GB (8 Gb) | 4 GB (32 Gb) | 1 GB (8 Gb) | |||||
No. Chips/KGSDs | 11 | 8 | 3 | 12 | 11 | 8 | ||
B/W Per Chip/Stack | 56 GB/s | 217.6 GB/s | 45.6 GB/s | 44 GB/s | ||||
Bus Width | 352-bit | 256-bit | 3092-bit | 384-bit | 352-bit | 256-bit | ||
Total B/W | 616 GB/s | 448GB/s | 652.8 GB/s | 547.7 GB/s | 484 GB/s | 352 GB/s | ||
DRAM Voltage | 1.35 V | 1.2 V (?) | 1.35 V |
Relative to GDDR5X, GDDR6 is not quite as big of a step up as some past memory generations, as many of GDDR6’s innovations were already baked into GDDR5X. None the less, alongside HBM2 for very high end use cases, it is expected to become the backbone memory of the GPU industry. The principle changes here include lower operating voltages (1.35v), and internally the memory is now divided into two memory channels per chip. For a standard 32-bit wide chip then, this means a pair of 16-bit memory channels, for a total of 16 such channels on a 256-bit card. While this in turn means there is a very large number of channels, GPUs are also well-positioned to take advantage of it since they are massively parallel devices to begin with.
NVIDIA for their part has confirmed that the first GeForce RTX cards will run their GDDR6 at 14Gbps, which happens to be the fastest speed grade offered by all of the Big 3 members. We know that NVIDIA is exclusively using Samsung's GDDR6 for their Quadro RTX cards – presumably because they need the density – however for the GeForce RTX cards the field should be open to all of the memory manufacturers. Though in the long run this leaves two avenues open to higher capacity cards: either moving up to 16Gb density chips, or going clamshell with the 8Gb chips they’re using now.
Odds & Ends: NVLink SLI, VirtualLink, & 8K HEVC
While this wasn’t mentioned in NVIDIA’s Gamescom presentation itself, NVIDIA’s GeForce 20 Series website confirms that SLI will once again be available for some high-end GeForce RTX cards. Specifically, both the RTX 2080 Ti and RTX 2080 will support SLI. Meanwhile the RTX 2070 will not support SLI; this being a departure from the 1070 which did offer it.
However the bigger aspect of that news is that NVIDIA’s proprietary cache coherent GPU interconnect, NVLink, will be coming to consumer cards. The GeForce GTX cards will be implementing SLI over NVLInk, with 2 NVLink channels running between each card. At a combined 50GB/sec of full-duplex bandwidth – meaning there’s 50GB of bandwidth available in each direction – this is a major upgrade over NVIDIA’s previous HB-SLI link. This is on top of NVLink’s other feature benefits, particularly cache coherence. And all of this comes at an important time, as inter-GPU bandwidth requirements keep rising with each generation.
Now the big question is whether this will reverse the ongoing decline of SLI, and at the moment I’m taking a somewhat pessimistic approach, but I’m eager to hear more from NVIDIA. 50GB/sec is a big improvement over HB-SLI, however it’s still only a fraction of the 448GB/sec (or more) of local memory bandwidth available to a GPU. So on its own it doesn’t fix the problems that have dogged multi-GPU rendering, either with AFR synchronization or effective workload splitting. In that respect it’s likely telling that NVIDIA doesn’t support NVLink SLI on the RTX 2070.
Meanwhile gamers something new to look forward to for VR, with the addition of VirtualLink support. The USB Type-C alternate mode was announced last month, and supports 15W+ of power, 10Gbps of USB 3.1 Gen 2 data, and 4 lanes of DisplayPort HBR3 video all over a single cable. In other words, it’s a DisplayPort 1.4 connection with extra data and power that is intended to allow a video card to directly drive a VR headset. The standard is backed by NVIDIA, AMD, Oculus, Valve, and Microsoft, so GeForce RTX cards will be the first of what we expect will ultimately be a number of products supporting the standard.
USB Type-C Alternate Modes | ||||||
VirtualLink | DisplayPort (4 Lanes) |
DisplayPort (2 Lanes) |
Base USB-C | |||
Video Bandwidth (Raw) | 32.4Gbps | 32.4Gbps | 16.2Gbps | N/A | ||
USB 3.x Data Bandwidth | 10Gbps | N/A | 10Gbps | 10Gbps + 10Gbps | ||
High Speed Lane Pairs | 6 | 4 | ||||
Max Power | Mandatory: 15W Optional: 27W |
Optional: Up To 100W |
Finally, while NVIDIA only briefly touched upon the subject, we do know that their video encoder block, NVENC, has been updated for Turing. The latest iteration of NVENC specifically adds support for 8K HEVC encoding. Meanwhile NVIDIA has also been able to further tune the quality of their encoder, allowing them to achieve similar quality as before with a 25% lower video bitrate.
Previewing GeForce RTX 2080 Ti
Turing and NVIDIA’s focus on hybrid rendering aside, let’s take a look at the individual GeForce RTX cards.
Before getting too far here, it’s important to point out that NVIDIA has offered little in the way of information on the cards’ performance besides their formal specifications. Essentially the entirety of the NVIDIA Gamescom presentation – and even most of the SIGGRAPH presentation – was focused on ray tracing/hybrid rendering and the Turing architecture’s unique hardware capabilities to support those features. As a result we don’t have a good frame of reference for how these specifications will translate into real-world performance. Which is also why we’re disappointed that NVIDIA has already started pre-orders, as it pushes consumers into blindly buying cards.
At any rate, with NVIDIA having changed the SM for Turing as much as they have versus Pascal, I don’t believe FLOPS alone is an accurate proxy for performance in current games. It’s almost certain that NVIDIA has been able to improve their SM efficiency, especially judging from what we’ve seen thus far with the Titan V. So in that respect this launch is similar to the Maxwell launch in that the raw specifications can be deceiving, and that it’s possible to lose FLOPS and still gain performance.
In any case, at the top of the GeForce RTX 20 series stack will be the GeForce RTX 2080 Ti. A major departure from the GeForce 700/900/10 series, NVIDIA is not retaining the Ti card as a mid-generation kicker. Instead they’re launching with it right away. This means that the high-end of the RTX family is now a 3 card stack from the start, instead of a 2 card stack as has previously been the case.
NVIDIA has not commented on this change in particular, and this is one of those things that I expect we’ll know more about once we reach the actual hardware launch. But there’s good reason to suspect that since NVIDIA is using the relatively mature TSMC 12nm “FFN” process – itself an optimized version of 16nm – that yields are in a better place than usual at this time. Normally NVIDIA would be using a more bleeding-edge process, where it would make sense to hold back the largest chip another year or so to let yields improve.
NVIDIA GeForce x80 Ti Specification Comparison | ||||||
RTX 2080 Ti Founder's Edition |
RTX 2080 Ti | GTX 1080 Ti | GTX 980 Ti | |||
CUDA Cores | 4352 | 4352 | 3584 | 2816 | ||
ROPs | 88? | 88? | 88 | 96 | ||
Core Clock | 1350MHz | 1350MHz | 1481MHz | 1000MHz | ||
Boost Clock | 1635MHz | 1545MHz | 1582MHz | 1075MHz | ||
Memory Clock | 14Gbps GDDR6 | 14Gbps GDDR6 | 11Gbps GDDR5X | 7Gbps GDDR5 | ||
Memory Bus Width | 352-bit | 352-bit | 352-bit | 384-bit | ||
VRAM | 11GB | 11GB | 11GB | 6GB | ||
Single Precision Perf. | 14.2 TFLOPs | 13.4 TFLOPs | 11.3 TFLOPs | 6.1 TFLOPs | ||
"RTX-OPS" | 78T | 78T | N/A | N/A | ||
TDP | 260W | 250W | 250W | 250W | ||
GPU | Big Turing | Big Turing | GP102 | GM200 | ||
Architecture | Turing | Turing | Pascal | Maxwell | ||
Manufacturing Process | TSMC 12nm "FFN" | TSMC 12nm "FFN" | TSMC 16nm | TSMC 28nm | ||
Launch Date | 09/20/2018 | 09/20/2018 | 03/10/2017 | 06/01/2015 | ||
Launch Price | $1199 | $999 | MSRP: $699 Founders: $699 |
$649 |
The king of NVIDIA’s new product stack, the GeForce RTX 2080 Ti is without a doubt an interesting card. And if we’re being honest, it’s not a card I was expecting. Based on these specifications, it’s clearly built around a cut-down version of NVIDIA’s “Big Turing” GPU, which the company just unveiled last week at SIGGRAPH. And like the name suggests, Big Turing is big: 18.6B transistors, measuring 754mm2 in die size. This is closer in size to GV100 (Volta/Titan V) than it is any past x80 Ti card, so I am surprised that, even as a cut-down chip, NVIDIA can economically offer it for sale. None the less here we are, with Big Turing coming to consumer cards.
Even though it’s a cut-down part, RTX 2080 Ti is still a beast, with 4352 Turing CUDA cores and what I estimate to be 544 tensor cores. Like its Quadro counterpart, this card is rated for 10 GigaRays/second, and for traditional compute we’re looking at 13.4 TFLOPS based on these specifications. Note that this is only 19% higher than GTX 1080 Ti, which is all the more reason why I want to learn more about Turing’s architectural changes before predicting what this means for performance in current-generation rasterization games.
Clockspeeds have actually dropped from generation to generation here. Whereas the GTX 1080 Ti started at 1.48GHz and had an official boost clock rating of 1.58GHz (and in practice boosting higher still), RTX 2080 Ti starts at 1.35GHz and boosts to 1.55GHz, while we don’t know anything about the practical boost limits. So assuming NVIDIA is being as equally conservative as the last generation, then this means the average clockspeeds have dropped slightly. Which in turn means that whatever performance gains we see from GTX 2080 Ti are going to ride entirely on the increased CUDA core count and any architectural efficiency improvements.
Meanwhile the ROP count is unknown, but as it needs to match the memory bus width, we’re almost certainly looking at 88 ROPs. Even more so than the core compute architecture, I’m curious as to whether there are any architectural improvements here. Otherwise because the ROP count is identical, then the maximum pixel throughput (on paper) is actually ever so slightly lower than it was on GTX 1080 Ti.
Speaking of the memory bus, this is another area that is seeing a significant improvement. NVIDIA has moved from GDDR5X to GDDR6, so memory clockspeeds have increased accordingly, from 11Gbps to 14Gbps, a 27% increase. And since the memory bus width itself remains identical at 352-bits wide, this means the final memory bandwidth increase is also 27%. Memory bandwidth has long been the Achilles heel of GPUs, so even if NVIDIA’s theoretical ROP throughput has not changed this generation, the fact of the matter is that having more memory bandwidth is going to remove bottlenecks and improve performance throughout the rendering pipeline, from the texture units and CUDA cores straight out to the ROPs. Of course, the tensor cores and RT cores are going to be prolific bandwidth consumers as well, so in workloads where they’re in play, NVIDIA is once again going to have to do more with (relatively) less.
Past this, things start diverging a bit. NVIDIA is once again offering their reference-grade Founders Edition cards, and unlike with the GeForce 10 series, the 20 series FE cards have slightly different specifications than their base specification compatriots. Specifically, NVIDIA has cranked up the clockspeed and the resulting TDP a bit, giving the 2080 Ti FE an on-paper 6% performance advantage, and also a 10W higher TDP. For the standard cards then, the TDP is the x80 Ti-traditional 250W, while the FE card moves to 260W.
Meanwhile, starting with the GeForce 20 series cards, NVIDIA is rolling out a new design to their reference/Founders Edition cards, the first such redesign since the original GeForce GTX Titan back in 2013. Up until now NVIDIA has focused on a conservative but highly effective blower design, pairing the best blower in the industry with a metal grey & black metal shroud. The end result is that these reference/FE cards could be dropped in virtually any system and work, thanks to the self-exhausting nature of blowers.
However for the GeForce 20 series, NVIDIA has blown off the blower, and instead opted to design their cards around the industry’s other favorite cooler design: the dual-fan open air cooler. Combined with NVIDIA’s metallic aesthetics, which they have retained, and the resulting product pretty much looks exactly like you’d expect a high-end open air cooled NVIDIA card to look like: two fans buried inside a meticulous metal shroud. And while we’ll see where performance stands once we review the card, it’s clear that NVIDIA is at the very least aiming to lead the pack in industrial design once again.
The switch to an open air cooler has three particular ramifications versus NVIDIA’s traditional blower, which for regular AnandTech readers you’ll know we’ve discussed before.
- Cooling capacity goes up
- Noise levels go down
- A card can no longer guarantee that it can cool itself
In an open air design, hot air is circulated back into the chassis via the fans, as the shroud is not fully closed and the design doesn’t force hot air out of the back of the case. Essentially in an open air design a card will push the hottest air away from itself, but it’s up to the chassis to actually get rid of that hot air. Which a well-designed case will do, but not without first circulating it through the CPU cooler, which is typically located above the GPU.
GPU cooler design is such that there is no one right answer. Because open air designs can rely on large axial fans with little air resistance, they can be very quiet. But overall cooling becomes the chassis’ job. Otherwise blowers are fully exhausting and work in practically any chassis – no matter how bad the chassis cooling is – but it is nosier thanks to the high-RPM radial fan. NVIDIA for their part has long favored blowers, but this appears to be at an end. It does make me wonder what this means for their OEM customers (whose designs often count on the video card being a blower), but that’s a deeper discussion for another time.
At any rate, from NVIDIA’s press release we know that each fan features 13 blades, and that the shroud itself is once again made out of die-cast aluminum. Also buried in the press release is information that NVIDIA is once again using a vapor chamber here to transfer heat between the GPU and the heatsink, and that it’s being called a “full length” vapor chamber, which would mean it’s notably larger than the vapor chamber in NVIDIA’s past cards. Unfortunately this is the limit to what we know right now about the cooler, and I expect there’s more to find out in the coming days and weeks. In the meantime NVIDIA has disclosed that the resulting card the standard size for a high-end NVIDIA reference card: dual slot width, 10.5-inches long.
Diving down, we also have a few tidbits about the reference PCB, including the power delivery system. NVIDIA’s press release specifically calls out a 13 phase power delivery system, which matches the low-resolution PCB render they’ve posted to their site. NVIDIA has always been somewhat frugal on VRMs – their cards have more than enough capacity for stock operation, but not much excess capacity for power-intensive overclocking – so it sounds like they are trying to meet overclockers half-way here. Though once we get to fully custom partner cards, I still expect the MSIs and ASUSes of the world to go nuts and try to outdo NVIDIA.
NVIDIA’s photos also make it clear that in order to meet that 250W+ TDP, we’re looking at an 8pin + 8pin configuration for PCIe power connectors. On paper such a setup is good for 375W, and while I don’t expect NVIDIA to go quite that far, typically we’d see a 300W 6pin + 8pin setup instead. So NVIDIA is clearly planning on drawing more power, and they’re using the connectors to match. Thankfully 8pin power connectors are fairly common on 500W+ PSUs these days, however it’s possible that older PSU owners may get pinched by the need for dual 8pin cables.
Finally, for display outputs, NVIDIA has confirmed that their latest generation flagship once again supports up to 4 displays. However there are actually 5 display outputs on the card: the traditional 3 DisplayPorts and a sole HDMI port, but now there’s also a singular USB Type-C port, offering VirtualLink support for VR headsets. As a result, users can pick any 4 of the 5 ports, with the Type-C port serving as a DisplayPort when not hooked up to a VR headset. Though this does mean that the final DisplayPort has been somewhat oddly shoved into the second row, in order to make room for the USB Type-C port.
Wrapping up the GeForce RTX 2080 Ti, NVIDIA’s new flagship has been priced to match. In fact it is seeing the greatest price hike of them all. Stock cards will start at $999, $300 above the GTX 1080 Ti. Meanwhile NVIDIA’s own Founders Edition card carries a $200 premium on top of that, retailing for $1199, the same price as the last-generation Titan Xp. The Ti/Titan dichotomy has always been a bit odd in recent years, so it would seem that NVIDIA has simply replaced the Titan with the Ti, and priced it to match.
GeForce RTX 2080
Moving down the stack, we have the GeForce RTX 2080. Normally the king of an NVIDIA launch product stack, this cycle’s x80 card is instead playing second-seat to the RTX 2080 Ti. However in terms of pricing and the general market position, it remains in a similar position.
NVIDIA GeForce x80 Specification Comparison | ||||||
RTX 2080 Founder's Edition |
RTX 2080 | GTX 1080 | GTX 980 | |||
CUDA Cores | 2944 | 2944 | 2560 | 2048 | ||
ROPs | 64? | 64? | 64 | 64 | ||
Core Clock | 1515MHz | 1515MHz | 1607MHz | 1126MHz | ||
Boost Clock | 1800MHz | 1710MHz | 1733MHz | 1216MHz | ||
Memory Clock | 14Gbps GDDR6 | 14Gbps GDDR6 | 10Gbps GDDR5X | 7Gbps GDDR5 | ||
Memory Bus Width | 256-bit | 256-bit | 256-bit | 256-bit | ||
VRAM | 8GB | 8GB | 8GB | 4GB | ||
Single Precision Perf. | 10.6 TFLOPs | 10.1 TFLOPs | 8.9 TFLOPs | 5.0 TFLOPs | ||
"RTX-OPS" | 60T | 60T | N/A | N/A | ||
TDP | 225W | 215W | 180W | 165W | ||
GPU | Unknown Turing | Unknown Turing | GP104 | GM204 | ||
Architecture | Turing | Turing | Pascal | Maxwell | ||
Manufacturing Process | TSMC 12nm "FFN" | TSMC 12nm "FFN" | TSMC 16nm | TSMC 28nm | ||
Launch Date | 09/20/2018 | 09/20/2018 | 05/27/2016 | 09/18/2014 | ||
Launch Price | $799 | $699 | MSRP: $599 Founders $699 |
$549 |
NVIDIA has not disclosed anything about the GPU used here. As I highly doubt it’s using Big Turing, I expect we’re looking at the next Turing down, which would presumably be TU104. And although we don’t have die size or transistor counts, in terms of functionality it appears to be a straight smaller version of Big Turing, featuring a proportional number of CUDA, tensor, and RT cores.
Altogether we’re looking at 2944 CUDA cores paired with what should be 368 tensor cores. I expect that like the RTX 2080 Ti, for the RTX 2080 what we’re looking at is not a fully-enabled TR104 part, especially as the similar Quadro RTX 5000 has slightly better specifications. Also like the RTX 1080 Ti, clockspeeds have dropped a bit from generation to generation, as the base clock is down to 1515MHz and the boost clock to 1710MHz. All told we’re looking at a pure CUDA core compute throughput of 10.1 TFLOPs, about 13% higher than the GTX 1080. Or if we compare it to the RTX 2080 Ti, we’d see around 75% of the expected compute/tensor performance, which is only a bit larger than the jump we saw between the GTX 1080 and GTX 1080 Ti.
Meanwhile it’s clear that the card will come with a fully enabled memory bus, meaning we’re looking at 8GB of GDDR6 running at 14Gbps, on top of a 256-bit memory bus. Relative to the GTX 1080 this is an even more significant 40% increase in memory bandwidth.
As for TDPs, they’ve gone up for this band of cards. The stock RTX 2080 will have a 215W TDP, up 30W from the GTX 1080’s 180W TDP, and an even bigger increase if we look at GTX 980’s 165W TDP. It’s no secret that NVIDIA is fighting a losing battle with Moore’s Law here, and barring massive efficiency improvements, there is a need to increase TDPs to keep up overall performance. TU104 is undoubtedly also a big chip, and without a full node shrink, it would seem that NVIDIA has to pay a power penalty instead. In the meantime this higher TDP also negates some of the RTX 2080 Ti’s power disadvantage, as now that gap is just 35W instead of 65W.
Past that, NVIDIA is offering an overclocked Founders Edition card here as they are with the RTX 2080 Ti. The RTX 2080 Founders Edition will ship with an 1800MHz boost clock (up 5%), and the TDP will be 10W higher at 225W.
Power delivery for the 2080 FE and similar reference cards will be a 6pin + 8pin configuration, which on paper is good for 300W. NVIDIA's official PCB photo for the card shows a similar VRM configuration as the 2080 Ti, so that means we're looking at what appears to be 10 phases driving the GPU among 13 phases in total. NVIDIA has also disclosed that the RTX 2080 Founders Edition in particular has an overclocking power limit of 280W, confirming earlier claims about being more overclocking-friendly than past reference designs.
The Founders Edition card will also retain the same dual fan reference cooling design as its bigger sibling, right down to its 10.5-inch long physical dimensions. This also includes the card-length vapor chamber that's used on the RTX 2080 Ti as well. So the only difference between the two cards, other than the power connectors, will be on the inside. Which also means that NVIDIA’s 2080 reference PCB will use the same 3x DisplayPort + 1x HDMI + 1x USB Type-C configuration as the other card.
Finally, in terms of pricing, the RTX 2080 will also be seeing a price hike over the last generation. Though not as significantly as the RTX 2080 Ti. The MSRP for a stock card will be $699, $100 over the launch price of the GTX 1080 and $200 over its current list price. Meanwhile the Founders Edition card will carry a further $100 premium, pushing it to $799. Finally, like the RTX 2080 Ti, the 2080 will be shipping in the first wave of cards on September 20th.
GeForce RTX 2070
The final member of the new GeForce RTX family is the GeForce RTX 2070. Traditional for its roots, this is the “value” enthusiast card, giving up some of the RTX 2080’s performance in exchange for a lower price. Though with prices starting at $499, “value” and “cheap” are not the same thing.
NVIDIA GeForce x70 Specification Comparison | ||||||
RTX 2070 Founder's Edition |
RTX 2070 | GTX 1070 | GTX 970 | |||
CUDA Cores | 2304 | 2304 | 1920 | 1664 | ||
ROPs | 64? | 64? | 64 | 64 | ||
Core Clock | 1410MHz | 1410MHz | 1506MHz | 1050MHz | ||
Boost Clock | 1710MHz | 1620MHz | 1683MHz | 1178MHz | ||
Memory Clock | 14Gbps GDDR6 | 14Gbps GDDR6 | 8Gbps GDDR5 | 7Gbps GDDR5 | ||
Memory Bus Width | 256-bit | 256-bit | 256-bit | 256-bit | ||
VRAM | 8GB | 8GB | 8GB | 4GB | ||
Single Precision Perf. | 7.9 TFLOPs | 7.5 TFLOPs | 6.5 TFLOPs | 3.9 TFLOPs | ||
"RTX-OPS" | 45T | 45T | N/A | N/A | ||
TDP | 185W | 175W | 150W | 145W | ||
GPU | Unknown Turing | Unknown Turing | GP104 | GM204 | ||
Architecture | Turing | Turing | Pascal | Maxwell | ||
Manufacturing Process | TSMC 12nm "FFN" | TSMC 12nm "FFN" | TSMC 16nm | TSMC 28nm | ||
Launch Date | 09/20/2018 | 09/20/2018 | 06/10/2016 | 09/18/2014 | ||
Launch Price | $599 | $499 | MSRP: $379 Founders $449 |
$329 |
Like the 2080, we don’t know the specific GPU being used here. But it’s almost certainly a somewhat significantly cut-down TU104. NVIDIA has shaved off several SMs here, resulting in 2304 CUDA cores and what should be 288 tensor cores. Meanwhile ray tracing performance is rated at 6 GigaRays/second.
Like the other RTX cards, clockspeeds have dropped a bit versus the previous generation; the base clock now starts at 1410MHz, and the boost clock is 1620MHz. On paper then, compute throughput works out to 7.5 TFLOPs, up 15% from GTX 1070. Or if we compare it to the 2080, the 2070 should deliver around 75% of its bigger sibling’s compute performance, which makes the jumps between the 2070, 2080, and 2080 Ti all very symmetrical. This is also an ever so slightly smaller gap than what was between the GTX 1080 and GTX 1070.
Also like the RTX 2080, this appears to be a fully enabled memory configuration. Meaning we’re looking at 8GB of GDDR6 running at 14Gbps, on top of a 256-bit memory bus. Relative to the GTX 1070 this is the single greatest bandwidth increase of all of the RTX cards; the 2070 will enjoy a 75% increase in memory bandwidth over its Pascal predecessor, as the GTX 1070 never did use GDDR5X.
TDPs have gone up here as well. The RTX 2070 is rated for 175W, up from 150W for the GTX 1070, and 145W for the GTX 970. As with the RTX 2080, it looks like NVIDIA is paying for their performance and new features via higher power consumption in lieu of a full node shrink.
Past that, NVIDIA is offering an overclocked Founders Edition card here as well. The RTX 2070 Founders Edition will ship with a 1710MHz boost clock (up 6%), and the TDP will be 10W higher at 185W.
Power delivery for the 2080 FE and similar reference cards will be a single 8pin connector, which on paper is good for 225W. Notably, thanks in part to cutting out a power connector, this card is shorter than the RTX 2080 – as is often the case for x70 cards – and comes in at just 9-inches long. Consequently, while the Founders Edition card retains the same basic industrial design as the other RTX cards, it’s not going to be the same exact shroud, owing to the shorter size and lack of an NVLink connector.
Otherwise in terms of display outputs, NVIDIA’s reference PCB is the same as the other cards: 3x DisplayPort + 1x HDMI + 1x USB Type-C.
Finally, in terms of pricing, like all the other RTX cards, the RTX 2070 will be seeing a price hike over the last generation. The MSRP for a stock card will be $499, $120 over the launch price of the GTX 1070 and around $100 over its current list price. Meanwhile the Founders Edition card will carry a further $100 premium, pushing it to $599.
However unlike the top two cards, the RTX 2070 will not be shipping on September 20th. While NVIDIA has announced the card, both their Founders Edition card as well as partner cards will not be shipping until some point in October. Consequently, pre-orders are not available at this time. Which coincidentally brings us to our final point…
Pre-Orders Starting Now
As mentioned in the introduction to this article, NVIDIA and its board partners are already taking pre-orders on the new GeForce RTX cards. As a general piece of advice, we caution against pre-ordering new hardware until it can be properly reviewed by a reputable third party (i.e. us). However in the case of the GeForce RTX series I feel that this is especially important, as NVIDIA has provided no meaningful guidance when it comes to performance expectations. So we don’t even know how well these cards will perform relative to their Pascal predecessors on current-generation games.
None the less, unless NVIDIA is sitting on a massive launch inventory, I have to admit that I have trouble believing that retailers will remain in stock ahead of the September 20th launch. Launch-day sell-outs have become common for video cards, and even with the unfortunate use of pre-orders here, it wouldn’t be surprising to see pre-orders run out before we reach the 20th. So potential buyers will have some hard choices to make if they want to ensure they have a card on launch day.
Meanwhile as far as pricing is concerned, while NVIDIA does have their $999/$699/$499 baseline MSRPs, the reality is that you won’t be able to find cards at these prices. And likely won’t be able to for weeks or months. The launch-day sell-out phenomena means that board partners have prioritized higher-end card designs, with pricing to match. As of this afternoon, I have not seen an RTX 2080 card listed below $749, and most are at or over the $799 price tag of the Founders Edition card. The gap is even bigger for RTX 2080 Ti cards, where I haven’t seen any for less than $1150. So there’s certainly merit to any arguments that at least for the launch, prices are closer to $1150 and $750 respectively.
None the less, if you are looking to pre-order a card, Newegg, Amazon, and NVIDIA are all taking pre-orders while they last. So at the very least there is a reasonable choice of retailers to choose from. We have not had a chance to dig into any of the board partners’ cards yet, but it would appear that this is going to be a semi-custom launch: that is, everyone is shipping a card using NVIDIA’s reference PCB and then their own custom cooler on top of it. So cooling aside, all of the launch-day cards should offer similar features.