Name: The NVIDIA GeForce GTX 1660 Ti Review, Feat. EVGA XC GAMING: Turing Sheds RTX for the Mainstream Market
Item: The NVIDIA GeForce GTX 1660 Ti Review, Feat. EVGA XC GAMING: Turing Sheds RTX for the Mainstream Market

Original Link: https://www.anandtech.com/show/13973/nvidia-gtx-1660-ti-review-feat-evga-xc-gaming

The NVIDIA GeForce GTX 1660 Ti Review, Feat. EVGA XC GAMING: Turing Sheds RTX for the Mainstream Market

VIEW ARTICLE

by Ryan Smith & Nate Oh on February 22, 2019 9:00 AM EST

157 Comments

When NVIDIA put their plans for their consumer Turing video cards into motion, the company bet big, and in more ways than one. In the first sense, NVIDIA dedicated whole logical blocks to brand-new graphics and compute features – ray tracing and tensor core compute – and they would need to sell developers and consumers alike on the value of these features, something that is no easy task. In the second sense however, NVIDIA also bet big on GPU die size: these new features would take up a lot of space on the 12nm FinFET process they’d be using.

The end result is that all of the Turing chips we’ve seen thus far, from TU102 to TU106, are monsters in size; even TU106 is 445mm², never mind the flagship TU102. And while the full economic consequences that go with that decision are NVIDIA’s to bear, for the first year or so of Turing’s life, all of that die space that is driving up NVIDIA’s costs isn’t going to contribute to improving NVIDIA’s performance in traditional games; it’s a value-added feature. Which is all workable for NVIDIA in the high-end market where they are unchallenged and can essentially dictate video card prices, but it’s another matter entirely once you start approaching the mid-range, where the AMD competition is alive and well.

Consequently, in preparing for their cheaper, sub-$300 Turing cards, NVIDIA had to make a decision: do they keep the RT and tensor cores in order to offer these features across the line – at a literal cost to both consumers and NVIDIA – or do they drop these features in order to make a leaner, more competitive chip? As it turns out, NVIDIA has opted for the latter, producing a new Turing GPU that is leaner and meaner than anything that’s come before it, but also very different from its predecessors for this reason.

That GPU is TU116, and it’s part of what will undoubtedly become a new sub-family of Turing GPUs for NVIDIA as the company starts rolling out Turing into the lower half of the video card market. Kicking things off in turn for this new GPU is NVIDIA’s latest video card, the GeForce GTX 1660 Ti. Launching today at $279, it’s destined to replace NVIDIA’s GTX 1060 6GB in the market and is NVIDIA’s new challenger for the mainstream video card market.

NVIDIA GeForce Specification Comparison
	GTX 1660 Ti	RTX 2060 Founders Edition	GTX 1060 6GB (GDDR5)	RTX 2070
CUDA Cores	1536	1920	1280	2304
ROPs	48	48	48	64
Core Clock	1500MHz	1365MHz	1506MHz	1410MHz
Boost Clock	1770MHz	1680MHz	1708MHz	1620MHz FE: 1710MHz
Memory Clock	12Gbps GDDR6	14Gbps GDDR6	8Gbps GDDR5	14Gbps GDDR6
Memory Bus Width	192-bit	192-bit	192-bit	256-bit
VRAM	6GB	6GB	6GB	8GB
Single Precision Perf.	5.5 TFLOPS	6.5 TFLOPS	4.4 TFLOPs	7.5 TFLOPs FE: 7.9 TFLOPS
"RTX-OPS"	N/A	37T	N/A	45T
SLI Support	No	No	No	No
TDP	120W	160W	120W	175W FE: 185W
GPU	TU116 (284 mm2)	TU106 (445 mm2)	GP106 (200 mm2)	TU106
Transistor Count	6.6B	10.8B	4.4B	10.8B
Architecture	Turing	Turing	Pascal	Turing
Manufacturing Process	TSMC 12nm "FFN"	TSMC 12nm "FFN"	TSMC 16nm	TSMC 12nm "FFN"
Launch Date	2/22/2019	1/15/2019	7/19/2016	10/17/2018
Launch Price	$279	$349	MSRP: $249 FE: $299	MSRP: $499 FE: $599

We’ll go into the full ramifications of what NVIDIA has (and hasn’t) taken out of TU116 on the next page, but at a high level it’s still every bit a Turing GPU, save the RTX functionality (RT and tensor cores). This means that it has the same core architecture in its SMs, and is directly comparable to the likes of the RTX 2060. Or to flip things around the other direction, versus the older Pascal and Maxwell-based video cards, it comes with all of Turing’s performance and efficiency benefits for traditional graphics workloads.

Compared to RTX 2060 then, the GTX 1660 Ti is actually rather similar. For this fully-enabled TU116 card, NVIDIA has dialed back on the number of SMs a bit, going from 30 to 24, and memory clockspeeds have dropped as well, from 14Gbps to 12Gbps. But past that, the two cards are closer in specifications than we might expect to see for a $70 price tag difference, especially as NVIDIA has kept the 6GB of GDDR6 on a 192-bit memory bus. In an added quirk, the GTX 1660 Ti actually has a slightly higher average boost clockspeed than the RTX 2060, with its 1770Mhz clockspeed giving it a 5% edge here.

The end result is that, on paper, the GTX 1660 Ti actually has a bit more ROP pixel pushing power than its bigger sibling thanks to that 5% boost clock advantage. However the drop in the SM count definitely hits compute and texture performance, where GTX 1660 Ti is going to deliver around 85% of RTX 2060’s compute and shading throughput. Or to frame things in reference to the GTX 1060 6GB it replaces, on the new card offers around 24% more compute/shader throughput (before taking architecture into account), a much smaller 4% increase in ROP throughput, and a very sizable 50% increase in memory bandwidth.

Speaking of memory bandwidth, NVIDIA’s continued use of a 192-bit memory bus in this segment continues to be a somewhat vexing choice since it leads to such odd memory amounts. I’ll fully admit I would have liked to have seen 8GB here, but then that was the case for RTX 2060 as well. The flip side being that at least they aren’t trying to ship a card with just a 128-bit memory bus, as was the case for GTX 960. This puts GTX 1660 Ti in an interesting spot in terms of memory bandwidth, since it’s benefitting from the jump to GDDR6; if you thought the GTX 1060 could use a little more memory bandwidth, GTX 1660 Ti gets it in spades. This has also allowed NVIDIA to opt for cheaper 12Gbps GDDR6 VRAM, marking the first time we’ve seen this in any video card.

Finally, taking a look at power consumption, we see that NVIDIA is going to be holding the line at 120W, which is the same TDP as the GTX 1060 6GB. This is notable because all of the other Turing cards to date have had higher TDPs than the cards they replace, leading to a broad case of generational TDP inflation. Of course we’ll see what actual power consumption is like in our testing, but right off the bat NVIDIA is setting up GTX 1660 Ti to be noticeably more power efficient than the RTX 20 series cards.

Wait, It's a GTX Card?

Along with the new TU11x family of GPUs, for this launch NVIDIA is also creating a new family of video cards: the GeForce GTX 16 series. With GTX 1660 Ti and its obligatory siblings lacking support for NVIDIA’s RTX family of features, the company has decided to clarify their product naming in only a way that NVIDIA can. The end result is that along with keeping the GTX prefix rather than RTX – since these parts obviously lack RTX functionality – the company is also giving them a lower series number. Overall it’s probably for the best that NVIDIA didn’t include these cards with the 20 series, least we get another GeForce 4 situation.

But on the flip side, the number “16” also doesn’t have any great meaning to it; other than not being “20” the number is somewhat arbitrary. According to NVIDIA, they essentially picked it because they wanted a number close to 20 to indicate that the new GPU is very close in functionality and performance to TU10x, and thus “16” instead of “11” or the like. Of course I’m not sure calling it the GTX 1660 Ti is doing anyone any favors when the next card up is the RTX 2060 (sans Ti), but there’s none the less a somewhat clear numerical progression here – and at least for the moment, one not based on memory capacity.

Price, Product Positioning, & The Competition

Moving on, unlike NVIDIA’s other Turing card launches up until now – and unlike the GTX 1060 6GB – the GTX 1660 Ti is not getting a reference card release. Instead this is a pure virtual launch, as NVIDIA calls it, meaning all the cards hitting the shelves are customized vendor cards. Traditionally these launches tend to be closer to semi-custom cards – partners tend to use NVIDIA’s internal reference board design or their first cards – so we’ll have to see what pops up over the coming weeks and months. For now then, this means we’re going to see a lot of single and dual-fan cards, similar to the kinds of designs used for a lot of the GTX 1060 cards and some of the RTX 2070 cards.

Buy the EVGA GeForce GTX 1660 Ti XC Black GAMING on Newegg

Another constant across the Turing family has been price inflation, and the GTX 1660 Ti is no exception. With a launch price of $279, the new card is launching at $30 above the GTX 1060 6GB it replaces. This is a lot better than the $349 that NVIDIA wants for the RTX 2060, but in case anyone thought that the $250 price tag of the GTX 1060 was a fluke, then it’s clear that sub-$300 is the new norm for xx60 cards, and not sub-$200 as the GTX 960 flirted with. It’s also worth noting that NVIDIA won’t be launching with any bundles here; neither the RTX Game On bundle nor the GTX 1060 Fortnite bundles will be in play here, so what you see is what you get.

In terms of positioning against their own cards, NVIDIA is rolling out the GTX 1660 Ti as the successor to the GTX 1060 6GB, the latter of which are becoming increasingly rare in the market as NVIDIA’s unplanned Pascal stockpile is finally drawn down. So the GTX 1660 Ti and GTX 1060 won’t be sharing space on store shelves for long. However like the other Turing cards, the GTX 1660 Ti is not a true generational successor to the GTX 1060; at roughly 36% faster, NVIDIA is not expecting anyone to upgrade from their mid-range Pascal card to this. Instead, NVIDIA’s marketing efforts are going to be heavily focused on enticing GTX 960 users, who are a further generation back, to finally upgrade. In that respect the GTX 1660 Ti has a very large performance advantage, but this may be a tough sell since the GTX 960 launched at a much cheaper $199 price point.

As for AMD, the launch of the GTX 1660 Ti finally puts a Turing card in competition with their Polaris cards, particularly the $279 Radeon RX 590, a fight that the Radeon cannot win. While AMD hasn’t announced any price changes for the RX 590 at this time, AMD will have little choice but to bring it down in price.

Instead, AMD’s competitor for the GTX 1660 Ti looks like it will be the Radeon RX Vega 56. The company sent word last night that they are continuing to work with partners to offer lower promotional prices on the card, including a single model that was available for $279, but as of press time has since sold out. Notably, AMD is asserting that this is not a price drop, so there’s an unusual bit of fence sitting here; the company may be waiting to see what actual, retail GTX 1660 Ti card prices end up like. So I’m not wholly convinced we’re going to see too many $279 Vega 56 cards, but we’ll see. If nothing else, AMD’s Raise the Game Bundle is being offered, giving them an edge over NVIDIA in terms of pack-in games.

Q1 2019 GPU Pricing Comparison
AMD	Price	NVIDIA
Radeon RX Vega 64	$499	GeForce RTX 2070
	$349	GeForce RTX 2060
	$329	GeForce GTX 1070
Radeon RX Vega 56* Radeon RX 590	$279	GeForce GTX 1660 Ti
	$249	GeForce GTX 1060 6GB (1280 cores)
Radeon RX 580 (8GB)	$179/$189	GeForce GTX 1060 3GB (1152 cores)

TU116: When Turing Is Turing… And When It Isn’t

Diving a bit deeper into matters, we have the new TU116 GPU at the heart of the GTX 1660 Ti. While NVIDIA does not announce future products in advance, you can expect that this will be the first of at least a couple of GPUs in what’s now the TU11x family, as NVIDIA is going to want to follow the same streamlined strategy for the eventual successors to GP107 and possible GP108.

TU116 is an interesting piece of kit, both because of the decisions that lead to this point and because of both the drawbacks and advantages of excising some of Turing’s functionality. As mentioned earlier in this article, NVIDIA has made a very deliberate decision to cut out their RTX functionality – the ray tracing cores and tensor cores – in order to produce a GPU that’s better suited for traditional rendering. The end result is a smaller, cheaper to produce GPU. But it also means that NVIDIA has to change how they go about promoting cards based on this GPU.

With a die size of 284mm², TU116 tells a story in and of itself. This makes it 40% smaller than the next-smallest Turing GPU, TU106. Similarly, the transistor count has come down from 10.8 billion to 6.6 billion. This greatly improves the manufacturability of the GPU and drives down its costs, especially since NVIDIA will be going into more competitive markets with it than the other TU10x GPUs. Still, TU116 is some 42% bigger than the 200mm² GP106 die that it replaces, so even though it’s more efficient, NVIDIA is still dealing with a significant increase in die size on a generation-by-generation basis.

Unfortunately, TU116 doesn’t give us a terribly good baseline for determining how much of a TU10x SM was composed of RTX hardware. TU116 doesn’t just drop the RTX hardware in its SMs, but it’s a smaller design overall; fewer SMs, fewer memory channels, and fewer ROPs. So we can’t fully separate the savings of dropping RTX from the savings of making a lighter GPU in general. However it’s interesting to note that on a relative basis, the transistor count difference between TU116 and TU106 is almost exactly the same as GP106 and GP104: there are 39% fewer transistors when stepping down. So later on it will give us an opportunity to look at performance and see if the performance gap between the GTX 1660 Ti and RTX 2070 – the full-fat cards of their respective GPUs – is anything like the sizable gap between the GTX 1060 6GB and the GTX 1080.

NVIDIA Turing GPU Comparison
	TU102	TU104	TU106	TU116
CUDA Cores	4608	3072	2304	1536
SMs	72	48	36	24
Texture Units	288	192	144	96
RT Cores	72	48	36	N/A
Tensor Cores	576	384	288	N/A
ROPs	96	64	64	48
Memory Bus Width	384-bit	256-bit	256-bit	192-bit
L2 Cache	6MB	4MB	4MB	1.5MB
Register File (Total)	18MB	12MB	9MB	6MB
Architecture	Turing	Turing	Turing	Turing
Manufacturing Process	TSMC 12nm "FFN"	TSMC 12nm "FFN"	TSMC 12nm "FFN"	TSMC 12nm "FFN"
Die Size	754mm2	545mm2	445mm2	284mm2

But getting back to architecture, this launch is one of a handful of times we’ve seen NVIDIA use dissimilar GPUs in their consumer cards, and it’s a situation without a good parallel. NVIDIA had done plenty of non-homogenous families in the past, but typically the black sheep of the family is the high-end server GPU, e.g. GP100, where it gets additional features not found in the consumer lineup. Instead the Turing family ends up having a split right down the middle.

The good news for consumers is that, outside of RTX functionality, TU116 and its ilk – which for the sake of simplicity I’m going to call Turing Minor from here on out – is functionally equivalent to TU102/TU104/TU106 (Turing Major). Turing Minor has the exact same DirectX feature set, the exact same core compute architecture (right on down to cache sizes), the exact same video and display blocks, etc. The RT and tensor cores really are the only thing that’s changed.

The situation looks much the same for programmers & developers as well: on the current press drivers the GTX 1660 TI reports itself as a Compute Capability 7.5 card – the same CC version as all of the Turing Major cards – so developers won’t have to even compile separate code for Turing Minor cards. So long as their code can handle a lack of tensor cores, at least.

(As a brief aside, as a performance exercise we ran the tensor version of our HGEMM benchmark on the GTX 1660 Ti. And it completed?! Performance was a bit lower, at 10.8 TFLOPS versus 11 TFLOPS with tensors disabled, but it did complete. Which indicates that either NVIDIA has been less than forthcoming on TU116, or in order to keep all Turing parts on CC 7.5, they are sending tensor ops through the CUDA cores on Turing Minor cards)

Looking at the TU116 SM, what we find is something almost identical to the SM diagrams used for Turing Major, with the SM arranged into 4 partitions, each with their own warp schedule and set of CUDA cores, while all 4 partitions share the L1 cache and texture units. Cache sizes and register file sizes are all unchanged here, so average throughput and register pressure are similarly unchanged as well. The one standout is that in replacing the tensor cores in their diagram, NVIDIA has opted to draw in FP16 cores, which is a bit of a stretch given what we know about the Turing architecture. NVIDIA only sent out this diagram yesterday, so I’m still checking with them to see if this is the company taking a creative liberty to highlight Turing’s other functionality, or if there’s more to it that NVIDIA is downplaying to keep things simple (ala Kepler and GK104).

Update: NVIDIA has gotten back to me this morning. As it turns out, the FP16 cores in the diagram are quite literal. For more information, please see below.

The Curious Case of FP16: Tensor Cores vs. Dedicated Cores

Even though Turing-based video cards have been out for over 5 months now, every now and then I’m still learning something new about the architecture. And today is one of those days.

Something that escaped my attention with the original TU102 GPU and the RTX 2080 Ti was that for Turing, NVIDIA changed how standard FP16 operations were handled. Rather than processing it through their FP32 CUDA cores, as was the case for GP100 Pascal and GV100 Volta, NVIDIA instead started routing FP16 operations through their tensor cores.

The tensor cores are of course FP16 specialists, and while sending standard (non-tensor) FP16 operations through them is major overkill, it’s certainly a valid route to take with the architecture. In the case of the Turing architecture, this route offers a very specific perk: it means that NVIDIA can dual-issue FP16 operations with either FP32 operations or INT32 operations, essentially giving the warp scheduler a third option for keeping the SM partition busy. Note that this doesn’t really do anything extra for FP16 performance – it’s still 2x FP32 performance – but it gives NVIDIA some additional flexibility.

Of course, as we just discussed, the Turing Minor does away with the tensor cores in order to allow for a learner GPU. So what happens to FP16 operations? As it turns out, NVIDIA has introduced dedicated FP16 cores!

These FP16 cores are brand new to Turing Minor, and have not appeared in any past NVIDIA GPU architecture. Their purpose is functionally the same as running FP16 operations through the tensor cores on Turing Major: to allow NVIDIA to dual-issue FP16 operations alongside FP32 or INT32 operations within each SM partition. And because they are just FP16 cores, they are quite small. NVIDIA isn’t giving specifics, but going by throughput alone they should be a fraction of the size of the tensor cores they replace.

To users and developers this shouldn’t make a difference – CUDA and other APIs abstract this and FP16 operations are simply executed wherever the GPU architecture intends for them to go – so this is all very transparent. But it’s a neat insight into how NVIDiA has optimized Turing Minor for die size while retaining the basic execution flow of the architecture.

Now the bigger question in my mind: why is it so important to NVIDIA to be able to dual-issue FP32 and FP16 operations, such that they’re willing to dedicate die space to fixed FP16 cores? Are they expecting these operations to be frequently used together within a thread? Or is it just a matter of execution ports and routing? But that is a question we’ll have to save for another day.

Turing Minor: Turing Sans RTX

For better or worse, the launch of GTX 1660 Ti and Turing Minor means that NVIDIA has needed to adjust how they go about promoting the new cards and the Turing architecture. While Turing launched with a laundry list of features, most of which had nothing to do with RTX, the broader consumer zeitgeist definitely focused on RTX and for good reason: compared to all of the low-level architectural changes under the hood, ray tracing, DLSS, and other RTX features are a lot more visible, and for NVIDIA they were easier to promote. This means that for Turing Minor NVIDIA instead has to focus on the low-level architectural improvements in Turing, which I think is great since these were largely overlooked at the Turing launch.

While I won’t recap our entire Turing deep dive here, relative to Pascal The big difference here is the numerous steps NVIDIA has taken to improve their IPC and overall efficiency. For example, Turing made the surprising move to ditch regular forms of Instruction Level Parallelism (ILP) by dropping the second warp scheduler dispatch port. Instead, each warp scheduler fires off a single set of instructions on each clock, taking advantage of the fact that it takes 2 (or more) clocks to issue a full warp in order to interleave a second instruction in.

This ILP change goes hand-in-hand with partitioning the SM into 4 blocks instead of 2, which serves to help better control resource contention among the warps and CUDA cores. In fact at a high level, a Turing SM looks a lot more like some of NVIDIA’s server-focused GPUs than their consumer-focused GPUs; there’s a lot more plumbing here in various forms to support the CUDA cores and to help them achieve better performance, rather than just throwing more CUDA cores at the problem. The net result is that while we don’t have metrics from NVIDIA, I fully expect that the ratio of supporting hardware and glue logic to CUDA cores is significantly higher on Turing than it was GP106 Pascal. Though by the same token, I expect the SMs as a whole are larger than Pascal’s as well, which is certainly reflected in the die size.

A big part of this change, in turn, is the fact that NVIDIA broke out their Integer cores into their own block. Previously a separate branch of the FP32 CUDA cores, the INT32 cores can now be addressed separately from the FP32 cores, which combined with instruction interleaving allows NVIDIA to keep both occupied at the same time. Now make no mistake: floating point math is still the heart and soul of shading and GPU compute, however integer performance has been slowly increasing in importance over time as well, especially as shaders get more complex and there’s increased usage of address generation and other INT32 functions. This change is a big part of the IPC gains NVIDIA is claiming for Turing architecture.

Speaking of CUDA cores, like all other Turing parts, TU116 and Turing Minor get NVIDIA’s fast FP16 functionality. This means that these GPUs can process FP16 operations at twice the rate of FP32 operations – via the GPU’s dedicated FP16 cores – which for GTX 1660 Ti works out to 11 TFLOPS of performance. Using FP16 shaders in PC games is still relatively new – the baseline 8^th gen consoles don’t support it and NVIDIA previously limited this feature to server parts – but it’s more widely used in mobile games where FP16 support is common. There, as it will be in the PC space, FP16 shaders allow for developers to trade off between performance and shader precision by using a lower precision format; not all shader programs require a full FP32’s worth of precision, and when done right it can improve performance and reduce memory bandwidth needs without any real image quality impact.

Meanwhile, looking at the rest of the GPU, the memory and cache system is a bit of a grab bag. On the one hand, Turing implements NVIDIA’s latest lossless memory compression technology. This has proven to be one of NVIDIA’s bigger advantages over AMD, and continues to allow them to get away with less memory bandwidth than we’d otherwise expect some of their GPUs to need. The actual savings vary from game to game, but for the GTX 2080 Ti launch, NVIDIA reported that they were seeing reductions in traffic between 18% and 33%

From the RTX 2080 Ti Launch

However, distinct to TU116 versus its Turing Major siblings, the latest GPU has a less L2 cache per ROP partition. Turing Major GPUs all have 512KB of L2 cache per partition, giving TU106 a total of 4MB of L2, for example. TU116 on the other hand has just 256KB of L2 per partition for a total of 1.5MB of L2, which happens to be the same amount of cache and cache ratios as on GP106. The performance impact of this is hard to measure given all of the other changes in the GPU, but clearly NVIDIA had traded off some die size at the cost of some increases in cache misses. The wildcard in all of this being how much the additional bandwidth of GDDR6 helps to offset those misses.

Finally on the graphics front, Turing Minor also retains Turing’s adaptive shading capabilities. Not unlike RTX, this is a new feature that is going to take some time to get adopted, so we’ve only seen a handful of games (such as Wolfenstein II) implement it thus far. But by reducing the pixel shader granularity/rate used at various points in a scene, the technology makes it possible to improve performance by reducing the overall shading workload.

The trick with adaptive shading – and why it’s a feature rather than an immediate and transparent means of improving performance – is that it’s making a very direct quality/speed tradeoff with pixel shaders; the reduced shading rate can reduce the overall image quality by reducing clarity and creating aliasing artifacts. So developers are still in their infancy playing with the technology to figure out where they can use it without noticeably hurting image quality. In practice I expect we’re going to see it more widely deployed in VR games at first, as the tech is much easier to use there (reduce the rate anywhere the user isn’t looking), as opposed to traditional games.

The end result of all of this is that while Turing Minor has some very important feature differences from Turing Major, at the end of the day it’s still Turing. NVIDIA for their part is going to have to grapple with the fact that not all of their current-generation cards feature RTX functionality, but that’s going to be marketing’s problem. As for consumers, unless you’re specifically seeking out NVIDIA’s ray tracing and tensor core functionality, GTX 1660 Ti is just another Turing.

Meet The EVGA GeForce GTX 1660 Ti XC Black GAMING

As a pure virtual launch, the release of the GeForce GTX 1660 Ti does not bring any Founders Edition model, and so everything is in the hands of NVIDIA’s add-in board partners. For today, we look at EVGA’s GeForce GTX 1660 Ti XC Black, a 2.75-slot single-fan card with reference clocks and a slightly increased TDP of 130W.

GeForce GTX 1660 Ti Card Comparison
	GTX 1660 Ti Ref Spec	EVGA GTX 1660 Ti XC Black GAMING
Base Clock	1500MHz	1500MHz
Boost Clock	1770MHz	1770MHz
Memory Clock	12Gbps GDDR6	12Gbps GDDR6
VRAM	6GB	6GB
TDP	120W	130W
Length	N/A	7.48"
Width	N/A	2.75-Slot
Cooler Type	N/A	Open Air
Price	$279	$279

Seeing as the GTX 1660 Ti is intended to replace the GTX 1060 6GB, EVGA’s cooler and card design is new and improved compared to their Pascal cards, and was first introduced with the RTX 20-series as they rolled out the iCX2 cooling design and new “XC” card branding, complementing their existing SC and Gaming series. As we’ve seen before, the iCX platform is comprised of a medley of features, and some of the core technology is utilized even when the full iCX suite isn’t. For one, EVGA reworked their cooler design with hydraulic dynamic bearing (HDB) fans, offering lower noise and higher lifespan than sleeve and ball bearing types, and this is present in the EVGA GTX 1660 Ti XC Black.

In general, the card essentially shares the design of the RTX 2060 XC, complete with those new raised EVGA ‘E’s on the fans, intended to improve slipstream. The single-fan RTX 2060 XC was paired with a thinner dual-fan XC Ultra variant, and in the same vein the GTX 1660 Ti XC Black is a one-fan design that essentially occupies three slots due to the thick heatsink and correspondingly taller fan hub. Being so short, though, makes the size a natural fit for mini-ITX form factors.

As one of the cards lower down the RTX 20 and now GTX 16 series stack, the GTX 1660 Ti XC Black also lacks LEDs and zero-dB fan capability, where fans turn off completely at low idle temperatures. The former is an eternal matter of taste, as opposed to the practicality of the latter, but both tend to be perks of premium models and/or higher-end GPUs. Putting price aside for the moment, the reference-clocked GTX 1660 Ti and RTX 2060 XC Black editions are the more mainstream variant anyhow.

Otherwise, the GTX 1660 Ti XC Black unsurprisingly lacks a USB-C/VirtualLink output, offering up the mainstream-friendly 1x DisplayPort/1x HDMI/1x DVI setup. Although the TU116 GPU still supports VirtualLink, the decision to implement it is up to partners; the feature is less applicable for cards further down the stack, where cards are more sensitive to cost and are less likely to be used for VR. Additionally, the 30W USB-C controller power budget could be significant amount relative to the overall TDP.

And on the topic of power, the GTX 1660 Ti XC Black’s power limit is actually capped at the default 130W, though theoretically the card’s single 8-pin PCIe power connector could supply 150W on its own.

The rest of the other GPU-tweaking knobs are there for your overclocking needs, and for EVGA this goes hand-in-hand with Precision, their overclocking utility. For NVIDIA’s Turing cards, EVGA released Precision X1, which allows modifying the voltage-frequency curve and scanning for auto-overclocking as part of Turing’s GPU Boost 4. Of course, NVIDIA’s restriction of actual overvolting is still in place, and for Turing there is a cap at 1.068v.

The Test

Without any Founders Edition, NVIDIA is pushing out the GTX 1660 Ti as a fully custom launch, and while the EVGA GeForce GTX 1660 Ti XC Black has reference clocks, the TDP is set at 130W rather than the reference 120W. To keep testing and analysis as apples-to-apples as possible, as usual we've emulated reference GTX 1660 Ti specifications. While not perfect, this should be reasonably accurate for a virtual reference card as we look at reference-to-reference comparisons.

Test Setup
CPU	Intel Core i7-7820X @ 4.3GHz
Motherboard	Gigabyte X299 AORUS Gaming 7 (F9g)
PSU	Corsair AX860i
Storage	OCZ Toshiba RD400 (1TB)
Memory	G.Skill TridentZ DDR4-3200 4 x 8GB (16-18-18-38)
Case	NZXT Phantom 630 Windowed Edition
Monitor	LG 27UD68P-B
Video Cards	EVGA GeForce GTX 1660 Ti XC Black NVIDIA GeForce GTX 1660 Ti AMD Radeon RX Vega 64 (Air) AMD Radeon RX 590 AMD Radeon RX 580 NVIDIA GeForce RTX 2060 Founders Edtion NVIDIA GeForce GTX 1070 Founders Edition NVIDIA GeForce GTX 1060 6GB Founders Edition NVIDIA GeForce GTX 960 (2GB)
Video Drivers	NVIDIA Release 418.91 AMD Radeon Software Adrenalin 2019 Edition 19.2.2
OS	Windows 10 x64 Pro (1803) Spectre and Meltdown Patched

Thanks to Corsair, we were able to get a replacement for our AX860i, and so power consumption figures will differ for earlier GPU 2018 Bench data.

In the same vein, for Ashes, GTA V, F1 2018, and Shadow of War, we've updated some of the benchmark automation and data processing steps, so results may vary at the 1080p mark compared to previous GPU 2018 data.

Battlefield 1 (DX11)

Battlefield 1 returns from the 2017 benchmark suite, the 2017 benchmark suite with a bang as DICE brought gamers the long-awaited AAA World War 1 shooter a little over a year ago. With detailed maps, environmental effects, and pacy combat, Battlefield 1 provides a generally well-optimized yet demanding graphics workload. The next Battlefield game from DICE, Battlefield V, completes the nostalgia circuit with a return to World War 2, but more importantly for us, is one of the flagship titles for GeForce RTX real time ray tracing.

We use the Ultra preset is used with no alterations. As these benchmarks are from single player mode, our rule of thumb with multiplayer performance still applies: multiplayer framerates generally dip to half our single player framerates. Battlefield 1 also supports HDR (HDR10, Dolby Vision).

Battlefield 1 - 2560x1440 - Ultra Quality

Battlefield 1 - 1920x1080 - Ultra Quality

Battlefield 1 - 99th Percentile - 2560x1440 - Ultra Quality

Battlefield 1 - 99th Percentile - 1920x1080 - Ultra Quality

Right from the get-go, the GTX 1660 Ti stakes out its territory in between the RTX 2060 FE and RX 590, leaving the latter by the wayside. And as a result, it technically edges out the GTX 1070 FE, though for all intents and purposes it is a dead heat. The RX Vega 56, however, keeps ahead by decent amount; Battlefield 1 runs well on many GPUs, but Vega cards have always had a strong showing in this title.

The mild +10W TDP of the EVGA XC Black makes an equally mild difference, more so with the 99th percentiles.

Far Cry 5 (DX11)

The latest title in Ubisoft's Far Cry series lands us right into the unwelcoming arms of an armed militant cult in Montana, one of the many middles-of-nowhere in the United States. With a charismatic and enigmatic adversary, gorgeous landscapes of the northwestern American flavor, and lots of violence, it is classic Far Cry fare. Graphically intensive in an open-world environment, the game mixes in action and exploration.

Far Cry 5 does support Vega-centric features with Rapid Packed Math and Shader Intrinsics. Far Cry 5 also supports HDR (HDR10, scRGB, and FreeSync 2). This testing was done without HD Textures enabled, an option that was recently patched in.

Far Cry 5 - 2560x1440 - Ultra Quality

Far Cry 5 - 1920x1080 - Ultra Quality

For Far Cry 5, the lineup is straightforward: the GTX 1660 Ti is but the slightest shade faster than the GTX 1070 FE. The GTX 1660 Ti isn't fast enough to overtake the RX Vega 56 here, but it's well ahead of the RX 590.

As far as the EVGA and reference GTX 1660 Ti's compare, Far Cry 5 is not one to show granular differences, due to how the developers implemented the built-in benchmark reporting. But this does reiterate the slight degree of difference of the additional 10W power limit.

Ashes of the Singularity: Escalation (DX12)

A veteran from both our 2016 and 2017 game lists, Ashes of the Singularity: Escalation remains the DirectX 12 trailblazer, with developer Oxide Games tailoring and designing the Nitrous Engine around such low-level APIs. The game makes the most of DX12's key features, from asynchronous compute to multi-threaded work submission and high batch counts. And with full Vulkan support, Ashes provides a good common ground between the forward-looking APIs of today. Its built-in benchmark tool is still one of the most versatile ways of measuring in-game workloads in terms of output data, automation, and analysis; by offering such a tool publicly and as part-and-parcel of the game, it's an example that other developers should take note of.

Settings and methodology remain identical from its usage in the 2016 GPU suite. To note, we are utilizing the original Ashes Extreme graphical preset, which compares to the current one with MSAA dialed down from x4 to x2, as well as adjusting Texture Rank (MipsToRemove in settings.ini).

We've updated some of the benchmark automation and data processing steps, so results may vary at the 1080p mark compared to previous data.

Ashes of the Singularity: Escalation - 2560x1440 - Extreme Quality

Ashes of the Singularity: Escalation - 1920x1080 - Extreme Quality

Ashes: Escalation - 99th Percentile - 2560x1440 - Extreme Quality

Ashes: Escalation - 99th Percentile - 1920x1080 - Extreme Quality

Interestingly, Ashes offers the least amount of improvement in the suite for the GTX 1660 Ti over the GTX 1060 6GB. Similarly, the GTX 1660 Ti lags behind the GTX 1070, which is already close to the older Turing sibling. With the GTX 1070 FE and RX Vega 56 neck-and-neck, the GTX 1660 Ti splits the RX 590/RX Vega 56 gap.

Wolfenstein II: The New Colossus (Vulkan)

id Software is popularly known for a few games involving shooting stuff until it dies, just with different 'stuff' for each one: Nazis, demons, or other players while scorning the laws of physics. Wolfenstein II is the latest of the first, the sequel of a modern reboot series developed by MachineGames and built on id Tech 6. While the tone is significantly less pulpy nowadays, the game is still a frenetic FPS at heart, succeeding DOOM as a modern Vulkan flagship title and arriving as a pure Vullkan implementation rather than the originally OpenGL DOOM.

Featuring a Nazi-occupied America of 1961, Wolfenstein II is lushly designed yet not oppressively intensive on the hardware, something that goes well with its pace of action that emerge suddenly from a level design flush with alternate historical details.

The highest quality preset, "Mein leben!", was used. Wolfenstein II also features Vega-centric GPU Culling and Rapid Packed Math, as well as Radeon-centric Deferred Rendering; in accordance with the preset, neither GPU Culling nor Deferred Rendering was enabled.

Wolfenstein II - 2560x1440 -

Wolfenstein II - 1920x1080 -

Wolfenstein II - 99th Percentile - 2560x1440 -

Wolfenstein II - 99th Percentile - 1920x1080 -

As we've seen before, Turing and Vega tend to run well on Wolfenstein II. For our games, these results are actually the closest the RX 590 can get to the GTX 1660 Ti, and even here the GTX 1660 Ti is a solid 13-14% ahead. Here, the GTX 1660 Ti also pulls the biggest lead over the GTX 1060 6GB, coming in at more than 1.5X faster, but also loses to the RX Vega 56 by more than other games.

The 6GB of framebuffer doesn't seem to be holding the GTX 1660 Ti back. The GTX 960's 2GB framebuffer, on the other hand, is asphyxiating.

Final Fantasy XV (DX11)

Upon arriving to PC earlier this, Final Fantasy XV: Windows Edition was given a graphical overhaul as it was ported over from console, fruits of their successful partnership with NVIDIA, with hardly any hint of the troubles during Final Fantasy XV's original production and development.

In preparation for the launch, Square Enix opted to release a standalone benchmark that they have since updated. Using the Final Fantasy XV standalone benchmark gives us a lengthy standardized sequence to utilize OCAT. Upon release, the standalone benchmark received criticism for performance issues and general bugginess, as well as confusing graphical presets and performance measurement by 'score'. In its original iteration, the graphical settings could not be adjusted, leaving the user to the presets that were tied to resolution and hidden settings such as GameWorks features.

Since then, Square Enix has patched the benchmark with custom graphics settings and bugfixes to be more accurate in profiling in-game performance and graphical options, though leaving the 'score' measurement. For our testing, we enable or adjust settings to the highest except for NVIDIA-specific features and 'Model LOD', the latter of which is left at standard. Final Fantasy XV also supports HDR, and it will support DLSS at some later date.

Final Fantasy XV - 2560x1440 - Ultra Quality

Final Fantasy XV - 1920x1080 - Ultra Quality

Final Fantasy XV - 99th Percentile - 2560x1440 - Ultra Quality

Final Fantasy XV - 99th Percentile - 1920x1080 - Ultra Quality

Final Fantasy V is another strong title for NVIDIA across the board, and the GTX 1660 Ti comes very close to the RX Vega 64, let alone surpassing the RX 590 and RX Vega 56.

The GTX 960 is clearly out of its element, and given the 99th percentiles it's fair to say that the 2GB framebuffer shoulders a good amount of the blame. By comparison, this makes the GTX 1660 Ti look exceedingly good at offering basically triple the performance (and amusingly, triple the VRAM).

Grand Theft Auto V (DX11)

Now a truly venerable title, GTA V is a veteran of past game suites that is still graphically demanding as they come. As an older DX11 title, it provides a glimpse into the graphically intensive games of yesteryear that don't incorporate the latest features. Originally released for consoles in 2013, the PC port came with a slew of graphical enhancements and options. Just as importantly, GTA V includes a rather intensive and informative built-in benchmark, somewhat uncommon in open-world games.

The settings are identical to its previous appearances, which are custom as GTA V does not have presets. To recap, a "Very High" quality is used, where all primary graphics settings turned up to their highest setting, except grass, which is at its own very high setting. Meanwhile 4x MSAA is enabled for direct views and reflections. This setting also involves turning on some of the advanced rendering features - the game's long shadows, high resolution shadows, and high definition flight streaming - but not increasing the view distance any further.

We've updated some of the benchmark automation and data processing steps, so results may vary at the 1080p mark compared to previous data.

Grand Theft Auto V - 2560x1440 - Very High Quality

Grand Theft Auto V - 1920x1080 - Very High Quality

Grand Theft Auto V - 99th Percentile - 2560x1440 - Very High Quality

Grand Theft Auto V - 99th Percentile - 1920x1080 - Very High Quality

For the GTX 1660 Ti, it's becoming clear that it is beyond firmly faster than the RX 590, its nominal competition at the $279 price point. The card pips the RX Vega 64, putting it in the realm of 1.4X to 1.5X faster than the RX 590, and around 10% faster than the RX Vega 56.

There's no mincing words here; while NVIDIA hardware may run better on GTA V in general, the size of GTX 1660 Ti's lead over the RX 590 is just crushing for the same MSRP, and equally so against the generally pricier RX Vega 56.

Middle-earth: Shadow of War (DX11)

Next up is Middle-earth: Shadow of War, the sequel to Shadow of Mordor. Developed by Monolith, whose last hit was arguably F.E.A.R., Shadow of Mordor returned them to the spotlight with an innovative NPC rival generation and interaction system called the Nemesis System, along with a storyline based on J.R.R. Tolkien's legendarium, and making it work on a highly modified engine that originally powered F.E.A.R. in 2005.

Using the new LithTech Firebird engine, Shadow of War improves on the detail and complexity, and with free add-on high resolution texture packs, offers itself as a good example of getting the most graphics out of an engine that may not be bleeding edge. Shadow of War also supports HDR (HDR10).

We've updated some of the benchmark automation and data processing steps, so results may vary at the 1080p mark compared to previous data.

Shadow of War - 2560x1440 - Ultra Quality

Shadow of War - 1920x1080 - Ultra Quality

Shadow of War is known to be a bit of a video memory hog, to which the GTX 960 acquiesces. Like Final Fantasy XV, the GTX 1660 Ti again finds itself bringing triple the performance. The GTX 1660 Ti opens a healthy lead here over the GTX 1060 6GB; if framebuffer is indeed a significant factor, it's important to note that the GTX 1660 Ti brings substantially more memory bandwidth to the table.

F1 2018 (DX11)

Succeeding F1 2016 is F1 2018, Codemaster's latest iteration in their official Formula One racing games. It features a slimmed down version of Codemasters' traditional built-in benchmarking tools and scripts, something that is surprisingly absent in DiRT 4.

Aside from keeping up-to-date on the Formula One world, F1 2017 added HDR support, which F1 2018 has maintained; otherwise, we should see any newer versions of Codemasters' EGO engine find its way into F1. Graphically demanding in its own right, F1 2018 keeps a useful racing-type graphics workload in our benchmarks.

We've updated some of the benchmark automation and data processing steps, so results may vary at the 1080p mark compared to previous data. Notably, for F1 2018 this includes calculating 99th percentiles from raw frame time output.

F1 2018 - 2560x1440 - Ultra Quality

F1 2018 - 1920x1080 - Ultra Quality

F1 2018 - 99th Percentile - 2560x1440 - Ultra Quality

F1 2018 - 99th Percentile - 1920x1080 - Ultra Quality

F1 is another solid showing for the GTX 1660 Ti, and while the lead over the RX 590 is less than most other games in the suite, it performs past the GTX 1070 FE mark to come closer to the RX Vega 56.

Total War: Warhammer II (DX11)

Last in our 2018 game suite is Total War: Warhammer II, built on the same engine of Total War: Warhammer. While there is a more recent Total War title, Total War Saga: Thrones of Britannia, that game was built on the 32-bit version of the engine. The first TW: Warhammer was a DX11 game was to some extent developed with DX12 in mind, with preview builds showcasing DX12 performance. In Warhammer II, the matter, however, appears to have been dropped, with DX12 mode still marked as beta, but also featuring performance regression for both vendors.

It's unfortunate because Creative Assembly themselves have acknowledged the CPU-bound nature of their games, and with re-use of game engines as spin-offs, DX12 optimization would have continued to provide benefits, especially if the future of graphics in RTS-type games will lean towards low-level APIs.

There are now three benchmarks with varying graphics and processor loads; we've opted for the Battle benchmark, which appears to be the most graphics-bound.

Total War: Warhammer II - 2560x1440 - Ultra Quality

Total War: Warhammer II - 1920x1080- Ultra Quality

Rounding out our look at game performance is Total War: Warhammer II.

Here, the GTX 1660 Ti lags behind the RTX 2060 and GTX 1070 FE more than in the other games, offering only somewhere around 80% of the RTX 2060 speed and 90% of the GTX 1070. In turn, it doesn't improve as much upon the GTX 1060 6GB and GTX 960, though practically speaking it has rendered its RX 590 competition as last-generation performance, given that it's neck-and-neck with the GTX 1060 6GB FE.

Compute & Synthetics

Shifting gears, we'll look at the compute and synthetic aspects of the GTX 1660 Ti.

Beginning with CompuBench 2.0, the latest iteration of Kishonti's GPU compute benchmark suite offers a wide array of different practical compute workloads, and we’ve decided to focus on level set segmentation, optical flow modeling, and N-Body physics simulations.

Compute: CompuBench 2.0 - Level Set Segmentation 256

Compute: CompuBench 2.0 - N-Body Simulation 1024K

Compute: CompuBench 2.0 - Optical Flow

On paper, the GTX 1660 Ti looks to provide around 85% of the RTX 2060's compute and shading throughput; for Compubench, we see it achieving around 82% of the latter's performance.

Moving on, we'll also look at single precision floating point performance with FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance.

Compute: Folding @ Home Single Precision

Next is Geekbench 4's GPU compute suite. A multi-faceted test suite, Geekbench 4 runs seven different GPU sub-tests, ranging from face detection to FFTs, and then averages out their scores via their geometric mean. As a result Geekbench 4 isn't testing any one workload, but rather is an average of many different basic workloads.

Compute: Geekbench 4 - GPU Compute - Total Score

In lieu of Blender, which has yet to officially release a stable version with CUDA 10 support, we have the LuxRender-based LuxMark (OpenCL) and V-Ray (OpenCL and CUDA).

Compute/ProViz: LuxMark 3.1 - LuxBall and Hotel

Compute/ProViz: V-Ray Benchmark 1.0.8

We'll also take a quick look at tessellation performance.

Synthetic: TessMark, Image Set 4, 64x Tessellation

Finally, for looking at texel and pixel fillrate, we have the Beyond3D Test Suite. This test offers a slew of additional tests – many of which we use behind the scenes or in our earlier architectural analysis – but for now we’ll stick to simple pixel and texel fillrates.

Synthetic: Beyond3D Suite - Pixel Fillrate

Synthetic: Beyond3D Suite - Integer Texture Fillrate (INT8)

Synthetic: Beyond3D Suite - Floating Point Texture Fillrate (FP32)

The practically identical pixel fill rates for the GTX 1660 Ti and RTX 2060 might seem odd at first blush, but it is an entirely expected result as both GPUs have the same number of ROPs, similar clockspeeds, same GPC/TPC setup, and similar memory configurations. And being the same generation/architecture, there aren't any changes or improvements to DCC. In the same vein, the RTX 2060 puts up a 25% higher texture fillrate over the GTX 1660 Ti as a consequence of having 25% more TMUs (96 vs 120).

Power, Temperature, and Noise

As always, we'll take a look at power, temperature, and noise of the GTX 1660 Ti, though as a pure custom launch we aren't expecting anything out of the ordinary. As mentioned earlier, the XC Black board has already revealed itself in its RTX 2060 guise.

As this is a new GPU, we will quickly review the GeForce GTX 1660 Ti's stock voltages and clockspeeds as well.

NVIDIA GeForce Video Card Voltages
Model	Boost	Idle
GeForce GTX 1660 Ti	1.037V	0.656V
GeForce RTX 2060	1.025v	0.725v
GeForce GTX 1060 6GB	1.043v	0.625v

The voltages are naturally similar to the 16nm GTX 1060, and in comparison to pre-FinFET generations, these voltages are exceptionally lower because of the FinFET process used, something we went over in detail in our GTX 1080 and 1070 Founders Edition review. As we said then, the 16nm FinFET process requires said low voltages as opposed to previous planar nodes, so this can be limiting in scenarios where a lot of power and voltage are needed, i.e. high clockspeeds and overclocking. For Turing (along with Volta, Xavier, and NVSwitch), NVIDIA moved to 12nm "FFN" rather than 16nm, and capping the voltage at 1.063v.

GeForce Video Card Average Clockspeeds
Game	GTX 1660 Ti	EVGA GTX 1660 Ti XC	RTX 2060	GTX 1060 6GB
Max Boost Clock	2160MHz	2160MHz	2160MHz	1898MHz
Boost Clock	1770MHz	1770MHz	1680MHz	1708MHz
Battlefield 1	1888MHz	1901MHz	1877MHz	1855MHz
Far Cry 5	1903MHz	1912MHz	1878MHz	1855MHz
Ashes: Escalation	1871MHz	1880MHz	1848MHz	1837MHz
Wolfenstein II	1825MHz	1861MHz	1796MHz	1835MHz
Final Fantasy XV	1855MHz	1882MHz	1843MHz	1850MHz
GTA V	1901MHz	1903MHz	1898MHz	1872MHz
Shadow of War	1860MHz	1880MHz	1832MHz	1861MHz
F1 2018	1877MHz	1884MHz	1866MHz	1865MHz
Total War: Warhammer II	1908MHz	1911MHz	1879MHz	1875MHz
FurMark	1594MHz	1655MHz	1565MHz	1626MHz

Looking at clockspeeds, a few things are clear. The obvious point is that the very similar results of the reference-clocked GTX 1660 Ti and EVGA GTX 1660 Ti XC are reflected in the virtually identical clockspeeds. The GeForce cards boost higher than the advertised boost clock, as is typically the case in our testing. All told, NVIDIA's formal estimates are still run a bit low, especially in our properly ventilated testing chassis, so we won't complain about the extra performance.

But on that note, it's interesting to see that while the GTX 1660 Ti should have a roughly 60MHz average boost advantage over the GTX 1060 6GB when going by the official specs, in practice the cards end up within half that span. Which hints that NVIDIA's official average boost clock is a little more correctly grounded here than with the GTX 1060.

Power Consumption

Idle Power Consumption

Load Power Consumption - Battlefield 1

Load Power Consumption - FurMark

Even though NVIDIA's video card prices for the xx60 cards have drifted up over the years, the same cannot be said for their power consumption. NVIDIA has set the reference specs for the card at 120W, and relative to their other cards this is exactly what we see. Looking at FurMark, our favorite pathological workload that's guaranteed to bring a video card to its maximum TDP, the GTX 960, GTX 1060, and GTX 1660 are all within 4 Watts of each other, exactly what we'd expect to see from the trio of 120W cards. It's only in Battlefield 1 do these cards pull apart in terms of total system load, and this is due to the greater CPU workload from the higher framerates afforded by the GTX 1660 Ti, rather than a difference at the card level itself.

Meanwhile when it comes to idle power consumption, the GTX 1660 Ti falls in line with everything else at 83W. With contemporary desktop cards, idle power has reached the point where nothing short of low-level testing can expose what these cards are drawing.

As for the EVGA card in its natural state, we see it draw almost 10W more on the dot. I'm actually a bit surprised to see this under Battlefield 1 as well since the framerate difference between it and the reference-clocked card is barely 1%, but as higher clockspeeds get increasingly expensive in terms of power consumption, it's not far-fetched to see a small power difference translate into an even smaller performance difference.

All told, NVIDIA has very good and very consistent power control here. and it remains one of their key advantages over AMD, and key strengths in keeping their OEM customers happy.

Temperature

Idle GPU Temperature

Load GPU Temperature - Battlefield 1

Load GPU Temperature - FurMark

Looking at temperatures, there are no big surprises here. EVGA seems to have tuned their card for high performance cooling, and as a result the large, 2.75-slot card reports some of the lowest numbers in our charts, including a 67C under FurMark when the card is capped at the reference spec GTX 1660 Ti's 120W limit.

Noise

Idle Noise Levels

Load Noise Levels - Battlefield 1

Load Noise Levels - FurMark

Turning again to EVGA's card, despite being a custom open air design, the GTX 1660 Ti XC Black doesn't come with 0db idle capabilties and features a single smaller but higher-RPM fan. The default fan curve puts the minimum at 33%, which is indicative that EVGA has tuned the card for cooling over acoustics. That's not an unreasonable tradeoff to make, but it's something I'd consider more appropriate for a factory overclocked card. For their reference-clocked XC card, EVGA could have very well gone with a less aggressive fan curve and still have easily maintained sub-80C temperatures while reducing their noise levels as well.

Final Words

We’re now four GPUs into the NVIDIA Turing architecture product stack, and while NVIDIA’s latest processor has pitched us a bit of a curve ball in terms of feature support, by and large NVIDIA is holding to a pretty consistent pattern with regards to product performance, positioning, and pricing. Which is to say that the company has a very specific product stack in mind for this generation, and thus far they’ve been delivering on it with the kind of clockwork efficiency that NVIDIA has come to be known for.

With the launch of the GeForce GTX 1660 Ti and the TU116 GPU underpinning it, we’re finally seeing NVIDIA shift gears a bit in how they’re building their cards. Whereas the four RTX 20 series cards are all loosely collected under the umbrella of “premium features for a premium price”, the GTX 1660 Ti goes in the other direction, dropping NVIDIA’s shiny RTX suite of effects for a product that is leaner and cheaper to produce. As a result, the new card offers a bigger improvement on a price/performance basis (in current games) than any of the other Turing cards, and with a sub-$300 price tag, is likely to be more warmly received than the other cards.

Looking at the numbers, the GeForce GTX 1660 Ti delivers around 37% more performance than the GTX 1060 6GB at 1440p, and a very similar 36% gain at 1080p. So consistent with the other Turing cards, this is not quite a major generational leap in performance; and to be fair to NVIDIA they aren’t really claiming otherwise. Instead, NVIDIA is mostly looking to sell this card to current GTX 960 and R9 380 users; people who skipped the Pascal generation and are still on 28nm parts. In which case, the GTX 1660 Ti offers well over 2x the performance of these cards, with performance frequently ending up neck-and-neck with what was the GTX 1070.

Meanwhile, taking a look at power efficiency, it’s interesting to note that for the GTX 1660 Ti NVIDIA has been able to hold the line on power consumption: performance has gone up versus the GTX 1060 6GB, but card power consumption hasn’t. Thanks to this, the GTX 1660 Ti is not just 36% faster, it’s 36% percent more efficient as well. The other Turing cards have seen their own efficiency gains as well, but with their TDPs all drifting up, this is the largest (and purest) efficiency gain we’ve seen to date, and probably the best metric thus far for evaluating Turing’s power efficiency against Pascal’s.

The end result of these improvements in performance and power efficiency is that NVIDIA has once again put together a very solid Turing-based video card. And while its performance gains don’t make the likes of the GTX 1060 6GB and Radeon RX 590 obsolete overnight, it’s a clear case of out with the old and in with the new for the mainstream video card market. The GTX 1060 is well on its way out, and meanwhile AMD is going to have to significantly reposition the $279 RX 590. The GTX 1660 Ti cleanly beats it in performance and power efficiency, delivering 25% better performance for a bit over half the power consumption.

Gallery: GeForce GTX 1660 Ti Partner Cards

If anything, having cleared its immediate competitors with superior technology, the only real challenge NVIDIA will face is convincing consumers to pay $279 for a xx60 class card, and which performs like a $379 card from two years ago. In this respect the GTX 1660 Ti is a much better value proposition than the RTX 2060 above it, but it’s also more expensive than the GTX 1060 6GB it replaces, so it runs the risk of drifting out of the mainstream market entirely. Thankfully pricing here is a lot more grounded than the RTX 20 series cards, but the mainstream market is admittedly more price sensitive to begin with.

This also means that AMD remains a wildcard factor; they have the option of playing the value spoiler with cheap RX 590 cards, and I’m curious to see how serious they really are about bringing the RX Vega 56 in to compete with NVIDIA’s newest card. Our testing shows that RX Vega 56 is still around 5% faster on average, so AMD could still play a new version of the RX 590 gambit (fight on performance and price, damn the power consumption).

Perhaps the most surprising part about any of this is that despite the fact that the GTX 1660 Ti very notably omits NVIDIA’s RTX functionality, I’m not convinced RTX alone is going to sway any buyers one way or another. Since the RTX 2060 is both a faster and more expensive card, I quickly tabled the performance and price increases for all of the Turing cards launched thus far.

GeForce: Turing versus Pascal
	List Price (Turing)	Relative Performance	Relative Price	Relative Perf-Per-Dollar
RTX 2080 Ti vs GTX 1080 Ti	$999	+32%	+42%	-7%
RTX 2080 vs GTX 1080	$699	+35%	+40%	-4%
RTX 2070 vs GTX 1070	$499	+35%	+32%	+2%
RTX 2060 vs GTX 1060 6GB	$349	+59%	+40%	+14%
GTX 1660 Ti vs GTX 1060 6GB	$279	+36%	+12%	+21%

The long and short of matters is that with the cheapest RTX card costing an additional $80, there’s a much stronger rationale to act based on pricing than feature sets. In fact considering just how amazingly consistent the performance gains are on a generation-by-generation basis, there’s ample evidence that NVIDIA has always planned it this way. Earlier I mentioned that NVIDIA acts with clockwork efficiency, and with nearly ever Turing card improving over its predecessor by roughly 35% (save the RTX 2060 with no direct predecessor), it’s amazing just how consistent NVIDIA’s product positioning is here. If the next GTX 16 series card isn’t also 35% faster than its predecessor, then I’m going to be amazed.

In any case, this makes a potentially complex situation for card buyers pretty simple: buy the card you can afford – or at least, the card with the performance you’re after – and don’t worry about whether it’s RTX or GTX. And while it’s unfortunate that NVIDIA didn’t include their RTX functionality top-to-bottom in the Turing family, there’s also a good argument to be had that the high-performance cost means that it wouldn’t make sense on a mainstream card anyhow. At least, not for this generation.

Last, but not least, we have the matter of EVGA’s GeForce GTX 1660 Ti XC Black GAMING. As this is launch without reference cards, we’re going to see NVIDIA’s board partners hit the ground running with their custom cards. And in true EVGA tradition, their XC Black GAMING is a solid example of what to expect for a $279 baseline GTX 1660 Ti card.

Since this isn’t a factory overclocked card, I’m a bit surprised that EVGA bothered to ship it with an increased 130W TDP. But I’m also glad they did, as the fact that it only improves performance by around 1% versus the same card at 120W is a very clear indicator that the GTX 1660 Ti is not meaningfully TDP limited. Overclocking will be another matter of course, but at stock this means that NVIDIA hasn’t had to significantly clamp down on power consumption to hit their power targets.

As for EVGA’s card design, I have to admit a triple-slot cooler is an odd choice for a 130W card – a standard double-wide card would have been more than sufficient for that kind of TDP – but in a market that’s going to be full of single and dual fan cards it definitely stands out from the crowd; and quite literally so, in the case of NVIDIA’s own promotional photos. Meanwhile I’m not sure there’s much to be said about EVGA’s software that we haven’t said a dozen times before: in EVGA Precision remains some of the best overclocking software on the market. And with such a beefy cooler on this card, it’s certainly begging to be overclocked.

Buy the EVGA GeForce GTX 1660 Ti XC Black GAMING on Newegg

The NVIDIA GeForce GTX 1660 Ti Review, Feat. EVGA XC GAMING: Turing Sheds RTX for the Mainstream Market

Wait, It's a GTX Card?

Price, Product Positioning, & The Competition

TU116: When Turing Is Turing… And When It Isn’t

The Curious Case of FP16: Tensor Cores vs. Dedicated Cores

Turing Minor: Turing Sans RTX

Meet The EVGA GeForce GTX 1660 Ti XC Black GAMING

The Test

Battlefield 1 (DX11)

Far Cry 5 (DX11)

Ashes of the Singularity: Escalation (DX12)

Wolfenstein II: The New Colossus (Vulkan)

Final Fantasy XV (DX11)

Grand Theft Auto V (DX11)

Middle-earth: Shadow of War (DX11)

F1 2018 (DX11)

Total War: Warhammer II (DX11)

Compute & Synthetics

Power, Temperature, and Noise

Power Consumption

Temperature

Noise

Final Words

Log in

Don't have an account? Sign up now