Agreed, this is a good review, as the video card reviews here usually are... Agreed about rushing as well. A lot of sites have less thorough stuff out in 1-2 days... I am guessing that Ryan and the others at Anandtech have regular day jobs and doing these reviews and articles is done on their own time. If that is the case, 2 months seems right. If I am incorrect in that assumption and this is a full time job, then they should be coming out with articles alot faster.
Hi All, I was just wondering if it's worth it to get the FE 1080 or just go with the regular one. Does the stock fan setup offer better thermals than the blower setup?
Depends if your custom board has any actual changes, it may just be the reference board with a custom cooler, so it would make no difference. Of course it would also be cheaper to boot.
"this change in the prefetch size is why the memory controller organization of GP104 is 8x32b instead of 4x64b like GM204, as each memory controller can now read and write 64B segments of data via a single memory channel.*
Shouldn't it be the opposite?
"Overall when it comes to HDR on NVIDIA’s display controller, not unlike AMD’s Pascal architecture"
I hope to add to this list a new revisit for Hardware Decoding+Encoding capabilities for these new GPU's in comparison to Intel QuickSync and pure Software solutions...
My favorite era of being a nerd!!! Poppin' opterons into s939 and pumpin the OC the athlon FX levels for a fraction of the price all while stompin' on pentium. It was a good (although expensive) time to a be a nerd... Besides paying 100 dollars for 1gb of DDR500. 6800gs budget friendly cards, and ATi x1800/1900 super beasts.. how i miss the days
AMD was instead forced increase the unit count and chip size for 480 is bigger than the 1060 chip, and is using a larger bus. Both increase the chip cost.
AMD loses because they are selling a more expensive chip for less money. That squeezes their unit profit on both ends.
"This echoes what I have been saying about this generation. It is really all about clock speed increases. IPC is essentially the same." - This is a good thing. Stuck on 28nm for 4 years, moving to 16nm is exactly what Nvidias architecture needed.
Furyx still loses to 980ti until 4K at which point the avg for both cards is under 30fps, and the mins are both below 20fps. IE, neither is playable. Even in AMD's case here we're looking at 7% gain (75.3 to 80.9). Looking at NV's new cards shows dx12 netting NV cards ~6% while AMD gets ~12% (time spy). This is pretty much a sneeze and will as noted here and elsewhere, it will depend on the game and how the gpu works. It won't be a blanket win for either side. Async won't be saving AMD, they'll have to actually make faster stuff. There is no point in even reporting victory at under 30fps...LOL.
Also note in that link, while they are saying maxwell gained nothing, it's not exactly true. Only avg gained nothing (suggesting maybe limited by something else?), while min fps jumped pretty much exactly what AMD did. IE Nv 980ti min went from 56fps to 65fps. So while avg didn't jump, the min went way up giving a much smoother experience (amd gained 11fps on mins from 51 to 62). I'm more worried about mins than avgs. Tomb on AMD still loses by more than 10% so who cares? Sort of blows a hole in the theory that AMD will be faster in all dx12 stuff...LOL. Well maybe when you force the cards into territory nobody can play at (4k in Tomb Raiders case).
It would appear NV isn't spending much time yet on dx12, and they shouldn't. Even with 10-20% on windows 10 (I don't believe netmarketshare's numbers as they are a msft partner), most of those are NOT gamers. You can count dx12 games on ONE hand. Most of those OS's are either forced upgrades due to incorrect update settings (waking up to win10...LOL), or FREE on machine's under $200 etc. Even if 1/4 of them are dx12 capable gpus, that would be NV programming for 2.5%-5% of the PC market. Unlike AMD they were not forced to move on to dx12 due to lack of funding. AMD placed a bet that we'd move on, be forced by MSFT or get console help from xbox1 (didn't work, ps4 winning 2-1) so they could ignore dx11. Nvidia will move when needed, until then they're dominating where most of us are, which is 1080p or less, and DX11. It's comic when people point to AMD winning at 4k when it is usually a case where both sides can't hit 30fps even before maxing details. AMD management keeps aiming at stuff we are either not doing at all (4k less than 2%), or won't be doing for ages such as dx12 games being more than dx11 in your OS+your GPU being dx12 capable.
What is more important? Testing the use case that describes 99.9% of the current games (dx11 or below, win7/8/vista/xp/etc), or games that can be counted on ONE hand and run in an OS most of us hate. No hate isn't a strong word here when the OS has been FREE for a freaking year and still can't hit 20% even by a microsoft partner's likely BS numbers...LOL. Testing dx12 is a waste of time. I'd rather see 3-4 more dx11 games tested for a wider variety although I just read a dozen reviews to see 30+ games tested anyway.
@Ryan & team: What was your reasoning for not including the new Doom in your 2016 GPU Bench game list? AFAIK it's the first indication of Vulkan performance for graphics cards.
We cooked up the list and locked in the games before Doom came out. It wasn't out until May 13th. GTX 1080 came out May 14th, by which point we had already started this article (and had published the preview).
OK, thank you. Any chance of adding it to the list please?
I'm a Windows gamer, so my personal interest in the cross-platform Vulkan is pretty meh right now (only one title right now, hooray! /s) but there are probably going to be some devs are going to choose it over DX12 for that very reason, plus I'm sure that you have readers who are quite interested in it.
Then you're woefully behind the times since other sites can do this better. If you're not able to re-run a benchmark for a game with a pretty significant patch like Tomb Raider, or a high profile game like Doom with a significant performance patch like Vulcan that's been out for over a week, then you're workflow is flawed and this site won't stand a chance against the other crop. I'm pretty sure you're seeing this already if you have any sort of metrics tracking in place.
Seems like an official addendum is necessary at some point. Doom on Vulkan is amazing. Dota 2 on Vulkan is great, too (and would be useful in reviews of low end to mainstream GPUs especially). Talos... not so much.
The table with the native FP throughput rates isn't correct on page 5. Either it's in terms of flops, then gp104 fp16 would be 1:64. Or it's in terms of hw instruction throughput - then gp100 would be 1:1. (Interestingly, the sandra numbers for half-float are indeed 1:128 - suggesting it didn't make any use of fp16 packing at all.)
Ahh, right you are. I was going for the FLOPs rate, but wrote down the wrong value. Thanks!
As for the Sandra numbers, they're not super precise. But it's an obvious indication of what's going on under the hood. When the same CUDA 7.5 code path gives you wildly different results on Pascal, then you know something has changed...
Did nVidia somehow limit the ability to promote FP16 operations to FP32? If not, I don't see the point in creating such a slow performing FP16 mode in the first place. Why waste die space when an intelligent designer can just promote the commands to get normal speeds out of the chip anyways? Sure you miss out on speed doubling through packing, but that is still much better than the 1/128 (1/64) rate you get using the provided FP16 mode.
I think they can just do that in the shader compiler. Any FP16 operation gets replaced by an FP32 one. Only reading from buffers and writing to buffers with FP16 content should remain FP16. Then again, if their driver is smart enough, it can even promote all buffers to FP32 as well (as long as the GPU is the only one accessing the data, the actual representation doesn't matter. Only when the CPU also accesses the data, does it actually need to be FP16).
Of course these Nvidia cards kick some major butt in games that have always favored Nvidia but I noticed that in games not specifically coded to take advantage of Nvidia and furthermore games with DX12 that these cards performance advantage is minimal at best vs an old Fury X with half the video RAM.
Then when you take into account Vulcan API and newer DX12 games (which can be found elsewhere) you see that the prices for these cards is a tad ridiculous and the performance advantage starts to melt away.
I am waiting for AMD to release their next "big gun" before I make a purchase decision. I'm rocking a 4k monitor right now and 60fps at that resolution is my target.
Great review - one of the few that highlights the fact the Pascal async compute is only half as good as AMD's version. Async compute is a key feature for increasing performance in DX12 and Vulkan and that's going to allow the RX 480 to perform well against the GTX 1060
The old Maxwell was so optimized it was always full and didn't even need Async Compute. The new Pascal is so much more optimized that it even has time to create the "holes" in execution (not counting the ones in your pocket) that were "missing" in the old architecture to be able to benefit for Async Compute. Expect Volta to create even more holes (with hardware support) for Async Compute to fill.
Did I read a different article? Because the article that I read said that the 'holes' would be pretty similar on Maxwell v2 and Pascal, given that they have very similar architectures. However, Pascal is more efficient at filling the holes with its dynamic repartitioning.
" NVIDIA tells us that it can be done in under 100us (0.1ms), or about 170,000 clock cycles."
Is my understanding right that Polaris, and I think even earlier with late GCN parts, could seamlessly interleave per-clock? So 170,000 times faster than Pascal in clock cycles (less in total time, but still above 100,000 times faster)?
That seems highly unlikely. Switching to another task is going to take some time, because you also need to switch all the registers, buffers, caches need to be re-filled etc. The only way to avoid most of that is to duplicate the whole register file, like HyperThreading does. That's doable on an x86 CPU, but a GPU has way more registers. Besides, as we can see, nVidia's approach is fast enough in practice. Why throw tons of silicon on making context switching faster than it needs to be? You want to avoid context switches as much as possible anyway.
Sadly AMD doesn't seem to go into any detail, but I'm pretty sure it's going to be in the same ballpark. My guess is that what AMD calls an 'ACE' is actually very similar to the SMs and their command queues on the Pascal side.
After re-reading AMD's asynchronous shader PDF, it seems that AMD also speaks of 'interleaving' when they switch a graphics CU to a compute task after the graphics task has completed. So 'interleaving' at task level, rather than at instruction level. Which would be pretty much the same as NVidia's Dynamic Load Balancing in Pascal.
The more I read about async computing in Polaris and Pascal, the more I realize that the implementations are not much different.
As Ryan pointed out, it seems that the reason that Polaris, and GCN as a whole, benefit more from async is the architecture of the GPU itself, being wider and having more ALUs.
Nonetheless, I'm sure we're still going to see comments like "Polaris does async in hardware. Pascal is hopeless with its software async hack".
You're joking, or a troll, or a clown. I complained about the time it took to get the full article, (to Ryan's credit, for the impatient ones of us just looking for numbers, he noted a while back that GPU bench was updated to include benchmarks for these cards), but this is exactly the kind of review that often has separated AT from numerous other sites. The description of the relatively crummy FP16 performance was solid and on point. From NV themselves teasing us with ads that half precision would rock our world, well, this review covers in great detail the reset of the story.
Yeah, I know guys, I shouldn't dignify it with a response.
Anandtech have always been nvidia shills. Sad they can't make a living without getting paid by nvidia, but they're not alone. Arstechnica is even worse and Tomshardware is way worse.
The HDR discussion of this review was super interesting, but as always, there's one key piece of information missing: WHEN are we going to see HDR monitors that take advantage of these new GPU abilities?
I myself am stuck at 1080p IPS because more resolution doesn't entice me, and there's nothing better than IPS. I'm waiting for HDR to buy my next monitor, but being 5 years old my Dell ST2220T is getting long in the teeth...
I think the results are quite interesting, and the games chosen really help show the advantages and limitations of the different architectures. When you compare the GTX 1080 to its price predecessor, the 980 Ti, you are getting an almost universal ~25%-30% increase in performance. Against rival AMDs R9 Fury X, there is more of a mixed bag. As the resolutions increase the bandwidth provided by the HBM memory on the Fury X really narrows the gap, sometimes trimming the margin to less that 10%,s specifically in games optimized more for DX12 "Hitman, "AotS". But it other games, specifically "Rise of the Tomb Raider" which boasts extremely high res textures, the 4Gb memory size on the Fury X starts to limit its performance in a big way. On average, there is again a ~25%-30% performance increase with much higher game to game variability. This data lets a little bit of air out of the argument I hear a lot that AMD makes more "future proof" cards. While many Nvidia 900 series users may have to upgrade as more and more games switch to DX12 based programming. AMD Fury users will be in the same boat as those same games come with higher and higher res textures, due to the smaller amount of memory on board. While Pascal still doesn't show the jump in DX12 versus DX11 that AMD's GPUs enjoy, it does at least show an increase or at least remain at parity. So what you have is a card that wins in every single game tested, at every resolution over the price predecessors from both companies, all while consuming less power. That is a win pretty much any way you slice it. But there are elements of Nvidia’s strategy and the card I personally find disappointing. I understand Nvidia wants to keep features specific to the higher margin professional cards, but avoiding HBM2 altogether in the consumer space seems to be a missed opportunity. I am a huge fan of the mini ITX gaming machines. And the Fury Nano, at the $450 price point is a great card. With an NVMe motherboard and NAS storage the need for drive bays in the case is eliminated, the Fury Nano at only 6” leads to some great forward thinking, and tiny designs. I was hoping to see an explosion of cases that cut out the need for supporting 10-11” cards and tons of drive bays if both Nvidia and AMD put out GPUs in the Nano space, but it seems not to be. HBM2 seems destined to remain on professional cards, as Nvidia won’t take the risk of adding it to a consumer Titan or GTX 1080 Ti card and potentially again cannibalize the higher margin professional card market. Now case makers don’t really have the same incentive to build smaller cases if the Fury Nano will still be the only card at that size. It’s just unfortunate that it had to happen because NVidia decided HBM2 was something they could slap on a pro card and sell for thousands extra. But also what is also disappointing about Pascal stems from the GTX 1080 vs GTX 1070 data Ryan has shown. The GTX 1070 drops off far more than one would expect based off CUDA core numbers as the resolution increases. The GDDR5 memory versus the GDDR5X is probably at fault here, leading me to believe that Pascal can gain even further if the memory bandwidth is increased more, again with HBM2. So not only does the card limit you to the current mini-ITX monstrosities (I’m looking at you bulldog) by avoiding HBM2, it also very likely is costing us performance. Now for the rank speculation. The data does present some interesting scenarios for the future. With the Fury X able to approach the GTX 1080 at high resolutions, most specifically in DX12 optimized games. It seems extremely likely that the Vega GPU will be able to surpass the GTX 1080, especially if the greatest limitation (4 Gb HBM) is removed with the supposed 8Gb of HBM2 and games move more and more the DX12. I imagine when it launches it will be the 4K card to get, as the Fury X already acquits itself very well there. For me personally, I will have to wait for the Vega Nano to realize my Mini-ITX dreams, unless of course, AMD doesn’t make another Nano edition card and the dream is dead. A possibility I dare not think about.
The gap getting narrower at higher resolutions probably has more to do with chips' designs rather than bandwidth. After all, Fury is the big GCN chip optimized for high resolutions. Even though GP104 does well, it's still the middle Pascal chip.
P.S. Please separate the paragraphs. It's a pain, reading your comment.
The GTX 1070 is really just a way for Nvidia to sell GP104's that didn't pass all of their tests. Don't expect them to put expensive memory on a card where they're only looking to make their money back. Keeping the card cost down, hoping it sells, is more important to them.
If there's a defect anywhere within one of the GPC's, the entire GPC is disabled and the chip is sold at a discount instead of being thrown out. I would not buy a 1070 which is really just a crippled 1080.
I'll be buying a 1080 for my 2560x1600 desktop, and an EVGA 1060 for my Mini-ITX build; which has a limited power supply.
Very good review. One minor comment to the article writers - do a final check on grammer - granted we are technical folks, but it was noticeable especially on the final words page.
That puts a lid on the comments that Pascal is basically a Maxwell die-shrink. It's obviously based on Maxwell but the addition of dynamic load balancing and preemption clearly elevates it to a higher level.
Still, seeing that using async with Pascal doesn't seem to be as effective as GCN, the question is how much of a role will it play in DX12 games in the next 2 years. Obviously async isn't be-all and end-all when it comes to performance but can Pascal keep up as a whole going forward or not.
I suppose we won't know until more DX12 are out that are also optimized properly for Pascal.
Except that it really is designed as an e-sport style game, and can run very well with low-end hardware, so isn't really needed for reviewing flagship cards. In other words, if your primary desire is to find a card that will run Overwatch well, you won't be looking at spending $200-$700 for the new video cards coming out.
And this is why I really wish Overwatch was more demanding on GPUs. I'd love to use it and DOTA 2, but 100fps at 4K doesn't tell us much of use about the architecture of these high-end cards.
Thanks for the excellent write-up, Ryan! Especially the parts on asynchronous compute and pre-emption were very thorough. A lot of nonsense was being spread about nVidia's alleged inability to do async compute in DX12, especially after Time Spy was released, and actually showed gains from using multiple queues. Your article answers all the criticism, and proves the nay-sayers wrong. Some of them went so far in their claims that they said nVidia could not even do graphics and compute at the same time. Even Maxwell v2 could do that. I would say you have written the definitive article on this matter.
Sadly that won't stop the clueless AMD fanboys from continuing to harp on that NVIDIA "doesn't have async compute" or that it "doesn't work". You've gotta feel for them though, NVIDIA's poor performance in a single tech demo... written with assistance from AMD... is really all the red camp has to go on. Because they sure as hell can't compete in terms of performance, or power usage, or cooler design, or adhering to electrical specifications...
Pretty sure critique was of Maxwell. Pascals async was widely advertised. It's them saying "don't worry, Maxwell can do it" to questions about it not having it, and then when Pascal is released, saying "oh yeah, performance would have tanked with it on Maxwell", that bugs people as it should
Nope, a lot of critique on Time Spy was specifically *because* Pascal got gains from the async render path. People said nVidia couldn't do it, so FutureMark must be cheating/bribed.
It won't matter much though because they won't read anything in this article or Futuremark's statement on Async use in Time Spy. And they will keep linking some forum posts that claim nvidia does not support Async Compute.
Nothing will change their minds that it is a rigged benchmark and the developers got bribed by nvidia.
I agree with the previous reviewers. It's fine and dandy to be a "day one" breakthrough reviewer and believe me I read and enjoyed 20 of those other day 1 reviews as well. But... IMO no one writes such an in depth, technical, and layman-enjoyable review like Anandtech. Excellent review fellas!
This is coming from a GTX 1070 FE owner, and I am also the other of the original Battleship Mtron SSD article.
I don't understand why in your benchmarks the framerates are so low. For example I have a 1070 and am able to play GTAV at very high settings and achieve 60fps constant at 4K.(No MSAA obviously)
Even other reviewers have noted much higher framerates. Listing the 1080 as a true 4K card and the 1070 as a capable 4K card too.
The way I craft these tests is settings centric. I pick the settings and see where the cards fall. Some other sites have said that they see what settings a card performs well at (e.g. where it gets 60fps) and then calibrate around that.
The end result is that the quality settings I test are most likely higher than the sites you're comparing this article to.
Ah now I see why, you have the advanced graphics settings turned up also. These are not turned on by default in GTAV since they cause great performance loss.
Extended Distance Scaling, Extended Shadows Distance, Long Shadows, High Resolution Shadows, High Detail Streaming While Flying.
They eat a lot of VRAM and perform terribly at higher resolutions.
I know the article was posted today but when was it actually written? NewEgg has been having the dual fan Gigabyte GTX 1070 at $399 for a couple weeks now. Yes, it's still $20 over the MSRP and frequently sells out as quick as they show up but it's still a fair deal cheaper than $429.
You have a funny typo on the fast sync page: " Fast Sync doesn’t replace either Adaptive Sync or G-Sync" I think you meant Adaptive Vsync, not that VESA standard which nvidia does not support.
Cool! Thanks Ryan! Did you reach your stretch goals on your Kickstarter campaign so that you can afford the scanning electron microscope for great photos?
Thanks Ryan. I was on about audio. Not for games, but for my music playback (SACD, DVDA) with different sampling rates that includes 88.2 and 176.4Khz. I can't find FULL specifications for the 100 series GTX cards that include this information, or any decoding capabilities. I was looking for the GTX 1060 for HTPC use with madVR.
Where can I find out about audio and sampling rates? MadVR capabilities?
That's the problem. I know nothing about the 900 series audio capabilities (which I suppose is the same as the 800 series ;) ) and no one publishes them in review. All reviews are incomplete.
Anyone here knows at least the supported audio sampling rates? If not, I think my best bet is going with AMD (which I'm shure supports 88.2 & 176.4 KHz).
I really appreciate all the work that went into this in depth review.
I especially am very glad that you included the GTX 680 in the benchmarks along with all the other cards after it. It's often really hard to get an overview of performance over a couple years.
I'm looking at upgrading 2 systems from GTX680 to either GTX 1070 or GTX 1060 and Titan (original one) to GTX 1080, so this helps see what the performance would be like. Hopefully you tested the 1060 the same way so I can just plug the numbers for it into the same graph.
2nd page 3rd paragraph: "generational increate in performance". ;increase? 2nd page 2nd section: "Pascal in an architecture that I’m not sure has any real parallel on a historical basis". ;is?
I'm a Nvidia guy all the way. For now. I am disappointed in the midrange RX480 and it's power consumption compared to the competition, especially after they had said that Polaris was goingto primarily be an efficiency improvement. Outside of my bias I truly hope AMD provides a very competitive flagship in the near future. Everyone wins. But with the 1060 now announced it just makes AMD's GPU prospects and profitability questionable.
So basically after all the hype about finfet, we get a standard, if not disappointing jump this generation also with a price hike. I'm so relieved that I didn't wait for this generation and can just enjoy my current 970 sli/nano crossfire rigs. AMD easily has the opportunity to blow these cards out of the water with big gpus.
Agreed. They'll likely be much more power-hungry, but I believe it's definitely doable. At the very least it'll probably be similar to Fury X Vs. GTX 980
The 1070 is as fast as the 980 ti. The 1060 is as fast as a 980. The 1080 is much faster than a 980 ti. Every card jumped up two tiers in performance from the previous gen. That's "standard" to you?
This review was a L O N G time coming, but gotta admit, excellent as always. This was the ONLY Pascal review to acknowledge and significantly include Kepler cards in the benchmarks and some comments. It makes sense to bench GK104 and analyze generational improvements since Kepler debuted 28nm and Pascal has finally ushered in the first node shrink since then. I guessed Anandtech would be the only site to do so, and looks like that's exactly what happened. Looking forward to the upcoming Polaris review!
I do still wonder if Kepler's poor performance nowadays is largely due to neglected driver optimizations or just plain old/inefficient architecture. If it's the latter, it's really pretty bad with modern game workloads.
It may be a little of the latter, but Kepler was pretty amazing at launch. I suspect driver neglect though, seeing as how Kepler performance got notably WORSE soon after Maxwell. It's also interesting to see how the comparable GCN cards of that time, which were often slower than the Kepler competition, are now significantly faster.
This is the one issue that has me wavering for the next card. My AMD cards, the last one being a 5850, have always lasted longer than my NV cards; of course at the expense of slower game fixes/ready drivers.
So far so good with a 1.5yrs old 970, but I'm keeping a close eye on it. I'm looking forward to what VEGA brings.
Yeah I'd keep an eye on it. My 770 can still play new games, albeit at lowered quality settings. The one hope for the 970 and other Maxwell cards is that Pascal is so similar. The only times I see performance taking a big hit would be newer games using asynchronous workloads, since Maxwell is poorly prepared to handle that. Otherwise maybe Maxwell cards will last much longer than Kepler. That said, I'm having second thoughts on the 1070 and curious to see what AMD can offer in the $300-$400 price range.
This is excactly the same situation as me. I got a 770 sitting in my rig, and am looking hard at the 1070, maybe soon. Although my 770 is still up to the task in most games, I really play only blizzard games theese days and they are not hard on your hardware.
My biggest issue is really that it is rather noisy, so I will be looking for a solution with the lowest DB.
Great article, it was totally worth waiting for.. I only read this sort of stuff here so have been waiting till now for any 1080 review.
If you sorted the framerate from highest to lowest, this would be the framerate of the slowest 1%. It's basically a more accurate/meaningful metric for minimum frame rates.
This is why I love Anandtech. Deep in reviews. Well I even wanted to be one of your editors if you have the plan to create a Chinese transtate version of these reviews.
Great detailed review, as always. But I have to ask once again: why didnt you do some kind of VR Benchmarks? Thats what drives my choises now, to be honest.
After over 2 months of reading GTX1080 reviews I felt a distinct lack of excitement as I read Anandtech kicking off their review of the finfet generation. Could it prove to be anything but an anticlimax?
Sadly and unsurprisingly...NOT.
It was however amusing to see the faithfull positively gushing praises for Anandtech now that the "greatly anticipated" review is finally out.
Yes folks, 20 or so pages of (well written) information, mostly already covered by other tech sites, finally published, it's as if a magic wand has been waved, the information has been presented with that special Anandtech sauce, new insights have been illuminated and all is well in Anandtechland again.
(AT LEAST UNTIL THE NEXT 2 MONTH DELAY.) LOL.
I do like the way Anandtech presents the FPS charts.
The info which is included within the article is indeed mostly already covered by other tech sites.
Emphasis on the "mostly" and the plural "sites".
Those of us who have jobs which keep us busy and have an interest in this sort of thing often don't have the time to trawl round many different sites to get reviews and pertinent technical data so we rely upon those sites which we trust to produce in-depth articles, even if they take a bit longer.
As an IT Manager for (most recently) a manufacturing firm and then a school, I don't care about bleeding edge, get the new stuff as soon as it comes out, I care about getting the right stuff, and a two month delay to get a proper review is absolutely fine. If I need quick benchmarks I'll use someone like Hexus or HardOCP but to get a deep dive into the architecture so I can justify purchases to the Art and Media departments, or the programers is essential. You don't get that anywhere else.
Your unwavering support for Anandtech is impressive.
I too have a job that keeps me busy, yet oddly enough I find the time to browse (I prefer that word to "trawl") a number of sites.
I find it helps to form objective opinions.
I don't believe in early adoption, but I do believe in getting the job done on time, however if you are comfortable with a 2 month delay, so be it :)
Interesting to note that architectural deep dives concern your art and media departments so closely in their purchasing decisions. Who would have guessed?
It's true (God knows it's been stated here often enough) that Anandtech goes into detail like no other, I don't dispute that. But is it worth the wait? A significant number seem to think not.
Allow me to leave one last issue for you to ponder (assuming you have the time in your extremely busy schedule).
Impatient as I was at the first for benchmarks, yes, I'm a numbers junkie, since it's evident precious few of us will have had a chance to buy one of these cards yet (or the 480), I doubt the delay has caused anyone to buy the wrong card. Can't speak for the smart phone review folks are complaining about being absent, but as it turns out, what I'm initially looking for is usually done early on in Bench. The rest of this, yeah, it can wait.
Job, house, kids, church... more than enough to keep me sufficiently busy that I don't have the time to browse more than a few sites. I pick them quite carefully.
Given the lifespan of a typical system is >5 years I think that a 2 month delay is perfectly reasonable. It can often take that long to get purchasing signoff once I've decided what they need to purchase anyway (one of the many reasons that architectural deep dives are useful - so I can explain why the purchase is worthwhile). Do you actually spend someone else's money at any point or are you just having to justify it to yourself?
Whether or not it's worth the wait to you is one thing - but it's clearly worth the wait to both Anandtech and to Purch.
While this is a very thorough and well written review, it makes me wonder about sponsored content and product placement. The PG279Q is the only monitor mentionned, making sure the brand appears, and nothing about competing products. It felt unnecessary. I hope it's just a coincidence, but considering there has been quite a lot of coverage about Asus in the last few months, I'm starting to doubt some of the stuff I read here.
"The PG279Q is the only monitor mentionned, making sure the brand appears, and nothing about competing products."
There's no product placement or the like (and if there was, it would be disclosed). I just wanted to name a popular 1440p G-Sync monitor to give some real-world connection to the results. We've had cards for a bit that can drive 1440p monitors at around 60fps, but GTX 1080 is really the first card that is going to make good use of higher refresh rate monitors.
Great article. It is pleasant to read more about technology instead of testing results. Some questions though:
1. higher frequency: I am kind of skeptical that the overall higher frequency is mostly enabled by FinFET. Maybe it is the case, but for example when Intel moved to FinFET we did not see such improvement. RX480 is not showing that either. It seems pretty evident the situation is different from 8800GTX where we first get frequency doubling/tripling only in the shader domain though. (Wow DX10 is 10 years ago... and computation throughput is improved by 20x)
2. The fastsync comparison graph looks pretty suspicious. How can Vsync have such high latency? The most latency I can see in a double buffer scenario with vsync is that the screen refresh just happens a tiny bit earlier than the completion of a buffer. That will give a delay of two frame time which is like 33 ms (Remember we are talking about a case where GPU fps>60). This is unless, of course, if they are testing vsync at 20hz or something.
1) It is a big part of it. Remember how bad 20nm was? The leakage was really high so Nvidia/AMD decided to skip it. FinFET's helped reduce the leakage for the "14/16"nm node.
That's apples to oranges. CPU's are already 3-4Ghz out of the box.
RX480 isn't showing it because the 14nm LPP node is a lemon for GPU's. You know what's the optimal frequency for Polaris 10? 1Ghz. After that the required voltage shoots up. You know, LPP where the LP stands for Low Power. Great for SoC's but GPU's? Not so much. "But the SoC's clock higher than 2Ghz blabla". Yeah, well a) that's the CPU and b) it's freaking tiny.
How are we getting 2Ghz+ frequencies with Pascal which so closely resembles Maxwell? Because of the smaller manufacturing node. How's that possible? It's because of FinFET's which reduced the leakage of the 20nm node. Why couldn't we have higher clockspeeds without FinFET's at 28nm? Because power. 28nm GPU's capped around the 1.2-1.4Ghz mark. 20nm was no go, too high leakage current. 16nm gives you FinFET's which reduced the leakage current dramatically. What does that enable you to do? Increase the clockspeed.. Here's a good article http://www.anandtech.com/show/8223/an-introduction...
Another question is about boost 3.0: given that we see 150-200 Mhz gpu offset very common across boards, wouldn't it be beneficial to undervolt (i.e. disallow the highest voltage bins corresponding to this extra 150-200 Mhz) and offset at the same time to maintain performance at lower power consumption? Why did Nvidia not do this in the first place? (This is coming from reading Tom's saying that 1060 can be a 60w card having 80% of its performance...)
NVIDIA, get with the program and support VESA Adaptive-Sync already!!! When your $700 card can't support the VESA standard that's in my monitor, and as a result I have to live with more lag and lower framerate, something is seriously wrong. And why wouldn't you want to make your product more flexible?? I'm looking squarely at you, Tom Petersen. Don't get hung up on your G-sync patent and support VESA!
If the stock cards reach the 83C throttle point, I don't see what benefit an OC gives (won't you just reach that sooner?). It seems like raising the TDP or under-voltaging would boost continuous performance. Your thoughts?
Thanks for the in depth FP16 section! I've been looking forward to the full review. I have to say this is puzzling. Why put it on there at all? Emulation would be faster. But anyway, NVIDIA announced a new Titan X just now! Does this one have FP16 for $1200? Instant buy for me if so.
Emulation would be faster, but it would not be the same as running it on a real FP16x2 unit. It's the same purpose as FP64 units: for binary compatibility so that developers can write and debug Tesla applications on their GeForce GPU.
Especially the info on preemption and async/scheduling.
I expected the preemption mght be expensive in some circumstances, but I didn't quite expect it to push the L2 cache though! Still this is a marked improvement for nVidia.
It seems like the preemption is implemented in the driver though? Are there actual h/w instructions to as it were "swap stack pointer", "push LDT", "swap instruction pointer"?
There is hardware to quickly swap task contexts to/from VRAM. The driver can signal when a task needs to be pre-empted, which it can now do at any pixel/instruction. If I understand Dynamic Load Balancing correctly, you can queue up tasks from the compute partition on the graphics partition, which will start running automatically once the graphics task has completed. It sounds like this is actually done without any interference from the driver.
I swear the whole 1080 vs 480X remind me of the old fight between the 8800 and the 2900XT which somewhat improved int he 3870 and end with a winner whit the 4870. I really hope AMD stops messing with the ATI division and lets them drop a winner. AMD has been sinking ATI and making ATI carry the goddarn load of AMD's processor division failure.
Excellent article Ryan. I have been reading for several days whenever i can catch five minutes, and it has been quite the read! I look forward to the polaris review.
I feel like u should bench these cards day 1, so that the whingers get it out od their system. Then label these reviews the "gp104" review, etc. It really was about the chip and board more than the specific cards....
After reading the page about Simultaneous Multi Projection, I had a question of whether this feature could be used for more efficiently rendering reflections, like on a mirror or the surface of water. Does anyone know?
Great review guys, in-depth and unbiased as always.
On that note, the anger from a few AMD fanboys is hilarious, almost as funny as how pissed off the Google fanboys get whenever Anandtech dares say anything positive about an Apple product.
Love my EVGA GTX 1080 SC, blistering performance, couldn't be happier with it
Anyone here knows at least the supported audio sampling rates? If not, I think my best bet is going with AMD (which I'm sure supports 88.2 & 176.4 KHz).
Here's my Time Spy result in 3DMark for anyone interested in what an X5690 Mac Pro can do with a 1080 running in PCIe 1.1 in Windows 10. http://www.3dmark.com/3dm/13607976?
The venn diagram is wrong -- for GP104 it says 1:64 speed for FP16 -- it is actually 1:1 for FP16 (ie same speed as FP32) (NOTE: GP100 has 2:1 FP16 -- meaning FP16 is twice as fast as FP32)
Have I understood correctly that Pascal offers a 20% increase in memory bandwidth from delta color compression over Maxwell? As in a total average of 45% over Kepler just from color compression?
Sorry, late comment. I just read about GPU Boost 3.0 and this is AWESOME. What they did, is expose what previously was only doable with bios modding - eg assigning the CLK bins different voltages. The problem with overclocking Kepler/Maxwell was NOT so much that you got stuck with the "lowest" overclock as the article says, but that simply adding a FIXED amount of clocks across the entire range of clocks, as you would do with Afterburner etc. where you simply add, say +120 to the core. What happened here is that you may be "stable" at the max overclock (CLK bin), but since you added more CLKs to EVERY clock bin, the assigned voltages (in the BIOS) for each bin might not be sufficient. Say you have CLK bin 63 which is set to 1304Mhz in a stock bios. Now you use Afterburner and add 150 Mhz, now all of a sudden this bin amounts to 1454Mhz BUT STILL at the same voltage as before, which is too low for 1454Mhz. You had to manually edit the table in the BIOS to shift clocks around, especially since not all Maxwell cards allowed adding voltage via software.
Yeah looking at the bottom here.The GTX 1070 is on the same level as a single 480 4GB card.So that graph is wrong. http://www.hwcompare.com/30889/geforce-gtx-1070-vs... Remember this is from GPU-Z based on hardware specs.No amount of configurations in the Drivers changes this.They either screwed up i am calling shenanigans.
Nice Ryan Smith! But, my question is, is it truly possible to share the GPU with different workloads in the P100? I've read in the NVIDIA manual that "The GPU has a time sliced scheduler to schedule work from work queues belonging to different CUDA contexts. Work launched to the compute engine from work queues belonging to different CUDA contexts cannot execute concurrently."
Nice Ryan Smith! But, my question is, is it truly possible to share the GPU with different workloads in the P100? I've read in the NVIDIA manual that "The GPU has a time sliced scheduler to schedule work from work queues belonging to different CUDA contexts. Work launched to the compute engine from work queues belonging to different CUDA contexts cannot execute concurrently."
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
200 Comments
Back to Article
Ryan Smith - Wednesday, July 20, 2016 - link
To follow: GTX 1060 Review (hopefully Friday), RX 480 Architecture Writeup/Review, and at some point RX 470 and RX 460 are still due.Chillin1248 - Wednesday, July 20, 2016 - link
Nice, don't worry about the rushers. There are plenty of day one reviewers, but few go into depth the way that makes it interesting.retrospooty - Wednesday, July 20, 2016 - link
Agreed, this is a good review, as the video card reviews here usually are... Agreed about rushing as well. A lot of sites have less thorough stuff out in 1-2 days... I am guessing that Ryan and the others at Anandtech have regular day jobs and doing these reviews and articles is done on their own time. If that is the case, 2 months seems right. If I am incorrect in that assumption and this is a full time job, then they should be coming out with articles alot faster.JoshHo - Wednesday, July 20, 2016 - link
Currently for mobile the only full time editor is Matt Humrick.AndrewJacksonZA - Wednesday, July 20, 2016 - link
Thank you Ryan. I look forward to more and reliable information about the 470 and especially the 460.prophet001 - Wednesday, July 20, 2016 - link
Hi All, I was just wondering if it's worth it to get the FE 1080 or just go with the regular one. Does the stock fan setup offer better thermals than the blower setup?Teknobug - Wednesday, July 20, 2016 - link
FE is a ripoffImSpartacus - Wednesday, July 20, 2016 - link
It's literally just the reference card. It's not a bad reference design, but it's generally considered a poor value for enthusiasts.HomeworldFound - Wednesday, July 20, 2016 - link
A reference design is very useful if you're watercooling though.trab - Wednesday, July 20, 2016 - link
Depends if your custom board has any actual changes, it may just be the reference board with a custom cooler, so it would make no difference. Of course it would also be cheaper to boot.Flunk - Wednesday, July 20, 2016 - link
The "stock" fan setup is the blower. The "founders edition" cards are the base reference cards.prophet001 - Wednesday, July 20, 2016 - link
Alrighty then. Thanks for the info.bill44 - Wednesday, July 20, 2016 - link
Where can I find Audio specification, sampling rates etc.? Decoding capabilities?ImSpartacus - Wednesday, July 20, 2016 - link
The timeline is appreciated.tipoo - Wednesday, July 20, 2016 - link
Sounds like a good few weeks here, well doneChaotic42 - Wednesday, July 20, 2016 - link
Thanks for the review. Some things are worth the wait. Turn your phone and computer off and go take a nap. Sounds like you've earned it.zeeBomb - Wednesday, July 20, 2016 - link
An anandtech review takes all the pain away! How am I going to read this casually though? Without all the detailed whachinlmicallitssna1970 - Wednesday, July 20, 2016 - link
Can you please add Cross Fire benchmarks in the RX480 review ?close - Wednesday, July 20, 2016 - link
But where's the GTX 1070/1080 review? Oh wait... Scratch that.blanarahul - Wednesday, July 20, 2016 - link
"this change in the prefetch size is why the memory controller organization of GP104 is 8x32b instead of 4x64b like GM204, as each memory controller can now read and write 64B segments of data via a single memory channel.*Shouldn't it be the opposite?
"Overall when it comes to HDR on NVIDIA’s display controller, not unlike AMD’s Pascal architecture"
What?!!!
TallestJon96 - Wednesday, July 20, 2016 - link
Still Waiting on that 960 review :)All kidding aside I'm I'm sure Il'll enjoy this review. Probably buy a 1080 in 2-3 weeks
Sandcat - Thursday, July 21, 2016 - link
Mea culpa.Armus19 - Thursday, July 21, 2016 - link
Could you pretty please do the usual HTPC credentials test for 1070, 1080 and 1060 and RX 480 as well pretty please? Otherwise a great review.Xajel - Thursday, July 21, 2016 - link
I hope to add to this list a new revisit for Hardware Decoding+Encoding capabilities for these new GPU's in comparison to Intel QuickSync and pure Software solutions...JBVertexx - Friday, July 22, 2016 - link
Still waiting..... I've been checking all afternoon.....JBVertexx - Friday, July 22, 2016 - link
For the 1060 review, that is.....colinisation - Wednesday, July 20, 2016 - link
Lovely stuff as always chaps, to the complaints quality takes time.There goes my evening.
i4mt3hwin - Wednesday, July 20, 2016 - link
"As of the time this paragraph was written, Newegg only has a single GTX 1080 in stock, a Founders Edition card at $499."Should be $699
osxandwindows - Wednesday, July 20, 2016 - link
Finally!jsntech - Wednesday, July 20, 2016 - link
Nice review.On page /30 (Power, Temp, etc.):
"...there is a real increate in power..."
Ryan Smith - Wednesday, July 20, 2016 - link
Thanks.Eden-K121D - Wednesday, July 20, 2016 - link
Finally the GTX 1080 reviewguidryp - Wednesday, July 20, 2016 - link
This echoes what I have been saying about this generation. It is really all about clock speed increases. IPC is essentially the same.This is where AMD lost out. Possibly in part the issue was going with GloFo instead of TSMC like NVidia.
Maybe AMD will move Vega to TSMC...
nathanddrews - Wednesday, July 20, 2016 - link
Curious... how did AMD lose out? Have you seen Vega benchmarks?TheinsanegamerN - Wednesday, July 20, 2016 - link
its all about clock speed for Nvidia, but not for AMD. AMD focused more on ICP, according to them.tarqsharq - Wednesday, July 20, 2016 - link
It feels a lot like the P4 vs Athlon XP days almost.stereopticon - Wednesday, July 20, 2016 - link
My favorite era of being a nerd!!! Poppin' opterons into s939 and pumpin the OC the athlon FX levels for a fraction of the price all while stompin' on pentium. It was a good (although expensive) time to a be a nerd... Besides paying 100 dollars for 1gb of DDR500. 6800gs budget friendly cards, and ATi x1800/1900 super beasts.. how i miss the dayseddman - Thursday, July 21, 2016 - link
Not really. Pascal has pretty much the same IPC as Maxwell and its performance increases accordingly with the clockspeed.Pentium 4, on the other hand, had a terrible IPC compared to Athlon and even Pentium 3 and even jacking its clockspeed to the sky didn't help it.
guidryp - Wednesday, July 20, 2016 - link
No one really improved IPC of their units.AMD was instead forced increase the unit count and chip size for 480 is bigger than the 1060 chip, and is using a larger bus. Both increase the chip cost.
AMD loses because they are selling a more expensive chip for less money. That squeezes their unit profit on both ends.
retrospooty - Wednesday, July 20, 2016 - link
"This echoes what I have been saying about this generation. It is really all about clock speed increases. IPC is essentially the same."- This is a good thing. Stuck on 28nm for 4 years, moving to 16nm is exactly what Nvidias architecture needed.
TestKing123 - Wednesday, July 20, 2016 - link
Sorry, too little too late. Waited this long, and the first review was Tomb Raider DX11?! Not 12?This review is both late AND rushed at the same time.
Mat3 - Wednesday, July 20, 2016 - link
Testing Tomb Raider in DX11 is inexcusable.http://www.extremetech.com/gaming/231481-rise-of-t...
TheJian - Friday, July 22, 2016 - link
Furyx still loses to 980ti until 4K at which point the avg for both cards is under 30fps, and the mins are both below 20fps. IE, neither is playable. Even in AMD's case here we're looking at 7% gain (75.3 to 80.9). Looking at NV's new cards shows dx12 netting NV cards ~6% while AMD gets ~12% (time spy). This is pretty much a sneeze and will as noted here and elsewhere, it will depend on the game and how the gpu works. It won't be a blanket win for either side. Async won't be saving AMD, they'll have to actually make faster stuff. There is no point in even reporting victory at under 30fps...LOL.Also note in that link, while they are saying maxwell gained nothing, it's not exactly true. Only avg gained nothing (suggesting maybe limited by something else?), while min fps jumped pretty much exactly what AMD did. IE Nv 980ti min went from 56fps to 65fps. So while avg didn't jump, the min went way up giving a much smoother experience (amd gained 11fps on mins from 51 to 62). I'm more worried about mins than avgs. Tomb on AMD still loses by more than 10% so who cares? Sort of blows a hole in the theory that AMD will be faster in all dx12 stuff...LOL. Well maybe when you force the cards into territory nobody can play at (4k in Tomb Raiders case).
It would appear NV isn't spending much time yet on dx12, and they shouldn't. Even with 10-20% on windows 10 (I don't believe netmarketshare's numbers as they are a msft partner), most of those are NOT gamers. You can count dx12 games on ONE hand. Most of those OS's are either forced upgrades due to incorrect update settings (waking up to win10...LOL), or FREE on machine's under $200 etc. Even if 1/4 of them are dx12 capable gpus, that would be NV programming for 2.5%-5% of the PC market. Unlike AMD they were not forced to move on to dx12 due to lack of funding. AMD placed a bet that we'd move on, be forced by MSFT or get console help from xbox1 (didn't work, ps4 winning 2-1) so they could ignore dx11. Nvidia will move when needed, until then they're dominating where most of us are, which is 1080p or less, and DX11. It's comic when people point to AMD winning at 4k when it is usually a case where both sides can't hit 30fps even before maxing details. AMD management keeps aiming at stuff we are either not doing at all (4k less than 2%), or won't be doing for ages such as dx12 games being more than dx11 in your OS+your GPU being dx12 capable.
What is more important? Testing the use case that describes 99.9% of the current games (dx11 or below, win7/8/vista/xp/etc), or games that can be counted on ONE hand and run in an OS most of us hate. No hate isn't a strong word here when the OS has been FREE for a freaking year and still can't hit 20% even by a microsoft partner's likely BS numbers...LOL. Testing dx12 is a waste of time. I'd rather see 3-4 more dx11 games tested for a wider variety although I just read a dozen reviews to see 30+ games tested anyway.
ajlueke - Friday, July 22, 2016 - link
That would be fine if it was only dx12. Doesn't look like Nvidia is investing much time in Vulkan either, especially not on older hardware.http://www.pcgamer.com/doom-benchmarks-return-vulk...
Cygni - Wednesday, July 20, 2016 - link
Cool attention troll. Nobody cares what free reviews you choose to read or why.AndrewJacksonZA - Wednesday, July 20, 2016 - link
Typo on page 18: "The Test""Core i7-4960X hosed in an NZXT Phantom 630 Windowed Edition" Hosed -> Housed
Michael Bay - Thursday, July 21, 2016 - link
I`d sure hose me a Core i7-4960X.AndrewJacksonZA - Wednesday, July 20, 2016 - link
@Ryan & team: What was your reasoning for not including the new Doom in your 2016 GPU Bench game list? AFAIK it's the first indication of Vulkan performance for graphics cards.Thank you! :-)
Ryan Smith - Wednesday, July 20, 2016 - link
We cooked up the list and locked in the games before Doom came out. It wasn't out until May 13th. GTX 1080 came out May 14th, by which point we had already started this article (and had published the preview).AndrewJacksonZA - Wednesday, July 20, 2016 - link
OK, thank you. Any chance of adding it to the list please?I'm a Windows gamer, so my personal interest in the cross-platform Vulkan is pretty meh right now (only one title right now, hooray! /s) but there are probably going to be some devs are going to choose it over DX12 for that very reason, plus I'm sure that you have readers who are quite interested in it.
TestKing123 - Wednesday, July 20, 2016 - link
Then you're woefully behind the times since other sites can do this better. If you're not able to re-run a benchmark for a game with a pretty significant patch like Tomb Raider, or a high profile game like Doom with a significant performance patch like Vulcan that's been out for over a week, then you're workflow is flawed and this site won't stand a chance against the other crop. I'm pretty sure you're seeing this already if you have any sort of metrics tracking in place.TheinsanegamerN - Wednesday, July 20, 2016 - link
So question, if you started this article on may 14th, was their no time in the over 2 months to add one game to that benchmark list?nathanddrews - Wednesday, July 20, 2016 - link
Seems like an official addendum is necessary at some point. Doom on Vulkan is amazing. Dota 2 on Vulkan is great, too (and would be useful in reviews of low end to mainstream GPUs especially). Talos... not so much.Eden-K121D - Thursday, July 21, 2016 - link
Talos Principle was a proof of conceptajlueke - Friday, July 22, 2016 - link
http://www.pcgamer.com/doom-benchmarks-return-vulk...Addendum complete.
mczak - Wednesday, July 20, 2016 - link
The table with the native FP throughput rates isn't correct on page 5. Either it's in terms of flops, then gp104 fp16 would be 1:64. Or it's in terms of hw instruction throughput - then gp100 would be 1:1. (Interestingly, the sandra numbers for half-float are indeed 1:128 - suggesting it didn't make any use of fp16 packing at all.)Ryan Smith - Wednesday, July 20, 2016 - link
Ahh, right you are. I was going for the FLOPs rate, but wrote down the wrong value. Thanks!As for the Sandra numbers, they're not super precise. But it's an obvious indication of what's going on under the hood. When the same CUDA 7.5 code path gives you wildly different results on Pascal, then you know something has changed...
BurntMyBacon - Thursday, July 21, 2016 - link
Did nVidia somehow limit the ability to promote FP16 operations to FP32? If not, I don't see the point in creating such a slow performing FP16 mode in the first place. Why waste die space when an intelligent designer can just promote the commands to get normal speeds out of the chip anyways? Sure you miss out on speed doubling through packing, but that is still much better than the 1/128 (1/64) rate you get using the provided FP16 mode.Scali - Thursday, July 21, 2016 - link
I think they can just do that in the shader compiler. Any FP16 operation gets replaced by an FP32 one.Only reading from buffers and writing to buffers with FP16 content should remain FP16. Then again, if their driver is smart enough, it can even promote all buffers to FP32 as well (as long as the GPU is the only one accessing the data, the actual representation doesn't matter. Only when the CPU also accesses the data, does it actually need to be FP16).
owan - Wednesday, July 20, 2016 - link
Only 2 months late and published the day after a different major GPU release. What happened to this place?Ninhalem - Wednesday, July 20, 2016 - link
There's 32 freaking pages in this review. Maybe people have other jobs instead of writing all day long. Did you ever think of that?I'll take quality and a long publishing time over crap and rushing out the door.
Stuka87 - Wednesday, July 20, 2016 - link
Thanks for the extremely in depth review Ryan!cknobman - Wednesday, July 20, 2016 - link
I cannot help feel just a bit underwhelmed.Of course these Nvidia cards kick some major butt in games that have always favored Nvidia but I noticed that in games not specifically coded to take advantage of Nvidia and furthermore games with DX12 that these cards performance advantage is minimal at best vs an old Fury X with half the video RAM.
Then when you take into account Vulcan API and newer DX12 games (which can be found elsewhere) you see that the prices for these cards is a tad ridiculous and the performance advantage starts to melt away.
I am waiting for AMD to release their next "big gun" before I make a purchase decision.
I'm rocking a 4k monitor right now and 60fps at that resolution is my target.
nathanddrews - Wednesday, July 20, 2016 - link
1080 is close to being that 4K60 card, but can't quite cut it. I'm waiting for "Big Vega" vs 1080Ti before dropping any money.lefty2 - Wednesday, July 20, 2016 - link
Great review - one of the few that highlights the fact the Pascal async compute is only half as good as AMD's version. Async compute is a key feature for increasing performance in DX12 and Vulkan and that's going to allow the RX 480 to perform well against the GTX 1060Daniel Egger - Wednesday, July 20, 2016 - link
"... why the memory controller organization of GP104 is 8x32b instead of 4x64b like GM204"Sounds like it's the other way around.
Ryan Smith - Wednesday, July 20, 2016 - link
No, that's correct. 8 32bit wide controllers rather than 4 64bit wide controllers.http://images.anandtech.com/doci/10325/GeForce_GTX...
http://images.anandtech.com/doci/8526/GeForce_GTX_...
DominionSeraph - Wednesday, July 20, 2016 - link
>It has taken about 2 years longer than we’d normally see... for a review of a flagship card to come out
sgeocla - Wednesday, July 20, 2016 - link
The old Maxwell was so optimized it was always full and didn't even need Async Compute. The new Pascal is so much more optimized that it even has time to create the "holes" in execution (not counting the ones in your pocket) that were "missing" in the old architecture to be able to benefit for Async Compute. Expect Volta to create even more holes (with hardware support) for Async Compute to fill.tipoo - Wednesday, July 20, 2016 - link
That's demonstrably untrue.http://www.futuremark.com/pressreleases/a-closer-l...
Plenty of holes that could have been filled in Maxwell.
patrickjp93 - Wednesday, July 20, 2016 - link
That doesn't actually support your point...Scali - Wednesday, July 20, 2016 - link
Did I read a different article?Because the article that I read said that the 'holes' would be pretty similar on Maxwell v2 and Pascal, given that they have very similar architectures. However, Pascal is more efficient at filling the holes with its dynamic repartitioning.
mr.techguru - Wednesday, July 20, 2016 - link
Just Ordered the MSI GeForce GTX 1070 Gaming X , way better than 1060 / 480. NVidia Nail it :)tipoo - Wednesday, July 20, 2016 - link
" NVIDIA tells us that it can be done in under 100us (0.1ms), or about 170,000 clock cycles."Is my understanding right that Polaris, and I think even earlier with late GCN parts, could seamlessly interleave per-clock? So 170,000 times faster than Pascal in clock cycles (less in total time, but still above 100,000 times faster)?
Scali - Wednesday, July 20, 2016 - link
That seems highly unlikely. Switching to another task is going to take some time, because you also need to switch all the registers, buffers, caches need to be re-filled etc.The only way to avoid most of that is to duplicate the whole register file, like HyperThreading does. That's doable on an x86 CPU, but a GPU has way more registers.
Besides, as we can see, nVidia's approach is fast enough in practice. Why throw tons of silicon on making context switching faster than it needs to be? You want to avoid context switches as much as possible anyway.
Sadly AMD doesn't seem to go into any detail, but I'm pretty sure it's going to be in the same ballpark.
My guess is that what AMD calls an 'ACE' is actually very similar to the SMs and their command queues on the Pascal side.
Ryan Smith - Wednesday, July 20, 2016 - link
Task switching is separate from interleaving. Interleaving takes place on all GPUs as a basic form of latency hiding (GPUs are very high latency).The big difference is that interleaving uses different threads from the same task; task switching by its very nature loads up another task entirely.
Scali - Thursday, July 21, 2016 - link
After re-reading AMD's asynchronous shader PDF, it seems that AMD also speaks of 'interleaving' when they switch a graphics CU to a compute task after the graphics task has completed. So 'interleaving' at task level, rather than at instruction level.Which would be pretty much the same as NVidia's Dynamic Load Balancing in Pascal.
eddman - Thursday, July 21, 2016 - link
The more I read about async computing in Polaris and Pascal, the more I realize that the implementations are not much different.As Ryan pointed out, it seems that the reason that Polaris, and GCN as a whole, benefit more from async is the architecture of the GPU itself, being wider and having more ALUs.
Nonetheless, I'm sure we're still going to see comments like "Polaris does async in hardware. Pascal is hopeless with its software async hack".
Matt Doyle - Wednesday, July 20, 2016 - link
Typo in the lead sentence of HPC vs. Consumer: Divergence paragraph: "Pascal in an architecture that...""is" instead of "in"
Matt Doyle - Wednesday, July 20, 2016 - link
Feeding Pascal page, "GDDR5X uses a 16n prefetch, which is twice the size of GDDR5’s 8n prefect."Prefect = prefetch
Matt Doyle - Wednesday, July 20, 2016 - link
Same page, "The latter is also a change from GTX 980, as NVIDIA has done from a digital + analog DVI port to a pure digital DVI port.""NVIDIA has gone"?
Matt Doyle - Wednesday, July 20, 2016 - link
Rather, "Meet the GTX 1080" page, second to last paragraph.Matt Doyle - Wednesday, July 20, 2016 - link
"Meet the GTX 1080..." page, "...demand first slow down to a point where board partners can make some informed decisions about what cards to produce."I believe you're missing the word "must" (or alternatively, "needs to") between "demand" and "first" in this sentence.
Ryan Smith - Wednesday, July 20, 2016 - link
Thanks!supdawgwtfd - Wednesday, July 20, 2016 - link
Didn't even finish reading the first page. The bias is overwhelming... So much emotional language...Good bye Anandtech. Had been a nice 14 years of reading but it's obvious now you have moved to so many other sites.
Shills who can't restrain their bias and review something without the love of a brand springing forth like a fountain.
Yes i created an account just for this soul reason...
The fucking 2 month wait is also not on.
But what to expect form children,
BMNify - Wednesday, July 20, 2016 - link
Then just GTFO you idiot, on second thoughts crying your heart out may also help in this fanboy mental break down situation of yours.catavalon21 - Wednesday, July 20, 2016 - link
You're joking, or a troll, or a clown. I complained about the time it took to get the full article, (to Ryan's credit, for the impatient ones of us just looking for numbers, he noted a while back that GPU bench was updated to include benchmarks for these cards), but this is exactly the kind of review that often has separated AT from numerous other sites. The description of the relatively crummy FP16 performance was solid and on point. From NV themselves teasing us with ads that half precision would rock our world, well, this review covers in great detail the reset of the story.Yeah, I know guys, I shouldn't dignify it with a response.
atlantico - Thursday, July 21, 2016 - link
Anandtech have always been nvidia shills. Sad they can't make a living without getting paid by nvidia, but they're not alone. Arstechnica is even worse and Tomshardware is way worse.brookheather - Wednesday, July 20, 2016 - link
Typo page 12 - "not unlike AMD’s Pascal architecture" - think you mean Polaris?brookheather - Wednesday, July 20, 2016 - link
And another one on the last page: it keep the GPU industry - should be kept.grrrgrrr - Wednesday, July 20, 2016 - link
Solid review! Some nice architecture introductions.euskalzabe - Wednesday, July 20, 2016 - link
The HDR discussion of this review was super interesting, but as always, there's one key piece of information missing: WHEN are we going to see HDR monitors that take advantage of these new GPU abilities?I myself am stuck at 1080p IPS because more resolution doesn't entice me, and there's nothing better than IPS. I'm waiting for HDR to buy my next monitor, but being 5 years old my Dell ST2220T is getting long in the teeth...
ajlueke - Wednesday, July 20, 2016 - link
Thanks for the review Ryan,I think the results are quite interesting, and the games chosen really help show the advantages and limitations of the different architectures. When you compare the GTX 1080 to its price predecessor, the 980 Ti, you are getting an almost universal ~25%-30% increase in performance.
Against rival AMDs R9 Fury X, there is more of a mixed bag. As the resolutions increase the bandwidth provided by the HBM memory on the Fury X really narrows the gap, sometimes trimming the margin to less that 10%,s specifically in games optimized more for DX12 "Hitman, "AotS". But it other games, specifically "Rise of the Tomb Raider" which boasts extremely high res textures, the 4Gb memory size on the Fury X starts to limit its performance in a big way. On average, there is again a ~25%-30% performance increase with much higher game to game variability.
This data lets a little bit of air out of the argument I hear a lot that AMD makes more "future proof" cards. While many Nvidia 900 series users may have to upgrade as more and more games switch to DX12 based programming. AMD Fury users will be in the same boat as those same games come with higher and higher res textures, due to the smaller amount of memory on board.
While Pascal still doesn't show the jump in DX12 versus DX11 that AMD's GPUs enjoy, it does at least show an increase or at least remain at parity.
So what you have is a card that wins in every single game tested, at every resolution over the price predecessors from both companies, all while consuming less power. That is a win pretty much any way you slice it. But there are elements of Nvidia’s strategy and the card I personally find disappointing.
I understand Nvidia wants to keep features specific to the higher margin professional cards, but avoiding HBM2 altogether in the consumer space seems to be a missed opportunity. I am a huge fan of the mini ITX gaming machines. And the Fury Nano, at the $450 price point is a great card. With an NVMe motherboard and NAS storage the need for drive bays in the case is eliminated, the Fury Nano at only 6” leads to some great forward thinking, and tiny designs. I was hoping to see an explosion of cases that cut out the need for supporting 10-11” cards and tons of drive bays if both Nvidia and AMD put out GPUs in the Nano space, but it seems not to be. HBM2 seems destined to remain on professional cards, as Nvidia won’t take the risk of adding it to a consumer Titan or GTX 1080 Ti card and potentially again cannibalize the higher margin professional card market. Now case makers don’t really have the same incentive to build smaller cases if the Fury Nano will still be the only card at that size. It’s just unfortunate that it had to happen because NVidia decided HBM2 was something they could slap on a pro card and sell for thousands extra.
But also what is also disappointing about Pascal stems from the GTX 1080 vs GTX 1070 data Ryan has shown. The GTX 1070 drops off far more than one would expect based off CUDA core numbers as the resolution increases. The GDDR5 memory versus the GDDR5X is probably at fault here, leading me to believe that Pascal can gain even further if the memory bandwidth is increased more, again with HBM2. So not only does the card limit you to the current mini-ITX monstrosities (I’m looking at you bulldog) by avoiding HBM2, it also very likely is costing us performance.
Now for the rank speculation. The data does present some interesting scenarios for the future. With the Fury X able to approach the GTX 1080 at high resolutions, most specifically in DX12 optimized games. It seems extremely likely that the Vega GPU will be able to surpass the GTX 1080, especially if the greatest limitation (4 Gb HBM) is removed with the supposed 8Gb of HBM2 and games move more and more the DX12. I imagine when it launches it will be the 4K card to get, as the Fury X already acquits itself very well there. For me personally, I will have to wait for the Vega Nano to realize my Mini-ITX dreams, unless of course, AMD doesn’t make another Nano edition card and the dream is dead. A possibility I dare not think about.
eddman - Wednesday, July 20, 2016 - link
The gap getting narrower at higher resolutions probably has more to do with chips' designs rather than bandwidth. After all, Fury is the big GCN chip optimized for high resolutions. Even though GP104 does well, it's still the middle Pascal chip.P.S. Please separate the paragraphs. It's a pain, reading your comment.
Eidigean - Wednesday, July 20, 2016 - link
The GTX 1070 is really just a way for Nvidia to sell GP104's that didn't pass all of their tests. Don't expect them to put expensive memory on a card where they're only looking to make their money back. Keeping the card cost down, hoping it sells, is more important to them.If there's a defect anywhere within one of the GPC's, the entire GPC is disabled and the chip is sold at a discount instead of being thrown out. I would not buy a 1070 which is really just a crippled 1080.
I'll be buying a 1080 for my 2560x1600 desktop, and an EVGA 1060 for my Mini-ITX build; which has a limited power supply.
mikael.skytter - Wednesday, July 20, 2016 - link
Thanks Ryan! Much appreciated.[email protected] - Wednesday, July 20, 2016 - link
Very good review. One minor comment to the article writers - do a final check on grammer - granted we are technical folks, but it was noticeable especially on the final words page.madwolfa - Wednesday, July 20, 2016 - link
It's "grammar", though. :)Eden-K121D - Thursday, July 21, 2016 - link
Oh the irony[email protected] - Thursday, July 21, 2016 - link
oh snap, that is some funny stuff right thereeddman - Wednesday, July 20, 2016 - link
That puts a lid on the comments that Pascal is basically a Maxwell die-shrink. It's obviously based on Maxwell but the addition of dynamic load balancing and preemption clearly elevates it to a higher level.Still, seeing that using async with Pascal doesn't seem to be as effective as GCN, the question is how much of a role will it play in DX12 games in the next 2 years. Obviously async isn't be-all and end-all when it comes to performance but can Pascal keep up as a whole going forward or not.
I suppose we won't know until more DX12 are out that are also optimized properly for Pascal.
javishd - Wednesday, July 20, 2016 - link
Overwatch is extremely popular right now, it deserves to be a staple in gaming benchmarks.jardows2 - Wednesday, July 20, 2016 - link
Except that it really is designed as an e-sport style game, and can run very well with low-end hardware, so isn't really needed for reviewing flagship cards. In other words, if your primary desire is to find a card that will run Overwatch well, you won't be looking at spending $200-$700 for the new video cards coming out.Ryan Smith - Wednesday, July 20, 2016 - link
And this is why I really wish Overwatch was more demanding on GPUs. I'd love to use it and DOTA 2, but 100fps at 4K doesn't tell us much of use about the architecture of these high-end cards.Scali - Wednesday, July 20, 2016 - link
Thanks for the excellent write-up, Ryan!Especially the parts on asynchronous compute and pre-emption were very thorough.
A lot of nonsense was being spread about nVidia's alleged inability to do async compute in DX12, especially after Time Spy was released, and actually showed gains from using multiple queues.
Your article answers all the criticism, and proves the nay-sayers wrong.
Some of them went so far in their claims that they said nVidia could not even do graphics and compute at the same time. Even Maxwell v2 could do that.
I would say you have written the definitive article on this matter.
The_Assimilator - Wednesday, July 20, 2016 - link
Sadly that won't stop the clueless AMD fanboys from continuing to harp on that NVIDIA "doesn't have async compute" or that it "doesn't work". You've gotta feel for them though, NVIDIA's poor performance in a single tech demo... written with assistance from AMD... is really all the red camp has to go on. Because they sure as hell can't compete in terms of performance, or power usage, or cooler design, or adhering to electrical specifications...tipoo - Wednesday, July 20, 2016 - link
Pretty sure critique was of Maxwell. Pascals async was widely advertised. It's them saying "don't worry, Maxwell can do it" to questions about it not having it, and then when Pascal is released, saying "oh yeah, performance would have tanked with it on Maxwell", that bugs people as it shouldScali - Wednesday, July 20, 2016 - link
Nope, a lot of critique on Time Spy was specifically *because* Pascal got gains from the async render path. People said nVidia couldn't do it, so FutureMark must be cheating/bribed.darkchazz - Thursday, July 21, 2016 - link
It won't matter much though because they won't read anything in this article or Futuremark's statement on Async use in Time Spy.And they will keep linking some forum posts that claim nvidia does not support Async Compute.
Nothing will change their minds that it is a rigged benchmark and the developers got bribed by nvidia.
Scali - Friday, July 22, 2016 - link
Yea, not even this official AMD post will: http://radeon.com/radeon-wins-3dmark-dx12/Eidigean - Wednesday, July 20, 2016 - link
Excellent article Ryan.Will you be writing a followup article with tests of two GTX 1080's in SLI with the new high-bandwidth dual bridge?
Looking specifically for these tests in SLI:
3840x2160
2560x1440
3x 2560x1440 (7680x1440)
3x 1920x1080 (5760x1080)
Hoping the latter two tests would include with and without multi-projection optimizations.
Thanks!
Ryan Smith - Wednesday, July 20, 2016 - link
"Will you be writing a followup article with tests of two GTX 1080's in SLI with the new high-bandwidth dual bridge?"It's on the schedule. But not in the near term, unfortunately. GPU Silly Season is in full swing right now.
big dom - Wednesday, July 20, 2016 - link
I agree with the previous reviewers. It's fine and dandy to be a "day one" breakthrough reviewer and believe me I read and enjoyed 20 of those other day 1 reviews as well. But... IMO no one writes such an in depth, technical, and layman-enjoyable review like Anandtech. Excellent review fellas!This is coming from a GTX 1070 FE owner, and I am also the other of the original Battleship Mtron SSD article.
Regards,
Dominick
big dom - Wednesday, July 20, 2016 - link
*authorjase240 - Wednesday, July 20, 2016 - link
I don't understand why in your benchmarks the framerates are so low. For example I have a 1070 and am able to play GTAV at very high settings and achieve 60fps constant at 4K.(No MSAA obviously)Even other reviewers have noted much higher framerates. Listing the 1080 as a true 4K card and the 1070 as a capable 4K card too.
Ryan Smith - Wednesday, July 20, 2016 - link
It depends on the settings.The way I craft these tests is settings centric. I pick the settings and see where the cards fall. Some other sites have said that they see what settings a card performs well at (e.g. where it gets 60fps) and then calibrate around that.
The end result is that the quality settings I test are most likely higher than the sites you're comparing this article to.
jase240 - Wednesday, July 20, 2016 - link
Ah now I see why, you have the advanced graphics settings turned up also. These are not turned on by default in GTAV since they cause great performance loss.Extended Distance Scaling, Extended Shadows Distance, Long Shadows, High Resolution Shadows, High Detail Streaming While Flying.
They eat a lot of VRAM and perform terribly at higher resolutions.
sonicmerlin - Thursday, July 21, 2016 - link
And I'm guessing add very little to overall picture quality.Mat3 - Wednesday, July 20, 2016 - link
I think there's something wrong with your Fury X in a couple of your compute tests. How's it losing to a Nano?masouth - Wednesday, July 20, 2016 - link
I know the article was posted today but when was it actually written? NewEgg has been having the dual fan Gigabyte GTX 1070 at $399 for a couple weeks now. Yes, it's still $20 over the MSRP and frequently sells out as quick as they show up but it's still a fair deal cheaper than $429.Ryan Smith - Wednesday, July 20, 2016 - link
The prices listed are based on checking Newegg daily for the past week. The sub-$429 cards have never been in stock when I've checked.powerarmour - Wednesday, July 20, 2016 - link
Lulz how late?jabbadap - Wednesday, July 20, 2016 - link
Great thorough review.You have a funny typo on the fast sync page:
" Fast Sync doesn’t replace either Adaptive Sync or G-Sync"
I think you meant Adaptive Vsync, not that VESA standard which nvidia does not support.
Ryan Smith - Wednesday, July 20, 2016 - link
Right you are. Thanks!bill44 - Wednesday, July 20, 2016 - link
Where can I find a FULL review please?Thanks
Ryan Smith - Wednesday, July 20, 2016 - link
"Where can I find a FULL review please?"This article was an excerpt of the full review. The complete, 5 volume set on GTX 1080 will be available at your local book store in October. ;-)
catavalon21 - Wednesday, July 20, 2016 - link
AwesomeAndrewJacksonZA - Thursday, July 21, 2016 - link
Cool! Thanks Ryan! Did you reach your stretch goals on your Kickstarter campaign so that you can afford the scanning electron microscope for great photos?bill44 - Thursday, July 21, 2016 - link
Thanks Ryan.I was on about audio. Not for games, but for my music playback (SACD, DVDA) with different sampling rates that includes 88.2 and 176.4Khz. I can't find FULL specifications for the 100 series GTX cards that include this information, or any decoding capabilities. I was looking for the GTX 1060 for HTPC use with madVR.
Where can I find out about audio and sampling rates? MadVR capabilities?
Ryan Smith - Friday, July 22, 2016 - link
As far as I know, audio capabilities are the same as on the 900 series. Unfortunately I don't have more detail than that right now.bill44 - Friday, July 22, 2016 - link
That's the problem. I know nothing about the 900 series audio capabilities (which I suppose is the same as the 800 series ;) ) and no one publishes them in review. All reviews are incomplete.Anyone here knows at least the supported audio sampling rates? If not, I think my best bet is going with AMD (which I'm shure supports 88.2 & 176.4 KHz).
bill44 - Saturday, July 23, 2016 - link
Anyone?poohbear - Wednesday, July 20, 2016 - link
thank you for the review, late as it is it's still an excellent review and love the details!junky77 - Wednesday, July 20, 2016 - link
In other reviews, even a Haswell-E is limited for GPUs like GTX 1070JamesAnthony - Wednesday, July 20, 2016 - link
I really appreciate all the work that went into this in depth review.I especially am very glad that you included the GTX 680 in the benchmarks along with all the other cards after it.
It's often really hard to get an overview of performance over a couple years.
I'm looking at upgrading 2 systems from GTX680 to either GTX 1070 or GTX 1060 and Titan (original one) to GTX 1080, so this helps see what the performance would be like.
Hopefully you tested the 1060 the same way so I can just plug the numbers for it into the same graph.
Thanks again!
Ryan Smith - Wednesday, July 20, 2016 - link
Be sure to check Bench. The 1060 results are already there, so you can see those comparisons right now.fivefeet8 - Wednesday, July 20, 2016 - link
2nd page 3rd paragraph: "generational increate in performance". ;increase?2nd page 2nd section: "Pascal in an architecture that I’m not sure has any real parallel on a historical basis". ;is?
hansmuff - Wednesday, July 20, 2016 - link
Great review, i like that you went into all the hardware details. Worth the wait.Chaser - Wednesday, July 20, 2016 - link
I'm a Nvidia guy all the way. For now. I am disappointed in the midrange RX480 and it's power consumption compared to the competition, especially after they had said that Polaris was goingto primarily be an efficiency improvement.Outside of my bias I truly hope AMD provides a very competitive flagship in the near future. Everyone wins. But with the 1060 now announced it just makes AMD's GPU prospects and profitability questionable.
MarkieGcolor - Wednesday, July 20, 2016 - link
So basically after all the hype about finfet, we get a standard, if not disappointing jump this generation also with a price hike. I'm so relieved that I didn't wait for this generation and can just enjoy my current 970 sli/nano crossfire rigs. AMD easily has the opportunity to blow these cards out of the water with big gpus.DonMiguel85 - Wednesday, July 20, 2016 - link
Agreed. They'll likely be much more power-hungry, but I believe it's definitely doable. At the very least it'll probably be similar to Fury X Vs. GTX 980sonicmerlin - Thursday, July 21, 2016 - link
The 1070 is as fast as the 980 ti. The 1060 is as fast as a 980. The 1080 is much faster than a 980 ti. Every card jumped up two tiers in performance from the previous gen. That's "standard" to you?Kvaern1 - Sunday, July 24, 2016 - link
I don't think there's much evidence pointing in the direction of GCN 4 blowing Pascal out of the water.Sadly, AMD needs a win but I don't see it coming. Budgets matter.
watzupken - Wednesday, July 20, 2016 - link
Brilliant review. Thanks for the in depth review. This is late, but the analysis is its strength and value add worth waiting for.ptown16 - Wednesday, July 20, 2016 - link
This review was a L O N G time coming, but gotta admit, excellent as always. This was the ONLY Pascal review to acknowledge and significantly include Kepler cards in the benchmarks and some comments. It makes sense to bench GK104 and analyze generational improvements since Kepler debuted 28nm and Pascal has finally ushered in the first node shrink since then. I guessed Anandtech would be the only site to do so, and looks like that's exactly what happened. Looking forward to the upcoming Polaris review!DonMiguel85 - Wednesday, July 20, 2016 - link
I do still wonder if Kepler's poor performance nowadays is largely due to neglected driver optimizations or just plain old/inefficient architecture. If it's the latter, it's really pretty bad with modern game workloads.ptown16 - Wednesday, July 20, 2016 - link
It may be a little of the latter, but Kepler was pretty amazing at launch. I suspect driver neglect though, seeing as how Kepler performance got notably WORSE soon after Maxwell. It's also interesting to see how the comparable GCN cards of that time, which were often slower than the Kepler competition, are now significantly faster.DonMiguel85 - Thursday, July 21, 2016 - link
Yeah, and a GTX 960 often beats a GTX 680 or 770 in many newer games. Sometimes it's even pretty close to a 780.hansmuff - Thursday, July 21, 2016 - link
This is the one issue that has me wavering for the next card. My AMD cards, the last one being a 5850, have always lasted longer than my NV cards; of course at the expense of slower game fixes/ready drivers.So far so good with a 1.5yrs old 970, but I'm keeping a close eye on it. I'm looking forward to what VEGA brings.
ptown16 - Thursday, July 21, 2016 - link
Yeah I'd keep an eye on it. My 770 can still play new games, albeit at lowered quality settings. The one hope for the 970 and other Maxwell cards is that Pascal is so similar. The only times I see performance taking a big hit would be newer games using asynchronous workloads, since Maxwell is poorly prepared to handle that. Otherwise maybe Maxwell cards will last much longer than Kepler. That said, I'm having second thoughts on the 1070 and curious to see what AMD can offer in the $300-$400 price range.jcardel - Wednesday, July 27, 2016 - link
This is excactly the same situation as me. I got a 770 sitting in my rig, and am looking hard at the 1070, maybe soon. Although my 770 is still up to the task in most games, I really play only blizzard games theese days and they are not hard on your hardware.My biggest issue is really that it is rather noisy, so I will be looking for a solution with the lowest DB.
Great article, it was totally worth waiting for.. I only read this sort of stuff here so have been waiting till now for any 1080 review.
Thanks!
D. Lister - Thursday, July 21, 2016 - link
Nice job, Ryan. Good comeback. Keep it up.Saeid92 - Thursday, July 21, 2016 - link
What is 99th procentile framerate?Ryan Smith - Thursday, July 21, 2016 - link
If you sorted the framerate from highest to lowest, this would be the framerate of the slowest 1%. It's basically a more accurate/meaningful metric for minimum frame rates.Eris_Floralia - Thursday, July 21, 2016 - link
This is why I love Anandtech. Deep in reviews. Well I even wanted to be one of your editors if you have the plan to create a Chinese transtate version of these reviews.daku123 - Thursday, July 21, 2016 - link
Typo on FP16 Throughput page. In second paragraph, it should be Tegra X1 (not Tesla X1?).Ryan Smith - Thursday, July 21, 2016 - link
Eyup. Thanks!Badelhas - Thursday, July 21, 2016 - link
Great detailed review, as always. But I have to ask once again:why didnt you do some kind of VR Benchmarks? Thats what drives my choises now, to be honest.
Cheers
Ranger1065 - Thursday, July 21, 2016 - link
After over 2 months of reading GTX1080 reviews I felt a distinct lack of excitementas I read Anandtech kicking off their review of the finfet generation. Could it
prove to be anything but an anticlimax?
Sadly and unsurprisingly...NOT.
It was however amusing to see the faithfull positively gushing praises for Anandtech
now that the "greatly anticipated" review is finally out.
Yes folks, 20 or so pages of (well written) information, mostly already covered by other tech sites,
finally published, it's as if a magic wand has been waved, the information has been presented with
that special Anandtech sauce, new insights have been illuminated and all is well in Anandtechland again.
(AT LEAST UNTIL THE NEXT 2 MONTH DELAY.) LOL.
I do like the way Anandtech presents the FPS charts.
Back to sleep now Anandtech :)
mkaibear - Thursday, July 21, 2016 - link
You've hit the nail on the head here Ranger.The info which is included within the article is indeed mostly already covered by other tech sites.
Emphasis on the "mostly" and the plural "sites".
Those of us who have jobs which keep us busy and have an interest in this sort of thing often don't have the time to trawl round many different sites to get reviews and pertinent technical data so we rely upon those sites which we trust to produce in-depth articles, even if they take a bit longer.
As an IT Manager for (most recently) a manufacturing firm and then a school, I don't care about bleeding edge, get the new stuff as soon as it comes out, I care about getting the right stuff, and a two month delay to get a proper review is absolutely fine. If I need quick benchmarks I'll use someone like Hexus or HardOCP but to get a deep dive into the architecture so I can justify purchases to the Art and Media departments, or the programers is essential. You don't get that anywhere else.
Ranger1065 - Thursday, July 21, 2016 - link
Your unwavering support for Anandtech is impressive.I too have a job that keeps me busy, yet oddly enough I find the time to browse (I prefer that word to "trawl") a number of sites.
I find it helps to form objective opinions.
I don't believe in early adoption, but I do believe in getting the job done on time, however if you are comfortable with a 2 month delay, so be it :)
Interesting to note that architectural deep dives concern your art and media departments so closely in their purchasing decisions. Who would have guessed?
It's true (God knows it's been stated here often enough) that
Anandtech goes into detail like no other, I don't dispute that.
But is it worth the wait? A significant number seem to think not.
Allow me to leave one last issue for you to ponder (assuming you have the time in your extremely busy schedule).
Is it good for Anandtech?
catavalon21 - Thursday, July 21, 2016 - link
Impatient as I was at the first for benchmarks, yes, I'm a numbers junkie, since it's evident precious few of us will have had a chance to buy one of these cards yet (or the 480), I doubt the delay has caused anyone to buy the wrong card. Can't speak for the smart phone review folks are complaining about being absent, but as it turns out, what I'm initially looking for is usually done early on in Bench. The rest of this, yeah, it can wait.mkaibear - Saturday, July 23, 2016 - link
Job, house, kids, church... more than enough to keep me sufficiently busy that I don't have the time to browse more than a few sites. I pick them quite carefully.Given the lifespan of a typical system is >5 years I think that a 2 month delay is perfectly reasonable. It can often take that long to get purchasing signoff once I've decided what they need to purchase anyway (one of the many reasons that architectural deep dives are useful - so I can explain why the purchase is worthwhile). Do you actually spend someone else's money at any point or are you just having to justify it to yourself?
Whether or not it's worth the wait to you is one thing - but it's clearly worth the wait to both Anandtech and to Purch.
[email protected] - Thursday, July 21, 2016 - link
Excellent article, well deserved the wait!giggs - Thursday, July 21, 2016 - link
While this is a very thorough and well written review, it makes me wonder about sponsored content and product placement.The PG279Q is the only monitor mentionned, making sure the brand appears, and nothing about competing products. It felt unnecessary.
I hope it's just a coincidence, but considering there has been quite a lot of coverage about Asus in the last few months, I'm starting to doubt some of the stuff I read here.
Ryan Smith - Thursday, July 21, 2016 - link
"The PG279Q is the only monitor mentionned, making sure the brand appears, and nothing about competing products."There's no product placement or the like (and if there was, it would be disclosed). I just wanted to name a popular 1440p G-Sync monitor to give some real-world connection to the results. We've had cards for a bit that can drive 1440p monitors at around 60fps, but GTX 1080 is really the first card that is going to make good use of higher refresh rate monitors.
giggs - Thursday, July 21, 2016 - link
Fair enough, thank you for responding promptly. Keep up the good work!arh2o - Thursday, July 21, 2016 - link
This is really the gold standard of reviews. More in-depth than any site on the internet. Great job Ryan, keep up the good work.Ranger1065 - Thursday, July 21, 2016 - link
This is a quality article.timchen - Thursday, July 21, 2016 - link
Great article. It is pleasant to read more about technology instead of testing results. Some questions though:1. higher frequency: I am kind of skeptical that the overall higher frequency is mostly enabled by FinFET. Maybe it is the case, but for example when Intel moved to FinFET we did not see such improvement. RX480 is not showing that either. It seems pretty evident the situation is different from 8800GTX where we first get frequency doubling/tripling only in the shader domain though. (Wow DX10 is 10 years ago... and computation throughput is improved by 20x)
2. The fastsync comparison graph looks pretty suspicious. How can Vsync have such high latency? The most latency I can see in a double buffer scenario with vsync is that the screen refresh just happens a tiny bit earlier than the completion of a buffer. That will give a delay of two frame time which is like 33 ms (Remember we are talking about a case where GPU fps>60). This is unless, of course, if they are testing vsync at 20hz or something.
Ryan Smith - Friday, July 22, 2016 - link
2) I suspect the v-sync comparison is a 3 deep buffer at a very high framerate.lagittaja - Sunday, July 24, 2016 - link
1) It is a big part of it. Remember how bad 20nm was?The leakage was really high so Nvidia/AMD decided to skip it. FinFET's helped reduce the leakage for the "14/16"nm node.
That's apples to oranges. CPU's are already 3-4Ghz out of the box.
RX480 isn't showing it because the 14nm LPP node is a lemon for GPU's.
You know what's the optimal frequency for Polaris 10? 1Ghz. After that the required voltage shoots up.
You know, LPP where the LP stands for Low Power. Great for SoC's but GPU's? Not so much.
"But the SoC's clock higher than 2Ghz blabla". Yeah, well a) that's the CPU and b) it's freaking tiny.
How are we getting 2Ghz+ frequencies with Pascal which so closely resembles Maxwell?
Because of the smaller manufacturing node. How's that possible? It's because of FinFET's which reduced the leakage of the 20nm node.
Why couldn't we have higher clockspeeds without FinFET's at 28nm? Because power.
28nm GPU's capped around the 1.2-1.4Ghz mark.
20nm was no go, too high leakage current.
16nm gives you FinFET's which reduced the leakage current dramatically.
What does that enable you to do? Increase the clockspeed..
Here's a good article
http://www.anandtech.com/show/8223/an-introduction...
lagittaja - Sunday, July 24, 2016 - link
As an addition to the RX 480 / Polaris 10 clockspeedGCN2-GCN4 VDD vs Fmax at avg ASIC
http://i.imgur.com/Hdgkv0F.png
timchen - Thursday, July 21, 2016 - link
Another question is about boost 3.0: given that we see 150-200 Mhz gpu offset very common across boards, wouldn't it be beneficial to undervolt (i.e. disallow the highest voltage bins corresponding to this extra 150-200 Mhz) and offset at the same time to maintain performance at lower power consumption? Why did Nvidia not do this in the first place? (This is coming from reading Tom's saying that 1060 can be a 60w card having 80% of its performance...)AnnonymousCoward - Thursday, July 21, 2016 - link
NVIDIA, get with the program and support VESA Adaptive-Sync already!!! When your $700 card can't support the VESA standard that's in my monitor, and as a result I have to live with more lag and lower framerate, something is seriously wrong. And why wouldn't you want to make your product more flexible?? I'm looking squarely at you, Tom Petersen. Don't get hung up on your G-sync patent and support VESA!AnnonymousCoward - Thursday, July 21, 2016 - link
If the stock cards reach the 83C throttle point, I don't see what benefit an OC gives (won't you just reach that sooner?). It seems like raising the TDP or under-voltaging would boost continuous performance. Your thoughts?modeless - Friday, July 22, 2016 - link
Thanks for the in depth FP16 section! I've been looking forward to the full review. I have to say this is puzzling. Why put it on there at all? Emulation would be faster. But anyway, NVIDIA announced a new Titan X just now! Does this one have FP16 for $1200? Instant buy for me if so.Ryan Smith - Friday, July 22, 2016 - link
Emulation would be faster, but it would not be the same as running it on a real FP16x2 unit. It's the same purpose as FP64 units: for binary compatibility so that developers can write and debug Tesla applications on their GeForce GPU.hoohoo - Friday, July 22, 2016 - link
Excellent article, Ryan, thank you!Especially the info on preemption and async/scheduling.
I expected the preemption mght be expensive in some circumstances, but I didn't quite expect it to push the L2 cache though! Still this is a marked improvement for nVidia.
hoohoo - Friday, July 22, 2016 - link
It seems like the preemption is implemented in the driver though? Are there actual h/w instructions to as it were "swap stack pointer", "push LDT", "swap instruction pointer"?Scali - Wednesday, July 27, 2016 - link
There is hardware to quickly swap task contexts to/from VRAM.The driver can signal when a task needs to be pre-empted, which it can now do at any pixel/instruction.
If I understand Dynamic Load Balancing correctly, you can queue up tasks from the compute partition on the graphics partition, which will start running automatically once the graphics task has completed. It sounds like this is actually done without any interference from the driver.
tamalero - Friday, July 22, 2016 - link
I swear the whole 1080 vs 480X remind me of the old fight between the 8800 and the 2900XTwhich somewhat improved int he 3870 and end with a winner whit the 4870.
I really hope AMD stops messing with the ATI division and lets them drop a winner.
AMD has been sinking ATI and making ATI carry the goddarn load of AMD's processor division failure.
doggface - Friday, July 22, 2016 - link
Excellent article Ryan. I have been reading for several days whenever i can catch five minutes, and it has been quite the read! I look forward to the polaris review.I feel like u should bench these cards day 1, so that the whingers get it out od their system. Then label these reviews the "gp104" review, etc. It really was about the chip and board more than the specific cards....
PolarisOrbit - Saturday, July 23, 2016 - link
After reading the page about Simultaneous Multi Projection, I had a question of whether this feature could be used for more efficiently rendering reflections, like on a mirror or the surface of water. Does anyone know?KoolAidMan1 - Saturday, July 23, 2016 - link
Great review guys, in-depth and unbiased as always.On that note, the anger from a few AMD fanboys is hilarious, almost as funny as how pissed off the Google fanboys get whenever Anandtech dares say anything positive about an Apple product.
Love my EVGA GTX 1080 SC, blistering performance, couldn't be happier with it
prisonerX - Sunday, July 24, 2016 - link
Be careful, you might smug yourself to death.KoolAidMan1 - Monday, July 25, 2016 - link
Spotted the fanboy apologistbill44 - Monday, July 25, 2016 - link
Anyone here knows at least the supported audio sampling rates? If not, I think my best bet is going with AMD (which I'm sure supports 88.2 & 176.4 KHz).Anato - Monday, July 25, 2016 - link
Thanks for the review! Waited it long, read other's and then come this, this was the best!Squuiid - Tuesday, July 26, 2016 - link
Here's my Time Spy result in 3DMark for anyone interested in what an X5690 Mac Pro can do with a 1080 running in PCIe 1.1 in Windows 10.http://www.3dmark.com/3dm/13607976?
Robalov - Tuesday, July 26, 2016 - link
Feels like it took 2 years longer than normal for this review :Dextide - Wednesday, July 27, 2016 - link
The venn diagram is wrong -- for GP104 it says 1:64 speed for FP16 -- it is actually 1:1 for FP16 (ie same speed as FP32) (NOTE: GP100 has 2:1 FP16 -- meaning FP16 is twice as fast as FP32)extide - Wednesday, July 27, 2016 - link
EDIT: I might be incorrect about this actually as I have seen information claiming both .. weird.mxthunder - Friday, July 29, 2016 - link
its really driving me nuts that a 780 was used instead of a 780ti.yhselp - Monday, August 8, 2016 - link
Have I understood correctly that Pascal offers a 20% increase in memory bandwidth from delta color compression over Maxwell? As in a total average of 45% over Kepler just from color compression?flexy - Sunday, September 4, 2016 - link
Sorry, late comment. I just read about GPU Boost 3.0 and this is AWESOME. What they did, is expose what previously was only doable with bios modding - eg assigning the CLK bins different voltages. The problem with overclocking Kepler/Maxwell was NOT so much that you got stuck with the "lowest" overclock as the article says, but that simply adding a FIXED amount of clocks across the entire range of clocks, as you would do with Afterburner etc. where you simply add, say +120 to the core. What happened here is that you may be "stable" at the max overclock (CLK bin), but since you added more CLKs to EVERY clock bin, the assigned voltages (in the BIOS) for each bin might not be sufficient. Say you have CLK bin 63 which is set to 1304Mhz in a stock bios. Now you use Afterburner and add 150 Mhz, now all of a sudden this bin amounts to 1454Mhz BUT STILL at the same voltage as before, which is too low for 1454Mhz. You had to manually edit the table in the BIOS to shift clocks around, especially since not all Maxwell cards allowed adding voltage via software.Ether.86 - Tuesday, November 1, 2016 - link
Astonishing review. That's the way Anandtech should be not like the mobile section which sucks...Warsun - Tuesday, January 17, 2017 - link
Yeah looking at the bottom here.The GTX 1070 is on the same level as a single 480 4GB card.So that graph is wrong.http://www.hwcompare.com/30889/geforce-gtx-1070-vs...
Remember this is from GPU-Z based on hardware specs.No amount of configurations in the Drivers changes this.They either screwed up i am calling shenanigans.
marceloamaral - Thursday, April 13, 2017 - link
Nice Ryan Smith! But, my question is, is it truly possible to share the GPU with different workloads in the P100? I've read in the NVIDIA manual that "The GPU has a time sliced scheduler to schedule work from work queues belonging to different CUDA contexts. Work launched to the compute engine from work queues belonging to different CUDA contexts cannot execute concurrently."marceloamaral - Thursday, April 13, 2017 - link
Nice Ryan Smith! But, my question is, is it truly possible to share the GPU with different workloads in the P100? I've read in the NVIDIA manual that "The GPU has a time sliced scheduler to schedule work from work queues belonging to different CUDA contexts. Work launched to the compute engine from work queues belonging to different CUDA contexts cannot execute concurrently."