Massive win. Hyper-Q and Dynamic Parallelism means Tesla can achieve near 100% utilization 24/7, with much less cpu power behind it required to do so, a huge value oriented upgrade. That's what I call the best hardware designing, the best engineering, surpassing expectations, and delivering bang for the buck. Amd is crying itself to sleep - and I'm starting to fear for them.
Geforce based GK110 cards will likely come Q1 2013. Nvidia will prioritize this product to the pro msrket first, and will move to retail after that to pit it against AMD's sea islands.
I wouldn't be surprised if we see GK110 GeForce parts sooner than Tesla, as historically Nvidia has launched the GeForce parts first. I believe last round of rumors was September.
This gives them more time to bin GK110 ASICs for functional units but more importantly, TDP and also gives them time to polish drivers expected of enterprise/professional parts.
Nvidia can't solve the problems with their design under 28nm fab to improve yields till the end of the year as Huang stated.
GK110 is much bigger than GK104 so yield problem is marked even more. Knowing that they will have a limited production for some time it is wiser to release it to the market where it will get higher profits. And it is HPC market. So no, we won't see GK110 in consumer cards this year.
You mean TSMC doesn't have enough wafer starts and building out several extra billion dollars of infrastructure injected recently doesn't happen overnight. With such great demand as we've already seen on 680 and now 670, it will sell out while amd 79xx and 78xx sit on the shelves.
chizow's theory on massive pre-overclocking of the 680 is now laughable - as we see 745MHz for the "new core" speed.
Maybe a few of you could instead start claiming nVidia "held back the slow, unable to achieve HARVESTED " cores for their new K10's, since the cores only hit 745MHz stable. Quite unlike the "defective and harvested 670 cores" that hit 915 980 or 1006 core on release, as so many of you bleated.
I think that's great nVidia is using defective "harvested" cores for their new $,2,000.00 plus K10 Tesla line - what a way to make huge profit $$$ on bad, slow, leaky parts, huh ! (rolls eyes again at the idiotic attack bleating)
I'll be suprised if we ever see a GK110 part in a Geforce card. Why would Nvidia sell these as graphicscards? Harder to manufacture, much more expensive because of that (e.g lower yields and less gpu's on a wafer), a big probability of horrible margins (for obvious reasons). And lastly, a big chance that chip won't be able to clock very high since it's likely/potentially even bigger than GF100.
There is also a big possibility the GK110 won't improve much (graphic wise) over the existing GK104. If I understood anything from the above article, the chip is not optimised for gaming, which makes it even more unlikely they'll go in that direction.
Presumably there was a GK100 design that was cancelled for whatever reason. If it ever taped out (presumably around late summer of last year, around the same time as the GK104 tape out), nVidia may have realized it would be unmanufacturable on TSMCs 28nm process as it was.
Rumor has it that GK110 taped out in January of this year. Perhaps nVidia had to go back and optimize the chip for yields and clock speeds once they got enough experience with the idiosyncrasies of the 28nm process.
From Nvidias own website, it seems like these cards are far weaker than Fermi at DP operations, even with two chips it doesn't match the single chip older Fermi ones
190 Gigaflops for the K10 with two GPUs, 665 and 515 for the older Fermi cards. Hmm. I thought only the consumer Geforce versions would have such cut down compute performance, but I guess it's inherent in the architecture.
Wait nevermind, that would be if it performed the same as the dual chip 104 which it probably won't. Do we have any idea what the DP performance of GK110 would be like then?
We need an edit button just for me, heh. Nvidias slide says the k10 is 3x the single precision performance as todays cards, the k20 is 3x the DP. So 1500Gflops is pretty close, and the rumours say 1.5Tflops too so that's a good bet I guess.
NVIDIA's official guideance is >1TFLOP real performance, which is a combination of the increased number of functional units and increased efficiency on GK110.
If GK110 has the same clocks as GK104 then it is ~1.43 TFLOPS DP. Nvidia stated that it has ~1 TFLOPS DP performance. They can't know it yet, because chip is too far for mass production state.
Mmmm, I thought they said that you'd be getting 3x the power efficiency of Fermi. M2090 had 660GFLOPS so it should follow that it gets somewhere around 1.9TFLOPS. Also, it wouldn't be much competition for the 7970GE which takes 1TFLOPS of DP, especially for a die this large. 550 square mm is enormous and the K20 should pay it back in huge performance gains. We might be able to get a 780 next year at close to the same FLOPS-age, just so it's able to compete with the next year GCN. Kepler might be the first to not cap SP or DP. However, I think this is just part of the plan to drop Maxwell tortuously and rip GPGPU at the seams.
IMHO, I think this might be a losing bet for NVIDIA as Kepler is taking DP performance away from everyone, and, HPC computing power away from enthusiasts on a budget. IMHO, they are walking the Compaq, HP, and Apple line now with overpriced products in the compute area. As DP is important to me, I just bought a 580 rather than a 680. I'll wait for benchmarks on this card, however, as an enthusiast on a budget looking for the best value for the money, I'll be passing on this card.
Perhaps NVIDIA is trying to counter the last gen situations where Tesla cards performed about as well, and sometimes not even as well, as the equivalent gaming card, and the Tesla 'performance' came at an extreme premium. The gamer cards were a far better value than the Tesla cards for compute performance in general.
I wish NVIDIA luck in this venture. There are not many public distributed computing projects out there that need DP support, however, for those projects, NVIDIA may be driving business to AMD - which presently has far superior DP performance. I think this is an instance where NVIDIA is definitely making a business decision to move in this direction; I hope it works out for them, or if it fails, I hope that they come to their senses. $2,500 is an expensive card no matter how you look at it, and the fact that they are courting oil and gas exploration is an indication that they are after the $$$.
There are plenty of products on the market that are for the professional rather than the "enthusiast on a budget."
I have a professional photographer friend who recently showed me a lens that he paid over $5,000 for. Now, I like cameras a lot, and have since I was a kid. And, I'm willing to spend more on them than most people do. But I'm an enthusiast on a budget when it comes to cameras. As much as I'd love to own that lens, that product clearly wasn't designed for me.
Good for nVidia if they can design a card that oil and gas companies are willing to pay crazy money for. In the end, nVidia being "after the $$$" is the best thing for us mere enthusiasts.
Unlike gamer cards, Tesla cards have to work reliably 24x7 for months on end, and companies that buy them need to be able to trust the results of their computations without resorting to running their weeks-long computation again just in case.
By the way, did you notice that K10 does not have an integrated fan? So even if you had the money to buy one, it wouldn't work in your enthusiast computer case.
Last gen was exactly the same - the GF104 had weak floating point performance too. The only differences so far is that nvidia haven't released the high end GK110 as a gaming chip yet, and due to the GK104 being so fast in comparison to ati's cards they called it the 680, not the 660 (which they would have if ati had been faster).
I'm sure they will release a GK110 to gamers in the end - probably call it the 780, and rebrand the 680 as a 760.
Exactly, the poor DP performance of GK104 just further drives the point home it was never meant to be the flagship ASIC for this generation of Kepler GPUs. Its obvious now GK110 was mean to be the true successor to GF100/110 and the rest of Nvidia's lineage of true flagship ASICs (G80, GT200, GT200b etc.)
What is obvious and drives the very point not made home without question is amd cards suck so badly nVidia has a whole other top tier they could have added on long ago now, right ? ROFL - you've said so, many times, nVidia is a massive king on performance right now, and since before the 680 release, so much extra gaming performance up their sleeve already, amd should be ashamed of itself.
This is what you've delivered, while you little attacking know it alls pretend amd is absolutely coreless and cardless in the near future, hence you can blame the release level on nVidia, while ignoring the failure company amd, right ? As amd "silently guards" any chip it may have designed or has in test production right now, all your energies can be focused on attacking nVidia, while all of you carelessly pretend amd gaming chips for tomorrow don't exist at all. I find that very interesting. I find the glaring omission that is now standard practice quite telling.
I'll happy buy a 680/760 for $300 or less when/if that happens... :p Upgrading my dual 6950s in CF (which cost me like $450 for both) to anything but an $800+ SLI/CF setup right now would just be a sidegrade at best, so I'll probably just skip this gen.
All that yet you state you just bought an nVidia GTX 580. ROFL BTW, amd is losing it's proprietay openCL Winzip compute benchmarks to Ivy Bridge cpu's.
Kepler GK110 is now displayig true innovation and the kind of engineering amd can only dream of - with 32 cpu calls available instead of 1 - and it also sends commands to other Keplers to keep them working. Obviously amd just lost their entire top end in compute - it's over.
@Ryan Smith Hello. Where is this NVIDIA GTC taking place? Can't NVIDIA conduct GTC in India? Please keep me more updated about NVIDIA's GTC. I am interested in these lectures. Thans in advance. Abhishek Patil, India
GTC (prime) takes place in San Jose, California, which is NVIDIA's home city. They do hold smaller international GTC events that are focused on training, but AFAIK those have only been held in Japan and China so far.
it's a bit unusual to see 15 SMXes instead of the usual 2^x units (16?).
Seems to me they are harvesting the GK100 chips as in GTX 480 days. i.e. the chip is too big (with respect to the 28nm process capability), to be able to have enough yield for a complete 16SMXes? It's very funny to see a company as smart as them to fall under the same pitfall twice (GTX 480 & now)
Made me wonder if Nvidia actually made a design decision on using this approach? i.e. they were planning to disable 1 or 2 SMXes right from the start, at which point, why didn't they come up with 17 SMXes design. I assumed there is interest in the 2^x SMX units since they fit nicely on bus/buffer widths (32, 64, 128, etc).
The early leaks of the GeForce GK110 part specified 2304 SPs, which indicates only 12/15 SMX active. It could be that Nvidia is already factoring in the harvesting and TDP targets in their realized yields for GK110 wafers.
We may never see a full 2880 SP Kepler, and it may have been Nvidia's intention when designing an odd numbered 15 SMX chip.
Truth of the matter is, with so many SPs, losing a few SMX would easily be the lesser of the evils if it came down to cutting functional units (see GTX 670 and GTX 680).
Even with 2304 SPs and 384-bit memory bandwidth, GTX780 will be very fast.
Right now GTX680 is faster than HD7970:
1536 SP vs. 2048 SP 192 GB/sec vs. 264 GB/sec bandwidth 32 ROP vs. 32 ROP 128 TMUs vs. 128 TMUs
Based on that Kepler needs 25% less SPs and 27% less memory bandwidth to compete. Even if HD8000 has 2560 SPs, 2304 SP GTX780 will still be plenty fast.
Of course if it has full blown 2,880 SPs, it will be insanely fast.
Too bad amd has little to none at all compute software to put their card to use in compute, while the will still win again in sales nVidia has a huge and supported base of the same.
K20 may ship with disabled functional units - this is part of what NVIDIA needs to figure out as they finish bringing up the chip - but GK110 as presented is complete. There are no hidden units (okay, the Polymorphs aren't on the diagram because of the audience), every functional unit is accounted for. So it's a design decision, specifically one that has an equal number of SMXs (5) for each pair of memory controllers.
I would bet 1/1 or 1/2 is better, but we will see the outcome of the battle when Handbrake gets OpenCL acceleration on both NVidia and AMD units.
I think Radeons are better for heavy computations, but then the sheer user base of CUDA is larger - it is just cheaper to replace Fermi with Kepler accelerators.
I have been working closely with a professional developer of CUDA applications targeting real-time processing of Broadcast-Quality HD video signals (frame-rate conversion, re-sizing, slow-motion, de-interlacing etc) using multiple GTX5xx cards operating in tandem (no need for SLI) None of the underlying algorithms require any Double-Precision computation, Single-Precision is more than adequate.
After much waiting, last week he managed to finally acquire a GTX680 board and ran benchmarks for that board against a GTX580. He found that the average GTX680 conversion frame-rate on an extensive test-suite dropped to less than 2/3 of that available with a single GTX580, actually only a little better than the frame--rate available from a previously-tested 900MHz overclocked GTX560Ti. Further technical analysis revealed conclusively that the 256-bit memory interface of the GK104 was the performance killer.....
I have been doing the exact same thing! I ran tests on my frame-rate converter and the GTX 680 got only 70% of a GTX 580! I also thought it was the reduced cache bandwidth because there are only 8 SMs instead of 16. I also do not use double precision FP, but it seems that specmanship and marketing wins out over common sense ;^(
Professionals don't use "FCPX!!!" . Someone needs to grasp you around the neck and wake your obviously dormant brain into a state of at least partial activity.
I really hope 2 of these cards is SLI will be finally able to run Metro 2033 at 1920x1080 with everything maxed out (DoF & Tessellation ON) and Crysis 1 (16AA and 16AF), also everything set to highest in NV control panel, at stable 100-120 fps. I don't care about the heat nor the power consumption of GK110 as long as it produce 60+ fps in most demanding games, and in SLI above 100fps. That's how the games are supposed to be played. Gtx 680 is a joke, it is slightly better than gtx 580 in Metro 2033. Doesnt worth the wait and price for gtx 680.
Like I said. Even after 680's are present and sold, people like you say they can't be built. I do wonder if you're wearing your shoes after you attempt to tie the shoelaces. No looking down before you answer.
You never have anything worthwhile to say, it's always some unintelligible pro-nvidia rubbish that doesn't even address the previous comments. There's no question in ours minds, we know nvidia can make a hobbled bit of rubbish and sell it to little dipshits who spend all their time commenting about their favorite computer hardware brand.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
51 Comments
Back to Article
iwod - Thursday, May 17, 2012 - link
Would any upcoming article be analyzing that as well? I was thinking it would be extremely useful in Gaming Cafe, LAN Party and Arcade Center.Ryan Smith - Thursday, May 17, 2012 - link
We'll have more info on GPU virtualization later today. It's quite a bit to cover.CeriseCogburn - Wednesday, May 23, 2012 - link
Massive win.Hyper-Q and Dynamic Parallelism means Tesla can achieve near 100% utilization 24/7, with much less cpu power behind it required to do so, a huge value oriented upgrade.
That's what I call the best hardware designing, the best engineering, surpassing expectations, and delivering bang for the buck.
Amd is crying itself to sleep - and I'm starting to fear for them.
r3loaded - Thursday, May 17, 2012 - link
Is there a chance that Nvidia will release a consumer-grade GPU based on GK110? Something like a GTX 685 perhaps?tviceman - Thursday, May 17, 2012 - link
Geforce based GK110 cards will likely come Q1 2013. Nvidia will prioritize this product to the pro msrket first, and will move to retail after that to pit it against AMD's sea islands.chizow - Thursday, May 17, 2012 - link
I wouldn't be surprised if we see GK110 GeForce parts sooner than Tesla, as historically Nvidia has launched the GeForce parts first. I believe last round of rumors was September.This gives them more time to bin GK110 ASICs for functional units but more importantly, TDP and also gives them time to polish drivers expected of enterprise/professional parts.
PsiAmp - Saturday, May 19, 2012 - link
Nvidia can't solve the problems with their design under 28nm fab to improve yields till the end of the year as Huang stated.GK110 is much bigger than GK104 so yield problem is marked even more. Knowing that they will have a limited production for some time it is wiser to release it to the market where it will get higher profits. And it is HPC market. So no, we won't see GK110 in consumer cards this year.
CeriseCogburn - Saturday, May 19, 2012 - link
You mean TSMC doesn't have enough wafer starts and building out several extra billion dollars of infrastructure injected recently doesn't happen overnight.With such great demand as we've already seen on 680 and now 670, it will sell out while amd 79xx and 78xx sit on the shelves.
CeriseCogburn - Saturday, May 19, 2012 - link
chizow's theory on massive pre-overclocking of the 680 is now laughable - as we see 745MHz for the "new core" speed.Maybe a few of you could instead start claiming nVidia "held back the slow, unable to achieve HARVESTED " cores for their new K10's, since the cores only hit 745MHz stable.
Quite unlike the "defective and harvested 670 cores" that hit 915 980 or 1006 core on release, as so many of you bleated.
I think that's great nVidia is using defective "harvested" cores for their new $,2,000.00 plus K10 Tesla line - what a way to make huge profit $$$ on bad, slow, leaky parts, huh !
(rolls eyes again at the idiotic attack bleating)
elajt_1 - Friday, May 18, 2012 - link
I'll be suprised if we ever see a GK110 part in a Geforce card. Why would Nvidia sell these as graphicscards? Harder to manufacture, much more expensive because of that (e.g lower yields and less gpu's on a wafer), a big probability of horrible margins (for obvious reasons). And lastly, a big chance that chip won't be able to clock very high since it's likely/potentially even bigger than GF100.There is also a big possibility the GK110 won't improve much (graphic wise) over the existing GK104. If I understood anything from the above article, the chip is not optimised for gaming, which makes it even more unlikely they'll go in that direction.
Just my two cents.
Malphas - Thursday, May 17, 2012 - link
They'll most likely just release them as the 700 series at the tail end of this year.Musafir_86 - Thursday, May 17, 2012 - link
-They directly go with GK110? Where's GK100? Shouldn't it be released first as with Fermi (GF100 -> GF110)?Regards.
gplnpsb - Thursday, May 17, 2012 - link
Presumably there was a GK100 design that was cancelled for whatever reason. If it ever taped out (presumably around late summer of last year, around the same time as the GK104 tape out), nVidia may have realized it would be unmanufacturable on TSMCs 28nm process as it was.Rumor has it that GK110 taped out in January of this year. Perhaps nVidia had to go back and optimize the chip for yields and clock speeds once they got enough experience with the idiosyncrasies of the 28nm process.
tipoo - Thursday, May 17, 2012 - link
From Nvidias own website, it seems like these cards are far weaker than Fermi at DP operations, even with two chips it doesn't match the single chip older Fermi oneshttp://www.nvidia.com/object/tesla-servers.html
190 Gigaflops for the K10 with two GPUs, 665 and 515 for the older Fermi cards. Hmm. I thought only the consumer Geforce versions would have such cut down compute performance, but I guess it's inherent in the architecture.
Ryan Smith - Thursday, May 17, 2012 - link
To be precise, it's GK104 that has weak DP (1/24). GK110 is much better, it's 1/3, which is actually a bit lower than GF110 which was 1/2.tipoo - Thursday, May 17, 2012 - link
So the full (gk110) part would be around 1500Gflops DP if it's 1/3, or is there more to it than that?tipoo - Thursday, May 17, 2012 - link
Wait nevermind, that would be if it performed the same as the dual chip 104 which it probably won't. Do we have any idea what the DP performance of GK110 would be like then?tipoo - Thursday, May 17, 2012 - link
We need an edit button just for me, heh. Nvidias slide says the k10 is 3x the single precision performance as todays cards, the k20 is 3x the DP. So 1500Gflops is pretty close, and the rumours say 1.5Tflops too so that's a good bet I guess.Ryan Smith - Thursday, May 17, 2012 - link
NVIDIA's official guideance is >1TFLOP real performance, which is a combination of the increased number of functional units and increased efficiency on GK110.PsiAmp - Saturday, May 19, 2012 - link
If GK110 has the same clocks as GK104 then it is ~1.43 TFLOPS DP. Nvidia stated that it has ~1 TFLOPS DP performance. They can't know it yet, because chip is too far for mass production state.PsiAmp - Saturday, May 19, 2012 - link
K10 and GTX 680 share the same chip. So DP in K10 is terrible.belmare - Friday, August 3, 2012 - link
Mmmm, I thought they said that you'd be getting 3x the power efficiency of Fermi. M2090 had 660GFLOPS so it should follow that it gets somewhere around 1.9TFLOPS.Also, it wouldn't be much competition for the 7970GE which takes 1TFLOPS of DP, especially for a die this large. 550 square mm is enormous and the K20 should pay it back in huge performance gains.
We might be able to get a 780 next year at close to the same FLOPS-age, just so it's able to compete with the next year GCN.
Kepler might be the first to not cap SP or DP. However, I think this is just part of the plan to drop Maxwell tortuously and rip GPGPU at the seams.
wiyosaya - Thursday, May 17, 2012 - link
IMHO, I think this might be a losing bet for NVIDIA as Kepler is taking DP performance away from everyone, and, HPC computing power away from enthusiasts on a budget. IMHO, they are walking the Compaq, HP, and Apple line now with overpriced products in the compute area. As DP is important to me, I just bought a 580 rather than a 680. I'll wait for benchmarks on this card, however, as an enthusiast on a budget looking for the best value for the money, I'll be passing on this card.Perhaps NVIDIA is trying to counter the last gen situations where Tesla cards performed about as well, and sometimes not even as well, as the equivalent gaming card, and the Tesla 'performance' came at an extreme premium. The gamer cards were a far better value than the Tesla cards for compute performance in general.
I wish NVIDIA luck in this venture. There are not many public distributed computing projects out there that need DP support, however, for those projects, NVIDIA may be driving business to AMD - which presently has far superior DP performance. I think this is an instance where NVIDIA is definitely making a business decision to move in this direction; I hope it works out for them, or if it fails, I hope that they come to their senses. $2,500 is an expensive card no matter how you look at it, and the fact that they are courting oil and gas exploration is an indication that they are after the $$$.
Parhel - Thursday, May 17, 2012 - link
There are plenty of products on the market that are for the professional rather than the "enthusiast on a budget."I have a professional photographer friend who recently showed me a lens that he paid over $5,000 for. Now, I like cameras a lot, and have since I was a kid. And, I'm willing to spend more on them than most people do. But I'm an enthusiast on a budget when it comes to cameras. As much as I'd love to own that lens, that product clearly wasn't designed for me.
Good for nVidia if they can design a card that oil and gas companies are willing to pay crazy money for. In the end, nVidia being "after the $$$" is the best thing for us mere enthusiasts.
SamuelFigueroa - Thursday, May 17, 2012 - link
Unlike gamer cards, Tesla cards have to work reliably 24x7 for months on end, and companies that buy them need to be able to trust the results of their computations without resorting to running their weeks-long computation again just in case.By the way, did you notice that K10 does not have an integrated fan? So even if you had the money to buy one, it wouldn't work in your enthusiast computer case.
Dribble - Thursday, May 17, 2012 - link
Last gen was exactly the same - the GF104 had weak floating point performance too. The only differences so far is that nvidia haven't released the high end GK110 as a gaming chip yet, and due to the GK104 being so fast in comparison to ati's cards they called it the 680, not the 660 (which they would have if ati had been faster).I'm sure they will release a GK110 to gamers in the end - probably call it the 780, and rebrand the 680 as a 760.
chizow - Thursday, May 17, 2012 - link
Exactly, the poor DP performance of GK104 just further drives the point home it was never meant to be the flagship ASIC for this generation of Kepler GPUs. Its obvious now GK110 was mean to be the true successor to GF100/110 and the rest of Nvidia's lineage of true flagship ASICs (G80, GT200, GT200b etc.)CeriseCogburn - Saturday, May 19, 2012 - link
What is obvious and drives the very point not made home without question is amd cards suck so badly nVidia has a whole other top tier they could have added on long ago now, right ?ROFL - you've said so, many times, nVidia is a massive king on performance right now, and since before the 680 release, so much extra gaming performance up their sleeve already, amd should be ashamed of itself.
This is what you've delivered, while you little attacking know it alls pretend amd is absolutely coreless and cardless in the near future, hence you can blame the release level on nVidia, while ignoring the failure company amd, right ?
As amd "silently guards" any chip it may have designed or has in test production right now, all your energies can be focused on attacking nVidia, while all of you carelessly pretend amd gaming chips for tomorrow don't exist at all.
I find that very interesting.
I find the glaring omission that is now standard practice quite telling.
Impulses - Thursday, May 17, 2012 - link
I'll happy buy a 680/760 for $300 or less when/if that happens... :p Upgrading my dual 6950s in CF (which cost me like $450 for both) to anything but an $800+ SLI/CF setup right now would just be a sidegrade at best, so I'll probably just skip this gen.CeriseCogburn - Wednesday, May 23, 2012 - link
All that yet you state you just bought an nVidia GTX 580.ROFL
BTW, amd is losing it's proprietay openCL Winzip compute benchmarks to Ivy Bridge cpu's.
Kepler GK110 is now displayig true innovation and the kind of engineering amd can only dream of - with 32 cpu calls available instead of 1 - and it also sends commands to other Keplers to keep them working.
Obviously amd just lost their entire top end in compute - it's over.
abhishek6893 - Thursday, May 17, 2012 - link
@Ryan SmithHello. Where is this NVIDIA GTC taking place? Can't NVIDIA conduct GTC in India?
Please keep me more updated about NVIDIA's GTC. I am interested in these lectures.
Thans in advance.
Abhishek Patil,
India
Ryan Smith - Thursday, May 17, 2012 - link
GTC (prime) takes place in San Jose, California, which is NVIDIA's home city. They do hold smaller international GTC events that are focused on training, but AFAIK those have only been held in Japan and China so far.PEJUman - Thursday, May 17, 2012 - link
it's a bit unusual to see 15 SMXes instead of the usual 2^x units (16?).Seems to me they are harvesting the GK100 chips as in GTX 480 days. i.e. the chip is too big (with respect to the 28nm process capability), to be able to have enough yield for a complete 16SMXes?
It's very funny to see a company as smart as them to fall under the same pitfall twice (GTX 480 & now)
Made me wonder if Nvidia actually made a design decision on using this approach? i.e. they were planning to disable 1 or 2 SMXes right from the start, at which point, why didn't they come up with 17 SMXes design.
I assumed there is interest in the 2^x SMX units since they fit nicely on bus/buffer widths (32, 64, 128, etc).
Kevin G - Thursday, May 17, 2012 - link
The die shot that is floating around appears that it only has 15 SMX clusters instead of 16.chizow - Thursday, May 17, 2012 - link
The early leaks of the GeForce GK110 part specified 2304 SPs, which indicates only 12/15 SMX active. It could be that Nvidia is already factoring in the harvesting and TDP targets in their realized yields for GK110 wafers.We may never see a full 2880 SP Kepler, and it may have been Nvidia's intention when designing an odd numbered 15 SMX chip.
Truth of the matter is, with so many SPs, losing a few SMX would easily be the lesser of the evils if it came down to cutting functional units (see GTX 670 and GTX 680).
RussianSensation - Thursday, May 17, 2012 - link
Even with 2304 SPs and 384-bit memory bandwidth, GTX780 will be very fast.Right now GTX680 is faster than HD7970:
1536 SP vs. 2048 SP
192 GB/sec vs. 264 GB/sec bandwidth
32 ROP vs. 32 ROP
128 TMUs vs. 128 TMUs
Based on that Kepler needs 25% less SPs and 27% less memory bandwidth to compete. Even if HD8000 has 2560 SPs, 2304 SP GTX780 will still be plenty fast.
Of course if it has full blown 2,880 SPs, it will be insanely fast.
clbench - Friday, May 18, 2012 - link
HD 7970 seems faster than GTX 680 in most compute benchmarks:http://clbenchmark.com/result.jsp
CeriseCogburn - Saturday, May 19, 2012 - link
Too bad amd has little to none at all compute software to put their card to use in compute, while the will still win again in sales nVidia has a huge and supported base of the same.You gotta love that paper amd phantom compute.
Ryan Smith - Thursday, May 17, 2012 - link
K20 may ship with disabled functional units - this is part of what NVIDIA needs to figure out as they finish bringing up the chip - but GK110 as presented is complete. There are no hidden units (okay, the Polymorphs aren't on the diagram because of the audience), every functional unit is accounted for. So it's a design decision, specifically one that has an equal number of SMXs (5) for each pair of memory controllers.thebluephoenix - Thursday, May 17, 2012 - link
Can those FP64 shaders do FP32? Or when GK110 GeForce comes out they'll do just nothing in games?Also, isn't better way of building compute efficient GPU to make all shaders FP64 capable 1/1 or 1/2 FP32? Like Fermi or GCN.
Ananke - Thursday, May 17, 2012 - link
I would bet 1/1 or 1/2 is better, but we will see the outcome of the battle when Handbrake gets OpenCL acceleration on both NVidia and AMD units.I think Radeons are better for heavy computations, but then the sheer user base of CUDA is larger - it is just cheaper to replace Fermi with Kepler accelerators.
Ryan Smith - Thursday, May 17, 2012 - link
These are the same FP64 CUDA Cores that we saw on GK104. So they cannot do FP32. They can only be used for FP64 operations.kilkennycat - Thursday, May 17, 2012 - link
I have been working closely with a professional developer of CUDA applications targeting real-time processing of Broadcast-Quality HD video signals (frame-rate conversion, re-sizing, slow-motion, de-interlacing etc) using multiple GTX5xx cards operating in tandem (no need for SLI) None of the underlying algorithms require any Double-Precision computation, Single-Precision is more than adequate.After much waiting, last week he managed to finally acquire a GTX680 board and ran benchmarks for that board against a GTX580. He found that the average GTX680 conversion frame-rate on an extensive test-suite dropped to less than 2/3 of that available with a single GTX580, actually only a little better than the frame--rate available from a previously-tested 900MHz overclocked GTX560Ti. Further technical analysis revealed conclusively that the 256-bit memory interface of the GK104 was the performance killer.....
krslavin - Sunday, May 20, 2012 - link
I have been doing the exact same thing! I ran tests on my frame-rate converter and the GTX 680 got only 70% of a GTX 580! I also thought it was the reduced cache bandwidth because there are only 8 SMs instead of 16. I also do not use double precision FP, but it seems that specmanship and marketing wins out over common sense ;^(g101 - Saturday, July 14, 2012 - link
Indeed it does...Ursus.se - Saturday, May 19, 2012 - link
What I can understands fit this unit into remarks to me from inside Apple about a new real up to date MacPro for FCPX user...Lets hope...
Ursus
g101 - Saturday, July 14, 2012 - link
You can also shut the fuck up.Professionals don't use "FCPX!!!" . Someone needs to grasp you around the neck and wake your obviously dormant brain into a state of at least partial activity.
garyab33 - Tuesday, May 22, 2012 - link
I really hope 2 of these cards is SLI will be finally able to run Metro 2033 at 1920x1080 with everything maxed out (DoF & Tessellation ON) and Crysis 1 (16AA and 16AF), also everything set to highest in NV control panel, at stable 100-120 fps. I don't care about the heat nor the power consumption of GK110 as long as it produce 60+ fps in most demanding games, and in SLI above 100fps. That's how the games are supposed to be played. Gtx 680 is a joke, it is slightly better than gtx 580 in Metro 2033. Doesnt worth the wait and price for gtx 680.CeriseCogburn - Wednesday, May 23, 2012 - link
Like I said.Even after 680's are present and sold, people like you say they can't be built.
I do wonder if you're wearing your shoes after you attempt to tie the shoelaces.
No looking down before you answer.
g101 - Saturday, July 14, 2012 - link
Shut the fuck up dumbass.You never have anything worthwhile to say, it's always some unintelligible pro-nvidia rubbish that doesn't even address the previous comments.
There's no question in ours minds, we know nvidia can make a hobbled bit of rubbish and sell it to little dipshits who spend all their time commenting about their favorite computer hardware brand.
Every. Single. Article.
biswa60 - Saturday, October 19, 2019 - link
Given the questionable compute performance of GK104, this makes NVIDIA’s decision to launch a Tesla part based on it quite unexpected.https://jewish-holidays.net/
https://motivation-letter.net/
https://meetingminutestemplate.net/