Each GPU has access to 12GB of memory, the memory is not really shared. Whereas in games, both GPUs would have essentially the same assets stored in video memory, in GPGPU applications, each gpu could be more of a "standalone accelerator card", operating on it's own dataset.
I'm really curious what the die size is on GK210. The GK110B is 550 mm^2 already. The larger register and L1 caches are just going to push it close to 600 mm^2.
I also think we'll see a Tesla K30 and K50 at some point using binned versions of these chips. The K50 would be fully enabled at higher clock speeds than the K40 and the K30 carrying a configuration and clock speed in between the K40 and K20X. I'd also expect a Quadro refresh to start utilizing this new chip as well (K6200).
This also means that the big Maxwell chip, GM110/210 is clearly going to be a 2015 product. GK210 is just a stop gap solution until they're ready.
I feel like it's about time the industry came up with a more accurate term for these cards. GPU is a little disingenuous as they're not being used to process graphics at all in this implementation, it's all compute performance for data crunching regardless of what the chip was originally designed to do.
Two suggestions: 1) Use GPGPU instead 2) Understand GPU as "General Processing Unit" in this context (though it's still confusing; not a good idea probably)
So, this is still Kepler-based accelerator, even for the $$$$ (which it will certainly have) or even $$$$$ price, which simply means there is no "big Maxwell" aka GM200 in a few months ahead, at least - because in the past (2 years ago), if I'm not mistaken, the first implementation of Kepler GK110 "big Kepler" aka GK110 happened to be Tesla K20X shipped for (still 2nd in top500) "Titan" supercomputer (not to be confused with "Titan" graphics cards).
There appears to be an inconsistency in the specs for K40. The K80 part is listed as more than > 2x as fast in SP and DP performance over K40. The Boost Clock is used for K80 to arrive at its performance data but for K40 the Base/Core Clock is used. It should really be:
Something's not right with the tables. In one, the K80 is listed as 2x 2496, and the paragraph says based on GK210. In another table, the GK210 is listed with 2880 cores. Also, I am assuming it's still 1/3 double precision?
I doubt that. It is more efficient to use more silicon at lower frequency to get the same performances (yet it costs more doing that). I think the disabled cores are for yield. I doubt that a monster bigger than 7bilions of transistor that occupies about 600mm^2 has an high yield. They are just using the best they can (and make user pay for that).
Seen that on gaming cards GK110 has been replaced by GM204 and that on Tesla GK110 has been replaced by this GK210, I think nvidia has stopped making the GK110 completely. It is possible that it was not that profitable (mainly on gaming cards). If this is true we'll soon see a new Quadro using this chip as well.
So, how does this thing actually work? I don't see a VGA port on the back. Is it just completely distributed, processing only, with the only working monitor port card being on the supercomputer's terminal workstation?
for Tesla K80 NVIDIA has produced a new GPU – GK210 and GK210 is having 2880 Stream Processors Tesla K80 is having two GK210 how come Tesla K80 is having 2 x 2496 Stream Processors? can somebody clarify
My deviceproperty.cu answering that shared memory is 48 kb/SM and register memory is around 65 kb/SM for each GK210 which is not matching with above give data. Why?
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
28 Comments
Back to Article
at80eighty - Monday, November 17, 2014 - link
Any clue what kind of analytics platforms are used for the above benchmarks?at80eighty - Monday, November 17, 2014 - link
to clarify; i'd imagine most big data platforms are running on Xeons, and not consumer class CPU's.Shadowmaster625 - Monday, November 17, 2014 - link
I guess it should be called M80 instead of K80. But I can see why they might not have wanted to go with M-80...JimRamK - Monday, November 17, 2014 - link
It isn't a Maxwell GPU though. It's still Kepler.MANOL VOJKA - Monday, January 26, 2015 - link
Right !nevertell - Monday, November 17, 2014 - link
Had they gone with M80, they'd have the whole of Northern England as eager customers just waiting to get their hands on something to r8.Kevin G - Monday, November 17, 2014 - link
Typo in the 6th paragraph:"Whereas Tesla K80 and K20X were 235W cards, Tesla K80 is a 300W card. "
Memo.Ray - Monday, November 17, 2014 - link
" This puts the total memory pool between the two GPUs at 12GB, with 480GB/sec of bandwidth among them."12GB/24GB?
nevertell - Monday, November 17, 2014 - link
Each GPU has access to 12GB of memory, the memory is not really shared.Whereas in games, both GPUs would have essentially the same assets stored in video memory, in GPGPU applications, each gpu could be more of a "standalone accelerator card", operating on it's own dataset.
Memo.Ray - Monday, November 17, 2014 - link
correct, memory is not shared, but the article says "total memory pool" here.Ryan Smith - Monday, November 17, 2014 - link
Whoops. Thanks.Kevin G - Monday, November 17, 2014 - link
I'm really curious what the die size is on GK210. The GK110B is 550 mm^2 already. The larger register and L1 caches are just going to push it close to 600 mm^2.I also think we'll see a Tesla K30 and K50 at some point using binned versions of these chips. The K50 would be fully enabled at higher clock speeds than the K40 and the K30 carrying a configuration and clock speed in between the K40 and K20X. I'd also expect a Quadro refresh to start utilizing this new chip as well (K6200).
This also means that the big Maxwell chip, GM110/210 is clearly going to be a 2015 product. GK210 is just a stop gap solution until they're ready.
Mushkins - Monday, November 17, 2014 - link
I feel like it's about time the industry came up with a more accurate term for these cards. GPU is a little disingenuous as they're not being used to process graphics at all in this implementation, it's all compute performance for data crunching regardless of what the chip was originally designed to do.TiGr1982 - Monday, November 17, 2014 - link
Two suggestions:1) Use GPGPU instead
2) Understand GPU as "General Processing Unit" in this context (though it's still confusing; not a good idea probably)
So, this is still Kepler-based accelerator, even for the $$$$ (which it will certainly have) or even $$$$$ price, which simply means there is no "big Maxwell" aka GM200 in a few months ahead, at least - because in the past (2 years ago), if I'm not mistaken, the first implementation of Kepler GK110 "big Kepler" aka GK110 happened to be Tesla K20X shipped for (still 2nd in top500) "Titan" supercomputer (not to be confused with "Titan" graphics cards).
RussianSensation - Monday, November 17, 2014 - link
There appears to be an inconsistency in the specs for K40. The K80 part is listed as more than > 2x as fast in SP and DP performance over K40. The Boost Clock is used for K80 to arrive at its performance data but for K40 the Base/Core Clock is used. It should really be:K40 @ 810mhz = 4.67 Tflops SP, 1.56 Tflops FP-64 / DP
K40 @ 875mhz = 5.04 Tflops SP, 1.68 Tflops FP-64 / DP
Otherwise, it seems only fair to use the Core Clock of the K80 too instead of its Boost Clock.
Interesting how K80 isn't a full fledged Titan Z Tesla derivative.
darckhart - Monday, November 17, 2014 - link
Something's not right with the tables. In one, the K80 is listed as 2x 2496, and the paragraph says based on GK210. In another table, the GK210 is listed with 2880 cores. Also, I am assuming it's still 1/3 double precision?Ryan Smith - Monday, November 17, 2014 - link
The tables are correct. The GPUs used in K80 do not ship with all CUDA cores enabled, likely for power reasons.CiccioB - Tuesday, November 18, 2014 - link
I doubt that. It is more efficient to use more silicon at lower frequency to get the same performances (yet it costs more doing that).I think the disabled cores are for yield. I doubt that a monster bigger than 7bilions of transistor that occupies about 600mm^2 has an high yield. They are just using the best they can (and make user pay for that).
Seen that on gaming cards GK110 has been replaced by GM204 and that on Tesla GK110 has been replaced by this GK210, I think nvidia has stopped making the GK110 completely. It is possible that it was not that profitable (mainly on gaming cards). If this is true we'll soon see a new Quadro using this chip as well.
SirKnobsworth - Monday, November 17, 2014 - link
Passive cooling? I'm guessing this is intended for a server chassis with strong directional airflow?Ryan Smith - Monday, November 17, 2014 - link
Correct.seankurth - Tuesday, November 18, 2014 - link
So, how does this thing actually work? I don't see a VGA port on the back. Is it just completely distributed, processing only, with the only working monitor port card being on the supercomputer's terminal workstation?vred - Tuesday, November 18, 2014 - link
It's for compute only. Same as every other Tesla card.just4U - Tuesday, November 18, 2014 - link
Hi Ryan, might want to change the spelling of Nvidia on that last paragraph before the word police get wind of it or worse... a fanboy! (chuckle) :)JlHADJOE - Wednesday, November 19, 2014 - link
Is the price real? That's a pretty huge drop!Dual GK210 cheaper than single GK110b.
RKREDDY - Saturday, November 22, 2014 - link
for Tesla K80 NVIDIA has produced a new GPU – GK210and GK210 is having 2880 Stream Processors
Tesla K80 is having two GK210
how come Tesla K80 is having 2 x 2496 Stream Processors?
can somebody clarify
TheinsanegamerN - Wednesday, November 26, 2014 - link
It says right in the article that two SMX units are disabled. so 2496 is the number of active cores.Abhijit - Tuesday, June 27, 2017 - link
My deviceproperty.cu answering that shared memory is 48 kb/SM and register memory isaround 65 kb/SM for each GK210 which is not matching with above give data. Why?