he could not even show a game title that uses raytracing..... as i said 2020 it will be a thing. this is the first generation with no content. people will buy it because of clever nvidia marketing but in the end for gaming it is a marketing plot. gamers need only to care about RT in two years +
i know siggraph is a pro event. but there are a ton of game developers attending siggraph. so if he had to shon some AAA game title with raytracing it would have resonated well.... but nothing.
How could he show a game that uses ray-tracing (i.e. real time ray-tracing during gameplay, not prerendered material for cinematic cuts, intros etc, that has long been a thing) when these are the first video cards able to support (partial) real time ray-tracing?
Ray-tracing support was added to the Vulcan and DirectX 12 APIs only months ago, and the largest game engines are still in the process of implementing ray-tracing. Game studios and developers need all three, fast hardware, API support and game engine support to start implementing real time ray-tracing, initially in a "hybrid" fashion, of course, with the majority of the material still being rasterized.
It looks like a chicken and egg scenario, but the true roadblock has been the lack of fast hardware. After Nvidia, at first, announced ray tracing acceleration in their next-gen graphics cards (incl. their consumer ones), it was the turn of the gaming APIs to add ray-tracing support. And, finally, it was time for the big game engines to get to work on it.
There is no such game yet, only a few demos, but I predict that in early 2019, mid 2019 tops, the first crop of AAA games with partial ray tracing support will be introduced. Some top level games might be able to be patched with ray tracing support as well.
Pretty much any new tech for game engines runs poorly on the first hardware. Tessellation wasn't that fast on the first GPUs that could do it, DX10/11/12 were all not that fast on first gen GPUs too.
DX9 ran like stink on the Radeon 9700 Pro and DX12 showed significant gains on AMD's existing GCN hardware, so that comment only holds if you're limiting it to Nvidia chips. It's a trivial observation to say that second-gen performs better - it'd be an automatic failure if it didn't ;)
The new Metro title has it and demonstrates an advantage for Titan V. FF15 also demonstrates a mild lead for Titan V vs 1080 ti which could possibly be related to the ray-tracing or another architectural advancement. They also showed off a live render Star Wars animation a while back. The hardware exists already for the higher end segments of the market and the software is coming along faster than it really should when considering that there's still no consumer market for the stuff, at least until nVidia stops holding out on us.
Basically with Turing, what we know of as a GPU is changing quite a bit.
Downside is so much money and die area spent without increasing performance in existing titles and engines. I'm not sure I want to pay double without seeing double performance in my existing titles, but today is marking the beginning of the transition to ray-tracing it seems.
These say quadro, not geforce. They're meant for professionals. I wouldn't be surprised if they released the geforce variants in a week or two without so much vram, disabled tensor cores, no ecc (not sure if the quadros do have it though), much lower double precision (these might support it at 1/2 or 1/3 rates), and perhaps less RT cores. Based on the die size of this monster, it most likely does have a lot of double precision hardware. In which case I'd expect nvidia to keep this massive 754 mm^2 die for quadros and teslas (and perhaps one Titan) and make a new die for geforce that should be a fair bit smaller.
You seem to think these prices are what the geforce cards will be. Just looking at the past these prices for quadros aren't completely out of the ordinary.
You didn't understand my comment. I meant if nVidia has a choice of including Tensor and RT cores or not, adding them won't improve performance with current titles so it's a lot of extra money without payoff. I was hoping to see Quadro cards without RT capabilities announced. Memory and standard CUDA cores only.
We can expect the Geforce cards to be almost identical but 4-6 times cheaper, just like with P5000 and P6000 vs GTX 1080 and Titan XP.
You said "Downside is so much money and die area spent without increasing performance in existing titles and engines. I'm not sure I want to pay double without seeing double performance in my existing titles, but today is marking the beginning of the transition to ray-tracing it seems."
This implies to me that you're seeing these as gaming cards. Especially around parts like "I'm not sure I want to pay double without seeing double performance in my existing titles."
The RT stuff is meant for workstations. That's the whole point. If you want stuff without it... Then you've got yourself a geforce card, lol. Or maybe even the regular geforce cards will have RT tracing hardware on it, though if they do I'd expect for there to be less of it.
That's historically true, but maybe we'll see a shift here?
Either way, I agree. That's money the company spent on areas that aren't gaming, but the GPU market as a whole is growing in several directions so I can understand their choices.They have to appeal to as many clients as possible, which means any one market segment likely won't get full resource utilization.
Yes but you are talking about these gpus being multiple times more expensive than current geforce gpus for much less than 2x gain in performance in games. Yes, that's true, but geforce cards will be released at similar price points to current gpus, making the "too expensive to justify the extra gaming perfomance point" invalid. These prices are for quadros, not geforce cards.
They are getting less transistors per unit area than for Volta, so I imagine it's on the 12 FFN process. The density increase of 7 nm should dominate any architectural changes that would reduce transistor density.
I'm surprised they are coming out with a generation of 12 FFN GPUs now when it seems 7 nm should be ready by mid 2019.
Judging by Volta, Turing probably gets its performance advantage over Pascal mostly by being more energy efficient. So a larger die size (and/or a higher clock) is necessary to obtain a performance gain.
Interestingly, trying to estimate by the number of CUDA cores (assuming the top end Quadro part isn't using a significantly cut down chip), it seems like there are FP64 units on this GPU, unless the RT cores and other architectural improvements over Volta are taking up a whole lot of transistors.
Sales to hyperscalers should take precedence over sales for proviz, so the fact that these Quadros will be available soon seems to suggest that a V100 replacement isn't forthcoming based on this GPU, unless NVIDIA has been selling such a card to hyperscalers in stealth mode. Of course, backing this up is that fact that this is a GDDR6 part with lower memory bandwidth than the GV100 and the fact that other than the greater precision options in the tensor cores, this GPU doesn't offer much of an advantage over the V100 for data center workloads. Well, it's interesting then that it SEEMS to have double precision. Maybe it doesn't and those RT cores really do take up a large number of transistors.
You've answered your own question. Launching on 12nm now gets them a new product now, a year after their previous refresh, and with major feature enhancements on the pro-side of the house an entire year sooner than if they waited for 7 nm. If the rumor mill is to be trusted, they did delay this generation a few months to empty out their retail channel of last gen parts after the crypto bubble popped; but going an entire year without a product refresh when they've got one would be insane.
I believe it costs a lot to prepare a design for high volume manufacturing. They aren't sitting on such a thing if they never make the preparations to begin with. Right now they have little competition, and they won't have competition until 2019. Even though Pascal has been out over two years, gamer market share uptake of Pascal GPUs has been low due to cryptocurrency. Those are reasons why NVIDIA could get away with not releasing a new generation until 2019.
In 2019, however, NVIDIA will have competition and it will be on 7 nm. It's not cheap to do a die shrink (or to come out with a new generation) so if 12 FFN Turing is only at the top of the market for a year that will eat into their margins. Those are reasons why NVIDIA would want to wait until 2019 to release their new generation on 7 nm.
Possibly NVIDIA believe that people won't be willing to purchase 2+ year old GPUs even though those GPUs are significantly faster than what the people currently have and there isn't anything better available than those 2+ yo GPUs on the market. Another possibility is that NVIDIA want to get the RTX hardware into the market as soon as possible in order to push the technology forward for future margins, giving up margins in the near term. The greater the percentage of RTX-capable cards in the market and the sooner they are released the more and sooner developers will design and build games using these technologies, and so the greater the demand for GPUs with these technologies will be in the future.
I think Nvidia is more concerned about keeping concept leadership for GPUs.
Essentially GPU is somewhere between commodity (like RAM) and proprietary (CPU) and AMD is just too uncomfortably close and Intel soon threatening to do something similar.
So Nvidia cannot wait to have the next process size ready. They need to push a product to establish their "brand mixed bag of functionalities hitherto known as premium GPU", or risk loosing that to someone else. AMD beefed up the compute part of their design because they understood the war will no longer be fought in games alone--perhaps even too much to cut it in games--even if the miners appreciated it for a while.
But that also highlights the risk: If you get the mixture wrong, if some specialty market like VR doesn't develop the way you anticipated it, you have a dud or lower margin product.
So they are seeding ray-tracing and putting stakes into the ground to claim the future GPU field. And you could argue that cheaper production of almost realistic animations could be as big or bigger than blockchain mining.
I am convinced they are thinking several generations ahead and it's becoming ever more challenging to find white spots on the map of compute that will last long enough for ROI.
NVIDIA's gross margins are better than Intel's now. There's nothing commodity-like about GPUs. Maybe what you are referring to is that Intel and AMD control the x86 instruction set whereas from a gaming point of view GPUs operate through APIs. But still NVIDIA achieve great margins on their going products because of superior research and development, which is an antithetical situation to a commodity market.
As far as finding "white spots" in the market overall, there is a lot more room for innovation in GPUs than in CPUs. That is why CPUs have had very little increase in performance or abilities over the past 10 years while GPU performance and abilities have been bounding upwards. That trend will continue, there are plenty of legs left in GPU architectural innovation.
FP and Int hardware are fully decoupled, which explains some of the differences. Turing also supports various data types specific to ray tracing, so I'd expect transistors are spent towards that as well.
There should be some FP64 hardware but probably 1/32 the single precision rate (same penalty GP102 had). Volta is still a HPC focused part for that workload. Additionally Volta had six nvLink ports, though only exposed on the SMX2 mezzanine cards.
The change in transistor density likely stems from GDDR6 where space on the die has to be allocated for off package IO. HBM pads are far smaller.
One thing missing was any mention of HDMI 2.1 support. Not surprising as this is a professional card where DP is standard but worthy of an architectural note. Perhaps when this generation reaches consumers?
When I tall about whether ut gas FP64 units or not I mean a number of them useful for FP64 computation. Of course it will maintain at least one per SM for compatibility.
I'm not sure what you are arguing as far as transistor areal density and as far as transistors per CUDA core. They were two separate points with two separate conclusions. One is that the chips are 16 FFN and not 7 nm, based on transistor density. I don't see different types of memory controllers or more or less NVLink ports (you don't know how many are on the Turing die, anyway) dominating a shrink from 16 FFN to 7 nm.
As far as the transistors we CUDA core and FP64, the FP and integer were already decoupled on Volta. It seems that major work went into the SM for Volta and Turing changes the SM very little from Volta. Since the transistors per unit area comparison I did was between the Quadro RTX 8000 and the Volta-based Quadro GV100, the independent integer pipeline and other SM changes cannot account for much change in the number of transistors per FP16 CUDA core. The encode/decode block has been updated, the memory controller is different, and there are potentially less NVLink ports, but I doubt these things would balloon the number of transistors the way full FP64 support would (besides, less NVLink ports would reduce the number of transistors, not increase it). FP64 support accounts for a fair chunk of transistors. My guess is the GPUs these new Quadro RTXs are based on don't have FP64 units (except for compatibility) and the RT cores take a significant number of transistors.
Err, sorry for all the typos. Also I should have written "Since the transistors per CUDA core calculation..." in the third paragraph.
One last thing, DP is not standard in proviz cards. Other than the Quadro GV100, NVIDIA have not had DP in their proviz cards since Kepler. That means the top of the line (on their introductions) M6000 and P6000 only have compatibility support for FP64 instructions.
I think it is safe to say that we agree that Turing isn't 7 nm. However, with the decrease in transistor density from Volta to Turing, it could have been either 12 nm and 16 nm. The density differences aren't that much to start between them, but there are also some changes in the design that also affect transistor density, mainly IO pads. Hence the mention.
I agree. We don't have the resolution to see whether it is 12 FFN or 16. But I figure that if NVIDIA can order a run of a custom node for a relatively small volume like what's necessary for V100, then they can order it for a much higher volume run like for TU104 (GT already standards for Tesla architecture so the rumor of TU makes sense). There is just the matter of cost/benefit. The design parameters of the two GPUs are rather similar, I assume. Even though they charge a lot more for the GV100, I doubt they spent a lot of money just to push it a little bit more. So I am guessing the node is still advantageous for TU104. But, who knows? I just put 12 FFN because I didn't want to keep typing 12/16 and I thought 12 was most likely.
> this GPU doesn't offer much of an advantage over the V100 for data center workloads.
One word: inferencing. int8 is pretty equivalent to fp, so 250 int8 TOPS is a doubling of what you could do with fp16-based inferencing. And if you can make use of int4 without at least doubling your model size, even better.
No, INT8 is not equivalent to FP16. That is only true for some networks. The greater precision flexibility of the tensor cores is a very small difference for an entirely new data center generation. I doubt companies are going to go through the expensive process of product verification just for that.
NVIDIA might come out with data center parts based on this GPU, such as an inference-oriented card to replace the P40 or P4, but I don't think they will be intended to succeed the Tesla V100.
> NVIDIA might come out with data center parts based on this GPU, such as an inference-oriented card to replace the P40 or P4
Yes, that's what I was saying.
I was responding to the statement I quoted. I didn't mean an overall advantage - I was just citing a specific and significant area where it *did* have an advantage.
You might want to confirm the definition of "variable rate shading" because I don't think it has to do with faster throughput using lower precision data types. Microsoft has a patent for it: http://www.freepatentsonline.com/y2018/0047203.htm...
It sounds like it lets you render to a render target at different granularities (e.g. maybe per-pixel for some parts, per-sample for others and in 2x2 pixel granularities for others) in a single draw call. Lets you reduce pixel costs for lower frequency/less important parts of the output image (like the edges of the screen or for foveated rendering for VR)
Display Stream Compression is an optional feature of the DisplayPort 1.4 spec, not a normative requirement, and support likely requires the addition of fixed function blocks to the display controller.
AFAIK, none of the current GPU architectures from either NVIDIA or AMD support DSC. Everyone assumes that they do because the outputs are listed as DP 1.4 in the spec sheets. However, 8K at 60 Hz is only supported for dual-cable displays, and I can find zero documentation to suggest that any of these GPUs are DSC capable.
> I am assuming based on this that we are looking at two NVLinks per board – similar to the Quadro GV100 – however I’m waiting on confirmation of that given NVIDIA’s 100GB/sec transfer number.
Yes, it must be 2. NVLink is bidir, and Nvidia always quotes the bidir numbers. NVLink 2.0 is good for 25 GB/sec per link per direction. So, there's your 100 GB/sec.
AMD could accomplish the same throughput with roughly x25 lanes of PCIe 4.0 connectivity between a pair of Vega 20 dies. Of course, it's a bit silly to even mention AMD, given how far Turing outclasses even Vega 20.
> NVIDIA is touting 500 trillion tensor operations per second (500T TOPs).
No, just 500 trillion ops per second.
> it's so large that NVIDIA may intend to keep it for their more profitable Quadro and Tesla GPUs. In which case this specific GPU won't be what eventually hits consumer cards.
I foresee them holding back the tensor and RT cores, or at least drastically reducing their numbers. I think tensor cores, at least, are part of their market segmentation strategy, in order to keep people from using GTX models for serious deep learning or inferencing. Perhaps they have a similar strategy with reserving the full RT core-count for their Quadro GPUs.
Also, why no Quadro GP100 in the comparison table? That would be more interesting than M6000.
I don't see NVIDIA holding back tensor or RT cores from their 1080 class GPUs. What they will hold back compared to various higher-priced parts are memory capacity, memory bandwidth, NVLink, FP64, ECC, and proviz certification. Each of those things is important to proviz, HPC, or AI applications, and I think at least one of them is critical to a good portion of applications in those areas. Plus NVIDIA have been putting pressure directly on their channel partners not to sell GeForce cards to data center customers.
NVIDIA probably see raytracing and AI as ways they can differentiate their GPUs compared to the competition. That means not holding back the AI or raytracing capabilities. They need to give developers an amount of compute that make the methods effective and worth pursuing.
You've got to cut die size, somehow. I don't believe RT-equipped GPUs are going to target HPC, in which case these wouldn't have significant FP64 you could remove. So, if not FP64, then what? Just tensor cores? Maybe. But why not go on to extend that same strategy to RT cores?
Well from the rumors, RT cores will only be on 2080 and 2070 GPUs. I think the die sizes will be large and the prices will reflect that.
But if/when they do cut die sizes they will cut them with a mind toward optimizing for the intended usage of the GPU, and not with a mind toward protecting higher priced parts. Gaming is still the biggest revenue and profit driver for NVIDIA and is growing very quickly. NVIDIA are going to take it seriously. And I think they have a good enough way of protecting those higher priced parts through the methods I said previously that they don't have to jeopardize optimal configuration for RTX gaming in order to do it. They are: memory capacity (large neural networks and serious rendering require large onboard memory capacity), memory bandwidth (compute applications require more memory bandwidth than graphics applications, though I don't know how RT cores play into that, yet), ECC (not important for machine learning, but proviz and HPC parts have it), professional software certification and support (professional users are often on production schedules and lose a lot more than the cost of a graphics card if things go wrong and they can't get it back on track quickly. Maybe there are other reasons to pay for it, too, but I don't know too much about it), NVLink (to train larger models and render larger scenes this helps to scale up and pool memory resources in a way not possible without it). Basically, one could ask why is anyone spending over $4,000 on a Quadro P6000 when a GeForce GTX 1080 Ti costs only $650? Or why is anyone spending over $8,000 on a Tesla V100 when a Titan V costs $3,000? I believe the answers to those questions are in my list above and also answer how NVIDIA can maintain differentiation between different product segments in the Turing generation without choosing suboptimal numbers of tensor cores or RT cores for gaming.
> Basically, one could ask why is anyone spending over $4,000 on a Quadro P6000 when a GeForce GTX 1080 Ti costs only $650?
Driver optimizations for workstation apps. Also, the card is certified for apps where a single license probably costs more than the workstation. So, why risk using an unsupported configuration and then not being able to get support, if you have an issue?
But, I think many Quadro customers don't pay list price. Big OEMs surely get a discount, as do big customers.
> Or why is anyone spending over $8,000 on a Tesla V100 when a Titan V costs $3,000?
NVLink, 32 GB of RAM instead of 12 GB, faster memory (due to 4096-bit vs. 3048-bit data bus), ECC memory, plus the general points I just mentioned.
> I believe the answers to those questions are in my list above
But this whole thing is a digression. It's answering the wrong question. The right question isn't how Nvidia can charge so much for Quadro, but rather how are they going to sell these enormous dies at prices gamers can afford.
No, not driver optimization for workstation apps. You can get that without spending $3000 for it. as for your second point there, exactly. They may get a discount but not from $4000 to $650 so that is irrelevant. And if they order a volume of 1080 This they will get s discount as well.
As for Tesla, again, exactly.
" But this whole thing is a digression. It's answering the wrong question. The right question isn't how Nvidia can charge so much for Quadro, but rather how are they going to sell these enormous dies at prices gamers can afford."
No. I was specifically responding to the following that you had written: "I foresee them holding back the tensor and RT cores, or at least drastically reducing their numbers. I think tensor cores, at least, are part of their market segmentation strategy, in order to keep people from using GTX models for serious deep learning or inferencing. Perhaps they have a similar strategy with reserving the full RT core-count for their Quadro GPUs."
They do not need to use tensor cores or RT cores as market segmentation strategies between GeForce cards and Quadro/Tesla cards. They do, however, need tensor cores and RT cores to attack the gaming market, their largest and most lucrative market, and still a rapidly growing market. How they can afford to include those technologies on gaming dies is besides the point because it was not what my post was in response to.
Well, Pascal was an exception, with Nvidia enabling Quadro driver optimizations for Titan X (Pascal), after AMD enabled their workstation driver optimizations for Vega Frontier.
> How they can afford to include those technologies on gaming dies is besides the point
Okay, maybe you're not concerned with that, but I am.
It doesn't matter if Pascal was an exception. Pascal demonstrated that there's no problem there. They didn't seem to lose margins by doing it.
It's not that I'm not concerned with how they can afford to put the technologies on gaming dies, it's that that wasn't the thread of this particular discussion. My point is that if they can afford to do it, I believe they will include the optimum number of RT cores and Tensor Cores to target the gaming market rather than holding back in order to ensure product differentiation. But it is predicated on two things: 1) that there is functional utility to putting these cores on lower performing parts; if the algorithms don't really work because there simply isn't enough horsepower then they won't go out of their way to put the cores on the pats (it may still be cheaper to keep them there, rather than designing a new Turing-like GPU without them) and 2) that they can hit their price points with reasonable margins when including the cores on the lower-performing parts.
The answers to those two questions are interesting, but I can't claim to know anything about them.
Games that are only ray traced, are not that great looking. Textures will continue to be used, and lighting/shadows will be off-loaded to the more efficient ray casting engine.
Ah, ray tracing, the new new VR. Just like VR wake me when it's finally good enough to provide a high quality experience with plenty of software to support it, at price that's attainable.
The ray trace engine will be for offloading the global illumination and shadowing to it's more efficient engines, and games will continue to use textures.
"Ah, ray tracing, the new new VR. Just like VR wake me when it's finally good enough to provide a high quality experience with plenty of software to support it, at price that's attainable."
...because 2 years is such a long, long time for a fundamental shift in technology. Sheesh. Do you realize that the technologies the internet were based on were invented in the 1970s, the internet didn't reach consumers until the early 1990s, and Google, the origin of the word we use for the generic searching for something on the internet, wasn't founded until 1998? We already had lots of Amazon.com ads on the radio by that point and Yahoo was a Wall Street darling. These things take time.
Given my video card accelerates my games, my video playback and encoding, 75% of what I use my main computer for, I should probably pay more attention to the GPU side of things than the CPU.
I'd love to see more in depth study on NVENC vs various options. I've gotten very good quality using Staxrip, though it did slow transcoding speeds considerably, it's still faster than the Ryzen and i7 with good quality and file size.
I miss the day when GPU reviews actually talked about new 3D features for *games*, does anyone remember? New antialiasing techniques, shaders, all sorts. It's been a long time. Alas, too many have bought into the modern PR hype that the only thing which matters now is fps, something that's making it all worse, ie. pressure on devs not to release a new game so complex that it would give "old style" fps rates in the 30 to 60 range on good hw, as that would annoy the vocal users of gsync/freesync monitors, and tech reviewers would doubtless be critical because they too have gotten too used to crazy high frame rates which most people don't need (at this rate we'll never have another Crysis moment, no company would take the risk, no matter how glorious a new game looked or functioned). Also notice how review sites do not perform image quality comparisons anymore.
People will say there's still money in the gaming market, but it's not where the big money is, not when one can sell exactly the same process tech to industry, research, defense, etc. for 10x more. As TGOG put it, gamers currently get the scraps off the table. Turing looks like another hefty move in the same direction.
I just wish tech sites were more critical of all this. Start doing quality comparisons again, talk about how actual gameplay in many ways hasn't changed in years, there have been no new modelling or rendering methods for 3D gaming for ages, still no ability to correctly model fire, water (accumulation, flood, damage, erosion, temperature response, electrical behaviour, etc.), mud, lava, ie. fluids in general. Imagine a Tomb Raider game where a storm, volcanic eruption or tornado actually did to the surrounding environment what it should do.
Look back to reviews of the 8800 GTX, that sort of thing, they were very different in focus back then. The modern obsession with fps is utterly boring, doubly so since as soon as one moves up to a significant resolution then the choice of CPU frequently becomes far less relevant, rendering articles about CPU performance rather moot (to coin a phrase). Instead, Intel launches a new CPU and the tech press goes mental about what fps it does at 1080p for game whatever; who cares??
Gaming needs to return to gameplay, with tech advances pushing the boundaries of what can be modelled with respect to immersive worlds, both visually and functionally. This is why, despite the less advanced looking visuals, my favourite game atm is Subnautica.
It's ironic because he's referencing as the good old days the card where NVIDIA introduced their first universal shader architecture GPU, which was the last big technological change in GPU design for games until Turing and its raytracing capabilities. He just doesn't understand the fundamental shift that is potentially possible using real time raytracing due to the time- and cost-savings in game development and the potential increase in visual quality from new algorithms.
Also, he doesn't understand that as an industry matures the easy problems get solved and further improvements take more work and ingenuity, so the pace is slower. Gamers aren't being given scraps, the pace of innovation is just bound to slow. NVIDIA is spending more money and manpower on gaming technologies development, both hardware and software, than ever before.
Another irony is that you just *know* someone in the future will be nostalgic for "the good 'ol days, like when Nvidia broke the mold with Turing and its groundbreaking RT cores".
Does it matter? They say it uses Volta's SM architecture, but it has notable improvements in the Tensor cores (not to mention the addition of the RT cores). So, I guess it has a similar relation to Volta as Pascal did to Maxwell.
I think the relation of the V100 to Turing is probably close to that of the P100 to the other Pascal members. It came first, has different features, different design parameters, and a different CUDA compute capability level (P100 was 6.0, while the others were 6.1).
"NVIDIA for their part has confirmed that the first Turing Quadro cards will run their GDDR6 at 14Gbps, which happens to be the fastest speed grade offered by all of the Big 3 members." According to a Samsung press release released a few days ago regarding the memory they provided to Nvidia, their GDDR6 memory supports up to 18 Gbps at the same voltage (1.35V), and optionally an overclocked mode to 20 Gbps when slightly overvoltaged (cannot recall how much, probably in the 1.40 - 1.45V range).
The press release did not refer to a future version of their memory, but their current memory. So my guess is that Nvidia either underclocked the memory to keep the TDP in check or, maybe, they asked for loosely binned, lower clocking and thus cheaper GGDR6 chips from Samsung.
Or they want to double/triple source and not be dependent on Samsung and their pricing. That's what Ryan implies by stating that this is the highest speed offered by all 3 manufacturers.
Small request: Could you change the colour you use when emphasising words? I keep trying to click anything in blue assuming that they are hyperlinks, but am disappointed when they are not
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
83 Comments
Back to Article
Gothmoth - Monday, August 13, 2018 - link
he could not even show a game title that uses raytracing..... as i said 2020 it will be a thing. this is the first generation with no content. people will buy it because of clever nvidia marketing but in the end for gaming it is a marketing plot. gamers need only to care about RT in two years +Gothmoth - Monday, August 13, 2018 - link
i know siggraph is a pro event. but there are a ton of game developers attending siggraph. so if he had to shon some AAA game title with raytracing it would have resonated well.... but nothing.Gothmoth - Monday, August 13, 2018 - link
still no editing typos at anands in 2018.... maybe next century.JoeyJoJo123 - Tuesday, August 14, 2018 - link
the technology's just not there yet.Santoval - Wednesday, August 15, 2018 - link
How could he show a game that uses ray-tracing (i.e. real time ray-tracing during gameplay, not prerendered material for cinematic cuts, intros etc, that has long been a thing) when these are the first video cards able to support (partial) real time ray-tracing?Ray-tracing support was added to the Vulcan and DirectX 12 APIs only months ago, and the largest game engines are still in the process of implementing ray-tracing. Game studios and developers need all three, fast hardware, API support and game engine support to start implementing real time ray-tracing, initially in a "hybrid" fashion, of course, with the majority of the material still being rasterized.
It looks like a chicken and egg scenario, but the true roadblock has been the lack of fast hardware. After Nvidia, at first, announced ray tracing acceleration in their next-gen graphics cards (incl. their consumer ones), it was the turn of the gaming APIs to add ray-tracing support. And, finally, it was time for the big game engines to get to work on it.
There is no such game yet, only a few demos, but I predict that in early 2019, mid 2019 tops, the first crop of AAA games with partial ray tracing support will be introduced. Some top level games might be able to be patched with ray tracing support as well.
niva - Tuesday, August 21, 2018 - link
People are going to complain about everything.jokifan - Monday, August 13, 2018 - link
Quadros aren't gaming cards. They're for content creation. Why do you think everything has to be about gaming?MadManMark - Tuesday, August 14, 2018 - link
Presumably because that's all he can think (or cares) about.RK7 - Monday, August 13, 2018 - link
Wow, just wow. Dude, this is about technological progress, no one gives a fcuk about your games.Oxford Guy - Wednesday, August 15, 2018 - link
Not quite. Ray tracing is likely to become more important for gaming as time progresses. Remember how "no one cares" about tessellation and physics?tyger11 - Monday, August 13, 2018 - link
This is Turing; why would you even bring up gaming?mdriftmeyer - Monday, August 13, 2018 - link
He most likely brought it up as Nvidia has been touting the next advancement in gaming is a Real-Time Ray Tracing Graphics GPUs.inighthawki - Tuesday, August 14, 2018 - link
Turing is just the architecture name. I presume you mean that these are Quadro cards?tipoo - Monday, August 13, 2018 - link
Gaming event is next week, with the Geforce version maybe launching there.cmdrdredd - Monday, August 13, 2018 - link
Pretty much any new tech for game engines runs poorly on the first hardware. Tessellation wasn't that fast on the first GPUs that could do it, DX10/11/12 were all not that fast on first gen GPUs too.I think this will be no different.
Spunjji - Tuesday, August 14, 2018 - link
DX9 ran like stink on the Radeon 9700 Pro and DX12 showed significant gains on AMD's existing GCN hardware, so that comment only holds if you're limiting it to Nvidia chips. It's a trivial observation to say that second-gen performs better - it'd be an automatic failure if it didn't ;)somejerkwad - Tuesday, August 14, 2018 - link
The new Metro title has it and demonstrates an advantage for Titan V. FF15 also demonstrates a mild lead for Titan V vs 1080 ti which could possibly be related to the ray-tracing or another architectural advancement. They also showed off a live render Star Wars animation a while back. The hardware exists already for the higher end segments of the market and the software is coming along faster than it really should when considering that there's still no consumer market for the stuff, at least until nVidia stops holding out on us.Oxford Guy - Wednesday, August 15, 2018 - link
1) Sell to people with deep pockets. 2) Use e-peen features to get mindshare via reviewersAlistair - Monday, August 13, 2018 - link
Basically with Turing, what we know of as a GPU is changing quite a bit.Downside is so much money and die area spent without increasing performance in existing titles and engines. I'm not sure I want to pay double without seeing double performance in my existing titles, but today is marking the beginning of the transition to ray-tracing it seems.
Dr. Swag - Tuesday, August 14, 2018 - link
>implying these are meant for gamers, lolThese say quadro, not geforce. They're meant for professionals. I wouldn't be surprised if they released the geforce variants in a week or two without so much vram, disabled tensor cores, no ecc (not sure if the quadros do have it though), much lower double precision (these might support it at 1/2 or 1/3 rates), and perhaps less RT cores. Based on the die size of this monster, it most likely does have a lot of double precision hardware. In which case I'd expect nvidia to keep this massive 754 mm^2 die for quadros and teslas (and perhaps one Titan) and make a new die for geforce that should be a fair bit smaller.
You seem to think these prices are what the geforce cards will be. Just looking at the past these prices for quadros aren't completely out of the ordinary.
Alistair - Tuesday, August 14, 2018 - link
You didn't understand my comment. I meant if nVidia has a choice of including Tensor and RT cores or not, adding them won't improve performance with current titles so it's a lot of extra money without payoff. I was hoping to see Quadro cards without RT capabilities announced. Memory and standard CUDA cores only.We can expect the Geforce cards to be almost identical but 4-6 times cheaper, just like with P5000 and P6000 vs GTX 1080 and Titan XP.
Dr. Swag - Tuesday, August 14, 2018 - link
You said "Downside is so much money and die area spent without increasing performance in existing titles and engines. I'm not sure I want to pay double without seeing double performance in my existing titles, but today is marking the beginning of the transition to ray-tracing it seems."This implies to me that you're seeing these as gaming cards. Especially around parts like "I'm not sure I want to pay double without seeing double performance in my existing titles."
The RT stuff is meant for workstations. That's the whole point. If you want stuff without it... Then you've got yourself a geforce card, lol. Or maybe even the regular geforce cards will have RT tracing hardware on it, though if they do I'd expect for there to be less of it.
Alistair - Tuesday, August 14, 2018 - link
The Quadro cards are usually exactly like the gaming cards. If you don't have reading comprehension, can't have a conversation...Trackster11230 - Tuesday, August 14, 2018 - link
That's historically true, but maybe we'll see a shift here?Either way, I agree. That's money the company spent on areas that aren't gaming, but the GPU market as a whole is growing in several directions so I can understand their choices.They have to appeal to as many clients as possible, which means any one market segment likely won't get full resource utilization.
Dr. Swag - Thursday, August 16, 2018 - link
Yes but you are talking about these gpus being multiple times more expensive than current geforce gpus for much less than 2x gain in performance in games. Yes, that's true, but geforce cards will be released at similar price points to current gpus, making the "too expensive to justify the extra gaming perfomance point" invalid. These prices are for quadros, not geforce cards.Yojimbo - Monday, August 13, 2018 - link
They are getting less transistors per unit area than for Volta, so I imagine it's on the 12 FFN process. The density increase of 7 nm should dominate any architectural changes that would reduce transistor density.I'm surprised they are coming out with a generation of 12 FFN GPUs now when it seems 7 nm should be ready by mid 2019.
Judging by Volta, Turing probably gets its performance advantage over Pascal mostly by being more energy efficient. So a larger die size (and/or a higher clock) is necessary to obtain a performance gain.
Interestingly, trying to estimate by the number of CUDA cores (assuming the top end Quadro part isn't using a significantly cut down chip), it seems like there are FP64 units on this GPU, unless the RT cores and other architectural improvements over Volta are taking up a whole lot of transistors.
Sales to hyperscalers should take precedence over sales for proviz, so the fact that these Quadros will be available soon seems to suggest that a V100 replacement isn't forthcoming based on this GPU, unless NVIDIA has been selling such a card to hyperscalers in stealth mode. Of course, backing this up is that fact that this is a GDDR6 part with lower memory bandwidth than the GV100 and the fact that other than the greater precision options in the tensor cores, this GPU doesn't offer much of an advantage over the V100 for data center workloads. Well, it's interesting then that it SEEMS to have double precision. Maybe it doesn't and those RT cores really do take up a large number of transistors.
DanNeely - Monday, August 13, 2018 - link
You've answered your own question. Launching on 12nm now gets them a new product now, a year after their previous refresh, and with major feature enhancements on the pro-side of the house an entire year sooner than if they waited for 7 nm. If the rumor mill is to be trusted, they did delay this generation a few months to empty out their retail channel of last gen parts after the crypto bubble popped; but going an entire year without a product refresh when they've got one would be insane.Yojimbo - Tuesday, August 14, 2018 - link
DanNeely,I believe it costs a lot to prepare a design for high volume manufacturing. They aren't sitting on such a thing if they never make the preparations to begin with. Right now they have little competition, and they won't have competition until 2019. Even though Pascal has been out over two years, gamer market share uptake of Pascal GPUs has been low due to cryptocurrency. Those are reasons why NVIDIA could get away with not releasing a new generation until 2019.
In 2019, however, NVIDIA will have competition and it will be on 7 nm. It's not cheap to do a die shrink (or to come out with a new generation) so if 12 FFN Turing is only at the top of the market for a year that will eat into their margins. Those are reasons why NVIDIA would want to wait until 2019 to release their new generation on 7 nm.
Possibly NVIDIA believe that people won't be willing to purchase 2+ year old GPUs even though those GPUs are significantly faster than what the people currently have and there isn't anything better available than those 2+ yo GPUs on the market. Another possibility is that NVIDIA want to get the RTX hardware into the market as soon as possible in order to push the technology forward for future margins, giving up margins in the near term. The greater the percentage of RTX-capable cards in the market and the sooner they are released the more and sooner developers will design and build games using these technologies, and so the greater the demand for GPUs with these technologies will be in the future.
abufrejoval - Wednesday, August 15, 2018 - link
I think Nvidia is more concerned about keeping concept leadership for GPUs.Essentially GPU is somewhere between commodity (like RAM) and proprietary (CPU) and AMD is just too uncomfortably close and Intel soon threatening to do something similar.
So Nvidia cannot wait to have the next process size ready. They need to push a product to establish their "brand mixed bag of functionalities hitherto known as premium GPU", or risk loosing that to someone else. AMD beefed up the compute part of their design because they understood the war will no longer be fought in games alone--perhaps even too much to cut it in games--even if the miners appreciated it for a while.
But that also highlights the risk: If you get the mixture wrong, if some specialty market like VR doesn't develop the way you anticipated it, you have a dud or lower margin product.
So they are seeding ray-tracing and putting stakes into the ground to claim the future GPU field. And you could argue that cheaper production of almost realistic animations could be as big or bigger than blockchain mining.
I am convinced they are thinking several generations ahead and it's becoming ever more challenging to find white spots on the map of compute that will last long enough for ROI.
Yojimbo - Wednesday, August 15, 2018 - link
NVIDIA's gross margins are better than Intel's now. There's nothing commodity-like about GPUs. Maybe what you are referring to is that Intel and AMD control the x86 instruction set whereas from a gaming point of view GPUs operate through APIs. But still NVIDIA achieve great margins on their going products because of superior research and development, which is an antithetical situation to a commodity market.As far as finding "white spots" in the market overall, there is a lot more room for innovation in GPUs than in CPUs. That is why CPUs have had very little increase in performance or abilities over the past 10 years while GPU performance and abilities have been bounding upwards. That trend will continue, there are plenty of legs left in GPU architectural innovation.
Yojimbo - Wednesday, August 15, 2018 - link
I meant to type "...NVIDIA achieve great margins on their gaming products..."Kevin G - Tuesday, August 14, 2018 - link
FP and Int hardware are fully decoupled, which explains some of the differences. Turing also supports various data types specific to ray tracing, so I'd expect transistors are spent towards that as well.There should be some FP64 hardware but probably 1/32 the single precision rate (same penalty GP102 had). Volta is still a HPC focused part for that workload. Additionally Volta had six nvLink ports, though only exposed on the SMX2 mezzanine cards.
The change in transistor density likely stems from GDDR6 where space on the die has to be allocated for off package IO. HBM pads are far smaller.
One thing missing was any mention of HDMI 2.1 support. Not surprising as this is a professional card where DP is standard but worthy of an architectural note. Perhaps when this generation reaches consumers?
Yojimbo - Tuesday, August 14, 2018 - link
When I tall about whether ut gas FP64 units or not I mean a number of them useful for FP64 computation. Of course it will maintain at least one per SM for compatibility.I'm not sure what you are arguing as far as transistor areal density and as far as transistors per CUDA core. They were two separate points with two separate conclusions. One is that the chips are 16 FFN and not 7 nm, based on transistor density. I don't see different types of memory controllers or more or less NVLink ports (you don't know how many are on the Turing die, anyway) dominating a shrink from 16 FFN to 7 nm.
As far as the transistors we CUDA core and FP64, the FP and integer were already decoupled on Volta. It seems that major work went into the SM for Volta and Turing changes the SM very little from Volta. Since the transistors per unit area comparison I did was between the Quadro RTX 8000 and the Volta-based Quadro GV100, the independent integer pipeline and other SM changes cannot account for much change in the number of transistors per FP16 CUDA core. The encode/decode block has been updated, the memory controller is different, and there are potentially less NVLink ports, but I doubt these things would balloon the number of transistors the way full FP64 support would (besides, less NVLink ports would reduce the number of transistors, not increase it). FP64 support accounts for a fair chunk of transistors. My guess is the GPUs these new Quadro RTXs are based on don't have FP64 units (except for compatibility) and the RT cores take a significant number of transistors.
Yojimbo - Tuesday, August 14, 2018 - link
Err, sorry for all the typos. Also I should have written "Since the transistors per CUDA core calculation..." in the third paragraph.One last thing, DP is not standard in proviz cards. Other than the Quadro GV100, NVIDIA have not had DP in their proviz cards since Kepler. That means the top of the line (on their introductions) M6000 and P6000 only have compatibility support for FP64 instructions.
Kevin G - Tuesday, August 14, 2018 - link
I think you misunderstood me. DP as in the context of DisplayPort, not double precision.Kevin G - Tuesday, August 14, 2018 - link
I think it is safe to say that we agree that Turing isn't 7 nm. However, with the decrease in transistor density from Volta to Turing, it could have been either 12 nm and 16 nm. The density differences aren't that much to start between them, but there are also some changes in the design that also affect transistor density, mainly IO pads. Hence the mention.Yojimbo - Wednesday, August 15, 2018 - link
I agree. We don't have the resolution to see whether it is 12 FFN or 16. But I figure that if NVIDIA can order a run of a custom node for a relatively small volume like what's necessary for V100, then they can order it for a much higher volume run like for TU104 (GT already standards for Tesla architecture so the rumor of TU makes sense). There is just the matter of cost/benefit. The design parameters of the two GPUs are rather similar, I assume. Even though they charge a lot more for the GV100, I doubt they spent a lot of money just to push it a little bit more. So I am guessing the node is still advantageous for TU104. But, who knows? I just put 12 FFN because I didn't want to keep typing 12/16 and I thought 12 was most likely.mode_13h - Tuesday, August 14, 2018 - link
> this GPU doesn't offer much of an advantage over the V100 for data center workloads.One word: inferencing. int8 is pretty equivalent to fp, so 250 int8 TOPS is a doubling of what you could do with fp16-based inferencing. And if you can make use of int4 without at least doubling your model size, even better.
Yojimbo - Tuesday, August 14, 2018 - link
No, INT8 is not equivalent to FP16. That is only true for some networks. The greater precision flexibility of the tensor cores is a very small difference for an entirely new data center generation. I doubt companies are going to go through the expensive process of product verification just for that.NVIDIA might come out with data center parts based on this GPU, such as an inference-oriented card to replace the P40 or P4, but I don't think they will be intended to succeed the Tesla V100.
mode_13h - Tuesday, August 14, 2018 - link
> NVIDIA might come out with data center parts based on this GPU, such as an inference-oriented card to replace the P40 or P4Yes, that's what I was saying.
I was responding to the statement I quoted. I didn't mean an overall advantage - I was just citing a specific and significant area where it *did* have an advantage.
grizzle - Monday, August 13, 2018 - link
You might want to confirm the definition of "variable rate shading" because I don't think it has to do with faster throughput using lower precision data types. Microsoft has a patent for it: http://www.freepatentsonline.com/y2018/0047203.htm...It sounds like it lets you render to a render target at different granularities (e.g. maybe per-pixel for some parts, per-sample for others and in 2x2 pixel granularities for others) in a single draw call. Lets you reduce pixel costs for lower frequency/less important parts of the output image (like the edges of the screen or for foveated rendering for VR)
Ryan Smith - Tuesday, August 14, 2018 - link
Thanks! Now that I've had a bit more time to sit down and do some research, I've updated the article accordingly.repoman27 - Monday, August 13, 2018 - link
Hey Ryan, I see 8K DisplayPort listed as a feature on one of those slides. I'm guessing that means Turing is the first GPU to support DSC then?Ryan Smith - Tuesday, August 14, 2018 - link
Unfortunately I don't have any information on that.DigitalFreak - Tuesday, August 14, 2018 - link
Pascal supported DP 1.4, which includes DSC.repoman27 - Tuesday, August 14, 2018 - link
Display Stream Compression is an optional feature of the DisplayPort 1.4 spec, not a normative requirement, and support likely requires the addition of fixed function blocks to the display controller.AFAIK, none of the current GPU architectures from either NVIDIA or AMD support DSC. Everyone assumes that they do because the outputs are listed as DP 1.4 in the spec sheets. However, 8K at 60 Hz is only supported for dual-cable displays, and I can find zero documentation to suggest that any of these GPUs are DSC capable.
mode_13h - Tuesday, August 14, 2018 - link
I want that die shot on a carpet.Spunjji - Tuesday, August 14, 2018 - link
2nd thismode_13h - Tuesday, August 14, 2018 - link
> I am assuming based on this that we are looking at two NVLinks per board – similar to the Quadro GV100 – however I’m waiting on confirmation of that given NVIDIA’s 100GB/sec transfer number.Yes, it must be 2. NVLink is bidir, and Nvidia always quotes the bidir numbers. NVLink 2.0 is good for 25 GB/sec per link per direction. So, there's your 100 GB/sec.
AMD could accomplish the same throughput with roughly x25 lanes of PCIe 4.0 connectivity between a pair of Vega 20 dies. Of course, it's a bit silly to even mention AMD, given how far Turing outclasses even Vega 20.
mode_13h - Tuesday, August 14, 2018 - link
> NVIDIA is touting 500 trillion tensor operations per second (500T TOPs).No, just 500 trillion ops per second.
> it's so large that NVIDIA may intend to keep it for their more profitable Quadro and Tesla GPUs. In which case this specific GPU won't be what eventually hits consumer cards.
I foresee them holding back the tensor and RT cores, or at least drastically reducing their numbers. I think tensor cores, at least, are part of their market segmentation strategy, in order to keep people from using GTX models for serious deep learning or inferencing. Perhaps they have a similar strategy with reserving the full RT core-count for their Quadro GPUs.
Also, why no Quadro GP100 in the comparison table? That would be more interesting than M6000.
Yojimbo - Tuesday, August 14, 2018 - link
I don't see NVIDIA holding back tensor or RT cores from their 1080 class GPUs. What they will hold back compared to various higher-priced parts are memory capacity, memory bandwidth, NVLink, FP64, ECC, and proviz certification. Each of those things is important to proviz, HPC, or AI applications, and I think at least one of them is critical to a good portion of applications in those areas. Plus NVIDIA have been putting pressure directly on their channel partners not to sell GeForce cards to data center customers.NVIDIA probably see raytracing and AI as ways they can differentiate their GPUs compared to the competition. That means not holding back the AI or raytracing capabilities. They need to give developers an amount of compute that make the methods effective and worth pursuing.
Yojimbo - Tuesday, August 14, 2018 - link
Not to mention that NVIDIA is strongly implying that these raytracing and AI methods are the future of real-time graphics.mode_13h - Tuesday, August 14, 2018 - link
That doesn't mean they can necessarily afford to put so many RT cores in consumer-focused models.smilingcrow - Tuesday, August 14, 2018 - link
Not at 1xnm but at 7nm they can afford to offer a lot more.mode_13h - Tuesday, August 14, 2018 - link
You've got to cut die size, somehow. I don't believe RT-equipped GPUs are going to target HPC, in which case these wouldn't have significant FP64 you could remove. So, if not FP64, then what? Just tensor cores? Maybe. But why not go on to extend that same strategy to RT cores?Yojimbo - Tuesday, August 14, 2018 - link
Well from the rumors, RT cores will only be on 2080 and 2070 GPUs. I think the die sizes will be large and the prices will reflect that.But if/when they do cut die sizes they will cut them with a mind toward optimizing for the intended usage of the GPU, and not with a mind toward protecting higher priced parts. Gaming is still the biggest revenue and profit driver for NVIDIA and is growing very quickly. NVIDIA are going to take it seriously. And I think they have a good enough way of protecting those higher priced parts through the methods I said previously that they don't have to jeopardize optimal configuration for RTX gaming in order to do it. They are: memory capacity (large neural networks and serious rendering require large onboard memory capacity), memory bandwidth (compute applications require more memory bandwidth than graphics applications, though I don't know how RT cores play into that, yet), ECC (not important for machine learning, but proviz and HPC parts have it), professional software certification and support (professional users are often on production schedules and lose a lot more than the cost of a graphics card if things go wrong and they can't get it back on track quickly. Maybe there are other reasons to pay for it, too, but I don't know too much about it), NVLink (to train larger models and render larger scenes this helps to scale up and pool memory resources in a way not possible without it). Basically, one could ask why is anyone spending over $4,000 on a Quadro P6000 when a GeForce GTX 1080 Ti costs only $650? Or why is anyone spending over $8,000 on a Tesla V100 when a Titan V costs $3,000? I believe the answers to those questions are in my list above and also answer how NVIDIA can maintain differentiation between different product segments in the Turing generation without choosing suboptimal numbers of tensor cores or RT cores for gaming.
mode_13h - Wednesday, August 15, 2018 - link
> Basically, one could ask why is anyone spending over $4,000 on a Quadro P6000 when a GeForce GTX 1080 Ti costs only $650?Driver optimizations for workstation apps. Also, the card is certified for apps where a single license probably costs more than the workstation. So, why risk using an unsupported configuration and then not being able to get support, if you have an issue?
But, I think many Quadro customers don't pay list price. Big OEMs surely get a discount, as do big customers.
> Or why is anyone spending over $8,000 on a Tesla V100 when a Titan V costs $3,000?
NVLink, 32 GB of RAM instead of 12 GB, faster memory (due to 4096-bit vs. 3048-bit data bus), ECC memory, plus the general points I just mentioned.
> I believe the answers to those questions are in my list above
But this whole thing is a digression. It's answering the wrong question. The right question isn't how Nvidia can charge so much for Quadro, but rather how are they going to sell these enormous dies at prices gamers can afford.
Yojimbo - Wednesday, August 15, 2018 - link
No, not driver optimization for workstation apps. You can get that without spending $3000 for it. as for your second point there, exactly. They may get a discount but not from $4000 to $650 so that is irrelevant. And if they order a volume of 1080 This they will get s discount as well.As for Tesla, again, exactly.
"
But this whole thing is a digression. It's answering the wrong question. The right question isn't how Nvidia can charge so much for Quadro, but rather how are they going to sell these enormous dies at prices gamers can afford."
No. I was specifically responding to the following that you had written:
"I foresee them holding back the tensor and RT cores, or at least drastically reducing their numbers. I think tensor cores, at least, are part of their market segmentation strategy, in order to keep people from using GTX models for serious deep learning or inferencing. Perhaps they have a similar strategy with reserving the full RT core-count for their Quadro GPUs."
They do not need to use tensor cores or RT cores as market segmentation strategies between GeForce cards and Quadro/Tesla cards. They do, however, need tensor cores and RT cores to attack the gaming market, their largest and most lucrative market, and still a rapidly growing market. How they can afford to include those technologies on gaming dies is besides the point because it was not what my post was in response to.
mode_13h - Wednesday, August 15, 2018 - link
> You can get that without spending $3000 for it.Well, Pascal was an exception, with Nvidia enabling Quadro driver optimizations for Titan X (Pascal), after AMD enabled their workstation driver optimizations for Vega Frontier.
> How they can afford to include those technologies on gaming dies is besides the point
Okay, maybe you're not concerned with that, but I am.
Yojimbo - Friday, August 17, 2018 - link
It doesn't matter if Pascal was an exception. Pascal demonstrated that there's no problem there. They didn't seem to lose margins by doing it.It's not that I'm not concerned with how they can afford to put the technologies on gaming dies, it's that that wasn't the thread of this particular discussion. My point is that if they can afford to do it, I believe they will include the optimum number of RT cores and Tensor Cores to target the gaming market rather than holding back in order to ensure product differentiation. But it is predicated on two things: 1) that there is functional utility to putting these cores on lower performing parts; if the algorithms don't really work because there simply isn't enough horsepower then they won't go out of their way to put the cores on the pats (it may still be cheaper to keep them there, rather than designing a new Turing-like GPU without them) and 2) that they can hit their price points with reasonable margins when including the cores on the lower-performing parts.
The answers to those two questions are interesting, but I can't claim to know anything about them.
yeeeeman - Tuesday, August 14, 2018 - link
He didn't show any game with ray tracing since there was no card until now to do this at a normal pace of development. Hard to think these days...AphaEdge - Wednesday, August 15, 2018 - link
Games that are only ray traced, are not that great looking. Textures will continue to be used, and lighting/shadows will be off-loaded to the more efficient ray casting engine.SquarePeg - Tuesday, August 14, 2018 - link
Ah, ray tracing, the new new VR. Just like VR wake me when it's finally good enough to provide a high quality experience with plenty of software to support it, at price that's attainable.Oxford Guy - Wednesday, August 15, 2018 - link
Wake me when video games aren't all the same violence and brutality boredom.Icehawk - Wednesday, August 15, 2018 - link
Some good hyperbole there guy, yup all games are brutal violence fests. Oh, wait...AphaEdge - Wednesday, August 15, 2018 - link
The ray trace engine will be for offloading the global illumination and shadowing to it's more efficient engines, and games will continue to use textures.That way, you get the best of both worlds.
Yojimbo - Friday, August 17, 2018 - link
"Ah, ray tracing, the new new VR. Just like VR wake me when it's finally good enough to provide a high quality experience with plenty of software to support it, at price that's attainable."...because 2 years is such a long, long time for a fundamental shift in technology. Sheesh. Do you realize that the technologies the internet were based on were invented in the 1970s, the internet didn't reach consumers until the early 1990s, and Google, the origin of the word we use for the generic searching for something on the internet, wasn't founded until 1998? We already had lots of Amazon.com ads on the radio by that point and Yahoo was a Wall Street darling. These things take time.
0ldman79 - Tuesday, August 14, 2018 - link
Improvements in NVENC, very interesting.Given my video card accelerates my games, my video playback and encoding, 75% of what I use my main computer for, I should probably pay more attention to the GPU side of things than the CPU.
I'd love to see more in depth study on NVENC vs various options. I've gotten very good quality using Staxrip, though it did slow transcoding speeds considerably, it's still faster than the Ryzen and i7 with good quality and file size.
mapesdhs - Tuesday, August 14, 2018 - link
Conclusion of NVIDIA's info: nothing relevant to gaming whatsoever. The Good Old Gamer summed all this up nicely a while ago:https://www.youtube.com/watch?v=PkeKx-L_E-o
I miss the day when GPU reviews actually talked about new 3D features for *games*, does anyone remember? New antialiasing techniques, shaders, all sorts. It's been a long time. Alas, too many have bought into the modern PR hype that the only thing which matters now is fps, something that's making it all worse, ie. pressure on devs not to release a new game so complex that it would give "old style" fps rates in the 30 to 60 range on good hw, as that would annoy the vocal users of gsync/freesync monitors, and tech reviewers would doubtless be critical because they too have gotten too used to crazy high frame rates which most people don't need (at this rate we'll never have another Crysis moment, no company would take the risk, no matter how glorious a new game looked or functioned). Also notice how review sites do not perform image quality comparisons anymore.
People will say there's still money in the gaming market, but it's not where the big money is, not when one can sell exactly the same process tech to industry, research, defense, etc. for 10x more. As TGOG put it, gamers currently get the scraps off the table. Turing looks like another hefty move in the same direction.
I just wish tech sites were more critical of all this. Start doing quality comparisons again, talk about how actual gameplay in many ways hasn't changed in years, there have been no new modelling or rendering methods for 3D gaming for ages, still no ability to correctly model fire, water (accumulation, flood, damage, erosion, temperature response, electrical behaviour, etc.), mud, lava, ie. fluids in general. Imagine a Tomb Raider game where a storm, volcanic eruption or tornado actually did to the surrounding environment what it should do.
Look back to reviews of the 8800 GTX, that sort of thing, they were very different in focus back then. The modern obsession with fps is utterly boring, doubly so since as soon as one moves up to a significant resolution then the choice of CPU frequently becomes far less relevant, rendering articles about CPU performance rather moot (to coin a phrase). Instead, Intel launches a new CPU and the tech press goes mental about what fps it does at 1080p for game whatever; who cares??
Gaming needs to return to gameplay, with tech advances pushing the boundaries of what can be modelled with respect to immersive worlds, both visually and functionally. This is why, despite the less advanced looking visuals, my favourite game atm is Subnautica.
Ian.
mode_13h - Tuesday, August 14, 2018 - link
Ah, but it *does* support a new AA mode:http://research.nvidia.com/publication/2018-08_Ada...
And more efficient SMs and GDDR6 directly benefit gaming performance.
Lastly, their RT cores represent a *huge* investment into gaming and VR. So, I honestly don't know what you're on about.
Yojimbo - Wednesday, August 15, 2018 - link
It's ironic because he's referencing as the good old days the card where NVIDIA introduced their first universal shader architecture GPU, which was the last big technological change in GPU design for games until Turing and its raytracing capabilities. He just doesn't understand the fundamental shift that is potentially possible using real time raytracing due to the time- and cost-savings in game development and the potential increase in visual quality from new algorithms.Also, he doesn't understand that as an industry matures the easy problems get solved and further improvements take more work and ingenuity, so the pace is slower. Gamers aren't being given scraps, the pace of innovation is just bound to slow. NVIDIA is spending more money and manpower on gaming technologies development, both hardware and software, than ever before.
Yojimbo - Wednesday, August 15, 2018 - link
I should have typed "unified shader architecture" not "universal shader architecture".mode_13h - Wednesday, August 15, 2018 - link
Another irony is that you just *know* someone in the future will be nostalgic for "the good 'ol days, like when Nvidia broke the mold with Turing and its groundbreaking RT cores".PaoDeTech - Tuesday, August 14, 2018 - link
Will Micron / Samsung produce enough GDDR6? It seems like a good choice if price is comparable to GDDR5 and performance ~80% HBM2.shabby - Tuesday, August 14, 2018 - link
Is this a volta derivative or a new gpu?mode_13h - Tuesday, August 14, 2018 - link
Does it matter? They say it uses Volta's SM architecture, but it has notable improvements in the Tensor cores (not to mention the addition of the RT cores). So, I guess it has a similar relation to Volta as Pascal did to Maxwell.shabby - Wednesday, August 15, 2018 - link
Not really its just strange that Nvidia has two different gpus come out so close to each other.mode_13h - Wednesday, August 15, 2018 - link
I think the relation of the V100 to Turing is probably close to that of the P100 to the other Pascal members. It came first, has different features, different design parameters, and a different CUDA compute capability level (P100 was 6.0, while the others were 6.1).https://docs.nvidia.com/cuda/cuda-c-programming-gu...
V100 has 7.0, so it'll be interesting to see what Turing gets.
Santoval - Wednesday, August 15, 2018 - link
"NVIDIA for their part has confirmed that the first Turing Quadro cards will run their GDDR6 at 14Gbps, which happens to be the fastest speed grade offered by all of the Big 3 members."According to a Samsung press release released a few days ago regarding the memory they provided to Nvidia, their GDDR6 memory supports up to 18 Gbps at the same voltage (1.35V), and optionally an overclocked mode to 20 Gbps when slightly overvoltaged (cannot recall how much, probably in the 1.40 - 1.45V range).
The press release did not refer to a future version of their memory, but their current memory. So my guess is that Nvidia either underclocked the memory to keep the TDP in check or, maybe, they asked for loosely binned, lower clocking and thus cheaper GGDR6 chips from Samsung.
MrSpadge - Thursday, August 16, 2018 - link
Or they want to double/triple source and not be dependent on Samsung and their pricing. That's what Ryan implies by stating that this is the highest speed offered by all 3 manufacturers.keg504 - Thursday, August 16, 2018 - link
Small request: Could you change the colour you use when emphasising words? I keep trying to click anything in blue assuming that they are hyperlinks, but am disappointed when they are notChad - Thursday, August 16, 2018 - link
Really glad to see NVENC getting some love, though the quality is quite bad as is now.Yojimbo - Friday, August 17, 2018 - link
What's wrong with it?