Ryan, I legit appreciate the "A Note on x86 Emulation" section. Literally answered the biggest questions I had on this that nobody else has covered, as far as I can tell. 10/10.
I think the usual benchmark stuff has been covered sufficiently by now.
The only remaining question for me is battery life and there specifically where it might really differ from existing x86 laptops.
From what I remember reading the SoC itself has become somewhat of a minor energy consumption player on light desktop loads, which is how most professional computer users spend time in front of a screen. So I don't even know if there is enough wiggle room for a 2-3x battery time improvement without going to passive displays and really slow low-power storage.
What it comes down to is mostly this question: do I need to take the charger into my all-day meeting or can I even leave it at home when I go on a week-long busines trip?
And can I forget about my power button actually be a power button like on my phone, where most of the time it's used for things like camera control.
Pretty sure it won't be the week-long trip just yet, but not bothering with chargers for all-day meetings seems to have been a major win for the fruity cult.
And somewhere in between those two is the question of how low the Snapdragons will go in lid-down mode, which could be suspend to RAM or even some usable kind of modern standby where e.g. voice commands or messenger actions might still be processed with something ultra-low power. And there the main quality benchmark would be how quickly you can resume normal operations without e.g. draining 10 minutes of idle battery for one resume.
From what I read between the lines of your deep dive, the granularity at which large swaths of the SoCs transistors could go and remain dark might be vastly better than with x86 designs and their iGPUs. And its the ability to tickle adequate responsiveness from phone wattages that would make a Snapdragon PC worth having, not it's ability to beat x86 on Cinebench.
Cinebench on an RTX 4090 makes it rather clear that any investment into CPU power for that use case is plain folly.
It's not "yellow journalism" or "clickbait" to report on a real phenomenon affecting Lenovo, Asus, and Samsung models (https://browser.geekbench.com/search?utf8=✓&q=snapdragon+x+elite) while telling people to hold their pitchforks because it's probably fixable. In fact it's good to report on it so if the problem persists during launch and a user does get an affected model they can look up what's happening and instead of the first thing they find is "Qualcomm LIED!" they see a tech outlet saying this may be a fixable problem. At this point we don't even know the cause, except that it is widespread but temperamental, pointing to a likely firmware culprit.
I'm Impressed that Qualcomm was this forthcoming with their CPU core, and iGPU core, information as most do not even provide a fraction of the CPU core information like Instruction Decoders Width and Micro-Op instruction issue and instruction retirement rates per cycle. And the Adreno X1's proper Render Configuration is able to be established Shaders:TMUs:ROPs from that material that Qualcomm provided. That's a proper Qualcomm information reveal just like Intel did for its Lunar lake SOCs and AMD should be ashamed of themselves for that Computex event and AMD more like Apple there and no proper Zen-5 CPU core Block diagrams or much any other relevant information like AMD has provided in the past! I guess AMD's saving all the details for Zen-5 for Hot Chips but really AMD's just gone downhill for proper release information info and Whitepaper releases.
There is very little of that Marketing material here and some very salient technical details provided as far as I'm concerned. And Marketing does tend to get themselves too much in the way for what should be the technical aspects focused presentation of CPUs and GPUs!
CPU block diagrams and GPU block diagrams and other in-depth descriptions like is presented here are not marketing slides in my world and all Benchmarks Can be gamed to some degree and by marketing departments and so it's all the same there. And Qualcomm and Intel's latest materials were more towards the Hot Chips symposium sorts of academic presentations rather than the AMD/Apple sorts of Magic Black box presentations sans any proper whitepaper publication at all. As far as Benchmarks that's a lot of different review sights for Benchmarks because invariably a single website will not cover all the bases there.
And If you do not Like the "Marketing Slides" then skip that and wait for the benchmarks so why have any need to read, and comment about, what you have stated you are not interested in to begin with!
Same here. Nothing brings me confidence except the battery life claims. These are legit. Performance and compatibility claims to me look very exaggerated. It's going to take a full iteration of Windows (12) to make the promises closer to reality.
I think we already knew there's no excuse for Apple not to support OpenCL and Vulkan. It's funny how Apple turned from being a supporter and inventor of open standards in the 2000s to "METAL ONLY" as soon as the iPhone became big.
Imagine this, Just as Linux/MESA Gets a Proper and up to date to OpenCL(Rusticl: Implemented in the Rust Programming language) implementation to replace that way out of date and ignored for years MESA Clover OpenCL implementation, the Blender Foundation not a year or so before that goes on and Drops OpenCL as the GPU compute API in favor of CUDA/PTX and so there goes Radeon GPU compute API support over to ROCm/HIP that's needed to take that CUDA(PTX Intermediate Language representation) and convert/translate that to a form that can be executed on Radeon GPUs. And ROCm/HIP is never really been for consumer dGPUs or iGPUs and Polaris graphics was dropped from the ROCm/HIP support matrix years ago and Vega graphics is ready to be dropped as well! And so that's really fragmented the GPU compute API landscape there as Blender 3D 3.0/later editions only have native back end support for Nvidia CUDA/PTX and Apple Metal. So AMD has ROCm/HIP and Intel Has OneAPI that has similar functionality to AMD's ROCm/HIP. But Intel's got their OneAPI working good with Blender 3D for ARC dGPUs and ARC/Xe iGPUs on Linux as well while on Linux AMD's ROCm/HIP is not an easy thing for the non Linux neck-beard to get installed and working properly and only on a limited set of Linux Workstation Distros, unlike Intel's OneAPI and Level-0.
But I'm on Zen+ and Vega 8/iGPU with a Polaris dGPU on one laptop and on Zen+ and Vega 11/iGPU on my ASRock X300 Desk Mini! And so my only hope at Blender 3D dGPU and iGPU accelerated cycles rendering is using Blender 2.93 and earlier editions that are legacy but still use OpenCL as the GPU compute API! But I'm still waiting for the Ubuntu folks to enable MESA/Rusticl instead of having that hidden behind some environment variable because that still unstable, and I'm downstream of Ubuntu on Linux Mint 21.3.
So I'm waiting for Mint 22 to get released to see if I will ever be able to get any Blender 3D iGPU or dGPU Accelerated Cycles rendering enabled because I do not want to use the fallback default and Blender's CPU accelerated Cycles rendering as that's just to slow and too stressful on the laptop and the Desk Mini(I'm using the ASRock provided cooler for that).
"It's funny how Apple turned from being a supporter and inventor of open standards"
You mean how Apple saw the small minds at other companies refuse to advance OpenCL and turn OpenGL into a godawful mess and concluded that trying to do things by committee was a complete waste of time? And your solution for this is what? Every person who actually understands the issues is well aware of what a clusterfsck Vulkan is, eg https://xol.io/blah/death-to-shading-languages/
There's a reason the two GPU APIs/shading languages that don't suck (Metal and CUDA) both come from a single company, not a committee.
Thanks for the write-up. I'm very much looking forward to the extra competition.
I assume AVX2 emulation would be too slow with Neon. While it's possible to make it work, it would perform worse than SSE, which isn't what any application would expect. And the number of programs that outright require AVX2 are probably very few. I'm assuming Microsoft is waiting for SVE to appear on these chips before implementing AVX2 emulation.
I think this might have been important if Lunar Lake wasn't around the corner. But after examining Lunar Lake I think this chip is overmatched. Good try though.
"Meanwhile the back-end is made from 6 render output units (ROPs), which can process 8 pixels per cycle each, for a total of 48 pixels/clock rendered. The render back-ends are plugged in to a local cache, as well as an important scratchpad memory that Qualcomm calls GMEM (more on this in a bit)."
No that's 6 Render Back Ends of 8 ROPs each for a total of 48 ROPs and 16 more ROPs than either the Radeon 680M/780M(32 ROPs) or the Meteor Lake Xe-LPG iGPU that is 32 ROPs max. And so the G-Pixel Fill Rates there are on one slide and that is stated as 72 G-Pixels/S and really I'm impressed there with that raster performance!
Do you have the entire Slide Deck for this release as the slide I'm referencing with the Pixel fill rates as in another article or another website ?
So the industry as a whole has always played a little fast and loose with how the term ROPs is thrown around. In all modern architectures, what you have is not X number of single units, but rather a smaller number of units that can render multiple pixels per cycle. In this case, 6 units, each of which can spit out 8 pixels.
For historical reasons, we often just say ROPs = pixel count, and move on from there. It doesn't really harm anyone, even if's' not quite correct.
But since this is our first deep dive into the Adreno GPU architecture, I wanted to get this a bit more technically correct. Hence the wording I used in the article.
"Do you have the entire Slide Deck for this release as the slide I'm referencing with the Pixel fill rates as in another article or another website ? "
For the Sake of TechPowerUp's GPU Database That lists the dGPU/iGPUs render configurations as Shader:TMUs:ROPs and (Tensor Cores/Matrix Math Units and RT Units as well) But Please Technology Press adopt some common nomenclature so the Hardware can be as properly quantified as possible. So yes there's different was of stating that but what about the online GPU and CPU information databases and so some standardized taxonomy for CPU and GPU, other processor hardware is needed!
And I did Find that slide in your link but please Tech Press for CPU cores and iGPUs/dGPUs please get together on that or maybe see if the ACM has some glossary of terms for CPU Core parts and GPUs as well. As without any standardized nomenclature processors(CPUs, GPUs, NPUs/other) from different makers can not be compared and contrasted to at least some basic level.
I was very Impressed that you referenced and article that's using Imagination Technologies Ray Tracing Hardware Levels classification system as that's a great scholarly way to standardize the classification of the various Hardware Ray Tracing implementations that have appeared since 2014 when the PowerVR Wizard GPU IP appeared with the first hardware based Ray Tracing implementation!
The problem with demanding a single consistent comparison number is that hardware isn't consistently comparable between different architectures.
Even the RT "levels" you quoted has issues - like AMD GPUs *do* have some hardware acceleration to the BVH processing - just a single node in the tree rather than a hardware tree walker. So it's more than a level 2, but less than a level 3. And there's also the issue that implies they're a linear progression - that level 4 follows level 3, but there's nothing specific about ray coherency sorting that *requires* a hardware BVH tree walker.
Categorizing hardware is complex because hardware implementation details are complex. At some point every abstraction breaks down.
Yes things are not directly comparable but TPU's GPU Database does list ROP counts and those "ROPs" usually process 1 Pixel per clock per "ROP" and so the G-Pixel fill rates can be estimated for that and used as a metric to compare different makers GPU hardware for pixel processing numbers(Theoretical Max Numbers).
And the RDNA2 GPU micro-architecture received no proper GPU Whitepaper at release and so folks interested in that Ray Tracing on RDNA2 only got some minimal slides sans any in-depth Whitepaper explanation of that, other than a link to the RDNA1 whitepaper that lacked any hardware Ray Tracing at all! And as far as I can tell there was never any formal RDNA2 whitepaper released!
But I do value your Input with regards to the Levels Number for RDNA2's Ray Tracing and is there any reading material out there that you know of that's not behind some NDA that goes into some whitepaper like deep dive into RDNA2's actual RT Pipeline and maybe with some flow charts as well.
The hardest thing for me is Pay-walled Publications and the difficulties around getting access to College Libraries in the large Urban areas of the NE US where that's closed down to students only and no way to get access to that. And so The Microprocessor Report and all the other Trade Journals that I used to have access to when I was in College are not accessible to me now!
What about your city's municipal reference library, rather than the lending and university ones? Often, one can get access to different journals there.
My Public Library is not subscribed to the usual academic Trade and Computing Sciences Journals and lacks the funding. But even with some College Libraries Being federal depository libraries they are not as open to non students as the CFR/USC requires. And so that makes things harder in the NE US. Now if I lived om the West Coast of the US and in some large Urban Area things are different there and even in the Southern US Cities surprisingly.
> is there any reading material out there ... that goes into some whitepaper like deep dive > into RDNA2's actual RT Pipeline and maybe with some flow charts as well.
To the few comments complaining about "marketing slides" or no benchmarks ... I appreciate this article, it's a pre release technical overview of a new cpu design. This kind of technical stuff is what made anandtech great. We are smart enough to spot the marketing, consume the author's input, and judge for ourselves (and should be patient enough to wait for benchmarks). Keep up the good work.
Spectre is not at all an inherent consequence of speculative execution.
Speculative execution does not reveal information through architectural state (registers, memory), because CPU designers have been careful to reset the architectural state when detecting a branch misprediction. They have not done this for microarchitectural state, because microarchitecture is not architecturally visible. But microarchitectural state can be revealed through side channels, and that's Spectre.
So the first part of the Spectre fix is to treat microarchitectural state (e.g., loaded cache lines) like architectural state: Buffer it in some place that's abandoned when the speculation turns out to be wrong, or is promoted to longer-term microarchitectural state (e.g., a cache) when the instruction commits (look for papers about "invisible speculation" to see some ideas in that direction). There are also a few other side channels that can reveal information about speculative processed data that need to be closed, but it's all doable without excessive slowdowns.
Intel and AMD have been informed of Spectre 7 years ago. If they had started working on fixes at the time, they would have been done long ago. But apparently Intel and AMD decided that they don't want to invest in that, and instead promote software mitigations, which either have an extreme performance cost, or require extreme development efforts (and there is still the possibility that the developer missed one of the ways in which Spectre can be exploited), so most software does not go there. Apparently they think that their customers don't value Spectre-immunity, and of course they love the myth that Spectre is inherent in speculation, because that means that few customers will ask them why they still have not fixed Spectre.
It's great that the Oryon team attacks the problem. I hope that they produced a proper fix; the term "mitigation" does not sound proper to me, but I'll have to learn more about what they did before I judge it. I hope there will be more information about that forthcoming.
"Officially, Qualcomm isn’t assigning any TDP ratings to these chip SKUs, as, in principle, any given SKU can be used across the entire spectrum of power levels."
A Qualcope since they are differentiating the SKUs by max turbo clocks.
First, thanks Ryan! Glad to see you doing deep dives again.
Questions: 1. Anything known about if and how well the Snapdragon Extreme would pair up with a dGPU? The iGPU's performance is (apparently) in the same ballpark as the 780M and the ARC in Meteor Lake, but gaming or workstation use would require a dGPU like a 4080 mobile Ada or the pro variant. So, any word from Qualcomm on playing nice with dGPUs? 2. The elephant in the room on the ARM side is the unresolved legal dispute between Qualcomm and ARM over whether Qualcomm has the right to use the cores developed (under an ALA) by Nuvia for the development of ARM-based cores for server CPUs in (now) client SoCs. Any news on that? Some writers have speculated that this uncertainty is one, maybe the key reason for Microsoft to also encourage Nvidia and MediaTek to develop client SoCs based on stock ARM architecture. MS might hedge its bets here, so they don't put all the work (and PR) into developing Windows-on ARM and "AI" everywhere and find themselves with no ARM Laptops available to customers if ARM prevails in court.
1) That question isn't really being entertained right now since the required software does not exist. If and when NVIDIA has a ARMv8 Windows driver set, then maybe we can get some answers.
2) The Arm vs. Qualcomm legal dispute is ongoing. The court case itself doesn't start until late this year. In the meantime, any negotiations between QC and Arm would be taking place in private. There's not really much to say until that case either reaches its conclusion - at which point Arm could ask for various forms of injunctive relief - or the two companies come to an out-of-court settlement.
Sure looks like it. Copilot and Recall are a potential privacy and confidentiality nightmare (MS has promised/threatened it can remember pretty much anything); and the integration of "AI" into something that is actually useful remains elusive. I am a bit surprised that Qualcomm hasn't taken up some of that slack. For an example of useful AI-supported functionality: On at least some smartphones (Pixel, Samsung's Galaxy S24 models) one can get pretty good instant translation and transcription, and do so offline (!); no phoning home to the mothership required. But, that uses Google's software, so it's not coming to a Microsoft device near you or me anytime soon. Something like that in Windows, but with demonstrably lower power consumption than doing so on the CPU or GPU would show some value an NPU could bring. Absent that, the silence out of Redmond is indeed deafening.
I know I'm late to this article, but what is BE 5.4? Is BE somehow the new shorthand for Bluetooth, Shown as BT in all the other product definitions in the table on page 1?
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
52 Comments
Back to Article
Terry_Craig - Thursday, June 13, 2024 - link
Almost a year and all Qualcomm throws at the public is more slides and marketing material? Where are the third-party reviews?Ryan Smith - Thursday, June 13, 2024 - link
Next week.mukiex - Thursday, June 13, 2024 - link
Ryan, I legit appreciate the "A Note on x86 Emulation" section. Literally answered the biggest questions I had on this that nobody else has covered, as far as I can tell. 10/10.Terry_Craig - Thursday, June 13, 2024 - link
Finally... Finally... I was also tired of just seeing marketing stuffabufrejoval - Thursday, June 20, 2024 - link
I think the usual benchmark stuff has been covered sufficiently by now.The only remaining question for me is battery life and there specifically where it might really differ from existing x86 laptops.
From what I remember reading the SoC itself has become somewhat of a minor energy consumption player on light desktop loads, which is how most professional computer users spend time in front of a screen. So I don't even know if there is enough wiggle room for a 2-3x battery time improvement without going to passive displays and really slow low-power storage.
What it comes down to is mostly this question: do I need to take the charger into my all-day meeting or can I even leave it at home when I go on a week-long busines trip?
And can I forget about my power button actually be a power button like on my phone, where most of the time it's used for things like camera control.
Pretty sure it won't be the week-long trip just yet, but not bothering with chargers for all-day meetings seems to have been a major win for the fruity cult.
And somewhere in between those two is the question of how low the Snapdragons will go in lid-down mode, which could be suspend to RAM or even some usable kind of modern standby where e.g. voice commands or messenger actions might still be processed with something ultra-low power. And there the main quality benchmark would be how quickly you can resume normal operations without e.g. draining 10 minutes of idle battery for one resume.
From what I read between the lines of your deep dive, the granularity at which large swaths of the SoCs transistors could go and remain dark might be vastly better than with x86 designs and their iGPUs. And its the ability to tickle adequate responsiveness from phone wattages that would make a Snapdragon PC worth having, not it's ability to beat x86 on Cinebench.
Cinebench on an RTX 4090 makes it rather clear that any investment into CPU power for that use case is plain folly.
TheProv - Monday, July 1, 2024 - link
Hey Ryan are you guys doing a review?yeeeeman - Thursday, June 13, 2024 - link
yeah, also getting a bit tired of this hype hype hype.they kinda did this in their disadvantage tbh.
meacupla - Thursday, June 13, 2024 - link
They are launching on June 18th, so the review embargo probably lifts on the 17th or 18th.Dante Verizon - Tuesday, June 18, 2024 - link
Apparently not.shabby - Thursday, June 13, 2024 - link
Some benchmarks are leaking out, prepare to be disappointed https://www.tomshardware.com/laptops/snapdragon-x-...meacupla - Friday, June 14, 2024 - link
It looks like the laptop is stuck at 2.5Ghz, and can't boost to 4.0Ghz for whatever reason.shabby - Friday, June 14, 2024 - link
Reason is probably heat so it's neutered.meacupla - Friday, June 14, 2024 - link
I think that article is there just for clicks. You are aware the source of those benches got the galaxybook working at 4ghz, right?dada_dave - Friday, June 14, 2024 - link
It's not "yellow journalism" or "clickbait" to report on a real phenomenon affecting Lenovo, Asus, and Samsung models (https://browser.geekbench.com/search?utf8=✓&q=snapdragon+x+elite) while telling people to hold their pitchforks because it's probably fixable. In fact it's good to report on it so if the problem persists during launch and a user does get an affected model they can look up what's happening and instead of the first thing they find is "Qualcomm LIED!" they see a tech outlet saying this may be a fixable problem. At this point we don't even know the cause, except that it is widespread but temperamental, pointing to a likely firmware culprit.FWhitTrampoline - Thursday, June 13, 2024 - link
I'm Impressed that Qualcomm was this forthcoming with their CPU core, and iGPU core, information as most do not even provide a fraction of the CPU core information like Instruction Decoders Width and Micro-Op instruction issue and instruction retirement rates per cycle. And the Adreno X1's proper Render Configuration is able to be established Shaders:TMUs:ROPs from that material that Qualcomm provided. That's a proper Qualcomm information reveal just like Intel did for its Lunar lake SOCs and AMD should be ashamed of themselves for that Computex event and AMD more like Apple there and no proper Zen-5 CPU core Block diagrams or much any other relevant information like AMD has provided in the past! I guess AMD's saving all the details for Zen-5 for Hot Chips but really AMD's just gone downhill for proper release information info and Whitepaper releases.There is very little of that Marketing material here and some very salient technical details provided as far as I'm concerned. And Marketing does tend to get themselves too much in the way for what should be the technical aspects focused presentation of CPUs and GPUs!
Dante Verizon - Thursday, June 13, 2024 - link
I prefer benchmarks a million times more than a bunch of marketing slides from intel or qualcomm... jezFWhitTrampoline - Friday, June 14, 2024 - link
CPU block diagrams and GPU block diagrams and other in-depth descriptions like is presented here are not marketing slides in my world and all Benchmarks Can be gamed to some degree and by marketing departments and so it's all the same there. And Qualcomm and Intel's latest materials were more towards the Hot Chips symposium sorts of academic presentations rather than the AMD/Apple sorts of Magic Black box presentations sans any proper whitepaper publication at all. As far as Benchmarks that's a lot of different review sights for Benchmarks because invariably a single website will not cover all the bases there.And If you do not Like the "Marketing Slides" then skip that and wait for the benchmarks so why have any need to read, and comment about, what you have stated you are not interested in to begin with!
tkSteveFOX - Sunday, June 16, 2024 - link
Same here. Nothing brings me confidence except the battery life claims. These are legit. Performance and compatibility claims to me look very exaggerated. It's going to take a full iteration of Windows (12) to make the promises closer to reality.kpb321 - Thursday, June 13, 2024 - link
On the end of page one: "Microsoft, for its part, has continued to work on their x86/x86 emulation layer, which now goes by the name Prism"I assume this should be x86/x64 or something like that as x86 twice doesn't make sense.
GeoffreyA - Thursday, June 13, 2024 - link
The Oryon core's microarchitecture sounds quite meaty indeed.id4andrei - Thursday, June 13, 2024 - link
If Qualcomm can support OpenCL and Vulkan there is no excuse for Apple not to.Dolda2000 - Thursday, June 13, 2024 - link
I think we already knew there's no excuse for Apple not to support OpenCL and Vulkan. It's funny how Apple turned from being a supporter and inventor of open standards in the 2000s to "METAL ONLY" as soon as the iPhone became big.FWhitTrampoline - Thursday, June 13, 2024 - link
Imagine this, Just as Linux/MESA Gets a Proper and up to date to OpenCL(Rusticl: Implemented in the Rust Programming language) implementation to replace that way out of date and ignored for years MESA Clover OpenCL implementation, the Blender Foundation not a year or so before that goes on and Drops OpenCL as the GPU compute API in favor of CUDA/PTX and so there goes Radeon GPU compute API support over to ROCm/HIP that's needed to take that CUDA(PTX Intermediate Language representation) and convert/translate that to a form that can be executed on Radeon GPUs. And ROCm/HIP is never really been for consumer dGPUs or iGPUs and Polaris graphics was dropped from the ROCm/HIP support matrix years ago and Vega graphics is ready to be dropped as well! And so that's really fragmented the GPU compute API landscape there as Blender 3D 3.0/later editions only have native back end support for Nvidia CUDA/PTX and Apple Metal. So AMD has ROCm/HIP and Intel Has OneAPI that has similar functionality to AMD's ROCm/HIP. But Intel's got their OneAPI working good with Blender 3D for ARC dGPUs and ARC/Xe iGPUs on Linux as well while on Linux AMD's ROCm/HIP is not an easy thing for the non Linux neck-beard to get installed and working properly and only on a limited set of Linux Workstation Distros, unlike Intel's OneAPI and Level-0.But I'm on Zen+ and Vega 8/iGPU with a Polaris dGPU on one laptop and on Zen+ and Vega 11/iGPU on my ASRock X300 Desk Mini! And so my only hope at Blender 3D dGPU and iGPU accelerated cycles rendering is using Blender 2.93 and earlier editions that are legacy but still use OpenCL as the GPU compute API! But I'm still waiting for the Ubuntu folks to enable MESA/Rusticl instead of having that hidden behind some environment variable because that still unstable, and I'm downstream of Ubuntu on Linux Mint 21.3.
So I'm waiting for Mint 22 to get released to see if I will ever be able to get any Blender 3D iGPU or dGPU Accelerated Cycles rendering enabled because I do not want to use the fallback default and Blender's CPU accelerated Cycles rendering as that's just to slow and too stressful on the laptop and the Desk Mini(I'm using the ASRock provided cooler for that).
name99 - Saturday, June 15, 2024 - link
"It's funny how Apple turned from being a supporter and inventor of open standards"You mean how Apple saw the small minds at other companies refuse to advance OpenCL and turn OpenGL into a godawful mess and concluded that trying to do things by committee was a complete waste of time?
And your solution for this is what? Every person who actually understands the issues is well aware of what a clusterfsck Vulkan is, eg https://xol.io/blah/death-to-shading-languages/
There's a reason the two GPU APIs/shading languages that don't suck (Metal and CUDA) both come from a single company, not a committee.
Dante Verizon - Sunday, June 16, 2024 - link
The reason is that there are few great programmers.dan82 - Thursday, June 13, 2024 - link
Thanks for the write-up. I'm very much looking forward to the extra competition.I assume AVX2 emulation would be too slow with Neon. While it's possible to make it work, it would perform worse than SSE, which isn't what any application would expect. And the number of programs that outright require AVX2 are probably very few. I'm assuming Microsoft is waiting for SVE to appear on these chips before implementing AVX2 emulation.
drajitshnew - Thursday, June 13, 2024 - link
Thanku Ryan and AT for a good CPU architecture update. It is a rare treat these daysHulk - Thursday, June 13, 2024 - link
I think this might have been important if Lunar Lake wasn't around the corner. But after examining Lunar Lake I think this chip is overmatched. Good try though.SIDtech - Friday, June 14, 2024 - link
😂😂😂😂FWhitTrampoline - Thursday, June 13, 2024 - link
"Meanwhile the back-end is made from 6 render output units (ROPs), which can process 8 pixels per cycle each, for a total of 48 pixels/clock rendered. The render back-ends are plugged in to a local cache, as well as an important scratchpad memory that Qualcomm calls GMEM (more on this in a bit)."No that's 6 Render Back Ends of 8 ROPs each for a total of 48 ROPs and 16 more ROPs than either the Radeon 680M/780M(32 ROPs) or the Meteor Lake Xe-LPG iGPU that is 32 ROPs max. And so the G-Pixel Fill Rates there are on one slide and that is stated as 72 G-Pixels/S and really I'm impressed there with that raster performance!
Do you have the entire Slide Deck for this release as the slide I'm referencing with the Pixel fill rates as in another article or another website ?
Ryan Smith - Thursday, June 13, 2024 - link
So the industry as a whole has always played a little fast and loose with how the term ROPs is thrown around. In all modern architectures, what you have is not X number of single units, but rather a smaller number of units that can render multiple pixels per cycle. In this case, 6 units, each of which can spit out 8 pixels.For historical reasons, we often just say ROPs = pixel count, and move on from there. It doesn't really harm anyone, even if's' not quite correct.
But since this is our first deep dive into the Adreno GPU architecture, I wanted to get this a bit more technically correct. Hence the wording I used in the article.
"Do you have the entire Slide Deck for this release as the slide I'm referencing with the Pixel fill rates as in another article or another website ? "
Yes, the complete slide deck is posted here: https://www.anandtech.com/Gallery/Album/9488
FWhitTrampoline - Thursday, June 13, 2024 - link
For the Sake of TechPowerUp's GPU Database That lists the dGPU/iGPUs render configurations as Shader:TMUs:ROPs and (Tensor Cores/Matrix Math Units and RT Units as well) But Please Technology Press adopt some common nomenclature so the Hardware can be as properly quantified as possible. So yes there's different was of stating that but what about the online GPU and CPU information databases and so some standardized taxonomy for CPU and GPU, other processor hardware is needed!And I did Find that slide in your link but please Tech Press for CPU cores and iGPUs/dGPUs please get together on that or maybe see if the ACM has some glossary of terms for CPU Core parts and GPUs as well. As without any standardized nomenclature processors(CPUs, GPUs, NPUs/other) from different makers can not be compared and contrasted to at least some basic level.
I was very Impressed that you referenced and article that's using Imagination Technologies Ray Tracing Hardware Levels classification system as that's a great scholarly way to standardize the classification of the various Hardware Ray Tracing implementations that have appeared since 2014 when the PowerVR Wizard GPU IP appeared with the first hardware based Ray Tracing implementation!
Jonny_H - Thursday, June 13, 2024 - link
The problem with demanding a single consistent comparison number is that hardware isn't consistently comparable between different architectures.Even the RT "levels" you quoted has issues - like AMD GPUs *do* have some hardware acceleration to the BVH processing - just a single node in the tree rather than a hardware tree walker. So it's more than a level 2, but less than a level 3. And there's also the issue that implies they're a linear progression - that level 4 follows level 3, but there's nothing specific about ray coherency sorting that *requires* a hardware BVH tree walker.
Categorizing hardware is complex because hardware implementation details are complex. At some point every abstraction breaks down.
FWhitTrampoline - Thursday, June 13, 2024 - link
Yes things are not directly comparable but TPU's GPU Database does list ROP counts and those "ROPs" usually process 1 Pixel per clock per "ROP" and so the G-Pixel fill rates can be estimated for that and used as a metric to compare different makers GPU hardware for pixel processing numbers(Theoretical Max Numbers).And the RDNA2 GPU micro-architecture received no proper GPU Whitepaper at release and so folks interested in that Ray Tracing on RDNA2 only got some minimal slides sans any in-depth Whitepaper explanation of that, other than a link to the RDNA1 whitepaper that lacked any hardware Ray Tracing at all! And as far as I can tell there was never any formal RDNA2 whitepaper released!
But I do value your Input with regards to the Levels Number for RDNA2's Ray Tracing and is there any reading material out there that you know of that's not behind some NDA that goes into some whitepaper like deep dive into RDNA2's actual RT Pipeline and maybe with some flow charts as well.
The hardest thing for me is Pay-walled Publications and the difficulties around getting access to College Libraries in the large Urban areas of the NE US where that's closed down to students only and no way to get access to that. And so The Microprocessor Report and all the other Trade Journals that I used to have access to when I was in College are not accessible to me now!
GeoffreyA - Friday, June 14, 2024 - link
What about your city's municipal reference library, rather than the lending and university ones? Often, one can get access to different journals there.FWhitTrampoline - Friday, June 14, 2024 - link
My Public Library is not subscribed to the usual academic Trade and Computing Sciences Journals and lacks the funding. But even with some College Libraries Being federal depository libraries they are not as open to non students as the CFR/USC requires. And so that makes things harder in the NE US. Now if I lived om the West Coast of the US and in some large Urban Area things are different there and even in the Southern US Cities surprisingly.GeoffreyA - Friday, June 14, 2024 - link
I understand. It's a sad state of affairs for information to be unaccessible.mode_13h - Saturday, June 15, 2024 - link
> is there any reading material out there ... that goes into some whitepaper like deep dive> into RDNA2's actual RT Pipeline and maybe with some flow charts as well.
Have you seen this?
https://chipsandcheese.com/2023/03/22/raytracing-o...
name99 - Saturday, June 15, 2024 - link
https://sci-hub.seSoulkeeper - Friday, June 14, 2024 - link
To the few comments complaining about "marketing slides" or no benchmarks ...I appreciate this article, it's a pre release technical overview of a new cpu design.
This kind of technical stuff is what made anandtech great.
We are smart enough to spot the marketing, consume the author's input, and judge for ourselves (and should be patient enough to wait for benchmarks).
Keep up the good work.
AntonErtl - Friday, June 14, 2024 - link
Spectre is not at all an inherent consequence of speculative execution.Speculative execution does not reveal information through architectural state (registers, memory), because CPU designers have been careful to reset the architectural state when detecting a branch misprediction. They have not done this for microarchitectural state, because microarchitecture is not architecturally visible. But microarchitectural state can be revealed through side channels, and that's Spectre.
So the first part of the Spectre fix is to treat microarchitectural state (e.g., loaded cache lines) like architectural state: Buffer it in some place that's abandoned when the speculation turns out to be wrong, or is promoted to longer-term microarchitectural state (e.g., a cache) when the instruction commits (look for papers about "invisible speculation" to see some ideas in that direction). There are also a few other side channels that can reveal information about speculative processed data that need to be closed, but it's all doable without excessive slowdowns.
Intel and AMD have been informed of Spectre 7 years ago. If they had started working on fixes at the time, they would have been done long ago. But apparently Intel and AMD decided that they don't want to invest in that, and instead promote software mitigations, which either have an extreme performance cost, or require extreme development efforts (and there is still the possibility that the developer missed one of the ways in which Spectre can be exploited), so most software does not go there. Apparently they think that their customers don't value Spectre-immunity, and of course they love the myth that Spectre is inherent in speculation, because that means that few customers will ask them why they still have not fixed Spectre.
It's great that the Oryon team attacks the problem. I hope that they produced a proper fix; the term "mitigation" does not sound proper to me, but I'll have to learn more about what they did before I judge it. I hope there will be more information about that forthcoming.
skavi - Friday, June 14, 2024 - link
great article. it’s nice to see quality stuff like this.nandnandnand - Friday, June 14, 2024 - link
"Officially, Qualcomm isn’t assigning any TDP ratings to these chip SKUs, as, in principle, any given SKU can be used across the entire spectrum of power levels."A Qualcope since they are differentiating the SKUs by max turbo clocks.
eastcoast_pete - Friday, June 14, 2024 - link
First, thanks Ryan! Glad to see you doing deep dives again.Questions: 1. Anything known about if and how well the Snapdragon Extreme would pair up with a dGPU? The iGPU's performance is (apparently) in the same ballpark as the 780M and the ARC in Meteor Lake, but gaming or workstation use would require a dGPU like a 4080 mobile Ada or the pro variant. So, any word from Qualcomm on playing nice with dGPUs?
2. The elephant in the room on the ARM side is the unresolved legal dispute between Qualcomm and ARM over whether Qualcomm has the right to use the cores developed (under an ALA) by Nuvia for the development of ARM-based cores for server CPUs in (now) client SoCs. Any news on that? Some writers have speculated that this uncertainty is one, maybe the key reason for Microsoft to also encourage Nvidia and MediaTek to develop client SoCs based on stock ARM architecture. MS might hedge its bets here, so they don't put all the work (and PR) into developing Windows-on ARM and "AI" everywhere and find themselves with no ARM Laptops available to customers if ARM prevails in court.
Ryan Smith - Friday, June 14, 2024 - link
1) That question isn't really being entertained right now since the required software does not exist. If and when NVIDIA has a ARMv8 Windows driver set, then maybe we can get some answers.2) The Arm vs. Qualcomm legal dispute is ongoing. The court case itself doesn't start until late this year. In the meantime, any negotiations between QC and Arm would be taking place in private. There's not really much to say until that case either reaches its conclusion - at which point Arm could ask for various forms of injunctive relief - or the two companies come to an out-of-court settlement.
eastcoast_pete - Saturday, June 15, 2024 - link
Thanks Ryan! Looking forward to the first tests.continuum - Saturday, June 15, 2024 - link
Great article, can't wait til actual reviews next week. Thanks Ryan!MooseMuffin - Saturday, June 15, 2024 - link
When should we expect to see Oryon cores in android phones?Ryan Smith - Sunday, June 16, 2024 - link
Snapdragon 8 Gen 4 late this year.abufrejoval - Thursday, June 20, 2024 - link
So the embargos have lifted......and the silence is deafening?
Is this Microsoft's Vision Pro moment?
eastcoast_pete - Monday, June 24, 2024 - link
Sure looks like it. Copilot and Recall are a potential privacy and confidentiality nightmare (MS has promised/threatened it can remember pretty much anything); and the integration of "AI" into something that is actually useful remains elusive. I am a bit surprised that Qualcomm hasn't taken up some of that slack. For an example of useful AI-supported functionality: On at least some smartphones (Pixel, Samsung's Galaxy S24 models) one can get pretty good instant translation and transcription, and do so offline (!); no phoning home to the mothership required. But, that uses Google's software, so it's not coming to a Microsoft device near you or me anytime soon. Something like that in Windows, but with demonstrably lower power consumption than doing so on the CPU or GPU would show some value an NPU could bring. Absent that, the silence out of Redmond is indeed deafening.James5mith - Saturday, June 22, 2024 - link
I know I'm late to this article, but what is BE 5.4? Is BE somehow the new shorthand for Bluetooth, Shown as BT in all the other product definitions in the table on page 1?