> nvidia has been making ARM based chips for years
How long since their last Denver core? All of their current SoCs have cores designed by ARM, themselves. I had even assumed Nvidia dissolved its ARM core design group, especially after its plans to acquire ARM came to light.
Denver? No way. Project Denver cores are VLIW, with a JIT compiler to translate the code. It's not easy to find details on them, but here's a choice sentence from a developer blog about Jetson TX2:
"TX2’s CPU Complex includes a dual-core 7-way superscalar NVIDIA Denver 2 for high single-thread performance with dynamic code optimization"
If you ask Jim Keller, a CPU is a CPU. You don't need an an 'expert ARM chip designer'. You just need an an 'expert CPU designer'. AMD already has those.
Yes but an ARM ISA based Instruction Decoder takes up less die area/transistors to implement than an x86 ISA Instruction Decoder so the custom ARM chips like Apple's A14 Cyclone can fit 8 Instruction decoders on the front of that core's design whereas Golden Cove has only 6 Instruction Decoders on its front there with only one of those 6 being a complex x86 Instruction Decoder and the reminder being simplified x86 Instruction Decoders. Arm Holdings has a core that coming with 10 Instruction Decoders but that design lack any Micro-op cache to store already decoded instructions.
Do you mean the ARM Cortex X4 is out now? And in what Shipping device if that's so? And hopefully there will be some Micro-Benchmark work done there to see how that works out with X4 having no Micro-OP cache. The X4 can only issue 10 Micro-OPs so that's all coming from the decoders and that's maybe going to have some latency and power usage considerations there instead of pulling in Micro-OPs from a Micro-op cache! And that's because Instruction Decoders are power hungry relative to other parts of the core when enabled and will have to be enabled there most of the time for Cortex X4.
I found an Interesting research paper tiled: "I See Dead μops: Leaking Secrets via Intel/AMD Micro-Op Caches" that's got an nice primer into AMD's and Intel's Micro-Op caches and how they work and even though the paper is looking at SMT related side channel vulnerabilities it's discussion of x86 Micro-OP cache designs is worth keeping a copy for reference for it's deeper dive in the x86 Micro-Op cache design at that time for AMD(Zen) and Intel(Sky Lake). There's some nice Micro-Benchmark code samples that may be useful to others as well in that research paper.
> hopefully there will be some Micro-Benchmark work done there
No, I think Anandtech hasn't benchmarked any phones since Andrei left.
> The X4 can only issue 10 Micro-OPs
Only? That's like the widest in the biz. Also, ARM seems to cal them Macro Ops.
> so that's all coming from the decoders and that's maybe going to have some latency and > power usage considerations there instead of pulling in Micro-OPs from a Micro-op cache!
How dumb do you think ARM is? Do you think they'd really drop the mOP cache if doing so came at such costs? And what do you mean "all the way", when the decoders are right where the mOP cache would be?
> that's because Instruction Decoders are power hungry relative to other parts of the core
No, you must be thinking of a bad ISA, like x86. AArch64 decoders are cheap. ARM got rid of the mOP cache, since the X4 dropped support for AArch32.
The other benefit ARM has isn't just simplified decoders that consume less power and less die area than their x86 counter parts. Post decoding Apple has a massive instruction window for re-ordering operations. Again, the ARM benefit isn't in the feature but rather increased capacity given the same die area.
The reordering window isn't really related to ISA, it's just an Apple flex. The X-series reorder buffers have the following sizes, according to wikichip: 224, 288, 320, and 384. So, the X4 has one not much more than half the size of the Firestorm cores in Apple's M1.
I expect AMD to start with AMD graphics and Arm Ltd cores: much cheaper to put together, and a fairly attractive proposition. They've said for years they'd do it if an OEM asked.
It might not be too dissimilar to their pre-Zen position: good-enough CPU power with much better graphics and drivers than Qualcomm (now) or Intel (then).
Doesn't that one have some severe GPU performance issues? IDK if it's the hardware or driver, but it's "underwhelming" for a radeon product is what I've read.
It got no opengl support, so it translates opengl to vulkan with angle, which caused some bugs in opengl android games and citra emulator back then. Perf/efficiency was similar to the qualcomm alternative.
I mean...yes, they had publicly announced an ARM design in 2017. It was cancelled in favor of Zen, but they have an ARM Architecture license, and likely have had engineers working on it the whole time.
It wasn't in favor of Zen as if it was one or the other, rather the initial goal was to be in parallel to Zen. AMD was cash strapped and didn't finish it. The real question is if AMD restarted K12 or is simply licensing high power ARM cores.
AMD still has an ARM license and uses their cores as part of the security process inside the IO dies. Further ARM core usage was inherited by AMD's Xilinx and Pensando acquisitions. AMD's current design philosophy isn't making the Xilinx or Pensando designs migrate to x86 (which is something Intel attempted with Altera before spinning them of again). Rather AMD strategy is simply chiplets and separating the hard CPU cores from their FPGA SoCs and putting in a chiplet Infinity Fabric link. Conceptually AMD would only have to provide an ARM based CCD to carry forward with the status quo on the development side. Similarly, swapping the ARM based CCD with a Zen x86 CCD would give customers options. This is cuts down on development resources and provides customers options.
I've seen an interview with Jim Keller I think (or some other high-thing who worked in AMD CPU team, but I'm like 80% sure he was Jim).
He said that he told AMD to work on both x86 and ARM, and they started the project like that, he said something that the main different between an x86 and ARM CPU is just a few parts, both can share most of the design if they're similarly designed (I mean not like ARM is optimized more for low power and higher efficiency)
AMD was convinced at first, but duo to time and limited resources and that they MUST succeed in their project, they decided to focus on the x86 design only, that was how the original Zen was born, there was supposedly an ARM brother design as well, but it was paused.
Technically, they can re open that project, of course it won't be a simple copy and paste, but they already have the hardest work, they need to port the design to ARM first, then do some ARM-specific optimizations.
Oh man, the next few years are going to be really interesting with a multi way throwdown between Qualcomm, Nvidia, Apple and AMD all having ARM SoCs, hopefully Intel joins the fray too. Going to be way different than decades of comparing the x86 duo.
There's so many toolchains and things that are on ARM but not RISC-V, it may become the ARM creeping up on x86 version of itself in the future, but it may be many years off.
Intel would be well served to have ARM designs in the pipeline today.
Why Windows and not Linux/Linux Kernel based OS options for an OS that's been used extensively on ARM devices for ages compared to any Windows on ARM OS that's going to be very Bloated there and performance limited!
Not holding my breath over this, but I do believe we are in need of a pathway out of x86 given how inefficient it is from a power perspective. ARM may offer the path, but it certainly isn't a magic bullet. NV is in a similar position as Intel in regards to obnoxious power demand though if not a fair bit worse these days.
> NV is in a similar position as Intel in regards to obnoxious power demand though > if not a fair bit worse these days.
I was going to disagree... but, if we look at two data points, their power scaling seems comparable. I know it's a more complex picture than that, but the following data shows the RTX 4090 delivering 77.0% as much 4k RT Ultra performance at 50% power, while the i9-13900K delivers 78.1% as much Cinebench R32 Multithreaded performance when limited to 125 W.
That's less of a difference than I thought it'd be actually. At any rate, most of my complaint with PC hardware has been solving the performance problem (acknowledged - if Intel or NV didn't do it they would lose the performance "crown" to someone else that did) by tossing more power at it. ARM-based solutions may move the goalposts to a different place on the proverbial field for a time at least until the various competitors resort to using power to win again. Eh, I'm kidding myself, it's probably already happening behind closed doors.
I really hope this is true. While Qualcomm is ok in the current Surface Pro line, competition would be great. Plus, nVidia has been doing windows GPU drivers for a long time so I'd expect theirs to be better right off the bat.
Yeah, there will definitely be an inflection point. If you're not paying attention, the transition will seem to come out of nowhere.
Along with their E-cores, I think Intel's APX is their plan to string x86 along for a while longer. It's just delaying the inevitable, though. Maybe their hope is they can keep x86 viable until the RISC-V ecosystem is ready for it to be a viable cloud alternative. Perhaps what Intel is *really* trying to avoid is specifically getting in bed with ARM.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
43 Comments
Back to Article
meacupla - Wednesday, October 25, 2023 - link
nvidia has been making ARM based chips for years, so I can see them offering a design that would work well.AMD... Have they even hired an expert ARM chip designer?
mode_13h - Wednesday, October 25, 2023 - link
> nvidia has been making ARM based chips for yearsHow long since their last Denver core? All of their current SoCs have cores designed by ARM, themselves. I had even assumed Nvidia dissolved its ARM core design group, especially after its plans to acquire ARM came to light.
meacupla - Wednesday, October 25, 2023 - link
It's been a while with Denver, but their latest one is most likely going to be an Orin in Nintendo Switch2.iphonebestgamephone - Thursday, October 26, 2023 - link
Those are said to be a78 cores, yet another stock arm design.mode_13h - Friday, October 27, 2023 - link
Denver? No way. Project Denver cores are VLIW, with a JIT compiler to translate the code. It's not easy to find details on them, but here's a choice sentence from a developer blog about Jetson TX2:"TX2’s CPU Complex includes a dual-core 7-way superscalar NVIDIA Denver 2 for high single-thread performance with dynamic code optimization"
Source: https://developer.nvidia.com/blog/jetson-tx2-deliv...
iphonebestgamephone - Wednesday, November 1, 2023 - link
Orin.Desierz - Wednesday, October 25, 2023 - link
If you ask Jim Keller, a CPU is a CPU. You don't need an an 'expert ARM chip designer'. You just need an an 'expert CPU designer'. AMD already has those.meacupla - Wednesday, October 25, 2023 - link
Okay, well if that holds true, I hope AMD can slap together a strong candidate.FWhitTrampoline - Wednesday, October 25, 2023 - link
Yes but an ARM ISA based Instruction Decoder takes up less die area/transistors to implement than an x86 ISA Instruction Decoder so the custom ARM chips like Apple's A14 Cyclone can fit 8 Instruction decoders on the front of that core's design whereas Golden Cove has only 6 Instruction Decoders on its front there with only one of those 6 being a complex x86 Instruction Decoder and the reminder being simplified x86 Instruction Decoders. Arm Holdings has a core that coming with 10 Instruction Decoders but that design lack any Micro-op cache to store already decoded instructions.iphonebestgamephone - Thursday, October 26, 2023 - link
The x4? Its out now.FWhitTrampoline - Thursday, October 26, 2023 - link
Do you mean the ARM Cortex X4 is out now? And in what Shipping device if that's so? And hopefully there will be some Micro-Benchmark work done there to see how that works out with X4 having no Micro-OP cache. The X4 can only issue 10 Micro-OPs so that's all coming from the decoders and that's maybe going to have some latency and power usage considerations there instead of pulling in Micro-OPs from a Micro-op cache! And that's because Instruction Decoders are power hungry relative to other parts of the core when enabled and will have to be enabled there most of the time for Cortex X4.I found an Interesting research paper tiled: "I See Dead μops: Leaking Secrets via Intel/AMD Micro-Op Caches" that's got an nice primer into AMD's and Intel's Micro-Op caches and how they work and even though the paper is looking at SMT related side channel vulnerabilities it's discussion of x86 Micro-OP cache designs is worth keeping a copy for reference for it's deeper dive in the x86 Micro-Op cache design at that time for AMD(Zen) and Intel(Sky Lake). There's some nice Micro-Benchmark code samples that may be useful to others as well in that research paper.
mode_13h - Friday, October 27, 2023 - link
> hopefully there will be some Micro-Benchmark work done thereNo, I think Anandtech hasn't benchmarked any phones since Andrei left.
> The X4 can only issue 10 Micro-OPs
Only? That's like the widest in the biz. Also, ARM seems to cal them Macro Ops.
> so that's all coming from the decoders and that's maybe going to have some latency and
> power usage considerations there instead of pulling in Micro-OPs from a Micro-op cache!
How dumb do you think ARM is? Do you think they'd really drop the mOP cache if doing so came at such costs? And what do you mean "all the way", when the decoders are right where the mOP cache would be?
> that's because Instruction Decoders are power hungry relative to other parts of the core
No, you must be thinking of a bad ISA, like x86. AArch64 decoders are cheap. ARM got rid of the mOP cache, since the X4 dropped support for AArch32.
iphonebestgamephone - Wednesday, November 1, 2023 - link
Yes. In the hands of reviewers in the form of the sd 8g3 refrence design.iphonebestgamephone - Wednesday, November 1, 2023 - link
Now the xiaomi 14 is also out.Kevin G - Thursday, October 26, 2023 - link
The other benefit ARM has isn't just simplified decoders that consume less power and less die area than their x86 counter parts. Post decoding Apple has a massive instruction window for re-ordering operations. Again, the ARM benefit isn't in the feature but rather increased capacity given the same die area.mode_13h - Friday, October 27, 2023 - link
The reordering window isn't really related to ISA, it's just an Apple flex. The X-series reorder buffers have the following sizes, according to wikichip: 224, 288, 320, and 384. So, the X4 has one not much more than half the size of the Firestorm cores in Apple's M1.mode_13h - Friday, October 27, 2023 - link
By contrast, Golden Cove's reorder buffer is 512 entries, while Zen 4's is only 320.FWhitTrampoline - Wednesday, October 25, 2023 - link
Edit A14 Cyclone to A14 Firestorm!jamesindevon - Wednesday, October 25, 2023 - link
I expect AMD to start with AMD graphics and Arm Ltd cores: much cheaper to put together, and a fairly attractive proposition. They've said for years they'd do it if an OEM asked.It might not be too dissimilar to their pre-Zen position: good-enough CPU power with much better graphics and drivers than Qualcomm (now) or Intel (then).
iphonebestgamephone - Thursday, October 26, 2023 - link
So the exynos 2100iphonebestgamephone - Thursday, October 26, 2023 - link
2200*meacupla - Friday, October 27, 2023 - link
Doesn't that one have some severe GPU performance issues? IDK if it's the hardware or driver, but it's "underwhelming" for a radeon product is what I've read.iphonebestgamephone - Wednesday, November 1, 2023 - link
It got no opengl support, so it translates opengl to vulkan with angle, which caused some bugs in opengl android games and citra emulator back then. Perf/efficiency was similar to the qualcomm alternative.Sahrin - Wednesday, October 25, 2023 - link
I mean...yes, they had publicly announced an ARM design in 2017. It was cancelled in favor of Zen, but they have an ARM Architecture license, and likely have had engineers working on it the whole time.https://en.wikipedia.org/wiki/AMD_K12
Kevin G - Thursday, October 26, 2023 - link
It wasn't in favor of Zen as if it was one or the other, rather the initial goal was to be in parallel to Zen. AMD was cash strapped and didn't finish it. The real question is if AMD restarted K12 or is simply licensing high power ARM cores.AMD still has an ARM license and uses their cores as part of the security process inside the IO dies. Further ARM core usage was inherited by AMD's Xilinx and Pensando acquisitions. AMD's current design philosophy isn't making the Xilinx or Pensando designs migrate to x86 (which is something Intel attempted with Altera before spinning them of again). Rather AMD strategy is simply chiplets and separating the hard CPU cores from their FPGA SoCs and putting in a chiplet Infinity Fabric link. Conceptually AMD would only have to provide an ARM based CCD to carry forward with the status quo on the development side. Similarly, swapping the ARM based CCD with a Zen x86 CCD would give customers options. This is cuts down on development resources and provides customers options.
Xajel - Thursday, October 26, 2023 - link
I've seen an interview with Jim Keller I think (or some other high-thing who worked in AMD CPU team, but I'm like 80% sure he was Jim).He said that he told AMD to work on both x86 and ARM, and they started the project like that, he said something that the main different between an x86 and ARM CPU is just a few parts, both can share most of the design if they're similarly designed (I mean not like ARM is optimized more for low power and higher efficiency)
AMD was convinced at first, but duo to time and limited resources and that they MUST succeed in their project, they decided to focus on the x86 design only, that was how the original Zen was born, there was supposedly an ARM brother design as well, but it was paused.
Technically, they can re open that project, of course it won't be a simple copy and paste, but they already have the hardest work, they need to port the design to ARM first, then do some ARM-specific optimizations.
vip2 - Thursday, November 2, 2023 - link
AMD has already built ARM based processors starting in 2014. The AMD Opteron A1100 64-bit processor.Dodozoid - Friday, November 10, 2023 - link
building an ARM "processor" is something completely different than building an ARM "core"tipoo - Wednesday, October 25, 2023 - link
Oh man, the next few years are going to be really interesting with a multi way throwdown between Qualcomm, Nvidia, Apple and AMD all having ARM SoCs, hopefully Intel joins the fray too. Going to be way different than decades of comparing the x86 duo.Threska - Wednesday, October 25, 2023 - link
Have Apple to thank for the incentive to move away from that duopoly.mode_13h - Wednesday, October 25, 2023 - link
Intel could go straight to RISC-V. That's where we'll probably all end up, eventually.tipoo - Wednesday, October 25, 2023 - link
There's so many toolchains and things that are on ARM but not RISC-V, it may become the ARM creeping up on x86 version of itself in the future, but it may be many years off.Intel would be well served to have ARM designs in the pipeline today.
Yojimbo - Wednesday, October 25, 2023 - link
Wouldn't they need a Windows for Risc-V first?FWhitTrampoline - Wednesday, October 25, 2023 - link
Why Windows and not Linux/Linux Kernel based OS options for an OS that's been used extensively on ARM devices for ages compared to any Windows on ARM OS that's going to be very Bloated there and performance limited!GeoffreyA - Friday, October 27, 2023 - link
Well, much of the world is using Windows and it's not going to change any time soon.mode_13h - Friday, October 27, 2023 - link
Android is adding support for RISC-V.PeachNCream - Wednesday, October 25, 2023 - link
Not holding my breath over this, but I do believe we are in need of a pathway out of x86 given how inefficient it is from a power perspective. ARM may offer the path, but it certainly isn't a magic bullet. NV is in a similar position as Intel in regards to obnoxious power demand though if not a fair bit worse these days.mode_13h - Wednesday, October 25, 2023 - link
> NV is in a similar position as Intel in regards to obnoxious power demand though> if not a fair bit worse these days.
I was going to disagree... but, if we look at two data points, their power scaling seems comparable. I know it's a more complex picture than that, but the following data shows the RTX 4090 delivering 77.0% as much 4k RT Ultra performance at 50% power, while the i9-13900K delivers 78.1% as much Cinebench R32 Multithreaded performance when limited to 125 W.
* https://www.tomshardware.com/news/improving-nvidia...
* https://www.anandtech.com/show/17641/lighter-touch...
PeachNCream - Wednesday, October 25, 2023 - link
That's less of a difference than I thought it'd be actually. At any rate, most of my complaint with PC hardware has been solving the performance problem (acknowledged - if Intel or NV didn't do it they would lose the performance "crown" to someone else that did) by tossing more power at it. ARM-based solutions may move the goalposts to a different place on the proverbial field for a time at least until the various competitors resort to using power to win again. Eh, I'm kidding myself, it's probably already happening behind closed doors.domboy - Wednesday, October 25, 2023 - link
I really hope this is true. While Qualcomm is ok in the current Surface Pro line, competition would be great. Plus, nVidia has been doing windows GPU drivers for a long time so I'd expect theirs to be better right off the bat.PeachNCream - Wednesday, October 25, 2023 - link
Window on ARM itself is kind of a flake of an experience so that needs to get sorted as well.Blastdoor - Friday, October 27, 2023 - link
How did x86 go extinct? Two ways: Gradually, then suddenly.We might be nearing the end of the gradually part.
mode_13h - Friday, October 27, 2023 - link
Yeah, there will definitely be an inflection point. If you're not paying attention, the transition will seem to come out of nowhere.Along with their E-cores, I think Intel's APX is their plan to string x86 along for a while longer. It's just delaying the inevitable, though. Maybe their hope is they can keep x86 viable until the RISC-V ecosystem is ready for it to be a viable cloud alternative. Perhaps what Intel is *really* trying to avoid is specifically getting in bed with ARM.