Comments for AMD Previews EPYC ‘Rome’ Processor: Up to 64 Zen 2 Cores

AMD Previews EPYC ‘Rome’ Processor: Up to 64 Zen 2 Cores

by Anton Shilov on 11/6/2018 2:27 PM EST

Posted in
CPUs
AMD
EPYC
Rome

Post Your Comment
Please log in or sign up to comment.

Comments Locked

67 Comments

Back to Article

ChaosFenix - Tuesday, November 6, 2018 - link
So is it just me or does that look like a GPU in the middle there? It looks to be too big to be HBM2 and you can clearly see the placement of the 8*8 CCX's on the side. Is that a massive GPU in the middle of that thing?
kalgriffen - Tuesday, November 6, 2018 - link
It's an I/O chip for the processor dies.
SaturnusDK - Tuesday, November 6, 2018 - link
It even says so in the 2 paragraph article. Short attention span or what?
ChaosFenix - Tuesday, November 6, 2018 - link
Oh I didn't realize that die was dedicated to IO. Interesting to break it off like that then.
phoenix_rizzen - Tuesday, November 6, 2018 - link
Read the linked "chiplet design approach" article. :)

That's the I/O hub, which comprises the memory controllers, the Infinity Fabric, and so on.

The CPU chiplets are fabbed on a 7nm process; the I/O hub is fabbed on a 14nm process.
Arbie - Tuesday, November 6, 2018 - link
Can't you tell from the photo? The forehead shows it's an AI module.
SaturnusDK - Tuesday, November 6, 2018 - link
Intel gotta turn up the AC now as the sweating under the brow across the Bayshore Freeway is going to be relentless in the coming years.
philehidiot - Saturday, November 10, 2018 - link
So she's got AMD properly back into the game, got an AMD GPU into my system for the first time in ages and, as the proverbial cherry, she's caning Intel's electricity bill with increased AC requirements.

That is one formidable woman.
outsideloop - Tuesday, November 6, 2018 - link
So, is Ryzen 2 going to have one or two of those little 8 core chiplets? Plus the I/O chip? or maybe a smaller I/O chip than that? :)
SaturnusDK - Tuesday, November 6, 2018 - link
Ryzen 3000 series doesn't need 8 DDR4 lanes or 8 dual Infinity Fabric controllers, so yes, the I/O chip for that will be about 1/4 the size.
Tuna-Fish - Tuesday, November 6, 2018 - link
The memory controllers and other IO take up the perimeter of the chip. An IO chip with 2 IF controllers for 2 core chiplets and 2 channels of DDR could be 1/16th of the size. (as size increases, perimeter goes up at a rate of 2x while area goes up at a rate of x^2)

Of course, most of the area of that IO chip will actually be cache. It will be interesting if the Ryzen IO chip will have a L4 cache, and if it will, how large will it be.
cheshirster - Wednesday, November 7, 2018 - link
Cache is not planned on the IO chip.
eva02langley - Tuesday, November 6, 2018 - link
If AMD can double the core counts on EPYC, they can do the same for their other range of CPU.
Hul8 - Tuesday, November 6, 2018 - link
More than 8 cores doesn't make that much sense for mainstream that will still be limited to 2 channel memory and 24 PCIe lanes. AMD will like to sell you Threadripper instead if you want more cores.

Where I can see the benefit is cramming one 8-core core (maybe 6 of those enabled) chiplet, a larg(ish) GPU chiplet, an I/O die - and maybe a stack of HBM - on a single package and making thin-and-light laptops with good enough performance for mainstream gaming. Separating CPU and GPU will allow AMD to create a lot more APU variants with essentially the same resources.
SaturnusDK - Tuesday, November 6, 2018 - link
It think it's more than likely they'll do 16 core for next gen R7s but without SMT, or with hybrid SMT where the two fastest cores are left without SMT and the two slowest is run with SMT for 16c/16t or 16c/24t.

And then naturally a high graphics performance variant with a HBM2 chiplet, a Vega Mobile chiplet, one 8 core chiplet, and an IO controller.
Martin_Schou - Tuesday, November 6, 2018 - link
Well ... AMD could light a fire under Intel's behind if they wanted to. Take a slightly defective I/O die with only four RAM channels and 64 CPU PCIe 4.0 lanes, hook up four 4-core CCXs and Vega 20 GPU and call it their high-end desktop processor (below Threadripper). Limit the chipset to only handling 4 RAM slots to avoid eating into older Threadrippers too much, but sadly I suspect you're right.

But it's fun to imagine. Then again, I suppose you could do something like this to make a high-end mobile CPU instead? Maybe you could get a MacBook with 6+ cores that doesn't end up melting itself?
Hul8 - Tuesday, November 6, 2018 - link
AMD doesn't have a socket with 4 memory channels, 64 PCIe lanes and video connectivity.

The whole idea of AM4/TR4+chipsets divide is to cut costs; AM4 provides enough connectivity for most people and TR4 serves the rest.

The intersection of people interested in high performance but not discrete graphics is so small it really doesn't pay to create a special socket for just them. And motherboard vendors wouldn't adopt it anyway, because they couldn't expect good sales.
Hul8 - Tuesday, November 6, 2018 - link
It doesn't mean that such a CPU could never see the light of day, but it would be soldered onto thin-and-light laptops and maybe AIOs.
Namisecond - Tuesday, November 6, 2018 - link
64 PCIe 4.0 lanes, Vega 20 GPU and 4 memory slots in a sealed thin and light laptop? Especially a mac?

Sounds more like a proprietary workstation or server board.
abufrejoval - Wednesday, November 7, 2018 - link
I think they call them gaming consoles.
Xajel - Wednesday, November 7, 2018 - link
TR4 socket is just an SP3 socket physically. But electrically it's different.

AMD used it because Threadripper was a last minute idea that Papermaster/Lisa liked and gave the green light for. They wasn't sure if it will be a success or not to put resources into a new platform nor they have time to make a one in the beginning.

So they just reused the SP3 socket, added few changes to the SP3 platform to make it consumer friendly & viola.

I expect it to last till the next major socket change (DDR5), then they will have a new platform & socket dedicated to the HEDT using a much smaller socket which should bring more benefits too like smaller size, lower cost & better compatibility with current/popular cooling solutions duo to it's smaller size.
liquidaim - Tuesday, November 6, 2018 - link
you make an intriguing observation. If we look at the fact that the MI60 card has infinity fabric allowing 200 GB/s transfers between GPUs, I can imagine these geniuses using that fabric to do exactly what you said for a 24 to 36 core gpu with good compression etc and high speed DDR4.

It looks as though the I/O block is ~ 400mm2. Which seems freakishly large for just some I/O. If it also has some cache on it, it would be the perfect pairing with a gpu...
Hul8 - Tuesday, November 6, 2018 - link
Well it has to have 8 Infinity Fabric interfaces for the core chiplets, plus controllers for 8 memory channels, plus 128 PCIe lanes worth of PCIe4/IF2 connectivity, plus the SATA and USB ports.

Compare 8/8/128 to 2/2/24 and you quickly see that it makes absolutely no sense to use the same I/O die for mainstream Ryzen as EPYC. If AMD even uses such a distributed design for regular Ryzen, its I/O die will be much smaller.
SaturnusDK - Wednesday, November 7, 2018 - link
The IO block is probably larger than it specifically needs to be in order to make a modular design where you can take the EPYC2 IO controller, cut it exactly in half and have 2 Threadripper 3rd gen IO controllers, or cut it in 4 and have 4 Ryzen 3rd gen IO controllers.
nandnandnand - Tuesday, November 6, 2018 - link
At first Ryzen was 4-8 cores, Threadripper was 8-16 cores. Then Threadripper moved to 12-32 cores, which is a broad range. And that was Zen+, before Zen 2 and the new "chiplets".

If 16-core Ryzen doesn't happen immediately, then I would be surprised if they didn't at least put out a 10 or 12-core Ryzen, maybe holding off 16 cores for a little later.
abufrejoval - Wednesday, November 7, 2018 - link
That's what I have been dreaming about since the first Zen launch!
Santoval - Tuesday, November 6, 2018 - link
A 16-core Ryzen 2 will be highly memory starved due to the two channels of DRAM. AMD could do this but it would be unwise. On the other hand they *did* double the cores of Threadripper. 32 cores with quad channel memory is the same (in DRAM per core) as 16 cores with dual channel memory.
It appears that they moved to an 8-core CCX for Epyc, so I wonder if they'll do they same for Ryzen 2 (with the option of having 8 cores with one CCX and 10 to 16 cores with two of them) or design an additional 6-core CCX for 12 cores max.
CodingRays - Wednesday, November 7, 2018 - link
Why would they have to create an additional 6core chiplet? Id expect them to use the chiplets with one or more defect cores. That way they could easily cover the entire 12-4 core range with one design. Also i hope that they are going to ditch the 4core cpu. That would put a bit more pressure on the industry
outsideloop - Tuesday, November 6, 2018 - link
And what is the 7nm Picasso APU gonna look like? Integrate the Vega (nay Navi) on the I/O chip, I would assume? But that would mean the GPU would be on 14/12nm? hmmmm
SaturnusDK - Wednesday, November 7, 2018 - link
The recently revealed Vega mobile (or Radeon Pro Vega 16/20 to be exact) for the Mac Pro line up is on 14nm produced by TSMC so it would be logical to use that as the base for the APU part. Either separately or integrated in the IO controllers. Most likely the former.
shabby - Tuesday, November 6, 2018 - link
Those 8 core dies look so small and inexpensive, how can Intel compete with those massive 28 core 14nm dies?
FriendlyUser - Tuesday, November 6, 2018 - link
That is exactly the point.
rahvin - Wednesday, November 7, 2018 - link
Those little chiplets probably cost a few dollar per chiplet, that small they will have fantastic yields. Add in the slightly bigger IO chip on the older process and likely low power use. Do have to wonder what assembling the package costs though. AMD made a good choice, Rome may pull them into some really good cloud contracts with a TCO that's much lower than Intel.
edzieba - Wednesday, November 7, 2018 - link
On the flipside, that means instead of one die that needs to pass binning, you need 5 dies. If - like with HBM-on-substrate devices - that requires binning /after/ packaging - you have 5 chances for a defect to fail a package, and also 1 in 5 of that dead die killing the entire CPU assembly.

And post-packaging binning appears to be what is used for current Threadripper binning (for example), so there's a good chance that if individual dual-CCX dies cannot be independantly binned pre-packaging, a bare northbridge-less CCX die will not either.
Topweasel - Wednesday, November 7, 2018 - link
Enough testing and control on wafer production and you can be confident of the capabilities of each die in each location.

When AMD first started disabling cores after testing with Phenom. Intel made a statement that was along the lines of if we tested a die and it had any issues we would trash it. Part of the reason for that was they were much more limited they didn't really have a product for disabled cores on a 2c Core 2 CPU. The other reason was they had recently made a big push on basically making sure all fabs had changed all manfucationing steps to make all of the ones on the same process exactly the same. They had gotten everything down to a perfect science and could pre bin chips before testing and therefore wouldn't "fail" after being lasered. When they did those chips could be trashed. But they were basically recognizing chips limits just like AMD was just before instead of after testing.

QC at other Fabs has gone waaaay up. They have to, to get this small. My guess is they know bad chips before they do the packaging.
psychobriggsy - Friday, November 9, 2018 - link
On the flipflipside, you can bin chiplets by speed as per usual, and then match them in assembly.

This increases clock speed distribution for the final product. Instead of getting a lucky 64-core die where all 64-cores can run at speed X, you only need to use 8 8-core dies that bin at that speed (which is far more common). The chances of 64-cores on a die reaching X are low, but the chances of 8 cores on a die reaching X are fairly reasonable. There's a good video on YouTube by AdoredTV that explains this, it has graphs and everything :)

I would also presume that assembly yields are very high - it's hardly as if MCM technology is new.
abufrejoval - Wednesday, November 7, 2018 - link
Physical size can be misleading: These are EUV chips and there are wonderful articles on this site on just how crazy difficult and expensive that technology is.... Use the search box and make sure to collect your jaw, before you get up after reading that.
psychobriggsy - Friday, November 9, 2018 - link
These are not EUV chips, that's next year with TSMC's 7N+ process, this is the plain 7N process.
jospoortvliet - Sunday, November 11, 2018 - link
Plain still means 4x patterning, itself very expensive and low yield. Samsung even claims to do EUV on 7nm to SAVE on cost.
colinstu - Tuesday, November 6, 2018 - link
something something burning rome
DigitalFreak - Tuesday, November 6, 2018 - link
I'm assuming that's 8 chiplets with 8 cores per, and not 4 x 16?
yeeeeman - Tuesday, November 6, 2018 - link
Yep, looks like it.
yeeeeman - Tuesday, November 6, 2018 - link
This is great stuff from AMD. I love the fact that they like to take risks and try new stuff all the time. They always try to find a better/cheaper solution to do what others (read Intel) do. If well executed, this CPU has the chance to completely leave the competition in the dust.
0ldman79 - Tuesday, November 6, 2018 - link
Looking forward to it.

Betcha we're going to have a 16 core 3700 or something along those lines on AM4.

That'll work, by the time I'm ready to upgrade 16 cores should be the high end mainstream. I'm good with that.
Dug - Tuesday, November 6, 2018 - link
Too bad Windows server is now per core instead of per socket. For good ol' 2012 r2 this cpu would make sense to upgrade too.
Dug - Tuesday, November 6, 2018 - link
Looks like $24,640 per cpu.
John_M - Tuesday, November 6, 2018 - link
Then the sensible thing would be to use Linux instead.
C@mM! - Tuesday, November 6, 2018 - link
Oh my dear god my server licensing costs since we moved to Server 2016 \ Epyc
jospoortvliet - Wednesday, November 7, 2018 - link
Tell ms you move to linux and see what discount you get. If not enough- move and spend the €£¥ on a party ;-)
jjj - Tuesday, November 6, 2018 - link
The memory channel count and number of PCIe is your assumption? It's much more likely that the old chipset and mobos do not take full advantage of what Rome has to offer.
Plus , since they have the IO die, they can go IO crazy with this one and would be a pity to bottleneck 64 cores with a lack of IO.
SaturnusDK - Wednesday, November 7, 2018 - link
What chipset? EPYC never had a chipset, nor will EPYC2.
twtech - Tuesday, November 6, 2018 - link
I wonder what clockrate this will run at with all cores loaded. And if they'll have a high-clockspeed 32-core variant designed for workstation use.
stephenbrooks - Tuesday, November 6, 2018 - link
Photo #2 makes it looks like the I/O chip is made by Apple... and who's that being reflected in photo #1?
CheapSushi - Tuesday, November 6, 2018 - link
This looks awesome. I'm digging the IBM style package. They took a bit chance on this approach and it's working out well.
Ej24 - Tuesday, November 6, 2018 - link
128 pcie 4.0 lanes. Holy crap. That's insane. That's 256GB/s of i/o. That's huge. Is there a bottleneck on that? As in, if you populated an epyc system with ~60 theoretical pcie 4.0 x2 nvme drives and tried to access the simultaneously would you have full throughput?
SaturnusDK - Wednesday, November 7, 2018 - link
The die shot confirms the old saying: "Rome wasn't built on one die".
jospoortvliet - Wednesday, November 7, 2018 - link
Might very well be the background of the name...
Zizy - Wednesday, November 7, 2018 - link
I find something strange in the slide deck - 4TB max for the single socket??? Huhm? How does that work, did AMD find higher capacity sticks or what?
Ryan Smith - Wednesday, November 7, 2018 - link
Yes. 16Gbit DRAM chips are now in the pipeline.

https://www.anandtech.com/show/13500/samsung-shows...
Rοb - Wednesday, November 7, 2018 - link
So if I want a 16-24 core ‘Rome’ processor it will be low cost, due to 5 or 6 dead cores on each chiplet ... ?

It might be an idea to produce a 2 and 4 chiplet version, with 8 to 6 working cores - workstation laptop.

I'd rather have 32 cores at double the clocks, running under 200W - more useful than 64 cores at half the speed (not in _every_ case, I know!).
abufrejoval - Wednesday, November 7, 2018 - link
The way these modern CPUs work, that's what you get automatically if you *don't* use every other core: Faster clock rates on the remaining ones.

Intel doesn't even sell these high-clock/low core chips any cheaper so here you get the same behavior.

Just hope you're not on Oracle style licensing... But perhaps BIOS de-activation would help there.
Kazu-kun - Wednesday, November 7, 2018 - link
"So if I want a 16-24 core ‘Rome’ processor it will be low cost, due to 5 or 6 dead cores on each chiplet ... ?"

No, they would just use less chiplets. For 32 cores, they would use 4 chiplets. For 16 cores 2 chiplets. And so on.

The reason they couldn't do this with the first generation Epyc is because the IO and memory controller were on the chiplets, so in order to keep the full IO and memory bandwidth they needed keep all the chiplets and disable cores instead. This isn't a problem anymore thanks to moving all the uncore to the IO die. Now instead of disabling cores, they can just put less chiplets.
jospoortvliet - Monday, November 12, 2018 - link
"I'd rather have 32 cores at double the clocks, running under 200W - more useful than 64 cores at half the speed (not in _every_ case, I know!)."

That is faster in nearly every case but 32 powerfull cores at 8ghz will not be possible under 2000 watt anytime soon, let alone 200...
jospoortvliet - Monday, November 12, 2018 - link
(Obviously right now it isnt possible at all, period. 7nm might allow 5-6ghz, at crazy power draw. Maybe. Clockspeed doubling is just not possible anymore, if it was it would be done - much nicer that doubling cores as that costs far more money in terms of die space!)
Samus - Thursday, November 8, 2018 - link
128 PCIe 4.0 lanes per SOCKET.

Your move Intel.
RogerAndOut - Thursday, November 8, 2018 - link
Well, 128 PCIe 4.0 lanes per SYSTEM as both 1 processor and 2 processor based systems will have 128 PCIe lanes free. On a 2 processor system, 64 lanes from each processor are used as the interconnect.
monglerbongler - Thursday, January 24, 2019 - link
The question is whether this will support persistent memory/storage (eg optane)

Since that is going to be significant in the near term evolution of server/data center/cluster hardware design.

*especially* for computational clusters.

No hard drives. Period. Just memory and peristent storage, with maybe a storage server somewhere back in the corner of the room to store the output data sets of whatever scientific or engineering computation is being performed.

AMD Previews EPYC ‘Rome’ Processor: Up to 64 Zen 2 Cores

Post Your Comment

67 Comments

Back to Article

ChaosFenix - Tuesday, November 6, 2018 - link

kalgriffen - Tuesday, November 6, 2018 - link

SaturnusDK - Tuesday, November 6, 2018 - link

ChaosFenix - Tuesday, November 6, 2018 - link

phoenix_rizzen - Tuesday, November 6, 2018 - link

Arbie - Tuesday, November 6, 2018 - link

SaturnusDK - Tuesday, November 6, 2018 - link

philehidiot - Saturday, November 10, 2018 - link

outsideloop - Tuesday, November 6, 2018 - link

SaturnusDK - Tuesday, November 6, 2018 - link

Tuna-Fish - Tuesday, November 6, 2018 - link

cheshirster - Wednesday, November 7, 2018 - link

eva02langley - Tuesday, November 6, 2018 - link

Hul8 - Tuesday, November 6, 2018 - link

SaturnusDK - Tuesday, November 6, 2018 - link

Martin_Schou - Tuesday, November 6, 2018 - link

Hul8 - Tuesday, November 6, 2018 - link

Hul8 - Tuesday, November 6, 2018 - link

Namisecond - Tuesday, November 6, 2018 - link

abufrejoval - Wednesday, November 7, 2018 - link

Xajel - Wednesday, November 7, 2018 - link

liquidaim - Tuesday, November 6, 2018 - link

Hul8 - Tuesday, November 6, 2018 - link

SaturnusDK - Wednesday, November 7, 2018 - link

nandnandnand - Tuesday, November 6, 2018 - link

abufrejoval - Wednesday, November 7, 2018 - link

Santoval - Tuesday, November 6, 2018 - link

CodingRays - Wednesday, November 7, 2018 - link

outsideloop - Tuesday, November 6, 2018 - link

SaturnusDK - Wednesday, November 7, 2018 - link

shabby - Tuesday, November 6, 2018 - link

FriendlyUser - Tuesday, November 6, 2018 - link

rahvin - Wednesday, November 7, 2018 - link

edzieba - Wednesday, November 7, 2018 - link

Topweasel - Wednesday, November 7, 2018 - link

psychobriggsy - Friday, November 9, 2018 - link

abufrejoval - Wednesday, November 7, 2018 - link

psychobriggsy - Friday, November 9, 2018 - link

jospoortvliet - Sunday, November 11, 2018 - link

colinstu - Tuesday, November 6, 2018 - link

DigitalFreak - Tuesday, November 6, 2018 - link

yeeeeman - Tuesday, November 6, 2018 - link

yeeeeman - Tuesday, November 6, 2018 - link

0ldman79 - Tuesday, November 6, 2018 - link

Dug - Tuesday, November 6, 2018 - link

Dug - Tuesday, November 6, 2018 - link

John_M - Tuesday, November 6, 2018 - link

C@mM! - Tuesday, November 6, 2018 - link

jospoortvliet - Wednesday, November 7, 2018 - link

jjj - Tuesday, November 6, 2018 - link

SaturnusDK - Wednesday, November 7, 2018 - link

twtech - Tuesday, November 6, 2018 - link

stephenbrooks - Tuesday, November 6, 2018 - link

CheapSushi - Tuesday, November 6, 2018 - link

Ej24 - Tuesday, November 6, 2018 - link

SaturnusDK - Wednesday, November 7, 2018 - link

jospoortvliet - Wednesday, November 7, 2018 - link

Zizy - Wednesday, November 7, 2018 - link

Ryan Smith - Wednesday, November 7, 2018 - link

Rοb - Wednesday, November 7, 2018 - link

abufrejoval - Wednesday, November 7, 2018 - link

Kazu-kun - Wednesday, November 7, 2018 - link

jospoortvliet - Monday, November 12, 2018 - link

jospoortvliet - Monday, November 12, 2018 - link

Samus - Thursday, November 8, 2018 - link

RogerAndOut - Thursday, November 8, 2018 - link

monglerbongler - Thursday, January 24, 2019 - link

Log in

Don't have an account? Sign up now