Comments Locked

76 Comments

Back to Article

  • iwod - Tuesday, November 6, 2018 - link

    It is interesting we have come a full circle, when DDR used to be in "Chipset", we then moved the memory controller to the CPU now we are back again.
  • ravyne - Tuesday, November 6, 2018 - link

    Kind of. Its still in the CPU socket, so the bus width can be a lot larger and lower latency than olden-style northbridges could ever dream of. The advantages of on-die memory controllers can start to be swamped when those dies start needing to communicate with each other to access non-local memory and IO.

    Ideas don't always fade to history, sometimes echos of an old idea make sense again in light of other engineering decisions made in modern times. Back when memory controllers moved on-die, it made sense because off-chip controllers were a bottle-neck. Now, the scalability of IO interfaces eats die-area that could be better spent on compute.
  • mapesdhs - Tuesday, November 6, 2018 - link

    I see echoes of SGI's old Crossbar chip:

    http://www.sgidepot.co.uk/chipsdiagram.gif

    As Mashey used to say, it's all about the bandwidth. 8)
  • Alexvrb - Tuesday, November 6, 2018 - link

    El blasto from el pasto.
  • lefty2 - Wednesday, November 7, 2018 - link

    Most software is NUMA aware, so there is very little non-local memory access in reality
  • NICK1358 - Tuesday, November 13, 2018 - link

    And what about compatibility with "older" chipsets as x470? will it support the new processors?
  • PeachNCream - Tuesday, November 6, 2018 - link

    The Infinity Fabric...doesn't it already account for a healthy chunk of the total power consumption of Zen-based products? Leaving it on a 14nm node and migrating the memory interface to it while putting the CPUs on 7nm may further increase the ratio of power consumed by the non-CPU portion of the processor package. I get that the decision is probably made in light of a lot of factors beyond just power consumption, but it looks like we're losing some efficiency gains offered by newer and smaller nodes and that speaks to how much closer we've gotten to the physical limits of the materials we currently use to produce processors.
  • Alistair - Tuesday, November 6, 2018 - link

    article:

    physical interfaces (such as DRAM and Infinity Fabric) do not scale that well with shrinks of process technology
  • PeachNCream - Tuesday, November 6, 2018 - link

    Scaling could mean other things besides just a reduction in power. I read that as a possible problem with reaching good yields (say throwing out a die with good CPUs because of a faulty DRAM controller or some problem with the IF - which might explain why recently released TR2s have only two dies with DRAM controllers enabled as the other two could be defective parts) or ramping up IF and DRAM controller clock speed. Power is, of course, a concern as well in the scaling discussion but we lack enough information in the article to, with such certainty, rule out other factors.
  • SleepyFE - Wednesday, November 7, 2018 - link

    When you scale things down you use less space and send less electrons through. That's how you get power savings. When sending signals through a comm line you have to be careful about crosstalk. So you either can't send more electrons (that would ensure a good read), or you need to space the comm lines more (even if they are smaller) negating the node shrink. With newer SSD controllers the QLC needs to read 16 voltage states, so it needs to be able to tell apart very small differences in voltage. The controller also has a lot more power (compared to older ones) and can use better error correction. Error correction in a CPU would take up compute resources.

    In conclusion: If it works, don't break it.
  • Thorgil - Thursday, November 8, 2018 - link

    Plus, let's not forget that AMD have an agreement with GloFo. For every wafer they get from TSMC, they pay a comission to GloFo. With the I/O die on GloFo 14 nm process, it's also a financial factor that comes into play; why pay even more (more than just the cost of the wafer/process) for TSMC's 7nm, when GloFo 14 nm will do just as well. It's a very logical decision from a company standpoint.
  • SleepyFE - Thursday, November 8, 2018 - link

    I think they are no longer as tight. GloFo gave up on chasing super new tech and is focusing on being a more general chip maker, leaving AMD to use TSMC more anyway.
    Now that i read my statement i realize you are right. What is left of the contract that requires AMD to use GloFo will be used up with 14nm with a likely discount.
  • levizx - Saturday, November 10, 2018 - link

    They only pay for "certain" wafers, not all. We don't know the details, but AMD would be extremely stupid if they agree to pay GloFo while outsourcing wafers GloFo can't even produce, especially now that GloFo don't even have a 7nm process anymore.
    I'd say AMD would only have to pay GloFo if they use 16/14/12/11/10 nm wafers outside GloFo.
  • lefty2 - Wednesday, November 7, 2018 - link

    I wonder if that's true, because there was no problem scaling from 28nm to 14nm.
  • Alistair - Tuesday, November 6, 2018 - link

    so the answer to your comment is no, it doesn't make much difference
  • axfelix - Tuesday, November 6, 2018 - link

    A lot of the reporting on AMD's recent comeback has effectively omitted the fact that they're still really bad at low TDPs (and frankly even Intel has been getting crushed here by ARM chips, specifically Apple's, for the past couple years), so as long as they aren't trying to compete outside of desktops and servers, this isn't a huge concern.
  • PeachNCream - Tuesday, November 6, 2018 - link

    Yup, that's what bothers me about the change. Anandtech has done power measurements and the IF in TR2 processors seems to account for a good 50+W TDP of the total chip package. Not moving IF to 7nm means the 50+W TDP will remain constant (adding DRAM controllers to the mix as well so the IF "die" will eat up more TDP) while the new chiplets containing the CPUs draw less power for a given amount of work thanks to the node shrink.
  • namechamps - Tuesday, November 6, 2018 - link

    AMD can always shrink the IO hub to 7nm later. Either as a mid cycle refresh or as part of Zen 3. Right now yields on 7nm are going to be riskier and defects in the IO hub portion would axe the processor so put that on the "safe" high yield 14nm process. It now makes the cores tiny (well under 100 mm2) so even with higher defect rates on 7nm AMD should be able to crank them out by the truckload.
  • Lolimaster - Wednesday, November 7, 2018 - link

    It's gonna be funny to see prices of Ryzen 3000 next year, they could easily go now for $200-250 8 cores.
  • Lolimaster - Wednesday, November 7, 2018 - link

    Reminds me of the Xbox 360 SOC after the 1st update, update in parts.
  • drunkenmaster - Wednesday, November 7, 2018 - link

    I/O goes both ways, there are two ends to a connection and two signals being made. Also that assumes no improvements to the process, optimisations to the IF controller design and optimisations in terms of how communication is routed and how far the signals go thus which all individually change the power usage of the i/o. Even pci-e 4 should improve efficiency and thus power usage for IF.

    More importantly maybe, for a given number of wafer starts, if you can move over half the die to 14nm, then even if it increases power, it can double the amount of chips you can produce. Seeing as the chip coming looks like it will trash anything Intel can produce in server handily with vastly higher performance per watt and that will be true through to at least early 2020(minimum) then AMDs biggest issue will in fact be volume, not power.

    This will both allow AMD to produce more EPYCs and then as capacity increases and they get more wafer starts it means they can get more 7nm gpus and desktop cpus/apus all done on 7nm sooner if those wafers aren't required for EPYC.

    Not to mention that the massively increased yields from splitting off the cores from uncore and the i/o once again increases potential chips being produced.
  • GreenReaper - Wednesday, November 7, 2018 - link

    On the plus side, if it's in the center it'll presumably get the most benefit from the cooling, and it'll make good use of the rest of the area if each core has its own space (and as you say, they may not need as much power, so use less cooling capacity).
  • EasyListening - Thursday, November 15, 2018 - link

    I heard Vega scales down to 8W quite nicely
  • MrSpadge - Tuesday, November 6, 2018 - link

    Besides the limited scaling of interfaces with process nodes, you could use different processes tuned for lower power in a dedicated I/O chip. Not sure what is done and needed here, but there may actually be power benefits to this.
  • PeachNCream - Tuesday, November 6, 2018 - link

    I hope so. I'm also hoping that a centralized IF results in lower inter-core latency and lower memory latency in high core count products. Honestly, this design change looks a bit like a star topology and less like a mesh. I'd love to see some testing done to see what portion of the total TDP will get spent by the IF/DRAM die.
  • jospoortvliet - Wednesday, November 7, 2018 - link

    I would expect that the entire reason they have done this is precisely to decrease the amount of energy spend on fabric. They surely are very much aware of that problem and this star topology and big ram cache might help a lot... At least I hope for them that that is the case ;-)
  • jjj - Tuesday, November 6, 2018 - link

    Having an IO die allows one to optimize for IO so they will have lower power than with integrated on 7nm - when integrated, you can't afford to optimize for IO as other blocks are much more important.
  • looncraz - Wednesday, November 7, 2018 - link

    Power usage is my main concern with this design.

    I believe AMD will use dynamic IF frequency like Vega does. This will allow links to slow down to, say, 800MHz or even shut down when not in use or when load is not high. Otherwise idle power will be insane since the IF also needs to be running MUCH faster than on Summit/Pinnacle Ridge to supply the bandwidth required.
  • Spunjji - Friday, November 9, 2018 - link

    This is a server chip, so while idle power isn't entirely irrelevant it's still not their biggest priority. After all, if you've paid to deploy and run a server that's not doing anything then you're pouring money down the drain.
  • abufrejoval - Wednesday, November 7, 2018 - link

    I/O amplifiers require surface area or actually volume I guess, so all those amps that power external off-chip lines to DRAM, PCIe slots they are big by necessity, somewhat like high-voltage long distance power lines vs. the wiring inside your house. It's one of the reasons that they do/did HBM stack base dies and silicon interposers on 65nm nodes, because there is no benefit to going small and the equipment was still around.

    And even if the surface area is larger, that makes for better cooling, too.

    Two hundred watts on the surface of you pinky's fingernail is a bit of a bother, same wattage on the size of a credit card, much less of a problem.
  • latentexistence - Tuesday, November 6, 2018 - link

    IBM called, they want their Z196 mainframe multichip module with CPU dies around a storage controller die back
  • FreckledTrout - Tuesday, November 6, 2018 - link

    Imitation is the biggest form of flattery.
  • hamiltenor - Tuesday, November 6, 2018 - link

    If anyone reads this and wants more info, chapter 1.4.3 http://www.redbooks.ibm.com/abstracts/sg247833.htm...
  • CheapSushi - Tuesday, November 6, 2018 - link

    IBM helped AMD with their HT implementation on Zen. I bet they helped them with this MCM package too. What do you have against that? Why does it matter if it's similar? What's your agenda with this comment?
  • nandnandnand - Wednesday, November 7, 2018 - link

    "IBM helped AMD with their HT implementation on Zen."

    Do you have a source? I couldn't find this.
  • latentexistence - Wednesday, November 7, 2018 - link

    I don’t have anything against that, I was making a joke because I think it’s cool.
  • eastcoast_pete - Wednesday, November 7, 2018 - link

    Nothing wrong with copying or imitating a successful design approach. Mainframe-on-a-chip is not a bad nickname to have.
  • ravyne - Tuesday, November 6, 2018 - link

    I'm not so sure PCIe is on the chiplets, I think it remains to be seen. There are advantages and disadvantages to either approach. Its a physical interface as well, so it scales poorly, but the pin-count is way lower than even a single DRAM controller.

    The chiplet/IO-die design opens lots of interesting questions though -- Is the L3 cache on the chiplet? If not, how much L2; if so, how much L3, and is there an L4 cache on the IO chiplet? Will the IO controller allow individual memory controllers to be partitioned into their own NUMA nodes? perhaps finer-grained bandwidth allocation via QoS mechanisms? Cloud providers / virtualization would love that.

    Opportunities as well. Market-specific IO dies for Epic, Threadripper, and Ryzen? What about APUs? I could see the GPU compute/control/cache being located on a 7nm chiplet of its own, with a 14/12nm IO chiplet having the display interfaces and framebuffer (maybe an EDRAM large enough for multiple render targets, perhaps configured as a general victim cache as intel's EDRAM is, or maybe with a single HBM2 interface).
  • ravyne - Tuesday, November 6, 2018 - link

    I do also wonder if manufacturing the IO die at 14nm helps them keep their contractual commitments with GloFo, now that they're not moving their production lines to 7nm. Looking at the size of the IO die, it looks roughly equal in size to 3-4 Zen1 dies in 14nm.
  • DigitalFreak - Tuesday, November 6, 2018 - link

    It would be interesting if they put PCI-E on the I/O chip and had an extremely high bandwidth interconnect to each chiplet. I'd love to see an 8 core Ryzen 3xxx with 40+ PCI-E lanes.
  • Hul8 - Tuesday, November 6, 2018 - link

    AMD will be using the AM4 socket until 2020, so I doubt there will be more lanes for mainstream Ryzen. Never mind the market for such solutions is so tiny that AMD are better off offering Threadripper for those customers, instead of saddling even low-end with more cost.
  • Spunjji - Friday, November 9, 2018 - link

    This. Also they're moving up to PCIe 4.0 so the bandwidth available from fewer lanes on the desktop isn't really a concern.
  • Targon - Wednesday, November 7, 2018 - link

    Gen-Z, which AMD has been involved with since its inception, is what you are looking at. Look up Gen-Z consortium. I expect 2020 or 2021 when AMD will incorporate Gen-Z support into its processors/chipsets.
  • marsdeat - Tuesday, November 6, 2018 - link

    What does the "IO" area on the IO die actually mean? Each chiplet seems to have an IF link (8 chiplets at 8 cores each, there are 8 ∞ links on the IO die). I would not be surprised if the PCIe is what is being communicated by "IO", along with other controllers on-board.

    The controller die is huge by the looks of it, bigger than the full Zen 1 Zeppelin die by a large margin.

    I'd be very interested to see how this comes to pass on Matisse, Renoir, and Castle Peak, but for Rome I still feel like the PCIe is on that controller die.
  • Hul8 - Tuesday, November 6, 2018 - link

    I agree that the I/O on the I/O die is PCIe 4.

    As to why they didn't just call them "PCIe 4": The lanes can probably work in either PCIe 4 *or* 2nd Gen Infinity Fabric mode. After all, that's the way Socket-Socket communication was arranged on 1st Gen EPYC.
  • marsdeat - Tuesday, November 6, 2018 - link

    RIGHT! I knew there was something I was forgetting, and it was how dual-socket EPYC support works. Thanks for reminding me!

    Yeah, IO makes sense when you think that's referring to dual-purpose ∞/PCIe bandwidth options. (Plus if there's an onboard SATA controller and all that jazz, that'll be in part of the IO section, I'm sure.)

    Last-level cache could easily be on the IO die too, but I feel like calling it the "IO die" rather than "controller" or some other more vague term implies that it's also the PCIe node.
  • ravyne - Tuesday, November 6, 2018 - link

    I'm thinking along the same lines as you two. the IO blocks probably are configurable as either the PCIe 4.0 or off-chip InfinityFabric.

    Thinking about it some more, I think its likely that there's some kind of large cache in the IO chip, but also L3 retained in the chiplet. They claim to get 2x density on 7nm, but they doubled core-count and doubled the SIMD compute width, register file, and data-path. Those 7nm chiplets somehow still look smaller than Zen1 dies. Its possible the DRAM controllers/pins can account for the difference, but it might be that L3 cache didn't stay at 2MB/core and that's what makes the chiplet < 2x area -- maybe its 8MB still, or 10, or 12, but not 16; it was fully-associative/victim cache in zen1, so it can be any size, really.

    A lot of enterprise workloads, like databases, respond *really* well to huge caches, that's why IBM puts them in Power, and they also support/have-supported really large memory by having off-chip memory controllers with 64MB caches, and multiple of these controllers in a system. But your typical workstation workloads (CAD, Modelling, Rendering) don't respond so dramatically, much less consumer workloads. A 128MB cache in the IO die makes sense for the server market, and at the same time, an 8-core chiplet with 8MB L3 wouldn't need additional cache on the IO die in an APU (today's APUs already only have 1MB/core L3).

    A potential worry here, is whether AMD will continue supporting things like ECC in IO dies bound for consumer SKUs (I hope so), or whether things like PCIe 4.0 will come to consumer SKUs as quickly as in enterprise SKUs (again, I hope so).

    Overall though, this is a great strategy -- On the enterprise side, so much of board/system validation has to do with IO/Board interfaces. Moving compute onto chiplets means the computational cores can evolve on their own without invoking the full validation burden it normally would. In the future, we might only see one set of IO dies per socket for its entire lifetime, but new micro-architectures every 12-18 months, instead of alternating between process-shinks and new archs. Same goes for GPU chiplets on the consumer side -- no need to tie a particular CPU micro-architecture to a particular GPU micro-architecture, they can evolve at their own pace.
  • mczak - Tuesday, November 6, 2018 - link

    I agree with others that i/o on the i/o die most likely refers to pcie.
    As for cache, good questions... I would assume L1/L2 are still to Zen - and as such I'd suspect there's value in having L3 shared by the cores on each chiplet, without having to go to the i/o die (cache also shrinks extremely well, so no need to really skimp on that there).
    Although I wouldn't be surprised if there's an L4 cache in the i/o die - this thing appears rather huge, and while it's definitely very much non-square (increasing the circumference compared to area), apart from the interface logic and the actual PHYs (which need to be to toward the edges of the die) it's basic function is basically that of a router, and I think it could easily have enough area for rather large amounts of L4 (say, 128MB or so). But it remains to be seen...
  • abufrejoval - Wednesday, November 7, 2018 - link

    From what I understand about the IF, PCIe is simply one of the protocols spoken on the IF fabric: You switch a bit it talks PCIe. But that doesn't mean you can drive PCIe level signals off the chiplets IF pins, that's where the amplifiers in the I/O hub comes in handy.

    If that I/O die actually acts as a configurable switch or if it is more of a hub, is one of the fun things I am looking forward to read about here.
  • FreckledTrout - Tuesday, November 6, 2018 - link

    So does anyone think we will see a single CCX / chiplet design in the Ryzen lineup? Seems doable now the IO / memory is disconnected from the CCX / chiplet.
  • nandnandnand - Tuesday, November 6, 2018 - link

    I think we could. Ryzen 4-8 core parts could become very cheap. The cheapest 8-core launch price was $300 for Ryzen 2700. Maybe we could see that dip down to $150-200, with 12 and 16-core options at the high end.

    Also hoping for 8-core laptop APUs.
  • 0ldman79 - Tuesday, November 6, 2018 - link

    Agreed.

    I really want an 8 core laptop, even if it throttles when all 8 are in play.

    I could take 4GHz single or dual and 2GHz with all 8. A lot of apps need threads, a lot need single core. It's certainly possible.

    Hell, my 45W i5 6300HQ typically only pulls 27W at full load, takes Prime95 or something similar to actually bump the 45W TDP and that's on 14nm. Surely they can get an 8 core under the same power limits on 7nm.
  • FreckledTrout - Tuesday, November 6, 2018 - link

    Good point on cutting costs but I was thinking it would help with performance by eliminating any cross chiplet communications like we see with cross CCX communications in Zen 1.
  • TheJian - Wednesday, November 7, 2018 - link

    I really hope they decide to KEEP the profits, not give them away to users. At this point AMD stock needs MARGINS and PROFIT, not this crap where they keep having basically break even quarters or losses. I mean multiple generations of great cpu/gpu and they keep screwing up pricing and trying to win pricing wars that are IMPOSSIBLE to win vs. NV/Intel. Do NOT keep pricing below the competition if you are competitive at their price range. There is no reason to keep discounting yourself to death.

    I'm shocked most people don't realize AMD has lost about $8B over the life of the company. They literally have NOT made a dime over the life of AMD. Do people understand this? When is it OK for AMD to charge enough to make real profits for REAL competition to last more than a gen or two? I want MORE R&D, yet people keep asking for cheap chips. Well don't be surprised when NV/Intel kick the crap out of them again and price sky high.

    I'd gladly pay a little more to AMD today, for better chips tomorrow and so on. Otherwise, we'll see AMD out of the race again in 3yrs when Intel gets their crap together, and probably sooner with NV as they haven't even topped them yet. We don't need discounts, get a better job! If people can't afford a $500 chip or gpu every few years, they are running their lives WRONG. Time to cut cable, cell phone or something so you can afford your toys, or like I said get a better job.

    Most people could just join TING and save a ton of cash yearly over ATT etc. My mother has been paying $13-15 for years vs. $76 from ATT for the same thing...LOL. A simple change of cell providers and she's got a free PC upgrade yearly (which of course I have to build as the tech guy in the family...LOL). $720 saved yearly just on a dumb phone. Me, I just cancelled it completely, free PC yearly now...LOL. Cut the cable cord too for faster internet and CRTV instead ($20 off CRTV18 coupon for year sub right now, not sure how long it lasts). It's easy to re-arrange your bills if you really want something, or in my case are just tired of paying people for crap I don't use anyway.

    Agenda TV and cell phone rape bills are over for me ;) Now the family only pays for what we actually want/use. ATT can kiss our butts, cable tv too! All the free crap on roku, youtube etc and hallmark (~$30/yr)/crtv($80 xmas year subs for the family) and we have all we could watch and more. So much news/content on CRTV we don't even miss fake news or even FOX news now, which is being flushed down the toilet by the sons of rupert murdoch who are NOT fox people and want to turn fox into CNN. I hope CRTV sinks fox at this point and gets hannity etc one day :)

    Save your money people. So many places to cut your bills easily (or simply get a better job). I digress.
  • PeachNCream - Wednesday, November 7, 2018 - link

    You make a few good points, but need to work on your presentation in order to get you main message across. As-is, you're just coming off as a bit crazy.
  • Targon - Wednesday, November 7, 2018 - link

    AMD needs money, so $330 for third generation would make more sense at the high end. We might see 16 core for $500 and if it can hit 5GHz, that would really hurt Intel.
  • nandnandnand - Wednesday, November 7, 2018 - link

    OK, i'll admit $150-200 is a low estimate for 8 cores. I'll up it to $250, with 16 core Ryzen at $500.
  • Der Keyser - Tuesday, November 6, 2018 - link

    Very interesting development. I cannot quite figure how intersocket communications will work from that IO Chiplet (Assuming all 8 IF are for CPU chiplets).
    But this should lower production cost noticably not to mention increase yield from wafers considerably. Also - now they can create different IO chips for Ryzen, Threadripper and EPYC to keep productioncost down but reuse the chiplets across all CPU’s
    Now: A major question arises: I assume this will make the CPU seem monolithic from an OS perspective (no SUB numa as accesstimes to memory will likelily be the same for all Chiplets?
    What kind of general latency penalty will they be looking at compared to the monolithic Intel design?
  • eastcoast_pete - Tuesday, November 6, 2018 - link

    @Anton: thanks for the coverage, really look forward to the updates. As mentioned below, I am especially curious if AMD mentioned if and how they improved infinity fabric's problems: power consumption and memory access.
    Some of the news is quite good: the new 256 bit wide floating point capability should help playing catch-up with Intel's supremacy in AVX, although the AVX512 capabilities remain an Intel exclusive, at least for now.
    The chiplet approach is exciting, although much of it lives and dies with the updated and improved (hopefully) infinity fabric: the reason is simple. A key reason why a 28 core Xeon will always be more expensive than a 32 core Threadripper is that Intel likely looses a lot of dies to fatal flaws during manufacturing. Now, Intel is actually really, really good at making its 14 nm FinFet chips, but once you're scaling to many, many billion transistors (28 cores!), the likelihood that the die has a fatal flaw somewhere just increases, and you end up thrashing a lot of expensive silicon. A chiplet approach reduces that waste enormously: if one chiplet or the I/O die is faulty, you only have to discard that one, not the entire chip. The Achilles heel of any multi-part chip is the fabric, so that's the big question for me: did AMD improve infinity fabric's power consumption, and fix the issues related to heterogeneous memory access?
  • eastcoast_pete - Tuesday, November 6, 2018 - link

    correction: ..trashing a lot of silicon. Not thrashing, although they may do that also (:
  • Der Keyser - Tuesday, November 6, 2018 - link

    Yes Exactly - because IF needs to be much better now as there is no local low latency memorycontroller included in the CCX/Chiplet (For NUMA)
  • ravyne - Tuesday, November 6, 2018 - link

    It hasn't been said whether they'll support the AVX512 ISA, but that's not a huge deal, really. In Zen1 the FPU was 128bits wide, but there are two and they could be paired to perform 256bit AVX ops. AMD might do the same for AVX512 support in Zen2, but even if not 2x 256bit FPU is almost as good as 1x 512bit FPU in terms of throughput. The kinds of workloads that work well for wide SIMD mostly don't care how groups of lanes are partitioned, they mostly care about the total width available. Some of Intel's high-end CPUs (Xeon Gold/Platinum, the newest Xeon D, and I *think* some of the HEDT I9s have 2x 512bit FPUs, but none of them can run both at full clockspeed because power-draw is becomes too high).

    I wonder if 128bit SIMD performance will flatline, stagnate though. Not that anyone's serious workloads are relying on that anymore.
  • Rudde - Friday, November 9, 2018 - link

    Another reason EPYC is cheap is manufacturing at scale. Intel manufactures a fraction XCC (extreme core count) dies compared to their mainstream processor dies. Using one small cpu die for every processor simplifies manufacturing greatly, reducing costs. A separate IO hub simplifies and reduces costs further.
  • Alexvrb - Tuesday, November 6, 2018 - link

    Hmm... I'll be interested to see what this approach could mean for future APUs. You open up a lot more possibilities with the CPU and GPU decoupled from I/O. Throw in future HBM variants and you have some interesting options, especially for mobile.
  • CheapSushi - Tuesday, November 6, 2018 - link

    I really like the look of the chiplet design package. I wonder if AMD got some help from IBM. I know they helped them with their HT implementation. Here's IBM's system controller & cores package: https://assets.pcmag.com/media/images/397587-ibm-z...
  • catavalon21 - Tuesday, June 18, 2019 - link

    First impression was how it reminded me of the L2 cache layout on my first AMD CPU, an Athlon Slot-1 K7
  • Lolimaster - Tuesday, November 6, 2018 - link

    So AMD optimized the modular arch to next level. Now improving yields even more focusing on the thing can actually be cheaper to produce the now "chiplets".

    AMD redifining CPU's. from going full SOC to antiSOC for high performance chips but not the obsolete monolithic design.
  • Lolimaster - Wednesday, November 7, 2018 - link

    This could give AMD an edge on ultraportables, or the return to 10" tablets with full windows inside.

    Small single 4core-CCX+GPU chiclet + basic IO since the form factor of mobile devices don't need as much complexity.
  • Lolimaster - Wednesday, November 7, 2018 - link

    This redisign goes beyond anyones imagination for Zen2 and beyond. No one expected that they weill tweak to such extreme a way of making cpu's that was supposed to be impressive yield wise. Not that 1st option went to the toilet of obsolescence.
  • Flying Aardvark - Wednesday, November 7, 2018 - link

    AMD nailed this. Far more impressive than I thought Zen2 would be. Though Zen over delivered as well, and that's why I've had a 1700, 1800X and 2700X. For consumer chips, the future is going to continue to move towards mobile, where further GPU integration and big core / little core arrangements are going to continue to be the future.
  • MagpieSVK - Wednesday, November 7, 2018 - link

    Maybe they will create 2 Channel 2 infinity fabric interconnect IO chip and use it with 2 chiplets in R7 series and with one chiplet one GPU on APU lineup
  • Targon - Wednesday, November 7, 2018 - link

    The real beauty of the approach that AMD is taking is that they can scale the design and evolve things well. Those Zen2 cores for example tie into the IO "module", but the IO module itself can be updated and get different configurations based on the target processor, socket, or whatever. Since AMD has supported Gen-Z consortium, I expect that all of this will come together in 2020 or 2021 where the need for dedicated PCI Express lanes on the CPU won't be needed, because those will hang off the Gen-Z bus, and all of those pins currently used for PCI Express support can then be used to tie into Gen-Z for a LOT of bandwidth.
  • jmnewton - Wednesday, November 7, 2018 - link

    I'm pretty sure the IO Pins are just generic high-speed SERDES that can internally be routed to control logic for IF 2.0, PCIe 4, GenZ, OpenCAPI, NVMe direct, USB 3.2, Ethernet, etc. If you look at many chipsets from Intel, AMD the motherboard implementation dictates which control logic gets connected to that physical port. Sometimes it can be bifurcated (think: (1) 16x PCIe 4.0 port or 1 8x PCIe 4.0 port + 2 USB 3.2 ports, etc).

    It will be very interesting to see what options they do have for the IO die and its configurability.

    I also think they might use some similar kinds of MCM setups for a set of Zen2 dies with a set of Radeon dies. Imagine 2 Zen2 dies + a decent Radeon die (or 2) and you've got a pretty nice workstation part. Your "APU" now becomes a mix-n-match game between the 2 sets of dies.
  • Azix - Thursday, January 10, 2019 - link

    Does not listing PCI-e mean that each chiplet will have its own lanes? Isn't PCI-e I/O? Isn't that almost all the I/O in PCs nowadays?

Log in

Don't have an account? Sign up now