Each one of the small die can be binned for yield, so when you're doing advanced packaging, you know that everything coming together is using known good die (KGD). Then it's a factor of the packaging yield process.
"Each one of the small die can be binned for yield, so when you're doing advanced packaging, you know that everything coming together is using known good die (KGD). "
may be they could/should bin the Big Banks into banklets?
The two big dies are using Foveros, so two large 'lower' dies with that patchwork of 'upper' dies above them, then EMIB underneath that to connect to the HBM stacks on the periphery and the link I/O tiles, as well as another EMIB bridge or two between the two Foveros stacks.
Can someone in layman’s terms describe a power on test and its purpose, as well as where it falls in the chip design process ? When the chip comes out of the fab, hasn’t it already been powered on? Just trying to understand. Thanks.
There's a variety of electrical testing (E-test) that goes on during production, but those are only smaller test circuits. The only way to know if the whole thing is going to work after packaging is to actually fire it up and see what happens.
from engineer with some rudimentary grasp of what happens in chips world, to be taken with a lot of salt. A chip is "tested" in the fab after manufacturing in terms of checking whether the "blueprint" that was supposed to be projected onto its portion of the wafer is correct. I think this is where the term yield comes from - what percentage of the wafer contains properly "projected" chips. However this chip is nothing but a collection of transistors and wires connecting them, and is, as such, an inanimate object. In order to do some meaningful work it needs to be able to interact with the rest of the world which it does over HW lines or programming registers, essentially its inputs and outputs. It also needs power to do all these things. It is the task of the packaging of the chip (assembly step in the process) to wrap around the chip an environment that will make chip useable. In case of Ponte Vecchio seems that not one but many chips are being integrated into one package likely with high-speed interconnects. Likely these chips are manufactured in different fabs (including external) so they would need to be shipped to a central location (likely an Intel fab) to integrate them into a package. As a result you get a piece of hardware ready to be plugged in somewhere and tested. Testing usually happens in an R&D center (California?) after the GPU has been assembled. What I believe they refer to as power on test is the very first sanity test where the chip is provided with input power and some(many) checks are performed to validate that components are operating properly. Once this power-on test is done they would likely proceed to do simple generic tests (feeding very simple sequences into the GPU and checking output) and then slowly work towards (much) more complex use-cases. Hope you get something out of this.
Yield rates are both die level as well as line level. Die level is broken down usually by how many defects per die you run, and line level is how many wafers did you get out at the end vs. how many virgin wafers you started with. Those are optical level inspections.
There are also e-test inspections where test circuits built into the part are energized to test various metrics in-line as well as post-process.
We do almost all our testing in Oregon at the Ronler Acres facility.
I believe this is a Co-EMIB design with two reticle-limit sized Foveros stacks. Each Foveros module contains 8 XeCU dies (the more or less square ones). The RAMBO caches are the 8 smaller dies in-between the 16 XeCU dies. The 12 dies flanking the XeCU dies on the outside of the modules are the XeMF dies. The four small dies where the two modules abut are probably for the on-package CXL links or whatever I/O they're using between modules. Then there are 8 HBM2E stacks and two transceiver tiles sharing the organic substrate which are connected to the Foveros stacks via EMIB.
It seems like the lower die in the Foveros stack may be more like a traditional silicon interposer.
Yeah, but Ian made that diagram, and I commented at the time that I thought he got that part wrong. All of Intel’s materials depict the Rambo cache as between the XeCU dies. I’m not sure how much we can really draw from those slides though.
wccftech has article, "Exclusive: Here Is Intel’s First 7nm GPU Xe HPC Diagram With Correct Annotations" which claims to have all the tiles correctly labeled.
Their labels don't identify any of the visible tiles being xe-emf. They do mention that there is a base tile below. The implication is that the xe-emf logic is on that base tile. The 12 tiles just outside the compute tiles are said to be passive stiffener tiles.
Ahh, thanks for the link. I had speculated in the comments section of the previous Anandtech article you cited that the XeMF might be the base dies of the Foveros stack. I don’t for a minute buy that those are Intel 10nm though.
So this is to take over Ampere in performance in late '21 or early '22? Don't they need a whole new software system too if they are starting from scratch? Never mind the training and learning a new system, which I'm not sure how many candidates are capable of. I suppose when you are spending billions of dollars, you are bound to find someone.
Intel built their oneAPI on mostly existing standards and foundational blocks, so it's not from scratch. Also, it's a safe bet that things like their media SDK and OpenVINO support it as well. So, I wouldn't say the software portion is especially high-risk.
oneAPI also has a CUDA backend, not sure how close it is to being production ready, but theoretically you can write oneAPI code and run it on a number of different systems.
I imagine there will be a performance hit, but in some use cases, the portability may be worth more than a slight performance hit.
Realistically we are looking at 2023 for this product and that's if everything runs super smoothly for them, surely then there is no way the rumours of Intel moving to TSMC can be true as it's claimed they are getting 5nm CPUs this year and most of Intel's CPUs will be 3nm the year after! It's going to look a little odd if the following year after that they suddenly go back to 7nm for something that is meant to be cutting edge!
Looks like some progress with power-on today. Raja displaying a "hello world" test result on his twitter page, saying it exercises all units of the 41 chiplet Xe-HPC GPU.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
31 Comments
Back to Article
shabby - Tuesday, January 26, 2021 - link
7nm + 7 advanced silicon technologies... the yield rate must be in the single digits no?Ian Cutress - Tuesday, January 26, 2021 - link
Each one of the small die can be binned for yield, so when you're doing advanced packaging, you know that everything coming together is using known good die (KGD). Then it's a factor of the packaging yield process.FunBunny2 - Tuesday, January 26, 2021 - link
"Each one of the small die can be binned for yield, so when you're doing advanced packaging, you know that everything coming together is using known good die (KGD). "may be they could/should bin the Big Banks into banklets?
bananaforscale - Wednesday, January 27, 2021 - link
7%?JayNor - Tuesday, January 26, 2021 - link
That's some amazingly tight stitching. I was expecting something more spread out, similar to the proportions of Raja's slide presentations.JayNor - Wednesday, February 10, 2021 - link
Looks like all those internal chiplets are stacked on top of seven XEMF chiplets in a foveros stacking, rather than being stitched together by emib.brucethemoose - Tuesday, January 26, 2021 - link
So... 8 HBM stacks. 2 IO dies? Possibly a bunch of separate compute and cache tiles?*Whistles* that's quite a package.
brucethemoose - Tuesday, January 26, 2021 - link
*Now I realize thats probably 2 big single dies in the middle.edzieba - Tuesday, January 26, 2021 - link
The two big dies are using Foveros, so two large 'lower' dies with that patchwork of 'upper' dies above them, then EMIB underneath that to connect to the HBM stacks on the periphery and the link I/O tiles, as well as another EMIB bridge or two between the two Foveros stacks.Exotica - Tuesday, January 26, 2021 - link
Can someone in layman’s terms describe a power on test and its purpose, as well as where it falls in the chip design process ? When the chip comes out of the fab, hasn’t it already been powered on? Just trying to understand. Thanks.JKflipflop98 - Tuesday, January 26, 2021 - link
There's a variety of electrical testing (E-test) that goes on during production, but those are only smaller test circuits. The only way to know if the whole thing is going to work after packaging is to actually fire it up and see what happens.nem0 - Tuesday, January 26, 2021 - link
from engineer with some rudimentary grasp of what happens in chips world, to be taken with a lot of salt.A chip is "tested" in the fab after manufacturing in terms of checking whether the "blueprint" that was supposed to be projected onto its portion of the wafer is correct. I think this is where the term yield comes from - what percentage of the wafer contains properly "projected" chips.
However this chip is nothing but a collection of transistors and wires connecting them, and is, as such, an inanimate object. In order to do some meaningful work it needs to be able to interact with the rest of the world which it does over HW lines or programming registers, essentially its inputs and outputs. It also needs power to do all these things.
It is the task of the packaging of the chip (assembly step in the process) to wrap around the chip an environment that will make chip useable.
In case of Ponte Vecchio seems that not one but many chips are being integrated into one package likely with high-speed interconnects. Likely these chips are manufactured in different fabs (including external) so they would need to be shipped to a central location (likely an Intel fab) to integrate them into a package. As a result you get a piece of hardware ready to be plugged in somewhere and tested.
Testing usually happens in an R&D center (California?) after the GPU has been assembled.
What I believe they refer to as power on test is the very first sanity test where the chip is provided with input power and some(many) checks are performed to validate that components are operating properly.
Once this power-on test is done they would likely proceed to do simple generic tests (feeding very simple sequences into the GPU and checking output) and then slowly work towards (much) more complex use-cases.
Hope you get something out of this.
ikjadoon - Tuesday, January 26, 2021 - link
Thanks for the insight.yeeeeman - Wednesday, January 27, 2021 - link
another mention that I would add is that they most certainly already powered it on, they are most certainly well ahead of what they say here...Spunjji - Wednesday, February 3, 2021 - link
Yes, but whether they actually got a fully working chip out of it remains to be seen.JKflipflop98 - Sunday, February 7, 2021 - link
Yield rates are both die level as well as line level. Die level is broken down usually by how many defects per die you run, and line level is how many wafers did you get out at the end vs. how many virgin wafers you started with. Those are optical level inspections.There are also e-test inspections where test circuits built into the part are energized to test various metrics in-line as well as post-process.
We do almost all our testing in Oregon at the Ronler Acres facility.
JayNor - Tuesday, January 26, 2021 - link
Are the 12 chiplets outside the GPUs the Rambo Cache?repoman27 - Tuesday, January 26, 2021 - link
I believe this is a Co-EMIB design with two reticle-limit sized Foveros stacks. Each Foveros module contains 8 XeCU dies (the more or less square ones). The RAMBO caches are the 8 smaller dies in-between the 16 XeCU dies. The 12 dies flanking the XeCU dies on the outside of the modules are the XeMF dies. The four small dies where the two modules abut are probably for the on-package CXL links or whatever I/O they're using between modules. Then there are 8 HBM2E stacks and two transceiver tiles sharing the organic substrate which are connected to the Foveros stacks via EMIB.It seems like the lower die in the Foveros stack may be more like a traditional silicon interposer.
JayNor - Tuesday, January 26, 2021 - link
a previous anandtech article has a diagram indicating those middle 8 chiplets are the Xe-MF.https://www.anandtech.com/show/15188/analyzing-int...
repoman27 - Tuesday, January 26, 2021 - link
Yeah, but Ian made that diagram, and I commented at the time that I thought he got that part wrong. All of Intel’s materials depict the Rambo cache as between the XeCU dies. I’m not sure how much we can really draw from those slides though.JayNor - Wednesday, January 27, 2021 - link
wccftech has article, "Exclusive: Here Is Intel’s First 7nm GPU Xe HPC Diagram With Correct Annotations" which claims to have all the tiles correctly labeled.Their labels don't identify any of the visible tiles being xe-emf. They do mention that there is a base tile below. The implication is that the xe-emf logic is on that base tile. The 12 tiles just outside the compute tiles are said to be passive stiffener tiles.
repoman27 - Wednesday, January 27, 2021 - link
Ahh, thanks for the link. I had speculated in the comments section of the previous Anandtech article you cited that the XeMF might be the base dies of the Foveros stack. I don’t for a minute buy that those are Intel 10nm though.olivaw - Tuesday, January 26, 2021 - link
I wonder how many video streams it can encode at the same time! A million? No! Make that "a billion"!abufrejoval - Tuesday, January 26, 2021 - link
I wildly guess that I see 2 large CPU/SoC dies, 4 GPU dies and 2 HBM stacks.Dug - Tuesday, January 26, 2021 - link
So this is to take over Ampere in performance in late '21 or early '22?Don't they need a whole new software system too if they are starting from scratch? Never mind the training and learning a new system, which I'm not sure how many candidates are capable of.
I suppose when you are spending billions of dollars, you are bound to find someone.
mode_13h - Thursday, January 28, 2021 - link
Intel built their oneAPI on mostly existing standards and foundational blocks, so it's not from scratch. Also, it's a safe bet that things like their media SDK and OpenVINO support it as well. So, I wouldn't say the software portion is especially high-risk.olafgarten - Thursday, February 4, 2021 - link
oneAPI also has a CUDA backend, not sure how close it is to being production ready, but theoretically you can write oneAPI code and run it on a number of different systems.I imagine there will be a performance hit, but in some use cases, the portability may be worth more than a slight performance hit.
repoman27 - Tuesday, January 26, 2021 - link
Crikey, even if those are only 8-Hi HBM2E stacks, that package still has at least 116 dies perched on it.mode_13h - Thursday, January 28, 2021 - link
The best tease involving this GPU would be for AMD to tease Intel by showing the lights dimming when Intel powers it on.You're gonna need a desk-side chiller *and* fusion reactor for it.
Oberoth - Thursday, February 4, 2021 - link
Realistically we are looking at 2023 for this product and that's if everything runs super smoothly for them, surely then there is no way the rumours of Intel moving to TSMC can be true as it's claimed they are getting 5nm CPUs this year and most of Intel's CPUs will be 3nm the year after! It's going to look a little odd if the following year after that they suddenly go back to 7nm for something that is meant to be cutting edge!https://wccftech.com/report-intel-signs-contract-t...
https://www.techpowerup.com/277229/trendforce-tsmc...
JayNor - Saturday, February 6, 2021 - link
Looks like some progress with power-on today. Raja displaying a "hello world" test result on his twitter page, saying it exercises all units of the 41 chiplet Xe-HPC GPU.