That is a curious-looking wafer. I thought it was fake at first but then I noticed the alignment notch. Actually, I'm still not convinced it's real because I have seen lots and lots of wafers in various stages of production and I have never seen one where partial chips go all the way out to the edges. It's a waste of time to deal with those in the steppers so no one does that.
Periphery defects? I used to deal with those...buildup of material that would breakdown during wet processing and stream particles all over the wafer. Running partials as far out as possible helped. Nowadays...do they still use big wet benches? I have been out awhile...
Yes, they do. That's one of the systems I spent lots of time working on. Those don't look defects to me. They are just a continuation of the chip pattern.
Still the most chemical efficient tools for some etch processes. It is odd to see die prints out to the edge all around, usually at least the 'corners' are inked out/not patterned by the time it hits copper layers because printing features out that far can increase the chances of film delamination, which just leads to more defectivity. I suppose on DUV tools the extra few seconds to run those shots isn't THAT bad on non-immersion layers, but it adds up over time
I spent my entire career working in the semiconductor industry, although in IT, and I have seen many wafers from 4" to 12" and printing partial die off the edge of the wafer is quite common.
They can already do that for simple LED's, but trying to bring Hexagonal IC Dies into existence is going to be exciting because there is a theoretical 62.5% increase in Manufactured Dies for a given Waffer Diameter and using Hexagonal IC Dies of a similar/identical area.
>So when are we going to hit 450 mm / 18" waffers?
For logic, never since there is little to no advantage to larger wafers. Possibly NAND might use it, but we'll see if its even worth it there.
>Are we ever going to get Hexagonal Die's to maximize possible Yields?
Probably not for logic. With reticle sizes getting smaller in the coming nodes, it makes even less sense going forward then it did in the past, and it didn't make much sense then to begin with.
The bigger the silicon ingot the more expensive it is to produce. Though the problems with EUV may have delayed development my guess would be the additional cost of the larger ingots negates the cost savings on the other side, particularly with the EUV tools being 10X more expensive.
There's another curious thing about the wafer. There are a lot of dies with just clipped corners. If they shifted the entire pattern to the left or right by a quarter of a die, they would get 6 more good ones. That's 7% more dies for free (90 instead of 84).
Exactly. What's with these nonsense comments, anyway? It is like bragging about how I can now run a 10-minute mile instead of a 20-minute mile while the star players are breaking world records and running 4-minute miles. *facepalm*
I'm inclined to believe him-- I think yields are still an issue (which is why they have so many dark cores) and that getting enough chips to meet demand on the 40 core parts will be tough.
"As impressive as the new Xeon 8380 is from a generational and technical stand-point, what really matters at the end of the day is how it fares up to the competition. I’ll be blunt here; nobody really expected the new ICL-SP parts to beat AMD or the new Arm competition – and it didn’t."
this ain't intel marketing presentation. This is a laid back, relaxed, non-biased and professional review. Not everyone hates Intel with their whole heart and not every reviewer hunts for clicks, so as to say that the new Intel server chip are shit. In the grand scheme of things, sure, they are not competitive, BUT Intel still has a few advantages over AMD that for some customers it might matter more than absolute performance. In the server space, price, dependability, upgradeability, quality and support is the name of the game. AMD, as we know even from consumer products isn't that amazing when it comes to drivers, BIOS quality and fixing bugs, whereas Intel is much more reliable in this regard. Sure, sure, you might say I am a fanboy, but first check what I say and then call me that if you want. Nevertheless, Intel needs Sapphire Rapids badly because even with all their advantages, they will keep losing marketshare.
Intel is currently slower, buggier and overpriced with horrific security issues meaning you can have: slow and insecure or even slower and barely secure
and who ever thought servers would regularly need watercooling
also what on Earth are you talking about upgrades? this entire platform is getting chucked shortly while AMD has offered multiple generations on the same platform for years with an upgrade bringing DDR5 and PCIe 5
> AMD, as we know even from consumer products isn't that amazing when it comes to drivers, BIOS quality and fixing bugs, whereas Intel is much more reliable in this regard.
What drivel even is this? Have you actually worked in the industry? Clearly, you have not. I have already seen machine learning nodes move to Epyc, my web host has since moved to Epyc, and even a lot of recommendations for home lab equipment (see ServeTheHome) has since been moving heavily towards AMD. You have no clue. So go eat a pound of sand. At least it will put out better crap than Intel’s 10nm.
You obviously don't know what the term "Machine Learning Node" actually means. That doesn't mean the accelerators for machine learning are FirePro or Epyc, just the server that houses them are running Epyc.
This is a release of server chips, which are distributed through OEM mainly with their specific drivers and BIOS releases close design with AMD.... Do you honestly believe that you get instable BIOS, drivers for Server releases? It's not a consumer moboproduct for 50-100-150$ that goes on sale for the masses with generic subset of BIOS that needs to look fancy, has oc potential, looks and feel, fan mngmnt etc....
second, price, upgradeability is still in favor of the AMD product, quality and support is delivered by the same OEM that ships both intel and AMD systems... and performance, well we know that answer already.. which only leaves retared ICT members that are aging and still believe in some mythical vendors... well i hope they still like all the spectre and meltdown patches and feel evry confident in there infinit support of a dominant monopoly and like to pay 10-15k$ for a server cpu to allow a bit more Ram support.
Oh and also don’t forget the usual Intel fused off features just because...Compare this to AMD where you get all features in all SKU’s. Anyone who recommends this crap is simply an Intel Fanboi.
Depends on workload. If you need of a massive per core bandwidth, there is only a street: Intel. If you need of very low cache latency, there is only a street: Intel. Moreover consider that actually AMD is selling a small number of 64 cores SKUs, the focus of the market is on 32 cores parts. So again in this arena Intel is absolutely the best for bandwidth, latencies and idle power (AMD main defect). Not much is changed, Amd is best for many cores apps, Intel for medium/low number of cores apps. Something will change at the end of this year.
So the 8380 number is being used for both Ice-SP and Cooper-SP, which are totally unrelated designs, on different platforms, with different microarchitectures?
Depends on whether you agree to say that Optane is "system memory". It is mapped onto address space and is directly attached to CPU so it probably can be said to be system memory.
Y'all are measuring power from the socket sensors, not the wall, right?
I think the later would be more interesting, even with the apples-to-oranges factor of different hardware surrounding the platform. After all, whole system power consumption is what one gets charged for.
We are testing non-production server configurations, all with varying hardware, PSUs, and other setup differences. Socket comparisons remain relatively static between systems.
Would be interesting to pit the 4309 (or 5315) against the Rocket Lake octacores. Yes, it's a very different platform aimed at a different market, but it would be interesting to see what a hypothetical '10nm Sunny Cove consumer desktop' could have resembled compared to what Rocket Lake's Sunny Cove delivered on 14nm.
These tests are not entirely representative of real world use cases. For open source software, the icc compiler should always be the first choice for Intel chips. The fact that Intel provides such a compiler for free and AMD doesn’t is a perk that you get with owning Intel. It would be foolish not to take advantage of it.
AMD provides AOCC and there's nothing stopping you from running ICC on AMD either. The relative positioning in that scenario doesn't change, and GCC is the industry standard in that regard in the real world.
Thanks for your reply. I was speaking from my experience in HPC: I’ve never compiled code that I intended to run on Intel architectures with anything but icc, except when the environment did not provide me such liberty, which was rare.
If I were to run the benchmarks, I would build them with the most optimal settings for each architecture using their respective optimizing compilers. I would also make sure that I am using optimized libraries, e.g. Intel MKL and not Open BLAS for Intel architecture, etc.
And I could optimize benchmarks using hand crafted optimal inner loops in assembler. It's possible to double the SPEC score that way. By using such optimized code on a slow CPU, it can *appear* to beat a much faster CPU. And what does that prove exactly? How good one is at cheating?
If we want to compare different CPUs then the only fair option is to use identical compilers and options like AnandTech does.
See it like it this: the benchmark is a racing track, the CPU is a car and the compiler is the driver. If I want to get the best time for each car on a given track I will not have them driven by the same driver. Rather, I will get the best driver for each car. A single driver will repeat the same mistakes in both cars, but one car may be more forgiving than the other.
Then you are comparing drivers and not cars. A good driver can win a race with a slightly slower car. And I know a much faster driver that can beat your best driver. And he will win even with a much slower car. So does the car really matter as long as you have a really good driver?
In the real world we compare cars by subjecting them to identical standardized tests rather than having a grandma drive one car and Lewis Hamilton drive another when comparing their performance/efficiency/acceleration/safety etc.
Based on the compiler options that Anandtech used, we already have the situation that Intel and AMD CPUs are executing different code for the same benchmark. From there it’s only a small step further to use the best compiler for each CPU.
This is a dumb analogy. CPUs are not like race cars. They're more like family sedans or maybe 18-wheeler semi trucks (in the case of server CPUs). As such, they should be tested the way most people are going to use them.
And almost NOBODY is compiling all their software with ICC. I almost never even hear about ICC, any more.
I'm even working with an Intel applications engineer on a CPU performance problem, and even HE doesn't tell me to build their own Intel-developed software with ICC!
Using identical compilers is the most unfair option there is to compare CPUs. Hardware and software on a modern system is tightly connected so it only makes sense to use those compilers on each platform that also are best optimised for that particular platform. Using a compiler that is underdeveloped for one platform is what makes an unfair comparison.
I think that using one unoptimized compiler for both is the best way to judge their performance. Such a compiler rules out bias and concentrates on pure hardware capabilities
You do realize that even the same gcc compiler with the settings that Anandtech used will generate different machine code for Intel and AMD architectures, let alone for ARM? To really make it "apples-to-apples" on Linux x86 they should've used "--with-tune=generic" option: then both CPUs will execute the exact same code.
But personally, I would prefer that they generated several binaries for each test, built them with optimal settings for each of the commonly used compilers: gcc, icc, aocc on Linux and perhaps even msvc on Windows. It's a lot more work I know, but I would appreciate it :)
Benchmarks, in articles like this, should strive to be *relevant*. And for that, they ought to focus on representing the performance of the CPUs as the bulk of readers are likely to experience it.
So, even if using some vendor-supplied compiler with trick settings might not fit your definition of "cheating", that doesn't mean it's a service to the readers. Maybe save that sort of thing for articles that specifically focus on some aspect of the CPU, rather than the *main* review.
There is nothing more relevant than being able to see all facets of a part's performance. This makes it possible to discern its actual performance capability.
Some think all a CPU comparison needs are gaming benchmarks. There is more to look at than subsets of commercial software. Synthetic benchmarks also are valid data points.
It's kind of like whether an automobile reviewer tests a car with racing tyres and 100-octane fuel. That would show you its maximum capabilities, but it's not how most people are going to experience it. While a racing enthusiast might be interested in knowing this, it's not a good proxy for the experience most people are likely to have with it.
All I'm proposing is to prioritize accordingly. Yes, we want to know how many lateral g's it can pull on a skid pad, once you remove the limiting factor of the all-season tyres, but that's secondary.
It's still cheating if you compare highly tuned benchmark scores with untuned scores. If you use it to trick users into believing CPU A is faster than CPU B eventhough CPU A is really slower, you are basically doing deceptive marketing. Mentioning it in the small print (which nobody reads) does not make it any less cheating.
It's cheating to use software that's very unoptimized to claim that that's as much performance as CPU has.
For example... let's say we'll just skip all software that has AVX-512 support — on the basis that it's just not worth testing because so many CPUs don't support it.
Running not fully optimized software is what we do all the time, so that's exactly what we should be benchmarking. The -Ofast option used here is actually too optimized since most code is built with -O2. Some browsers use -Os/-Oz for much of their code!
Not to disagree with you, but always take Phoronix' benchmarks with a grain of salt.
First, he tested one 14 nm CPU model that only has one AVX-512 unit per core. Ice Lake has 2, and therefore might've shown more benefit.
Second, PTS is enormous (more than 1 month typical runtime) and I haven't seen Michael being very transparent about his criteria for selecting which benchmarks to feature in his articles. He can easily bias perception through picking benchmarks that respond well or poorly to the feature or product in question.
There are also some questions raised about his methodology, such as whether he effectively controlled for AVX-512 usage in some packages that contain hand-written asm. However, by looking at the power utilization graphs, I doubt that's an issue in this case. But, if he excluded such packages for that very reason, then it could unintentionally bias the results.
Completely agree that Phoronix benchmarks are dubious - it's not only the selection but also the lack of analysis of odd results and the incorrect way he does cross-ISA comparisons. It's far better to show a few standard benchmarks with well-known characteristics than a random sample of unknown microbenchmarks.
Ignoring all that, there are sometimes useful results in all the noise. The power results show that for the selected benchmarks there is really use of AVX-512. Whether this is typical across a wider range of code is indeed the question...
With regard specifically to testing AVX-512, perhaps the best method is to include results both with and without it. This serves the dual-role of informing customers of the likely performance for software compiled with more typical options, as well as showing how much further performance is to be gained by using an AVX-512 optimized build.
GCC the industry standard in real world? Maybe in that part of the world where you live, but not everywhere. It is only true in a part of the world. HPC centres have relied on icc for ages for much of the performance-critical code, though GCC is slowly catching up, at least for C and C++ but not at all for Fortran, an important language in HPC (I just read it made it back in the top-20 of most used languages after falling back to position 34 a year or so ago). In embedded systems and the non-x86-world in general, LLVM derived compilers have long been the norm. Commercial compiler vendors and CPU manufacturers are all moving to LLVM-based compilers or have been there for years already.
Yes GCC is the industry standard for Linux. That's a simple fact, not something you can dispute.
In HPC people are willing to use various compilers to get best performance, so it's certainly not purely ICC. And HPC isn't exclusively Intel or x86 based either. LLVM is increasing in popularity in the wider industry but it still needs to catch up to GCC in performance.
GCC is the only supported compiler for building the Linux kernel, although Google is working hard to make it build with LLVM. They seem to believe it's better for security.
From the benchmarks that Phoronix routinely publishes, each has its strengths and weaknesses. I think neither is a clear winner.
ICC and AMD's AOCC are SPEC trick compilers. Neither is used much in the real world since for real code they are typically slower than GCC or LLVM.
Btw are you equally happy if I propose to use a compiler which replaces critical inner loops of the benchmarks with hand-optimized assembler code? It would be foolish not to take advantage of the extra performance you get only on those benchmarks...
They are not SPEC tricks. You can use these compilers for any compliant C++ code that you have. In the last 10 years, the only time I didn’t use icc with Intel chips was on systems where I had no control over the sw ecosystem.
Maybe compared to old GCC/LLVM versions, but things have changed. There is now little difference between ICC and GCC when running SPEC in terms of vectorized performance. Note the amount of code that can benefit from AVX-512 is absolutely tiny, and the speedups in the real world are smaller than expected (see eg. SIMDJson results with hand-optimized AVX-512).
And please read the article - the setup is clearly explained in every review: "We compile the binaries with GCC 10.2 on their respective platforms, with simple -Ofast optimisation flags and relevant architecture and machine tuning flags (-march/-mtune=Neoverse-n1 ; -march/-mtune=skylake-avx512 ; -march/-mtune=znver2 (for Zen3 as well due to GCC 10.2 not having znver3). "
I remember that custom builds of Blender done with ICC scored better on Piledriver as well as on Intel hardware. So, even an architecture that was very different was faster with ICC.
Instead of whinge why not investigate the issue if you're actually interested?
Bottom line is that, just before the time of Zen's release, I tested three builds of Blender done with ICC and all were faster on both Intel and Piledriver (a very different architecture from Haswell).
I asked why the Blender team wasn't releasing its builds with ICC since performance was being left on the table but only heard vague suggestions about code stability.
This is absolutely untrue. There is not much special about AOCC, it is just a AMD-packaged Clang/LLVM with few extras so it is not a SPEC compiler at all. Neither is it true for Intel. Sites that are concerned about getting the most performance out of their investments often use the Intel compilers. It is a very good compiler for any code with good potential for vectorization, and I have seen it do miracles on badly written code that no version of GCC could do.
And those closed-source "extras" in AOCC magically improve the SPEC score compared to standard LLVM. How is it not a SPEC compiler just like ICC has been for decades?
It's strange to tell people who use the Intel compiler that it's not used much in the real world, as though that carries some substantive point.
The Intel compiler has always been better than gcc in terms of the performance of compiled code. You asserted that that is no longer true, but I'm not clear on what evidence you're basing that on. ICC is moving to clang and LLVM, so we'll see what happens there. clang and gcc appear to be a wash at this point.
It's true that lots of open source Linux-world projects use gcc, but I wouldn't know the percentage. Those projects tend to be lazy or untrained when it comes to optimization. They hardly use any compiler flags relevant to performance, like those stipulating modern CPU baselines, or link time optimization / whole program optimization. Nor do they exploit SIMD and vectorization much, or PGO, or parallelization. So they leave a lot of performance on the table. More rigorous environments like HPC or just performance-aware teams are more likely to use ICC or at least lots of good flags and testing.
And yes, I would definitely support using optimized assembly in benchmarks, especially if it surfaced significant differences in CPU performance. And probably, if the workload was realistic or broadly applicable. Anything that's going to execute thousands, millions, or billions of times is worth optimizing. Inner loops are a common focus, so I don't know what you're objecting to there. Benchmarks should be about realizable optimal performance, and optimization in general should be a much bigger priority for serious software developers – today's software and OSes are absurdly slow, and in many cases desktop applications are slower in user-time than their late 1980s counterparts. Servers are also far too slow to do simple things like parse an HTTP request header.
"today's software and OSes are absurdly slow, and in many cases desktop applications are slower in user-time than their late 1980s counterparts." a late 1980's desktop could not even play a video let alone edit one, your average mid range smartphone is much more capable. My four year old can do basic computing with just her voice. People like you forget how far software and hardware has come.
Sure, computers and devices are far more capable these days, from a hardware point of view, but applications, relying too much on GUI frameworks and modern languages, are more sluggish today than, say, a bare Win32 application of yore.
You're arguing apples (latency) and oranges (capability).
An Apple II has better latency than an Apple Lisa, even though the latter is vastly more powerful in most respects. The sluggishness of the UI was one of the big problems with that system from a consumer point of view. Many self-described power users equated a snappy interface with capability, so they believed their CLI machines (like the IBM PC) were a lot better.
"today's software and OSes are absurdly slow, and in many cases desktop applications are slower in user-time than their late 1980s counterparts"
Oh yes. One builds a computer nowadays and it's fast for a year. But then applications, being updated, grow sluggish over time. And it starts to feel like one's old computer again. So what exactly did we gain, I sometimes wonder. Take a simple suite like LibreOffice, which was never fast to begin with. I feel version 7 opens even slower than 6. Firefox was quite all right, but as of 85 or 86, when they introduced some new security feature, it seems to open a lot slower, at least on my computer. At any rate, I do appreciate all the free software.
> It's strange to tell people who use the Intel compiler that it's not used much in the real world, as though that carries some substantive point.
To use the automotive analogy, it's as if a car is being reviewed using 100-octane fuel, even though most people can only get 93 or 91 octane (and many will just use the cheap 87 octane, anyhow).
The point of these reviews isn't to milk the most performance from the product that's theoretically possible, but rather to inform readers about how they're likely to experience it. THAT is why it's relevant that almost nobody uses ICC in practice.
And, in fact, BECAUSE so few people are using ICC, Intel puts a lot of work into GCC and LLVM.
I think that a common compiler like GCC should be used (like Andrei is doing), along with a generic x86-64 -march (in the case of Intel/AMD) and generic -mtune. The idea would be to get the CPUs on as equal a footing as possible, even with code that might not be optimal, and reveal relative rather than absolute performance.
Using generic (-march=x86-64) means you are building for ancient SSE2... If you want a common baseline then use something like -march=x86-64-v3. You'll then get people claiming that excluding AVX-512 is unfair eventhough there is little difference on most benchmarks except for higher power consumption ( https://www.phoronix.com/scan.php?page=article&... ).
If I may offer an analogy, I would say: the benchmark is like an exam in school but here we test time to finish the paper (and with the constraint of complete accuracy). Each pupil should be given the identical paper, and that's it.
Using optimised binaries for different CPUs is a bit like knowing each child's brain beforehand (one has thicker circuitry in Bodman region 10, etc.) and giving each a paper with peculiar layout and formatting but same questions (in essence). Which system is better, who can say, but I'd go with the first.
Well, whatever tricks were used made Blender faster with the ICC builds I tested — both on AMD's Piledriver and on several Intel releases (Lynnfield and Haswell).
Please tell me you did this test with an ICC released only a couple years ago, or else I feel embarrassed for you polluting this discussion with such irrelevant facts.
This Xeon generation exists primarily because Intel had to come through and deliver something in 10 nm, after announcing the heck out of it for years. As an actual processor, they are not bad as far as Xeons are concerned, but clearly inferior to AMD's current EPYC line, especially on price/performance. Plus, we and the world know that the real update is around the corner within a year: Sapphire Rapids. That one promises a lot of performance uplift, not the least by having PCI-5 and at least the option of directly linked HBM for RAM. Lastly, if Intel would have managed to make this line compatible with the older socket (it's not), one could at least have used these Ice Lake Xeons to update Cooper Lake systems via a CPU swap. As it stands, I don't quite see the value proposition, unless you're in an Intel shop and need capacity very badly right now.
Agreed. Both Ice Lake and Rocket lake are just placeholders to try to make something before the real improvement comes with Saphire rapids and Alder Make respectively... I'm one that says that AMD really needs the competition right now to not get sloppy and become "2017-2020 Intel". I want to see both competing hard in the next years ahead
Rocket Lake is a stopgap. Ice Lake (and Ice Lake SP) were just late; they would have been unquestioned market leaders if launched on time and even now mostly just run into problems when the competition is throwing way more cores at the problem.
No, Ice Lake Server cores have a much lower clock frequency and a much smaller L3 cache than Epyc 7xx3, so they are much slower core per core than AMD Milan for any general purpose application, e.g. software compilation.
The Ice Lake Server cores have a double number of floating-point multipliers that can be used by AVX-512 programs, so they are faster (despite their clock frequency deficit) for the applications that are limited by FP multiplication throughput or that can use other special AVX-512 features, e.g. the instructions useful for machine learning.
'limited by FP multiplication throughput or that can use other special AVX-512 features, e.g. the instructions useful for machine learning.'
How do they compare with Power?
How do they compare with GPUs? (I realize that a GPU is very good at a much more limited palette of work types versus a general-purpose CPU. However... how much overlap there is between a GPU and AVX-512 is something at least non-experts will wonder about.)
The best GPUs from NVIDIA and AMD can provide between 3 and 4 times more performance per watt than the best Intel Xeons with AVX-512.
However most GPUs are usable only in applications where low precision is appropriate, i.e. graphics and machine learning.
The few GPUs that can be used for applications that need higher precision (e.g. NVIDIA A100 or Radeon Instinct) are extremely expensive, much more than Xeons or Epycs, and individuals or small businesses have very little chances to be able to buy them.
Please re-check the price list. The top-end A100 does sell for a bit more than the $8K list price of the top Xeon and EPYC, however MI100 seems to be pretty close. perf/$ is still wildly in favor of GPUs.
Unfortunately, if you're only looking at the GPUs' ordinary compute specs, you're missing their real point of differentiation, which is their low-precision tensor performance. That's far beyond what the CPUs can dream of!
Trust there are good reasons why Intel scrapped Xeon Phi, after flogging it for 2 generations (plus a few prior unreleased iterations), and adopted a pure GPU approach to compute!
Reading the conclusion I’m confused by how it’s possible for the product to be a success and for it to be both slower and more expensive.
‘But Intel has 10nm in a place where it is economically viable’
Is that the full-fat 10nm or a simplified/weaker version? I can’t remember but vaguely recall something about Intel having had to back away from some of the tech improvements it had said would be in its 10nm.
Because there is more than the benchmarks that are in this review to making decisions when buying servers. Intel's entire ecosystem is an advantage much bigger than AMD's lead in benchmarks, as is Intel's ability to deliver high volume. The product will be a success because it will sell a lot of hardware. It will, however, allow a certain amount of market share to be lost to AMD, but less thanwpuld be lost without it. It will also cut into profit margins compared to if the Intel chips were even with the AMD ones in the benchmarks, or if Intel's 10 nm was as cost effective as they'd like it to be (but TSMC's 7 nm is not as cost effect as Intel would like they're processes to be, either).
I never made any argument or made any suggestions for the article, I only tried to clear up your confusion: "Reading the conclusion I’m confused by how it’s possible for the product to be a success and for it to be both slower and more expensive." Perhaps the author should have been more explicit as to why he made his conclusion. To me, the publishing of server processor benchmarks on a hardware enthusiast site like this is mostly for entertainment purposes, although it might influence some retail investors. They are just trying to pit the processor itself against its competitors. "How does Intel's server chip stack up against AMD's server chip?" It's like watching the ball game at the bar.
> To me, the publishing of server processor benchmarks on a hardware enthusiast site like this is mostly for entertainment purposes, although it might influence some retail investors.
You might be surprised. I'm a software developer at a hardware company and we use benchmarks on sites like this to give us a sense of the hardware landscape. Of course, we do our own, internal testing, with our own software, before making any final decisions.
I'd guess that you'll find systems admins of SMEs that still use on-prem server hardware are probably also looking at reviews like these.
It's impossible to post a rebuttal (i.e. 'clear up your confusion') without making one or more arguments.
I rebutted your rebuttal.
You argued against the benchmarks being seen as important. I pointed out that that means the article shouldn't have been pages of benchmarks. You had nothing.
I wish there were test done with 2nd gen Optane memory. isn't that one of the selling point of Intel Xeon that is not there in Epyc or Arm Servers. Also please do benchmarks with P5800X optane SSD as that is supposedly fastest SSD around.
Bunch of desktop enthusiasts failing to understand that as long as Intel provides a new part to augment an existing VM pool that isn't so awful as to justify replacing all existing systems, they're going to retain 90% of their existing customers.
It's good to know powerusage is about the same as the specifications. It shows the madness of an 8 core i9 using more than an 54 core Xeon. And even those 54 cores are not delivering a decent power/Watt. If AMD would have made that I/O die on 7nm the this ice lake CPU would even be in deeper trouble.
Wow. The conclusion is quite shocking. Massive improvement - still not good enough. Wow. Imagine how massively behind is the current generation compared to AMD in the server's market. Wow.
The biggest sign would be if a 3nm fab in US starting hiring engaged but undervalued engineers from a 10nm fab which now found good reasons not to move to asia.
Andrei, thanks for the review, but please consider augmenting your memory benchmarks with something besides STREAM Triad.
Particularly in light of the way that Altra benefits from their "non-temporal" optimization, it gives a false impression of the memory performance of these CPUs in typical, real-world use cases. I would suggest a benchmark that performs a mix of reads and writes of various sizes.
Another interesting benchmark to look at would be some sort of stress test involving atomic memory operations.
Is it known whether there will be an IceLake-X this time round? The list of single-Xeon motherboard launches suggests possibly not; it would obviously be appealing to have a 24-core HEDT without paying the Xeon premium.
Boeings and Airbuses are never actually sold at their nominal prices, they cost far less, a non-disclosed number, for big buyers after gruesome haggling, sometimes less than half the “catalogue” price. I think this is exactly what's intel doing now: set the catalogue price high to avoid losing face, and give huge discount to avoid losing market share.
EPYC 75F3 is the clear winner SKU and the must have for most of the workloads. This is based on price - performance - cores and its related 3rd party sw licensing...
I wonder when Intel will be able to convince VMware to move from a 32core licensing schema to a 40core :) They used to get all the dev favor when PAT was still in the house, I had several senior engineers in escalation calls stating that the hypervisor was optimised for Intel ...guess what even under optimised looking for a VM farm in 2020-2021-....you are way better off with an AMD build.
If you can't beat the competition, then what? Ian seems to be impressed that Intel was finally able to launch a Xeon that's a little faster than its previous Xeon, but not fast enough to justify the price tag in relation to what AMD has been offering for a while. So here we are congratulating Intel on burning through wads more cash to produce yet-another-non-competitive result. It really seems as if Intel *requires* AMD to set its goals and to tell it where it needs to go--and that is sad. It all began with x86-64 and SDRAM from AMD beating out Itanium and RDRAM years ago. And when you look at what Intel has done since it's just not all that impressive. Well, at least we can dispense with the notion that "Intel's 10nm is TSMC's 7nm" as that clearly is not the case.
What about the networking applications of this new chip? Dan Rodriguez's presentation showed gains of 1.4x to 1.8x for various networking benchmarks. Intel's entry into 5G infrastructure, NFV, vRAN, ORAN, hybrid cloud is growing faster than they originally predicted. They are able to bundle Optane, SmartNICs, FPGAs, eASIC chips, XeonD, P5900 family Atom chips... I don't believe they have a competitor that can provide that level of solution.
There is some faulty logic at work in many of the comments, with claims like it's cheating to use a more optimized compiler.
It's not cheating unless:
• the compiler produces code that's so much more unstable/buggy that it's quite a bit more untrustworthy than the less-optimized compiler
• you don't make it clear to readers that the compiler may make the architecture look more performant simply because the other architectures may not have had compiler optimizations on the same level
• you use the same compiler for different architectures when using a different compiler for one or more other architectures will produce more optimized code for those architectures as well
• the compiler sabotages the competition, via things like 'genuine Intel'
Fact is that if a CPU can accomplish a certain amount of work in a certain amount of time, using a certain amount of watts under a certain level of cooling — that is the part's actual performance capability.
If that means writing machine code directly (not even assembly) to get to that performance level, so what? That's an entirely different matter, which is how practical/economical/profitable/effortful it is to get enough code to measure all of the different aspects of the part's maximum performance capability. The only time one can really cite that as a deal-breaker is if one has hard data to demonstrate that by the time the hand-tuned/optimized code is written changes to the architecture (and/or support chips/hardware) will obsolete the advantage — making the effort utterly fruitless, beyond intellectual curiosity concerning the part's ability. For instance, if one knows that Intel, for instance, is going to integrate new instructions (very soon) that will make various types of hand-tuned assembly obsolete in short order, it can be argued that it's not worth the effort to write the code. People made this argument with some of AMD's Bulldozer/Piledriver instructions, on the basis that enough industry adoption wasn't going to happen. But, frankly... if you're going to make claims about the part's performance, you really should do what you can to find out what it is.
One can, though, of course... include a disclaimer that 'it seems clear enough that, regardless of how much hand-tuned code is done, the CPU isn't going to deliver enough to beat the competition, if the competition's code is similarly hand-tuned' — if that's the case. Even if a certain task is tuned to run twice as fast, is it going to be twice as fast as tuned code for the competition's stuff? Is its performance per watt deficit going to be erased? Will its pricing no longer be a drag on its perceived competitiveness?
For example, one could have wrung every last drop of performance out of Bulldozer but it wasn't going to beat Sandy Bridge E — a chip with the same number of transistors. Piledriver could beat at least the desktop version of Sandy in certain workloads when clocked well outside of the optimal (for the node's performance per watt) range but that's where it's very helpful to have tests at the same clock. It was discovered, for instance, that the Fury X and Vega had basically identical performance at the same clock. Since desktop Sandy could easily clock at the same 4.0 GHz Piledriver initially shipped with it could be tested at that rate, too.
Ideally, CPU makers would release benchmarks that demonstrate every facet of their chip's maximum performance. The concern about those being best-case and synthetic is less of a problem in that scenario because all aspects of the chip's performance would be tested and published. That makes cherry-picking impossible.
The faulty logic I see is that you seem to believe it's the review's job to showcase the product in the best possible light. No, that's Intel's job, and you can find plenty of that material at intel.com, if that's what you want.
Articles like this should focus on representing the performance of the CPUs as the bulk of readers are likely to experience it. So, even if using some vendor-supplied compiler with trick settings might not fit your definition of "cheating", that doesn't mean it's a service to the readers.
I think it could be appropriate to do that sort of thing, in articles that specifically analyze some narrow aspect of a CPU, for instance to determine the hardware's true capabilities or if it was just over-hyped. But, not in these sort of overall reviews.
'The faulty logic I see is that you seem to believe it's the review's job to...'
'I think it could be appropriate to do that sort of thing, in articles that...'
Don't contradict yourself or anything.
If you're not interested in knowing how fast a CPU is that's ... well... I don't know.
Telling people to go for marketing info (which is inherently deceptive — the entire fundamental reason for marketing departments to exist) is obviously silly.
I think the point of confusion is that I'm drawing a distinction between the initial product review and subsequent follow-up articles they often publish to examine specific points of interest. This would also allow for more time to do a more thorough investigation, since the initial reviews tend to be conducted under strict deadlines.
> If you're not interested in knowing how fast a CPU is that's ... well... I don't know.
There's often a distinction between the performance, as users are most likely to experience it, and the full capabilities of the product. I actually want to know both, but I think the former should be the (initial) priority.
"At the same time, we have also spent time a dual Xeon Gold 6330 system from Supermicro, which has two 28-core processors,..." Nonsensical English: "time a duel". I haven't the faintest what you were trying to say.
"DRAM latencies here are reduced by 1.7ns, which isn't very much a significant difference,..." Either use "very much", or use "a significant": DRAM latencies here are reduced by 1.7ns, which isn't a very significant difference,..."
"Inspecting Intel's prior disclosures about Ice Lake SP in last year's HotChips presentations, one point sticks out, and that's is the "SpecI2M optimisation" where the system is able to convert traditional RFO (Read for ownership) memory operations into another mechanism" Excess "is": "Inspecting Intel's prior disclosures about Ice Lake SP in last year's HotChips presentations, one point sticks out, and that's the "SpecI2M optimisation" where the system is able to convert traditional RFO (Read for ownership) memory operations into another mechanism"
"It's a bit unfortunate that system vendors have ended up publishing STREAM results with hyper optimised binaries that are compiled with non-temporal instructions from the get-go, as for example we would not have seen this new mechanism on Ice Lake SP with them" You need to rewrite the sentance or add more commas to break it up: "It's a bit unfortunate that system vendors have ended up publishing STREAM results with hyper optimised binaries that are compiled with non-temporal instructions from the get-go, as, for example, we would not have seen this new mechanism on Ice Lake SP with them"
"The latter STREAM results were really great to see as I view is a true design innovation that will benefit a lot of workloads." Exchange "is" for "this as": "The latter STREAM results were really great to see as I view this as a true design innovation that will benefit a lot of workloads." Or discard "view" and rewrite as a diffinitive instead of as an opinion: "The latter STREAM results were really great to see as this is a true design innovation that will benefit a lot of workloads."
"Intel's new Ice Lake SP system, similarly to the predecessor Cascade Lake SP system, appear to be very efficient at full system idle,..." Missing "s": "Intel's new Ice Lake SP system, similarly to the predecessor Cascade Lake SP system, appears to be very efficient at full system idle,..."
"...the new Ice Lake part to most of the time beat the Cascade Lake part,..." "to" doesn't belong. Rewrite: "...the new Ice Lake part can beat the Cascade Lake part most of the time,..."
"...both showcasing figures that are still 25 and 15% ahead of the Xeon 8380." Missing "%": "...both showcasing figures that are still 25% and 15% ahead of the Xeon 8380."
"Intel had been pushing very hard the software optimisation side of things,..." Poor sentance structure: "Intel had been pushing the software optimisation side very hard,..."
"...which unfortunately didn't have enough time to cover for this piece." Missing "we": "...which unfortunately we didn't have enough time to cover for this piece."
"While we are exalted to finally see Ice lake SP reach the market,..." "excited" not "exalted": "While we are excited to finally see Ice lake SP reach the market,..."
You mean the bottom-tier Xeons? Those are just mainstream desktop chips with less features disabled, so that question depends on when Alder Lake hits.
I'd say "no", because the Xeon versions typically lag the corresponding mainstream chips by a few months. So, if Alder Lake launches in November, then maybe we get the Xeons in February-March of next year.
The more immediate question is whether they'll release a Xeon version of Rocket Lake. I think that's likely, since they skipped Comet Lake and there are significant platform enhancements for Rocket Lake.
No, the W-1300 Xeons will be Rocket Lake. The top model will be Xeon W-1390P, which will be equivalent to the top i9 Rocket Lake, with 125 W TDP and 5.3 GHz maximum turbo.
That is a curious-looking wafer. I thought it was fake at first but then I noticed the alignment notch. Actually, I'm still not convinced it's real because I have seen lots and lots of wafers in various stages of production and I have never seen one where partial chips go all the way out to the edges. It's a waste of time to deal with those in the steppers so no one does that.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
169 Comments
Back to Article
deil - Tuesday, April 6, 2021 - link
that's a lot of upgrade for intelGomez Addams - Tuesday, April 6, 2021 - link
That is a curious-looking wafer. I thought it was fake at first but then I noticed the alignment notch. Actually, I'm still not convinced it's real because I have seen lots and lots of wafers in various stages of production and I have never seen one where partial chips go all the way out to the edges. It's a waste of time to deal with those in the steppers so no one does that.JCB994 - Tuesday, April 6, 2021 - link
Periphery defects? I used to deal with those...buildup of material that would breakdown during wet processing and stream particles all over the wafer. Running partials as far out as possible helped. Nowadays...do they still use big wet benches? I have been out awhile...Gomez Addams - Tuesday, April 6, 2021 - link
Yes, they do. That's one of the systems I spent lots of time working on. Those don't look defects to me. They are just a continuation of the chip pattern.FullmetalTitan - Saturday, April 24, 2021 - link
Still the most chemical efficient tools for some etch processes. It is odd to see die prints out to the edge all around, usually at least the 'corners' are inked out/not patterned by the time it hits copper layers because printing features out that far can increase the chances of film delamination, which just leads to more defectivity. I suppose on DUV tools the extra few seconds to run those shots isn't THAT bad on non-immersion layers, but it adds up over timeArsenica - Tuesday, April 6, 2021 - link
It isn´t real if it doesn´t have DrIan bite marks./jk
ilt24 - Tuesday, April 6, 2021 - link
@Gomez AddamsI spent my entire career working in the semiconductor industry, although in IT, and I have seen many wafers from 4" to 12" and printing partial die off the edge of the wafer is quite common.
check out the pictures in these article:
https://www.anandtech.com/show/15380/i-ran-off-wit...
https://www.anandtech.com/show/9723/amd-to-spinoff...
Kamen Rider Blade - Tuesday, April 6, 2021 - link
So when are we going to hit 450 mm / 18" waffers?Are we ever going to get Hexagonal Die's to maximize possible Yields?
http://www.semiconductor-today.com/news_items/2020...
https://semiaccurate.com/2015/05/18/disco-makes-he...
They can already do that for simple LED's, but trying to bring Hexagonal IC Dies into existence is going to be exciting because there is a theoretical 62.5% increase in Manufactured Dies for a given Waffer Diameter and using Hexagonal IC Dies of a similar/identical area.
ilt24 - Tuesday, April 6, 2021 - link
@Kamen Rider Blade - "So when are we going to hit 450 mm / 18" waffers?"It seems the desire to move to EUV distracted TSMC, Samsung and Intel who are probably the only companies that were really interested in 450mm.
saratoga4 - Tuesday, April 6, 2021 - link
>So when are we going to hit 450 mm / 18" waffers?For logic, never since there is little to no advantage to larger wafers. Possibly NAND might use it, but we'll see if its even worth it there.
>Are we ever going to get Hexagonal Die's to maximize possible Yields?
Probably not for logic. With reticle sizes getting smaller in the coming nodes, it makes even less sense going forward then it did in the past, and it didn't make much sense then to begin with.
rahvin - Sunday, April 18, 2021 - link
The bigger the silicon ingot the more expensive it is to produce. Though the problems with EUV may have delayed development my guess would be the additional cost of the larger ingots negates the cost savings on the other side, particularly with the EUV tools being 10X more expensive.Lukasz Nowak - Thursday, April 8, 2021 - link
There's another curious thing about the wafer. There are a lot of dies with just clipped corners. If they shifted the entire pattern to the left or right by a quarter of a die, they would get 6 more good ones. That's 7% more dies for free (90 instead of 84).Wouldn't that be worth doing?
Smell This - Thursday, April 8, 2021 - link
Maybe Chipzillah can glue them together . . . HA!
Speaking of which __ it this aN *MCM* multi-chip module ?
Chaitanya - Tuesday, April 6, 2021 - link
Even that upgrade is falling short of catching up.Hifihedgehog - Tuesday, April 6, 2021 - link
Exactly. What's with these nonsense comments, anyway? It is like bragging about how I can now run a 10-minute mile instead of a 20-minute mile while the star players are breaking world records and running 4-minute miles. *facepalm*Wilco1 - Tuesday, April 6, 2021 - link
No kidding. It is not even matching Graviton 2! A $8k CPU beaten by an Arm CPU from 2019...Hifihedgehog - Tuesday, April 6, 2021 - link
and yet that's still a lot of falling short for Intel 🤷♂️fallaha56 - Tuesday, April 6, 2021 - link
not really, the 38 and 40 core parts won't be available in any amounts (see Semiaccurate)and as can be seen in the lower spec parts, suddenly Ice Lake is barely beating Cascake Lake never mind AMD
fallaha56 - Tuesday, April 6, 2021 - link
https://semiaccurate.com/2021/04/06/intels-ice-lak...Gondalf - Wednesday, April 7, 2021 - link
LOL semi-accurate. 40-38 cores parts already for sale.Shorty_ - Thursday, April 8, 2021 - link
did you read the article before commenting?I'm inclined to believe him-- I think yields are still an issue (which is why they have so many dark cores) and that getting enough chips to meet demand on the 40 core parts will be tough.
Hifihedgehog - Saturday, April 17, 2021 - link
LOL Gondalf. Who pays $1000 for your thoughts?DannyH246 - Tuesday, April 6, 2021 - link
Another Intel marketing presentation from www.IntelTech.comLet me summarize - slower, hotter, pricier than the AMD equivalent. Zero reason to buy.
SarahKerrigan - Tuesday, April 6, 2021 - link
"As impressive as the new Xeon 8380 is from a generational and technical stand-point, what really matters at the end of the day is how it fares up to the competition. I’ll be blunt here; nobody really expected the new ICL-SP parts to beat AMD or the new Arm competition – and it didn’t."How is that "Intel marketing"?
ParalLOL - Tuesday, April 6, 2021 - link
In this case you did not even need to read the article to know what the tone would be. I guess Danny did not manage to read the title either.fallaha56 - Tuesday, April 6, 2021 - link
how? the chip isn't worth touching with bargepolethat's if the 38-40 core parts are actually available
which they won't be
and what sysadmin is going to go demand this when Milan is a drop in replacement and Intel next-gen is an entirely new platform
Azix - Tuesday, April 6, 2021 - link
Are you assuming they won't be because semiaccurate said so? They have 100% track record? Didn't he also say Rocket Lake S wouldn't clock high at all?yeeeeman - Tuesday, April 6, 2021 - link
this ain't intel marketing presentation. This is a laid back, relaxed, non-biased and professional review. Not everyone hates Intel with their whole heart and not every reviewer hunts for clicks, so as to say that the new Intel server chip are shit. In the grand scheme of things, sure, they are not competitive, BUT Intel still has a few advantages over AMD that for some customers it might matter more than absolute performance.In the server space, price, dependability, upgradeability, quality and support is the name of the game. AMD, as we know even from consumer products isn't that amazing when it comes to drivers, BIOS quality and fixing bugs, whereas Intel is much more reliable in this regard. Sure, sure, you might say I am a fanboy, but first check what I say and then call me that if you want. Nevertheless, Intel needs Sapphire Rapids badly because even with all their advantages, they will keep losing marketshare.
fallaha56 - Tuesday, April 6, 2021 - link
absolute nonsense from a fanboi yesIntel is currently slower, buggier and overpriced with horrific security issues meaning you can have: slow and insecure or even slower and barely secure
and who ever thought servers would regularly need watercooling
also what on Earth are you talking about upgrades? this entire platform is getting chucked shortly while AMD has offered multiple generations on the same platform for years with an upgrade bringing DDR5 and PCIe 5
fallaha56 - Tuesday, April 6, 2021 - link
wonderfully summed up here:https://semiaccurate.com/2021/04/06/intels-ice-lak...
lmcd - Tuesday, April 6, 2021 - link
Linking semiaccurate like it's accurate, the jokes write themselves.arashi - Tuesday, April 6, 2021 - link
Still more accurate than the embarrassment called #silicongang.schujj07 - Tuesday, April 6, 2021 - link
As an actual administrator in a datacenter your statement about those advantages is bogus.Hifihedgehog - Tuesday, April 6, 2021 - link
> AMD, as we know even from consumer products isn't that amazing when it comes to drivers, BIOS quality and fixing bugs, whereas Intel is much more reliable in this regard.What drivel even is this? Have you actually worked in the industry? Clearly, you have not. I have already seen machine learning nodes move to Epyc, my web host has since moved to Epyc, and even a lot of recommendations for home lab equipment (see ServeTheHome) has since been moving heavily towards AMD. You have no clue. So go eat a pound of sand. At least it will put out better crap than Intel’s 10nm.
amootpoint - Wednesday, April 7, 2021 - link
If your ML has moved to AMD, you are already burning a lot of money ... good luck.AI is where AMD is lagging so much compared to Intel, that it doesn’t even make sense.
schujj07 - Wednesday, April 7, 2021 - link
You obviously don't know what the term "Machine Learning Node" actually means. That doesn't mean the accelerators for machine learning are FirePro or Epyc, just the server that houses them are running Epyc.amootpoint - Thursday, April 8, 2021 - link
You clearly sounds like an arrogant guy, with full on personal attacks. No point in further discussion.schujj07 - Thursday, April 8, 2021 - link
Pot calling kettle black.duploxxx - Wednesday, April 7, 2021 - link
This is a release of server chips, which are distributed through OEM mainly with their specific drivers and BIOS releases close design with AMD.... Do you honestly believe that you get instable BIOS, drivers for Server releases? It's not a consumer moboproduct for 50-100-150$ that goes on sale for the masses with generic subset of BIOS that needs to look fancy, has oc potential, looks and feel, fan mngmnt etc....second, price, upgradeability is still in favor of the AMD product, quality and support is delivered by the same OEM that ships both intel and AMD systems... and performance, well we know that answer already.. which only leaves retared ICT members that are aging and still believe in some mythical vendors... well i hope they still like all the spectre and meltdown patches and feel evry confident in there infinit support of a dominant monopoly and like to pay 10-15k$ for a server cpu to allow a bit more Ram support.
DannyH246 - Tuesday, April 6, 2021 - link
Oh and also don’t forget the usual Intel fused off features just because...Compare this to AMD where you get all features in all SKU’s. Anyone who recommends this crap is simply an Intel Fanboi.Gondalf - Wednesday, April 7, 2021 - link
Depends on workload. If you need of a massive per core bandwidth, there is only a street: Intel.If you need of very low cache latency, there is only a street: Intel.
Moreover consider that actually AMD is selling a small number of 64 cores SKUs, the focus of the market is on 32 cores parts. So again in this arena Intel is absolutely the best for bandwidth, latencies and idle power (AMD main defect).
Not much is changed, Amd is best for many cores apps, Intel for medium/low number of cores apps.
Something will change at the end of this year.
SarahKerrigan - Tuesday, April 6, 2021 - link
So the 8380 number is being used for both Ice-SP and Cooper-SP, which are totally unrelated designs, on different platforms, with different microarchitectures?Well, that's not confusing at all.
Drumsticks - Tuesday, April 6, 2021 - link
I thought surely it must have been a typo or you were confused, but yes indeed, you can find the 40 core 8380 right next to the 28 core 14nm 8380HL.Intel's naming has never been stellar but this is a new level.
jeremyshaw - Tuesday, April 6, 2021 - link
Cooper Lake probably existed about as much as Cannon Lake. Intel still doesn't want to acknowledge their failures.This totally bodes well. /s
schujj07 - Tuesday, April 6, 2021 - link
You can actually buy Cooper Lake 4p servers.fallaha56 - Tuesday, April 6, 2021 - link
well you won't be buying 38-40 core Ice Lake onesphantom parts, yield are awfuls. and as we saw, on the lower core count parts, so is performance
fallaha56 - Tuesday, April 6, 2021 - link
https://semiaccurate.com/2021/04/06/intels-ice-lak...29a - Wednesday, April 7, 2021 - link
Are you getting paid for every semiaccurate link you post?eastcoast_pete - Tuesday, April 6, 2021 - link
Maybe that's what Intel meant with "improved cryptographic performance"; nobody can make any sense out of their naming scheme (: . Cryptic indeed!amootpoint - Tuesday, April 6, 2021 - link
It seems like a great wholistic platform. I must say, well done Intel.Hifihedgehog - Tuesday, April 6, 2021 - link
Correction: Maybe if it was released over two years ago...amootpoint - Wednesday, April 7, 2021 - link
Not really. Just look at the perf numbers vs AMD. Intel is broadly winning.Bagheera - Thursday, April 8, 2021 - link
which review article did you read? didn't sound like you read the same one I did.Unashamed_unoriginal_username_x86 - Tuesday, April 6, 2021 - link
the 8280 core to core bounce latency image is 404ingRyan Smith - Tuesday, April 6, 2021 - link
This has been fixed. Thanks!JayNor - Tuesday, April 6, 2021 - link
News release says "The platform supports up to 6 terabytes of system memory per socket".This story says 4TB. Is Intel wrong?
ParalLOL - Tuesday, April 6, 2021 - link
Depends on whether you agree to say that Optane is "system memory". It is mapped onto address space and is directly attached to CPU so it probably can be said to be system memory.DigitalFreak - Tuesday, April 6, 2021 - link
It's like when Intel says their desktop processors have 40 PCIe lanes. It's actually 16 CPU and 32 chipset. Well 20 CPU with Rocket Lake, but still...brucethemoose - Tuesday, April 6, 2021 - link
Y'all are measuring power from the socket sensors, not the wall, right?I think the later would be more interesting, even with the apples-to-oranges factor of different hardware surrounding the platform. After all, whole system power consumption is what one gets charged for.
Jorgp2 - Tuesday, April 6, 2021 - link
That would vary massively between systems.fallaha56 - Tuesday, April 6, 2021 - link
surebut when your 64-core part virtually beats Intel's dual socket 32-core part on performance alone?
add the energy savings and suddenly it's a 300-400% perf lead
Jorgp2 - Tuesday, April 6, 2021 - link
The fuck?You do realize that they put more than CPUs onto servers right?
Andrei Frumusanu - Tuesday, April 6, 2021 - link
We are testing non-production server configurations, all with varying hardware, PSUs, and other setup differences. Socket comparisons remain relatively static between systems.edzieba - Tuesday, April 6, 2021 - link
Would be interesting to pit the 4309 (or 5315) against the Rocket Lake octacores. Yes, it's a very different platform aimed at a different market, but it would be interesting to see what a hypothetical '10nm Sunny Cove consumer desktop' could have resembled compared to what Rocket Lake's Sunny Cove delivered on 14nm.Jorgp2 - Tuesday, April 6, 2021 - link
You could also compare it to the 10900x, which is an existing AVX-512 CPU with large L2 caches.Holliday75 - Tuesday, April 6, 2021 - link
Typical consumer workloads the RL will be better. For typical server workloads the IL will be better. That is the gist of what would be said.ricebunny - Tuesday, April 6, 2021 - link
These tests are not entirely representative of real world use cases. For open source software, the icc compiler should always be the first choice for Intel chips. The fact that Intel provides such a compiler for free and AMD doesn’t is a perk that you get with owning Intel. It would be foolish not to take advantage of it.Andrei Frumusanu - Tuesday, April 6, 2021 - link
AMD provides AOCC and there's nothing stopping you from running ICC on AMD either. The relative positioning in that scenario doesn't change, and GCC is the industry standard in that regard in the real world.ricebunny - Tuesday, April 6, 2021 - link
Thanks for your reply. I was speaking from my experience in HPC: I’ve never compiled code that I intended to run on Intel architectures with anything but icc, except when the environment did not provide me such liberty, which was rare.If I were to run the benchmarks, I would build them with the most optimal settings for each architecture using their respective optimizing compilers. I would also make sure that I am using optimized libraries, e.g. Intel MKL and not Open BLAS for Intel architecture, etc.
Wilco1 - Tuesday, April 6, 2021 - link
And I could optimize benchmarks using hand crafted optimal inner loops in assembler. It's possible to double the SPEC score that way. By using such optimized code on a slow CPU, it can *appear* to beat a much faster CPU. And what does that prove exactly? How good one is at cheating?If we want to compare different CPUs then the only fair option is to use identical compilers and options like AnandTech does.
ricebunny - Tuesday, April 6, 2021 - link
See it like it this: the benchmark is a racing track, the CPU is a car and the compiler is the driver. If I want to get the best time for each car on a given track I will not have them driven by the same driver. Rather, I will get the best driver for each car. A single driver will repeat the same mistakes in both cars, but one car may be more forgiving than the other.DigitalFreak - Tuesday, April 6, 2021 - link
Is the compiler called The Stig?Wilco1 - Tuesday, April 6, 2021 - link
Then you are comparing drivers and not cars. A good driver can win a race with a slightly slower car. And I know a much faster driver that can beat your best driver. And he will win even with a much slower car. So does the car really matter as long as you have a really good driver?In the real world we compare cars by subjecting them to identical standardized tests rather than having a grandma drive one car and Lewis Hamilton drive another when comparing their performance/efficiency/acceleration/safety etc.
Makste - Wednesday, April 7, 2021 - link
Well saidricebunny - Wednesday, April 7, 2021 - link
Based on the compiler options that Anandtech used, we already have the situation that Intel and AMD CPUs are executing different code for the same benchmark. From there it’s only a small step further to use the best compiler for each CPU.mode_13h - Wednesday, April 7, 2021 - link
So, you're saying make the situation MORE lopsided? Instead, maybe they SHOULD use the same compiled code!mode_13h - Wednesday, April 7, 2021 - link
This is a dumb analogy. CPUs are not like race cars. They're more like family sedans or maybe 18-wheeler semi trucks (in the case of server CPUs). As such, they should be tested the way most people are going to use them.And almost NOBODY is compiling all their software with ICC. I almost never even hear about ICC, any more.
I'm even working with an Intel applications engineer on a CPU performance problem, and even HE doesn't tell me to build their own Intel-developed software with ICC!
KurtL - Wednesday, April 7, 2021 - link
Using identical compilers is the most unfair option there is to compare CPUs. Hardware and software on a modern system is tightly connected so it only makes sense to use those compilers on each platform that also are best optimised for that particular platform. Using a compiler that is underdeveloped for one platform is what makes an unfair comparison.Makste - Wednesday, April 7, 2021 - link
I think that using one unoptimized compiler for both is the best way to judge their performance. Such a compiler rules out bias and concentrates on pure hardware capabilitiesricebunny - Wednesday, April 7, 2021 - link
You do realize that even the same gcc compiler with the settings that Anandtech used will generate different machine code for Intel and AMD architectures, let alone for ARM? To really make it "apples-to-apples" on Linux x86 they should've used "--with-tune=generic" option: then both CPUs will execute the exact same code.But personally, I would prefer that they generated several binaries for each test, built them with optimal settings for each of the commonly used compilers: gcc, icc, aocc on Linux and perhaps even msvc on Windows. It's a lot more work I know, but I would appreciate it :)
mode_13h - Wednesday, April 7, 2021 - link
Intel, AMD, and ARM all contribute loads of patches to both GCC and LLVM. There's no way either of these compilers can be seen as "underdeveloped".And Intel is usually doing compiler work a couple YEARS ahead of each CPU & GPU generation. If anyone is behind, it's AMD.
Oxford Guy - Wednesday, April 7, 2021 - link
It's not cheating if the CPU can do that work art that speed.It's only cheating if you don't make it clear to readers what kind of benchmark it is (hand-tuned assembly).
mode_13h - Thursday, April 8, 2021 - link
Benchmarks, in articles like this, should strive to be *relevant*. And for that, they ought to focus on representing the performance of the CPUs as the bulk of readers are likely to experience it.So, even if using some vendor-supplied compiler with trick settings might not fit your definition of "cheating", that doesn't mean it's a service to the readers. Maybe save that sort of thing for articles that specifically focus on some aspect of the CPU, rather than the *main* review.
Oxford Guy - Sunday, April 11, 2021 - link
There is nothing more relevant than being able to see all facets of a part's performance. This makes it possible to discern its actual performance capability.Some think all a CPU comparison needs are gaming benchmarks. There is more to look at than subsets of commercial software. Synthetic benchmarks also are valid data points.
mode_13h - Monday, April 12, 2021 - link
It's kind of like whether an automobile reviewer tests a car with racing tyres and 100-octane fuel. That would show you its maximum capabilities, but it's not how most people are going to experience it. While a racing enthusiast might be interested in knowing this, it's not a good proxy for the experience most people are likely to have with it.All I'm proposing is to prioritize accordingly. Yes, we want to know how many lateral g's it can pull on a skid pad, once you remove the limiting factor of the all-season tyres, but that's secondary.
Wilco1 - Thursday, April 8, 2021 - link
It's still cheating if you compare highly tuned benchmark scores with untuned scores. If you use it to trick users into believing CPU A is faster than CPU B eventhough CPU A is really slower, you are basically doing deceptive marketing. Mentioning it in the small print (which nobody reads) does not make it any less cheating.Oxford Guy - Sunday, April 11, 2021 - link
It's cheating to use software that's very unoptimized to claim that that's as much performance as CPU has.For example... let's say we'll just skip all software that has AVX-512 support — on the basis that it's just not worth testing because so many CPUs don't support it.
Wilco1 - Sunday, April 11, 2021 - link
Running not fully optimized software is what we do all the time, so that's exactly what we should be benchmarking. The -Ofast option used here is actually too optimized since most code is built with -O2. Some browsers use -Os/-Oz for much of their code!AVX-512 and software optimized for AVX-512 is quite rare today, and the results are pretty awful on the latest cores: https://www.phoronix.com/scan.php?page=article&...
Btw Andrei ran ICC vs GCC: https://twitter.com/andreif7/status/13808945639975...
ICC is 5% slower than GCC on SPECINT. So there we go.
mode_13h - Monday, April 12, 2021 - link
Not to disagree with you, but always take Phoronix' benchmarks with a grain of salt.First, he tested one 14 nm CPU model that only has one AVX-512 unit per core. Ice Lake has 2, and therefore might've shown more benefit.
Second, PTS is enormous (more than 1 month typical runtime) and I haven't seen Michael being very transparent about his criteria for selecting which benchmarks to feature in his articles. He can easily bias perception through picking benchmarks that respond well or poorly to the feature or product in question.
There are also some questions raised about his methodology, such as whether he effectively controlled for AVX-512 usage in some packages that contain hand-written asm. However, by looking at the power utilization graphs, I doubt that's an issue in this case. But, if he excluded such packages for that very reason, then it could unintentionally bias the results.
Wilco1 - Monday, April 12, 2021 - link
Completely agree that Phoronix benchmarks are dubious - it's not only the selection but also the lack of analysis of odd results and the incorrect way he does cross-ISA comparisons. It's far better to show a few standard benchmarks with well-known characteristics than a random sample of unknown microbenchmarks.Ignoring all that, there are sometimes useful results in all the noise. The power results show that for the selected benchmarks there is really use of AVX-512. Whether this is typical across a wider range of code is indeed the question...
mode_13h - Monday, April 12, 2021 - link
With regard specifically to testing AVX-512, perhaps the best method is to include results both with and without it. This serves the dual-role of informing customers of the likely performance for software compiled with more typical options, as well as showing how much further performance is to be gained by using an AVX-512 optimized build.KurtL - Wednesday, April 7, 2021 - link
GCC the industry standard in real world? Maybe in that part of the world where you live, but not everywhere. It is only true in a part of the world. HPC centres have relied on icc for ages for much of the performance-critical code, though GCC is slowly catching up, at least for C and C++ but not at all for Fortran, an important language in HPC (I just read it made it back in the top-20 of most used languages after falling back to position 34 a year or so ago). In embedded systems and the non-x86-world in general, LLVM derived compilers have long been the norm. Commercial compiler vendors and CPU manufacturers are all moving to LLVM-based compilers or have been there for years already.Wilco1 - Wednesday, April 7, 2021 - link
Yes GCC is the industry standard for Linux. That's a simple fact, not something you can dispute.In HPC people are willing to use various compilers to get best performance, so it's certainly not purely ICC. And HPC isn't exclusively Intel or x86 based either. LLVM is increasing in popularity in the wider industry but it still needs to catch up to GCC in performance.
mode_13h - Wednesday, April 7, 2021 - link
GCC is the only supported compiler for building the Linux kernel, although Google is working hard to make it build with LLVM. They seem to believe it's better for security.From the benchmarks that Phoronix routinely publishes, each has its strengths and weaknesses. I think neither is a clear winner.
Wilco1 - Thursday, April 8, 2021 - link
Plus almost all distros use GCC - there is only one I know that uses LLVM. LLVM is slowly gaining popularity though.They are fairly close for general code, however recent GCC versions significantly improved vectorization, and that helps SPEC.
Wilco1 - Tuesday, April 6, 2021 - link
ICC and AMD's AOCC are SPEC trick compilers. Neither is used much in the real world since for real code they are typically slower than GCC or LLVM.Btw are you equally happy if I propose to use a compiler which replaces critical inner loops of the benchmarks with hand-optimized assembler code? It would be foolish not to take advantage of the extra performance you get only on those benchmarks...
ricebunny - Tuesday, April 6, 2021 - link
They are not SPEC tricks. You can use these compilers for any compliant C++ code that you have. In the last 10 years, the only time I didn’t use icc with Intel chips was on systems where I had no control over the sw ecosystem.Wilco1 - Tuesday, April 6, 2021 - link
They only exist because of SPEC. The latest ICC is now based on LLVM since it was falling further behind on typical code.ricebunny - Tuesday, April 6, 2021 - link
From my experience icc consistently produced better vectorized code.Anandtech again didn’t publicize the compiler flags they used to build the benchmark code. By default, gcc will not generate avx512 optimized code.
Wilco1 - Tuesday, April 6, 2021 - link
Maybe compared to old GCC/LLVM versions, but things have changed. There is now little difference between ICC and GCC when running SPEC in terms of vectorized performance. Note the amount of code that can benefit from AVX-512 is absolutely tiny, and the speedups in the real world are smaller than expected (see eg. SIMDJson results with hand-optimized AVX-512).And please read the article - the setup is clearly explained in every review: "We compile the binaries with GCC 10.2 on their respective platforms, with simple -Ofast optimisation flags and relevant architecture and machine tuning flags (-march/-mtune=Neoverse-n1 ; -march/-mtune=skylake-avx512 ; -march/-mtune=znver2 (for Zen3 as well due to GCC 10.2 not having znver3). "
Drazick - Wednesday, April 7, 2021 - link
The ICC compiler has much better vectorization engine than the one in GCC. It will usually generate better vectorized code. Especially numerical code.But the real benefit of ICC is its companion libraries: VSML, MKL, IPP.
Oxford Guy - Wednesday, April 7, 2021 - link
I remember that custom builds of Blender done with ICC scored better on Piledriver as well as on Intel hardware. So, even an architecture that was very different was faster with ICC.mode_13h - Thursday, April 8, 2021 - link
And when was this? Like 10 years ago? How do we know the point is still relevant?Oxford Guy - Sunday, April 11, 2021 - link
How do we know it isn't?Instead of whinge why not investigate the issue if you're actually interested?
Bottom line is that, just before the time of Zen's release, I tested three builds of Blender done with ICC and all were faster on both Intel and Piledriver (a very different architecture from Haswell).
I asked why the Blender team wasn't releasing its builds with ICC since performance was being left on the table but only heard vague suggestions about code stability.
Wilco1 - Sunday, April 11, 2021 - link
This thread has a similar comment about quality and support in ICC: https://twitter.com/andreif7/status/13808945639975...KurtL - Wednesday, April 7, 2021 - link
This is absolutely untrue. There is not much special about AOCC, it is just a AMD-packaged Clang/LLVM with few extras so it is not a SPEC compiler at all. Neither is it true for Intel. Sites that are concerned about getting the most performance out of their investments often use the Intel compilers. It is a very good compiler for any code with good potential for vectorization, and I have seen it do miracles on badly written code that no version of GCC could do.Wilco1 - Wednesday, April 7, 2021 - link
And those closed-source "extras" in AOCC magically improve the SPEC score compared to standard LLVM. How is it not a SPEC compiler just like ICC has been for decades?JoeDuarte - Wednesday, April 7, 2021 - link
It's strange to tell people who use the Intel compiler that it's not used much in the real world, as though that carries some substantive point.The Intel compiler has always been better than gcc in terms of the performance of compiled code. You asserted that that is no longer true, but I'm not clear on what evidence you're basing that on. ICC is moving to clang and LLVM, so we'll see what happens there. clang and gcc appear to be a wash at this point.
It's true that lots of open source Linux-world projects use gcc, but I wouldn't know the percentage. Those projects tend to be lazy or untrained when it comes to optimization. They hardly use any compiler flags relevant to performance, like those stipulating modern CPU baselines, or link time optimization / whole program optimization. Nor do they exploit SIMD and vectorization much, or PGO, or parallelization. So they leave a lot of performance on the table. More rigorous environments like HPC or just performance-aware teams are more likely to use ICC or at least lots of good flags and testing.
And yes, I would definitely support using optimized assembly in benchmarks, especially if it surfaced significant differences in CPU performance. And probably, if the workload was realistic or broadly applicable. Anything that's going to execute thousands, millions, or billions of times is worth optimizing. Inner loops are a common focus, so I don't know what you're objecting to there. Benchmarks should be about realizable optimal performance, and optimization in general should be a much bigger priority for serious software developers – today's software and OSes are absurdly slow, and in many cases desktop applications are slower in user-time than their late 1980s counterparts. Servers are also far too slow to do simple things like parse an HTTP request header.
pSupaNova - Wednesday, April 7, 2021 - link
"today's software and OSes are absurdly slow, and in many cases desktop applications are slower in user-time than their late 1980s counterparts." a late 1980's desktop could not even play a video let alone edit one, your average mid range smartphone is much more capable. My four year old can do basic computing with just her voice. People like you forget how far software and hardware has come.GeoffreyA - Wednesday, April 7, 2021 - link
Sure, computers and devices are far more capable these days, from a hardware point of view, but applications, relying too much on GUI frameworks and modern languages, are more sluggish today than, say, a bare Win32 application of yore.Oxford Guy - Wednesday, April 7, 2021 - link
You're arguing apples (latency) and oranges (capability).An Apple II has better latency than an Apple Lisa, even though the latter is vastly more powerful in most respects. The sluggishness of the UI was one of the big problems with that system from a consumer point of view. Many self-described power users equated a snappy interface with capability, so they believed their CLI machines (like the IBM PC) were a lot better.
GeoffreyA - Wednesday, April 7, 2021 - link
"today's software and OSes are absurdly slow, and in many cases desktop applications are slower in user-time than their late 1980s counterparts"Oh yes. One builds a computer nowadays and it's fast for a year. But then applications, being updated, grow sluggish over time. And it starts to feel like one's old computer again. So what exactly did we gain, I sometimes wonder. Take a simple suite like LibreOffice, which was never fast to begin with. I feel version 7 opens even slower than 6. Firefox was quite all right, but as of 85 or 86, when they introduced some new security feature, it seems to open a lot slower, at least on my computer. At any rate, I do appreciate all the free software.
ricebunny - Wednesday, April 7, 2021 - link
Well said.Frank_M - Thursday, April 8, 2021 - link
Intel Fortran is vastly faster then GCC.How did ricebunny get a free compiler?
mode_13h - Thursday, April 8, 2021 - link
> It's strange to tell people who use the Intel compiler that it's not used much in the real world, as though that carries some substantive point.To use the automotive analogy, it's as if a car is being reviewed using 100-octane fuel, even though most people can only get 93 or 91 octane (and many will just use the cheap 87 octane, anyhow).
The point of these reviews isn't to milk the most performance from the product that's theoretically possible, but rather to inform readers about how they're likely to experience it. THAT is why it's relevant that almost nobody uses ICC in practice.
And, in fact, BECAUSE so few people are using ICC, Intel puts a lot of work into GCC and LLVM.
GeoffreyA - Thursday, April 8, 2021 - link
I think that a common compiler like GCC should be used (like Andrei is doing), along with a generic x86-64 -march (in the case of Intel/AMD) and generic -mtune. The idea would be to get the CPUs on as equal a footing as possible, even with code that might not be optimal, and reveal relative rather than absolute performance.Wilco1 - Thursday, April 8, 2021 - link
Using generic (-march=x86-64) means you are building for ancient SSE2... If you want a common baseline then use something like -march=x86-64-v3. You'll then get people claiming that excluding AVX-512 is unfair eventhough there is little difference on most benchmarks except for higher power consumption ( https://www.phoronix.com/scan.php?page=article&... ).GeoffreyA - Saturday, April 10, 2021 - link
I think leaving AVX512 out is a good policy.GeoffreyA - Thursday, April 8, 2021 - link
If I may offer an analogy, I would say: the benchmark is like an exam in school but here we test time to finish the paper (and with the constraint of complete accuracy). Each pupil should be given the identical paper, and that's it.Using optimised binaries for different CPUs is a bit like knowing each child's brain beforehand (one has thicker circuitry in Bodman region 10, etc.) and giving each a paper with peculiar layout and formatting but same questions (in essence). Which system is better, who can say, but I'd go with the first.
Oxford Guy - Wednesday, April 7, 2021 - link
Well, whatever tricks were used made Blender faster with the ICC builds I tested — both on AMD's Piledriver and on several Intel releases (Lynnfield and Haswell).mode_13h - Thursday, April 8, 2021 - link
Please tell me you did this test with an ICC released only a couple years ago, or else I feel embarrassed for you polluting this discussion with such irrelevant facts.Oxford Guy - Sunday, April 11, 2021 - link
It wasn't that long ago.If you want to increase the signal to noise ratio you should post something substantive.
For instance, if you think think ICC no longer produces faster Blender builds why not post some evidence to that effect?
eastcoast_pete - Tuesday, April 6, 2021 - link
This Xeon generation exists primarily because Intel had to come through and deliver something in 10 nm, after announcing the heck out of it for years. As an actual processor, they are not bad as far as Xeons are concerned, but clearly inferior to AMD's current EPYC line, especially on price/performance. Plus, we and the world know that the real update is around the corner within a year: Sapphire Rapids. That one promises a lot of performance uplift, not the least by having PCI-5 and at least the option of directly linked HBM for RAM. Lastly, if Intel would have managed to make this line compatible with the older socket (it's not), one could at least have used these Ice Lake Xeons to update Cooper Lake systems via a CPU swap. As it stands, I don't quite see the value proposition, unless you're in an Intel shop and need capacity very badly right now.Limadanilo2022 - Tuesday, April 6, 2021 - link
Agreed. Both Ice Lake and Rocket lake are just placeholders to try to make something before the real improvement comes with Saphire rapids and Alder Make respectively... I'm one that says that AMD really needs the competition right now to not get sloppy and become "2017-2020 Intel". I want to see both competing hard in the next years aheaddrothgery - Wednesday, April 7, 2021 - link
Rocket Lake is a stopgap. Ice Lake (and Ice Lake SP) were just late; they would have been unquestioned market leaders if launched on time and even now mostly just run into problems when the competition is throwing way more cores at the problem.AdrianBc - Wednesday, April 7, 2021 - link
No, Ice Lake Server cores have a much lower clock frequency and a much smaller L3 cache than Epyc 7xx3, so they are much slower core per core than AMD Milan for any general purpose application, e.g. software compilation.The Ice Lake Server cores have a double number of floating-point multipliers that can be used by AVX-512 programs, so they are faster (despite their clock frequency deficit) for the applications that are limited by FP multiplication throughput or that can use other special AVX-512 features, e.g. the instructions useful for machine learning.
Oxford Guy - Wednesday, April 7, 2021 - link
'limited by FP multiplication throughput or that can use other special AVX-512 features, e.g. the instructions useful for machine learning.'How do they compare with Power?
How do they compare with GPUs? (I realize that a GPU is very good at a much more limited palette of work types versus a general-purpose CPU. However... how much overlap there is between a GPU and AVX-512 is something at least non-experts will wonder about.)
AdrianBc - Thursday, April 8, 2021 - link
The best GPUs from NVIDIA and AMD can provide between 3 and 4 times more performance per watt than the best Intel Xeons with AVX-512.However most GPUs are usable only in applications where low precision is appropriate, i.e. graphics and machine learning.
The few GPUs that can be used for applications that need higher precision (e.g. NVIDIA A100 or Radeon Instinct) are extremely expensive, much more than Xeons or Epycs, and individuals or small businesses have very little chances to be able to buy them.
mode_13h - Friday, April 9, 2021 - link
Please re-check the price list. The top-end A100 does sell for a bit more than the $8K list price of the top Xeon and EPYC, however MI100 seems to be pretty close. perf/$ is still wildly in favor of GPUs.Unfortunately, if you're only looking at the GPUs' ordinary compute specs, you're missing their real point of differentiation, which is their low-precision tensor performance. That's far beyond what the CPUs can dream of!
Trust there are good reasons why Intel scrapped Xeon Phi, after flogging it for 2 generations (plus a few prior unreleased iterations), and adopted a pure GPU approach to compute!
mode_13h - Thursday, April 8, 2021 - link
"woulda, coulda, shoulda"Ice Lake SP is not even competitive with Rome. So, they missed their market window by quite a lot!
Oxford Guy - Tuesday, April 6, 2021 - link
Reading the conclusion I’m confused by how it’s possible for the product to be a success and for it to be both slower and more expensive.‘But Intel has 10nm in a place where it is economically viable’
Is that the full-fat 10nm or a simplified/weaker version? I can’t remember but vaguely recall something about Intel having had to back away from some of the tech improvements it had said would be in its 10nm.
Yojimbo - Tuesday, April 6, 2021 - link
Because there is more than the benchmarks that are in this review to making decisions when buying servers. Intel's entire ecosystem is an advantage much bigger than AMD's lead in benchmarks, as is Intel's ability to deliver high volume. The product will be a success because it will sell a lot of hardware. It will, however, allow a certain amount of market share to be lost to AMD, but less thanwpuld be lost without it. It will also cut into profit margins compared to if the Intel chips were even with the AMD ones in the benchmarks, or if Intel's 10 nm was as cost effective as they'd like it to be (but TSMC's 7 nm is not as cost effect as Intel would like they're processes to be, either).RanFodar - Tuesday, April 6, 2021 - link
This.Oxford Guy - Wednesday, April 7, 2021 - link
So, the argument here is that the article should have been all that instead of focusing on benchmarks.Yojimbo - Friday, April 9, 2021 - link
I never made any argument or made any suggestions for the article, I only tried to clear up your confusion: "Reading the conclusion I’m confused by how it’s possible for the product to be a success and for it to be both slower and more expensive." Perhaps the author should have been more explicit as to why he made his conclusion. To me, the publishing of server processor benchmarks on a hardware enthusiast site like this is mostly for entertainment purposes, although it might influence some retail investors. They are just trying to pit the processor itself against its competitors. "How does Intel's server chip stack up against AMD's server chip?" It's like watching the ball game at the bar.mode_13h - Saturday, April 10, 2021 - link
> To me, the publishing of server processor benchmarks on a hardware enthusiast site like this is mostly for entertainment purposes, although it might influence some retail investors.You might be surprised. I'm a software developer at a hardware company and we use benchmarks on sites like this to give us a sense of the hardware landscape. Of course, we do our own, internal testing, with our own software, before making any final decisions.
I'd guess that you'll find systems admins of SMEs that still use on-prem server hardware are probably also looking at reviews like these.
Oxford Guy - Sunday, April 11, 2021 - link
It's impossible to post a rebuttal (i.e. 'clear up your confusion') without making one or more arguments.I rebutted your rebuttal.
You argued against the benchmarks being seen as important. I pointed out that that means the article shouldn't have been pages of benchmarks. You had nothing.
trivik12 - Tuesday, April 6, 2021 - link
I wish there were test done with 2nd gen Optane memory. isn't that one of the selling point of Intel Xeon that is not there in Epyc or Arm Servers. Also please do benchmarks with P5800X optane SSD as that is supposedly fastest SSD around.Frank_M - Thursday, April 8, 2021 - link
Optane and Thunderbolt.Azix - Tuesday, April 6, 2021 - link
There's a reason intel's data center revenues are still massive compared to AMDs. These will sell in large quantities because AMD can't supply.fanboi99 - Friday, April 9, 2021 - link
That statement is totally inaccurate. Both current gen Intel and AMD server lines are readily available.Qasar - Friday, April 9, 2021 - link
Azix, and the other part of that, could be contracts, and the prices they charge for them.lmcd - Tuesday, April 6, 2021 - link
Bunch of desktop enthusiasts failing to understand that as long as Intel provides a new part to augment an existing VM pool that isn't so awful as to justify replacing all existing systems, they're going to retain 90% of their existing customers.adelio - Wednesday, April 7, 2021 - link
but for almost every new intel line they have no choice but to replace everything anyway so AMD are not really at that much of a disadvantage, if any!Oxford Guy - Wednesday, April 7, 2021 - link
'as long as Intel provides a new part to augment an existing VM pool that isn't so awful as to justify replacing all existing systems'That's nifty. I thought Intel likes to require new motherboards and things.
I had no idea these chips are so backwards compatible with older hardware.
domih - Tuesday, April 6, 2021 - link
# Where BS == 'Marketing'AMD = 'Maximum {}, Minimum {}'.format('performance', 'BS')
INTEL = 'Maximum {}, Minimum {}'.format( 'BS', 'performance')
Foeketijn - Wednesday, April 7, 2021 - link
It's good to know powerusage is about the same as the specifications.It shows the madness of an 8 core i9 using more than an 54 core Xeon.
And even those 54 cores are not delivering a decent power/Watt.
If AMD would have made that I/O die on 7nm the this ice lake CPU would even be in deeper trouble.
yankeeDDL - Wednesday, April 7, 2021 - link
Wow. The conclusion is quite shocking. Massive improvement - still not good enough. Wow.Imagine how massively behind is the current generation compared to AMD in the server's market.
Wow.
rf-design - Wednesday, April 7, 2021 - link
The biggest sign would be if a 3nm fab in US starting hiring engaged but undervalued engineers from a 10nm fab which now found good reasons not to move to asia.mode_13h - Wednesday, April 7, 2021 - link
Andrei, thanks for the review, but please consider augmenting your memory benchmarks with something besides STREAM Triad.Particularly in light of the way that Altra benefits from their "non-temporal" optimization, it gives a false impression of the memory performance of these CPUs in typical, real-world use cases. I would suggest a benchmark that performs a mix of reads and writes of various sizes.
Another interesting benchmark to look at would be some sort of stress test involving atomic memory operations.
TomWomack - Wednesday, April 7, 2021 - link
Is it known whether there will be an IceLake-X this time round? The list of single-Xeon motherboard launches suggests possibly not; it would obviously be appealing to have a 24-core HEDT without paying the Xeon premium.EthiaW - Wednesday, April 7, 2021 - link
Boeings and Airbuses are never actually sold at their nominal prices, they cost far less, a non-disclosed number, for big buyers after gruesome haggling, sometimes less than half the “catalogue” price.I think this is exactly what's intel doing now: set the catalogue price high to avoid losing face, and give huge discount to avoid losing market share.
duploxxx - Wednesday, April 7, 2021 - link
well easy conclusion.EPYC 75F3 is the clear winner SKU and the must have for most of the workloads.
This is based on price - performance - cores and its related 3rd party sw licensing...
I wonder when Intel will be able to convince VMware to move from a 32core licensing schema to a 40core :)
They used to get all the dev favor when PAT was still in the house, I had several senior engineers in escalation calls stating that the hypervisor was optimised for Intel ...guess what even under optimised looking for a VM farm in 2020-2021-....you are way better off with an AMD build.
WaltC - Wednesday, April 7, 2021 - link
If you can't beat the competition, then what? Ian seems to be impressed that Intel was finally able to launch a Xeon that's a little faster than its previous Xeon, but not fast enough to justify the price tag in relation to what AMD has been offering for a while. So here we are congratulating Intel on burning through wads more cash to produce yet-another-non-competitive result. It really seems as if Intel *requires* AMD to set its goals and to tell it where it needs to go--and that is sad. It all began with x86-64 and SDRAM from AMD beating out Itanium and RDRAM years ago. And when you look at what Intel has done since it's just not all that impressive. Well, at least we can dispense with the notion that "Intel's 10nm is TSMC's 7nm" as that clearly is not the case.JayNor - Wednesday, April 7, 2021 - link
What about the networking applications of this new chip? Dan Rodriguez's presentation showed gains of 1.4x to 1.8x for various networking benchmarks. Intel's entry into 5G infrastructure, NFV, vRAN, ORAN, hybrid cloud is growing faster than they originally predicted. They are able to bundle Optane, SmartNICs, FPGAs, eASIC chips, XeonD, P5900 family Atom chips... I don't believe they have a competitor that can provide that level of solution.Bagheera - Thursday, April 8, 2021 - link
Patr!ck Patr!ck Partr!ck?evilpaul666 - Saturday, April 10, 2021 - link
It only works in front of a mirror. Donning a hoodie helps, too.Oxford Guy - Wednesday, April 7, 2021 - link
There is some faulty logic at work in many of the comments, with claims like it's cheating to use a more optimized compiler.It's not cheating unless:
• the compiler produces code that's so much more unstable/buggy that it's quite a bit more untrustworthy than the less-optimized compiler
• you don't make it clear to readers that the compiler may make the architecture look more performant simply because the other architectures may not have had compiler optimizations on the same level
• you use the same compiler for different architectures when using a different compiler for one or more other architectures will produce more optimized code for those architectures as well
• the compiler sabotages the competition, via things like 'genuine Intel'
Fact is that if a CPU can accomplish a certain amount of work in a certain amount of time, using a certain amount of watts under a certain level of cooling — that is the part's actual performance capability.
If that means writing machine code directly (not even assembly) to get to that performance level, so what? That's an entirely different matter, which is how practical/economical/profitable/effortful it is to get enough code to measure all of the different aspects of the part's maximum performance capability. The only time one can really cite that as a deal-breaker is if one has hard data to demonstrate that by the time the hand-tuned/optimized code is written changes to the architecture (and/or support chips/hardware) will obsolete the advantage — making the effort utterly fruitless, beyond intellectual curiosity concerning the part's ability. For instance, if one knows that Intel, for instance, is going to integrate new instructions (very soon) that will make various types of hand-tuned assembly obsolete in short order, it can be argued that it's not worth the effort to write the code. People made this argument with some of AMD's Bulldozer/Piledriver instructions, on the basis that enough industry adoption wasn't going to happen. But, frankly... if you're going to make claims about the part's performance, you really should do what you can to find out what it is.
Oxford Guy - Wednesday, April 7, 2021 - link
One can, though, of course... include a disclaimer that 'it seems clear enough that, regardless of how much hand-tuned code is done, the CPU isn't going to deliver enough to beat the competition, if the competition's code is similarly hand-tuned' — if that's the case. Even if a certain task is tuned to run twice as fast, is it going to be twice as fast as tuned code for the competition's stuff? Is its performance per watt deficit going to be erased? Will its pricing no longer be a drag on its perceived competitiveness?For example, one could have wrung every last drop of performance out of Bulldozer but it wasn't going to beat Sandy Bridge E — a chip with the same number of transistors. Piledriver could beat at least the desktop version of Sandy in certain workloads when clocked well outside of the optimal (for the node's performance per watt) range but that's where it's very helpful to have tests at the same clock. It was discovered, for instance, that the Fury X and Vega had basically identical performance at the same clock. Since desktop Sandy could easily clock at the same 4.0 GHz Piledriver initially shipped with it could be tested at that rate, too.
Ideally, CPU makers would release benchmarks that demonstrate every facet of their chip's maximum performance. The concern about those being best-case and synthetic is less of a problem in that scenario because all aspects of the chip's performance would be tested and published. That makes cherry-picking impossible.
mode_13h - Thursday, April 8, 2021 - link
The faulty logic I see is that you seem to believe it's the review's job to showcase the product in the best possible light. No, that's Intel's job, and you can find plenty of that material at intel.com, if that's what you want.Articles like this should focus on representing the performance of the CPUs as the bulk of readers are likely to experience it. So, even if using some vendor-supplied compiler with trick settings might not fit your definition of "cheating", that doesn't mean it's a service to the readers.
I think it could be appropriate to do that sort of thing, in articles that specifically analyze some narrow aspect of a CPU, for instance to determine the hardware's true capabilities or if it was just over-hyped. But, not in these sort of overall reviews.
Oxford Guy - Sunday, April 11, 2021 - link
'The faulty logic I see is that you seem to believe it's the review's job to...''I think it could be appropriate to do that sort of thing, in articles that...'
Don't contradict yourself or anything.
If you're not interested in knowing how fast a CPU is that's ... well... I don't know.
Telling people to go for marketing info (which is inherently deceptive — the entire fundamental reason for marketing departments to exist) is obviously silly.
mode_13h - Monday, April 12, 2021 - link
> Don't contradict yourself or anything.I think the point of confusion is that I'm drawing a distinction between the initial product review and subsequent follow-up articles they often publish to examine specific points of interest. This would also allow for more time to do a more thorough investigation, since the initial reviews tend to be conducted under strict deadlines.
> If you're not interested in knowing how fast a CPU is that's ... well... I don't know.
There's often a distinction between the performance, as users are most likely to experience it, and the full capabilities of the product. I actually want to know both, but I think the former should be the (initial) priority.
ballsystemlord - Thursday, April 8, 2021 - link
Spelling and grammar errors (there are a lot!):"At the same time, we have also spent time a dual Xeon Gold 6330 system from Supermicro, which has two 28-core processors,..."
Nonsensical English: "time a duel". I haven't the faintest what you were trying to say.
"DRAM latencies here are reduced by 1.7ns, which isn't very much a significant difference,..."
Either use "very much", or use "a significant":
DRAM latencies here are reduced by 1.7ns, which isn't a very significant difference,..."
"Inspecting Intel's prior disclosures about Ice Lake SP in last year's HotChips presentations, one point sticks out, and that's is the "SpecI2M optimisation" where the system is able to convert traditional RFO (Read for ownership) memory operations into another mechanism"
Excess "is":
"Inspecting Intel's prior disclosures about Ice Lake SP in last year's HotChips presentations, one point sticks out, and that's the "SpecI2M optimisation" where the system is able to convert traditional RFO (Read for ownership) memory operations into another mechanism"
"It's a bit unfortunate that system vendors have ended up publishing STREAM results with hyper optimised binaries that are compiled with non-temporal instructions from the get-go, as for example we would not have seen this new mechanism on Ice Lake SP with them"
You need to rewrite the sentance or add more commas to break it up:
"It's a bit unfortunate that system vendors have ended up publishing STREAM results with hyper optimised binaries that are compiled with non-temporal instructions from the get-go, as, for example, we would not have seen this new mechanism on Ice Lake SP with them"
"The latter STREAM results were really great to see as I view is a true design innovation that will benefit a lot of workloads."
Exchange "is" for "this as":
"The latter STREAM results were really great to see as I view this as a true design innovation that will benefit a lot of workloads."
Or discard "view" and rewrite as a diffinitive instead of as an opinion:
"The latter STREAM results were really great to see as this is a true design innovation that will benefit a lot of workloads."
"Intel's new Ice Lake SP system, similarly to the predecessor Cascade Lake SP system, appear to be very efficient at full system idle,..."
Missing "s":
"Intel's new Ice Lake SP system, similarly to the predecessor Cascade Lake SP system, appears to be very efficient at full system idle,..."
"...the new Ice Lake part to most of the time beat the Cascade Lake part,..."
"to" doesn't belong. Rewrite:
"...the new Ice Lake part can beat the Cascade Lake part most of the time,..."
"...both showcasing figures that are still 25 and 15% ahead of the Xeon 8380."
Missing "%":
"...both showcasing figures that are still 25% and 15% ahead of the Xeon 8380."
"Intel had been pushing very hard the software optimisation side of things,..."
Poor sentance structure:
"Intel had been pushing the software optimisation side very hard,..."
"...which unfortunately didn't have enough time to cover for this piece."
Missing "we":
"...which unfortunately we didn't have enough time to cover for this piece."
"While we are exalted to finally see Ice lake SP reach the market,..."
"excited" not "exalted":
"While we are excited to finally see Ice lake SP reach the market,..."
Thanks for the article!
Oxford Guy - Sunday, April 11, 2021 - link
Perhaps Purch would be willing to take you on as a volunteer unpaid intern for proofreading for spelling and grammar?I would think there are people out there who would do it for resume building. So... if it bothers you perhaps you should make an inquiry.
evilpaul666 - Saturday, April 10, 2021 - link
Are the W-1300s going to use 10nm this year?mode_13h - Saturday, April 10, 2021 - link
You mean the bottom-tier Xeons? Those are just mainstream desktop chips with less features disabled, so that question depends on when Alder Lake hits.I'd say "no", because the Xeon versions typically lag the corresponding mainstream chips by a few months. So, if Alder Lake launches in November, then maybe we get the Xeons in February-March of next year.
The more immediate question is whether they'll release a Xeon version of Rocket Lake. I think that's likely, since they skipped Comet Lake and there are significant platform enhancements for Rocket Lake.
AdrianBc - Monday, April 12, 2021 - link
No, the W-1300 Xeons will be Rocket Lake. The top model will be Xeon W-1390P, which will be equivalent to the top i9 Rocket Lake, with 125 W TDP and 5.3 GHz maximum turbo.rahvin - Tuesday, April 20, 2021 - link
Andre does some of the best server reviews available, IMO.KKK11 - Tuesday, May 11, 2021 - link
That is a curious-looking wafer. I thought it was fake at first but then I noticed the alignment notch. Actually, I'm still not convinced it's real because I have seen lots and lots of wafers in various stages of production and I have never seen one where partial chips go all the way out to the edges. It's a waste of time to deal with those in the steppers so no one does that.