I'm interested in what this means for the SD8cx successor. A 30% increase from using one or two X1 cores would be great, I'm using the SQ2 chip and that's fast enough for typical office tasks. That's still not enough to get within striking range of the M1 or even A14. What the heck is in those Firestorms that makes them so fast?
On memory subsystems, what are Qualcomm and Samsung doing wrong compared to Apple and HiSilicon? The M1's memory bandwidth is astonishingly high but that's from using custom parts. HiSilicon is doing a great job using standard ARM interconnects.
Agreed on the stupidly high GPU clocking. The SQ2 has a beefy GPU that performs well but it can get hot even in a large tablet form factor. It's time to stop the marketing departments from forcing engineers to chase pointless metrics.
As Upscaled wrote, "It's not magic. It's good design." And why don't AMD / Intel have these same good designs? The videos above give thorough answers. For example, the last link explains why everyone cares Firestorm (A14) and Lightning (A13) have an 8-wide decode, which is one of many major "better designs" versus competing Arm & x86 CPUs.
Yes! Never mind 1T / 1C have shown negligence IPC differences in general computing. 1% at best. Please go test an i5-8600K vs an i7-8700K: one has SMT, one does not. Terrible to see the WCCFTech disinformation cycle reach a mainstream audience so quickly.
And then he, out of left field, throws up a Cinebench multi-core score: "See? Intel and Apple are actually very close to each other." But, his comparison was the 4+4 M1 vs an 8C Intel...
I wish I could edit comments. I give up on consumer YouTube videos; I saw his earlier interview with RISC's founders and it seemed halfway decent. I'm a fool.
Don't be offended but I think that what you posted is completely bs. First of all you are comparing arm64 cpus and x86 cpus, second the arm 64 cores used by apple are very arguably faster than the x86 cores of for example a ryzen cpu.
The second thing which I would want to point out is that about the widht of the pipeline, it is not that apple is a genious or intel and amd are stupid: the x86 architecture was built to have a narrowish pipeline and do more cycles in fact the width of the pipeline in intel and amd cpus is as wide as it gets. Arm arch on the other end lets you use a wider pipeline, however saying using a narrower or wider pipeline is better or not is pointless because performance comes down to how you organise the cpu around that specific pipeline so....
> What the heck is in those Firestorms that makes them so fast?
The same thing since A9 again (CMIIW): super wide decoder + super big cache. Apple isn't stingy when it comes to die size and Apple SoCs are always bigger than Snapdragon on the same generation and process node. 4mm^2 difference is huge when we're talking at nm level. What's weird, Exynos is even bigger but can't match these two. No idea what Samsung put there.
"What's weird, Exynos is even bigger but can't match these two. No idea what Samsung put there."
This is probably due to TSMC having a *far* denser 5nm process node compared to Samsung's process 5nm node. Per the article below TSMC's 5nm node tops at 173 million transistors per mm^2 while Samsung's 5nm node reaches only 126.5 MTr/mm^2 (i.e. barely denser than TSMC's second gen 7nm+ node) due to much more, er, "conservative" design choices (Samsung basically just switched from DDB cells in 7nm to SDB cells; the article explains what that means).
What is often not clear is that the quoted transistor densities of each process node are always the *maximum* transistor densities, not the actual transistor densities used to fab a die. For instance Intel have three different 10nm node variants with three different densities, a low, mid and high density variant (ranging I believe from ~55 million to ~100 million transistors per mm^2). The last one is the only one that has been widely reported, the other two were intended for the tech savvy press and audience.
Each Intel 10nm die has a mix of all three libraries, but each design is (obviously) *fixed* with a precise mix of the three. The desktop parts always have a higher percentage of low density cells because these need to clock higher, and vice versa for the mobile parts. Mobile phones are efficiency focused, so their SoCs have the highest percentage of the highest density variant of each process node that is possible (without hindering performance too much).
That is an additional reason their clocks top at ~3 GHz. Since the two SoCs of the article are both mobile SoCs of an almost identical design we can assume a roughly equivalent percentage of the highest density cells each process node maxes out at. Thus, if all else was being equal (including the same iGPU) Samsung's SoC would have a roughly ~27% larger die than TSMC's SoC. That must be the main reason Samsung kept the cache sizes of the X1 and the A55 cores low.
p.s.2 I just noticed that the Snapdragon 888 is also fabbed with Samsung's 5nm node. While that rendered the comparison between the two SoCs in my above comment moot the other things I pointed out might have some "academic" value regarding process nodes (I have no away to delete the comment anyway..).
I'm not an expert by any means, but I think Samsung's biggest problem was always optimisation - they use lots of die area for computing resources but the memory interfaces aren't optimised well enough to feed the beast, and they kept trying to push clocks higher to compensate.
The handy car analogy would be: Samsung - Dodge Viper. More cubes! More noise! More fuel! Grrr. Qualcomm / ARM - Honda Civic. Gets you there. Efficient and compact. Apple - Bugatti Veyron. Big engine, but well-engineered. Everything absolutely *sings*.
you're right but you also don't really touch why Apple can do that and X86 designs can't. The issue is that uOP decoding on x86 is *awfully* slow and inefficient on power.
This was explained to me as follows:
Variable-length instructions are an utter nightmare to work with. I'll try to explain with regular words how a decoder handles variable length. Here's all the instructions coming in:
Now, ARM is fixed length (3-letters only), so if I'm decoding them, I just add a space between every 3 letters. ARM: dogcatputnetgotfin ARM decoded: dog cat put net got fin
done. Now I can re-order them in a huge buffer, avoid dependencies, and fill my execution ports on the backend.
x86 is variable length, This means I cannot reliably figure out where the spaces should go. so I have to try all of them and then throw out what doesn't work. Look at how much more work there is to do.
This is why most x86 cores only have a 3-4 wide frontend. Those decoders are massive, and extremely energy intensive. They cost a decent bit of transistor budget and a lot of thermal budget even at idle. And they have to process all the different lengths and then unpack them, like I showed above with "regular" words. They have excellent throughput because they expand instructions into a ton of micro-ops... BUT that expansion is inconsistent, and hilariously inefficient.
This is why x86/64 cores require SMT for the best overall throughput -- the timing differences create plenty of room for other stuff to be executed while waiting on large instructions to expand. And with this example... we only stepped up to 6-byte instructions. x86 is 1-15 bytes so imagine how much longer the example would have been.
Apple doesn't bother with SMT on their ARM core design, and instead goes for a massive reorder buffer, and only presents a single logical core to the programmer, because their 8-wide design can efficiently unpack instructions, and fit them in a massive 630μop reorder buffer, and fill the backend easily achieving high occupancy, even at low clock speeds. Effectively, a reorder buffer, if it's big enough, is better than SMT, because SMT requires programmer awareness / programmer effort, and not everything is parallelizable.
Je suis pas sur si le benchmark SPENCint2006 est vraiment fiable, en plus je pense que ça fait longtemps que ce benchmark est là depuis un moment et je pense qu'il n'a plus bonne fiabilité, ce sont de nouveaux processeurs puissant. Donc je pense que ce n'est pas très fiable et qu'il ne dit pas des choses précises. Je pense que faut pas que vous croyez ce benchmark à 100%.
"Looking at all these results, it suddenly makes sense as to why Qualcomm launched another bin/refresh of the Snapdragon 865 in the form of the Snapdragon 870."
So this means Qualcomm is hedging its bets by having two flagship chips on separate TSMC and Samsung processes? Hopefully the situation will improve once X1 cores get built on TSMC 5nm and there's more experience with integrating X1 + A78. All this also makes SD888 phones a bit pointless if you already have an SD865 device.
I think Samsung was rushing, and its usually easier to stamp out something that's smaller (cache takes alot of silicon estate). Why they rushed was due to a switch from their M-cores to the X-core, and also internalising the 5G-radio.
Here's the weird part, I actually think this time their Mongoose Cores would be competitive. Unlike Andrei, I estimated the Cortex-X1 was going to be a load of crap, and seems I was right. Having node parity with Qualcomm, the immature implementation that is the X1, and the further refined Mongoose core... it would've meant they would be quite competitive (better/same/worse) but that's not saying much after looking at Apple.
How do I figure? The Mongoose core was a Cortex A57 alternative which was competitive against Cortex A72 cores. So it started as midcore (Cortex A72) and evolved into a highcore implementation as early as 2019 with the S9 when they began to get really wide, really fast, really hot/thirsty. Those are great for a Large Tablet or Ultrabook, but not good properties for a smaller handheld.
There was a precedence for this, in the overclocked QSD 845 SoCs, 855+, and the subpar QSD 865 implementation. Heck, it goes all the way back to 2016 when MediaTek was designing 2+4+4 core chipsets (and they failed miserably as you would imagine). I think when consumers buy these, companies send orders, fabs design them, etc... they always forget about the software. This is what separates Apple from Qualcomm, and Qualcomm from the rest. You can either brute-force your way to the top, or try to do things more cost/thermal efficiently.
First of all, apologies for sounding crass. Also, you're a professional in this field, I'm merely an enthusiast (aka Armchair Expert) take what I say with a grain of salt. So if you correct me, I stand corrected.
Nevertheless, I'm very unimpressed by big cores: Mongoose M5, to a lesser extent the Cortex-X1, and to a much Much much lesser extent the Firestorm. I do not think the X1 is great. Remember, the "middle cores" still haven't hit their limits, so it makes little sense to go even thirstier/hotter. Even if the power and thermal issues weren't so dire with these big-cores, the performance difference between the middle cores vs big cores is negligible, also there is no applications that are optimised/demand the big cores. Apple's big-core implementation is much more optimised, they're smarter about thermals, and the performance delta between it and the middle-cores is substantial, hence why their implementation works and why it favours compared to the X1/M5.
I can see a future for big-cores. Yet, I think it might involve killing the little-cores (A53/A55), and replacing it with a general purpose cores that will be almost as efficient yet be able to perform much better to act as middle-cores. Otherwise latency is always going to be an issue when shifting work from one core to another then another. I suspect the Cortex-X2 will right many wrongs of the X1, combined with a node jump, it should hopefully be a solid platform. Maybe similar to the 20nm-Cortex A57 versus the 16nm-Cortex A72 evolution we saw back in 2016. The vendors have little freedom when it comes to implementing the X1 cores, and I suspect things will ease up for X2, which could mean operating at reasonable levels.
So even with the current (and future) drawbacks of big-cores, I think they could be a good addition for several reasons: application-specific optimisations, external dock. We might get a DeX implementation that's native to Android/AOSP, and combined that with an external dock that provides higher power delivery AND adequate active-cooling. I can see that as a boon for content creators and entertainment consumers alike. My eye is on emulation performance, perhaps this brute-force can help stabilise the weak Switch and PS2 emulation currently on Android (WiiU next?).
Actually samsung has still M6 cores in its belly, the development team was shut down only after they completed the M6 cores.
Difficoult to say if they would have been better than an X1.
However it seems that arm has rushed this whole a78 and X1 thing and samsung rushed to put too much stuff in the cpu with evidently not enough time to do it well
Feels like a 20nm all over again. The move to Samsung's fab certainly did not help with the new SD 888 and Samsung's Exynos is able to close the performance gap since they are on the same node. In fact, this review also somewhat confirmed that Nvidia's jump to Samsung's 8nm certainly contributed to the high power consumption and lower clockspeed.
I assumed they meant higher power relative to TSMC 7nm - of course overall power is still a little higher than Turing on TSMC 12nm because of the higher logic density.
Samsung's 8nm is based on their 10nm, and can be considered a more refined variant with about a 10% improvement in efficiency. TSMC's 12nm is based on their 16 nm, with about the same efficiency improvements. 10lpp vs 14lpp is about 40% less power. 14lpp was computed to be about 25% less efficient than 16ff+. Which would mean 8lpp has around 20% lower power consumption than 16ff+. Tsmc 10nm should be around 40% less power than 16ff+, so Samsung 8nm is in fact worse than Tsmc 10nm.
Samsung 8nm for Nvidia doesn't have much impact in the Desktop PEG scene. Because the GPUs are already heavy on power consumption. Having a TSMC will make it efficient but it won't help with temps / clocks or the performance, always a new node helps with either get perf boost or efficiency.
Nvidia wanted cheap manufacturing for it's GPUs and more volume. But the latter is busted due to artificially pumping up this BS by Mining craze & corona problem. That's why A100 is on TSMC 7N instead of Samsung, because HPC and other hyperscalers need efficiency.
In mobile it matters a lot due to the stupid Li Ion garbage tech.
Efficiency for desktop gpus matters a lot. At best you are limited by temperature and noise, at worst you are also limited by power consumption (primarily oem pcs). If a cooler can dissipate 375 watts at an acceptable noise and temperature threshold, then that's the max power the gpu can ship at(the ceiling is lower if overclocking headroom is considered).
Switching to tsmc will help temperatures, performance, and clock. Lower power consumption means lower temperatures. The tsmc node can also clock higher which drives performance up. If using tsmc allows the chip to clock n% higher at the same power, ship it with n/2% more frequency, and now performance and oc headroom is higher, and temps and power draw are lower.
Both of the major manufacturer's top-end GPUs are limited by power input and heat dissipation - that's why they rarely perform much better than the next tier down, despite having significantly more execution resources. They do better on a performance-per-watt basis, though, because they're operating at a more sane part of the efficiency curve.
Yes, when Apple split its SoC production between Samsung and TSMC that one year when they were looking to replace Samsung with TSMC, it was found here, and in other places, that TSMC’ s larger process was 20% more power efficient than Samsung’s smaller process. I think it was the 14 node for Samsung and the 16 for TSMC.
So nothing seems to have changed. Samsung’s process technology remains inferior to that of TSMC.
Crap, pure crap samsung. Qcom can do better on the same node with the same CPU IP. Pathetic. And people are still enthusiastic about rdna chip from samsung. If this is hot with Mali, then the one with rdna will be a toaster.
Honestly, this chip is probably on the rushed side. There was probably some work sunk into switching from the mongoose-series cores. I think Samsung will improve next generation substantially. I'm not surprised this generation they did not match Qualcomm, though I'm a bit surprised it's quite so decisive.
Yes, most certainly that is true. Also, it seems like the node is holding them up quite a lot also and I think the thermal characteristics of the Samsung process vs TSMC are worse since both the SD888 and Exynos exhibit very bad behavior in regards to this. The previous SD865 is much better in this regard with TSMC 7nm even though that chip is also quite pushed to its limits. Basically, process is critical. Anyone can say whatever it wants, but process tech is critical for failure or success. We have so many examples in the industry that show this. Intel with its 14nm vs AMD with 7nm TSMC. AMD could not have done the things they have with GF12nm. That is absolutely certain. Nvidia has just moved to Samsung 8nm from TSMC 12nm and ... it shows. The new cards are hot and power hogs. Samsung needs to invest lots of R&D in their fabs because their customers have given them a leap of faith this time and if they feel they lose too much with it, they will avoid samsung in the future.
Lmao this is the exact same statement made by Samsung fans every single year and we know how it turned out. Oh "Exynos 990 will be revolutionary" or "Exynos 9825 will kick ass" and so on and so for. Now they are hedging their bets on AMD GPUs. And when that's subpar, they will fabricate another pathetic excuse. Come on aren't you guys tired of blindly defending Samsung?
Personally, I will keep hoping "the next Exynos is the one!" because I don't have a choice. In Europe it's only exynos and no snapdragon in their phones.
Parce que je pense que le téléphone galaxy est mieux en terme de qualité et de normalité, et il est sans défauts majeurs et je pense qu'il peut durer bien plus que 10 ans.
OneUI, Samsung Knox, Samsung Pass, Secure Folder, Smart Things, updates ... yes, updates, display, service so. I can be frustrated when in Europe I can have only Exynos soc but I am not. My usage pattern will be far away from the max power of the phone.
Not a Samsung fan. I was hopeful that a new CPU architecture could outperform tiny stock ARM cores and instead ARM delivered a core that wasn't tiny, but happy to pillage bad results when I saw them.
This isn't the same deal at all. This imo is equivalent to Qualcomm's 810, which disastrously rushed an A57 implementation and burned Samsung so badly that they doubled down on their current track.
The fact that they switched from their own IP to ARM's without falling behind Qualcomm in the generational cycle would seem to support your hypothesis.
That said, whether they'll improve "substantially" does remain to be seen - after all, they'll be integrating AMD GPU tech into a smartphone SoC for the first time since they sold Imageon to Qualcomm.
I don't expect the GPU to necessarily match up in efficiency year one, but this CPU result, if it's not from a rushed implementation, is absolutely unacceptable. Qualcomm surely isn't changing X1/A78 that much.
Mali is just crap. The Kirin 9000 is using a lower clocked, 70% wider Mali G78 configuration (MP24 vs MP14) built on TSMC N5, which is head over heels superior to Samsung 5LPE, and efficiency @ peak performance is still only on par with the Exynos 2100 configuration. In power savings mode, where it has an efficiency advantage, it's only 28% better with 70% more cores, so it's obvious avg frequency is throttled way down. Hopefully nvidia takes Mali out to the farm soon.
And why i should care if snapdragon cpu and gpu is marginally better . You won't feel the cpu difference in opening chrome or facebook and such powerful gpus in phones are utterly useless. Are you launching missiles or playing Star Citizen with your smartphones guys? You can play almost all current games with a Mali 400MP gpu. The only thing that matters in this comparation is the battery life where we should give the props to qualcomm, but this is the only thing that actually matters.
Nah, these days there are games that support 144fps and stuff, you need something like adreno 618 and up. For some stuff like genshin, to do max graphics wiyh constant 50+ fps you basically need a sd865 minimum.
Dead by daylight too, its very demanding and you need a sdg865's gpu at least to have 60fps at high at all times, sdg855 is very good too just sometimes dip a bit below 60. And there is a gigantic visual difference between low and high
I don't think Andrei gets enough credit for his work here, he's put hours upon hours into methodical, detailed articles for a website that's been forced into sponsored posts to make ends meet. God freaking bless
Ugh. I'm extremely curious as to how much of a deal slsi have qcom. The "dark horse" (in the west/outside China) huawei looks like they out did themselves. Damn good job
As expected due to Sammy's inferior fab node it's almost pointless to upgrade your SD865/+ device if not for a slightly better camera experience, but is too highly debatable. As a owner of a Sony 1 II I am perfectly willing to skip this generation entirely and wait for TSMCs 5nm to become widely available to Qualcomm and most noticeably, wait to see what chip Huawei can produce end of year to replace Kirin (still thinking they will acquire MTK at some point this year)
I thought there was talk of huawei selling their smartphone brands (mate & p, I think)? I recall huawei denying it but the source who made the claim apparently also made a previous claim regarding huawei, and which Huawei also denied, that turned out to be correct
I have an iPhone XR, and I'm waiting for the iPhone 13 for a reason. As you can see in the chart, the iPhone 12 didn't improve anything GPU wise either.
I'd like 120hz, USB-C, and an actually better GPU. One can hope 2021 will finally deliver.
What are you gonna do with that GPU The current handles every single task It doesn't make sense to wait for just a GPU There are no apps which can tell the difference between them
if most folk haven't yet noticed: welcome to the cost of diminishing returns. what's worse: as we climb down the node ladder to 1 atom/feature, what do we do with all those trillions of transistors? download your porn is just 5 seconds? that's worth paying for, dontcha think?
Gpu not only increases in power but efficiency too. For example both 11 pro and 13 pro can run a game smooth 60. But since(hopefully) 13 pro has a more efficient chip, overall device temperature will be a few degrees lower, thus more comfortable and more battery life
The table values aren't the long-term sustained performance points. I was trying to target a 4W measurement point for the table data, the values in the charts are actually what the phone will sustain, which is below 4W.
:) I said it before that this makes more sense, because if it's not throttling after a while, that equals to being constantly artificially throttled. That said, these figures suggest far higher potential of the 865 should somebody be able to overclock it so that it boosts to >8W for a short while then throttles back. From ROGP3's figures last year I thought the 865 was already relatively inefficient, but compared to this generation, whatever happened none of the SoCs(not even the two on the TSMC node) are more efficient than 865. At least those who bought 865 should be satisfied. Even though Samsung's 5nm seems to have flopped, it's exactly where it should be according to performance predictions a couple years ago, so they're actually on track, just that the fact that this is a half node one year behind TSMC isn't reflected in its nomenclature.
Thanks Andrei! This kind of review is why I read AT. Your results also confirm my view that QC had good reasons to "launch" the 870 as the backup option to 888-based devices; judging by your findings, even a plain 865 (no +) device will be a very competitive device, and at a lower price point to boot. When you write up your full review, please also cover whether Samsung will guarantee at least three full generational OS updates for their S21 devices for the US also; apparently, they do so for Europe. The absence of such guarantees has turned me off from buying "flagship" Android phones in recent years, and if Samsung comes through on that also for the US, I might reconsider a Sammy for 2021. Thanks!
Forgot to add this: The significant power consumption of either SoC plus the pretty, but still power-hungry display makes me wonder about the battery capacity Samsung chose for the S21 Ultra. My own view is that once the phone is big and the weight is over 200 g, may as well go really big on the battery; so, this looks like a case for >= 6,000 mAh to me, and in both meanings of the word "case".
Would the efficiency matter much in the real world? If you use your phone for regular things (calls, chat, browsing, a few apps) the sustained performance and power consumption shouldn't matter much. The cpu & GPU are only stressed for short bursts, unlike benchmarks and games. The higher frequencies of the Exynos might even make the phone feel a bit snappier. I think...
I've added the web-browsing tests - they're lighter than PCMark on the SoC. In the case of the S21U, the display is extremely efficient so it's still good.
The lower brightness you use your phone, or the more the SoC difference will appear, and the less battery life you'll experience.
Yeah they do But Anand & Brian Klugs podcasts were the one which attracted me a lot The best part was how well Anand designed the bench Some Anandtech podcasts is required now
Soo next year are we gonna have Snapdragon flagship at a better process node or not? There is absolutely no compulsion for Qualcomm to push for a better process node Apple will run with things
Yea they'd definitely skip the X1 core. Which honestly might be the correct move based on results, the X1 is better but maybe not enough better to justify the die space.
My experience with Samsung Electronics was heavy use of outsourcing and indian workers, on their software side. I wouldn't be surprised if that's the case for their hardware too. Very poor results for their 5lpe vs TSMC's N7. They're becoming obsessed with cutting costs.
It wouldn't but they aren't local to South Korea, and India has a lot of IT competence so it is a place many companies outsource to.... and outsourcing usually doesn't help quality.
Process node performance has nothing to do with the nationality of the employees. It's just natural for different teams, fabs, equipment, techniques, etc... to show different results.
And I doubt they would abandon investments into their silicon fabrication R&D, as they're basically the only other serious player for 3nm as of this moment (Others being stuck at 7nm).
They're behind the curve, but at least they're delivering new nodes (Differently from Intel and Global Foundries, two american fabs.)
With Qualcomm owning the US market for Android, and with Samsung's continued Exynos woes, Apple's SoC R&D will return big dividends in the next few years.
They no longer show in download mode, they can be accessed via the recovery>kernel logs, but I'm struggling to find it on device as the logs are massive and doesn't seem like it's possible to extract them without root to do a search
Marketing runs the show as always. More power consumption at the expense of battery destruction like Apple battery issue. Which will force people to buy new shit again another year of $1000+ as long as people do not stop buying this shit for Fartnite and others, this will continue forever. Utter shame.
About the Bootloader unlock, does S20 Exynos2100 have it still or the greedy pig samsung yanked that off too ?
I'm in the market right now to buy a phone with No hole / notch and SD card slot I have even forgone 3.5mm jack since I would get my LG service or something for the ESS anyways guess what ? None of them exist. Except Sony and ASUS which the latter is not in US market. Fucking bullshit really.
Thanks for the great research and write-up, Andrei. Looks like I still won't be looking to Samsung for my next upgrade, as I'm stuck with Exynos here in the UK. Shame!
Andrei, also special thanks for the power draw comparison of the A55 Little Cores in the (TSMC N7) 865 vs the (Samsung 5 nm) 888! That one graph tells us everything we need to know about what Samsung's current " 5 nm" is really comparable to. I really wonder if QC's decision to chose Samsung's fabbing was more based on availability (or absence thereof for TSMC's 5 nm) or on price?
For such a thorough review it is shocking to see that software versions (build number) used during tests are not stated. It is absolutely essential that the review contains software versions, so that other can try to replicate results, and for the reviewing site, to have references during re-tests.
The milc win is certainly from the data prefetcher. In simulation milc also benefits massively from runahead execution, ie same principle (bring in data earlier).
Has anyone identified a paper or patent that indicates what ARM are doing? A table driven approach (markov prefetcher) still seems impractical, and ARM don't go in for blunt solutions that just throw area at the problem. They might be doing something like scanning lines as they enter L2 for what look like plausible addresses, and prefetching based on those, which would cover a large range of pointer-based use cases, and seems like the sort of smart low area solution they tend to favor.
Hope Qualcomm moves next gen flagship SOC to TSMC again. Cannot be at so much disadvantage. Of course Samsung 3nm could narrow the gap, but that is more for 2023 flagships. Disappointing to see Exynos disappoint again. How is Exynos1080 as a mid range chipset?
Their 3nm is expected to be on par with TSMC N5. The expect gains over 7nm are only 30% higher performance, 35% die area reduction, and 40-50% power reduction. Considering 5LPE is still behind N7P it's not much and will be barely be on par with N5 in density let alone efficiency.
You must be kidding... The Exynos 2100 is at least somewhat close to the Snapdragon 888 in CPU performance. Mali continues to be a problem, and remains so even for the Kirin 9000 on TSMC N5. Mongoose was an abomination that belonged maybe in 2015. Samsung Semiconductor is less competent than TSMC but SARC's mongoose team was a joke.
All those attempts to spend transistors niggardly and boost performance by high frequency have failed miserably. Single transistor performance seems to be decaying from node to node now. Flat & Not more enough transistor count=performance regression.
Andrei, when you're testing the actual phone, could you check the battery life with the 5G modem on and off, respectively? 5G modems are supposedly quite power hungry also, and, if it's possible to turn 5G off (but leaving 4G LTE on), it would be interesting to see just how much power 5G really consumes.
I guess this is due to binning and his tests show his Exonys 2100 is in the middle. Strange. Also, the battery life is better on the 888 and external temps are about the samd.
All this really doesn't look good for Windows on ARM if we're stuck with hot and hungry Qualcomm chips on Samsung 5nm. The 8cx and SQ on TSMC 7nm were very efficient but that's with slower A76 cores. I'm hoping a quad-X1 design on TSMC 5nm will be in the next iteration of the Surface Pro X or Galaxy Book S.
Mi 11 may have the vapor chamber for better cooling, but it also allows for a higher battery temperature. If they throttled at the same temps we could see how useful that thing actually is.
Not all S20s had that vapor chamber. Some just had a graphene layer, which in theory would give similar results. Don't know if the S21 uses graphene tho.
The battery life benchmarks are indication of how actually invalid the whole Anantech's premise is. Pretty much ALL actual real usage tests have shown BIG improvements in the autonomy between the S20/S21 yet Andrei wants us to believe in the stupid benchmark test that shows "regression" between Exynos 990 and Exynos 2100. What a joke..
I would say these results are incomplete rather than invalid. The PCMark Work 2.0 - Battery Life test is a demanding mixed usage benchmark. When running that benchmark it isn't exactly a shock that the Exynos 2100 S21 Ultra should return very slightly reduced battery life than the Exynos 990 S20 Ultra. Anandtech isn't alone in noting that when processing demanding workloads the Exynos 2100 draws more power (on average) than the Exynos 990. Andrei, for his part, is explicit that the Exynos 2100 is also significantly more performant than its predecessor. He does say that the increased performance wasn’t just achieved through improved efficiency, but also through greater power usage and it is hard to dispute that looking at the numbers.
There is a gap in the data however. The full PCMark Work 2.0 - Battery Life test involves a Work performance score that gives a more complete picture of how much work/the rate that work is being completed while executing the test. That would be very useful information to have. Still, it is undoubtedly the case that the reduction in battery life that Andrei mentions is not due to a regression but rather the increased rate that the Exynos 2100 is executing work (when processing demanding mixed usage workloads). While that information isn't provided in connection to the PCMark Work 2.0 - Battery Life test the GFXBench GPU heavy test data (arranged in Power Efficiency tables) does confirm the high power draw of the Exynos 2100 during peak performance bursts (which must bump up average power consumption as well) even as that chip roundly outperforms the Exynos 990.
Indeed, heavy mixed usage workloads are not going to put the Exynos 2100 battery life in the best light. Still, Andrei did show the results from a Web Browsing Battery Life test that undoubedly will be useful to a lot of phone users who don't view the results of the PCMark Work 2.0 - Battery Life test as having a lot of relevance for them. But, I, for one, am happy to have that information.
Andrei seems to be adding to/reworking the battery life data in this review.
Nice peak power consumption! It doesn't seem unlikely that we'll end up with a situation similar to Apple's battery gate on Snapdragon 888 devices, in a few years. Way to go!
The above Snapdragon A55 Latency data, 264.371ns@131072KB, seems out of expectation. it is bigger than exynos and tensor. So what is the test pattern or lat_mem_rd CMD under your test? can you share it?
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
123 Comments
Back to Article
serendip - Monday, February 8, 2021 - link
I'm interested in what this means for the SD8cx successor. A 30% increase from using one or two X1 cores would be great, I'm using the SQ2 chip and that's fast enough for typical office tasks. That's still not enough to get within striking range of the M1 or even A14. What the heck is in those Firestorms that makes them so fast?On memory subsystems, what are Qualcomm and Samsung doing wrong compared to Apple and HiSilicon? The M1's memory bandwidth is astonishingly high but that's from using custom parts. HiSilicon is doing a great job using standard ARM interconnects.
Agreed on the stupidly high GPU clocking. The SQ2 has a beefy GPU that performs well but it can get hot even in a large tablet form factor. It's time to stop the marketing departments from forcing engineers to chase pointless metrics.
ikjadoon - Monday, February 8, 2021 - link
>What the heck is in those Firestorms that makes them so fast?A trifecta that I share with people because this question comes up so often.
Technical Part 1: https://www.anandtech.com/show/16226/apple-silicon...
Technical Part 2: https://www.anandtech.com/show/16252/mac-mini-appl...
Consumer + Approachable: https://www.youtube.com/watch?v=3SG5e4z-Ygg
Technical Background + More Approachable: https://www.youtube.com/watch?v=cAjarAgf0nI
As Upscaled wrote, "It's not magic. It's good design." And why don't AMD / Intel have these same good designs? The videos above give thorough answers. For example, the last link explains why everyone cares Firestorm (A14) and Lightning (A13) have an 8-wide decode, which is one of many major "better designs" versus competing Arm & x86 CPUs.
ikjadoon - Monday, February 8, 2021 - link
Uh, wait. Delete. I definitely skipped too much of that one. What on Earth is he going on about 1T vs 1C?~~ Consumer + Approachable: https://www.youtube.com/watch?v=3SG5e4z-Ygg ~~
Wow, I'm a little stunned at how bad this Upscaled video is. The Dev Doc + AnandTech are much more reliable.
Nicon0s - Monday, February 8, 2021 - link
LoL, Engadget.He keeps saying multithreading when he means SMT or Hyperthreading(Intel's version of SMT).
ikjadoon - Monday, February 8, 2021 - link
Yes! Never mind 1T / 1C have shown negligence IPC differences in general computing. 1% at best. Please go test an i5-8600K vs an i7-8700K: one has SMT, one does not. Terrible to see the WCCFTech disinformation cycle reach a mainstream audience so quickly.https://www.anandtech.com/show/16261/investigating...
And then he, out of left field, throws up a Cinebench multi-core score: "See? Intel and Apple are actually very close to each other." But, his comparison was the 4+4 M1 vs an 8C Intel...
I wish I could edit comments. I give up on consumer YouTube videos; I saw his earlier interview with RISC's founders and it seemed halfway decent. I'm a fool.
Archer_Legend - Tuesday, February 9, 2021 - link
Don't be offended but I think that what you posted is completely bs.First of all you are comparing arm64 cpus and x86 cpus, second the arm 64 cores used by apple are very arguably faster than the x86 cores of for example a ryzen cpu.
The second thing which I would want to point out is that about the widht of the pipeline, it is not that apple is a genious or intel and amd are stupid: the x86 architecture was built to have a narrowish pipeline and do more cycles in fact the width of the pipeline in intel and amd cpus is as wide as it gets.
Arm arch on the other end lets you use a wider pipeline, however saying using a narrower or wider pipeline is better or not is pointless because performance comes down to how you organise the cpu around that specific pipeline so....
leledumbo - Monday, February 8, 2021 - link
> What the heck is in those Firestorms that makes them so fast?The same thing since A9 again (CMIIW): super wide decoder + super big cache. Apple isn't stingy when it comes to die size and Apple SoCs are always bigger than Snapdragon on the same generation and process node. 4mm^2 difference is huge when we're talking at nm level. What's weird, Exynos is even bigger but can't match these two. No idea what Samsung put there.
Santoval - Tuesday, February 9, 2021 - link
"What's weird, Exynos is even bigger but can't match these two. No idea what Samsung put there."This is probably due to TSMC having a *far* denser 5nm process node compared to Samsung's process 5nm node. Per the article below TSMC's 5nm node tops at 173 million transistors per mm^2 while Samsung's 5nm node reaches only 126.5 MTr/mm^2 (i.e. barely denser than TSMC's second gen 7nm+ node) due to much more, er, "conservative" design choices (Samsung basically just switched from DDB cells in 7nm to SDB cells; the article explains what that means).
What is often not clear is that the quoted transistor densities of each process node are always the *maximum* transistor densities, not the actual transistor densities used to fab a die. For instance Intel have three different 10nm node variants with three different densities, a low, mid and high density variant (ranging I believe from ~55 million to ~100 million transistors per mm^2). The last one is the only one that has been widely reported, the other two were intended for the tech savvy press and audience.
Each Intel 10nm die has a mix of all three libraries, but each design is (obviously) *fixed* with a precise mix of the three. The desktop parts always have a higher percentage of low density cells because these need to clock higher, and vice versa for the mobile parts. Mobile phones are efficiency focused, so their SoCs have the highest percentage of the highest density variant of each process node that is possible (without hindering performance too much).
That is an additional reason their clocks top at ~3 GHz. Since the two SoCs of the article are both mobile SoCs of an almost identical design we can assume a roughly equivalent percentage of the highest density cells each process node maxes out at. Thus, if all else was being equal (including the same iGPU) Samsung's SoC would have a roughly ~27% larger die than TSMC's SoC. That must be the main reason Samsung kept the cache sizes of the X1 and the A55 cores low.
Santoval - Tuesday, February 9, 2021 - link
p.s. Sorry, I forgot the link to the article :https://semiwiki.com/semiconductor-manufacturers/s...
Santoval - Tuesday, February 9, 2021 - link
p.s.2 I just noticed that the Snapdragon 888 is also fabbed with Samsung's 5nm node. While that rendered the comparison between the two SoCs in my above comment moot the other things I pointed out might have some "academic" value regarding process nodes (I have no away to delete the comment anyway..).mohamad.zand - Thursday, June 17, 2021 - link
Hi , thank you for your explanationDo you know how many transistors Snapdragon 888 and Exynos 2100 are?
It is not written anywhere
Spunjji - Thursday, February 11, 2021 - link
I'm not an expert by any means, but I think Samsung's biggest problem was always optimisation - they use lots of die area for computing resources but the memory interfaces aren't optimised well enough to feed the beast, and they kept trying to push clocks higher to compensate.The handy car analogy would be:
Samsung - Dodge Viper. More cubes! More noise! More fuel! Grrr.
Qualcomm / ARM - Honda Civic. Gets you there. Efficient and compact.
Apple - Bugatti Veyron. Big engine, but well-engineered. Everything absolutely *sings*.
Shorty_ - Monday, February 15, 2021 - link
you're right but you also don't really touch why Apple can do that and X86 designs can't. The issue is that uOP decoding on x86 is *awfully* slow and inefficient on power.This was explained to me as follows:
Variable-length instructions are an utter nightmare to work with. I'll try to explain with regular words how a decoder handles variable length. Here's all the instructions coming in:
x86: addmatrixdogchewspout
ARM: dogcatputnetgotfin
Now, ARM is fixed length (3-letters only), so if I'm decoding them, I just add a space between every 3 letters.
ARM: dogcatputnetgotfin
ARM decoded: dog cat put net got fin
done. Now I can re-order them in a huge buffer, avoid dependencies, and fill my execution ports on the backend.
x86 is variable length, This means I cannot reliably figure out where the spaces should go. so I have to try all of them and then throw out what doesn't work.
Look at how much more work there is to do.
x86: addmatrixdogchewspoutreading frame 1 (n=3): addmatrixdogchewspout
Partially decoded ops: add, , dog, , ,
reading frame 2 (n=4): matrixchewspout
Partially decoded ops: add, ,dog, chew, ,
reading frame 3 (n=5): matrixspout
Partially decoded ops: add, ,dog, chew, spout,
reading frame 4 (n=6): matrix
Partially decoded ops: add, matrix, dog, chew, spout,
Fully Expanded Micro Ops: add, ma1, ma2, ma3, ma4, dog, ch1, ch2, ch3, sp1, sp2, sp3
This is why most x86 cores only have a 3-4 wide frontend. Those decoders are massive, and extremely energy intensive. They cost a decent bit of transistor budget and a lot of thermal budget even at idle. And they have to process all the different lengths and then unpack them, like I showed above with "regular" words. They have excellent throughput because they expand instructions into a ton of micro-ops... BUT that expansion is inconsistent, and hilariously inefficient.
This is why x86/64 cores require SMT for the best overall throughput -- the timing differences create plenty of room for other stuff to be executed while waiting on large instructions to expand. And with this example... we only stepped up to 6-byte instructions. x86 is 1-15 bytes so imagine how much longer the example would have been.
Apple doesn't bother with SMT on their ARM core design, and instead goes for a massive reorder buffer, and only presents a single logical core to the programmer, because their 8-wide design can efficiently unpack instructions, and fit them in a massive 630μop reorder buffer, and fill the backend easily achieving high occupancy, even at low clock speeds. Effectively, a reorder buffer, if it's big enough, is better than SMT, because SMT requires programmer awareness / programmer effort, and not everything is parallelizable.
Karim Braija - Saturday, February 20, 2021 - link
Je suis pas sur si le benchmark SPENCint2006 est vraiment fiable, en plus je pense que ça fait longtemps que ce benchmark est là depuis un moment et je pense qu'il n'a plus bonne fiabilité, ce sont de nouveaux processeurs puissant. Donc je pense que ce n'est pas très fiable et qu'il ne dit pas des choses précises. Je pense que faut pas que vous croyez ce benchmark à 100%.serendip - Monday, February 8, 2021 - link
"Looking at all these results, it suddenly makes sense as to why Qualcomm launched another bin/refresh of the Snapdragon 865 in the form of the Snapdragon 870."So this means Qualcomm is hedging its bets by having two flagship chips on separate TSMC and Samsung processes? Hopefully the situation will improve once X1 cores get built on TSMC 5nm and there's more experience with integrating X1 + A78. All this also makes SD888 phones a bit pointless if you already have an SD865 device.
Bluetooth - Monday, February 8, 2021 - link
Why would they skimp on the cache. Was neural engine or something else with higher priority getting silicon?Kangal - Tuesday, February 9, 2021 - link
I think Samsung was rushing, and its usually easier to stamp out something that's smaller (cache takes alot of silicon estate). Why they rushed was due to a switch from their M-cores to the X-core, and also internalising the 5G-radio.Here's the weird part, I actually think this time their Mongoose Cores would be competitive. Unlike Andrei, I estimated the Cortex-X1 was going to be a load of crap, and seems I was right. Having node parity with Qualcomm, the immature implementation that is the X1, and the further refined Mongoose core... it would've meant they would be quite competitive (better/same/worse) but that's not saying much after looking at Apple.
How do I figure?
The Mongoose core was a Cortex A57 alternative which was competitive against Cortex A72 cores. So it started as midcore (Cortex A72) and evolved into a highcore implementation as early as 2019 with the S9 when they began to get really wide, really fast, really hot/thirsty. Those are great for a Large Tablet or Ultrabook, but not good properties for a smaller handheld.
There was a precedence for this, in the overclocked QSD 845 SoCs, 855+, and the subpar QSD 865 implementation. Heck, it goes all the way back to 2016 when MediaTek was designing 2+4+4 core chipsets (and they failed miserably as you would imagine). I think when consumers buy these, companies send orders, fabs design them, etc... they always forget about the software. This is what separates Apple from Qualcomm, and Qualcomm from the rest. You can either brute-force your way to the top, or try to do things more cost/thermal efficiently.
Andrei Frumusanu - Tuesday, February 9, 2021 - link
> Unlike Andrei, I estimated the Cortex-X1 was going to be a load of crap, and seems I was right.The X1 *is* great, and far better than Samsung's custom cores.
Kangal - Wednesday, February 10, 2021 - link
First of all, apologies for sounding crass.Also, you're a professional in this field, I'm merely an enthusiast (aka Armchair Expert) take what I say with a grain of salt. So if you correct me, I stand corrected.
Nevertheless, I'm very unimpressed by big cores: Mongoose M5, to a lesser extent the Cortex-X1, and to a much Much much lesser extent the Firestorm. I do not think the X1 is great. Remember, the "middle cores" still haven't hit their limits, so it makes little sense to go even thirstier/hotter. Even if the power and thermal issues weren't so dire with these big-cores, the performance difference between the middle cores vs big cores is negligible, also there is no applications that are optimised/demand the big cores. Apple's big-core implementation is much more optimised, they're smarter about thermals, and the performance delta between it and the middle-cores is substantial, hence why their implementation works and why it favours compared to the X1/M5.
I can see a future for big-cores. Yet, I think it might involve killing the little-cores (A53/A55), and replacing it with a general purpose cores that will be almost as efficient yet be able to perform much better to act as middle-cores. Otherwise latency is always going to be an issue when shifting work from one core to another then another. I suspect the Cortex-X2 will right many wrongs of the X1, combined with a node jump, it should hopefully be a solid platform. Maybe similar to the 20nm-Cortex A57 versus the 16nm-Cortex A72 evolution we saw back in 2016. The vendors have little freedom when it comes to implementing the X1 cores, and I suspect things will ease up for X2, which could mean operating at reasonable levels.
So even with the current (and future) drawbacks of big-cores, I think they could be a good addition for several reasons: application-specific optimisations, external dock. We might get a DeX implementation that's native to Android/AOSP, and combined that with an external dock that provides higher power delivery AND adequate active-cooling. I can see that as a boon for content creators and entertainment consumers alike. My eye is on emulation performance, perhaps this brute-force can help stabilise the weak Switch and PS2 emulation currently on Android (WiiU next?).
iphonebestgamephone - Monday, February 15, 2021 - link
The improvement with the 888 in damonps2 and eggns are quite good. Check some vids on youtube.Archer_Legend - Tuesday, February 9, 2021 - link
Actually samsung has still M6 cores in its belly, the development team was shut down only after they completed the M6 cores.Difficoult to say if they would have been better than an X1.
However it seems that arm has rushed this whole a78 and X1 thing and samsung rushed to put too much stuff in the cpu with evidently not enough time to do it well
watzupken - Monday, February 8, 2021 - link
Feels like a 20nm all over again. The move to Samsung's fab certainly did not help with the new SD 888 and Samsung's Exynos is able to close the performance gap since they are on the same node. In fact, this review also somewhat confirmed that Nvidia's jump to Samsung's 8nm certainly contributed to the high power consumption and lower clockspeed.s.yu - Monday, February 8, 2021 - link
That would be saying Samsung's 8nm is worse than TSMC 12nm, it's not that bad, it should be a bit better than TSMC 10nm.Spunjji - Monday, February 8, 2021 - link
I assumed they meant higher power relative to TSMC 7nm - of course overall power is still a little higher than Turing on TSMC 12nm because of the higher logic density.Otritus - Monday, February 8, 2021 - link
Samsung's 8nm is based on their 10nm, and can be considered a more refined variant with about a 10% improvement in efficiency. TSMC's 12nm is based on their 16 nm, with about the same efficiency improvements. 10lpp vs 14lpp is about 40% less power. 14lpp was computed to be about 25% less efficient than 16ff+. Which would mean 8lpp has around 20% lower power consumption than 16ff+. Tsmc 10nm should be around 40% less power than 16ff+, so Samsung 8nm is in fact worse than Tsmc 10nm.Silver5urfer - Monday, February 8, 2021 - link
Samsung 8nm for Nvidia doesn't have much impact in the Desktop PEG scene. Because the GPUs are already heavy on power consumption. Having a TSMC will make it efficient but it won't help with temps / clocks or the performance, always a new node helps with either get perf boost or efficiency.Nvidia wanted cheap manufacturing for it's GPUs and more volume. But the latter is busted due to artificially pumping up this BS by Mining craze & corona problem. That's why A100 is on TSMC 7N instead of Samsung, because HPC and other hyperscalers need efficiency.
In mobile it matters a lot due to the stupid Li Ion garbage tech.
Otritus - Monday, February 8, 2021 - link
Efficiency for desktop gpus matters a lot. At best you are limited by temperature and noise, at worst you are also limited by power consumption (primarily oem pcs). If a cooler can dissipate 375 watts at an acceptable noise and temperature threshold, then that's the max power the gpu can ship at(the ceiling is lower if overclocking headroom is considered).Switching to tsmc will help temperatures, performance, and clock. Lower power consumption means lower temperatures. The tsmc node can also clock higher which drives performance up. If using tsmc allows the chip to clock n% higher at the same power, ship it with n/2% more frequency, and now performance and oc headroom is higher, and temps and power draw are lower.
Spunjji - Thursday, February 11, 2021 - link
Both of the major manufacturer's top-end GPUs are limited by power input and heat dissipation - that's why they rarely perform much better than the next tier down, despite having significantly more execution resources. They do better on a performance-per-watt basis, though, because they're operating at a more sane part of the efficiency curve.geoxile - Monday, February 8, 2021 - link
Tsmc 12/16nm was roughly on par with Samsung 14nm.melgross - Monday, February 8, 2021 - link
Yes, when Apple split its SoC production between Samsung and TSMC that one year when they were looking to replace Samsung with TSMC, it was found here, and in other places, that TSMC’ s larger process was 20% more power efficient than Samsung’s smaller process. I think it was the 14 node for Samsung and the 16 for TSMC.So nothing seems to have changed. Samsung’s process technology remains inferior to that of TSMC.
geoxile - Monday, February 8, 2021 - link
What are you smoking? https://www.tomshardware.com/news/iphone-6s-a9-sam... the Samsung 14nm made A9s had a 10% advantage.Spunjji - Monday, February 8, 2021 - link
Yeah, I think a lot of us had suspicions that was the case, but this is really confirming it. Interesting implications for Nvidia's next gen.yeeeeman - Monday, February 8, 2021 - link
Crap, pure crap samsung. Qcom can do better on the same node with the same CPU IP. Pathetic. And people are still enthusiastic about rdna chip from samsung. If this is hot with Mali, then the one with rdna will be a toaster.lmcd - Monday, February 8, 2021 - link
Honestly, this chip is probably on the rushed side. There was probably some work sunk into switching from the mongoose-series cores. I think Samsung will improve next generation substantially. I'm not surprised this generation they did not match Qualcomm, though I'm a bit surprised it's quite so decisive.yeeeeman - Monday, February 8, 2021 - link
Yes, most certainly that is true. Also, it seems like the node is holding them up quite a lot also and I think the thermal characteristics of the Samsung process vs TSMC are worse since both the SD888 and Exynos exhibit very bad behavior in regards to this. The previous SD865 is much better in this regard with TSMC 7nm even though that chip is also quite pushed to its limits.Basically, process is critical. Anyone can say whatever it wants, but process tech is critical for failure or success. We have so many examples in the industry that show this. Intel with its 14nm vs AMD with 7nm TSMC. AMD could not have done the things they have with GF12nm. That is absolutely certain. Nvidia has just moved to Samsung 8nm from TSMC 12nm and ... it shows. The new cards are hot and power hogs. Samsung needs to invest lots of R&D in their fabs because their customers have given them a leap of faith this time and if they feel they lose too much with it, they will avoid samsung in the future.
Exynos_is_weak - Monday, February 8, 2021 - link
Lmao this is the exact same statement made by Samsung fans every single year and we know how it turned out. Oh "Exynos 990 will be revolutionary" or "Exynos 9825 will kick ass" and so on and so for. Now they are hedging their bets on AMD GPUs. And when that's subpar, they will fabricate another pathetic excuse. Come on aren't you guys tired of blindly defending Samsung?anad0commenter - Monday, February 8, 2021 - link
Personally, I will keep hoping "the next Exynos is the one!" because I don't have a choice. In Europe it's only exynos and no snapdragon in their phones.Wereweeb - Monday, February 8, 2021 - link
Why not buy an Oneplus phone then?druzzyaka - Monday, February 8, 2021 - link
Because of OneUI.Thats why I bought HongKong version of S10. This phone still feels premium, despite being 2 year old.
s.yu - Tuesday, February 9, 2021 - link
I agree, currently on a Vivo, and noticed that OneUI is much better.Karim Braija - Saturday, February 20, 2021 - link
Parce que je pense que le téléphone galaxy est mieux en terme de qualité et de normalité, et il est sans défauts majeurs et je pense qu'il peut durer bien plus que 10 ans.Cicerone - Sunday, February 28, 2021 - link
OneUI, Samsung Knox, Samsung Pass, Secure Folder, Smart Things, updates ... yes, updates, display, service so. I can be frustrated when in Europe I can have only Exynos soc but I am not. My usage pattern will be far away from the max power of the phone.Spunjji - Monday, February 8, 2021 - link
I rarely trust any kind of blanket statement about "fans" from someone whose username is literally announcing themselves as a single-issue hater.lmcd - Monday, February 8, 2021 - link
Not a Samsung fan. I was hopeful that a new CPU architecture could outperform tiny stock ARM cores and instead ARM delivered a core that wasn't tiny, but happy to pillage bad results when I saw them.This isn't the same deal at all. This imo is equivalent to Qualcomm's 810, which disastrously rushed an A57 implementation and burned Samsung so badly that they doubled down on their current track.
Spunjji - Monday, February 8, 2021 - link
The fact that they switched from their own IP to ARM's without falling behind Qualcomm in the generational cycle would seem to support your hypothesis.That said, whether they'll improve "substantially" does remain to be seen - after all, they'll be integrating AMD GPU tech into a smartphone SoC for the first time since they sold Imageon to Qualcomm.
lmcd - Monday, February 8, 2021 - link
I don't expect the GPU to necessarily match up in efficiency year one, but this CPU result, if it's not from a rushed implementation, is absolutely unacceptable. Qualcomm surely isn't changing X1/A78 that much.melgross - Monday, February 8, 2021 - link
Yeah yeah. Every year someone says that.geoxile - Monday, February 8, 2021 - link
Mali is just crap. The Kirin 9000 is using a lower clocked, 70% wider Mali G78 configuration (MP24 vs MP14) built on TSMC N5, which is head over heels superior to Samsung 5LPE, and efficiency @ peak performance is still only on par with the Exynos 2100 configuration. In power savings mode, where it has an efficiency advantage, it's only 28% better with 70% more cores, so it's obvious avg frequency is throttled way down. Hopefully nvidia takes Mali out to the farm soon.FirePirate3 - Tuesday, February 9, 2021 - link
And why i should care if snapdragon cpu and gpu is marginally better . You won't feel the cpu difference in opening chrome or facebook and such powerful gpus in phones are utterly useless. Are you launching missiles or playing Star Citizen with your smartphones guys? You can play almost all current games with a Mali 400MP gpu. The only thing that matters in this comparation is the battery life where we should give the props to qualcomm, but this is the only thing that actually matters.iphonebestgamephone - Tuesday, February 9, 2021 - link
Nah, these days there are games that support 144fps and stuff, you need something like adreno 618 and up. For some stuff like genshin, to do max graphics wiyh constant 50+ fps you basically need a sd865 minimum.theblitz707 - Wednesday, February 10, 2021 - link
Dead by daylight too, its very demanding and you need a sdg865's gpu at least to have 60fps at high at all times, sdg855 is very good too just sometimes dip a bit below 60. And there is a gigantic visual difference between low and highUnashamed_unoriginal_username_x86 - Monday, February 8, 2021 - link
I don't think Andrei gets enough credit for his work here, he's put hours upon hours into methodical, detailed articles for a website that's been forced into sponsored posts to make ends meet. God freaking blessAlistair - Monday, February 8, 2021 - link
he always does the best, I'm eagerly awaiting the S21 review alsoFunBunny2 - Monday, February 8, 2021 - link
"God freaking bless"one might wonder where all that revenue goes? could it be the corner office Suits?
Spunjji - Monday, February 8, 2021 - link
I don't think it's actually that much revenue TBHtuxRoller - Monday, February 8, 2021 - link
Ugh. I'm extremely curious as to how much of a deal slsi have qcom.The "dark horse" (in the west/outside China) huawei looks like they out did themselves. Damn good job
Fulljack - Monday, February 8, 2021 - link
probably because 80% capacity of TSMC 5nm are reserved for Apple A14 and M1, so Qualcomm won't bother with that 20% anyway.tuxRoller - Tuesday, February 9, 2021 - link
20% for their top end socs might be about right.Last year, at least, they were already splitting their orders between fabs.
tkSteveFOX - Monday, February 8, 2021 - link
As expected due to Sammy's inferior fab node it's almost pointless to upgrade your SD865/+ device if not for a slightly better camera experience, but is too highly debatable.As a owner of a Sony 1 II I am perfectly willing to skip this generation entirely and wait for TSMCs 5nm to become widely available to Qualcomm and most noticeably, wait to see what chip Huawei can produce end of year to replace Kirin (still thinking they will acquire MTK at some point this year)
tuxRoller - Tuesday, February 9, 2021 - link
I thought there was talk of huawei selling their smartphone brands (mate & p, I think)? I recall huawei denying it but the source who made the claim apparently also made a previous claim regarding huawei, and which Huawei also denied, that turned out to be correctAlistair - Monday, February 8, 2021 - link
I have an iPhone XR, and I'm waiting for the iPhone 13 for a reason. As you can see in the chart, the iPhone 12 didn't improve anything GPU wise either.I'd like 120hz, USB-C, and an actually better GPU. One can hope 2021 will finally deliver.
Kishoreshack - Monday, February 8, 2021 - link
What are you gonna do with that GPUThe current handles every single task
It doesn't make sense to wait for just a GPU
There are no apps which can tell the difference between them
Alistair - Monday, February 8, 2021 - link
That's why I'm waiting, performance isn't improving enough to warrant an upgrade, and prices are WAY higher this year.FunBunny2 - Monday, February 8, 2021 - link
" performance isn't improving enough"if most folk haven't yet noticed: welcome to the cost of diminishing returns. what's worse: as we climb down the node ladder to 1 atom/feature, what do we do with all those trillions of transistors? download your porn is just 5 seconds? that's worth paying for, dontcha think?
theblitz707 - Wednesday, February 10, 2021 - link
Gpu not only increases in power but efficiency too. For example both 11 pro and 13 pro can run a game smooth 60. But since(hopefully) 13 pro has a more efficient chip, overall device temperature will be a few degrees lower, thus more comfortable and more battery lifeTigran - Monday, February 8, 2021 - link
Why Galaxy S21U (Exynos 2100) 🔥 Throttled is 18.55 in table, while it's 15.52 in histogram above (GFXBench Aztec High)?Andrei Frumusanu - Monday, February 8, 2021 - link
The table values aren't the long-term sustained performance points. I was trying to target a 4W measurement point for the table data, the values in the charts are actually what the phone will sustain, which is below 4W.Tigran - Monday, February 8, 2021 - link
Thanks!Monty1401 - Monday, February 8, 2021 - link
Top work as ever Andrei, always appreciated.I thought lower chips bins were associated with better efficiency?
How do you access the chip binning on the S21? Not seeing it in download mode.
Cellar Door - Monday, February 8, 2021 - link
Andrei does the best reviews at AnandTech - prove me wrongExcellent review, thanks!
s.yu - Monday, February 8, 2021 - link
:) I said it before that this makes more sense, because if it's not throttling after a while, that equals to being constantly artificially throttled.That said, these figures suggest far higher potential of the 865 should somebody be able to overclock it so that it boosts to >8W for a short while then throttles back.
From ROGP3's figures last year I thought the 865 was already relatively inefficient, but compared to this generation, whatever happened none of the SoCs(not even the two on the TSMC node) are more efficient than 865. At least those who bought 865 should be satisfied.
Even though Samsung's 5nm seems to have flopped, it's exactly where it should be according to performance predictions a couple years ago, so they're actually on track, just that the fact that this is a half node one year behind TSMC isn't reflected in its nomenclature.
eastcoast_pete - Monday, February 8, 2021 - link
Thanks Andrei! This kind of review is why I read AT.Your results also confirm my view that QC had good reasons to "launch" the 870 as the backup option to 888-based devices; judging by your findings, even a plain 865 (no +) device will be a very competitive device, and at a lower price point to boot.
When you write up your full review, please also cover whether Samsung will guarantee at least three full generational OS updates for their S21 devices for the US also; apparently, they do so for Europe. The absence of such guarantees has turned me off from buying "flagship" Android phones in recent years, and if Samsung comes through on that also for the US, I might reconsider a Sammy for 2021. Thanks!
eastcoast_pete - Monday, February 8, 2021 - link
Forgot to add this: The significant power consumption of either SoC plus the pretty, but still power-hungry display makes me wonder about the battery capacity Samsung chose for the S21 Ultra. My own view is that once the phone is big and the weight is over 200 g, may as well go really big on the battery; so, this looks like a case for >= 6,000 mAh to me, and in both meanings of the word "case".Xerxesro - Monday, February 8, 2021 - link
Would the efficiency matter much in the real world? If you use your phone for regular things (calls, chat, browsing, a few apps) the sustained performance and power consumption shouldn't matter much. The cpu & GPU are only stressed for short bursts, unlike benchmarks and games. The higher frequencies of the Exynos might even make the phone feel a bit snappier. I think...Andrei Frumusanu - Monday, February 8, 2021 - link
I've added the web-browsing tests - they're lighter than PCMark on the SoC. In the case of the S21U, the display is extremely efficient so it's still good.The lower brightness you use your phone, or the more the SoC difference will appear, and the less battery life you'll experience.
brucethemoose - Monday, February 8, 2021 - link
Out of the box with a few extra apps, flagships do a ton of processing in the background. And browsing still depends on burst performance.If you zealously clean the phone (and like Andrei mentioned, use high brightness), then yeah, power draw from the screen, radios are a bigger factor.
Wereweeb - Monday, February 8, 2021 - link
In the latest episode of "Phone manufacturers trying to fit a laptop inside a tiny glass brick"tkSteveFOX - Monday, February 8, 2021 - link
Forgot to credit Andrei, another sensational research. Put most reviews to shame.Well done!
Kishoreshack - Monday, February 8, 2021 - link
Andrei Reminds me of Brian KlugBut Brian's level was something else
IanCutress - Tuesday, February 9, 2021 - link
People only remember the best bits.Kishoreshack - Thursday, February 11, 2021 - link
Yeah they doBut Anand & Brian Klugs podcasts were the one which attracted me a lot
The best part was how well Anand designed the bench
Some Anandtech podcasts is required now
Kishoreshack - Monday, February 8, 2021 - link
Soo next year are we gonna have Snapdragon flagship at a better process node or not?There is absolutely no compulsion for Qualcomm to push for a better process node
Apple will run with things
SarahKerrigan - Monday, February 8, 2021 - link
Oof. Those numbers are rough, and not a good look at all for 5lpe.Wonder if we'll see X1/A78 on N5 any time soon. With Hisilicon out of the picture, who would even be the obvious candidate for that? Mediatek?
Spunjji - Monday, February 8, 2021 - link
Mediatek would be an odd candidate, given their tendency to focus on area efficiency, but it'd be a nice surprise if they branched out a bit!lmcd - Monday, February 8, 2021 - link
Yea they'd definitely skip the X1 core. Which honestly might be the correct move based on results, the X1 is better but maybe not enough better to justify the die space.Spunjji - Thursday, February 11, 2021 - link
It certainly seems like it would make more sense on an SoC design aimed at larger devices.geoxile - Monday, February 8, 2021 - link
My experience with Samsung Electronics was heavy use of outsourcing and indian workers, on their software side. I wouldn't be surprised if that's the case for their hardware too. Very poor results for their 5lpe vs TSMC's N7. They're becoming obsessed with cutting costs.iphonebestgamephone - Monday, February 8, 2021 - link
Why would being indian be the reason?jospoortvliet - Wednesday, February 10, 2021 - link
It wouldn't but they aren't local to South Korea, and India has a lot of IT competence so it is a place many companies outsource to.... and outsourcing usually doesn't help quality.iphonebestgamephone - Monday, February 15, 2021 - link
You get what you pay for ofc.iphonebestgamephone - Monday, February 15, 2021 - link
If its like you said and tsmc outsourced indians too, they are just better because they paid for the better workers.Wereweeb - Monday, February 8, 2021 - link
Process node performance has nothing to do with the nationality of the employees. It's just natural for different teams, fabs, equipment, techniques, etc... to show different results.And I doubt they would abandon investments into their silicon fabrication R&D, as they're basically the only other serious player for 3nm as of this moment (Others being stuck at 7nm).
They're behind the curve, but at least they're delivering new nodes (Differently from Intel and Global Foundries, two american fabs.)
Wereweeb - Monday, February 8, 2021 - link
Foundries*reuthermonkey1 - Monday, February 8, 2021 - link
This is becoming a very big issue for Android.With Qualcomm owning the US market for Android, and with Samsung's continued Exynos woes, Apple's SoC R&D will return big dividends in the next few years.
Toss3 - Monday, February 8, 2021 - link
How do you check the bin of the exynos chip? Is it the three numbers you see when booting into download mode?Monty1401 - Tuesday, February 9, 2021 - link
They no longer show in download mode, they can be accessed via the recovery>kernel logs, but I'm struggling to find it on device as the logs are massive and doesn't seem like it's possible to extract them without root to do a searchMonty1401 - Tuesday, February 9, 2021 - link
If you find them, let me know, pulled my hair out scrolling through logs and trying various adb commands1Silver5urfer - Monday, February 8, 2021 - link
Marketing runs the show as always. More power consumption at the expense of battery destruction like Apple battery issue. Which will force people to buy new shit again another year of $1000+ as long as people do not stop buying this shit for Fartnite and others, this will continue forever. Utter shame.About the Bootloader unlock, does S20 Exynos2100 have it still or the greedy pig samsung yanked that off too ?
I'm in the market right now to buy a phone with No hole / notch and SD card slot I have even forgone 3.5mm jack since I would get my LG service or something for the ESS anyways guess what ? None of them exist. Except Sony and ASUS which the latter is not in US market. Fucking bullshit really.
5j3rul3 - Monday, February 8, 2021 - link
Can I mention that "TSMC N7+ and TSMC N6 are BETTER than SAMSUNG 5LPE and SAMSUNG 4LPE"?Spunjji - Monday, February 8, 2021 - link
Thanks for the great research and write-up, Andrei. Looks like I still won't be looking to Samsung for my next upgrade, as I'm stuck with Exynos here in the UK. Shame!eastcoast_pete - Monday, February 8, 2021 - link
Andrei, also special thanks for the power draw comparison of the A55 Little Cores in the (TSMC N7) 865 vs the (Samsung 5 nm) 888! That one graph tells us everything we need to know about what Samsung's current " 5 nm" is really comparable to. I really wonder if QC's decision to chose Samsung's fabbing was more based on availability (or absence thereof for TSMC's 5 nm) or on price?DanD85 - Monday, February 8, 2021 - link
Well, seems like Apple hogging most of TSMC 5nm node leaves other with no other choice but going with the lesser foundry.heraldo25 - Monday, February 8, 2021 - link
For such a thorough review it is shocking to see that software versions (build number) used during tests are not stated.It is absolutely essential that the review contains software versions, so that other can try to replicate results, and for the reviewing site, to have references during re-tests.
name99 - Monday, February 8, 2021 - link
The milc win is certainly from the data prefetcher. In simulation milc also benefits massively from runahead execution, ie same principle (bring in data earlier).Has anyone identified a paper or patent that indicates what ARM are doing? A table driven approach (markov prefetcher) still seems impractical, and ARM don't go in for blunt solutions that just throw area at the problem. They might be doing something like scanning lines as they enter L2 for what look like plausible addresses, and prefetching based on those, which would cover a large range of pointer-based use cases, and seems like the sort of smart low area solution they tend to favor.
trivik12 - Monday, February 8, 2021 - link
Hope Qualcomm moves next gen flagship SOC to TSMC again. Cannot be at so much disadvantage. Of course Samsung 3nm could narrow the gap, but that is more for 2023 flagships.Disappointing to see Exynos disappoint again. How is Exynos1080 as a mid range chipset?
geoxile - Monday, February 8, 2021 - link
Their 3nm is expected to be on par with TSMC N5. The expect gains over 7nm are only 30% higher performance, 35% die area reduction, and 40-50% power reduction. Considering 5LPE is still behind N7P it's not much and will be barely be on par with N5 in density let alone efficiency.jeremyshaw - Monday, February 8, 2021 - link
In other words, Samsung strangled then killed SARC for their failures, only to find the failures were with SSI itself.geoxile - Monday, February 8, 2021 - link
You must be kidding... The Exynos 2100 is at least somewhat close to the Snapdragon 888 in CPU performance. Mali continues to be a problem, and remains so even for the Kirin 9000 on TSMC N5. Mongoose was an abomination that belonged maybe in 2015. Samsung Semiconductor is less competent than TSMC but SARC's mongoose team was a joke.EthiaW - Monday, February 8, 2021 - link
All those attempts to spend transistors niggardly and boost performance by high frequency have failed miserably.Single transistor performance seems to be decaying from node to node now. Flat & Not more enough transistor count=performance regression.
eastcoast_pete - Monday, February 8, 2021 - link
Andrei, when you're testing the actual phone, could you check the battery life with the 5G modem on and off, respectively? 5G modems are supposedly quite power hungry also, and, if it's possible to turn 5G off (but leaving 4G LTE on), it would be interesting to see just how much power 5G really consumes.Andrei Frumusanu - Tuesday, February 9, 2021 - link
I don't have 5G coverage here so it's not feasible for me to test.Edwardmcardle - Wednesday, February 10, 2021 - link
Will you be testing reception differences e.g. 4g and wifi? Fantastic write up as always!Dorkaman - Tuesday, February 9, 2021 - link
Different s21 ultra phones can have different performance says tech chaphttps://youtu.be/yuNNmf2gIRc
I guess this is due to binning and his tests show his Exonys 2100 is in the middle. Strange. Also, the battery life is better on the 888 and external temps are about the samd.
serendip - Tuesday, February 9, 2021 - link
All this really doesn't look good for Windows on ARM if we're stuck with hot and hungry Qualcomm chips on Samsung 5nm. The 8cx and SQ on TSMC 7nm were very efficient but that's with slower A76 cores. I'm hoping a quad-X1 design on TSMC 5nm will be in the next iteration of the Surface Pro X or Galaxy Book S.Raqia - Tuesday, February 9, 2021 - link
Disappointing sustained performance, however the S21 series lacks the phase change vapor chamber cooling solution of the S20's:https://9to5google.com/2021/01/18/samsung-galaxy-s...
vs
https://www.ifixit.com/News/43501/why-samsung-buil...
Notably the Mi11 has this:
https://gadgettendency.com/a-triple-chamber-as-a-s...
This makes for better subsequent runs but the SoCs built on 5LPE are still disappointing.
iphonebestgamephone - Wednesday, February 10, 2021 - link
Mi 11 may have the vapor chamber for better cooling, but it also allows for a higher battery temperature. If they throttled at the same temps we could see how useful that thing actually is.dudedud - Wednesday, February 10, 2021 - link
Not all S20s had that vapor chamber. Some just had a graphene layer, which in theory would give similar results. Don't know if the S21 uses graphene tho.darkich - Wednesday, February 10, 2021 - link
The battery life benchmarks are indication of how actually invalid the whole Anantech's premise is.Pretty much ALL actual real usage tests have shown BIG improvements in the autonomy between the S20/S21 yet Andrei wants us to believe in the stupid benchmark test that shows "regression" between Exynos 990 and Exynos 2100.
What a joke..
ChrisGX - Sunday, February 14, 2021 - link
I would say these results are incomplete rather than invalid. The PCMark Work 2.0 - Battery Life test is a demanding mixed usage benchmark. When running that benchmark it isn't exactly a shock that the Exynos 2100 S21 Ultra should return very slightly reduced battery life than the Exynos 990 S20 Ultra. Anandtech isn't alone in noting that when processing demanding workloads the Exynos 2100 draws more power (on average) than the Exynos 990. Andrei, for his part, is explicit that the Exynos 2100 is also significantly more performant than its predecessor. He does say that the increased performance wasn’t just achieved through improved efficiency, but also through greater power usage and it is hard to dispute that looking at the numbers.There is a gap in the data however. The full PCMark Work 2.0 - Battery Life test involves a Work performance score that gives a more complete picture of how much work/the rate that work is being completed while executing the test. That would be very useful information to have. Still, it is undoubtedly the case that the reduction in battery life that Andrei mentions is not due to a regression but rather the increased rate that the Exynos 2100 is executing work (when processing demanding mixed usage workloads). While that information isn't provided in connection to the PCMark Work 2.0 - Battery Life test the GFXBench GPU heavy test data (arranged in Power Efficiency tables) does confirm the high power draw of the Exynos 2100 during peak performance bursts (which must bump up average power consumption as well) even as that chip roundly outperforms the Exynos 990.
Indeed, heavy mixed usage workloads are not going to put the Exynos 2100 battery life in the best light. Still, Andrei did show the results from a Web Browsing Battery Life test that undoubedly will be useful to a lot of phone users who don't view the results of the PCMark Work 2.0 - Battery Life test as having a lot of relevance for them. But, I, for one, am happy to have that information.
Andrei seems to be adding to/reworking the battery life data in this review.
https://benchmarks.ul.com/pcmark-android
https://s3.amazonaws.com/download-aws.futuremark.c...
sachouba - Wednesday, February 10, 2021 - link
Nice peak power consumption!It doesn't seem unlikely that we'll end up with a situation similar to Apple's battery gate on Snapdragon 888 devices, in a few years. Way to go!
mixmaxmix - Saturday, February 13, 2021 - link
Samsung develop ap for battery efficiency rather than performancehelloworld_chip - Sunday, February 14, 2021 - link
Why didn't we mention that 888's A78 only enables 32KB L1 cache while it can be 64KB? I think L1d size can play a good role here regarding IPC.catlikesfish - Wednesday, June 22, 2022 - link
The above Snapdragon A55 Latency data, 264.371ns@131072KB, seems out of expectation. it is bigger than exynos and tensor. So what is the test pattern or lat_mem_rd CMD under your test? can you share it?