Comments Locked

221 Comments

Back to Article

  • SarahKerrigan - Thursday, July 2, 2020 - link

    Not only does it not have AVX-512, it appears to have no AVX at all. For a premium product in 2020, that's embarrassing.
  • shabby - Thursday, July 2, 2020 - link

    Atom... premium? Don't be silly 😆
  • SarahKerrigan - Thursday, July 2, 2020 - link

    Hey, I agree, but it's going into devices that start at US$1k. It's clearly being positioned as a premium chip, despite showing every sign of not actually being very good.
  • shabby - Thursday, July 2, 2020 - link

    It'll be a tough sell that's for sure.
  • Smell This - Thursday, July 2, 2020 - link

    No where to go but up ... or 6.5 years back:

    HP Pavilion TouchSmart 11-e115nr - 11.6"
    AMD A6-1450 7w 'Temash' - 8 GB RAM - Samsung SSD

    CB15: 109
    OpenGL: 10.42
    http://dude-gotta-go.com/images/AMD-A6-1450-Temash...

    Ice Storm 1.2: 20243
    http://dude-gotta-go.com/images/AMD-A6-1450-Temash...

    Fire Strike 1.1: 236
    http://dude-gotta-go.com/images/AMD-A6-1450-Temash...
  • eastcoast_pete - Thursday, July 2, 2020 - link

    Yes, premium this ain't. Really disappointed, as I see this overall concept as being the most innovative thing to come out of Intel in a long time. However, this way - no dice.
  • sharath.naik - Friday, July 3, 2020 - link

    Worst is the cinebench r15 scores of 89 and 250 for single and multi thread scores. That's ridiculous, you can actually undervolt and power limit a i5-1035G7 and still get at least twice the performance in single thread.
  • dersteffeneilers - Saturday, July 4, 2020 - link

    Well the thing goes into laptops smaller than phones, if it's pulling <2W during bursts, it's gonna overheat really badly.
  • Spunjji - Monday, July 6, 2020 - link

    So they've sacrificed speed *and* cost-effectiveness for area. Oh dear.
  • ProDigit - Friday, July 3, 2020 - link

    It would be, if you paired 100 of them together.
  • AhsanX - Thursday, July 2, 2020 - link

    The Sunny Cove has AVX-512 but Tremont cores don,t have any AVX. So Intel disabled AVX on the Sunny Cove core too, as heat was gonna be a problem if they let it enabled.
  • ikjadoon - Thursday, July 2, 2020 - link

    Did we read the same article? AVX-512 was completely removed (i.e., physically) from the Sunny Cove dies because of Windows 10 compatibility problems.

    Windows was never built for x86 heterogeneous processing and still cannot do anything close now in 2020 (perhaps that would've been a smarter investment than going all-in on touch in 2008!).

    Intel & Microsoft remain stuck in late 2010s for their low-power / mobile-first / thin-client transition caused by the dominating success of smartphones & ARM-based architectures wiping out anything of interest in low-power x86.

    There's a reason Intel just nearly stopped all development on Atom: nobody give a crap about the Pentium Silver & Celeron CPUs.
  • ikjadoon - Thursday, July 2, 2020 - link

    *gave
  • Jorgp2 - Thursday, July 2, 2020 - link

    >There's a reason Intel just nearly stopped all development on Atom: nobody give a crap about the Pentium Silver & Celeron CPUs.

    Lol, no
  • ProDigit - Friday, July 3, 2020 - link

    The $160 laptop I purchased from hp, with an N5000 in it, works really well!
    Would have been better if used on a desktop, and the core count was quadrupled.
  • extide - Thursday, July 2, 2020 - link

    It's not a windows issue. Even if you ran Linux or any other OS on here you would have to run with CPU's all supporting the same ISA. ARM specifically designs cores to pair up together such that they have the exact same ISA (instruction support) so this isn't an issue in cell phones.

    I mean theoretically you could have the processors support a slightly different ISA and have it throw an interrupt if it tried to execute an instruction that the current core didn't support but a different one did and then the scheduler would have to move that thread to the other core. That could get really janky though, which is why nobody has talked about doing this yet.

    Also, they said in this article that even though Intel said they removed the AVX512 units -- it can still be seen in the die shots.

    Also, Intel didn't stop development on Atom -- this chip has a brand new core and their public roadmaps have several more in the future.
  • reggjoo1 - Tuesday, July 7, 2020 - link

    They’re gonna have to develop more for atom, and get into the right “governor “ for crossover operation, more than scheduler tweaks. They have a lot to learn, and it may come down to the quality over the I/O system for these to really succeed. As long as their "ego" doesn't get in their way, and they learn from the "smartphone arena, they might have something for X86.
  • dotjaz - Friday, July 3, 2020 - link

    Did we read the same article? AVX-512 was clearly not removed.
  • dotjaz - Friday, July 3, 2020 - link

    There, "Intel has stated on the record repeatedly that they removed it. The die shot of the compute silicon shows that not to be the case." If you can read.
  • jeremyshaw - Friday, July 3, 2020 - link

    That was a later edit. Originally Ian claimed it was removed.
  • returnzer0 - Friday, July 3, 2020 - link

    So no, they did not, in fact, read the same article.
  • s.yu - Monday, July 6, 2020 - link

    Mystery solved!
  • vanilla_gorilla - Friday, July 3, 2020 - link

    https://www.anandtech.com/show/15877/intel-hybrid-...

    "At the top is the single Sunny Cove core, also present in Ice Lake. Intel has stated that it has physically removed the AVX-512 part of the silicon, however we can still see it in the die shot. This is despite the fact that it can’t be used in this design due to one of the main limitations of a hybrid CPU. We’ll cover that more in a later topic."

    It was NOT physically removed but it cannot be used so it doesn't really matter. In practice this will have no AVX-512.
  • dotjaz - Friday, July 3, 2020 - link

    Also "However, all modern software assumes a homogeneous processor", that's why they have to support exactly the same ISA extensions. I didn't realise Windows is the only modern software in existence.
  • Meteor2 - Friday, July 3, 2020 - link

    Why so rude, dotjaz?
  • dotjaz - Saturday, July 4, 2020 - link

    So which part is rude? Is this rude asking you what's rude? Sorry your feelings got hurt. There, happy now?
  • jospoortvliet - Sunday, July 5, 2020 - link

    Linux also expects it. Modern enough?
  • jeremyshaw - Thursday, July 2, 2020 - link

    Good. This action ensures this segment of products will be easier to emulate on arm, helping to tear these products away from Intel's grasp.
  • Kangal - Sunday, July 5, 2020 - link

    To be honest, this is a great innovation.
    It's just the execution is quite lacking, and on top of that, it's a couple years too late.

    Just imagine a SoC such as:
    3x Big processor (Intel Core M), eg/ Core i7-8500Y
    5x Small processor (Intel Atom), eg/ Atom x7-Z8750

    Dynamic Scaling:
    (Idle) 4x Small Cores run at 500MHz at Idle
    (Very-low power) 4x Small Cores clock starting at 1.0GHz
    (Low-power use) 5x Small Cores clock upto 2.5GHz
    (Medium power) 3x Big Cores starting at 1.5GHz, 5x Small Cores upto 2.5GHz
    (Regular power) 3x Big Cores at 2.0GHz, 5x Small Cores at 2.0GHz
    (High-power use) 3x Big Cores at 3.0GHz, 5x Small Cores at 2.0GHz
    (Very-high power) 3x Big Cores at 4.0GHz, 5x Small Cores at 2.5GHz
    (Max-power use) 1x Big Cores at 5.0GHz, 2x Big Cores at 4.0GHz, 5x Small Cores at 2.5GHz
  • Kangal - Sunday, July 5, 2020 - link

    Now imagine all of this, competing against AMD.
    Their 12nm node is fairly competitive against Intel's 14nm. And their Zen+ architecture is somewhat competitive against Intel's Skylake architecture. So comparing the above Hybrid Processor, to a 4c/8t (eg/ Ryzen-3780U). Well that's a no contest victory for Intel. And AMD would struggle to fit those technologies into a 8-core laptop processor, so there would be no threat from above.

    Once AMD steps up to either Zen2 architecture, or 7nm node, or both!...
    ....that's when things get heated. Since in the 15W / Ultrabook market, the above setup by Intel would secure a slim victory against similar 4c/8t AMD processor. But when you step up to the 25W / Laptop market, then AMD will pull ahead with their 8c/16t processor. However at least in this scenario, Intel has a good showing of their competitiveness and capabilities. That works upto 2021, but after that, Intel will have to make noticeable performance improvements to both Big/Small Core architectures, AND, they will have to make substantial efficiency improvements on the lithography side (maybe execute on their 8nm nodes, versus TSMC's 5nm).

    First question, why use examples of Cherry Trail and Amber Lake?
    Well, they're both on Intel's (pretty good) 14nm node. Also this is the most efficient "Small Core" Atom architecture that Intel has. Later produced Intel Pentium/Celeron/Atom processors come from the same family, however, they're designed for higher energy consumption. Whereas the "Big Core" stated above is a Core M processor (now rebranded as a Core i7-Y), and it is the latest and best they have when it comes to performance whilst maintaining efficiency.

    Why the 3/5 Split you may ask?
    Well, the most useful is the first/main thread. Followed closely by the second thread as most code has evolved for Dual-Cores in the past 20 years. And somewhat important is the third core, as we've also had an evolution to Quad-Cores in the past 10 years. However, most code hasn't made the full transition from single to dual threads, the same way that dual threads haven't translated well to quad threads. So instead of 2+6 split, which will have some performance drops on Quad thread code, it's better to go for 3+5. So you may ask, then why not just go for an even 4+4 split? Well, most of those quad threads don't utilise the 4th core very well, so we can make do by relegating that to a Small Core instead. This saves us some efficiency, which is what we want to achieve with this concept in the first place. The least energy using split would be 0+8. The most performant split would be 8+0. So this 3+5 split is basically the best of both worlds, since you will get 90% of the single-threaded performance, 70% of the multi-threaded performance, and 50% of the energy expenditure characteristics. So it's not perfect, but it's the closest you can get there... until the code evolves further. And we've started transitioning codes running for 8-core processors only around 2015-2017, so there's a good chance we aren't going to see the evolution tipping point until around 2025.
  • Valantar - Sunday, July 5, 2020 - link

    Uhm, I have to ask, did you write this comment eight months ago? AMD has been kicking Intel's butt in 15W laptops since the first Renoir laptops hit the streets. While that did take a while after the initial presentation, their advantage is nonetheless significant both in performance and power draw.
  • serendip - Monday, July 6, 2020 - link

    AMD doesn't have anything in the 5W TDP range. Not yet, anyway. The problem is that Lakefield brings middling performance at a high price. Intel already has 5W and 4W parts, check out the Pentium 4425Y and m3-8100Y in the Surface Go 2. Those chips are much cheaper and easier to fab than Lakefield and they bring equal or higher performance.
  • Kangal - Tuesday, July 7, 2020 - link

    The best sku chipset that AMD makes in the "15W bracket" is the 4800U. However, that's with the TDP-down as it's not a proper 15W part. Plus, there are no laptops with that combination yet. The Lenovo Yoga Slim7 has the chipset, but it is at the 25W bracket, and apparently that goes much higher during use when possible.

    So no, AMD isn't quite kicking Intel's butt in the Ultrabook segment yet. Maybe in 6 months, when yields improve and more vendors join. But for now, Intel is still the dominant force in the Thin & Light segment. AMD they're killing it at the Regular Laptop market, the Entire Desktop Market, the Server market, and the Console market. However, ARM is pretty much going to take over the Server Market now that the big companies are moving that way, and since Linux drivers have matured on ARMv8_64. The laptop segment is safe for now, but the new Macs might cause other vendors to think beyond Windows, or think beyond x86. The Console segment and Desktop segments are safe for now (and for at least this decade).
  • Spunjji - Friday, July 10, 2020 - link

    That's an arbitrary distinction if ever I saw one. By that definition, Ice Lake is a 28W part operating in "TDP down" to 15W.

    AMD could conceivably laser 4 cores and 60% of the GPU off a Renoir chip, drop the clocks and end up with a "5W" part not dissimilar to Intel's M3 series. It wouldn't make any sense for them to do so, though, because they can't make enough chips as it is and it wouldn't really buy them any meaningful market share.
  • Kangal - Friday, July 10, 2020 - link

    I didn't discount the 4800U at all. I merely stated the fact that it is, in fact, a 25W chipset and it can operate at 15W with TDP-down. I'm not sure you quite understand this tier system.

    But anyways, my point was that AMD's best 15W option is the 4800U, but we don't know how it actually performs because there are no devices out there. From what we can speculate, it should be very competitive, but Intel really has championed the Ultrabook market in the last decade. So for all intents and purposes, Intel is probably still ahead here by a hair, yet they could've had a larger lead if they implemented the above design I tried to explain. Too bad. AMD will humiliate/supersede them completely in a year or two at this pace.
  • Spunjji - Monday, July 6, 2020 - link

    "And AMD would struggle to fit those technologies into a 8-core laptop processor, so there would be no threat from above."

    Boy, you really need to keep up with the news...
  • Kangal - Tuesday, July 7, 2020 - link

    No, you didn't read that correctly.
    AMD doesn't have any 8-core processor on their 16nm/14nm/12nm node, that is, confined to the thermal profile of a laptop. I was saying Intel needed to release the processor that I outlined, and release it years ago. And if they did that, then their only competition would be Renoir/Ryzen-4000, and even then AMD would lose on the low-voltage (Ultrabook) market and win on the regular (Laptop) market.

    See the above comment by serendip. AMD is working on having lower and lower voltage chips. Their lowest power one I think is still the V1605B embedded chip. But right now, that small company is really stretched thin. They're working on Servers, on HDD optimisations, on making GPUs, on optimising GPUs, on making console processors, on desktops, laptops, and a few other budget options.

    By the time AMD actually properly polishes the driverset for laptops/battery drain, it's going to be another year. But hopefully, on the next set of chips they update the graphics (from Vega to RDNA). It's possible they might ditch the monolithic design of their mobile chips, and shift those over to a chiplet design as well. This will take a hit to performance, and to efficiency... but on the bright side it should mean even cheaper processors to vendors and consumers-alike.
  • Spunjji - Friday, July 10, 2020 - link

    I think I understand now, but the phrasing was confusing!
  • eastcoast_pete - Thursday, July 2, 2020 - link

    It's beyond embarrassing, it's borderline idiotic. Intel's "performance" cores have one unique differentiator going for them, and that is the ability to execute AVX, especially AVX2 and AVX512 instructions. Doing what they did means they basically gelded their own big core, and this gelding won't win any performance crowns.
    I sort of get why it's hard to have a scheduler trying to work with cores that have different capabilities, but is it really impossible to have one that makes it a hard and fast rule that if an AVX instruction is called for, the big core gets fired up? Now, I am not able to program one of these myself, but, as a user, I would rather pay a little power consumption penalty and have a real Sunny Cove-like large core than an overgrown Atom as the "performance" core. Big mistake.
  • brantron - Thursday, July 2, 2020 - link

    The trouble with AVX on the Sunny Cove core is it will still only be one core.

    So add another, and call it...Ice Lake Y? Wait a minute... :p

    After more than a decade, Atom for PC still looks like a square peg in a round hole.
  • ProDigit - Friday, July 3, 2020 - link

    It actually starts making sense, once you start doubling up on the cores. Sure, dual or quad core atom processors aren't really a thing. But at 95W tdp, you could be running a 30 to 60 core cpu. And those numbers are for 14nm. Not 10.
    That could make it interesting!
  • Alexvrb - Saturday, July 4, 2020 - link

    Heck toss in 200W of these and it can be used to generate polygons and stuff! Oh wait... they already tried that. Seriously, a pile of slower cores might be OK for a secondary chip (accelerator) but it's not ideal for a main CPU and even less so for a consumer use case. Even a fast quad core would beat the stuffing out of a 60 core Atom for the overwhelming majority of consumer workloads.

    Actually, even as an accelerator for professional use there are often better solutions - GPUs and/or purpose-built accelerators, depending on your workload. That's why Intel shifted gears in that realm too.
  • LiKenun - Thursday, July 2, 2020 - link

    The way I understand the implications for programmers… sometimes a program will do a one-time check for a particular processor feature (e.g.: if Avx2.IsSupported == true) and load optimized code at startup or optimize byte code compilation to remove unused branches. Then the program uses the loaded implementations for its entire running lifetime. There’s going to be a lot of work needed to undo these assumptions about processor feature sets not changing while the program is running.
  • Lucky Stripes 99 - Saturday, July 4, 2020 - link

    It would really be nice if Windows became better about ISA and API version controls. Instead of having to do a bunch of run time checks to avoid arcane system errors, I'd like to be able to set some minimum versions in the header of the EXE so that things will gracefully fail at startup. From a scheduler standpoint, this could allow you to have cores with different ISA versions and it would know which cores to avoid.
  • ProDigit - Friday, July 3, 2020 - link

    Yeah? Tell me where you can actually use avx512 or avx2? Most home users don't need it. Intel makes chips for businesses and for most home users.
    Not for those that occasionally need to run programs that are made for servers or performance machines.
    Avx2/512 makes no sense, and has no home in a laptop.
  • Cullinaire - Friday, July 3, 2020 - link

    Don't mind him, he's always harping about avx512 every chance he gets even though it's pointless.
  • lefty2 - Friday, July 3, 2020 - link

    Yes, home users do need it. AVX512 is rarely used, but AVX2 is almost universal.
  • eastcoast_pete - Friday, July 3, 2020 - link

    So, I guess you don't use Microsoft Teams or other video conferencing software on your laptop? Because those use AVX or AVX2 for virtual backgrounds, amongst other features.
    Regarding "no place in laptops", I vaguely remember hearing that about SSEs way back.
  • dotjaz - Saturday, July 4, 2020 - link

    Well, AVX512 is truly pointless on a laptop, and possibly on any general consumer parts. That much is true. It is not energy efficient at all. On top of that, there's the mess of subsets.

    But AVX2 does provide sizable benefits over SSE4 even for optimal code. AVX alone is probably not worth it.
  • eastcoast_pete - Sunday, July 5, 2020 - link

    With regard to AVX512, it's also a chicken-or-egg issue; as long as software makers can (correctly) assume that most of their customers don't have CPUs that have it, they won't use it, even where it would speed things up over AVX2. That's why AMD starting to have their own implementation of AVX512 is so important; it'll make it more of a mainstream feature that programmers can assume to be available for use by their software. That's one of the reasons this boneheaded move by Intel ticks me off.
  • Santoval - Saturday, July 4, 2020 - link

    AVX2/512 is not the only bit that differentiates the Core series from Atom. If you only care about floating point performance then yes, that's their primary difference. AVX has nothing to do with integer code though.
  • ProDigit - Friday, July 3, 2020 - link

    Not a single program, other than benchmarks use it.
    Very few use avx 256 even...
    I wonder why anyone would need it on a laptop, especially considering it's a server feat.
  • Meteor2 - Friday, July 3, 2020 - link

    Not photo and video editing software?
  • lefty2 - Friday, July 3, 2020 - link

    Tremont doesn't support any version of AVX. So, that would cause a huge swath of software to operate very slow. AVX is used a lot more universally than you think - practically all games use it. All multimedia applications, photo editing software, etc, etc.
  • dotjaz - Saturday, July 4, 2020 - link

    Wrong, AVX2 is quite universal on any multimedia tasks. AVX512 is pointless.
  • Samus - Saturday, July 4, 2020 - link

    Clearly a first-gen product. Give it time.

    For my sake I hope this isn't a dud because Intel clearly invested billions into the thing and I own a lot of Intel stock lol.
  • neogodless - Thursday, July 2, 2020 - link

    Why not rename them Core Hybrid Intel Processors? Makes a great initialism!
  • eastcoast_pete - Thursday, July 2, 2020 - link

    With this decision (no AVX), they should call it a "Crybid". Those relaxed marijuana laws in CA sure have unexpected repercussions, but answer the question "what were they smoking?".
  • YB1064 - Thursday, July 2, 2020 - link

    "The bottom chiplet contains the ‘peripheral’ components that are not as performance related, such as security controller, USB ports, and PCIe lanes. This is built on Intel’s cheaper 22nm manufacturing node."

    Surely they are not going to go back to 22nm? The packaging engineering is interesting enough, but there seems to be no fundamental architecture improvements. I see Zen-xx crushing this straight out the gate. Definitely not adaptable for desktop use.
  • III-V - Thursday, July 2, 2020 - link

    >I see Zen-xx crushing this straight out the gate. Definitely not adaptable for desktop use.

    This is a 7W processor, dumbass.
  • eek2121 - Thursday, July 2, 2020 - link

    Right, and AMD has 14nm chips that are 6W. I expect they will be releasing a quad core Zen 2 based product that fits in a 6W power envelope in due time.
  • Jorgp2 - Thursday, July 2, 2020 - link

    Lol, no.

    That's a 6w base clock.
  • yeeeeman - Friday, July 3, 2020 - link

    It might fit into 6W but it won't fit into the same package as LKF. I think people don't understand this is a 1cmx1cm package with basically an entire motherboard worth of components.
  • Spunjji - Friday, July 10, 2020 - link

    Definitely not. But is that really all that big of an advantage in a 13" devices that are already hitting the legal limit for battery capacity? It does feel a bit like a solution looking for a problem.

    The low idle power is a bigger sell, but the performance penalty is a heavy one.
  • Spunjji - Friday, July 10, 2020 - link

    Van Gogh is supposed to be around 9W - I expect they'll probably have knocked-down variants capable of 6W operation.
  • ikjadoon - Thursday, July 2, 2020 - link

    And nobody will enjoy running a 7 W to 9.5 W x86 CPU on Windows 10: not in a laptop, not on a tablet, not on a convertible. There is simply too much legacy junk in Windows 10 and this project may only bear fruit in the 2030s, if it's not canned.

    If these run Windows 10X, perhaps there's a niche market, but Intel is obviously jamming a Hemi engine into a Prius and it looks a last-ditch, mostly fruitless, billion-dollar bet from the outside.

    Not unlike Nokia, a few years before completely abandoning their Lumia lineup under MS, decided to invest huge money into Lumia cameras: a lot of money thrown into a pit of fire.

    Intel has been unable to "get ready" for the mobile / thin-client era of computing for a decade now.
  • Meteor2 - Thursday, July 2, 2020 - link

    Windows runs just fine on low-powered computers -- as long as there's enough RAM and at least a SATA SSD.
  • Icehawk - Friday, July 3, 2020 - link

    Except this maxes out at 8gb which is barely adequate in W10, 16gb is really necessary unless you like to choke or run super lean. My enterprise laptop sits at 6gb after a fresh boot just on the desktop due to all the corporate junk on there - we have some Lenovos with 8gb soldered RAM and they are hot garbage.
  • yeeeeman - Friday, July 3, 2020 - link

    I have a 2014 Intel atom z3735f tablet with 2gb of ram and swap handles things pretty well. 8GB of RAM is enough for 95% of people.
  • ProDigit - Friday, July 3, 2020 - link

    6GB is what most need for a 64 bit version of windows home.
    8gb only if you're multitasking.
    You could also run linux, and be fine with 2GB.
  • Tomatotech - Friday, July 3, 2020 - link

    personal use computers (laptop / phones / etc) have a hell of a lot going on behind the scenes with constant and multiple cloud interactions and updates. Apple manages with tiny amounts of RAM on their phones because they have an obsessive focus on cutting out cruft and streamlining iOS. MacOS and Windows? Not so much. 8GB is needed for futureproofing.

    If Microsoft is able to revamp Windows on iOS lines then maybe 6GB or 4GB, but Apple themselves have set 8GB as a minimum for for all new MacOS computers, and they are a company that has a history of not putting enough RAM in their cheaper offerings.

    As for the new ARM Mac computers, we will see, but it's unlikely they will have less than 8GB. 16GB minimum is a strong possibility for various reasons when they come out in 2021-ish. IF, and it's a BIG IF, Apple releases a new ultra-long-life device running a mix of MacOS and iOS, then the battery difference of running less RAM might swing it, but we're firmly in making-stuff-up land now.
  • nonoverclock - Friday, July 3, 2020 - link

    Corporate laptops are sort of a different story with all of their agents running. Definitely need more resource to handle all that. On a basic home laptop with Office and a few other apps, most people could probably do ok with 8 GB.
  • Lucky Stripes 99 - Saturday, July 4, 2020 - link

    Agreed. My personal laptop has fewer than a third of the processes and half of the services of what my work laptop has at bootup. The encrypted filesystem is an especially nasty resource hog. I never seem to have a fast enough work laptop with all of that stuff.
  • ProDigit - Friday, July 3, 2020 - link

    Then run linux, like ubuntu.
  • dotjaz - Saturday, July 4, 2020 - link

    So what? Ryzen 4300U can be configured to 10W or lower if they want to complete with it. Besides Van Gogh is around the corner, it's almost certainly a native Quad-core part with RDNA2 for 15W and below. 7W is certainly within reach especially considering 3200U could do that.
  • Skydanthology - Thursday, July 2, 2020 - link

    Yeah, this is for ultra-mobile laptops or dual-screen tablets that require the lowest standby power. Besides, AMD also uses an older process node when making I/O or memory contollers.
  • eek2121 - Thursday, July 2, 2020 - link

    Not for mobile.
  • ikjadoon - Thursday, July 2, 2020 - link

    They said "ultra-mobile laptops"...which is exactly Lakefield's target.

    "Lakefield processors are the smallest to deliver Intel Core performance and full Windows compatibility across productivity and content creation experiences for ultra-light and innovative form factors."

    Literally from Intel: https://newsroom.intel.com/news/intel-hybrid-proce...
  • jeremyshaw - Thursday, July 2, 2020 - link

    It's already somewhat covered in the article, however 22FFL =/= Intel's old "22FinFET". Intel's 22nm FinFET is closer to other foundries' 16/14/12nm FinFET anyways*, so it's strange you aren't bashing AMD for being behind the times on their 14/12nm IOD.

    *This is roughly when the "Intel N-1 node is equal to TSMC/Samsung/GF's N node" started, FWIW. Some say it was 32nm when it really started, but we can all agree that by 22nm, Intel really pulled ahead for a bit. Well, part of that was due to TSMC and GF fumbling badly at 32/28nm, but that somewhat dilutes the metric, anyways.
  • Jorgp2 - Thursday, July 2, 2020 - link

    >Surely they are not going to go back to 22nm? The packaging engineering is interesting enough, but there seems to be no fundamental architecture improvements. I see Zen-xx crushing this straight out the gate. Definitely not adaptable for desktop use.

    And?

    AMD was using a similar node until last year.
  • extide - Thursday, July 2, 2020 - link

    It's 22FFL, which is a derivative of the 14nm process.
  • ProDigit - Friday, July 3, 2020 - link

    All chip manufacturers make CPUs at variable lithography. Even AMD. When it says their Ryzens are built on a 7nm node, it means 7nm is the smallest, but other parts still use 10, 12, 14 or 22 nm.
    Ryzen 2000 CPUs had parts still running on 28nm.
  • FunBunny2 - Friday, July 3, 2020 - link

    "other parts still use 10, 12, 14 or 22 nm. Ryzen 2000 CPUs had parts still running on 28nm."

    which raises a question, which I suppose is answered somewhere in the hardware engineering space: I suppose having multiple 'node' sizes on the same line is possible due to the fact that the native 'node' is way larger than, in this case, 7nm by multi-masking, and backing off on masks to print the larger 'node' segments. so, if we should ever get to some Xnm, say 7nm, as native resolution, would the machinations to print up, say 28nm, be more work than the current process of printing down?
  • bji - Thursday, July 2, 2020 - link

    Very likely you meant 0.2 PICOjoules of energy consumed per bit, not 0.2 PETAjoules.
  • JayNor - Thursday, July 2, 2020 - link

    Someone from Intel mentioned that they have a chiplet version of their LTE modem that can go in the stacked design. I don't recall where the interview is, though...
  • Deicidium369 - Thursday, July 2, 2020 - link

    https://tech.hindustantimes.com/tech/news/intel-s-... makes mention of Lakefield and an LTE modem.

    "chiplet" is a marketing term.
  • Ian Cutress - Monday, July 20, 2020 - link

    We asked that with our interview with Ramune Nagisetty from Intel. They say they can do it, but it's not done here.
  • brucethemoose - Thursday, July 2, 2020 - link

    Theres potential for another Micron partnership here, as Intel needs custom stackable DRAM dies with TSVs that they can stick below a compute die. Going through the package and back up to memory with a long, narrow interface seems like a tremendous waste of power.

    And That 4-atom cluster takes up as much space as a AVX-less sunny cove die... a bunch of those would be interesting in a reticle size or EPYC-style Xeon. Cloud providers subdivide giant Xeons into smaller instances anyway, and I imagine that many customers would prefer 4 full cores for the same cost as a single hyperthreaded one. Thats more or less what AWS is pitching with their Graviton chips.
  • nandnandnand - Thursday, July 2, 2020 - link

    Just imagine a 256-core Gracemont or later Atom CPU using chiplets. That could be great for servers.

    That's basically a return to Xeon Phi, except those cores were modified to do AVX-512.
  • brucethemoose - Thursday, July 2, 2020 - link

    Indeed. In hindsight, Intel designed and pushed Phi towards the wrong market.
  • extide - Thursday, July 2, 2020 - link

    That's basically what the latter generation Xeon Phi was.
  • JorgeE1 - Thursday, July 2, 2020 - link

    Intel can name the manufacturing process: Ferveros Ultra paCKed Scalar Hybrid Intel sTacked Silicon
  • 69369369 - Thursday, July 2, 2020 - link

    EDGY
    D
    G
    Y
  • serendip - Thursday, July 2, 2020 - link

    "But the bottom line is that in most cases, expect Lakefield to perform similar to four Atom cores, just above Goldmont Plus, and not like any of the Skylake/Ice Lake Core products and its derivatives."

    So a Kaby Lake Pentium will outperform this with 2 less cores and similar TDP while being a lot cheaper. That big core is sitting around doing nothing. It should be used as a turbo boost core, much like how the m3-8100Y behaves in the Surface Go 2.

    Intel is either ballsy or stupid to pit this against the SD 8CX in the same price range.
  • lmcd - Thursday, July 2, 2020 - link

    That TDP is at its pitiful base clocks. Tremont will outperform Sunny Cove and Skylake both at the lower power designations a second core is allowed, while using less die space. The second core in a Kaby Lake Pentium is worth less than the Tremont core in a theoretical 1+1 Lakefield design because the second core is throttled to around that 60% mark in the perf/watt curve graphic in the first place.
  • Jorgp2 - Thursday, July 2, 2020 - link

    Atom does support GFNI
  • quorm - Thursday, July 2, 2020 - link

    I don't know. The tech is interesting, especially the stacking, but overall, this doesn't seem to offer much benefit. Judging from the provided graphs, the power consumption difference between core and atom is too small to justify it. Does atom have dedicated hardware decode for current video codecs? Only way I could see this being beneficial, at least in this first iteration.
  • brucethemoose - Thursday, July 2, 2020 - link

    Even pure Atom SoCs had hardware decoding, right? IDK where it is on the die, but thats traditionally "part" of the GPU, and stuff like Netflix won't even run without it.
  • lmcd - Thursday, July 2, 2020 - link

    Yea that's 100% part of the die and it's why stuff like Silvermont getting upgraded Intel graphics as opposed to earlier bad PowerVR graphics (and weak decode blocks) was absolutely essential
  • alufan - Thursday, July 2, 2020 - link

    hmm intel glueing cpus together then?
    Pity they are both sows ears and cant be polished
  • Alistair - Thursday, July 2, 2020 - link

    The only point of this is if it is dirt cheap. You don't pay a premium for 1 core and 4 atom cores. DOA. I'll take a 7nm 4 core Zen 2 laptop instead thanks.

    Make this a $50 CPU? Then I'm interested.
  • lmcd - Thursday, July 2, 2020 - link

    I don't think it'll be $50 but if it is, I hope to see it on single-board computers. That'd be slick.
  • serendip - Thursday, July 2, 2020 - link

    It's meant for $1000 computers, not cheap sub-$500 devices. I wouldn't pay that much money for 4 Atom cores and a big core that sits around like an unwanted appendage.
  • lmcd - Friday, July 3, 2020 - link

    It's meant for $1000 tablets and ultralights, not traditional computer form factors.

    Glad you won't buy any smartphone then!
  • yeeeeman - Friday, July 3, 2020 - link

    Qualcomm is fighting in the same space with 8cx for the same huge prices and that one doesn't even run x64 apps. This market wants very light laptops with very good battery life and LKF does just that, wherever you like it or not.
  • Spunjji - Monday, July 6, 2020 - link

    We'll see what the market actually wants when this launches. I have a strong suspicion that the market doesn't want the absolutely miserable performance/$ on offer here, even for the quoted battery life benefits, but I've been wrong before.
  • justing6 - Thursday, July 2, 2020 - link

    Amazing article! I learned so much about Lakefield and 3D stacking in general. The technology and engineering is really incredible, but as a consumer product it looks extremely lacking.

    I'm a proud owner and heavy user of a Surface Pro X, and the 8cx/SQ1 is generally "good enough" when running ARM workloads. Going to a 15W Intel chip that can turbo to 25-40W feels noticeably snappier, but when considering the SQ1 is 7W-15W it's really impressive. The 4+4 also allows for very good multitasking performance, it takes 10GB+ of heavy web browser tabs running on an external 3440x1440p display before it really feels like it starts to slow down.

    However, that's when I live inside Chromium Edge running native ARM64 code. Performance is still laughable compared to Apple Silicon, especially for translated x86 code. On Geekbench the A12z on the dev kits running translated x86 code is just as fast as the 8cx running native ARM code, while the 8cx's performance really suffers when it has to run anything more complicated than a text editor or video player written in x86. I expect Apple's successor to the A12z to mop the floor with this whole market at the same price points, even for x86 code. On top of this, Apple has unparalleled leverage over developers by controlling its entire hardware stack. I wouldn't be surprised if in 2 years, all major MacOS applications will be compiled for native ARM64 code. On the other hand, Windows on ARM64 is almost 2 years old now and has very few natively compiled apps.

    I really prefer Windows, but it's going to be a hard choice for me and a lot of consumers if a Lakefield/8cx ultraportable running Windows costs the same as an Apple ultraportable on Silicon that has somewhere around double the performance for the same price, while still keeping a thin and light design with great battery life. Intel and Qualcomm will be fighting for a distant second place.
  • jeremyshaw - Thursday, July 2, 2020 - link

    So you're saying a desktop with desktop TDP outperforms a fanless tablet with tablet TDP?

    I do agree Apple has a stronger push (and will have to, since they are moving their entire ecosystem over, and anyone that isn't fully onboard will simply be left behind).

    Qualcomm got too greedy, Nvidia doesn't want to fight in that market anymore (remember the original Surface tablets with Nvidia Tegra chips?), and nobody else is really eyeing the laptop/consumer segment outside of Apple.

    Oh, well. Some people are propping up the PINE64 as if it's worth anything.
  • lmcd - Thursday, July 2, 2020 - link

    No one's come up with an exciting killer app beyond video decoding for smartphones or tablets so might as well "prop up" the PINE64 :)

    Hopefully Broadcom will get interested in SoCs again with the work they're doing with the RPi foundation. ARM is going toward powerful CPU cores anyway so it shouldn't take an Apple-sized company to come up with competitive ARM designs.
  • justing6 - Thursday, July 2, 2020 - link

    Considering an iPad Pro (a fanless tablet) running an A12z puts up Geekbench 5 scores 30% to 60% higher that the SQ1/8cx single/multi core respectively when running native ARM code, it's safe to say its a generation or two ahead of anything Qualcomm has. I also doubt they changed much with the TDP of the chip in the Arm transition Dev kit, if anything they made it less powerful by disabling the 4 small cores and leaving it only with the 4 large cores to give them more time to work out the big.LITTLE scheduling in MacOS. A 30% hit to performance when running x86 code sounds about right, its just that the chip has so much more raw power than the 8cx that it will be able to give users a much better experience.

    I'm not an Apple fan by any means, but I am a fan of innovation. Apple has been pretty stagnant on that front the past decade, but with the move to ARM they have a chance to really get ahead of the market like the Apple of the 2000s.
  • serendip - Thursday, July 2, 2020 - link

    The ARM MacOS devices could be mobile powerhouses at (gasp!) equal price points as Windows devices running either ARM or x86. Imagine a $1000 Macbook A13 or A14 with double the performance than a Surface Pro X or Galaxy Book S costing the same.
  • lmcd - Friday, July 3, 2020 - link

    Considering the hurdles just to use any form of open source software with the platform, they're not equal.
  • JayNor - Thursday, July 2, 2020 - link

    Intel already makes LTE modems. By chiplet, I am referring to the Foveros 3D stackable chiplets in this case ... Intel also makes emib stitched chiplet form features for their FPGAs. So, not just a marketing term. These have to implement certain bus interfaces or TSV placement requirements to work with the FPGA or Foveros manufacturing.
  • henryiv - Thursday, July 2, 2020 - link

    What a shame to disable AVX-512. The circuitry is probably left there to support SSE, which is still a common denominator with Tremont cores. Also, 5 cores not running together is a huge huge bummer.

    This first generation is an experimental product and is to be avoided. In the next generation, the Tremont successor will probably get at least 256-bit AVX support, which will be finally possible to use across 5 cores. Transition to 7nm should also give the elbow room needed to run all 5 cores at the same time at full throttle within the limited 7w power budget.
  • jeremyshaw - Thursday, July 2, 2020 - link

    By the usual Intel Atom timeline, it will be 2023 before a Tremont successor comes out. By then, will anyone even care about Intel releases anymore?
  • lmcd - Thursday, July 2, 2020 - link

    Bad joke, right? Intel is signaling that Atom is moving to the center of their business model. Previous times Intel prioritized Atom, Atom got every-other-year updates. I'd expect a 2022 core or sooner, if you assume Tremont was "ready" in 2019 but had no products (given 10nm delays and priority to Ice Lake).
  • serendip - Friday, July 3, 2020 - link

    Moving Atom to the center of their business model? Atom is still being treated as an also-ran.

    Intel now has to contend with a resurgent AMD gobbling up x86 market share in multiple segments and ARM encroaching on consumer and server segments. Putting Atom as a priority product would be suicidal.
  • Lucky Stripes 99 - Saturday, July 4, 2020 - link

    Not really. The ARM big/little design has been fairly successful. Most likely is that you'll see the walls between Atom and Core break down a bit in order to keep code optimizations happy on either core type.
  • Deicidium369 - Saturday, July 4, 2020 - link

    revenues show they are gobbling up nothing.
  • Namisecond - Saturday, July 11, 2020 - link

    There is a limit to the amount of market share AMD can gobble up and they are currently at or near their limit. AMD is production limited and always will be.
  • yeeeeman - Friday, July 3, 2020 - link

    The successor (Gracemont) comes next year in Alder Lake S. Stop being a hater and go eat your amd cake.
  • anonomouse - Thursday, July 2, 2020 - link

    There are bigger challenges for asymmetric core design beyond just the actual ISA support and scheduling, too. Multithreaded software has lots of assumptions around locks and spinlocks in particular that will have to be tuned, and effective priority inversions will be problematic too. Like where a thread on the big core has to wait on a lock that is held by a thread running on a small core.

    Notebookcheck's article made it seem like the scheduling right now just doesn't sustain using all of the Tremonts + the Sunny Cove at the same time, which neatly sort of sidesteps the issue for now, at obvious cost of the perf of that bigger core. Not clear whether that's intended behavior that will stick around.
  • wr3zzz - Thursday, July 2, 2020 - link

    There is no need for so many little cores if software were not designed to continuously phone home with our personal data, or skimming money continuously via micro-transactions. The entire ecosystem of phones is designed around that concept. PC, not so much, for now.
  • jeremyshaw - Thursday, July 2, 2020 - link

    Too late. MSFT and Intel are pushing that rehashed garbage "Modern Standby" (formerly Connected Standby, InstantGo, etc), which is trying to make laptops that don't go into standby - rather they go into a low idle state and "perform tasks" throughout the night.

    Usually, it just drains the battery on my laptop (I have long disabled automatic mail retrieval, and any other scheduled task) and forces the laptop into hibernate. Just what I want out of my laptop - less battery life.

    Luckily for us, AMD laptops don't support this garbage fire.

    MSFT... just because Apple was able to successfully implement "Modern Standby" almost a decade ago, doesn't mean you can. Wake up. Or not.
  • abufrejoval - Friday, July 3, 2020 - link

    Yeah, had to laptop batteries killed because they woke up in the middle of a flight packed tight and overheating. Hybrid and modern standby are absolute "killer features".
  • brantron - Thursday, July 2, 2020 - link

    Why not two tiny Cannon Lake cores?

    I'm no Intel engineer, but the inconvenient fact remains that such a device would be more useful to the average person.

    That leaves Lakefield with the appearance of a frankenstein experiment. Sorry Intel, sounds fun, but I don't buy those for $1,000+.
  • serendip - Thursday, July 2, 2020 - link

    This is the most damning quote from the article:
    "Intel has made the 1+4 design to act as a 0+4 design that sometimes has access to a higher performance mode. Whereas smartphone chips are designed for all eight cores to power on for sustained periods, Lakefield is built only for 0+4 sustained workloads. And that might ultimately be its downfall."

    And this is going into $1000 devices like the Galaxy Book S and Thinkpad Fold. The ARM 8cx variant of the Galaxy Book S is $999, the Surface Pro X with an upgraded 8cx is also $999, and these offer i5 level performance when running ARM code. They also have surprisingly beefy integrated GPUs.

    Now imagine paying $999 for a 4-core Atom device with a Sunny Cove core that mostly sits idle. I've used cheap Bay Trail and Apollo Lake Atoms, they're decent performers at low price points but they don't belong in anything over $500 because they're still laggy.

    I've also compared the Pentium 4415Y vs. the m3-8100Y in the old and new Surface Go: the Kaby Lake Pentium dual-core feels slightly laggy because it can't turbo, whereas the m3 feels much more snappy when it turbos. Even then, the Pentium still feels more snappy than Apollo Lake because single-core performance is higher. For daily use, Windows likes fat beefy cores with high turbo because a lot of the UI is single-threaded.
  • brantron - Friday, July 3, 2020 - link

    And in addition to the m3's turbo, there's hyper-threading and AVX to account for.

    What clock speed would Ice Lake Y or Tiger Lake Y have with no hyper-threading or AVX?

    Something doesn't add up here, and it's not just the bizarre hybrid cores.
  • serendip - Friday, July 3, 2020 - link

    Yes, the m3 has HT and so does the much maligned Pentium Gold 4415Y and 4425Y.

    Lakefield looks fascinating from purely technical viewpoint but from a value standpoint, it looks to be a disaster. Intel actually thinks 4 Tremont Atom cores are going to be the main cores for $1000 devices.
  • Meteor2 - Friday, July 3, 2020 - link

    Think of the margins though
  • Spunjji - Monday, July 6, 2020 - link

    Given the cost of producing multiple dies, stacking them, and packaging the whole lot with RAM on top - I doubt even that's going to be particularly compelling. It probably would have been better if they'd transitioned fully to 10nm and had idle 14nm capacity, but as it stands this will be competing for manufacturing space with their own premium products on both lines. D:
  • lmcd - Friday, July 3, 2020 - link

    This isn't designed for Windows. This is designed for Windows 10X. Windows 10X got delayed so partners are shipping it with Windows.
  • serendip - Friday, July 3, 2020 - link

    Will Win10X bring a magical doubling in performance?
  • Spunjji - Monday, July 6, 2020 - link

    My thoughts exactly. I feel like they've tried to do too many things at the same time with this product.

    They obviously wanted to demonstrate Foveros with a relatively low-complexity, relatively low-power chip - but the cost of the first-gen Foveros tech conflicts with one of the big primary selling points of small chips in the first place, i.e. lower cost. So they've gone for a "premium" product, but the first-gen Foveros tech puts a fairly low ceiling on its performance - meaning it's not actually very premium in practice.

    It's a quagmire of mutually contradictory requirements, and tbh that's pretty on-par with Intel's previous efforts in the low-power CPU arena.
  • lmcd - Thursday, July 2, 2020 - link

    Honestly confused why everyone is up in arms about the lack of AVX. This is a tablet SoC for Windows 10X and any other usage of it is outside of its intended scope. Future SoCs might add more of this functionality but it doesn't really seem like a priority. A low-power SoC that also won't wilt to the render thread and "just works" with legacy x86-only apps when necessary sounds good to me.
  • Spunjji - Monday, July 6, 2020 - link

    I think it's mainly that Intel have spent so much time selling that feature so hard, then dropped it - albeit without actually physically removing it. So, once again, Intel are charging their customers money to manufacture something in silicon that they can't actually use.
  • xdrol - Thursday, July 2, 2020 - link

    The Snapdragon 7c is more like a 2+6 than a 0+8 chip: It has 2x Kryo Gold (Cortex A76) and 6x Kryo Silver (A55) cores.
  • Sychonut - Friday, July 3, 2020 - link

    I am not sure I understand whether the added design complexity is justified by the very minor power savings as depicted by the power / performance graph on page 1, or am I reading it wrong? The difference between the two curves seems marginal at best below 58%.
  • ichaya - Friday, July 3, 2020 - link

    That seems like the most meaningful part of the chart. You can still deliver 60%+ performance with only 30-50% of the power.

    This is a 1st gen attempt and a 2+4 design with AVX and ARM's System-level cache would definitely be interesting to see in ultra portables.
  • unclevagz - Friday, July 3, 2020 - link

    Which itself is not a good showing for the 'efficiency' core, the small thunder cores in the Apple A13 are at ~20-30% of the Lightning core's performance while consuming 5-15% of the power (2-3x perf/watt) . And the A13 Lightning in all likelihood already runs rings around the lakefield cores in this department.
  • ichaya - Sunday, July 5, 2020 - link

    The chart shows <10% power for <30% perf, and <20% power for <50% perf. That seems like 2-3x perf/watt difference as well. The A13 has a total of 28MB of cache shared between the CPU+GPU, where as this seems to have 6MB for the 4+1 CPU cores sans L1 caches.

    I'd love to see an Anandtech article on how Apple's large caches help with the code density differences between x86-64/ARM and with lower clock speeds, power consumption.
  • Wilco1 - Sunday, July 5, 2020 - link

    The code density of AArch64 is significantly better than x86_64, so even at same cache sizes Arm has an advantage.
  • ichaya - Wednesday, July 8, 2020 - link

    Source? Everything I've read says x86-64 still has a diminishing but slight advantage in code density. If anything, lower clock speeds are helping Apple by avoiding memory pressure issues at higher clock speeds. I highly doubt AArch64 could perform the same as x86-64 with equal caches at any clock speed. uArch differences could outweigh these differences, but I've seen evidence of this given how large Apple's caches have been.
  • ichaya - Wednesday, July 8, 2020 - link

    * I've seen no evidence of this given how large Apple's caches have been.

    Correcting the last sentence in post above.
  • Wilco1 - Wednesday, July 8, 2020 - link

    No, x86 has never had good code density, 32-bit x86 is terrible compared to Thumb-2. x86_64 has worse code density than 32-bit x86, and it gets really bad if you use SIMD instructions.

    Try building a large binary on both systems using the same compiler and compare the .text sizes. For example I use all of SPEC2017 built with identical GCC version and options. AArch64 code is generally 10-15% smaller.

    Many AArch64 cores already have higher IPC - yes that absolutely means they are faster than x86 cores at the same clock frequency using similar sized caches.

    This https://images.anandtech.com/graphs/graph15578/115... shows Neoverse N1 has ~28% higher IPC than EPYC 7571 and ~21% higher IPC than Xeon Platinum 8259 on SPECINT2017. While Naples has 2x8MB LLC on each chiplet, the Xeon has 36MBytes, more than the 32MB in Graviton 2 (both also have 1MB L2 per core).

    Recent cores like Cortex-A78 and Cortex-X1 are 30-50% faster than Neovere N1. Do the math and see where this is going. 2020 is the year when AArch64 servers outperform the fastest x86 servers, 2021 may be the year when AArch64 CPUs outperform the fastest x86 desktops.
  • ichaya - Saturday, July 11, 2020 - link

    If you compare with -march=x86-64 or with a specific uArch like -march=haswell you'll get comparable code sizes to -march=armv8.4-a. But form the runtime code density differences I've seen, x86-64 still seems to have a slight advantage.

    From the article you linked the image from (https://www.anandtech.com/show/15578/cloud-clash-a... "If we were to divide the available cache on a per-thread basis, the Graviton2 leads the set at 1.5MB, ahead of the EPYC’s 1.25MB and the Xeon’s 1.05MB." ARM's system-level cache is good idea, as is shared L2 in Apple's A* chips. But cache advantages per thread in Graviton and A* seem to signal it's not the uArch making the difference. Similar cores to Graviton's cores with less cache, do a lot worse. Not being able to clock higher than 2.5Ghz also seems to signal that the uArch/interconnects cannot keep up with memory pressure.

    To the extent that die sizes of these chips (Graviton 2 is 7nm, Epyc 7571 and Intel Xeon 8259CL are 14nm) are comparable, it's features like AVX2/SMT that seem to have been replaced with cache in the benchmarks in the article. I'll be looking forward to A* chips to see how they might stack up in Laptops and Desktops, but these are the doubts I still have.
  • ichaya - Saturday, July 11, 2020 - link

    Correct link in post above: https://www.anandtech.com/show/15578/cloud-clash-a...
  • Wilco1 - Saturday, July 11, 2020 - link

    Runtime code density? Do you mean accurately counting total bytes fetched from L1I and MOP cache? x86 won't look good because of the inefficiency of byte-aligned instructions, needing 2 extra predecode bits per byte and MOPs being very wide on x86 (64 bits in SandyBridge)... It clearly shows why byte-sized instructions are a bad idea.

    The graph I posted is for single-threaded performance, so the amount of cache per-thread is not relevant at all. Arm's IPC is higher and thus it is a better micro architecture than Skylake and EPYC 1. IPC is also ~12% better than EPYC 7742 based on https://www.anandtech.com/show/14694/amd-rome-epyc...

    In terms of all-core throughput the fastest EPYC 7742 does only ~30% better than Graviton 2 on INTrate2006. That's pretty awful considering it has 8 times the L3 cache (yes eight times!!!), twice the threads, runs at up to 3.4GHz and uses twice the power...

    In terms of die size, EPYC 7742 is ~3 times larger in 7nm, so it's extremely area inefficient compared to Graviton 2. So any suggestion that cache is used to make a weak core look better should surely be directed at EPYC?

    Graviton 2 is a very conservative design to save cost, hence the low 2.5GHz frequency. Ampere Altra pushes the limits with 80 Neoverse N1 cores at 3.3GHz base (yes that's base, not turbo!). Next year it will have 128 cores, competing with 128 threads in EPYC 3. Guess how that will turn out?
  • ichaya - Sunday, July 12, 2020 - link

    Code density and decoding instructions are separate things. Here's an older paper on code density of a particular program: http://web.eece.maine.edu/~vweaver/papers/iccd09/l...

    Single threaded workloads are obviously going to do better with a shared system-level and in Apple's case, shared L2 caches. Sharing caches is something that Intel is closer to than AMD. You cannot compare INTrate2006 or any single threaded benchmark running on an ARM where all system-level caches are available for one thread with an Epyc 7742 where only 1 CCX's L3 caches are available to one thread. That would be 32MB on Graviton 2 vs 16MB on an AMD EPYC 2 CCX. So, AMD is being 30% faster with 1/2 the cache and clocked 30% higher than Graviton 2.

    I will definitely give credit to efficient shared system/L2 cache usage to Graviton 2, A*, and other ARM chips, but comparing power usage when there are 64 cores of AVX2 on chip when there's nothing comparable on another is an irrelevant comparison if there ever was one.
  • Wilco1 - Sunday, July 12, 2020 - link

    The complexity and overhead of instruction decoding is closely related with the ISA. Byte-aligned instructions have a large cost, and since they don't give a code density advantage, it's an even larger cost! Again if you want to study code density, compare all of SPEC or a whole Linux distro. Code density of huge amounts of compiled code is what matters in the real world, not tiny examples that are a few hundred bytes!

    Well EPYC 7742 is only 21% faster single threaded while being clocked 36% faster. Sure Graviton 2 has twice the L3 available, but the difference between 16 and 32MBytes is hardly going to be 12%. If every doubling gave 10% then the easiest way to improve performance was to keep doubling caches!

    AVX isn't used much, surely not in SPEC, so it contributes little to total power consumption (unless you're trying to say that x86 designers are totally incompetent?). At the end of the day getting good perf/W matters to data centers, not whether a core has AVX or not.
  • ichaya - Sunday, July 12, 2020 - link

    You've claimed ARM64 has a code density advantage without any evidence for a few posts now. Being byte-aligned has advantages too, which are clear in the paper with the real world program! You're welcome to provide more real world evidence!

    We're changing the goal posts now with new numbers, you can't estimate IPC based on one specific INTrate2006 test, and assume it's similar across other workloads as well. If we just stick to INTrate2006, IPC seems within 5% where Graviton 2 has twice the cache of AMD Epyc 7742.

    Comparing a top-line power number like you were doing is irrelevant when features like AVX can easily blow past any power envelope you might have, and one chip lacks the feature.
  • Wilco1 - Sunday, July 12, 2020 - link

    No, I am stating that AArch64 has better code density as a fact. Maybe 5 years ago you could argue about it as AArch64 was still relatively new, but today that's not even disputable. So check it out if you'd like to see it for yourself.

    I used the overall intrate result to get an accurate IPC comparison. If you do the math correctly you'll see that Graviton 2 has 12% higher IPC than EPYC 7742.

    At the end of the day what matters is performance, perf/W and cost. Whether you have AVX or not is not relevant in this comparison - EPYC 7742 uses the same amount of power whether it executes AVX code or not.
  • ichaya - Tuesday, July 14, 2020 - link

    This is not the first time I've seen someone look at single thread performance and disregard everything else. All Graviton 2 and A13 single thread gains can be attributed to large (100~200% more) shared L2/L3 caches, and when compared with x86, 5% or even 75% IPC gains turn out to be ~10% less real world performance or ~10% more with marginal power use difference on 7nm. AMD has everything from a 15W to 280W chip.

    For multi-threaded, the Graviton 2 looks better, but the 64 vcpu EPYC 2 c5a.16xlarge (144MB L2+L3) AWS instance costs the same as the 64 core Graviton 2 m6g.16xlarge (96MB L2+L3) instance and delvers equivalent performance on real world tasks while having 1/2 the real cores, 1/2 the system RAM and 50% more L2+L3.

    perf/W/$ is important, and since ARM has always been on the lower end of W and $, it can be hard to see past it. If you can compare cache sizes, power and real world performance, the only thing revolutionary is the fact that Amazon, Apple and the ARM ecosystem have come this far in a few years. The overall features (AVX2+SMT among others) and openness still leaves a lot to be desired.
  • Wilco1 - Wednesday, July 15, 2020 - link

    Single threaded performance is important in showing that x86 does no longer have the big advantage it once used to have. Overall throughput is well correlated with single thread performance, you can see that clearly in the results we discussed. Do you believe 64 Graviton 1 cores would do equally well against 7742 if they had the same huge caches?

    I haven't seen serious benchmarks on c5a, do you have a link? With 32 cores at 3.3GHz it should burn well over 200W, not an improvement...

    It's not that revolutionary if you followed the rapid increase of single thread performance over the last 5 years. Smartphones paid for the progress in microarchitecture and process technology that enabled competitive Arm servers (it helped AMD surpass Intel as well). I don't believe SMT or AVX are useful - 128 cores in Altra Max will beat 64 cores with SMT+AVX on performance and area at similar power.

    As for AVX, this article discusses how Intel's latest CPU disables AVX... Linus had some interesting comments recently about the fragmentation of the many AVX variants. Then there are all the unresolved clocking and power issues. It's a mess.
  • ichaya - Thursday, July 16, 2020 - link

    If there was a significant power difference between m6g.16xlarge and c5a.16xlarge, they would be priced differently. 128GB of RAM can't be more than ~15W.

    Single thread performance can help multi-thread performance up to a point, but SMT, non-boost clocks, and biasing towards TLP more than ILP (like an in-order GPU) can hurt single thread performance at the expense of more multi-threaded throughput.

    AVX-512 is a mess, but AVX2 is worth having in most contexts now. Maybe some AVX512 instructions worth having will make it into a AVX2.1 which can completely supersede AVX2. For the price of Lakefield, there are certainly more attractive options, though compatibility, packaging and performance can trump battery life.
  • Wilco1 - Thursday, July 16, 2020 - link

    Well there is a much better comparison, c6g.16xlarge has 128GB and is 12% cheaper than c5a.16xlarge. More than enough to pay for the electricity cost of the 280W TDP of c5a.

    Yes you can optimize for multithreaded throughput but SMT remains questionable, especially for large core counts. Why add SMT when you could just add some more cores?

    Indeed AVX512 is worse, and could be removed without anyone missing it. Lakefield battery life comparisons are in, the Atom curse has struck yet again...
  • ichaya - Thursday, July 16, 2020 - link

    12% is probably more the amount of subsidies these instances are getting. Amazon has a very very long history of putting any profit margins back into growth. Either that, or 128GB of RAM is 100W+!

    SMT is perhaps the lowest level at which TLP can be extracted, recent multi-core Atoms don't have it, but for server/workstation tasks like compilation, DB engine or even general multi-tasking, it's well worth it.
  • Wilco1 - Friday, July 17, 2020 - link

    Graviton 2 is less than a third of the silicon area of EPYC so cheaper to make. 128GB server DRAM costs over $1000, which is why the 256GB/512GB versions are more expensive. The power cost of extra DRAM is a tiny fraction of that.

    There are tasks where SMT helps but equally there also tasks where it is slower. So it looks great on marketing slides where you just show the best cases, but overall it is a small gain.
  • ichaya - Saturday, July 18, 2020 - link

    I wouldn't call a 64 vcpu (180W) system beating or equaling a 64 core (110W) system in web serving/DB and code compilation a small gain. The tasks where SMT hurts is basically single threaded JS, which is just such a shame. Shame! I don't think POWER, SPARC and others have been wrong in having added SMT years ago.

    For code compilation and DB the differences are 50%-100%+ making perf/W/$ very competitive.
    https://www.phoronix.com/scan.php?page=article&...

    This article also seems to mention SMT might make an appearance in the next Neoverse N* chips: https://www.nextplatform.com/2019/02/28/arm-sharpe...
  • Wilco1 - Sunday, July 19, 2020 - link

    The Phoronix link has various benchmarks that aren't even running identical code between different ISAs (eg. Linux kernel compile). So it's not anywhere near a fair CPU comparison like SPEC. And this: https://openbenchmarking.org/result/1907314-AS-RYZ... shows SMT gives almost no gain on multithreaded benchmarks once you stop cherry picking the good results and ignore the bad ones...

    Even if we just consider the benchmarks with the largest SMT speedup, Coremark and 7-zip have good SMT gains of 41% and 32%, but m6g *still* outperforms c5a by 5% and 24%.

    So the best SMT gain combined with a 32% frequency advantage and 4 times the L3 cache is still not enough to provide equal per-thread performance!
  • ichaya - Sunday, July 19, 2020 - link

    SPEC is useful for some IPC comparisons, but it's questionable to use it for much else. PG bench in the phoronix link has a 50%+ speedup with SMT which is basically inline for perf/W/$ with Graviton 2 instance. The worst case is Casandra, but everything else is within ~5% for similar perf/$ if not comparable perf/W too since comparing TDP is workload dependent as well and not measured by most tests.

    XZ and Blender are ~45% faster with SMT in your openbenchmark link, but that's a 3900X (12-core/24-thread), so any comparisons to server chips (64-core Graviton 2) are unfair given power consumption and core differences. 4 times the L3 is also wrong, it's 50% more L2+L3 with half the cores and SMT if you're being fair between m6g.16xlarge or c6g.16xlarge and c5a.16xlarge.
  • Quantumz0d - Friday, July 3, 2020 - link

    Intel has lost it's edge. And this whole portable nonsense is reaching peaks of stupidity. Those Lakefield processor equipped machines will be close to $1000 for their thin and ultra light 1 USB C / 1 3.5mm audio jack, what a fucking disaster.

    I had owned one ultrabook which is Acer Aspire S3 and I used to even play DotA2 on that, and after 1-2 years the whole machine heated like crazy, I repasted, no dice, cleaned fans, nothing. And then battery also stopped holding a charge. Now what ? That stupid POS is dead, not even worth, meanwhile a Haswell machine with rPGA socket, and an MXM slot from 2013 and guess what ? the GPU got an upgrade to Pascal 1070 MXM from Kepler 860M.

    All these BGA trash machines will no longer hold charge nor have their serviceability, older ultrabooks atleast had a 2.5" drive, newer ones have NVMe SSDs, these 2 in 1 trash like most of the Surface lineup is almost impossible to even repair or service. And because of this thin and light market Windows 10 has been ruined as well to cater to this bs phenomenon and desktop class OS is hit with that ugly Mobile UX which lacks powerful software options, navigation and all. Plus you don't even get to repair it yourself due to non available servicing parts.

    With Apple HW same thing, full BGA not even NVMe SSDs, and now they also started to make their Mac OS look and feel like iOS trash. This whole mobile and ultra portable garbage is ruining everything, from gaming to the HW.
  • PandaBear - Monday, July 6, 2020 - link

    They don't want to cannibalize their highly profitable x86 business, so they have to give you crap for what you want if you want to pay less. The problem right now is other companies don't have to deal with this political monopoly BS and they are eating Intel for lunch.

    Most monopolies die this way: when their monopoly business is obsoleted and they hang on to it to milk the cow till it dies.
  • yeeeeman - Friday, July 3, 2020 - link

    Tigerlake should also be in the pipeline soon, right?
  • Deicidium369 - Saturday, July 4, 2020 - link

    Benchmarks showing it destroying AMD Renoir at single core, and within 17% on MT - despite half the cores...

    https://wccftech.com/intel-10nm-core-i7-1165g7-cpu...
  • watzupken - Sunday, July 5, 2020 - link

    "Benchmarks showing it destroying AMD Renoir at single core, and within 17% on MT - despite half the cores...

    https://wccftech.com/intel-10nm-core-i7-1165g7-cpu...

    Till we see the actual performance, you need to take these leaks with a lot of salt. The test bed are not revealed in leaks and it is not possible to ascertain if it is a realistic number. This we don't have to speculate for long since it should be out pretty soon.
  • pugster - Friday, July 3, 2020 - link

    Lakefield's 2.5w standby sounds kind of high. ARM cpu is probably much lower than that.
  • Ian Cutress - Monday, July 20, 2020 - link

    2.5 mW
  • ProDigit - Friday, July 3, 2020 - link

    Qualcomm has proven that a single fast core isn't enough. Intel needs to at least do 2 fast cores. Then add at least 6 atom cores.
    But if Intel wants to compete with AMD, it'll need to create a quad core big setup, with at least 10 to 12 atom cores.
    Any less will be too little. These are too little as is, competing against the 3000 series of AMD.

    It would be awesome, if Intel could make a 25W quad core cpu, paired with an additional 40 watts on atom cores. That's about 20 additional cores, or a 24 core cpu.
  • abufrejoval - Friday, July 3, 2020 - link

    A great article overall, very informative, deeply technical while still readable to a layman, very little judgement or marketing, allowing readers to form their own opinion: Anandtech at its very best!

    Not mentioned in the article and not covered by the comments so far is that the main driver behind Intel’s low power SoCs has been Apple: This is what Intel thought Apple would want and be happy with!

    And if you contrast it to what Apple will now do on their own, that makes me want to sell all my Intel shares: Good thing I never had any.

    This is another Intel 80432 or i860, tons of great ideas engineered into parts, but great parts don’t automatically make a convincing whole.

    And I simply don’t see them iterate that into many more designs over the next years at competitive prices: With that hot-spot governed layout between the two all the flexibility and cost savings a chiplet design is supposed to deliver goes away and you now have two chips in a very tight symbiosis with no scale-up design benefits.

    It’s a Foveros tech demo, but a super expensive one with very little chance of currying favors even at ‘negative revenues’ in the current market.

    X86 is not competitive in terms of Watts or transistors required for a given amount of compute. It didn’t matter that much in PCs, the competing servers were much worse for a long time, but in the mobile space, phones to ultrabooks, it seems impossible to match ARM, even if you could rewind the clock by ten years and started to take BIG-little seriously. Lakefield is essentially a case study for Core being too big and thus power hungry and Atom failing on performance.

    ISA legacy is still holding x86 from dying completely, but that matters less and less at both the top of the performance range with servers and at the bottom in mobile, where the Linux kernel rules supreme and many userlands and ISAs compile just fine.

    Gaming is a hold-out, but perhaps the last generation consoles on x86, gamer PCs alone too much of a niche to determine the future.

    The desktop will switch to who offers the bigger, longer lasting bang for the buck and there is a very good chance that will be ARM next.

    Microsoft may be allowed to blunder along with lackluster ARM64 support for a couple more days, but Apple’s switch puts them under long deserved pressure. A nice Linux/Android/Chromium hybrid ultrabook running whatever Office could get things moving quicker… at least I hope that, because I’d never want to be forced into the bitten Apple…. by these corporate decision makers I see already twitching.

    No chance I’d ever let a new Apple into my home: The ][ was the last good one they made.
  • Quantumz0d - Sunday, July 5, 2020 - link

    PC gaming marketcap is supposed to be at $40Bn by 2022, total gaming market is $120Bn including everything, and Consoles are built on AMD x86 technology and now DX12U and you think that is a niche ?

    ARM is not going to do anything just because Apple did, there are so many trials by so many companies and the best company which is known for it's ROI with R&D, Qualcomm abandoned all of it's Server ARM marketshare dreams with the death of their full custom Centriq. x86 runs blazingly fast and optimized with Linux which is what the world is powered just because ARM is good in thin and light garbage doesn't make it a superstar.

    ARM is not going to get into Desktop at all, no one is going to write their programs again to suppor that HW, and no company is going to invest in DIY market before Server/DC market. Supercomputer market is not the DIY or Enterprise, look at the Top Supercomputers, Chinese Tianhe and 2 positions are with Chinese only, AMD CRAY Zen based IF supercomputer is about to come as well.
  • Wilco1 - Sunday, July 5, 2020 - link

    The #1 supercomputer is Arm, and Arm servers beat x86 servers on performance, cost and power, so not a single "fact" in your post is correct.
  • lmcd - Sunday, July 5, 2020 - link

    That first statement is hilariously disconnected from the second. Fugaku at 3x the cost per flop of its next competitor hardly backs up your assertion.

    ARM servers might beat x86 servers on performance, cost, and power but it's not looking that good vs x86_64. The latter arch is commodity hardware, software, and talent hiring.
  • Wilco1 - Monday, July 6, 2020 - link

    Just looking at the peak FLOPS in comparisons is deceiving. Fugaku is a very different design as it does not use GPU accelerators like most supercomputers. That means it is far better than the rest in terms of ease of programming and efficiency. So even if the upfront cost is higher, they expect to get far more out of it than other super computers.

    I'd say Arm servers are doing really well in 2020, clearly companies want a change from the x86 duopoly. Much of the talent is at companies that do Arm designs. How else do you think Arm CPUs are getting 20-30% faster per year, and mobile phones already outperform the fastest x86 desktops?
  • Quantumz0d - Tuesday, July 7, 2020 - link

    No company wants to develop an in house IP, that R&D and ROI is not easy, Amazon did it because to chop off some costs and set up a plan for the low end AWS instances with Graviton 2, Altera is still yet to show, Centriq abandoned by Qcomm with so much of marketing done around Cloudflare and top class engineering work, the team which made 820's full custom core.

    AND What the fuck you are babbling on fastest x86 desktops (Like Threadripper 3990X, or 3950X, 10900K) outperformed by mobile phones ? Ooof, you are gulping down the AT's SPEC scores aren't you ?

    ARM servers LMAO, like how AMD upped their DC marketshare with EPYC7742, dude stop posting absolute rubbish. ARM marketshare in data centers is in 0.5% area where IBM also resides.
  • Quantumz0d - Monday, July 6, 2020 - link

    Tiahu is fucking Chinese Sunway Processor based Supercomputer and it's top #3 so what did they do ? jack off to Zen with Hygon or did they make all Chinese use Chinese made processors ? Stop that bullshit of Supercomputer nonsense, IBM has been there since ages and they had SMT8 with Power9 uarch which came in 2017 (Summit which is #2, it was first since 2018) what did they do ? x86 is consumer based and DC market is relying only on that. ARM DC market-share is less than fucking 2%, AMD is at 4.5%, Intel is at 95% that is 2019 Q4.

    I don't know why people hate x86 as if it's like their life is being threatened by them, the fact that x86 machines are able to run vast diverse rich software selection and more freedom based computing, people want ARM based proprietary dogshit more, Apple series trash wich their APIs or the Bootloader locked (much worse like chastity) or Unlocked Android phones, even with GNU GPL v2 and Qcomm's top OSS CAF the godddamned phones do not get latest updates or anything but a Core2Quad from decade ago can run a fucking Linux or Win7 / Win10 without any bullshit issue.

    Wait for the SPEC A series iPhone 12 benchmarks and then you be more proud of that garbage device which cannot compute anything outside what Apple deems it.
  • Wilco1 - Friday, July 3, 2020 - link

    It would be good to run benchmarks on the 2 variants of Galaxy Book S. One comparison I found:

    https://browser.geekbench.com/v5/cpu/compare/25848...

    So Lakefield wins by only 21% on single-threaded (that's a bad result given it is Cortex-A76 vs IceLake at similar clocks), and is totally outclassed on multithreaded...
  • lmcd - Sunday, July 5, 2020 - link

    Current scheduler doesn't even guarantee that's the Sunny Cove core.
  • Wilco1 - Monday, July 6, 2020 - link

    Given Tremont can't get anywhere near Cortex-A76 performance, we can be sure single-threaded result is the Sunny Cove core.
  • PaulHoule - Friday, July 3, 2020 - link

    This is an example of the "Innovator's Dilemma" scenario where it is harder to move upmarket (in terms of performance) than downmarket.

    Put a phone processor into a box with a fan and people will be blown away by how fast it is -- they've never seen an ARM processor cooled by a fan before.

    Put a desktop processor into a thin tablet with little thermal headroom and people will be blown away by how slow it is.

    So first it is a situation that Intel can't win, but second it is a disaster that this low performance (downmarket) chip is expensive to produce and has to be sold upmarket. Sure you can stick any number of dies together and "scale up" a package in a way that looks as if you scaled up the chip by reducing the feature size, but when you reduce the feature size the cost per feature goes down in the long term -- when you stick a bunch of cheap chips together you get an expensive chip.
  • Drkrieger01 - Friday, July 3, 2020 - link

    I'm not one to criticize, but this comment section is a dumpster fire.

    First of all, this is a FIRST GENERATION PRODUCT that hasn't even/barely made it to market.
    Secondly, no one has really gotten to do a deep dive on performance of said product.
    Thirdly, this processor package can be used from low end laptops, to tablets, and possibly it other mobile devices.
    Fourth - who the hell cares about AVX? Do you people realize just how little AVX-512 is actually used in day-to-day usage scenarios that this CPU would be designed for? (mobile)

    How about we wait to see what this product actually does for the technology market before we write it off as 'Intel Trash'.
    /drops mic
  • Wilco1 - Friday, July 3, 2020 - link

    What hasn't helped is that both Lakefield and Tremont have been hyped up for some time, so expectations were high. Some sites even claim that Tremont is a Cortex-A77 class core purely based on it having 2x3 decoders... That is setting things up for disappointment and failure.

    "Wait for the next generation, it'll be great" has been used on every Atom, but it never lived up to its promise.
  • Deicidium369 - Sunday, July 5, 2020 - link

    "Wait for the next generation, it'll be great" has been used on every AMD product, but it never lived up to its promise."
  • Wilco1 - Sunday, July 5, 2020 - link

    Without a doubt AMD has a much better track record than Intel - where are the 10nm desktops and servers? And Lakefield getting 60% of performance of the 18 month old 8cx is embarassing...
  • lmcd - Sunday, July 5, 2020 - link

    Track record is more than the last calendar year in a single market segment. You're kidding yourself if the company that brought Bulldozer, Piledriver, Steamroller, and Excavator to market promising that this was the one that fixed the architecture suddenly gets a pass on everything.
  • Korguz - Monday, July 6, 2020 - link

    and yet, it seems intel gets a pass when they make mistakes, or screw up. go figure.
  • Spunjji - Monday, July 6, 2020 - link

    Do you want to produce a similar list for the Pentium 4, Itanium, and Atom product ranges, or would that require a little too much intellectual honesty?

    Both companies have extended periods of bad products. Only one of them had the excuse of mediocre revenues, and only one of them was punished for their repeated failures with a dramatic loss in market share. Tells you a lot, really.
  • Spunjji - Monday, July 6, 2020 - link

    Weird - pretty sure Athlon 64 was a rout, the first Athlon X2 was a rout, and Ryzen 3000 was a rout... You post some of the most asinine crap in this comment section when you're bagging on AMD, which is a real shame, because the rest of the time you seem to make a fair bit of sense.
  • abufrejoval - Friday, July 3, 2020 - link

    The entire article is about explaining what can already be inferred from the information we have at hand.

    Chips are engineering and physics, very little magic and a great degree of predictability.

    None of the elements here are first generation, only their combination is a bit new. Ice Lake can be measured, you can benchmark a single core at 5 Watts with ThrottleStop and pinning the benchmark to a single core in any IceLake system. Atoms are well known and we can be sure that when Intel claims 23% improvement at the same power, it won't be 230%.

    You can predict it's going to be much more expensive to make than a normal Atom, and you can measure that a single Core CPU below 5 Watts doesn't have a lot of horse power, while multiple cores on this design leave no Wattage for the big one.

    This chip will be very expensive to make, so it won't sell at Atom prices. All the engineering is about making small enough to compete with ARM designs, yet capable of competing at 5Watts.

    Yes, Ian could still be wrong here and there, but there isn't a lot of room to err.

    The rest of us agree, that this chip will fail to make Intel rich and customers happy.
    If we should be all wrong, remind us and we'll show proper contrition and learn.

    But we bet on what we extrapolate from what we can know and measure, that's our duty as engineers.
  • lmcd - Sunday, July 5, 2020 - link

    Tremont is absolutely new and the thermal characteristics of the package and layout also determine a lot.
  • PaulHoule - Saturday, July 4, 2020 - link

    @DrK,

    the engineering on this part is like what you'd get if you contracted out to Rockwell or Litton Industries for a brain for a Stinger missile. Compact, brilliantly packaged, with adequate performance, but no concern at all about thermal dissipation because the missile is going to hit or miss its target before the CPU fries.

    Foveros is an expensive technology for a mass market device (cheap tablet) because the fabrication cost depends on the total area and there is an expensive step of stitching the chips together at the end. If you could avoid fabricating "glue" components and just snap together chips from a library this might be an amazing technology to build 500 of something at low development cost and time (e.g. weeks) If you have to make a new mask for the chip, however, it is a lot less fun.

    So far as AVX the problem is as you say: "who cares about AVX?" Intel has shipped a backlog of features that people don't use because of overhead and complexity. As a software dev I get paid to work on certain aspect of my products, and maximizing performance with the latest instructions may or may not be on my agenda. If it is easy to do I will push for it but it means debugging compatibility problems it is a tough ask. "Optimal" performance for a range of users can mean shipping many versions of a function; the performance of loading, installing, updating, those libraries will be not in the least optimal.

    Intel is like that Fatboy Slim album, 'We're #1, Why Try Harder?' The world has changed and Intel is not the #1 CPU firm any more. Intel has to get more Paranoid or it might not Survive.
  • Spunjji - Monday, July 6, 2020 - link

    Why start with "I'm not one to criticise" and then do it? Clearly you are, and as a rhetorical flourish it's tedious in the extreme.

    1 - It's a first-gen product and it shows, but they're putting it in premium products.
    2 - No deep-dive, for sure, but Intel's own figures are not very encouraging.
    3 - Citation needed here. There's no sign of it being used outside of low-power premium devices.
    4 - Who cares about AVX indeed! Tell that to the Intel fanboys pissing all over the AMD threads?

    I'm entirely in favour of your final conclusion, but it's not really supported by the previous statements. 🤷‍♂️
  • Oxford Guy - Friday, July 3, 2020 - link

    Bricklake or bust.
  • Meteor2 - Friday, July 3, 2020 - link

    Ultimately this is another attempt by Intel to stay relevant in a space where it's always struggled: mobile. With the progress being made by Apple, Microsoft, and Qualcomm using ARM, Intel is looking at losing an ever-growing chunk of what was the laptop market.

    But whatever Intel tries, bottom line is that ARM is more efficient than x86.
  • Beaver M. - Friday, July 3, 2020 - link

    Thats not the issue. The issue is that theres not much software in that sector for x86.
  • Valantar - Sunday, July 5, 2020 - link

    A few errors in the article: 2 16-bit channels of LPDDR4X should be 2 32-bit channels of LPDDR4X, given that Renoir (with 4 32-bit LP4X channels at the same clock speed) delivers exactly 2x the bandwidth. Right?

    You should also proofread the pasted-in laptop descriptions; a lot of stuff in them clashes with the previous text.

    Beyond that though: great article! Part of the reason why I love AT is for these technical yet understandable deep-dives. Looking forward to the next one.
  • Pixelpusher6 - Sunday, July 5, 2020 - link

    Interesting choice to place the DRAM right over the core, seems like it would make more sense to move it next to the chip but on package. I guess my question is was it worth the complexity to implement this Foveros design to save a little space? It seems like they could have gotten the same benefit by using a traditional packaging i.e. with a little large package. Can you imagine paying $2500 like the price of that Lenovo and having Atom-esque performance?
  • Pixelpusher6 - Sunday, July 5, 2020 - link

    *larger
  • Farfolomew - Monday, July 6, 2020 - link

    Agreed on the DRAM placement. It seems really out of place. Another "dime size" piece of silicon right next to the Lakefield CPU doesn't seem like it would take up much more board space, and would alleviate a ton of the heat dissipation problems by allowing the compute-layer die to be directly connected to a heatsink
  • serendip - Monday, July 6, 2020 - link

    It seems to be an interesting technical answer to a question nobody asked. Board space is a lot cheaper than what Lakefield would cost. It could also cost more for Intel to produce and they'd be stuck carrying multiple RAM SKUs.

    Heat dissipation could be a major issue. The slow chip could become even slower if it has to constantly throttle down because of thermal loads. Intel is sadly mistaken if this is supposed to be an ARM competitor.
  • Spunjji - Monday, July 6, 2020 - link

    That's the exact impression I got, too. They seems to be jumping around waving "look, we can do this too" when really it would have made far more practical sense *not to do it*.
  • watzupken - Sunday, July 5, 2020 - link

    Conceptually, this is a good way to lower power requirements to make Intel more competitive against ARM SOCs. However I agree that this is indeed for part smartphone, and part PC. Which unfortunately also means it may not be good for either one. From a smartphone perspective, while this may be a low power chip, but I am still not convinced that an x86 chips can be as efficient as a high end ARM chip. On the PC/ laptop space, I feel it will be more economical to just go for the pure Tremont based chips which should offer sufficient performance for light chores, and still offer good battery life. In my opinion, this is going to be a very niche chip and likely won't be cheap either.
  • Farfolomew - Monday, July 6, 2020 - link

    The engineers for this got screwed over by the marketing teams. There is no way this is supposed to compete with higher-end Core chips. It's supposed to be the New Atom replacement: It's supposed to fix everything that was wrong with previous 0+4 Atom CPUs: sluggish OS response and, to an extent, slow single-threaded perf. And also a boost in GPU capabilities.

    I hope Intel doesn't abandon this, even though, as Ian said, this first gen is going to get slammed
  • serendip - Tuesday, July 7, 2020 - link

    I think Intel should have gone the other way by bringing Sunny Cove idle power down to ARM levels, instead of making this Frankenstein's monster of a chip. ARM licensees all use big cores to speed up UI threads and OS response but Intel seems to be using little Tremont Atom cores for everything. An upgrade to Atom wouldn't have saved Intel's position in mobile, not when ARM big cores have higher perf/watt.
  • PandaBear - Monday, July 6, 2020 - link

    Did I see $2499 for the Lenovo? Holy smoke, it is going to fail with this kind of processor. I think Intel is doomed with this being done on 10nm.
  • hanselltc - Monday, July 6, 2020 - link

    How does Comet lake fit into that power/performance graph?
  • qwertymac93 - Monday, July 6, 2020 - link

    Those performance numbers have me shaking my head. Currently, the chip is acting more like a "1 OR 4" core, not a "1 AND 4" core design. I can't help but wonder if two "big" cores would have been both faster and more power efficient... Clearly this product is suffering from first-gen-itus. 2021 can't come soon enough for Intel.
  • MS - Tuesday, July 7, 2020 - link

    Another science project by someone who has a stake in the Atom design. Some things are so bad, it's impossible to kill them. Avoton, Covington, they were all terrible products and all you need to do is look at the performance/power graph to see that there isn't even a net power saving. It's like putting a Pinto engine into a Mustang and expecting better gas mileage. What are they thinking? Or are they? The only thing they missed is piggy backing a Larrabee.
  • name99 - Tuesday, July 7, 2020 - link

    "here’s what Intel did, with each connection operating at 500 mega-transfers per second. The key point on this slide is the power: 0.2 picojoules of energy consumed per bit transferred."

    It's worth comparing this to the competition. TSMC has LIPINCON which is not exactly identical (I don't believe TSMC has publicly demo'd a stacked version) but at the abstract level of "chiplet to chiplet communication" it's the same thing.
    TSMC gets 0.56pJ of power, but at a substantially faster 8GT/s. I don't know the extent to which this would scale down if reduced to Intel speeds, and whether its energy costs would go down (or even up, but that seems unlikely) if it were operating vertically rather than horizontally through an RDL.

    Point is Foveros is a branding more than a technology per se. It's Intel's way of performing a particular type of packaging, but as far as we can tell, the same style of packaging is available to anyone who uses eg TSMC, if it met their particular goals.

    (So far it hasn't because Apple, QC, Huawei, etc, can fit their entire SoC on a single die, they don't need to go through these contortions to either reduce the die size or deal with the limited capabilities of their fabs...

    That sounds snarky, but Lakefield is deeply fishy. Sure, you want to save area, but the target is a tablet, and Apple's tablet SoC's have been 120 to 150mm^2. You'd figure a single die Lakefield all on 10nm would fit in ~150mm62 or less. So???)

    And Samsung? I would guess so, but I don't know.
  • Spunjji - Friday, July 10, 2020 - link

    Apple aren't trying to squeeze all of their profit margins out of the CPU alone, though. That's the difference.

    This is Intel trying to preserve margins by using fancy packaging technology to increase yield (and thus both output and margins) on their increasingly capacity-constrained nodes.
  • EthiaW - Tuesday, July 7, 2020 - link

    How can we expect something that stingy on silicon area (don't have place for a single more large core) to compete with a snapdragon 9cx (likely with two Cortex-X1)or apple a14? Actually it has no edge over apple a12 from 2018 even the latter faces some 40% performance lost in x86 simulation.
  • Wilco1 - Wednesday, July 8, 2020 - link

    It doesn't even compete with the 18 month old 8cx... It will be interesting to see a side by side Book S review with benchmarks and battery life.
  • serendip - Tuesday, July 14, 2020 - link

    https://www.notebookcheck.net/Samsung-Galaxy-Book-...

    Here it is. It barely competes against the 8cx but gets almost half the battery life running at 5W TDP. Samsung is supposed to release an update to allow running at 7W but that would kill battery life even more.
  • Wilco1 - Wednesday, July 15, 2020 - link

    Ouch... Thanks for that link!
  • reggjoo1 - Tuesday, July 7, 2020 - link

    Just manipulating the scheduler, won't be enough. They're gonna have to work on the governor more.
  • 808Hilo - Sunday, July 12, 2020 - link

    Headline:
    Intel expanded its turd business!
    We successfully, and at great cost, replicated the Atom processor and are only 10 years late with our consumer grade chip. The improvements are amazing: 1 slow processor supported by 3 superslow processors in a revolutionary new 4 processor die. The chip, designed for warheads, is exclusively down binned and handselected for exacting consumer standards. Support our military. Desining low performance is not cheap. Getting effed - Intel inside!
  • throAU - Monday, July 13, 2020 - link

    So, unless this can compete with the iPad Pro processor of the day, I just don't see the market. Windows 10 on ultra portable tablet type devices already sucks. So your realistic choices are android and iOS. Android has a suite of decently performing, already existent SOCs on the market, likely at far less cost than intel will no doubt try to charge for this. And no AVX-512? Only a single performance core? I just don't see it working out.

    I would have thought they'd be far better off not neutering the Sunny Cove core, and working with Microsoft/others on an API for workload queuing to the relevant core for a relevant code fragment. Treat the performance core as you would any other co-processor. Use thread affinity to bind specific UI threads to it. I'm sure there are methods that could be used but no - in order to run on unmodified platforms (that suck for the market segment they are aiming at anyway) - they crippled it.
  • serendip - Tuesday, July 14, 2020 - link

    Notebookcheck has a review comparing the Intel Lakefield and ARM models of the Galaxy Book S:mhttps://www.notebookcheck.net/Samsung-Galaxy-Book-...

    The results aren't pretty. For the same price of around $1000, the Lakefield version loses LTE in some markets, has equal or slightly less performance for CPU and GPU, but it has <10 hour battery life compared to the 8cx model's 16 hours. Despite all the fancy packaging, Lakefield is still half as efficient as Qualcomm's best, which makes it outclassed by Apple's silicon.

    The worst part about Lakefield on Windows is how it essentially performs as a quad core Atom chip most of the time. Ian's fears were realized.
  • throAU - Tuesday, July 14, 2020 - link

    This is pretty much exactly what I expected. Except the modern ARM processors have a better feature set than a crippled Lakefield chip. And there's less fragmentation in what they will/will not support vs. other ARM processors of the day.

    I expected Qualcomm to outclass it. It won't even be anywhere near close an A12Z and that's a processor from 12-18 months ago, which will no doubt be outclassed itself by whatever apple release late this year.
  • ballsystemlord - Wednesday, July 22, 2020 - link

    Spelling and grammar errors:

    "For those that are interested, Lakefield's PMICs are under the codenames Warren Cove and Castro Cover, and were developed in 2017-2018."
    I think you misspelled "cove":
    "For those that are interested, Lakefield's PMICs are under the codenames Warren Cove and Castro Cove, and were developed in 2017-2018."

    "Even those these CPUs are a 1+4 configuration,..."
    "though" not "those":
    "Even though these CPUs are a 1+4 configuration,..."

    "Another thing to note, which Intel glossed over, that most people are going to be really concerned about."
    Missing "is" and concerned about what?
    "Another thing to note, which Intel glossed over, is that most people are going to be really concerned about."

Log in

Don't have an account? Sign up now