AMD is no less than doomed if this is all they've got. I say this with sadness and regret.... and anger. There are definitely a few former CEOs who deserve to be publicly beaten.
They doomed themselves when they chose to leverage half the company for ATI just before the second great depression and they doubled down on defeat when they sold off their greatest asset: their fabs. Then, just to seal the deal, they decided that the longer pipelines and greater complexity that had almost ruined Intel the decade before would save them. Without in-house fabs to achieve the efficiency. Without the talent to realize the vision. Without floating point units...
I have to disagree. With Ivy Bridge EP Xeons out soon, the Warsaw product is probably too little, too late. However, the Seattle and Berlin CPUs can carve out their niches. Don't underestimate the micro server market. It is similar to x86 server market back in the late 90ies. It started small and in a niche, but it quickly grew upwards.
its frustrating to not see Steamroller based 2P and 4P Opterons. AMD has failed badly in execution. The original Bulldozer was released in late Q3 and Q4 2011 . the Piledriver updates happened in Q4 2012. Steamroller based updates should have been released by Q4 2013 or Q1 2014. but now thats not going to be the case. AMD's execution of big core based products has been miserable. In Feb 2012 during the Financial Analyst Day , AMD talked of Kaveri being their 2013 APU. By late 2012 it was clear that Steamroller based Kaveri was delayed. So AMD cooked up a mild Trinity update called Richland and marketed it as the new 2013 APU. Whats worse is AMD is hit badly by Globalfoundries screwing up their 28nm bulk process. The only hope is that AMD go direct to Excavator for the real next big core based Opteron 2P / 4P products with a DDR4 based new platform in late 2014 or Q1 2015. The question is what process will AMD be using. AMD needs a 20nm process based Excavator product to have any chance against Haswell EP / EX. Intel has no competition till late 2014 in the traditional IT server market. AMD 's problem is they are at the mercy of foundries for process tech , which are atleast 1 generation behind Intel and even more if you count features like FINFET. Intel's transistor performance is unmatched. So given AMD are behind on process tech by a generation , they need to out-architect Intel which is a very difficult task. But against all these odds everyone is hoping AMD can compete against Intel in the big core race. lets see what AMD has in store with Excavator based Opteron refreshes.
AMD isn't in as bad a position as you seem to think. They aren't good competition for Intel in the high end, true, but performance per dollar in the midrange and low range server market is definitely AMD's area of dominance. I have several AMD based Dell R515 machines in my lab. They perform very well, especially as virtual hosts, for a good price range. They're a good $500-700 cheaper than the equal performing Intel based machines, or in other terms, you can get two more cores per socket and two 1TB hard drives for the same price. Don't count them out just yet.
The problem is that low and middle end don't provide that much of profit, as in AMD as to sell loads of them just to recover the costs of R&D, they don't even have their own fabs anymore so besides paying the R&D they also have to pay someone else to actually make them, plus they will soon have real competition from the ARM cause let's be honest ARM will start eating AMD market share at a much faster rate than eating Intel market share, this cause exactly of what you said, Intel owns the performance sector and the ones that want performance will not consider ARM solutions (for the time being).
Hector Ruiz was probably one of the worst CEO from the AMD history and just to think he had a bigger pay check that the CEO of Intel makes me wonder how AMD as a company is being run...
Unfortunately, $500-$700 is often a drop in the bucket compared to the software costs so a lot of businesses will go Intel despite the superior price/performance ratio. Perhaps more worrisome is the fact that AMD chips are simply larger and more costly than their Intel equivalents. If Intel ever decided to lower their profit margins enough, AMD couldn't compete.
I think AMD's current strategy of capitalizing on GPU tech and integration with SeaMicro tech is the appropriate strategy at the moment. They have no real hope of catching up to Intel's process tech and Intel appears to have a superior big chip architecture even without it. Intel is focused on its small chip architecture and given the general CPU expertise, it is probable that they will catch up their as well. However, Intel has historically been less than proficient in regards to graphics. While the new Iris line is pretty impressive, it is hard to say how much of that is efficient architecture and how much is simple brute force. Also, while the Xeon Phi is making great strides in the highend HPC arena, they simply don't yet have the expertise to leverage their on die graphics for compute as simply and easily as AMD's upcoming Berlin (for mainstream or lowend HPC). The longer AMD can maintain their technical advantage here, the better. Hopefully the can also keep building off of the SeaMicro tech to distinguish themselves from the manufacturers that think they can just throw and ARM chip in their server and call it a day.
AMD was never even with Intel's fabrication technology in the first place. They've always been behind (usually one generation, but sometimes more). The only significant (unique) fab technology I can recall AMD releasing was their Silicon on Insulator (SOI) process, but it was still on a larger node. Also, it wasn't so much that Intel couldn't have used SOI as they just deemed strained silicon more useful for their purposes at the time.
Intel has always relied on their process advantage as it allows them to implement architectures that are simply impractical on previous process nodes. Need more performance in the same power envelope, use a higher performance architecture with a new process node. Need a lower power envelope with the same performance, a smaller process node can help you maintain the same clock speeds on the same architecture while dropping power. Need new features, such as better power management, to reach your target. A smaller process node gives you more transistors to work with to implement those features within the same die size. As cost is rather directly related to die size, they can even effectively change nothing and reduce cost. For a long time clever designs and lower profit margins had helped AMD compete, but Intel simply has far more leeway to screw up without having an appreciable effect on their final product. One less than optimal move by AMD lands them in budget land with no profit margins.
ARM based manufacturers seem to have a more optimal architecture (if only slightly) for low power processors than Intel right now. However, Intel's processes advantage has once again allowed them to achieve a rough parity with their newest Atom architecture. If Intel can close the gap in regards to architectural cleverness, then ARM houses will need to close the gap in fab tech. Otherwise, like AMD, they will be stuck in a situation where one wrong move could relegate them to the budget end (of a budget market). Luckily, there are many ARM houses and they aren't all likely to screw up at the same time.
AMD's power usage would be better if they didn't implement CnQ on a per-module basis. Sharing resources is one thing, but power circuitry? The dynamic L2 is a big step in the right direction though, especially considering there's up to 8MB of it. They could always go inclusive on the caches as they did with Jaguar.
Johann, in the latest 500 supercomputer list, three supercomputer sport 22 nm Intel Ivy Bridge E5 Xeons, including the most powerful supercomputer in the world (which uses 32 thousand Ivy Bridge E5s). The Ivy Bridge E5s, although not yet listed on the Intel Price list, are already available to select customers. Ivy bridge E7s come out in weeks.
AMD's Warsaw gross margins will be squeezed.
Let us assume you are right about power efficient micro-servers. Here is the problem:
--Haswell ultra low power Xeon E3s come out in weeks --Avoton is likely released by September.
What do you think this will do to Seattle and Berlin CPU gross margins? Keeping in mind how much Global Foundries rips AMD off in fabricating? Seattle CPU gross margins will be reduced by the fact that they cannot process native X86 code.
This said, Seattle and Berlin CPUs will do okay in the 2nd half of 2014 (doubt AMD will be able to sell them in mass before the 2nd half of 2014). The real Seattle killers will be ultra low power 14 nm broadwell E3 Xeons and airmont--avoton's 14 nm successor. Both should be available in Q4, 2014.
In 2015, Intel will release microservers powered by 14 nm multisocket Xeon Phis. Not PCI express Xeon Phis, but microservers consisting entirely of Xeon Phis that are better at processing X86 code then any ARM Holding derivatives.
The medium term Intel roadmap for new server chips, and server noncore functionality will be extremely difficult for AMD to compete against.
Seattle is not build by Global Foundries and I can imagine that the R&D on Seattle is quite cheap. After all, it is not like AMD had to reinvent the A57 core. And a lot can happen in 3 quarters, so I don't think Seattle should be compared to Airmont.
I think AMD still has a chance. Intel's most recent releases have shown that the per-core performance increases are asymptotically reaching the fastest x86 is going to get. Haswell was insignificantly faster than Ivy Bridge, which itself was only marginally faster than Sandy Bridge. It's clear that the year-on-year speed increases are quickly going to approach zero.
In this environment AMD has some room to catch up. If they can hold on for a few more years it should give them some time to improve until their IPC is as good as or so close to as good as Intel's as to not really matter.
Depending on how you look at it, it either sucks or is great that x86 is approaching its performance limit. Most would say it sucks because they always want something faster (even if 99.9% of the time they don't actually need it!). I actually am kind of happy that things are slowing down considerably. Rather than focusing on performance for performance' sake, perhaps we can start focusing on the entire computing infrastructure and make it work better for people instead of just faster. Lower power devices, more ubiquitous devices, more useful, more user friendly ... these are all great trends, and it's where the future competition in the computing world will predominantly be.
It reminds me of gaming consoles and their effect on games. I know alot of people dislike the fact that consoles set a performance standard that it was no longer beneficial for most games to go beyond, leading to a sort of performance stagnation. On the other hand, companies could spend more resources focusing on making better games and less on higher and higher detailed eye candy. I actually appreciate that because I think game play is more important than graphics.
Anyway, I don't think you can start sounding AMD's death knell yet. They're going to reach near performance parity with Intel in a few years just because they have more headroom for performance improvement than Intel as we approach the limit. At that point, x86 won't really be getting significantly faster year by year but it sure will get cheaper. I am sure we can all appreciate at least that aspect of it.
x86 is far from hitting its performance limit. The reason Intel's CPUs haven't drastically improved performance in the last few generations is because their efforts have been focused elsewhere, most namely the integrated graphics chip, which has made significant strides even since Sandy Bridge, and consumes a large portion of the die. Haswell put most of their focus on Iris and power consumptions. Ivy bridge was already fast enough for 99.99% of the userbase, so it makes strong business sense to focus efforts elsewhere, and utilize their transistors on more important matters.
If Intel wanted to increase CPU performance, they could. But they have no reason to bother doing so. Other features are more important and their competition (AMD) is still a generation or two behind in performance. There's not a whole lot of pressure for faster CPUs.
inighthawki is right. In addition to what you said, Intel was focused on reducing the TDP envelope of CPU cores rather than increasing the single threaded performance of CPU cores. Since Intel uses the same architecture for everything from E7 Xeons to ultra-low power Haswell SoCs that power tablets (two 256 bit SIMDs wide with FMA), the single threaded performance of E7 Xeons is reduced to fit Haswell and Broadwell into tablets.
However, this will not last. Intel now has three X86 architectures: --main Core architecture --Avoton and Avoton successors --Xeon Phi (including multisocketed E5 and E7 versions)
Intel will be able to focus on increasing the number of bits wide for the main core from 512 to 1024 for future micro-architectures, as well as improving macro fusion, branch predictors, buffer sizes (better unified reservation stations, instruction decode queues, reorder buffers) increasing the number of parallel execution ports, L1, L2, L3, etc.
The reason why is that the successors of Avoton will focus on low TDP parts over time, freeing Intel to prioritize increasing single threaded performance on the main core. Similarly Xeon Phi's focus on improving multi-threaded performance allows the main core to focus more on single threaded performance.
You refuted your own point at the end when you said "There's not a whole lot of pressure for faster CPUs". This is exactly why x86 is asymptotically approaching its IPC limit. There simply are not enough dollars chasing increased x86 performance to justify the huge costs involved in advancing x86 performance significantly. With mobile markets eating away at desktop and laptop market share, the number of tasks that require ever increasing CPU performance dwindling, and the increasing cost of increasing x86 performance, there just isn't going to be enough money chasing improved x86 performance to justify the billions that it costs to make significant headway.
I apologize if I misinterpretted your original point, but I was speaking explicitly from a technological perspective. There are plenty of ways to improve performance, just very little will to do so. You are correct on that one.
However I'm not sure I agree that the reason is solely just a need to justify R&D costs. I think it's more simply that the performance is high enough that they have the opportunity to better spend their time on more important things. Take away the costs they've put in since Sandy Bridge associated with improving the integrated graphics performance and lowering power consumption and I bet they could've easily ended up with a chip that was at LEAST 50% faster than it is now. It's just that nobody really wants or needs that.
I think we are in agreement. And I must say, things sure are different than they were 15 years ago when the gigahertz wars were heating up. Back then there was almost nothing you could do with a computer that couldn't benefit from more speed. Now ... well it's easy to go 4 or 5 years without bothering to upgrade, and the year-on-year speed bumps are almost not noticeable.
I like AMD. But they are in a tough spot and I don't see them getting out of it with any of their roadmaps. The only interesting product is their ARM server chip. Everything else appears to be more of the same with slight improvements. That's not going to change the market dynamics. Intel squeezed up to 50% better idle battery life (and 30-100% faster graphics) with haswell and is targeting 2x cpu performance for the next atom and has their new xeon phi powering the fastest supercomputer (even if it's slightly less efficient than some of the other machines). They are setting aggressive targets and meeting them.
AMD needs to hit targets that are at least 30-100% better than current offerings at something that matters. Be it IPC, power consumption, graphics, maximum clocks, core counts... but something. They should just sell the PS4 or XBOne chips. 8 cores and gddr5 or edram. That would give probably a 100% increase in apu graphics performance and double the cores which would give some cpu tasks a large increase.
I'm not sure I agree. Even with Steamroller fixing multithreading, that in itself won't result in a big gain. If Cinebench is anything to go by, FX scales decently enough there so there's not a huge amount of performance left on the table without a decent single core IPC gain.
Each BD core only contains 2 ALUs and 2 AGUs. AMD argues that dropping the third of each was a necessary evil, however whereas Stars couldn't really utilise them, Core certainly can. Additionally, FlexFPU isn't helping - theoretically, AMD shouldn't lag behind in AVX but they do significantly, so they'd need to boost the FPU to 256-bit per pipe in order to gain parity... but they're not doing this with Steamroller as far as I've heard (in fact, the only thing they're doing is removing an MMX unit). FX's memory controller doesn't appear to be that good, either, and the significant power savings from high density libraries don't come until Excavator (though Steamroller does get the dynamic L2, which is neat).
Not having an FX or Opteron Steamroller 8-core (or higher) is a bit daft regardless of the above, unless it's not a big enough jump, OR Excavator has been pushed forwards.
A bit alarming that there is no 12-16 Steamroller cores on 28 or 20nm on the roadmap and what does that tells us about desktop (they don't really have a public roadmap for 2014 desktop)? I do wonder if Warsaw is one huge die or they are still going with 2 dies with 8 cores each. On the ARM side it will be interesting to see power consumption on 28nm,at least this way they'll have the product sooner and i do wish the DIY players and the SoC makers would set some standards for ARM desktops. At the very least there is a HTPC and file server market that can expand to desktop.
I wonder if AMD is taking an Intel release approach to this. The high core-count EP server CPUs coming out a year after the smaller ones with some tweaks. Steamroller is set to be released for desktop in Kaveri form this year- same die as Berlin I am guessing. But you are right that I don't see a Steamroller FX chip or a server variant with 4m/8c which is puzzling. I hope that they aren't going to rely on fused-off Kaveri 2m/4c CPU only parts for clients as it simply isn't enough horsepower. The BD architecture is actually extremely competitive with anything Intel has when used in a proper Linux environment.
Hopefully these products come out on time, AMD really needs timely execution to have any hope here. Other 64-bit ARM solutions are just around the corner and they will be left in the dust if they have constant delays.
You are newbie in server market. Server folks will never go to closed language. Because they value their time of development.
And of cause server processors need as much floating point as possible in 64/128 wide range. The funny "32 fp performance" numbers which opencl and cuda creators are showing to us are irrelevant because you need as much significant bits as possible - 32 minus exponent minus reserve - conclusion is 32 fp is irrelevant.
Intel got greedy and lazy ,AMD had an opportunity in 200+$ desktop with many cores, preferably on 20nm so they can fit 16 cores in a reasonable die size). AMD also has an opportunity to make an APU similar in perf to Xbox One, for a nice budged gaming box capable of playing console ports for quite a few years and more when the APU gets CFed with a discrete card.We'll see big of a GPU Kaveri has ,it's debatable if it's worth using a big enough GPU on 28nm. In ARM consumer, both Intel and M$ are somewhat alienating customers so the ARM players should push on all fronts. Sadly it seems AMD doesn't have the resources to be aggressive and fight too many battles.
AMD had opportunity everywhere since the times they made opteron 180 and overall transition to manycore era. But they throw their chance with opteron decision to drop floating point. And of cause AMD needs manual cache regulation instruction set.
I don't think we can necessarily infer desktop strategy from the server roadmap. AMD already has a really, really small market share (something like 4%) in big x86 server parts. Their market share on the desktop is nowhere near that bad. Sure, they're going to release the APUs first for Steamroller since that's a mainstream product and has a bigger market than the enthusiasts. But once they get that done, they're going to need to do something to keep themselves in the news until the next architecture, and putting together a FX chip wouldn't be too much of a stretch. I do think that Socket AM3+ is dead; the FX parts will most likely be on the same socket as the APUs from now on. This will minimize engineering costs.
The problem is, that AMD's big cores are so bad compared to Intel's big cores, that it makes next to no sense to compete against Intel with them. Perf/w is worse and manufacturing costs are far too high. To sell chips made out of these cores, they need to cut their prices so low, that they can't make any profits to pay for R&D or any other stuff. This applies to both server and desktop-market. Sure, you can sell expensive to manufacture multi-module phenoms for cheap-ass people who want best multicore performance per dollar, but what's the benefit when you can't make any money doing so?
AMD is in dire need of a complete big-core CPU architecture renewal and with their R&D resources that just isn't probably going to happen any time soon. Unless they can pull some kind of a magical bunnyrabit from their hat, I don't see them being competetive in big-core ever again.
They are shifting their target to those markets where they hope they can still compete.
The big question right now is if Steamroller can fix the problems with AMD's construction equipment architecture or not. Official estimates are quite bullish, promising 15%-30% gains on a clock-for-clock basis. No doubt these are overly generous estimates and I take them with a grain of salt, but if AMD can increase actual IPC by 10% or more with Steamroller (rather than just cranking the clock speed higher) then there may be hope for the construction equipment cores. If not, then AMD's best bet is ditching that line altogether, and scaling up Jaguar or its successor so it's reasonably competitive on the desktop. The good thing is that Jaguar is already optimized for low power (which is where the Bulldozer lineage really falls short) and its IPC is pretty good. And they've already got some nice design wins with the PS4 and Xbone, which demonstrates that these cores are suitable for gaming (an "enthusiast" use). Perhaps they could backport some of the features that Bulldozer and its successors actually got right, like the improved branch predictor. (Or did they already do that with Jaguar?) After all, this is basically what Intel did when they dropped Netburst in favor of a revised version of the P6 architecture.
@zepi "Sure, you can sell expensive to manufacture multi-module phenoms for cheap-ass people who want best multicore performance per dollar, but what's the benefit when you can't make any money doing so?"
Even if you can't make any money, if you can break even it is still useful for keeping your employees employed. While a business doesn't have an inherent need to employ someone for the sake of employing them, in this case, it is useful for maintaining your talent pool. For a company like AMD, the engineers' work cycles are most likely punctuated with periods of high demand and low demand. When you have fewer product lines, this means their could be periods of time where they have no work to do at all while waiting for work to be completed farther up the line. Even a small loss is better that paying a chunk of employees to do nothing while waiting for the next thing to come down the pipe. Having more product lines allows you to even out such lulls by staggering releases and thus filling in the gaps from one product line with work from another.
As an example, if the employees responsible for the low power line's layout only worked the low power line, many of them would have been left with nothing to do as they were waiting for the Jaguar architecture to be developed and simulated. Tweaks to the bobcat layout and preparations for the next node change would have kept some of them busy, but it is quite likely that many found work in the mainstream or even FX lines in the interim.
If you take a look at amd's assets, their portfolio and their current situation its not hard to see where they are headed. Money is usually made in the middle of the road. By this i mean most sales for enterprise class server cpus in this economic scenario will target a balance of sufficient computing power, price and power consumption.
What would you, as a business owner, opt for your average vm server for your average medium business needs? the $600, 95w e5-2630 or the $290, 115w opteron 6320? I wont even discuss the different standards when it comes to the tdp rating both companies have, its a matter of cold hard cash.
AMD will sell cheap, will move faster while listening to clients, will take more risks on niche markets, will leverage their gpu technologies onto the server market to make up for their less than stellar fpu performance.
How big is the HPC market compared to the SME one?
I'm not sure this is the correct time, but I do think that eventually we will see a merger or at least closer alignment of the FX line and the A series products. Consider that since before the bulldozer architecture was conceptualized, AMD had been looking to fuse the CPU and GPU into one chip. They wanted to allow people to program code for the "GPU" portions of the processor as easily as the "CPU" portions and even within the same code blocks. They've steadily (if slowly) progressed towards this goal since then culminating in their current HSA and hUMA technologies. When looked at from this perspective, the subpar floating point performance of bulldozer and its derivatives makes sense. If you have a set of "GPU" cores or "stream processors" available to handle floating point operations, then it seems less necessary to include them in the CPU cores.
Unfortunately, this merger is taking longer than AMD's initial expectations. Even if AMD's intention was to leverage discrete GPU's, in the mean time, to cover the floating point gap, software hasn't yet progressed to the point make it happen. For the moment, a GPUless part is necessary to serve higher performance sectors. Though eventually, I do expect to see GPUish elements in their high end parts to handle parallel operations and possibly augment the floating point characteristics of the processors. At this point, the transistors dedicated to the "GPU" portion will no longer be useless die space in regards to CPU performance. Such processors would have a much easier time with voice recognition, facial recognition, pattern recognition, neural algorithms (A.I. learning), ect.
AMD can't expect third-party code to be rewritten to accomodate their processors. If they can leverage the GPU for floating point, then fine, but it has to work seamlessly with existing CPU opcodes. In other words, the APU has to *internally* see that a stream of (say) SSE2 floating point instructions is coming, and hand that off to the GPU portion, without requiring anything to be recoded.
AMD doesn't have the market share to tell software vendors to do things their way.
That 2013 picture is some scary schiznit! Last time I went to a concert that was what it was like too. Those screens are right out of some science fiction horror novel. It is amazing what people cannot see, even when it is so plainly obvious.
Yeah people are more focused on burying their noses in their phones and capturing the moment than actually living the moment. I don't have a smart phone and I notice that I pay alot more attention to what I'm actually doing than most people most of the time. I don't know why people think it's necessary to record a crappy smart phone recording of an event when you can almost certainly buy a professionally made recording of almost any important event after the fact for a few bucks.
"Andrew Feldman told us that Berlin will offer at least twice CPU processing performance than the Opteron X-series."
I'd damn well hope it was a lot more than this. If it's clocked at twice the speed then Berlin will be forgettable, however if the comparison is with Berlin clocked at, say, 3GHz, that's not so bad.
All non-BD AMD architectures seem to scale very well with additional cores, and this is the main area that SR looks to improve upon.
Small typo on page 4, on the very first sentence," The current Opteron 4310 EE (2 modules, 4 cores at 2.2-3 GHz, 40W TDP) and Opteron 4376 HE (4 modules, 89 cores at 2.6-3.6 GHz, 65W TDP) are about the best AMD can deliver for low power servers that need some more processing power." Unless I'm mistaken (which an 89 core chip would be pretty sweet, especially at just 65 watts) that should read 8 core. Otherwise great read Johan.
To be honest, I wrongly assumed that I still had some time left, then discovered that the deadline was already a few hours ago and just hurried. So I humbly thank you for making this article more readable for our readers :-)
I'm hoping we will see some relatively inexpensive Mini-ITX Kyoto boards. An Opteron-X paired with ECC RAM could make a good, reliable platform for a DIY firewall or low-end NAS.
I didn't known the 9W for X1150 was for 1Ghz Core. If so i guess Intel will have Zero competition for Atom Servers. Intel will want to get as much market shares between now to end of 2014 when ARM will have a low power server product. Seattle seems interesting, I wonder if it is possible to have 4x 16 Core Seattle in a server. Seems like a good candidate for Hosting environment.
Remember that the X1150 has 4 cores vs the 2 for the Atom chip. It also has much better IPC and support for ECC and 32GB (DDR3-1600) of RAM vs only 8GB (DDR3-1333) for the S1260.
The Centertron (S1200-series) Atoms already have ECC support. But the performance is still subpar, so AMD will have an advantage there. Also, as far as I have been able to determine, there is only one Centertron board available to DIYers (the Supermicro X9SBAA-F) and it's hard to find and quite pricey (>$250). If AMD could get Mini-ITX Kyoto boards out at the $150-$200 price point, this would be a quite attractive option for small servers.
So how would next Gen Intel Arch compared to Jaguar? Since Intel has ( I think ) Hyper Threading in their Next Gen as well, so a Dual Core isn't so far off from Quad Core, And it seems Intel can ups the Ghz a bit more then the X1150.
"Berlin will use the same 28 nm process technology as the Opteron x1150 and x2150"
Jaguar is made at TSMC. Rumors for Steamroller derived processors point to production at Globalfoundries. Most go as far as claiming that it will be FD-SOI (see STMicroelectronics). That makes sense because 28nm bulk would be a step back compared to 32nm SOI. I don't think AMD is going to stop using SOI for their high-end CPU's anytime soon. Either way, be sure that they won't be using the same process technology (even though they're both 28nm) for these very different CPU's.
Please... AMD specifically said in the DEC 2012 WSA amendment with Global Foundries. It is in Page 6 of the 11 Page PDF titles
<quote> Separately, AMD will move to standard 28nm process technology and significantly reduce reimbursements to GF for future research and development costs. – We anticipate these savings will be approximately ~$20M per quarter during the next several years which also helps achieve our OPEX target of $450M by Q3 2013 <end quote>
Read again and again. The standard technology at 28nm is Bulk HKMG. not SOI.
Even GF's powerpoint roadmap states that the 20nm as well as 14XM process will both be successors to 28nm Bulk HKMG will continue to be Bulk HKMG and bulk with FinFETs respectively. And the SOI line stops with 28nm FD-SOI (in collaboration with STM) and just a ambiguous dotted line for the future.
With FDSOI we still do not know if GF has agreed to manufacture FDSOI parts in large volume. All we know is the license agreement between ST Micro and GF.
Do you really think Seattle will hit the market in Q1 2014? Count me skeptical. Seattle will have lower single threaded performance than Avoton. Plus Seattle can't process native X86 code the way Avoton can.
Seattle will sample in Q1 and be available in Q3 according to other articles.
Note A57 is much faster than Avoton, even A15 has better IPC. Given Seattle packs 4 times as many cores, we're talking about 5-6 times the throughput of Avoton.
Wilco1, it's beautiful how you can make definite conclusion based on nothing. All we know now are estimates. But if you would even take these estimates into consideration then things won't be as rosy as you describe them. - A57 will be around 20-30% faster then A15 on the same process. Numbers from ARM themselves. In one of Anand's reviews, AMD's Bobcat core was faster than a A15. - Intel's press release on Silvermont: "Silvermont microarchitecture delivers ~3x more peak performance or the same performance at ~5x lower power over current-generation Intel Atom processor core"
Considering that the newest Atom SOC's aren't that much slower than AMD's Bobcat core, a sane person wouldn't have said what you just said.
Let's assume Silvermont will do 3 times the score, so ~4300 (I don't believe it will but that's another discussion). A57 has 20-30% better IPC as you say, and clocks up to 2.5GHz, so would score ~7300. That means with 4x the cores Seattle would have 6.8 times the throughput. Round it down to 5-6 times because we're talking about estimates. Any sane person would agree with my calculation. Avoton has no chance to compete on throughput, not even with 8 cores. Period.
Btw A15 easily beats Bobcat and trades blows with Jaguar (beats it on overall score, wipes it out on FP but is slower on int and memory - remember this is a phone SoC!): http://browser.primatelabs.com/geekbench2/compare/...
You're comparing a dual core Atom to a quad core A15! http://browser.primatelabs.com/geekbench2/compare/... If it wasn't for the Neon unit kicking ass in some tests, the scores would be pretty close. Since Silvermont will support SSE4.1 & SSE4.2, it will become stronger in that department.
The first Seattle will be a 8-core part. The 16-core one will follow later but AMD is not precise with the release date. Geekbench is a very low level benchmark that runs mostly in the processor cache. Many things that make or a break a processor are not tested in this way. It's not true indicator of real world performance. http://images.anandtech.com/graphs/graph6877/53966... Here you have an example where a 1.6Ghz dual core bobcat beats a 1.7Ghz dual core A15 by almost 20% in a real world cpu intensive test. Just to show that Geekbench is very low-level and thing like prefetchers, branch predictors, ... are not tested very well. But as I said, it's too soon to speculate. That's all I'm saying.
Atom supports hyperthreading which gives a big speedup so it is reasonable to compare with a quad core - note most phones are quad-core anyway. But as you show even a dual A15 beats a dual Atom by almost a factor of 2 despite running at a lower frequency. I don't know whether NEON is used at all, but I'd be surprised if it was.
Geekbench is not perfect but it correlates well with other native benchmarks such as SPEC and Phoronix. Your physics example is comparing hand optimized drivers for different GPUs. The A15 is trivially beaten by the Note 2 (quad A9) despite having ~3 times the NEON FP performance. So I'm not sure whether one can conclude anything from that beyond that AMD's drivers appear to be well optimized.
3DMark physics runs completely on the CPU. The GPU is not used at all. The fact that the Note 2 beats the dual A15, shows ones more that low level benchmarks like GB and SPEC are almost meaningless between architectures. Also, I don't agree that you should compare a dual core HT Atom to a quad core. It's still a dual core. That A15 beats a 5 year old Atom core by a factor of 2 is no feat. It's the minimum. Yes, Geekbench is compiled to use ARM's Neon.
Yes I know the physics test runs on the CPU. Without having access to the source code it would be hard to say what causes it. If it doesn't use Neon and has a high correlation with #cores * MHz then that might explain it. Btw where did you get that Geekbench uses Neon? Which benchmarks are actually vectorized?
A dual Atom has 4 threads. A 2-module Bulldozer has 4 threads. A quad core has 4 threads. These are ways of supporting 4 threads with different hardware tradeoffs. However from the software perspective they look and behave in the same way. So it's entirely reasonable to compare thread for thread.
Btw if you think the dual A15 vs Atom speedup of ~2x is low, you'll be disappointed with Silvermont. Since the claimed 2.8x speedup is for a quad core, a dual Silvermont will only be ~1.4 times faster...
Wilco, Avoton's 14 nm shrink (airmont) is likely released in Q4, 2014. Seattle will be available in Q3, 2014. Seattle should really be compared to airmont.
Avoton has better branch predictors and macro fusion than A57. Some Avoton SoCs will pack 20 Silvermont cores.
"Given Seattle packs 4 times as many cores, we're talking about 5-6 times the throughput of Avoton." :LOL:
Silvermont and Airmont cores are more power efficient than A57 cores.
Didn't you read the news that Intel appears to be delaying 14nm introduction to 2015? And where did you see that Avoton has 20 cores? At best they might have 8.
Nobody has compared A57 and Avoton branch predictors yet, so the jury is out on that. Same for power efficiency.
I believe the road AMD has chosen has a good future for them... the only problem is that, currently, that road is very rocky and difficult to drive through.
Once they manage to fulfill their "Fusion" plans, we won't need to be bothered by the anemic FPU units paired on each Piledriver/Steamroller/whatever module, mostly because that computing should be done on the GPU (which, by the time the Fusion is complete, would be an integral part of the processor itself).
Unfortunately we're still a few years away from such... and I hope they (greatly) improve integer performance on their cores until then. But I do believe it's going to happen, and it'll be great.
Think of the old "math co-processors" of the past, back in the 386 days (they got integrated on the 486 models). The only difference now tho, is that a GCN IGP unit of today has an order of magnitude more compute/FPU power than Intel parts. If you fast forward to the future, and get that improved even more and fused into the CPU....
"AMD can't expect third-party code to be rewritten to accomodate their processors." AMD doesn't have the market share to tell software vendors to do things their way."
Not true. AMD has *more* than mere market share, it has *complete* *market* *dominance* in the new gaming consoles. *All* of gaming graphics software expertise is now focused on leveraging CPU / GPU tradeoffs in the AMD APU design. If advantages really exist, they will be found and exposed, and new software will go that way as well, if only for AMD users. For example, it would seem that the GPU has plenty of fast FP, which need not also be in the CPU, *provided* it can be accessed easily without copying back and forth.
What is this second great depression we are talking about? I think that is a myth. I don't see any dust bowls or soup lines. Certainly not here in Austin, where AMD is. What you had to wait an extra week to pick up your iPhone 5 and were depressed, and that's the 2nd great depression? Have you ever talked to anyone who was actually alive during the great depression? Wimps.
We're not going to see any major x86 architecture changes from AMD till at least 2015 if not 2016 thats about how long it takes to design and deploy a newly designed architecture which is hopefully what the major rehires from last year will be working on. So we're stuck with mostly minor tweaks and enhancements till then.
"AMD says that using the graphics core for the heavy scalar floating point will get as easy as C++ programming and as a result, Berlin should make a few heads turn in the HPC world. It even looks like SSE-x will get less and less important over time in that market. "
Ahh, yes, the old "New compilers will make our weird CPU architecture invisible to the programmer" gambit. How's that worked out in the past, guys? Trimedia? Cell? iTanium?
But there's sucker born every minute. Good luck to anyone foolish enough to invest today on the assumption that this magical compiler will be available tomorrow.
[I'm not claiming this breakthrough --- compiler-transparent GPGPU --- will NEVER happen. I am claiming it ain't gonna happen during the relevant lifetime of this product.]
This roadmap is a disaster. No new high margin SKUs since 2015++ (excavator). No medium margin SKUs to beat Intel single socket offerings since excavator core will born in an unknown year. The low margin segment is dominated by a NON-X86 core...vanilla from Arm and not custom ala Qualcomm. The process side of the things is even worse. The Arm core (H2 2014 from a more accurate Amd official slide reported by xbit) is stuck on 28nm, funny thing !! considering that Qualcomm will be on 20nm in Q1/2014; Amd has not even the money to work with TSMC to deliver a competitive Arm Soc !!! Seattle is just now a failure looking the specs, the process do not allow eight cores with a decent TDP to mach Intel Avoton in 22nm Trigate. Recent impementations of A15 say that 28nm node is not the best thing around not even to do a decent quadcore low power device...you figure an eight core one. Anyway Seattle is late, aka in the same time frame of 14nm Airmont.
The last part of the article is stunning: "It looks like the Intel Avoton will have a very potent challenger in Q1 2014".....too bad Seattle that is an H2 2014 device.
"So there is good chance that AMD will make a big comeback in 2014 in the server market"
What server market??? microserver market ??? with a NON-x86 core ??? a x86 Company ??? I have said: there is good chances that Amd will do a so so New Entry in 2014 in a 10% low margin nice on the server market, along with many other contenders some of them with custom and optimized x86/Arm cores. I love our articles Johan, still this seem very very strange to me
You guys on here sound like incompetent investors on Wall Street that no noting about technology. Let me give you investors some advice: If you do not know the field very well that you invest in, then you should refrain from commenting like you know who will be more competitive.
When AMD goes all HuMa aware with their new generation APU's then SSE instructions, AXV instructions and other such floating point instructions will be utterly destroyed by a program that takes advantage of the GCN cores on these APU's. That is an UNDENAIBLE FACT!! A program written to take full advantage of the best floating point instructions that X86 has to offer will not come ANYWHERE near that of the same program written take advantage of the GCN cores on the next generation APU,s.
That is why the server Kaveri variant CPU does not need to be 2P or 4P. Database programs that leverage GCN cores will outperform Intel's floating point instructions in their processors, even in 2P or 4P configs. It takes a whole lot of CPU's to equal the floating point computation power of the GCN architecture. CPU's are only great at serial code and branch prediction. We need more programmers to comment on here rather than you investor types. I feel like the only technical person on here. Geez...
Too bad most common server workload is not floating point only based and a rude and layout repetitive GPU can be a substitute of a CPU. The bulk of the SW is optimized serial. Kaveri can be nice in low end HPC, still you forget that Intel is shipping nicely powerful integrated GPU in these days, so Amd is not alone anymore in this segment.
There will be certain operations that can be made potentially many times faster. But not everything. Databases are interesting, but at least in my use cases they are more limited by memory capacity and disk/Storage I/O than cpu performance.
Most of the things I code are office and management/ordering/billing systems. Nothing particularly cpu intensive (other than video compression). Just lots of business rules and interop.
Wake me up when those microserver GPUs can use protected memory. As far as I know, all that memory is wide open to any process on the server. I can't imagine many uses of a microserver that could accept that (google and other single owner datacenters, maybe. But I tend to see these things as something you would want for VPS hosting).
GPUs appear perfect for cryptographic uses, but are completely unacceptable as long as they can't protect their own memory (just sift through the entire GPU looking for keys, you will find them quickly). I suppose there exist the odd ECC format you might want to run on your server, but that is sufficiently exotic to simply justify adding a PCIe card.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
80 Comments
Back to Article
coburn_c - Tuesday, June 18, 2013 - link
AMD is no less than doomed if this is all they've got. I say this with sadness and regret.... and anger. There are definitely a few former CEOs who deserve to be publicly beaten.They doomed themselves when they chose to leverage half the company for ATI just before the second great depression and they doubled down on defeat when they sold off their greatest asset: their fabs. Then, just to seal the deal, they decided that the longer pipelines and greater complexity that had almost ruined Intel the decade before would save them. Without in-house fabs to achieve the efficiency. Without the talent to realize the vision. Without floating point units...
JohanAnandtech - Tuesday, June 18, 2013 - link
I have to disagree. With Ivy Bridge EP Xeons out soon, the Warsaw product is probably too little, too late. However, the Seattle and Berlin CPUs can carve out their niches. Don't underestimate the micro server market. It is similar to x86 server market back in the late 90ies. It started small and in a niche, but it quickly grew upwards.gostan - Tuesday, June 18, 2013 - link
Not underestimating the micro server market here. just that I don't have high hope on AMD to execute their roadmap in a timely fashion.TheCountess - Tuesday, June 18, 2013 - link
actually they hit all their launch targets in recent years, even if not always the performance they were hoping for.raghu78 - Tuesday, June 18, 2013 - link
its frustrating to not see Steamroller based 2P and 4P Opterons. AMD has failed badly in execution. The original Bulldozer was released in late Q3 and Q4 2011 . the Piledriver updates happened in Q4 2012. Steamroller based updates should have been released by Q4 2013 or Q1 2014. but now thats not going to be the case. AMD's execution of big core based products has been miserable.In Feb 2012 during the Financial Analyst Day , AMD talked of Kaveri being their 2013 APU. By late 2012 it was clear that Steamroller based Kaveri was delayed. So AMD cooked up a mild Trinity update called Richland and marketed it as the new 2013 APU. Whats worse is AMD is hit badly by Globalfoundries screwing up their 28nm bulk process.
The only hope is that AMD go direct to Excavator for the real next big core based Opteron 2P / 4P products with a DDR4 based new platform in late 2014 or Q1 2015. The question is what process will AMD be using. AMD needs a 20nm process based Excavator product to have any chance against Haswell EP / EX.
Intel has no competition till late 2014 in the traditional IT server market. AMD 's problem is they are at the mercy of foundries for process tech , which are atleast 1 generation behind Intel and even more if you count features like FINFET. Intel's transistor performance is unmatched. So given AMD are behind on process tech by a generation , they need to out-architect Intel which is a very difficult task. But against all these odds everyone is hoping AMD can compete against Intel in the big core race. lets see what AMD has in store with Excavator based Opteron refreshes.
dgingeri - Tuesday, June 18, 2013 - link
AMD isn't in as bad a position as you seem to think. They aren't good competition for Intel in the high end, true, but performance per dollar in the midrange and low range server market is definitely AMD's area of dominance. I have several AMD based Dell R515 machines in my lab. They perform very well, especially as virtual hosts, for a good price range. They're a good $500-700 cheaper than the equal performing Intel based machines, or in other terms, you can get two more cores per socket and two 1TB hard drives for the same price. Don't count them out just yet.Strunf - Tuesday, June 18, 2013 - link
The problem is that low and middle end don't provide that much of profit, as in AMD as to sell loads of them just to recover the costs of R&D, they don't even have their own fabs anymore so besides paying the R&D they also have to pay someone else to actually make them, plus they will soon have real competition from the ARM cause let's be honest ARM will start eating AMD market share at a much faster rate than eating Intel market share, this cause exactly of what you said, Intel owns the performance sector and the ones that want performance will not consider ARM solutions (for the time being).Hector Ruiz was probably one of the worst CEO from the AMD history and just to think he had a bigger pay check that the CEO of Intel makes me wonder how AMD as a company is being run...
JPForums - Tuesday, June 18, 2013 - link
Unfortunately, $500-$700 is often a drop in the bucket compared to the software costs so a lot of businesses will go Intel despite the superior price/performance ratio. Perhaps more worrisome is the fact that AMD chips are simply larger and more costly than their Intel equivalents. If Intel ever decided to lower their profit margins enough, AMD couldn't compete.I think AMD's current strategy of capitalizing on GPU tech and integration with SeaMicro tech is the appropriate strategy at the moment. They have no real hope of catching up to Intel's process tech and Intel appears to have a superior big chip architecture even without it. Intel is focused on its small chip architecture and given the general CPU expertise, it is probable that they will catch up their as well. However, Intel has historically been less than proficient in regards to graphics. While the new Iris line is pretty impressive, it is hard to say how much of that is efficient architecture and how much is simple brute force. Also, while the Xeon Phi is making great strides in the highend HPC arena, they simply don't yet have the expertise to leverage their on die graphics for compute as simply and easily as AMD's upcoming Berlin (for mainstream or lowend HPC). The longer AMD can maintain their technical advantage here, the better. Hopefully the can also keep building off of the SeaMicro tech to distinguish themselves from the manufacturers that think they can just throw and ARM chip in their server and call it a day.
Klimax - Wednesday, June 19, 2013 - link
I wouldn't be so sure about computing advantage for AMD:http://techreport.com/review/24954/amd-a10-6800k-a...
JPForums - Tuesday, June 18, 2013 - link
AMD was never even with Intel's fabrication technology in the first place. They've always been behind (usually one generation, but sometimes more). The only significant (unique) fab technology I can recall AMD releasing was their Silicon on Insulator (SOI) process, but it was still on a larger node. Also, it wasn't so much that Intel couldn't have used SOI as they just deemed strained silicon more useful for their purposes at the time.Intel has always relied on their process advantage as it allows them to implement architectures that are simply impractical on previous process nodes. Need more performance in the same power envelope, use a higher performance architecture with a new process node. Need a lower power envelope with the same performance, a smaller process node can help you maintain the same clock speeds on the same architecture while dropping power. Need new features, such as better power management, to reach your target. A smaller process node gives you more transistors to work with to implement those features within the same die size. As cost is rather directly related to die size, they can even effectively change nothing and reduce cost. For a long time clever designs and lower profit margins had helped AMD compete, but Intel simply has far more leeway to screw up without having an appreciable effect on their final product. One less than optimal move by AMD lands them in budget land with no profit margins.
ARM based manufacturers seem to have a more optimal architecture (if only slightly) for low power processors than Intel right now. However, Intel's processes advantage has once again allowed them to achieve a rough parity with their newest Atom architecture. If Intel can close the gap in regards to architectural cleverness, then ARM houses will need to close the gap in fab tech. Otherwise, like AMD, they will be stuck in a situation where one wrong move could relegate them to the budget end (of a budget market). Luckily, there are many ARM houses and they aren't all likely to screw up at the same time.
silverblue - Wednesday, June 19, 2013 - link
The 14nm node has been delayed until 2015.AMD's power usage would be better if they didn't implement CnQ on a per-module basis. Sharing resources is one thing, but power circuitry? The dynamic L2 is a big step in the right direction though, especially considering there's up to 8MB of it. They could always go inclusive on the caches as they did with Jaguar.
Klimax - Wednesday, June 19, 2013 - link
14nm wasn't delayed. (Only desktop won't receive it)1008anan - Tuesday, June 18, 2013 - link
Johann, in the latest 500 supercomputer list, three supercomputer sport 22 nm Intel Ivy Bridge E5 Xeons, including the most powerful supercomputer in the world (which uses 32 thousand Ivy Bridge E5s). The Ivy Bridge E5s, although not yet listed on the Intel Price list, are already available to select customers. Ivy bridge E7s come out in weeks.AMD's Warsaw gross margins will be squeezed.
Let us assume you are right about power efficient micro-servers. Here is the problem:
--Haswell ultra low power Xeon E3s come out in weeks
--Avoton is likely released by September.
What do you think this will do to Seattle and Berlin CPU gross margins? Keeping in mind how much Global Foundries rips AMD off in fabricating? Seattle CPU gross margins will be reduced by the fact that they cannot process native X86 code.
This said, Seattle and Berlin CPUs will do okay in the 2nd half of 2014 (doubt AMD will be able to sell them in mass before the 2nd half of 2014). The real Seattle killers will be ultra low power 14 nm broadwell E3 Xeons and airmont--avoton's 14 nm successor. Both should be available in Q4, 2014.
In 2015, Intel will release microservers powered by 14 nm multisocket Xeon Phis. Not PCI express Xeon Phis, but microservers consisting entirely of Xeon Phis that are better at processing X86 code then any ARM Holding derivatives.
The medium term Intel roadmap for new server chips, and server noncore functionality will be extremely difficult for AMD to compete against.
JohanAnandtech - Thursday, June 20, 2013 - link
Seattle is not build by Global Foundries and I can imagine that the R&D on Seattle is quite cheap. After all, it is not like AMD had to reinvent the A57 core. And a lot can happen in 3 quarters, so I don't think Seattle should be compared to Airmont.bji - Tuesday, June 18, 2013 - link
I think AMD still has a chance. Intel's most recent releases have shown that the per-core performance increases are asymptotically reaching the fastest x86 is going to get. Haswell was insignificantly faster than Ivy Bridge, which itself was only marginally faster than Sandy Bridge. It's clear that the year-on-year speed increases are quickly going to approach zero.In this environment AMD has some room to catch up. If they can hold on for a few more years it should give them some time to improve until their IPC is as good as or so close to as good as Intel's as to not really matter.
Depending on how you look at it, it either sucks or is great that x86 is approaching its performance limit. Most would say it sucks because they always want something faster (even if 99.9% of the time they don't actually need it!). I actually am kind of happy that things are slowing down considerably. Rather than focusing on performance for performance' sake, perhaps we can start focusing on the entire computing infrastructure and make it work better for people instead of just faster. Lower power devices, more ubiquitous devices, more useful, more user friendly ... these are all great trends, and it's where the future competition in the computing world will predominantly be.
It reminds me of gaming consoles and their effect on games. I know alot of people dislike the fact that consoles set a performance standard that it was no longer beneficial for most games to go beyond, leading to a sort of performance stagnation. On the other hand, companies could spend more resources focusing on making better games and less on higher and higher detailed eye candy. I actually appreciate that because I think game play is more important than graphics.
Anyway, I don't think you can start sounding AMD's death knell yet. They're going to reach near performance parity with Intel in a few years just because they have more headroom for performance improvement than Intel as we approach the limit. At that point, x86 won't really be getting significantly faster year by year but it sure will get cheaper. I am sure we can all appreciate at least that aspect of it.
inighthawki - Tuesday, June 18, 2013 - link
x86 is far from hitting its performance limit. The reason Intel's CPUs haven't drastically improved performance in the last few generations is because their efforts have been focused elsewhere, most namely the integrated graphics chip, which has made significant strides even since Sandy Bridge, and consumes a large portion of the die. Haswell put most of their focus on Iris and power consumptions. Ivy bridge was already fast enough for 99.99% of the userbase, so it makes strong business sense to focus efforts elsewhere, and utilize their transistors on more important matters.If Intel wanted to increase CPU performance, they could. But they have no reason to bother doing so. Other features are more important and their competition (AMD) is still a generation or two behind in performance. There's not a whole lot of pressure for faster CPUs.
1008anan - Tuesday, June 18, 2013 - link
inighthawki is right. In addition to what you said, Intel was focused on reducing the TDP envelope of CPU cores rather than increasing the single threaded performance of CPU cores. Since Intel uses the same architecture for everything from E7 Xeons to ultra-low power Haswell SoCs that power tablets (two 256 bit SIMDs wide with FMA), the single threaded performance of E7 Xeons is reduced to fit Haswell and Broadwell into tablets.However, this will not last. Intel now has three X86 architectures:
--main Core architecture
--Avoton and Avoton successors
--Xeon Phi (including multisocketed E5 and E7 versions)
Intel will be able to focus on increasing the number of bits wide for the main core from 512 to 1024 for future micro-architectures, as well as improving macro fusion, branch predictors, buffer sizes (better unified reservation stations, instruction decode queues, reorder buffers) increasing the number of parallel execution ports, L1, L2, L3, etc.
The reason why is that the successors of Avoton will focus on low TDP parts over time, freeing Intel to prioritize increasing single threaded performance on the main core. Similarly Xeon Phi's focus on improving multi-threaded performance allows the main core to focus more on single threaded performance.
In any case I would argue that single threaded performance has improved more since conroe than you imply. Anand Shimpi explains why better than I could here:
http://www.anandtech.com/show/6355/intels-haswell-...
bji - Tuesday, June 18, 2013 - link
You refuted your own point at the end when you said "There's not a whole lot of pressure for faster CPUs". This is exactly why x86 is asymptotically approaching its IPC limit. There simply are not enough dollars chasing increased x86 performance to justify the huge costs involved in advancing x86 performance significantly. With mobile markets eating away at desktop and laptop market share, the number of tasks that require ever increasing CPU performance dwindling, and the increasing cost of increasing x86 performance, there just isn't going to be enough money chasing improved x86 performance to justify the billions that it costs to make significant headway.inighthawki - Tuesday, June 18, 2013 - link
I apologize if I misinterpretted your original point, but I was speaking explicitly from a technological perspective. There are plenty of ways to improve performance, just very little will to do so. You are correct on that one.However I'm not sure I agree that the reason is solely just a need to justify R&D costs. I think it's more simply that the performance is high enough that they have the opportunity to better spend their time on more important things. Take away the costs they've put in since Sandy Bridge associated with improving the integrated graphics performance and lowering power consumption and I bet they could've easily ended up with a chip that was at LEAST 50% faster than it is now. It's just that nobody really wants or needs that.
bji - Wednesday, June 19, 2013 - link
I think we are in agreement. And I must say, things sure are different than they were 15 years ago when the gigahertz wars were heating up. Back then there was almost nothing you could do with a computer that couldn't benefit from more speed. Now ... well it's easy to go 4 or 5 years without bothering to upgrade, and the year-on-year speed bumps are almost not noticeable.andrewaggb - Wednesday, June 19, 2013 - link
I like AMD. But they are in a tough spot and I don't see them getting out of it with any of their roadmaps. The only interesting product is their ARM server chip. Everything else appears to be more of the same with slight improvements. That's not going to change the market dynamics. Intel squeezed up to 50% better idle battery life (and 30-100% faster graphics) with haswell and is targeting 2x cpu performance for the next atom and has their new xeon phi powering the fastest supercomputer (even if it's slightly less efficient than some of the other machines). They are setting aggressive targets and meeting them.AMD needs to hit targets that are at least 30-100% better than current offerings at something that matters. Be it IPC, power consumption, graphics, maximum clocks, core counts... but something.
They should just sell the PS4 or XBOne chips. 8 cores and gddr5 or edram. That would give probably a 100% increase in apu graphics performance and double the cores which would give some cpu tasks a large increase.
silverblue - Tuesday, June 18, 2013 - link
I'm not sure I agree. Even with Steamroller fixing multithreading, that in itself won't result in a big gain. If Cinebench is anything to go by, FX scales decently enough there so there's not a huge amount of performance left on the table without a decent single core IPC gain.Each BD core only contains 2 ALUs and 2 AGUs. AMD argues that dropping the third of each was a necessary evil, however whereas Stars couldn't really utilise them, Core certainly can. Additionally, FlexFPU isn't helping - theoretically, AMD shouldn't lag behind in AVX but they do significantly, so they'd need to boost the FPU to 256-bit per pipe in order to gain parity... but they're not doing this with Steamroller as far as I've heard (in fact, the only thing they're doing is removing an MMX unit). FX's memory controller doesn't appear to be that good, either, and the significant power savings from high density libraries don't come until Excavator (though Steamroller does get the dynamic L2, which is neat).
Not having an FX or Opteron Steamroller 8-core (or higher) is a bit daft regardless of the above, unless it's not a big enough jump, OR Excavator has been pushed forwards.
jjj - Tuesday, June 18, 2013 - link
A bit alarming that there is no 12-16 Steamroller cores on 28 or 20nm on the roadmap and what does that tells us about desktop (they don't really have a public roadmap for 2014 desktop)?I do wonder if Warsaw is one huge die or they are still going with 2 dies with 8 cores each.
On the ARM side it will be interesting to see power consumption on 28nm,at least this way they'll have the product sooner and i do wish the DIY players and the SoC makers would set some standards for ARM desktops. At the very least there is a HTPC and file server market that can expand to desktop.
SilentSin - Tuesday, June 18, 2013 - link
I wonder if AMD is taking an Intel release approach to this. The high core-count EP server CPUs coming out a year after the smaller ones with some tweaks. Steamroller is set to be released for desktop in Kaveri form this year- same die as Berlin I am guessing. But you are right that I don't see a Steamroller FX chip or a server variant with 4m/8c which is puzzling. I hope that they aren't going to rely on fused-off Kaveri 2m/4c CPU only parts for clients as it simply isn't enough horsepower. The BD architecture is actually extremely competitive with anything Intel has when used in a proper Linux environment.Hopefully these products come out on time, AMD really needs timely execution to have any hope here. Other 64-bit ARM solutions are just around the corner and they will be left in the dust if they have constant delays.
Gigaplex - Tuesday, June 18, 2013 - link
"The BD architecture is actually extremely competitive with anything Intel has when used in a proper Linux environment."Only for integer based workloads. It still struggles with floating point.
TiredOldFart2 - Tuesday, June 18, 2013 - link
Integer workload is most of the workload for now. Fp intensive will naturally migrate onto opencl over time, as it is more efficient to do so.sanaris - Monday, February 17, 2014 - link
You are newbie in server market. Server folks will never go to closed language.Because they value their time of development.
And of cause server processors need as much floating point as possible in 64/128 wide range.
The funny "32 fp performance" numbers which opencl and cuda creators are showing to us are irrelevant because you need as much significant bits as possible - 32 minus exponent minus reserve - conclusion is 32 fp is irrelevant.
jjj - Tuesday, June 18, 2013 - link
Intel got greedy and lazy ,AMD had an opportunity in 200+$ desktop with many cores, preferably on 20nm so they can fit 16 cores in a reasonable die size). AMD also has an opportunity to make an APU similar in perf to Xbox One, for a nice budged gaming box capable of playing console ports for quite a few years and more when the APU gets CFed with a discrete card.We'll see big of a GPU Kaveri has ,it's debatable if it's worth using a big enough GPU on 28nm.In ARM consumer, both Intel and M$ are somewhat alienating customers so the ARM players should push on all fronts.
Sadly it seems AMD doesn't have the resources to be aggressive and fight too many battles.
sanaris - Monday, February 17, 2014 - link
AMD had opportunity everywhere since the times they made opteron 180 and overall transition to manycore era. But they throw their chance with opteron decision to drop floating point. And of cause AMD needs manual cache regulation instruction set.JDG1980 - Tuesday, June 18, 2013 - link
I don't think we can necessarily infer desktop strategy from the server roadmap. AMD already has a really, really small market share (something like 4%) in big x86 server parts. Their market share on the desktop is nowhere near that bad. Sure, they're going to release the APUs first for Steamroller since that's a mainstream product and has a bigger market than the enthusiasts. But once they get that done, they're going to need to do something to keep themselves in the news until the next architecture, and putting together a FX chip wouldn't be too much of a stretch. I do think that Socket AM3+ is dead; the FX parts will most likely be on the same socket as the APUs from now on. This will minimize engineering costs.zepi - Tuesday, June 18, 2013 - link
The problem is, that AMD's big cores are so bad compared to Intel's big cores, that it makes next to no sense to compete against Intel with them. Perf/w is worse and manufacturing costs are far too high. To sell chips made out of these cores, they need to cut their prices so low, that they can't make any profits to pay for R&D or any other stuff. This applies to both server and desktop-market. Sure, you can sell expensive to manufacture multi-module phenoms for cheap-ass people who want best multicore performance per dollar, but what's the benefit when you can't make any money doing so?AMD is in dire need of a complete big-core CPU architecture renewal and with their R&D resources that just isn't probably going to happen any time soon. Unless they can pull some kind of a magical bunnyrabit from their hat, I don't see them being competetive in big-core ever again.
They are shifting their target to those markets where they hope they can still compete.
JDG1980 - Tuesday, June 18, 2013 - link
The big question right now is if Steamroller can fix the problems with AMD's construction equipment architecture or not. Official estimates are quite bullish, promising 15%-30% gains on a clock-for-clock basis. No doubt these are overly generous estimates and I take them with a grain of salt, but if AMD can increase actual IPC by 10% or more with Steamroller (rather than just cranking the clock speed higher) then there may be hope for the construction equipment cores. If not, then AMD's best bet is ditching that line altogether, and scaling up Jaguar or its successor so it's reasonably competitive on the desktop. The good thing is that Jaguar is already optimized for low power (which is where the Bulldozer lineage really falls short) and its IPC is pretty good. And they've already got some nice design wins with the PS4 and Xbone, which demonstrates that these cores are suitable for gaming (an "enthusiast" use). Perhaps they could backport some of the features that Bulldozer and its successors actually got right, like the improved branch predictor. (Or did they already do that with Jaguar?) After all, this is basically what Intel did when they dropped Netburst in favor of a revised version of the P6 architecture.JPForums - Tuesday, June 18, 2013 - link
@zepi "Sure, you can sell expensive to manufacture multi-module phenoms for cheap-ass people who want best multicore performance per dollar, but what's the benefit when you can't make any money doing so?"Even if you can't make any money, if you can break even it is still useful for keeping your employees employed. While a business doesn't have an inherent need to employ someone for the sake of employing them, in this case, it is useful for maintaining your talent pool. For a company like AMD, the engineers' work cycles are most likely punctuated with periods of high demand and low demand. When you have fewer product lines, this means their could be periods of time where they have no work to do at all while waiting for work to be completed farther up the line. Even a small loss is better that paying a chunk of employees to do nothing while waiting for the next thing to come down the pipe. Having more product lines allows you to even out such lulls by staggering releases and thus filling in the gaps from one product line with work from another.
As an example, if the employees responsible for the low power line's layout only worked the low power line, many of them would have been left with nothing to do as they were waiting for the Jaguar architecture to be developed and simulated. Tweaks to the bobcat layout and preparations for the next node change would have kept some of them busy, but it is quite likely that many found work in the mainstream or even FX lines in the interim.
TiredOldFart2 - Tuesday, June 18, 2013 - link
If you take a look at amd's assets, their portfolio and their current situation its not hard to see where they are headed. Money is usually made in the middle of the road. By this i mean most sales for enterprise class server cpus in this economic scenario will target a balance of sufficient computing power, price and power consumption.What would you, as a business owner, opt for your average vm server for your average medium business needs? the $600, 95w e5-2630 or the $290, 115w opteron 6320? I wont even discuss the different standards when it comes to the tdp rating both companies have, its a matter of cold hard cash.
AMD will sell cheap, will move faster while listening to clients, will take more risks on niche markets, will leverage their gpu technologies onto the server market to make up for their less than stellar fpu performance.
How big is the HPC market compared to the SME one?
JPForums - Tuesday, June 18, 2013 - link
I'm not sure this is the correct time, but I do think that eventually we will see a merger or at least closer alignment of the FX line and the A series products. Consider that since before the bulldozer architecture was conceptualized, AMD had been looking to fuse the CPU and GPU into one chip. They wanted to allow people to program code for the "GPU" portions of the processor as easily as the "CPU" portions and even within the same code blocks. They've steadily (if slowly) progressed towards this goal since then culminating in their current HSA and hUMA technologies. When looked at from this perspective, the subpar floating point performance of bulldozer and its derivatives makes sense. If you have a set of "GPU" cores or "stream processors" available to handle floating point operations, then it seems less necessary to include them in the CPU cores.Unfortunately, this merger is taking longer than AMD's initial expectations. Even if AMD's intention was to leverage discrete GPU's, in the mean time, to cover the floating point gap, software hasn't yet progressed to the point make it happen. For the moment, a GPUless part is necessary to serve higher performance sectors. Though eventually, I do expect to see GPUish elements in their high end parts to handle parallel operations and possibly augment the floating point characteristics of the processors. At this point, the transistors dedicated to the "GPU" portion will no longer be useless die space in regards to CPU performance. Such processors would have a much easier time with voice recognition, facial recognition, pattern recognition, neural algorithms (A.I. learning), ect.
JDG1980 - Tuesday, June 18, 2013 - link
AMD can't expect third-party code to be rewritten to accomodate their processors. If they can leverage the GPU for floating point, then fine, but it has to work seamlessly with existing CPU opcodes. In other words, the APU has to *internally* see that a stream of (say) SSE2 floating point instructions is coming, and hand that off to the GPU portion, without requiring anything to be recoded.AMD doesn't have the market share to tell software vendors to do things their way.
Shadowmaster625 - Tuesday, June 18, 2013 - link
That 2013 picture is some scary schiznit! Last time I went to a concert that was what it was like too. Those screens are right out of some science fiction horror novel. It is amazing what people cannot see, even when it is so plainly obvious.bji - Tuesday, June 18, 2013 - link
Yeah people are more focused on burying their noses in their phones and capturing the moment than actually living the moment. I don't have a smart phone and I notice that I pay alot more attention to what I'm actually doing than most people most of the time. I don't know why people think it's necessary to record a crappy smart phone recording of an event when you can almost certainly buy a professionally made recording of almost any important event after the fact for a few bucks.silverblue - Tuesday, June 18, 2013 - link
"Andrew Feldman told us that Berlin will offer at least twice CPU processing performance than the Opteron X-series."I'd damn well hope it was a lot more than this. If it's clocked at twice the speed then Berlin will be forgettable, however if the comparison is with Berlin clocked at, say, 3GHz, that's not so bad.
All non-BD AMD architectures seem to scale very well with additional cores, and this is the main area that SR looks to improve upon.
nismotigerwvu - Tuesday, June 18, 2013 - link
Small typo on page 4, on the very first sentence," The current Opteron 4310 EE (2 modules, 4 cores at 2.2-3 GHz, 40W TDP) and Opteron 4376 HE (4 modules, 89 cores at 2.6-3.6 GHz, 65W TDP) are about the best AMD can deliver for low power servers that need some more processing power." Unless I'm mistaken (which an 89 core chip would be pretty sweet, especially at just 65 watts) that should read 8 core. Otherwise great read Johan.JohanAnandtech - Tuesday, June 18, 2013 - link
thx. fixed :-)Denithor - Tuesday, June 18, 2013 - link
"It is clear that the micro server market gets the lionshare of AMD’s attention."lion's share
"While the Opteron-X, Opteron 6300 and “Berlin” CPU will all face stiff completion from the Intel alternatives."
competition
"but it looks like Intel will probably have the upperhand in most traditional server markets."
upper hand
Wow, did you guys not edit this piece before going live?
JohanAnandtech - Tuesday, June 18, 2013 - link
To be honest, I wrongly assumed that I still had some time left, then discovered that the deadline was already a few hours ago and just hurried. So I humbly thank you for making this article more readable for our readers :-)JDG1980 - Tuesday, June 18, 2013 - link
I'm hoping we will see some relatively inexpensive Mini-ITX Kyoto boards. An Opteron-X paired with ECC RAM could make a good, reliable platform for a DIY firewall or low-end NAS.iwod - Tuesday, June 18, 2013 - link
I didn't known the 9W for X1150 was for 1Ghz Core. If so i guess Intel will have Zero competition for Atom Servers. Intel will want to get as much market shares between now to end of 2014 when ARM will have a low power server product. Seattle seems interesting, I wonder if it is possible to have 4x 16 Core Seattle in a server. Seems like a good candidate for Hosting environment.SuperMecha - Tuesday, June 18, 2013 - link
Remember that the X1150 has 4 cores vs the 2 for the Atom chip. It also has much better IPC and support for ECC and 32GB (DDR3-1600) of RAM vs only 8GB (DDR3-1333) for the S1260.JDG1980 - Tuesday, June 18, 2013 - link
The Centertron (S1200-series) Atoms already have ECC support. But the performance is still subpar, so AMD will have an advantage there. Also, as far as I have been able to determine, there is only one Centertron board available to DIYers (the Supermicro X9SBAA-F) and it's hard to find and quite pricey (>$250). If AMD could get Mini-ITX Kyoto boards out at the $150-$200 price point, this would be a quite attractive option for small servers.iwod - Tuesday, June 18, 2013 - link
So how would next Gen Intel Arch compared to Jaguar? Since Intel has ( I think ) Hyper Threading in their Next Gen as well, so a Dual Core isn't so far off from Quad Core, And it seems Intel can ups the Ghz a bit more then the X1150.I would love to see some comparison done.
milli - Tuesday, June 18, 2013 - link
"Berlin will use the same 28 nm process technology as the Opteron x1150 and x2150"Jaguar is made at TSMC.
Rumors for Steamroller derived processors point to production at Globalfoundries. Most go as far as claiming that it will be FD-SOI (see STMicroelectronics). That makes sense because 28nm bulk would be a step back compared to 32nm SOI. I don't think AMD is going to stop using SOI for their high-end CPU's anytime soon.
Either way, be sure that they won't be using the same process technology (even though they're both 28nm) for these very different CPU's.
rocketbuddha - Thursday, June 20, 2013 - link
Please...AMD specifically said in the DEC 2012 WSA amendment with Global Foundries. It is in Page 6 of the 11 Page PDF titles
<quote>
Separately, AMD will move to standard 28nm process technology and significantly reduce reimbursements to GF for future research and development costs.
–
We anticipate these savings will be approximately ~$20M per quarter during the next several years which also helps achieve our OPEX target of $450M by Q3 2013
<end quote>
Read again and again. The standard technology at 28nm is Bulk HKMG. not SOI.
Even GF's powerpoint roadmap states that the 20nm as well as 14XM process will both be successors to 28nm Bulk HKMG will continue to be Bulk HKMG and bulk with FinFETs respectively. And the SOI line stops with 28nm FD-SOI (in collaboration with STM) and just a ambiguous dotted line for the future.
With FDSOI we still do not know if GF has agreed to manufacture FDSOI parts in large volume. All we know is the license agreement between ST Micro and GF.
1008anan - Tuesday, June 18, 2013 - link
Do you really think Seattle will hit the market in Q1 2014? Count me skeptical. Seattle will have lower single threaded performance than Avoton. Plus Seattle can't process native X86 code the way Avoton can.Wilco1 - Tuesday, June 18, 2013 - link
Seattle will sample in Q1 and be available in Q3 according to other articles.Note A57 is much faster than Avoton, even A15 has better IPC. Given Seattle packs 4 times as many cores, we're talking about 5-6 times the throughput of Avoton.
milli - Tuesday, June 18, 2013 - link
Wilco1, it's beautiful how you can make definite conclusion based on nothing.All we know now are estimates. But if you would even take these estimates into consideration then things won't be as rosy as you describe them.
- A57 will be around 20-30% faster then A15 on the same process. Numbers from ARM themselves. In one of Anand's reviews, AMD's Bobcat core was faster than a A15.
- Intel's press release on Silvermont: "Silvermont microarchitecture delivers ~3x more peak performance or the same performance at ~5x lower power over current-generation Intel Atom processor core"
Considering that the newest Atom SOC's aren't that much slower than AMD's Bobcat core, a sane person wouldn't have said what you just said.
Wilco1 - Tuesday, June 18, 2013 - link
Let's start with the facts, shall we? We already know A15 beats current Atoms by a huge margin on native code:http://browser.primatelabs.com/geekbench2/compare/...
Let's assume Silvermont will do 3 times the score, so ~4300 (I don't believe it will but that's another discussion). A57 has 20-30% better IPC as you say, and clocks up to 2.5GHz, so would score ~7300. That means with 4x the cores Seattle would have 6.8 times the throughput. Round it down to 5-6 times because we're talking about estimates. Any sane person would agree with my calculation. Avoton has no chance to compete on throughput, not even with 8 cores. Period.
Btw A15 easily beats Bobcat and trades blows with Jaguar (beats it on overall score, wipes it out on FP but is slower on int and memory - remember this is a phone SoC!): http://browser.primatelabs.com/geekbench2/compare/...
milli - Tuesday, June 18, 2013 - link
You're comparing a dual core Atom to a quad core A15!http://browser.primatelabs.com/geekbench2/compare/...
If it wasn't for the Neon unit kicking ass in some tests, the scores would be pretty close. Since Silvermont will support SSE4.1 & SSE4.2, it will become stronger in that department.
The first Seattle will be a 8-core part. The 16-core one will follow later but AMD is not precise with the release date.
Geekbench is a very low level benchmark that runs mostly in the processor cache. Many things that make or a break a processor are not tested in this way. It's not true indicator of real world performance.
http://images.anandtech.com/graphs/graph6877/53966...
Here you have an example where a 1.6Ghz dual core bobcat beats a 1.7Ghz dual core A15 by almost 20% in a real world cpu intensive test. Just to show that Geekbench is very low-level and thing like prefetchers, branch predictors, ... are not tested very well.
But as I said, it's too soon to speculate. That's all I'm saying.
Wilco1 - Wednesday, June 19, 2013 - link
Atom supports hyperthreading which gives a big speedup so it is reasonable to compare with a quad core - note most phones are quad-core anyway. But as you show even a dual A15 beats a dual Atom by almost a factor of 2 despite running at a lower frequency. I don't know whether NEON is used at all, but I'd be surprised if it was.Geekbench is not perfect but it correlates well with other native benchmarks such as SPEC and Phoronix. Your physics example is comparing hand optimized drivers for different GPUs. The A15 is trivially beaten by the Note 2 (quad A9) despite having ~3 times the NEON FP performance. So I'm not sure whether one can conclude anything from that beyond that AMD's drivers appear to be well optimized.
milli - Wednesday, June 19, 2013 - link
3DMark physics runs completely on the CPU. The GPU is not used at all. The fact that the Note 2 beats the dual A15, shows ones more that low level benchmarks like GB and SPEC are almost meaningless between architectures.Also, I don't agree that you should compare a dual core HT Atom to a quad core. It's still a dual core. That A15 beats a 5 year old Atom core by a factor of 2 is no feat. It's the minimum.
Yes, Geekbench is compiled to use ARM's Neon.
Wilco1 - Wednesday, June 19, 2013 - link
Yes I know the physics test runs on the CPU. Without having access to the source code it would be hard to say what causes it. If it doesn't use Neon and has a high correlation with #cores * MHz then that might explain it. Btw where did you get that Geekbench uses Neon? Which benchmarks are actually vectorized?A dual Atom has 4 threads. A 2-module Bulldozer has 4 threads. A quad core has 4 threads. These are ways of supporting 4 threads with different hardware tradeoffs. However from the software perspective they look and behave in the same way. So it's entirely reasonable to compare thread for thread.
Btw if you think the dual A15 vs Atom speedup of ~2x is low, you'll be disappointed with Silvermont. Since the claimed 2.8x speedup is for a quad core, a dual Silvermont will only be ~1.4 times faster...
rocketbuddha - Thursday, June 20, 2013 - link
AMD APU is on Windows Full version.A15 SOCs are in Android mobile version.
Not at all equivalent.
1008anan - Tuesday, June 18, 2013 - link
Wilco, Avoton's 14 nm shrink (airmont) is likely released in Q4, 2014. Seattle will be available in Q3, 2014. Seattle should really be compared to airmont.Avoton has better branch predictors and macro fusion than A57. Some Avoton SoCs will pack 20 Silvermont cores.
"Given Seattle packs 4 times as many cores, we're talking about 5-6 times the throughput of Avoton." :LOL:
Silvermont and Airmont cores are more power efficient than A57 cores.
Wilco1 - Tuesday, June 18, 2013 - link
Didn't you read the news that Intel appears to be delaying 14nm introduction to 2015? And where did you see that Avoton has 20 cores? At best they might have 8.Nobody has compared A57 and Avoton branch predictors yet, so the jury is out on that. Same for power efficiency.
1008anan - Tuesday, June 18, 2013 - link
A friend Ashraf wrote an article about the 14 nm "delay" not being that big of a delay:http://seekingalpha.com/article/1503982-intel-s-14...
Avoton SoC packages might include 20 cores. More than one die per package.
JDG1980 - Tuesday, June 18, 2013 - link
I know I love to get my tech news from *investment* websites. (/sarcasm)Wilco1 - Tuesday, June 18, 2013 - link
A delay is a delay. And a link to a tech site which mentions that 20 core Avoton?iwod - Wednesday, June 19, 2013 - link
14nm is only being delayed on Desktop where there are NO competition. It is on schedule for Atom and Mobile CPU.LordanSS - Tuesday, June 18, 2013 - link
I believe the road AMD has chosen has a good future for them... the only problem is that, currently, that road is very rocky and difficult to drive through.Once they manage to fulfill their "Fusion" plans, we won't need to be bothered by the anemic FPU units paired on each Piledriver/Steamroller/whatever module, mostly because that computing should be done on the GPU (which, by the time the Fusion is complete, would be an integral part of the processor itself).
Unfortunately we're still a few years away from such... and I hope they (greatly) improve integer performance on their cores until then. But I do believe it's going to happen, and it'll be great.
Think of the old "math co-processors" of the past, back in the 386 days (they got integrated on the 486 models). The only difference now tho, is that a GCN IGP unit of today has an order of magnitude more compute/FPU power than Intel parts. If you fast forward to the future, and get that improved even more and fused into the CPU....
RandSec - Tuesday, June 18, 2013 - link
"AMD can't expect third-party code to be rewritten to accomodate their processors." AMD doesn't have the market share to tell software vendors to do things their way."Not true. AMD has *more* than mere market share, it has *complete* *market* *dominance* in the new gaming consoles. *All* of gaming graphics software expertise is now focused on leveraging CPU / GPU tradeoffs in the AMD APU design. If advantages really exist, they will be found and exposed, and new software will go that way as well, if only for AMD users. For example, it would seem that the GPU has plenty of fast FP, which need not also be in the CPU, *provided* it can be accessed easily without copying back and forth.
ruiner5000 - Tuesday, June 18, 2013 - link
What is this second great depression we are talking about? I think that is a myth. I don't see any dust bowls or soup lines. Certainly not here in Austin, where AMD is. What you had to wait an extra week to pick up your iPhone 5 and were depressed, and that's the 2nd great depression? Have you ever talked to anyone who was actually alive during the great depression? Wimps.SunLord - Tuesday, June 18, 2013 - link
We're not going to see any major x86 architecture changes from AMD till at least 2015 if not 2016 thats about how long it takes to design and deploy a newly designed architecture which is hopefully what the major rehires from last year will be working on. So we're stuck with mostly minor tweaks and enhancements till then.TiredOldFart2 - Tuesday, June 18, 2013 - link
Forget low power arm servers, give me a hybrid. Give me a box with x86 and arm cpus, and allocates resources based on use scenario.Give this to the market cheap and with low power usage and carve your way back into relevance.
name99 - Tuesday, June 18, 2013 - link
"AMD says that using the graphics core for the heavy scalar floating point will get as easy as C++ programming and as a result, Berlin should make a few heads turn in the HPC world. It even looks like SSE-x will get less and less important over time in that market. "Ahh, yes, the old "New compilers will make our weird CPU architecture invisible to the programmer" gambit. How's that worked out in the past, guys?
Trimedia? Cell? iTanium?
But there's sucker born every minute. Good luck to anyone foolish enough to invest today on the assumption that this magical compiler will be available tomorrow.
[I'm not claiming this breakthrough --- compiler-transparent GPGPU --- will NEVER happen. I am claiming it ain't gonna happen during the relevant lifetime of this product.]
Alberto - Wednesday, June 19, 2013 - link
This roadmap is a disaster.No new high margin SKUs since 2015++ (excavator). No medium margin SKUs to beat Intel single socket offerings since excavator core will born in an unknown year.
The low margin segment is dominated by a NON-X86 core...vanilla from Arm and not custom ala Qualcomm.
The process side of the things is even worse. The Arm core (H2 2014 from a more accurate Amd official slide reported by xbit) is stuck on 28nm, funny thing !! considering that Qualcomm will be on 20nm in Q1/2014; Amd has not even the money to work with TSMC to deliver a competitive Arm Soc !!!
Seattle is just now a failure looking the specs, the process do not allow eight cores with a decent TDP to mach Intel Avoton in 22nm Trigate. Recent impementations of A15 say that 28nm node is not the best thing around not even to do a decent quadcore low power device...you figure an eight core one.
Anyway Seattle is late, aka in the same time frame of 14nm Airmont.
The last part of the article is stunning: "It looks like the Intel Avoton will have a very potent challenger in Q1 2014".....too bad Seattle that is an H2 2014 device.
"So there is good chance that AMD will make a big comeback in 2014 in the server market"
What server market??? microserver market ??? with a NON-x86 core ??? a x86 Company ???
I have said: there is good chances that Amd will do a so so New Entry in 2014 in a 10% low margin nice on the server market, along with many other contenders some of them with custom and optimized x86/Arm cores.
I love our articles Johan, still this seem very very strange to me
PCpowerman - Wednesday, June 19, 2013 - link
You guys on here sound like incompetent investors on Wall Street that no noting about technology. Let me give you investors some advice: If you do not know the field very well that you invest in, then you should refrain from commenting like you know who will be more competitive.When AMD goes all HuMa aware with their new generation APU's then SSE instructions, AXV instructions and other such floating point instructions will be utterly destroyed by a program that takes advantage of the GCN cores on these APU's. That is an UNDENAIBLE FACT!! A program written to take full advantage of the best floating point instructions that X86 has to offer will not come ANYWHERE near that of the same program written take advantage of the GCN cores on the next generation APU,s.
That is why the server Kaveri variant CPU does not need to be 2P or 4P. Database programs that leverage GCN cores will outperform Intel's floating point instructions in their processors, even in 2P or 4P configs. It takes a whole lot of CPU's to equal the floating point computation power of the GCN architecture. CPU's are only great at serial code and branch prediction. We need more programmers to comment on here rather than you investor types. I feel like the only technical person on here. Geez...
Alberto - Wednesday, June 19, 2013 - link
Too bad most common server workload is not floating point only based and a rude and layout repetitive GPU can be a substitute of a CPU. The bulk of the SW is optimized serial.Kaveri can be nice in low end HPC, still you forget that Intel is shipping nicely powerful integrated GPU in these days, so Amd is not alone anymore in this segment.
And yes Kavery need to be 2P, but it is not.
andrewaggb - Wednesday, June 19, 2013 - link
There will be certain operations that can be made potentially many times faster. But not everything. Databases are interesting, but at least in my use cases they are more limited by memory capacity and disk/Storage I/O than cpu performance.Most of the things I code are office and management/ordering/billing systems. Nothing particularly cpu intensive (other than video compression). Just lots of business rules and interop.
Klimax - Wednesday, June 19, 2013 - link
See Iris Pro and what it does to GCN...Alberto - Wednesday, June 19, 2013 - link
Moreover Intel graphics are now fully OpenGL 4 e OpenCL 1.2 capable.......Calinou__ - Thursday, June 20, 2013 - link
...on Windows.1008anan - Wednesday, June 19, 2013 - link
PCpowerman,Please comment on Intel's Broadwell integrated graphics and 14 nm tock (maybe Goldstone?) integrated graphics.
Intel is closer to truly fusion application processors with many different types of cores working together (both fixed function and general function.)
wumpus - Friday, July 5, 2013 - link
Wake me up when those microserver GPUs can use protected memory. As far as I know, all that memory is wide open to any process on the server. I can't imagine many uses of a microserver that could accept that (google and other single owner datacenters, maybe. But I tend to see these things as something you would want for VPS hosting).GPUs appear perfect for cryptographic uses, but are completely unacceptable as long as they can't protect their own memory (just sift through the entire GPU looking for keys, you will find them quickly). I suppose there exist the odd ECC format you might want to run on your server, but that is sufficiently exotic to simply justify adding a PCIe card.