Same here. I'm really curious about the differences between the four different dies IBM will be offering. Certainly the mix of two core types and IO types should fill the assorted niches found in the server market.
I can wait, it will be a market share failure like every other power because IBM will price it out of reach of any sensible price range. Going by previous attempts it will cost anywhere from 5-10X as much as an equivalent amount of x86 processing power. Something like $10K for the processor and a another $2-5 for the case, memory and motherboard and it will be equivalent to a quad x86 Xeon server that costs $5k for the same hardware.
No one that doesn't need some special sauce it provides will buy them, particularly because you'd have to recompile all your software to use it. IBM has screwed up power so many times at this point that you'd have to be a fool to bet on it.
The specjbb2013 benchmark is broken, SPEC discovered the benchmark can be vendor optimized to provide false results so they fixed it in specjbb2015. IBM have released specjbb2015 numbers for their S812LC server achieving 44.900 for max-jops and 13.000 for crticial-jops. That is almost as good as the Intel Xeon E5-2699v4 result. However, what is interesting is the critical-jops, which measures critical throughput under SLAs. IBM have said they will compete with Intel, with their power9.
(Of course, one SPARC M7 cpu achieves 120.600 max-jops and 60.300 critical-jops, that is 2.7x faster max-jops and 4.6x faster critical-jops. This is not using the built in hardware accelerators in SPARC. Next year the SPARC M8 arrives, which is 2x faster than M7. Today, Oracle have released six cpus in six years, each doubling performance (except the low cost S7, which is a crippled M7))
I do like how you come with a comment that's incendiary towards POWER8 and POWER9, doing what you can to make it look worse... and then start touting how magical and wonderful SPARC M7 is. Using the same old Oracle-supplied performance claims without substantiating it. Funny, that. I think it stands out a little bit...
But that's not what matters. If you run a simple google search, "site:anandtech.com brutalizer", you'll find comments with not a lot of variety. Usually commenting on anything x86 and POWER8, and in every single one (Except this one, actually! You actually reference an IBM supplied Spec result. However, you should link to it next time.) you tout the wonder of the latest SPARC of the time. Linking to Oracle-supplied benchmarks, on Oracles own site consistently concluding that Oracle outperforms their competitors. And every time you do this the comment seems to be as close to the top of the comment list as possible, for visibility.
I'm sure there's more comments like this where you're actually adding to the conversation but those are the few I found, and they're always unrelated to CPUs and the server market. They seem to perhaps reflect your own interests? But there is one thing to point out here and that the first religiously-pro-Oracle comment you made seemed to be in 2014. What happened then? Did you buy the account? Did someone start paying you? I don't know.
I am not doing something to make power look worse, I put it in perspective and post other benchmark numbers from Intel and Oracle so people can compare. Yes, I am posting hard facts that can be indendently verified, or are you rejecting the benchmarks I post? Why? Why do you think it is a bad thing I post benchmarks from other vendors than IBM? You dont want people to be able to build their own opinion about power by comparing with other vendors? Why not? Why is it dangerous when someone quote benchmarks from other vendors? Whats the problem with that?
"...Using the same old Oracle-supplied performance claims without substantiating it..."
Now this is the same old FUD from the IBM supporters. As i have explained, mathematicians can always prove their claims with links to benchmarks, white papers, resaerch papers, or point to common comp sci knowledge, etc. So you are in deep sh-t now. I can always post links to the numbers I claim. You claim I can not, and I spread unsubstantiated information - now you are lying about me.
Quote me on any number in any post - and I will post links to prove my numbers. If you ever find any post (you will not find any) where I make up numbers out of the blue to discredit IBM or Intel, you are correct that I post unsubstantiated claims. If you can not find any such posts by me, you are spreading FUD about me, and you lie about me. Now go ahead and quote me on any number where I make out things. I am waiting.
You are not really smart to claim a mathematician to not be able to prove his figures. I am now able to prove you are a liar and FUDer.
I think it is funny how the IBM supporters always FUD and try to discredit people, instead of countering the benchmark numbers. I post benchmark numbers, and instead of try to discuss the numbers you always attack me. That is not the scientific way, to avoid the hard facts and instead try to discredit the opponent. You should instead try to dissect my numbers and links instead of attacking me. But always, always, the IBM crowd does that " oh, he is an Oracle supporter" - so what? You are an IBM supporter! The difference is that I post numbers, and IBM crowd attacks me instead of countering with other numbers.
If you want to disprove my claims about Sparc, post numbers that disproves my benchmarks. Do not attack me, that does not win you any discussions.
Sure, it's true that on SPECjbb2015 a T7-1 beats a low-end IBM Turismo machine, an S812LC (with an entry price under $5000 list, compared to over $30000 entry price for the T7-1), by a factor of 2.7x on max-jops. It's also true that M7 came out almost a year and a half after P8 did, and that you can get a dual-CPU P8 server with that same processor, and 256GB RAM, for well under half of the list price of a single-CPU T7-1 with 128GB.
Starting to see why IBM has over 70% of the non-x86 server market?
Screwed up Power (so many times)? Please explain? Compared to what....SPARC? Itanium? If you are talking about those platforms, POWER has 70% of that marketshare. Do you mean against "Good Enough" Intel? Absolutely Intel is the market leader but only in share as it isn't in innovation. Power still delivers enterprise features for AIX and IBM i customers with features Intel could only dream about. Where the future of the data center is going with Linux, well it did take IBM a while to figure out they couldn't do it their way. Now, they are committed 100% (from my perspective as a non-IBMer while also being committed to AIX & IBM i as their is a solid install base there) which we all see in the form of IBM & even non-IBM solutions built by OpenPOWER partners and ISV solutions using little endian Linux. Yes, there are some workloads that require extra work to optimize but for those already optimized or those which can be optimized, those customers can now buy a server for less money that has the potential to outperform Intel by up to 2X, in a system using innovative technology (CAPI & NVLink) that is more reliable. I don't know, IBM may be late and Power has some work to do but I really don't think you can back up your statement that "IBM has screwed up power so many times". Latest OpenPOWER Summit was a huge success. Here is a Google interview https://www.youtube.com/watch?v=f0qTLlvUB-s&fe...
Oh, but you were probably just trying to be clever and take a few competitive shots.
Yeah, that $4800 Power server wasn't nearly equivalent to what was benchmarked in this review with the "midrange" server that costs over $11K on the same web page you cited.
I could build an 8 or 12 core Xeon that would put the hurt on that low-end Power box for less money and continue to save money during every minute of operation.
" it will cost anywhere from 5-10X" . What do you base this on? Several SKUs of IBM are in the $1500 range. "Something like $10K for the processor". This seems to be about the high-end. The E7s are in the $4.6-7k range. Even if IBM would charge $10k for the high end CPUs, it is nowhere near being 5x more expensive. Unless I am missing something, you seem to have missed that IBM has a scale out range and is offering much more affordable OpenPOWER CPUs.
IMHO, the place where POWER servers make sense right now, is for use with IBM software. So if you are using something DB2 or WebSphere, where the real cost is the Software licenses. Then it's really a Nobrainer. Not that your local IBM sales Guy will like that you'll do a switch to a Linux@Power solution :)
For the Java tests, did you change the GC collector settings? Also, why only 24GB for the JVM? I run JBoss with 32GB across our servers. I'd use more, but they still have issues with going to higher levels.
Unless working with huge datasets you want to keep your JVM heap size as reasonably low as possible... otherwise there would be a penalty on GC performance. Granted, with this sort of hardware it would be pretty minuscule, but the general rule of thumb still applies...
No changes to the GC Collector settings. 24 GB for VM = 4x 24 GB + 4x 3 GB for Transaction Injector and 2 GB for the controllor = +/- 110 GB memory. We wanted to run it inside 128 GB as most of our DIMMs are 16 GB at DDR4-2400/2133.
Isn't the limit slighty lower than 32 GiB? At some point the JVM switches to 64 bit pointers, which means you'll lose a lot of the available heap to larger pointers. I think you might want to lower your settings. I'm curious, what kind of GC times are you seeing with your heap size? I don't currently have access to Java running on non virtualised hardware so I would like to know if the overhead is significant (mostly running Elasticsearch here).
All in all the Power chip isn't terrible but the power consumption coupled with the sheer amount of tuning that is required just to get it competitive with the Xeons isn't too encouraging. You could spend far less time tuning the Xeons and still have higher performance or go ahead with tuning to get even more performance out of those Xeons.
On top of the fact that this isn't a supposedly "high end" model, the higher end power parts cost more and will burn through even more power, and that's an expense that needs to be considered for the types of real-world applications that use these servers.
That ad on the last page that claims lower equipment cost of course compares that to an HP DL380, the most overpriced Xeon E5 system out right now. (I know because I shopped them.) Comparing it to a comparable Dell R730 would show less expense, better support, and better expansion options.
Something is wrong is these power consumption data. The plataform idles at 221W and under full load only 260W?? the cpu is vanished?? Power 8 at over 3Ghz has an active power of only 40W?? 1) the idle value is wrong or 2) the under load value is wrong. All this is not consistent with IBM TDP official values. IMO the energy consumption page of the article has to be rewrite.
We have double checked those numbers. It is probably an indication that many of the power saving features do not work well under Linux right now. BTW, just to give you an idea: running c-ray (floating point) caused the consumption to go to 361W.
C-ray isn't that smart. :D It's a very simple code, brute force basically, and the smaller dataset can easily fit in a modern cache (actually the middling size test probably does too on CPUs like these). Hmm, I suppose it's possible one could optimise the compilation a bit to help, but I doubt anything except a full rewrite could make decent use of any vector tech, and I don't want to allow changes to the code, that would make comparisons to all other test results null. Compiler optimisations are ok, but not multi-pass optimisations that feed back info about the target data into the initial compile, that's cheating IMO (some people have done this to obtain what look like really silly run times, but I don't include them on my main C-ray page).
Ummm so in short words the utilized sw don't stress at all the cpu, not even the hot caches near the memory banks. We need a bench with an high memory utilization and a balanced mix between integer and FP, more in line with real world utilization
I don't know if this test is enough to say POWER8 is power/perf competitive with haswell in 22nm. In fact POWER market share is definitively at the historic minimum and 14nm Broadwell is pretty young, so this disaster it is not its fault.
Many of the features does have a performance impact, ranging from negative over neutral to positive for a single one.
But Again. I think your comparison with 'vanilla' software stacks are relevant. This is what people would see out of the box with an existing software stack. It is 101% relevant to do that comparison as this is the marked that IBM is trying to break into with these servers.
But what could be fun to see was some tests where all the Bells and Whistles were utilized. As many have written here.. use of Hardware supported Decimal Floating Point. The Vector Execution unit, the ability to do hardware assisted Memory Compression etc. etc.
Thanks Jesper. Looks like I will have to spend even more time on that system :-). And indeed, out of the box performance is important if IBM ever wants to get a piece of the x86 market.
It was my understanding that the SMT mode on the power8 could be changed. Depending on the type of work this would make a giant difference, especially with mysql/mariadb that are limited to 1 process/thread per connection.
With databases the real winner would be with one that supports parallel queries, such as postgresql 9.6, db2, oracle, etc.
Also yer bench mark very easily could be limiting the power8 if its not opening enough connections to fill out the number of threads that thing can handle, remember mysql/mariaDB are 1 process/thread per connection. Alot of database bench marks default to a small number of connections, this thing has 160 threads with the dual 10 core. I would suggest trying to run that same benchmark again but do it at the same time from multiple client machines. See if the bench takes a larger dip when a second client machine runs the same bench or if the bench shows similar figures(granted this might hit hd io limit on the power8 server).
Pretty sure most of the reason for that is due to Intel blocking every attempt Nvidia makes at getting a high bandwidth interface bolted onto a Xeon.
Given that one of the main reasons that Intel blocked Nvidia's chipset business way back in the day was to try to limit the ability of other companies bolting on high bandwidth accelerators onto Intel chips (Presumably to protect their own initiatives in that space).
Not terribly impressive. You have to get SW to paly nice and spend time to fine tune it to outperform Intel and it will cost you in power and cooling. More like "yes, if you get quite bigger TDP you get bit more power". And it won't be terribly good in many cases. (Like public facing service where latency is critical)
Maybe if you are in USA and can waste admins and devs time and waste a lot on cooling and electricity then maybe. Otherwise why bother...
I don't see this as a bad result. This is a 22nm processor, over two years old, and it beats Haswell-EP (which is newer) on efficiency. Broadwell-EP is brand new, and P9 should come out well before the end of BDW-EP's lifecycle.
Some of the POWER9 chips will be out next year though is suspect that the scale-up models maybe an early 2018 part. Considering that those chips go into IBM's big iron Unix servers, they tend to launch a bit later than the low end models so it isn't game changing.
The real question is when SkyLake-EP/EX will launch and in comparison to the scale-out POWER9 chips. I was expecting a first half of 2017 for the Intel parts but I have no reference as to when to expect the POWER9 SO chips. Thus there is a chance Intel can come out first.
Intel also wants a quick transition to SkyLake-EP/EX as they unify those to lines to some extent and provide some major platform improvements. I'm thinking Broadwell-EP/EX will have a relatively short life span compared to Haswell-EP/EX. This mimics much of what happened on the desktop and the challenge to move to 14 nm.
This article neglects one important aspect to costs: per-core licensed software. Those licenses can easily be north of 10 000$ . PER CORE. For some special purpose software the license cost can be over 100 000 $ / core. Yes, per core. It sounds ridiculous, but it's true. So if your 10-core IBM system has the same performance as a 14-core Intel system, and your license cost is 10 000$ / core, well, then you just saved yourself 40 000 $ by using the IBM processor. Even with lower license fee / core, the cost advantage can be significant, easily outweighing the additional electricity bill over the lifetime of the server.
I don't know how much value could have the performed tests, because they don't reflect what happens in the real world. In the real world you don't use an old o.s. version and an old compiler for an x86/x64 platform, only because the POWER platform has problems with the newer ones. And a company which spends so much money in setting up its systems, can also spend just a fraction and buy an Intel compiler to squeeze out the maximum performance. IMO you should perform the tests with the best environment(s) which is available for a specific platform.
I missed your reaction, but we discussed this is in the first part. Using Intel's compiler is good practice in HPC, but it is not common at all in the rest of the server market. And I do not see what an Intel compiler can do when you install mysql or run java based applications. Nobody is running recompiled databases or most other server software.
Then why you haven't used the latest available distro (and compiler) for x86? It's the one which people usually use when installing a brand new system.
This seems rather disappointing, and with regards to optmized Postgres and MariaDB, I think in that case one should also build these software packages optimized for Xeon Broadwell.
@nils_ Optimized for.. simply means that the software has been officially ported to POWER, and yes that would normally include that the specific accelerators that are inside the POWER architecture now are actually used by the software, and this usually means changing the code a bit. So .. to put it in other words .. just like it is with Intel x86 Xeons.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
49 Comments
Back to Article
Eden-K121D - Thursday, September 15, 2016 - link
Can't wait for Power9Kevin G - Thursday, September 15, 2016 - link
Same here. I'm really curious about the differences between the four different dies IBM will be offering. Certainly the mix of two core types and IO types should fill the assorted niches found in the server market.rahvin - Thursday, September 15, 2016 - link
I can wait, it will be a market share failure like every other power because IBM will price it out of reach of any sensible price range. Going by previous attempts it will cost anywhere from 5-10X as much as an equivalent amount of x86 processing power. Something like $10K for the processor and a another $2-5 for the case, memory and motherboard and it will be equivalent to a quad x86 Xeon server that costs $5k for the same hardware.No one that doesn't need some special sauce it provides will buy them, particularly because you'd have to recompile all your software to use it. IBM has screwed up power so many times at this point that you'd have to be a fool to bet on it.
Eden-K121D - Friday, September 16, 2016 - link
Tell that to GoogleBrutalizer - Friday, September 16, 2016 - link
Power9 will be 50% - 125% faster than power8, according to IBM.http://www.nextplatform.com/wp-content/uploads/201...
On average it will be 75% faster.
The specjbb2013 benchmark is broken, SPEC discovered the benchmark can be vendor optimized to provide false results so they fixed it in specjbb2015. IBM have released specjbb2015 numbers for their S812LC server achieving 44.900 for max-jops and 13.000 for crticial-jops. That is almost as good as the Intel Xeon E5-2699v4 result. However, what is interesting is the critical-jops, which measures critical throughput under SLAs. IBM have said they will compete with Intel, with their power9.
(Of course, one SPARC M7 cpu achieves 120.600 max-jops and 60.300 critical-jops, that is 2.7x faster max-jops and 4.6x faster critical-jops. This is not using the built in hardware accelerators in SPARC. Next year the SPARC M8 arrives, which is 2x faster than M7. Today, Oracle have released six cpus in six years, each doubling performance (except the low cost S7, which is a crippled M7))
wingar - Friday, September 16, 2016 - link
I do like how you come with a comment that's incendiary towards POWER8 and POWER9, doing what you can to make it look worse... and then start touting how magical and wonderful SPARC M7 is. Using the same old Oracle-supplied performance claims without substantiating it. Funny, that. I think it stands out a little bit...But that's not what matters. If you run a simple google search, "site:anandtech.com brutalizer", you'll find comments with not a lot of variety. Usually commenting on anything x86 and POWER8, and in every single one (Except this one, actually! You actually reference an IBM supplied Spec result. However, you should link to it next time.) you tout the wonder of the latest SPARC of the time. Linking to Oracle-supplied benchmarks, on Oracles own site consistently concluding that Oracle outperforms their competitors. And every time you do this the comment seems to be as close to the top of the comment list as possible, for visibility.
Have some links.
http://www.anandtech.com/comments/10158/the-intel-...
http://www.anandtech.com/comments/9193/the-xeon-e7...
http://www.anandtech.com/comments/10230/ibm-nvidia...
http://www.anandtech.com/comments/9567/the-power-8...
http://www.anandtech.com/comments/7757/quad-ivy-br...
http://www.anandtech.com/comments/7852/intel-xeon-...
http://www.anandtech.com/comments/7285/intel-xeon-...
But I found a couple of comments you left that anti-everyone-not-Oracle. Have some links.
http://www.anandtech.com/comments/7334/a-look-at-a...
http://www.anandtech.com/comments/7371/understandi...
http://www.anandtech.com/comments/5831/amd-trinity...
I'm sure there's more comments like this where you're actually adding to the conversation but those are the few I found, and they're always unrelated to CPUs and the server market. They seem to perhaps reflect your own interests? But there is one thing to point out here and that the first religiously-pro-Oracle comment you made seemed to be in 2014. What happened then? Did you buy the account? Did someone start paying you? I don't know.
And hey, for fun I've actually posted this comment before to you, here's a link:
http://www.anandtech.com/comments/10435/assessing-...
Brutalizer - Friday, September 16, 2016 - link
I am not doing something to make power look worse, I put it in perspective and post other benchmark numbers from Intel and Oracle so people can compare. Yes, I am posting hard facts that can be indendently verified, or are you rejecting the benchmarks I post? Why? Why do you think it is a bad thing I post benchmarks from other vendors than IBM? You dont want people to be able to build their own opinion about power by comparing with other vendors? Why not? Why is it dangerous when someone quote benchmarks from other vendors? Whats the problem with that?If you insist, here is the SPARC M7 specjbb2015 results.
https://blogs.oracle.com/BestPerf/entry/201511_spe...
PowerOfFacts - Friday, September 16, 2016 - link
trollBrutalizer - Friday, September 16, 2016 - link
"...Using the same old Oracle-supplied performance claims without substantiating it..."Now this is the same old FUD from the IBM supporters. As i have explained, mathematicians can always prove their claims with links to benchmarks, white papers, resaerch papers, or point to common comp sci knowledge, etc. So you are in deep sh-t now. I can always post links to the numbers I claim. You claim I can not, and I spread unsubstantiated information - now you are lying about me.
Quote me on any number in any post - and I will post links to prove my numbers. If you ever find any post (you will not find any) where I make up numbers out of the blue to discredit IBM or Intel, you are correct that I post unsubstantiated claims. If you can not find any such posts by me, you are spreading FUD about me, and you lie about me. Now go ahead and quote me on any number where I make out things. I am waiting.
You are not really smart to claim a mathematician to not be able to prove his figures. I am now able to prove you are a liar and FUDer.
I think it is funny how the IBM supporters always FUD and try to discredit people, instead of countering the benchmark numbers. I post benchmark numbers, and instead of try to discuss the numbers you always attack me. That is not the scientific way, to avoid the hard facts and instead try to discredit the opponent. You should instead try to dissect my numbers and links instead of attacking me. But always, always, the IBM crowd does that " oh, he is an Oracle supporter" - so what? You are an IBM supporter! The difference is that I post numbers, and IBM crowd attacks me instead of countering with other numbers.
If you want to disprove my claims about Sparc, post numbers that disproves my benchmarks. Do not attack me, that does not win you any discussions.
SarahKerrigan - Friday, September 16, 2016 - link
Sure, it's true that on SPECjbb2015 a T7-1 beats a low-end IBM Turismo machine, an S812LC (with an entry price under $5000 list, compared to over $30000 entry price for the T7-1), by a factor of 2.7x on max-jops. It's also true that M7 came out almost a year and a half after P8 did, and that you can get a dual-CPU P8 server with that same processor, and 256GB RAM, for well under half of the list price of a single-CPU T7-1 with 128GB.Starting to see why IBM has over 70% of the non-x86 server market?
PowerOfFacts - Friday, September 16, 2016 - link
trollBOMBOVA - Friday, October 7, 2016 - link
Rich info , good scoutPowerOfFacts - Friday, September 16, 2016 - link
Sigh ....PowerOfFacts - Friday, September 16, 2016 - link
That's strange, this site says you can buy a POWER8 server for $4800. https://www.ibm.com/marketplace/cloud/big-data-inf...Screwed up Power (so many times)? Please explain? Compared to what....SPARC? Itanium? If you are talking about those platforms, POWER has 70% of that marketshare. Do you mean against "Good Enough" Intel? Absolutely Intel is the market leader but only in share as it isn't in innovation. Power still delivers enterprise features for AIX and IBM i customers with features Intel could only dream about. Where the future of the data center is going with Linux, well it did take IBM a while to figure out they couldn't do it their way. Now, they are committed 100% (from my perspective as a non-IBMer while also being committed to AIX & IBM i as their is a solid install base there) which we all see in the form of IBM & even non-IBM solutions built by OpenPOWER partners and ISV solutions using little endian Linux. Yes, there are some workloads that require extra work to optimize but for those already optimized or those which can be optimized, those customers can now buy a server for less money that has the potential to outperform Intel by up to 2X, in a system using innovative technology (CAPI & NVLink) that is more reliable. I don't know, IBM may be late and Power has some work to do but I really don't think you can back up your statement that "IBM has screwed up power so many times". Latest OpenPOWER Summit was a huge success. Here is a Google interview https://www.youtube.com/watch?v=f0qTLlvUB-s&fe...
Oh, but you were probably just trying to be clever and take a few competitive shots.
CajunArson - Saturday, September 17, 2016 - link
Yeah, that $4800 Power server wasn't nearly equivalent to what was benchmarked in this review with the "midrange" server that costs over $11K on the same web page you cited.I could build an 8 or 12 core Xeon that would put the hurt on that low-end Power box for less money and continue to save money during every minute of operation.
JohanAnandtech - Saturday, September 17, 2016 - link
" it will cost anywhere from 5-10X" . What do you base this on? Several SKUs of IBM are in the $1500 range. "Something like $10K for the processor". This seems to be about the high-end. The E7s are in the $4.6-7k range. Even if IBM would charge $10k for the high end CPUs, it is nowhere near being 5x more expensive. Unless I am missing something, you seem to have missed that IBM has a scale out range and is offering much more affordable OpenPOWER CPUs.jesperfrimann - Wednesday, September 21, 2016 - link
IMHO, the place where POWER servers make sense right now, is for use with IBM software. So if you are using something DB2 or WebSphere, where the real cost is the Software licenses.Then it's really a Nobrainer. Not that your local IBM sales Guy will like that you'll do a switch to a Linux@Power solution :)
// Jesper
YukaKun - Thursday, September 15, 2016 - link
For the Java tests, did you change the GC collector settings? Also, why only 24GB for the JVM? I run JBoss with 32GB across our servers. I'd use more, but they still have issues with going to higher levels.Cheers!
madwolfa - Thursday, September 15, 2016 - link
Unless working with huge datasets you want to keep your JVM heap size as reasonably low as possible... otherwise there would be a penalty on GC performance. Granted, with this sort of hardware it would be pretty minuscule, but the general rule of thumb still applies...JohanAnandtech - Thursday, September 15, 2016 - link
No changes to the GC Collector settings. 24 GB for VM = 4x 24 GB + 4x 3 GB for Transaction Injector and 2 GB for the controllor = +/- 110 GB memory. We wanted to run it inside 128 GB as most of our DIMMs are 16 GB at DDR4-2400/2133.nils_ - Monday, September 26, 2016 - link
Isn't the limit slighty lower than 32 GiB? At some point the JVM switches to 64 bit pointers, which means you'll lose a lot of the available heap to larger pointers. I think you might want to lower your settings. I'm curious, what kind of GC times are you seeing with your heap size? I don't currently have access to Java running on non virtualised hardware so I would like to know if the overhead is significant (mostly running Elasticsearch here).CajunArson - Thursday, September 15, 2016 - link
All in all the Power chip isn't terrible but the power consumption coupled with the sheer amount of tuning that is required just to get it competitive with the Xeons isn't too encouraging. You could spend far less time tuning the Xeons and still have higher performance or go ahead with tuning to get even more performance out of those Xeons.On top of the fact that this isn't a supposedly "high end" model, the higher end power parts cost more and will burn through even more power, and that's an expense that needs to be considered for the types of real-world applications that use these servers.
dgingeri - Thursday, September 15, 2016 - link
That ad on the last page that claims lower equipment cost of course compares that to an HP DL380, the most overpriced Xeon E5 system out right now. (I know because I shopped them.) Comparing it to a comparable Dell R730 would show less expense, better support, and better expansion options.Morawka - Thursday, September 15, 2016 - link
you mean a company made a slide that uses the most extreme edge cases to make their product look good?!?! Shocking /sGondalf - Thursday, September 15, 2016 - link
Something is wrong is these power consumption data. The plataform idles at 221W and under full load only 260W?? the cpu is vanished?? Power 8 at over 3Ghz has an active power of only 40W??1) the idle value is wrong or 2) the under load value is wrong. All this is not consistent with IBM TDP official values.
IMO the energy consumption page of the article has to be rewrite.
JohanAnandtech - Thursday, September 15, 2016 - link
We have double checked those numbers. It is probably an indication that many of the power saving features do not work well under Linux right now.BTW, just to give you an idea: running c-ray (floating point) caused the consumption to go to 361W.
Kevin G - Thursday, September 15, 2016 - link
I presume that c-ray uses the 256 bit vector unit on POWER8?Also have you done any energy consumption testing that takes advantage of the hardware decimal unit?
mapesdhs - Thursday, September 15, 2016 - link
C-ray isn't that smart. :D It's a very simple code, brute force basically, and the smaller dataset can easily fit in a modern cache (actually the middling size test probably does too on CPUs like these). Hmm, I suppose it's possible one could optimise the compilation a bit to help, but I doubt anything except a full rewrite could make decent use of any vector tech, and I don't want to allow changes to the code, that would make comparisons to all other test results null. Compiler optimisations are ok, but not multi-pass optimisations that feed back info about the target data into the initial compile, that's cheating IMO (some people have done this to obtain what look like really silly run times, but I don't include them on my main C-ray page).Ian.
Gondalf - Tuesday, September 20, 2016 - link
Ummm so in short words the utilized sw don't stress at all the cpu, not even the hot caches near the memory banks. We need a bench with an high memory utilization and a balanced mix between integer and FP, more in line with real world utilizationI don't know if this test is enough to say POWER8 is power/perf competitive with haswell in 22nm.
In fact POWER market share is definitively at the historic minimum and 14nm Broadwell is pretty young, so this disaster it is not its fault.
jesperfrimann - Wednesday, September 21, 2016 - link
If you have a OPAL (Bare Metal system that cannot run POWERVM) then all the powersavings features are off by default AFAIR.Try to have a look at:
https://public.dhe.ibm.com/common/ssi/ecm/po/en/po...
Many of the features does have a performance impact, ranging from negative over neutral to positive for a single one.
But Again. I think your comparison with 'vanilla' software stacks are relevant. This is what people would see out of the box with an existing software stack.
It is 101% relevant to do that comparison as this is the marked that IBM is trying to break into with these servers.
But what could be fun to see was some tests where all the Bells and Whistles were utilized. As many have written here.. use of Hardware supported Decimal Floating Point. The Vector Execution unit, the ability to do hardware assisted Memory Compression etc. etc.
// Jesper
JohanAnandtech - Sunday, September 25, 2016 - link
Thanks Jesper. Looks like I will have to spend even more time on that system :-). And indeed, out of the box performance is important if IBM ever wants to get a piece of the x86 market.luminarian - Thursday, September 15, 2016 - link
It was my understanding that the SMT mode on the power8 could be changed. Depending on the type of work this would make a giant difference, especially with mysql/mariadb that are limited to 1 process/thread per connection.With databases the real winner would be with one that supports parallel queries, such as postgresql 9.6, db2, oracle, etc.
Also yer bench mark very easily could be limiting the power8 if its not opening enough connections to fill out the number of threads that thing can handle, remember mysql/mariaDB are 1 process/thread per connection. Alot of database bench marks default to a small number of connections, this thing has 160 threads with the dual 10 core. I would suggest trying to run that same benchmark again but do it at the same time from multiple client machines. See if the bench takes a larger dip when a second client machine runs the same bench or if the bench shows similar figures(granted this might hit hd io limit on the power8 server).
So yea, that and try SMT-2 and SMT-4 modes.
JohanAnandtech - Friday, September 16, 2016 - link
Hi, I tried SMT-4, throughput was about 25% worse: 11k instead 14k+. 95th perc response time was better: 3.7 ms.JohanAnandtech - Friday, September 16, 2016 - link
updated the MySQL graphs with SMT-4 data. Our Spark tests gets worse with SMT-4 and that is also true for SPECjbb.luminarian - Friday, September 16, 2016 - link
Awesome, Thanks for the response.Meteor2 - Friday, September 16, 2016 - link
The HPC potential is awesome. You can really see why Oak Ridge chose POWER9 and Volta.Communism - Sunday, September 18, 2016 - link
Pretty sure most of the reason for that is due to Intel blocking every attempt Nvidia makes at getting a high bandwidth interface bolted onto a Xeon.Given that one of the main reasons that Intel blocked Nvidia's chipset business way back in the day was to try to limit the ability of other companies bolting on high bandwidth accelerators onto Intel chips (Presumably to protect their own initiatives in that space).
Klimax - Saturday, September 17, 2016 - link
Not terribly impressive. You have to get SW to paly nice and spend time to fine tune it to outperform Intel and it will cost you in power and cooling. More like "yes, if you get quite bigger TDP you get bit more power". And it won't be terribly good in many cases. (Like public facing service where latency is critical)Maybe if you are in USA and can waste admins and devs time and waste a lot on cooling and electricity then maybe. Otherwise why bother...
SarahKerrigan - Sunday, September 18, 2016 - link
I don't see this as a bad result. This is a 22nm processor, over two years old, and it beats Haswell-EP (which is newer) on efficiency. Broadwell-EP is brand new, and P9 should come out well before the end of BDW-EP's lifecycle.Kevin G - Sunday, September 18, 2016 - link
Some of the POWER9 chips will be out next year though is suspect that the scale-up models maybe an early 2018 part. Considering that those chips go into IBM's big iron Unix servers, they tend to launch a bit later than the low end models so it isn't game changing.The real question is when SkyLake-EP/EX will launch and in comparison to the scale-out POWER9 chips. I was expecting a first half of 2017 for the Intel parts but I have no reference as to when to expect the POWER9 SO chips. Thus there is a chance Intel can come out first.
Intel also wants a quick transition to SkyLake-EP/EX as they unify those to lines to some extent and provide some major platform improvements. I'm thinking Broadwell-EP/EX will have a relatively short life span compared to Haswell-EP/EX. This mimics much of what happened on the desktop and the challenge to move to 14 nm.
loa - Monday, September 19, 2016 - link
This article neglects one important aspect to costs:per-core licensed software.
Those licenses can easily be north of 10 000$ . PER CORE. For some special purpose software the license cost can be over 100 000 $ / core. Yes, per core. It sounds ridiculous, but it's true.
So if your 10-core IBM system has the same performance as a 14-core Intel system, and your license cost is 10 000$ / core, well, then you just saved yourself 40 000 $ by using the IBM processor.
Even with lower license fee / core, the cost advantage can be significant, easily outweighing the additional electricity bill over the lifetime of the server.
aryonoco - Tuesday, September 20, 2016 - link
Thanks Johan for another very interesting article.As I have said before, there is literally nothing on the web that compares with your work. You are one of a kind!
Looking forward to POWER 9. Should be very interesting.
HellStew - Tuesday, September 20, 2016 - link
Good article as usually. Thanks Johan.I'd still love to see some VM benchmarks!
cdimauro - Wednesday, September 21, 2016 - link
I don't know how much value could have the performed tests, because they don't reflect what happens in the real world. In the real world you don't use an old o.s. version and an old compiler for an x86/x64 platform, only because the POWER platform has problems with the newer ones. And a company which spends so much money in setting up its systems, can also spend just a fraction and buy an Intel compiler to squeeze out the maximum performance.IMO you should perform the tests with the best environment(s) which is available for a specific platform.
JohanAnandtech - Sunday, September 25, 2016 - link
I missed your reaction, but we discussed this is in the first part. Using Intel's compiler is good practice in HPC, but it is not common at all in the rest of the server market. And I do not see what an Intel compiler can do when you install mysql or run java based applications. Nobody is running recompiled databases or most other server software.cdimauro - Sunday, October 2, 2016 - link
Then why you haven't used the latest available distro (and compiler) for x86? It's the one which people usually use when installing a brand new system.nils_ - Monday, September 26, 2016 - link
This seems rather disappointing, and with regards to optmized Postgres and MariaDB, I think in that case one should also build these software packages optimized for Xeon Broadwell.jesperfrimann - Thursday, September 29, 2016 - link
@nils_Optimized for.. simply means that the software has been officially ported to POWER, and yes that would normally include that the specific accelerators that are inside the POWER architecture now are actually used by the software, and this usually means changing the code a bit.
So .. to put it in other words .. just like it is with Intel x86 Xeons.
// Jesper
alpha754293 - Monday, October 3, 2016 - link
I look forward to your HPC benchmarks if/when they become available.