Huh? XPoint is still an order of magnitude slower than DRAM, and has 8 orders of magnitude less endurance, nothing in here leads me to believe they intend to replace DRAM. Supplement DRAM by turning primary memory into a two-tier system, on the other hand...
@Guspaz. Endurance is irrelevant since the devices/hardware can easily achieve 5y or more on regular consumer type of usage. I believe the ROI will basically render the DRAM obsolete on that type of a metric. The speed will be replaced by energy consumption. The DRAM will not be completely avoided, but heavily suppressed. The DIMM's modules will hold only small part DRAM just enough for the OS to keep the essential data on it.
DRAM is not regular consumer usage. DRAM Data changes extremely frequently.
The reason these things can be used for swap is because the entire point of swap is to move currently inactive data out of main memory and into the swap. That is limited writing.
If you replace all DRAM with this, then writing will go berserk. Last I checked, SSDs target somewhere around 1000->10000 writes per sector. In regular usage that isn't to big of an issue (this is why SSD controllers work hard to move data around and spread out writes).
In a normal applications, that many writes is pretty much peanuts. DDR can pump around 17GB of data per second.
As a swap, it makes sense. As main memory? No way you would do that.
@Cogman. That is perfect for main memory on mobile devices, especially for notebooks. First it will combine the main memory RAM+SSD in one PCI-slot device, and second will increase the battery life about 5-10x, accounting NV factor and very low latency. So if the notebook can handle 9h of regular consumer type of work with those type of DIMMS is destined to hold for 2 days. The preliminary results show that current standards as SATA and PCI are too slow to allow the true speed to come to life. In the exceptionally intense I/O environment DRAM has no match, but those cases might be limited as the codes or the OS will improve.
I see it makes sense in small mobile devices because one solution for storage and memory will save space but due to price will take years if this ever happens. However I fully disagree about battery life. The most important factor on most mobile devices is the display. We can see that now. 14/16 nmand did battery life dramatically increase vs the "bad" S810? Not really.
@beginner99, You would still want to have some main memory dedicated to currently executing applications. But I could certainly see it being useful for things like phones since they are usually running few apps concurrently (so moving non-active apps into swap makes sense).
It would act as vastly expanded memory in devices that would otherwise only have 2 GB of RAM, and it might even mean we could see devices with 1 GB of active RAM that depend on a few GB of XPoint, but we will definitely have real RAM.
It still wouldn't be create for main memory. Applications write very frequently. Every time they call a new method or allocate space for a variable, they are modifying memory (and very frequently those values in the variables change).
You would still want some memory for the currently running applications, but maybe you could save some space and not have so much (so have 512mb of main memory and 4gb of swap, for example).
As I understand it, this is non-consumer tech. Not yet at least. Consumers are well served with current gen NVMe SSDs, probably more so than they need.
This ain't replacing DRAM in the Enterprise either.
exactly right. My understanding is that this is for read-heavy enterprise storage. Things like large databases or maybe even Netflix where you have TBs of relatively static storage that several users need to access at a moments notice. Things that need to stay in active memory, but there is too much of it to practically store in DRAM.
Less endurance? anything will have less endurance than RAM because RAM's endurance is unlimited because it is volatile.. When RAM loses power, that's it, the data is gone. When Xpoint loses power, the data is still there, even for decades. It's speed is 4x to 8x slower, sure, but endurance is not really going to be a issue for at least 10-15 years if intel's numbers hold up.
Slow non-volatile memory is not a competitor, much less a substitute for fast volatile memory. Data loss is not a big danger, since enterprise machines have backup power and can flush caches to permanent storage on impending power failure, thus avoiding data loss completely.
Intel is stepping up with the fictional hype after it became evident the previous round of hype was all words with no weight to them.
I disagree. RAM needs orders of magnitude more endurance than an SSD and it's also usually a part you keep longer than Mobo or CPU. SSDs are now at like 3000 write cycles. Anything fro RAM would need millions if not billions of write cycles.
I had an old game, black and white, which measured number of functions called in part of its stats gathering. It was often in the 10's of millions. Each of those function calls is a write to memory and a likely write to the same sector of memory.
Every function call, every variable allocation, these will result in a memory update. In a GCed application, every minor GC will result in 10s to 100s of megabytes of data being written.. (woopee for moving garbage collectors).
Software writes and reads data like a beast. It is part of the reason why CPUs have 3 levels of cache on them.
billions sounds like a really low estimate. Software often uses specific memory locations as counters; and if you're unlucky and caching won't do, then you might see up to around 30 million writes *a second* to the same location. That's perhaps a bit of a corner case; but DRAM has no way of dealing with write-amplification; it's pretty much direct access with some fairly static address remapping (TLB). That's what makes rowhammer bugs possible!
You could hit a trillion writes a day, and I bet there's some workload out there running on a real server today that's written orders of magnitudes more than that *in practice* to a single memory location.
It's speed is not 8x to 4x slower, at least not by any metric intel or micron have ever claimed. DRAM has a latency of 4-12ns (nano, not micro!), and it doesn't look like xpoint is anywhere near that. Also, realize that practical memory latencies *including* controller overheads are only a small multiple of that; 20-30ns is normal. Those great XPoint numbers all exclude the controller; as soon as it's included, latency (necessarily) skyrockets since it doesn't sound like XPoint can be directly addressed as simply as DRAM.
Intel hasn't released a lot of data on this stuff, so it's hard to be sure, but my guess is that optane will be around 1000x slower than DRAM in practice via the NVMe interface. And that no matter what they do, they'll be around 10x slower - a very, very significant difference.
And don't forget that DRAM isn't standing still either; HBM2 isn't nearly as uncertain as XPoint, and that's already slated to deliver 1TB/s bandwidth at even lower latencies, and with less power consumption than DDR4.
I'm not expecting miracles from XPoint. It's going to be one hell of a fast SSD, and it'll be brilliant for swap and cache, but it's not going to dramatically change computer architecture. DRAM isn't going away; nor are larger cheaper SSDs.
Unless the pricing can be really competitive, it's likely to remain an small, niche product.
I see this mentioned pretty much anytime a new non-volatile memory is being talked about. But usually not from the companies themselves. Replacing DRAM is only going to happen if we can find something faster (which is cheap enough), or for some very low end applications.
You'll notice that DRAM is still even in Intel's marketing drawings. Best guess it will start out in enterprise 32G (or more) DRAM "caches" and allow terabytes of xpoint to be used as "memory".
Check the latency figures on the recent POWER-8 article: main memory latency was 90ns. While xpoint latency might be roughly that by itself (never mind navigating all the way to the registers), that implies as long as you don't wear it out (and with enterprise-sized arrays with any kind of wear leveling it *can't* we worn out: the system simply can't throw that many writes over its lifetime). Give it a decent DRAM cache and watch it run.
What I really want to see is an 8G HBM (or whatever size HBM will be by the time xpoint is made in volume). That should make a great front to xpoint and likely allow plenty* more threads to run (because HBM can supply a ton of outstanding requests. You won't get "high bandwidth" with just a few threads). * think replacing the wasted space of graphics parts in K chips with cores, not slapping hundreds of ARM cores down. Amdahl's law will remain in effect as long as these things do general purpose computing and single thread performance is important (although cranking up main memory access should change things a bit).
No expert but something is amiss here. I saw somewhere that Samsung plans competing using traditional NAnd but with higher parallelism and probably SLC mode. The advantages seems to be coming from the interface and avoiding storage protocols as Ive read with Nand on DIMMS.
"ScaleMP and Intel have previously demonstrated that flash-based NVMe SSDs can be used as a cost-effective alternative to building a server with extreme amounts of DRAM". To highligh cost-effectiveness as a punch line in this statement is odd since Optane will cost more than NVMe SSDs? Intel is positioning Optane as some sort of a medium between flash based SSDs and DRAM, however I very much suspect that pricing wise it will lie closer to DRAM pricing...
The cost effectiveness comes primarily from not having to buy a ton of DRAM in order to get sufficient performance. Not only should you be able to save a lot of money on the DIMMs themselves, but there's potential for a lot of platform cost savings by not buying a server that is designed to hold a TB of DRAM and instead just needing a full set of PCIe lanes. Yes, Optane won't be as cheap as flash SSDs on a per-capacity basis, but in terms of application performance it might come out ahead by allowing the use of even less DRAM or fewer machines, or by making NVMe virtual memory viable for a wider range of big data tasks where flash might not be fast enough.
I still dont grasp ere this is heading. It is assumed it will have an advantage when compared to DRAM in large dataset we will soon have stacked DRAM as well, giving us potentially 4-32x DRAM capacity, why not use DRAM then.
For servers the advantage is that it should allow large amounts of data to be stored cheaper than DRAM and faster than current SSD or disk.
For mobile the saving is they would be able to go to sleep and wake up faster and yet use up less power by using less DRAM. Or just allow it to use up a lot more storage for running programs.
In the longer term it may allow large databases to be stored without having to worry about fitting data into blocks but more like in DRAM. This would help both servers and mobiles.
Its simple. Some databases need to be stored in memory to maximize speed of access. Endurance is not a problem because it is more about read latency than write endurance. Buying a TB of DRAM. Eg, 1000GBs of DRAM is very very expensive. XPoint has a lower latency than NAND, and a relatively high endurance, so it should be possible to put your DB in XPoint (especially Xpoint DIMMs) and keep a relatively fast performance, while having either larger Databases and/or lower costs.
On the other hand... This will not make Crysis perform significantly better.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
36 Comments
Back to Article
ImSpartacus - Tuesday, August 16, 2016 - link
This is kinda hardcore. Intel must really believe they have some special sauce. I look forward to learning more.bartendalot - Tuesday, August 16, 2016 - link
Intel believes it is a disruptive technology that will eliminate the need for DRAM.That's some special sauce there...
Guspaz - Tuesday, August 16, 2016 - link
Huh? XPoint is still an order of magnitude slower than DRAM, and has 8 orders of magnitude less endurance, nothing in here leads me to believe they intend to replace DRAM. Supplement DRAM by turning primary memory into a two-tier system, on the other hand...Vlad_Da_Great - Tuesday, August 16, 2016 - link
@Guspaz. Endurance is irrelevant since the devices/hardware can easily achieve 5y or more on regular consumer type of usage. I believe the ROI will basically render the DRAM obsolete on that type of a metric. The speed will be replaced by energy consumption. The DRAM will not be completely avoided, but heavily suppressed. The DIMM's modules will hold only small part DRAM just enough for the OS to keep the essential data on it.Cogman - Tuesday, August 16, 2016 - link
DRAM is not regular consumer usage. DRAM Data changes extremely frequently.The reason these things can be used for swap is because the entire point of swap is to move currently inactive data out of main memory and into the swap. That is limited writing.
If you replace all DRAM with this, then writing will go berserk. Last I checked, SSDs target somewhere around 1000->10000 writes per sector. In regular usage that isn't to big of an issue (this is why SSD controllers work hard to move data around and spread out writes).
In a normal applications, that many writes is pretty much peanuts. DDR can pump around 17GB of data per second.
As a swap, it makes sense. As main memory? No way you would do that.
Vlad_Da_Great - Tuesday, August 16, 2016 - link
@Cogman. That is perfect for main memory on mobile devices, especially for notebooks. First it will combine the main memory RAM+SSD in one PCI-slot device, and second will increase the battery life about 5-10x, accounting NV factor and very low latency. So if the notebook can handle 9h of regular consumer type of work with those type of DIMMS is destined to hold for 2 days.The preliminary results show that current standards as SATA and PCI are too slow to allow the true speed to come to life. In the exceptionally intense I/O environment DRAM has no match, but those cases might be limited as the codes or the OS will improve.
SetiroN - Wednesday, August 17, 2016 - link
Could you please stop talking about things you clearly have absolutely zero clue about?Cogman - Wednesday, August 17, 2016 - link
@SetiroN was that directed at me? If so, what would you say I'm wrong about?SetiroN - Wednesday, August 17, 2016 - link
Of course not. It was a reply to the clueless dude pulling imaginary computing out of his ass.beginner99 - Wednesday, August 17, 2016 - link
I see it makes sense in small mobile devices because one solution for storage and memory will save space but due to price will take years if this ever happens. However I fully disagree about battery life. The most important factor on most mobile devices is the display. We can see that now. 14/16 nmand did battery life dramatically increase vs the "bad" S810? Not really.Cogman - Wednesday, August 17, 2016 - link
@beginner99, You would still want to have some main memory dedicated to currently executing applications. But I could certainly see it being useful for things like phones since they are usually running few apps concurrently (so moving non-active apps into swap makes sense).mkozakewich - Wednesday, August 17, 2016 - link
It would act as vastly expanded memory in devices that would otherwise only have 2 GB of RAM, and it might even mean we could see devices with 1 GB of active RAM that depend on a few GB of XPoint, but we will definitely have real RAM.Cogman - Wednesday, August 17, 2016 - link
It still wouldn't be create for main memory. Applications write very frequently. Every time they call a new method or allocate space for a variable, they are modifying memory (and very frequently those values in the variables change).You would still want some memory for the currently running applications, but maybe you could save some space and not have so much (so have 512mb of main memory and 4gb of swap, for example).
lilmoe - Tuesday, August 16, 2016 - link
As I understand it, this is non-consumer tech. Not yet at least. Consumers are well served with current gen NVMe SSDs, probably more so than they need.This ain't replacing DRAM in the Enterprise either.
CaedenV - Wednesday, August 17, 2016 - link
exactly right. My understanding is that this is for read-heavy enterprise storage. Things like large databases or maybe even Netflix where you have TBs of relatively static storage that several users need to access at a moments notice. Things that need to stay in active memory, but there is too much of it to practically store in DRAM.Morawka - Tuesday, August 16, 2016 - link
Less endurance? anything will have less endurance than RAM because RAM's endurance is unlimited because it is volatile.. When RAM loses power, that's it, the data is gone. When Xpoint loses power, the data is still there, even for decades. It's speed is 4x to 8x slower, sure, but endurance is not really going to be a issue for at least 10-15 years if intel's numbers hold up.ddriver - Tuesday, August 16, 2016 - link
Slow non-volatile memory is not a competitor, much less a substitute for fast volatile memory. Data loss is not a big danger, since enterprise machines have backup power and can flush caches to permanent storage on impending power failure, thus avoiding data loss completely.Intel is stepping up with the fictional hype after it became evident the previous round of hype was all words with no weight to them.
lilmoe - Tuesday, August 16, 2016 - link
This ain't about "speed", it's about latency, in which DRAM is far more than 8x more responsive.beginner99 - Wednesday, August 17, 2016 - link
I disagree. RAM needs orders of magnitude more endurance than an SSD and it's also usually a part you keep longer than Mobo or CPU. SSDs are now at like 3000 write cycles. Anything fro RAM would need millions if not billions of write cycles.Cogman - Wednesday, August 17, 2016 - link
Easily billions.I had an old game, black and white, which measured number of functions called in part of its stats gathering. It was often in the 10's of millions. Each of those function calls is a write to memory and a likely write to the same sector of memory.
Every function call, every variable allocation, these will result in a memory update. In a GCed application, every minor GC will result in 10s to 100s of megabytes of data being written.. (woopee for moving garbage collectors).
Software writes and reads data like a beast. It is part of the reason why CPUs have 3 levels of cache on them.
emn13 - Wednesday, August 17, 2016 - link
billions sounds like a really low estimate. Software often uses specific memory locations as counters; and if you're unlucky and caching won't do, then you might see up to around 30 million writes *a second* to the same location. That's perhaps a bit of a corner case; but DRAM has no way of dealing with write-amplification; it's pretty much direct access with some fairly static address remapping (TLB). That's what makes rowhammer bugs possible!You could hit a trillion writes a day, and I bet there's some workload out there running on a real server today that's written orders of magnitudes more than that *in practice* to a single memory location.
emn13 - Wednesday, August 17, 2016 - link
It's speed is not 8x to 4x slower, at least not by any metric intel or micron have ever claimed. DRAM has a latency of 4-12ns (nano, not micro!), and it doesn't look like xpoint is anywhere near that. Also, realize that practical memory latencies *including* controller overheads are only a small multiple of that; 20-30ns is normal. Those great XPoint numbers all exclude the controller; as soon as it's included, latency (necessarily) skyrockets since it doesn't sound like XPoint can be directly addressed as simply as DRAM.Intel hasn't released a lot of data on this stuff, so it's hard to be sure, but my guess is that optane will be around 1000x slower than DRAM in practice via the NVMe interface. And that no matter what they do, they'll be around 10x slower - a very, very significant difference.
And don't forget that DRAM isn't standing still either; HBM2 isn't nearly as uncertain as XPoint, and that's already slated to deliver 1TB/s bandwidth at even lower latencies, and with less power consumption than DDR4.
I'm not expecting miracles from XPoint. It's going to be one hell of a fast SSD, and it'll be brilliant for swap and cache, but it's not going to dramatically change computer architecture. DRAM isn't going away; nor are larger cheaper SSDs.
Unless the pricing can be really competitive, it's likely to remain an small, niche product.
MrSpadge - Tuesday, August 16, 2016 - link
I see this mentioned pretty much anytime a new non-volatile memory is being talked about. But usually not from the companies themselves. Replacing DRAM is only going to happen if we can find something faster (which is cheap enough), or for some very low end applications.wumpus - Thursday, August 18, 2016 - link
You'll notice that DRAM is still even in Intel's marketing drawings. Best guess it will start out in enterprise 32G (or more) DRAM "caches" and allow terabytes of xpoint to be used as "memory".Check the latency figures on the recent POWER-8 article: main memory latency was 90ns. While xpoint latency might be roughly that by itself (never mind navigating all the way to the registers), that implies as long as you don't wear it out (and with enterprise-sized arrays with any kind of wear leveling it *can't* we worn out: the system simply can't throw that many writes over its lifetime). Give it a decent DRAM cache and watch it run.
What I really want to see is an 8G HBM (or whatever size HBM will be by the time xpoint is made in volume). That should make a great front to xpoint and likely allow plenty* more threads to run (because HBM can supply a ton of outstanding requests. You won't get "high bandwidth" with just a few threads).
* think replacing the wasted space of graphics parts in K chips with cores, not slapping hundreds of ARM cores down. Amdahl's law will remain in effect as long as these things do general purpose computing and single thread performance is important (although cranking up main memory access should change things a bit).
FunBunny2 - Tuesday, August 16, 2016 - link
mmap() on steriods will rule the earth. that will power real RDBMS system.fangdahai - Tuesday, August 16, 2016 - link
XPOINT is far more expensive than NAND, much slower than DRAM, endurance is just very, very poor compare to DRAMI don't see how it could be useful, except for BIG DATA and super computers.
ddriver - Tuesday, August 16, 2016 - link
but...but.... it is 1000x better! It must be great really ;)nandnandnand - Tuesday, August 16, 2016 - link
Store the entire OS and applications in XPoint, without needing constant writes, then see how bad it is.Hint: XPoint is useful if you use it right.
jwcalla - Tuesday, August 16, 2016 - link
32 GB of DRAM is cheap.Zertzable - Wednesday, August 17, 2016 - link
Yes, but this targets a market where we're talking about 256, 512 or event 1TB of memory.zodiacfml - Tuesday, August 16, 2016 - link
No expert but something is amiss here. I saw somewhere that Samsung plans competing using traditional NAnd but with higher parallelism and probably SLC mode.The advantages seems to be coming from the interface and avoiding storage protocols as Ive read with Nand on DIMMS.
K_Space - Tuesday, August 16, 2016 - link
"ScaleMP and Intel have previously demonstrated that flash-based NVMe SSDs can be used as a cost-effective alternative to building a server with extreme amounts of DRAM".To highligh cost-effectiveness as a punch line in this statement is odd since Optane will cost more than NVMe SSDs? Intel is positioning Optane as some sort of a medium between flash based SSDs and DRAM, however I very much suspect that pricing wise it will lie closer to DRAM pricing...
Billy Tallis - Tuesday, August 16, 2016 - link
The cost effectiveness comes primarily from not having to buy a ton of DRAM in order to get sufficient performance. Not only should you be able to save a lot of money on the DIMMs themselves, but there's potential for a lot of platform cost savings by not buying a server that is designed to hold a TB of DRAM and instead just needing a full set of PCIe lanes. Yes, Optane won't be as cheap as flash SSDs on a per-capacity basis, but in terms of application performance it might come out ahead by allowing the use of even less DRAM or fewer machines, or by making NVMe virtual memory viable for a wider range of big data tasks where flash might not be fast enough.iwod - Wednesday, August 17, 2016 - link
I still dont grasp ere this is heading. It is assumed it will have an advantage when compared to DRAM in large dataset we will soon have stacked DRAM as well, giving us potentially 4-32x DRAM capacity, why not use DRAM then.Dmcq - Thursday, August 18, 2016 - link
For servers the advantage is that it should allow large amounts of data to be stored cheaper than DRAM and faster than current SSD or disk.For mobile the saving is they would be able to go to sleep and wake up faster and yet use up less power by using less DRAM. Or just allow it to use up a lot more storage for running programs.
In the longer term it may allow large databases to be stored without having to worry about fitting data into blocks but more like in DRAM. This would help both servers and mobiles.
doggface - Saturday, August 20, 2016 - link
Its simple. Some databases need to be stored in memory to maximize speed of access. Endurance is not a problem because it is more about read latency than write endurance. Buying a TB of DRAM. Eg, 1000GBs of DRAM is very very expensive. XPoint has a lower latency than NAND, and a relatively high endurance, so it should be possible to put your DB in XPoint (especially Xpoint DIMMs) and keep a relatively fast performance, while having either larger Databases and/or lower costs.On the other hand... This will not make Crysis perform significantly better.