This is still significantly slower than RAM....maybe for some typical consumer workloads it can take over as an all in one storage solution, but for servers and power users, we'll still need RAM as we know it today...and the fastest "RAM" if you will is on die L1 cache...which has physical limits to it's speed and size based on speed of light!
I can see SSD's going away depending on manufacturing costs but so many computers are shipping with spinning disks still I'd say it's well over a decade before we see SSD's become the replacement for all spinning disk consumer products.
Intel is pricing this right between SSD's and RAM which makes sense, I just hope this will help the industry start to drive down prices of SSD's!
Estimates from about 2 years back had the cost/GB price of SSDs undercutting that of HDDs in the early 2020's. AFAIK those were business as usual projections, but I wouldn't be surprised to see it happen a bit sooner as HDD makers pull the plug on R&D for the generation that would otherwise be overtaken due to sales projections falling below the minimums needed to justify the cost of bringing it to market with its useful lifespan cut significantly short.
Hard drive storage cost has not changed significantly in at least half a decade, while ssd prices have continued to fall (albeit at a much slower rate than in the past). This bodes well for the crossover.
Actually it has, unless you regard HDDs with double density at the same price every 2 - 2.5 years as not an actual falling cost. $ per GB is what matters, and that is falling steadily, for both HDDs and SSDs (although the latter have lately spiked in price due to flash shortage).
The latency specs include PCIe and controller overhead. Get rid of those by dropping this memory in a DIMM slot and it'll be much faster. Still not as fast as current memory, but it's going to be getting close. Normal system memory is in the range of 0.5us. 60us is getting very close.
PCIE latency is below 1 us. I don't see how subtracting less than 1 from 60 gets you anywhere near 0.5.
All in all, if you want the best value for your money and the best performance, that money is better spent on 128 gigs of ecc memory.
Sure, xpoint is non volatile, but so what? It is not like servers run on the grid and reboot every time the power flickers LOL. Servers have at the very least several minutes of backup power before they shut down, which is more than enough to flush memory.
Despite intel's BS PR claims, this thing is tremendously slower than RAM, meaning that if you use it for working memory, it will massacre your performance. Also, working memory is much more write intensive, so you are looking at your money investment crapping out potentially in a matter of months. Whereas RAM will be much, much faster and work for years.
4 fast NVME SSDs will give you like 12 GB\s bandwidth, meaning that in the case of an imminent shutdown, you can flush and restore the entire content of those 128 gigs of ram in like 10 seconds or less. Totally acceptable trade-back for tremendously better performance and endurance.
There is only one single, very narrow niche where this purchase could make sense. Database usage, for databases with frequent low queue access. This is an extremely rare and atypical application scenario, probably less than 1/1000 in server use. Which is why this review doesn't feature any actual real life workloads, because it is impossible to make this product look good in anything other than synthetic benches. Especially if used as working memory rather than storage.
ddriver: Do you work for the memory industry? Or hold a stock in them? You have a personal gripe about the company that goes beyond logic.
PCI Express latency is far higher than 1us. There are unavoidable costs of implementing a controller on the interface and there's also software related latency.
I have a personal gripe with lying. Which is what intel has been doing every since it announced hypetane. If you find having a problem with lying a problem with logic, I'd say logic ain't your strong point.
Lying is also what you do. PCIE latency is around 0.5 us. We are talking PHY here. Controller and software overhead affect equally every communication protocol.
Xpoint will see only minuscule latency improvements from moving to dram slots. Even if PCIE has about 10 times the latency of dram, we are still talking ns, while xpoint is far slower in the realm of us. And it ain't no dram either, so the actual latency improvement will be nowhere nearly the approx 450 us.
It *could* however see significant bandwidth improvements, as the dram interface is much wider, however that will require significantly increased level of parallelism and a controller that can handle it, and clearly, the current one cannot even saturate a pcie x4 link. More bandwidth could help mitigate the high latency by masking it through buffering, but it will still come nowhere near to replacing dram without a tremendous performance hit.
*450 ns, by which I mean lower by 450 ns. And the current xpoint controller is nowhere near hitting the bottleneck of PCIE. It would take a controller that is at least 20 times faster than the current one to even get to the point where PCIE is a bottleneck. And even faster to see any tangible benefit from connecting xpoint directly to the memory controller.
I'd rather have some nice 3D SLC (better than xpoint in literally every aspect) on PCIE for persistent storage RAM in the dimm slots. Hyped as superior, xpoint is actually nothing but a big compromise. Peak bandwidth is too low even compared to NVME NAND, latency is way too high and endurance is way too low for working memory. Low queue depths performance is good, but credit there goes to the controller, such a controller will hit even better performance with SLC nand. Smarter block management could also double the endurance advantage SLC already has over xpoint.
We don't know how much slower the media is than dram right now. We know than using dram over nvme has similar (though much better worst case) perf to this. See my other post regarding polling and latency.
Re-reading, I see it says "typical" latency is under 10us, placing it in spitting distance of DDR3/4. It's the 99.9999th percentile that is 60us for Q1. At Q16, 99.999th percentile is 140us. That means it takes only 140us to service 16 requests. That's pretty much the same as 10us.
Read Q1 4KiB bandwidth is only about 500MiB/s, but at Q8, it's about 2GiB which puts it on par with DDR4-2400.
That's only on a page hit. For the type of operations that 3dxpoint is looking at (4k or so) you won't find it on an open page and thus take 2-3 times as long till it is ready.
That still leaves you with ~100x latency. And we are still wondering if losing the PCIe controller will make any significant difference to this number (one problem is that if Intel/Micron magically fixed this, the endurance is only slightly better than SLC and would quickly die if used as main memory).
Endurance for the initial batch postulated from intel's warranty would be around 30k PE cycles, and 50k for the upcoming generation. That's not "only slightly better than SLC" as SCL has 100k PE cycles endurance. But the 100k figure is somewhat old, and endurance goes down with process node. So at a comparable process, SLC might be going down, approaching 50k.
It remains to be seen, the lousy industry is penny pinching and producing artificial NAND shortages to milk people as much as possible, and pretty much all the wafers are going into TLC, some MLC and why oh why, QLC trash.
I guess they are saving the best for last. 3D SLC will address the lower density, samsung currently has 2 TB MLC M2, so 1 TB is perfectly doable via 3D SLC. I am guessing samsung's z-nand will be exactly that - SLC making a long overdue comeback.
You're making the mistake those who know nothing make, which is surprising for you. This is a first generation product. It will get much faster, and much cheaper as time goes on. NAND will stagnate. You also have to remember that Intel never made the claim that this was as fast as RAM, or that it would be. The closest they came was to say that this would be in between NAND and RAM in speed. And yes, for some uses, it might be able to replace RAM. But that could be several generations down the road, in possibly 5 years, or so.
I'm not sure i understand you. You talk about "pages", but, i hope, the reviewer was only using dio, so there would be no page cache. It's very unclear where you are getting this "~100x" number. Nvme connected dram has a plurality of hits around 4-6 us (depending on software) but it also has a distributed latency curve. However, i don't know what the latency at the 99.999% percentile. The point is that even with dram's sub-100ns latency, it's still not staying terribly close to the theoretical min latency of the bus. Btw, it's not just the controller. A very large amount of latency comes from the block layer itself (amongst other things).
It is quite possible that Intel artificially weakened P4800X's performance and durability in order to avoid internal competition with their SSD division (they already did the same with Atoms). If your new technology is *too* good it might make your other more mainstream technology look bad in comparison and you could see a big drop in sales. Or it might have a "deflationary" effect, where their customers might delay buying in hope of lower prices later. This way they can also have a more clear storage hierarchy, business segment wise, where their mainstream products are good, and their niche ones are better but not too good.
I am not suggesting that it could ever compete with DRAM, just that the potential of 3D XPoint technology might actually be closer to what they mentioned a year ago than the first products they shipped.
Intel wont be reducing the price of the optane but rather will be giving the average consumer a watered down version which will be charged at a premium but perform only slightly better then the top SSD. The conclusion ? Another over priced ripoff from Intel.
What makes the most impression is it took a completely different review format to make this product look good. No doubt strictly following intel's own review guidelines. And of course, not a shred of real world application. Enter hypetane - the paper dragon.
Also, bandwidth is only one side of the coin. Xpoint is 30-100+ times more latent than dram, meaning the CPU will have to wait 30-100+ times longer before it has data to compute, and dram is already too slow in this aspect, so you really don't want to go any slower.
I see a niche for hypetane - ram-less systems, sporting very slow CPUs. Only a slow CPU will not be wasted on having to wait on working memory. Server CPUs don't really need to crunch that much data either, if any, which is paradoxical, seeing how intel will only enable avx512 on xeons, so it appears that the "amazingly fast" and overpriced hypetane is at home only in simple low end servers, possibly paired with them many core atom chips. Even overpriced, it will kind of a decent deal, as it offers about 3 times the capacity per dollar as dram, paired with wimpy atoms it could make for a decent simple, low cost, frequent access server.
You are missing the usefulness of it entirely. Yes, it is a niche product. And I even agree, intel is hyping it and offering it for consumer with minimal benefit (beside intel's bottom line). But it realistically slots between NAND and DRAM. This review shows that it has lower latency than NAND and it has higher density than DRAM. This is the play.
You say it cannot replace DRAM and for most usage (by far) you are true. However, for a small niche that works with very big data sets (like for finace or exploration), having more memory, although slower, will still be much faster than memory + swap (to a slower NAND storage).
Let me repeat, this is a niche product, but it has its uses. Intel marketing is hyping it and trying to use it where its tradeoffs (particularly price) make little sense, but the technology itself is good (if limited).
Don't be so sure that latency is keeping it from being used as [secondary] main memory. A 4GB machine can actually function (more or less) for office duty and some iffy gaming capability. I'd strongly suspect that a 4-8GB stack of HBM (preferably the low-cost 512 bit systems, as the CPU really only wants 512bit chunks of memory at a time) with the rest backed by 3dxpoint would still be effective at this high latency. Any improvement is likely to remove latency as something that would stop it (and current software can use the current stack [with PCIe connection] to work 3dxpoint as "swappable ram").
The endurance may well keep this from happening (it is on par with SLC).
The other catch is that this is a pretty steep change along the entire memory system. Expect Intel to have huge internal fights as to what the memory map should look like, where the HBM goes (does Intel pay to manufacture an expensive CPU module or foist it on down the line), do you even use HBM (if Ravenridge does, I'd expect that Intel would have to if they tried to use xpoint as main memory)? The big question is what would be the "cache line" of the DRAM memory: the current stack only works with 4k, the CPU "wants" 512 bits, HBM is closer to 4k. 4k looks like a no-brainer, but you still have to put a funky L5/buffer that deals with the huge cache line or waste a ton of [top level, not sure if L3 or L4] cache by giving it 4k cache lines.
What is it with you and RAM? This isn't a RAM replacement for most any use. Intel hasn't said that it is. Why are you insisting on comparing it to RAM?
With ddriver and RAM? I've only skimmed ddriver's posts but I believe a summary would be.
1) RAM is faster than this product so adding more RAM would be a better option than adding a middle man that is only faster than the data storage device but still slower than RAM. 2) RAM has much more endurance than these drives 3) Servers tend to stay on 24/7 and have back up power solutions (UPS, generators, etc) to allow for a RAM data flush to a non-volatile data storage device prior to any power loss so it renders Optane's advantage of being non-volatile fairly moot.
ddriver believes these reasons result in this product having very niche uses yet Intel keeps hyping this as a solution for every user while hiding behind synthetic benchmarks instead of demonstrating real world applications which would reveal that more RAM would lead to a superior solution in many/most cases.
I may have missed something but I think that sums up what I have read so far.
oops, in the last part I forgot that he saying they are using the benchmarks to hide the fact that it's not as far ahead of NAND speads (although it is ahead) as they claim.
A few days ago I registered here on Anandtech and I found it very odd that such a very knowledgeable website provided (only) unsecure cleartext registration and log-in forms. I felt awkward and uncomfortable, because that is a very no no for me. I wanted to register though, so I used the Tor Browser, to risk being sniffed only by the exit node. Now I see that Charlie (which I used to read ages ago) has taken this quite a few steps further.. The guy sells $1,000 annual "professional subscriptions" on a completely private, crystal clear transparent, as public as it gets, 100% unencrypted page. I am utterly dumbfounded... And I lost all appetite to read his article or anything from him ever again. For life. Click your link and then click the "Become a subscriber" link on the top to enjoy this adorable (in)security atrocity..
I find the go faster stripes on my monitor screen make a massive difference to my FPS. I have many, many more FPS as a result. It's due to the quality of the paint - Dulux one-coat just bring down my latency to the point whe...
.... I've sniffed too much of this paint, haven't I?
never has Intel claimed that this product is faster the DRAM...Your indignation is not proportional to even your perceived slight by Intel. You work for SK Hynix or more likely Powerchip don't you?
Nope, I am self employed. I never accused intel of lying about hypetane being faster than dram. I accused them of lying how much faster than NAND it is and how close to dram it is. And I have only noted that it is hundreds of times slower than dram, making the population of dimm slots (which some intel cheerleaders claim will magically make hypetane faster) is a very bad prospect in 99.99% of the use cases.
I don't have corporate preferences either, IMO all corporations are intrinsically full of crap, yet the amount of it varies. I also do realize that "nicer" companies are only nicer because they are it a tough situation and cannot afford to not be nice.
What annoys me is that legally speaking, false advertising is a crime, yet everyone is doing it, because it has so many loopholes, and what's worse, the suckers line up to cheer at those lies.
It is still a first gen product and I think it has potential in servers and scientific computing. First gen SSDs were also crappy with low capacity. Give it 5 years I think it will make more sense.
I know that this drive isn't targeted for consumers at all, but I'm really interested in how it performs in consumer level workloads as an example of what a full Optane SSD is capable of. Any chance we can get a part 2 with the consumer drive tests and have it compared to the fastest consumer NVM-e drives? Even just a partial test suite for a sampler of how it compares would be great.
I imagine it will be insane - the drive saturates its throughput at <QD6, meaning most consumer workloads. It'll obviously be a while before its affordable from a consumer perspective, but I suspect the consumer prices will be a lot lower without the enterprise class requirements thrown in.
This drive looks incredibly good. 2-4x more than enterprise SSDs for pretty similar sequential throughput - BUT at insanely lower queue depths, which is a big benefit. At those QDs, it's easily justifying its price in throughput. Throw on top of that a 99.999th% latency that is often better than their 99th% latency, and 3D Xpoint has a very bright future ahead of it. It might be gen 1 tech, but it's already justified its existence for an entire class of workloads.
Those are some very impressive numbers for a gen1 storage device. Basically better than an SSD in almost every way except of course price. I'm interested in seeing what Micron does with QuantX as it should have the same characteristics but potentially more accessible.
Well finally! I was waiting for this test ever since I heard about the technology. This is enterprise drive, yeah, but it is the showcase for the technology and it shows what we can expect for consumer drive - 8-10x current SSD speeds for desktop usage (that is 98% 4-8k RR, QD=1). That blows out of the water everything in the market. Actually this technology shines exactly at radon joe's PC, while SSDs shine only in enterprise market (QD=16+). Can't wait!
All you want (from desktop user perspective) is low latency at low queue depth (1). NVME helps with that regard, tho not by a lot. Equal drives, one on sata, one on nvme will make the nvme a bit more agile resulting in more performance for you. So far no current ssd is ever close to saturate the sata3 bus in desktop use, this one, however, is scratching it. Sure, it will be years till we get affordable consumer drives from that tech, but it is pretty much the same step forward than going from hdd to ssd - first ssds were in the range of 20ish mb per second, while hdds - about 1.5 in these circumstances. Here we are talking a jump from 50 to close to 400+. Moar power! :)
Imagine having long battery life and instant hibernation - at 400 mbps, waking up from hibernation and reloading memory contents would take a few seconds. Then again, constantly writing a huge page file to XPoint wouldn't be good for longevity and hibernation doesn't allow for background processes to run while asleep. I'm thinking of potential usage for XPoint on phones and tablets, can't seem to find any.
Yeah, also imagine your system working 10 times slower, because it uses hypetane instead of ram. And not only that, but you also have to replace that memory every 6 months or so, because working memory is much more write intensive, and this thing's endurance is barely twice that of MLC flash.
It is well worth the benefit of instant resume, because if enterprise systems are known for something, that is frequently hibernating and resuming.
They didn't say replace the ram with xpoint. It's a really good idea since xpoint has faster media access times so even when it's a smaller amount it should still be quite a bit faster than nand.
Why are you mentioning dimms? Are you just posting random responses? Neither of your posts in this thread actually addressed anything that the posters were discussing.
Have you been living in a cave the past five years? SATA 3.0 has been the limiting factor for SSDs for a while now - all max out around 450MB/sec.
Now there are plenty of SSD that connect via PCIe instead of SATA and are able to pull several gigabytes/sec. Examples include Samsung 960 Pro/Evo, 950 Pro, OCZ RD400, etc. SATA has ben the bottleneck for a while and now that we have NVMe, we're seeing what NAND can really do with m.2 or pci-e connections
That speed is only for high queue depth workloads. Even the 960 Pro only does about 137mb/s average in random reads over QD1, QD2, and QD4. The QD1 numbers are something like 34mb/s. Those numbers are far below the SATA spec. Almost all consumer tasks are low queue depth.
With this drive, you get about 400mb/s even at QD1, and something like 1.3gb/s at QD4.
A very very sweet piece of technology assuming you have the right workloads to take advantage of what it can offer. Obviously it's not going to do much for a consumer grade desktop, at least not in this form factor & price.
It's pretty clear that in at least some of those tests the PCIe interface is doing some bottlenecking too. It will be interesting to see Optane integrated into memory DIMMs where that is no longer an issue.
I don't agree. For Gen1, I'd say it's about right on. It seems that consumer storage advancements are accelerating (SSD, NAND, now this inside a decade). I for one am happy to see a part of Intel (albeit a joint partnership) pressing ahead and releasing revolutionary tech - soon to me enjoyed by consumers.
Hypetane is based on very mature technology, only the storage medium is allegedly new. And it has gone through at least 3 refinements since first taped out.
Which explains why gen1 is so "good" and also that gen2 will be barely incremental, because there is nothing much to improve upon, the only performance increase can come from more parallelism (which can be implemented with gen1 tech just as well) or improved controller.
I'm impressed. If money is no object it's a flash killer. Unfortunately it's also way more expensive than I can afford even if I wouldn't need a new CPU/Mobo/Ram to use it. I'm really interested in seeing if the consumer focused little optane cache drives can actually make a significant difference in real world use. Tiny cache SSDs looked decent in benchmarks, but real world use patterns were sufficiently random to undermine them unless you were up to a "real SSD sized" cache of 120ish MB vs the 16/32GB of the cheap cache drives. And 120SSD + HDD was pricey enough and niche enough at the time that outside of Apple AFAIK no OEM offered it as a pre-build cache setup; and the enthusiasts who were willing to pay the price premium (myself among them) were able to just configure out boxes to keep music/images/video on the HDD and use the SSD for almost everything else.
Money IS NO object. It is an abstract concept. Paper bills are only symbolic representation of money, 99% of the money don't even exist in paper form, they are just some imaginary numbers.
Aren't you a clever one. Seriously you make some good points in other comments and you are technically right in this one as well, but goddamn you're such an ass.
It is impossible to be smart and considered not an ass in a world, swarming with dummies. I'd rather be an ass than dumb. Playing dumb is not an option, because it eventually gets you. Fitting in with the dummies is not really worth it.
With all the Intel hype and PR, I was expecting the charts to be a bit more, um, flat? Looking at the deltas from start to finish of each benchmark, it looks like the drive has lots of characteristics similar to current flash based SSDs for the same price.
Not impressed. I'll wait for your hands on review before bashing it more.
This is what the reviews don't explain and leave people in total darkness. You think your shiny new samsung 960 pro with 2.5g/s will be faster than your dusty old 840 evo barely scratching 500? Yes? Then you are in for a surprise - graphs look great, but check on loading times and real program/game benches and see it is exactly the same. That is why SSD reviews should always either divide to sections for the different usage or explain in great simplicity and detail what you need to look for in a PART of the graph. This one is about 8-10 times faster than your SSD so it IS impressive a lot, but price is equally impressive.
Yes, that's the problem with readers. They're comparing this to the 960 Pro and other M.2 and even SATA drives. Um.... NO. You compare this with similar form factor SSDs with similar price tags and heat sinks.
And no, even QD1 benches aren't that big of a difference.
"And no, even QD1 benches aren't that big of a difference" This didn't sound right, I meant to say that even QD1 isn't very different **compared to enterprise full PCIe SSDs*** at similar prices.
You're crazy. This thing is great. The current weak spot of NAND is on full display here, and xpoint is decimating it. We all know SSDs chug when you throw a lot of writes at them, all of Anandtech "performance consistency" benchmarks show that iops take a nose dive if you benchmark for more than a few seconds. Xpoint doesn't break a sweat and is orders of magnitude faster.
I'm also pleasantly surprised at the consistency of sequential. A lot of noise was made about their sequential numbers not being as good as the latest SSDs, but one thing not considered is that SSDs don't hit that number until you get to high queue depths. For individual transfers xpoint seems to actually come closer to max performance.
Queue depth is concurent access to the drive, at the same time.
For desktop/gaming you are looking at 4k random read (95-99% of the time), QD=1 For movie processing you are looking at sequential read/write at QD=1 For light file server you are looking at both higher blocks, say 64k random read and also sequential read, at QD=2/4 For heavy file server you go for QD=8/16 For light database you are looking for QD=4, random read/random write (depends on db type) For heavy database you are looking for QD=16/more, random read/random write (depends on db type)
A heavy file server only has such a small queue depth if using spinning rust, to keep down latency. When using SSDs, file servers have QDs in 64-256 range.
Queue depth is how many commands the computer has queued up for the drive. The computer can issue commands to the drive faster than it can service them -- so, for example, SATA can support a queue of up to 32 commands. Typical desktop use just doesn't generate enough traffic on the drives to queue up much data so you usually are in the low 1-2, maybe 4 QD. Some server workloads can be higher, but even on a DB server, if you are seeing QD's of 16 I would say your storage is not fast enough for what you are trying to do, so being able to get good performance at low queue depths is truly a breakthrough.
For file servers, it's not just the queue depth that's important, it's the number of queues. FreeBSD and OpenZFS have had a lot of blogs and videos about the issues of scaling up servers, especially in regards to multi-core.
SATA only supports 1 queue. NVMe supports up to ~65,000 with a depth of ~65,000 each. They're actually having issues saturating high end SSDs because their IO stack can't handle the throughput.
If you have a lot of SATA drives, then you effectively have many queues, but if you want a single/few super fast device(s), like say L2ARC, you need to take advantage of the new protocol.
The answer is something like the Linux kernel's block multiqueue (ongoing, still not the default for all devices but it shouldn't be more than a few more cycles). Its been a massive undertaking and involved rewriting many drivers.
It is a pity intel doesnt make video cards, because 16GB of this would go very well with 4GB of RAM and a decent memory controller. It would lower the overall cost and not impact performance at all.
"It would lower the overall cost and not impact performance at all."
What? This stuff is around 50x slower than DRAM, which itself is reaching its limits in GPUs, hence features like delta color compression... Right now when your gpu runs out of ram it uses your system ram as extra space, this is a far better system.
"Intel's new 3D XPoint non-volatile memory technology, which has been on the cards publically for the last couple of years"
I think you mean "IN the cards". In this context, "ON the cards" makes it sound like we've all been missing out on 3D xPoint PCI cards for a "couple of years" :)
A bit of a suggestion - can you divide (or provide in final thoughts) SSD reviews per consumer base? Desktop user absolutely does not care about sequential performance or QD16 or even write for what matters (except for the odd time installing something). Database can't care less about sequential or low QD, etc. Giving the tables is good for the odd few % of the readers that actually know what to look for, the rest just take a look at the end of the graph and take a stunningly wrong idea. Just a few comparisons tailored per use will make it so easy for the masses. It was Anand that fought for that during the early sandforce days, he forced ocz to reconsider their ways to tweak SSDs for real world performance, not graph based and got me as a follower. Let that not die in vain and let those, that lack the specific knowledge be informed. Just look at the comments and see how people interpret the results. I know this is enterprise grade SSD, but it is also a showcase for a new technology that will come in our hands soonish.
NVME yes, will show up that when you run that game on 960 pro it will take exactly same amount (-~1 sec) compared to old sata ssd. Octane however will show some 8 times faster and it will stick totally awesome in the graph. If you don't know what to look for, octane is not impressive and some new nvme ssd is actually very good compared to the old, both are untrue.
How about power consumption? Could we start seeing similar hardware in tablets and phones in say, 5 years' time? We will still need DRAM for speed and low power consumption. XPoint would then make for a great system and caching drive, with slower and cheaper NAND being used for media storage, like how SSD + HDD setups are used now.
Literally the last two sentences in the review (and mentioned at other times):
"Since our testing was remote, we have not yet even had the chance to look under the drives's heatsink, or measure the power efficiency of the Optane SSD and compare it against other SSDs. We are awaiting an opportunity to get a drive in hand, and expect some of the secrets under the hood to be exposed in due course as drives filter through the ecosystem."
What I really want to see, ( Not sure if Intel allows them to ) Optane to put through all the test of SSD Bench. ( For Reference Only, and we would know how QD1 affect the benchmarks ) And Power Consumption.
"However it is worth noting that the Optane SSD only manages a passing score when the application uses asynchronous I/O APIs. Using simple synchronous write() system calls pushes the average latency up to 11-12µs"
You mentioned "polling mode" for nvme was disabled, which is strange since that's been the default since ~4.5. Also, there are different types of polling modes, so, my hope is that the polling mode you are talking about is the new hybrid polling (introduced in 4.10, but possibly backported to your Ubuntu kernel). If not, then we know that xpoint is faster than the data you've gathered. Western Digital gave a talk at the recent Vault conference and discussed when it makes sense to poll vs reap. Polling ends up being about 1.3x faster (average latency) than waiting for the irq (4.5us vs 6us). If you went with one of the userspace drivers, polling ends up twice as fast, but that would take much more work to benchmark. So, considering that you're benchmarking the kinds of devices that this feature was designed for, and that we are interested in us latencies, what you've ended up benchmarking here was, to a greater extent than needed, the default kernel configuration.
I said the NVMe driver wasn't manually switched into polling mode; I left it with the default behavior which on 4.8 seems to be not polling unless the application requests. I'm certainly not seeing the 100% CPU usage that would be likely if it was polling.
If I'd had more time, I would have experimented with the latest kernel versions and the various tricks to get even lower latency.
I wasn't claiming that you disabled polling only that polling was disabled since it should be on be default for this device. Assuming you were looking at the sysfs interface, was the key that was set to 0 called io_poll or io_poll_delay? The later set to 0 enables hybrid polling, so the cpu wouldn't be pegged. Either way, you wouldn't need a new kernel, just to enable a feature the kernel has had since 4.4 for these low latency devices. Also, did you disable the pagecache (direct=1) in your fio commands? If you didn't, that would explain why aio was faster since it uses dio. Btw, it's not my intent to unnecessarily criticize you because i realize the tests were performed under constrained circumstances. I just would've appreciated some comment in the article about a critical feature for this hardware was not enabled in the kernel.
Optane was supposed to be 1000x faster, have 1000X endurance and be 10x denser than NAND (http://hothardware.com/ContentImages/NewsItem/4020... I realize this is the first product, but saying that it fell short of expectation is an understatement. It has lower endurance, lower density and it is measurably faster, but certainly nowhere close 1000X. Oh, did I mention it is 5-10X more expensive?
I am quite disappointed, to be honest. It will get better, but @not ready@ is something that comes to ind reading the article.
3D XPoint memory was supposed to be 1000x faster than NAND, 1000x more durable than NAND, and 10x denser than DRAM. Those claims were about the 3D XPoint memory itself, not the Optane SSD built around that memory.
I disagree. I can agree that the speed may be limited by the drive, but even so, it falls short by a large factor. The durability and the density, however, are pretty much platform independent and they are not there by a very, very long shot. Intel itself demonstrated that it is only 2.4-3X faster (https://en.wikipedia.org/wiki/3D_XPoint).
It clearly has a future, especially as the NAND is approaching the end of its scalability. Engineering wise is interesting, but today, it makes really little sense, while it should have been a slam dunk. I mean, who would have thought twice before buying a 500GB drive that maxes out the SATA for $20-30? But this one ... not so much.
I don't see xpoint replacing dram due to both latency and endurance not being up to par , but It's going to disrupt the ssd market and as the technology matures and prices come down, I can see xpoint revolutionizing the storage market as ssd did years ago.
Competition is clearly worried since seems like paid trolls are trying to spread falsehoods and bs here and elsewhere on the web.
I just bet it will be highly disturbing to the SSD market LOL. With its inflated price, limited capacity and pretty much unnecessary advantages I can just see people lining up to buy that and leaving SSDs on the shelves.
You are either extremely ignorant or a paid troll !!! anyone who understands technology knows that new tech is always expensive. When SSDs came to the market, they were much more expensive and had a lot less capacity than HDDs but they closed the gap and disrupted the market. The same is bound to happen for Xpoint which performs better than NAND by orders of magnitude.
It is not expensive because it is new, it is expensive because intel and micron wasted a crapload of money on RDing it and it turned out to be mediocre - significantly weaker than good old and almost forgotten SLC. So now they hype and lie about it and sell it significantly overpriced in hopes they will see some returns of the investment.
Also, it seems like you are quite ignorant, ignorant enough to not know what "order of magnitude" means. You just heard someone smart using it and decided to imitate, following some brilliant logic that it will make you look smart. Well, it doesn't. It does exactly the opposite. Now either stop using it, or at the very least, look it up, understand and remember what it actually means, so the next time you use it, you don't end up embarrassing yourself.
"significantly weaker than good old and almost forgotten SLC"
Seriously ?! You must be getting paid to spew this bs! no one can be this ignorant!! can you read numbers ?! what part of 8.9us latency don't you understand, this is at least 10x better than the latest and greatest NVMe SSDs (be it TLC, VNAND or whatever bs marketing terms they feed idiots like you nowadays).
what part of 95K/108K QD1 IOPS don't you understand ?! This is 3-10x compared to this best SSDs on the market.
So I repeat again, Xpoint is orders of magnitude better performing than the latest and greatest SSDs (from Samsung or whichever company) on the market. This is a fact.
You don't even understand basic math, stop embarrassing yourself by posting these idiotic comments!
Well if this fruitless exchange is any evidence my intellect is far superior to yours. So If my intellect is equal to that of a parrot, yours must be equal to that of a maggot ... lol
From your testing, looks like the drive offers real advantages on low QD, i.e. for desktop/small office server use. For these uses a normal SSD is also enough though. Given that modern Xeons have up to 28 cores (running 56 threads each) and server motherboards have 2 or more CPU slots, a properly loaded server will offer QD > 64 all day long, and certainly not just 4 active threads - where the Micron 9100 offers even higher performance, and if the performance is good enough there, it certainly good enough on lower QDs where it is even better PER REQUEST. And who cares what 99.999% latency is, as long as it is milliseconds and not seconds - network and other latencies on the accesses to these servers will be higher anyway.
An incredibly good first attempt, but it really does not push the envelope in the market it is priced for - high-performance storage-bottlenecked servers.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
117 Comments
Back to Article
Ninhalem - Thursday, April 20, 2017 - link
At last, this is the start of transitioning from hard drive/memory to just memory.ATC9001 - Thursday, April 20, 2017 - link
This is still significantly slower than RAM....maybe for some typical consumer workloads it can take over as an all in one storage solution, but for servers and power users, we'll still need RAM as we know it today...and the fastest "RAM" if you will is on die L1 cache...which has physical limits to it's speed and size based on speed of light!I can see SSD's going away depending on manufacturing costs but so many computers are shipping with spinning disks still I'd say it's well over a decade before we see SSD's become the replacement for all spinning disk consumer products.
Intel is pricing this right between SSD's and RAM which makes sense, I just hope this will help the industry start to drive down prices of SSD's!
DanNeely - Thursday, April 20, 2017 - link
Estimates from about 2 years back had the cost/GB price of SSDs undercutting that of HDDs in the early 2020's. AFAIK those were business as usual projections, but I wouldn't be surprised to see it happen a bit sooner as HDD makers pull the plug on R&D for the generation that would otherwise be overtaken due to sales projections falling below the minimums needed to justify the cost of bringing it to market with its useful lifespan cut significantly short.Guspaz - Saturday, April 22, 2017 - link
Hard drive storage cost has not changed significantly in at least half a decade, while ssd prices have continued to fall (albeit at a much slower rate than in the past). This bodes well for the crossover.Santoval - Tuesday, June 6, 2017 - link
Actually it has, unless you regard HDDs with double density at the same price every 2 - 2.5 years as not an actual falling cost. $ per GB is what matters, and that is falling steadily, for both HDDs and SSDs (although the latter have lately spiked in price due to flash shortage).bcronce - Thursday, April 20, 2017 - link
The latency specs include PCIe and controller overhead. Get rid of those by dropping this memory in a DIMM slot and it'll be much faster. Still not as fast as current memory, but it's going to be getting close. Normal system memory is in the range of 0.5us. 60us is getting very close.tuxRoller - Friday, April 21, 2017 - link
They also include context switching, isr (pretty board specific), and block layer abstraction overheads.ddriver - Friday, April 21, 2017 - link
PCIE latency is below 1 us. I don't see how subtracting less than 1 from 60 gets you anywhere near 0.5.All in all, if you want the best value for your money and the best performance, that money is better spent on 128 gigs of ecc memory.
Sure, xpoint is non volatile, but so what? It is not like servers run on the grid and reboot every time the power flickers LOL. Servers have at the very least several minutes of backup power before they shut down, which is more than enough to flush memory.
Despite intel's BS PR claims, this thing is tremendously slower than RAM, meaning that if you use it for working memory, it will massacre your performance. Also, working memory is much more write intensive, so you are looking at your money investment crapping out potentially in a matter of months. Whereas RAM will be much, much faster and work for years.
4 fast NVME SSDs will give you like 12 GB\s bandwidth, meaning that in the case of an imminent shutdown, you can flush and restore the entire content of those 128 gigs of ram in like 10 seconds or less. Totally acceptable trade-back for tremendously better performance and endurance.
There is only one single, very narrow niche where this purchase could make sense. Database usage, for databases with frequent low queue access. This is an extremely rare and atypical application scenario, probably less than 1/1000 in server use. Which is why this review doesn't feature any actual real life workloads, because it is impossible to make this product look good in anything other than synthetic benches. Especially if used as working memory rather than storage.
IntelUser2000 - Friday, April 21, 2017 - link
ddriver: Do you work for the memory industry? Or hold a stock in them? You have a personal gripe about the company that goes beyond logic.PCI Express latency is far higher than 1us. There are unavoidable costs of implementing a controller on the interface and there's also software related latency.
ddriver - Friday, April 21, 2017 - link
I have a personal gripe with lying. Which is what intel has been doing every since it announced hypetane. If you find having a problem with lying a problem with logic, I'd say logic ain't your strong point.Lying is also what you do. PCIE latency is around 0.5 us. We are talking PHY here. Controller and software overhead affect equally every communication protocol.
Xpoint will see only minuscule latency improvements from moving to dram slots. Even if PCIE has about 10 times the latency of dram, we are still talking ns, while xpoint is far slower in the realm of us. And it ain't no dram either, so the actual latency improvement will be nowhere nearly the approx 450 us.
It *could* however see significant bandwidth improvements, as the dram interface is much wider, however that will require significantly increased level of parallelism and a controller that can handle it, and clearly, the current one cannot even saturate a pcie x4 link. More bandwidth could help mitigate the high latency by masking it through buffering, but it will still come nowhere near to replacing dram without a tremendous performance hit.
ddriver - Friday, April 21, 2017 - link
*450 ns, by which I mean lower by 450 ns. And the current xpoint controller is nowhere near hitting the bottleneck of PCIE. It would take a controller that is at least 20 times faster than the current one to even get to the point where PCIE is a bottleneck. And even faster to see any tangible benefit from connecting xpoint directly to the memory controller.I'd rather have some nice 3D SLC (better than xpoint in literally every aspect) on PCIE for persistent storage RAM in the dimm slots. Hyped as superior, xpoint is actually nothing but a big compromise. Peak bandwidth is too low even compared to NVME NAND, latency is way too high and endurance is way too low for working memory. Low queue depths performance is good, but credit there goes to the controller, such a controller will hit even better performance with SLC nand. Smarter block management could also double the endurance advantage SLC already has over xpoint.
mdriftmeyer - Saturday, April 22, 2017 - link
ddriver is spot on. just to clarify an early comment. He's correct and the IntelUser2000 is out of his league.mdriftmeyer - Saturday, April 22, 2017 - link
Spot on.tuxRoller - Friday, April 21, 2017 - link
We don't know how much slower the media is than dram right now.We know than using dram over nvme has similar (though much better worst case) perf to this.
See my other post regarding polling and latency.
bcronce - Saturday, April 22, 2017 - link
Re-reading, I see it says "typical" latency is under 10us, placing it in spitting distance of DDR3/4. It's the 99.9999th percentile that is 60us for Q1. At Q16, 99.999th percentile is 140us. That means it takes only 140us to service 16 requests. That's pretty much the same as 10us.Read Q1 4KiB bandwidth is only about 500MiB/s, but at Q8, it's about 2GiB which puts it on par with DDR4-2400.
ddriver - Saturday, April 22, 2017 - link
"placing it in spitting distance of DDR3/4"I hope you do realize that dram latency is like 50 NANOseconds, and 1 MICROsecond is 1000 NANOseconds.
So 10 us is actually 200 times as much as 50 ns. Thus making hypetane about 200 times slower in access latency. Not 200%, 200X.
tuxRoller - Saturday, April 22, 2017 - link
Yes, the dram media is that fast but when it's exposed through nvme it has the latency characteristics that bcronce described.wumpus - Sunday, April 23, 2017 - link
That's only on a page hit. For the type of operations that 3dxpoint is looking at (4k or so) you won't find it on an open page and thus take 2-3 times as long till it is ready.That still leaves you with ~100x latency. And we are still wondering if losing the PCIe controller will make any significant difference to this number (one problem is that if Intel/Micron magically fixed this, the endurance is only slightly better than SLC and would quickly die if used as main memory).
ddriver - Sunday, April 23, 2017 - link
Endurance for the initial batch postulated from intel's warranty would be around 30k PE cycles, and 50k for the upcoming generation. That's not "only slightly better than SLC" as SCL has 100k PE cycles endurance. But the 100k figure is somewhat old, and endurance goes down with process node. So at a comparable process, SLC might be going down, approaching 50k.It remains to be seen, the lousy industry is penny pinching and producing artificial NAND shortages to milk people as much as possible, and pretty much all the wafers are going into TLC, some MLC and why oh why, QLC trash.
I guess they are saving the best for last. 3D SLC will address the lower density, samsung currently has 2 TB MLC M2, so 1 TB is perfectly doable via 3D SLC. I am guessing samsung's z-nand will be exactly that - SLC making a long overdue comeback.
tuxRoller - Sunday, April 23, 2017 - link
The endurance issue is, imho, the biggest concern right now.melgross - Tuesday, April 25, 2017 - link
You're making the mistake those who know nothing make, which is surprising for you. This is a first generation product. It will get much faster, and much cheaper as time goes on. NAND will stagnate. You also have to remember that Intel never made the claim that this was as fast as RAM, or that it would be. The closest they came was to say that this would be in between NAND and RAM in speed. And yes, for some uses, it might be able to replace RAM. But that could be several generations down the road, in possibly 5 years, or so.tuxRoller - Sunday, April 23, 2017 - link
I'm not sure i understand you.You talk about "pages", but, i hope, the reviewer was only using dio, so there would be no page cache.
It's very unclear where you are getting this "~100x" number. Nvme connected dram has a plurality of hits around 4-6 us (depending on software) but it also has a distributed latency curve. However, i don't know what the latency at the 99.999% percentile. The point is that even with dram's sub-100ns latency, it's still not staying terribly close to the theoretical min latency of the bus.
Btw, it's not just the controller. A very large amount of latency comes from the block layer itself (amongst other things).
Santoval - Tuesday, June 6, 2017 - link
It is quite possible that Intel artificially weakened P4800X's performance and durability in order to avoid internal competition with their SSD division (they already did the same with Atoms). If your new technology is *too* good it might make your other more mainstream technology look bad in comparison and you could see a big drop in sales. Or it might have a "deflationary" effect, where their customers might delay buying in hope of lower prices later. This way they can also have a more clear storage hierarchy, business segment wise, where their mainstream products are good, and their niche ones are better but not too good.I am not suggesting that it could ever compete with DRAM, just that the potential of 3D XPoint technology might actually be closer to what they mentioned a year ago than the first products they shipped.
albert89 - Friday, April 21, 2017 - link
Intel wont be reducing the price of the optane but rather will be giving the average consumer a watered down version which will be charged at a premium but perform only slightly better then the top SSD. The conclusion ? Another over priced ripoff from Intel.TheinsanegamerN - Thursday, April 20, 2017 - link
the fastest SSD on the consumer market is the 960 pro, which can hit 3.2GB/s read under certain circumstances.This is the equivalent of single channel DDR 400 from 2001. and DDR had far lower latencys to boot.
We are a long, long way from replacing RAM with storage.
ddriver - Friday, April 21, 2017 - link
What makes the most impression is it took a completely different review format to make this product look good. No doubt strictly following intel's own review guidelines. And of course, not a shred of real world application. Enter hypetane - the paper dragon.ddriver - Friday, April 21, 2017 - link
Also, bandwidth is only one side of the coin. Xpoint is 30-100+ times more latent than dram, meaning the CPU will have to wait 30-100+ times longer before it has data to compute, and dram is already too slow in this aspect, so you really don't want to go any slower.I see a niche for hypetane - ram-less systems, sporting very slow CPUs. Only a slow CPU will not be wasted on having to wait on working memory. Server CPUs don't really need to crunch that much data either, if any, which is paradoxical, seeing how intel will only enable avx512 on xeons, so it appears that the "amazingly fast" and overpriced hypetane is at home only in simple low end servers, possibly paired with them many core atom chips. Even overpriced, it will kind of a decent deal, as it offers about 3 times the capacity per dollar as dram, paired with wimpy atoms it could make for a decent simple, low cost, frequent access server.
frenchy_2001 - Friday, April 21, 2017 - link
You are missing the usefulness of it entirely.Yes, it is a niche product.
And I even agree, intel is hyping it and offering it for consumer with minimal benefit (beside intel's bottom line).
But it realistically slots between NAND and DRAM.
This review shows that it has lower latency than NAND and it has higher density than DRAM.
This is the play.
You say it cannot replace DRAM and for most usage (by far) you are true. However, for a small niche that works with very big data sets (like for finace or exploration), having more memory, although slower, will still be much faster than memory + swap (to a slower NAND storage).
Let me repeat, this is a niche product, but it has its uses.
Intel marketing is hyping it and trying to use it where its tradeoffs (particularly price) make little sense, but the technology itself is good (if limited).
wumpus - Sunday, April 23, 2017 - link
Don't be so sure that latency is keeping it from being used as [secondary] main memory. A 4GB machine can actually function (more or less) for office duty and some iffy gaming capability. I'd strongly suspect that a 4-8GB stack of HBM (preferably the low-cost 512 bit systems, as the CPU really only wants 512bit chunks of memory at a time) with the rest backed by 3dxpoint would still be effective at this high latency. Any improvement is likely to remove latency as something that would stop it (and current software can use the current stack [with PCIe connection] to work 3dxpoint as "swappable ram").The endurance may well keep this from happening (it is on par with SLC).
The other catch is that this is a pretty steep change along the entire memory system. Expect Intel to have huge internal fights as to what the memory map should look like, where the HBM goes (does Intel pay to manufacture an expensive CPU module or foist it on down the line), do you even use HBM (if Ravenridge does, I'd expect that Intel would have to if they tried to use xpoint as main memory)? The big question is what would be the "cache line" of the DRAM memory: the current stack only works with 4k, the CPU "wants" 512 bits, HBM is closer to 4k. 4k looks like a no-brainer, but you still have to put a funky L5/buffer that deals with the huge cache line or waste a ton of [top level, not sure if L3 or L4] cache by giving it 4k cache lines.
melgross - Tuesday, April 25, 2017 - link
What is it with you and RAM? This isn't a RAM replacement for most any use. Intel hasn't said that it is. Why are you insisting on comparing it to RAM?masouth - Tuesday, May 2, 2017 - link
With ddriver and RAM? I've only skimmed ddriver's posts but I believe a summary would be.1) RAM is faster than this product so adding more RAM would be a better option than adding a middle man that is only faster than the data storage device but still slower than RAM.
2) RAM has much more endurance than these drives
3) Servers tend to stay on 24/7 and have back up power solutions (UPS, generators, etc) to allow for a RAM data flush to a non-volatile data storage device prior to any power loss so it renders Optane's advantage of being non-volatile fairly moot.
ddriver believes these reasons result in this product having very niche uses yet Intel keeps hyping this as a solution for every user while hiding behind synthetic benchmarks instead of demonstrating real world applications which would reveal that more RAM would lead to a superior solution in many/most cases.
I may have missed something but I think that sums up what I have read so far.
masouth - Tuesday, May 2, 2017 - link
oops, in the last part I forgot that he saying they are using the benchmarks to hide the fact that it's not as far ahead of NAND speads (although it is ahead) as they claim.AnTech - Saturday, April 29, 2017 - link
Is Intel XPoint Optane a fiasco? Check out:Intel crosses an unacceptable ethical line
http://semiaccurate.com/2017/03/27/intel-crosses-u...
Santoval - Tuesday, June 6, 2017 - link
A few days ago I registered here on Anandtech and I found it very odd that such a very knowledgeable website provided (only) unsecure cleartext registration and log-in forms. I felt awkward and uncomfortable, because that is a very no no for me. I wanted to register though, so I used the Tor Browser, to risk being sniffed only by the exit node. Now I see that Charlie (which I used to read ages ago) has taken this quite a few steps further..The guy sells $1,000 annual "professional subscriptions" on a completely private, crystal clear transparent, as public as it gets, 100% unencrypted page. I am utterly dumbfounded... And I lost all appetite to read his article or anything from him ever again. For life. Click your link and then click the "Become a subscriber" link on the top to enjoy this adorable (in)security atrocity..
tsk2k - Thursday, April 20, 2017 - link
You forgot one thing, CRYSIS 3 FPS?!?!philehidiot - Thursday, April 20, 2017 - link
I find the go faster stripes on my monitor screen make a massive difference to my FPS. I have many, many more FPS as a result. It's due to the quality of the paint - Dulux one-coat just bring down my latency to the point whe....... I've sniffed too much of this paint, haven't I?
ddriver - Friday, April 21, 2017 - link
If you use this instead of ram it will most likely be 3 FPS indeed :)mtroute - Friday, April 21, 2017 - link
never has Intel claimed that this product is faster the DRAM...Your indignation is not proportional to even your perceived slight by Intel. You work for SK Hynix or more likely Powerchip don't you?ddriver - Sunday, April 23, 2017 - link
Nope, I am self employed. I never accused intel of lying about hypetane being faster than dram. I accused them of lying how much faster than NAND it is and how close to dram it is. And I have only noted that it is hundreds of times slower than dram, making the population of dimm slots (which some intel cheerleaders claim will magically make hypetane faster) is a very bad prospect in 99.99% of the use cases.I don't have corporate preferences either, IMO all corporations are intrinsically full of crap, yet the amount of it varies. I also do realize that "nicer" companies are only nicer because they are it a tough situation and cannot afford to not be nice.
What annoys me is that legally speaking, false advertising is a crime, yet everyone is doing it, because it has so many loopholes, and what's worse, the suckers line up to cheer at those lies.
MobiusPizza - Sunday, April 23, 2017 - link
It is still a first gen product and I think it has potential in servers and scientific computing. First gen SSDs were also crappy with low capacity. Give it 5 years I think it will make more sense.melgross - Tuesday, April 25, 2017 - link
You obviously have some ax to grind. You do seem bitter about something.The first SSDs weren't much better than many HHD's in random R/W. Give it a break!
XabanakFanatik - Thursday, April 20, 2017 - link
I know that this drive isn't targeted for consumers at all, but I'm really interested in how it performs in consumer level workloads as an example of what a full Optane SSD is capable of. Any chance we can get a part 2 with the consumer drive tests and have it compared to the fastest consumer NVM-e drives? Even just a partial test suite for a sampler of how it compares would be great.Drumsticks - Thursday, April 20, 2017 - link
I imagine it will be insane - the drive saturates its throughput at <QD6, meaning most consumer workloads. It'll obviously be a while before its affordable from a consumer perspective, but I suspect the consumer prices will be a lot lower without the enterprise class requirements thrown in.This drive looks incredibly good. 2-4x more than enterprise SSDs for pretty similar sequential throughput - BUT at insanely lower queue depths, which is a big benefit. At those QDs, it's easily justifying its price in throughput. Throw on top of that a 99.999th% latency that is often better than their 99th% latency, and 3D Xpoint has a very bright future ahead of it. It might be gen 1 tech, but it's already justified its existence for an entire class of workloads.
superkev72 - Thursday, April 20, 2017 - link
Those are some very impressive numbers for a gen1 storage device. Basically better than an SSD in almost every way except of course price. I'm interested in seeing what Micron does with QuantX as it should have the same characteristics but potentially more accessible.DrunkenDonkey - Thursday, April 20, 2017 - link
Well finally! I was waiting for this test ever since I heard about the technology. This is enterprise drive, yeah, but it is the showcase for the technology and it shows what we can expect for consumer drive - 8-10x current SSD speeds for desktop usage (that is 98% 4-8k RR, QD=1). That blows out of the water everything in the market. Actually this technology shines exactly at radon joe's PC, while SSDs shine only in enterprise market (QD=16+). Can't wait!Meteor2 - Thursday, April 20, 2017 - link
But don't we say SATA3 is good enough and we don't really need (for consumer use) NVMe? So what's the real benefit of something faster?DrunkenDonkey - Thursday, April 20, 2017 - link
All you want (from desktop user perspective) is low latency at low queue depth (1). NVME helps with that regard, tho not by a lot. Equal drives, one on sata, one on nvme will make the nvme a bit more agile resulting in more performance for you. So far no current ssd is ever close to saturate the sata3 bus in desktop use, this one, however, is scratching it. Sure, it will be years till we get affordable consumer drives from that tech, but it is pretty much the same step forward than going from hdd to ssd - first ssds were in the range of 20ish mb per second, while hdds - about 1.5 in these circumstances. Here we are talking a jump from 50 to close to 400+. Moar power! :)serendip - Thursday, April 20, 2017 - link
Imagine having long battery life and instant hibernation - at 400 mbps, waking up from hibernation and reloading memory contents would take a few seconds. Then again, constantly writing a huge page file to XPoint wouldn't be good for longevity and hibernation doesn't allow for background processes to run while asleep. I'm thinking of potential usage for XPoint on phones and tablets, can't seem to find any.ddriver - Friday, April 21, 2017 - link
Yeah, also imagine your system working 10 times slower, because it uses hypetane instead of ram.And not only that, but you also have to replace that memory every 6 months or so, because working memory is much more write intensive, and this thing's endurance is barely twice that of MLC flash.
It is well worth the benefit of instant resume, because if enterprise systems are known for something, that is frequently hibernating and resuming.
tuxRoller - Friday, April 21, 2017 - link
They didn't say replace the ram with xpoint.It's a really good idea since xpoint has faster media access times so even when it's a smaller amount it should still be quite a bit faster than nand.
ddriver - Friday, April 21, 2017 - link
So if you populate the dimm slots with hypetane, where does the dram go?kfishy - Friday, April 21, 2017 - link
You can have a hybrid memory subsystem, the current topology of CACHE-DRAM-SSD/HDD is not the only way to go.tuxRoller - Friday, April 21, 2017 - link
Why are you mentioning dimms?Are you just posting random responses?
Neither of your posts in this thread actually addressed anything that the posters were discussing.
Kakti - Saturday, April 22, 2017 - link
Have you been living in a cave the past five years? SATA 3.0 has been the limiting factor for SSDs for a while now - all max out around 450MB/sec.Now there are plenty of SSD that connect via PCIe instead of SATA and are able to pull several gigabytes/sec. Examples include Samsung 960 Pro/Evo, 950 Pro, OCZ RD400, etc. SATA has ben the bottleneck for a while and now that we have NVMe, we're seeing what NAND can really do with m.2 or pci-e connections
cfenton - Thursday, April 27, 2017 - link
That speed is only for high queue depth workloads. Even the 960 Pro only does about 137mb/s average in random reads over QD1, QD2, and QD4. The QD1 numbers are something like 34mb/s. Those numbers are far below the SATA spec. Almost all consumer tasks are low queue depth.With this drive, you get about 400mb/s even at QD1, and something like 1.3gb/s at QD4.
CajunArson - Thursday, April 20, 2017 - link
A very very sweet piece of technology assuming you have the right workloads to take advantage of what it can offer. Obviously it's not going to do much for a consumer grade desktop, at least not in this form factor & price.It's pretty clear that in at least some of those tests the PCIe interface is doing some bottlenecking too. It will be interesting to see Optane integrated into memory DIMMs where that is no longer an issue.
tarqsharq - Thursday, April 20, 2017 - link
I can imagine this on a heavily trafficked database server would be insanely effective.ddriver - Friday, April 21, 2017 - link
Not anywhere nearly as fast as an in-memory database.Chaitanya - Thursday, April 20, 2017 - link
Like most recent Intel products: overpriced, and overhyped.vortmax2 - Thursday, April 20, 2017 - link
I don't agree. For Gen1, I'd say it's about right on. It seems that consumer storage advancements are accelerating (SSD, NAND, now this inside a decade). I for one am happy to see a part of Intel (albeit a joint partnership) pressing ahead and releasing revolutionary tech - soon to me enjoyed by consumers.haukionkannel - Friday, April 21, 2017 - link
I agree. A very good product!ddriver - Friday, April 21, 2017 - link
Hypetane is based on very mature technology, only the storage medium is allegedly new. And it has gone through at least 3 refinements since first taped out.Which explains why gen1 is so "good" and also that gen2 will be barely incremental, because there is nothing much to improve upon, the only performance increase can come from more parallelism (which can be implemented with gen1 tech just as well) or improved controller.
mtroute - Friday, April 21, 2017 - link
First, wrong, just completely wrong. Second, cite your source.tuxRoller - Friday, April 21, 2017 - link
What's better?Who has better cpus?
tat tvam asi - Monday, April 24, 2017 - link
You contribute nothing to this discussion, Chaitanya. Just bileDanNeely - Thursday, April 20, 2017 - link
I'm impressed. If money is no object it's a flash killer. Unfortunately it's also way more expensive than I can afford even if I wouldn't need a new CPU/Mobo/Ram to use it. I'm really interested in seeing if the consumer focused little optane cache drives can actually make a significant difference in real world use. Tiny cache SSDs looked decent in benchmarks, but real world use patterns were sufficiently random to undermine them unless you were up to a "real SSD sized" cache of 120ish MB vs the 16/32GB of the cheap cache drives. And 120SSD + HDD was pricey enough and niche enough at the time that outside of Apple AFAIK no OEM offered it as a pre-build cache setup; and the enthusiasts who were willing to pay the price premium (myself among them) were able to just configure out boxes to keep music/images/video on the HDD and use the SSD for almost everything else.ddriver - Friday, April 21, 2017 - link
Money IS NO object. It is an abstract concept. Paper bills are only symbolic representation of money, 99% of the money don't even exist in paper form, they are just some imaginary numbers.Leosch - Sunday, April 23, 2017 - link
Aren't you a clever one. Seriously you make some good points in other comments and you are technically right in this one as well, but goddamn you're such an ass.ddriver - Sunday, April 23, 2017 - link
It is impossible to be smart and considered not an ass in a world, swarming with dummies. I'd rather be an ass than dumb. Playing dumb is not an option, because it eventually gets you. Fitting in with the dummies is not really worth it.Leosch - Thursday, April 27, 2017 - link
It is possible, you are just unable, because you're not as great as you thinklilmoe - Thursday, April 20, 2017 - link
With all the Intel hype and PR, I was expecting the charts to be a bit more, um, flat? Looking at the deltas from start to finish of each benchmark, it looks like the drive has lots of characteristics similar to current flash based SSDs for the same price.Not impressed. I'll wait for your hands on review before bashing it more.
DrunkenDonkey - Thursday, April 20, 2017 - link
This is what the reviews don't explain and leave people in total darkness. You think your shiny new samsung 960 pro with 2.5g/s will be faster than your dusty old 840 evo barely scratching 500? Yes? Then you are in for a surprise - graphs look great, but check on loading times and real program/game benches and see it is exactly the same. That is why SSD reviews should always either divide to sections for the different usage or explain in great simplicity and detail what you need to look for in a PART of the graph. This one is about 8-10 times faster than your SSD so it IS impressive a lot, but price is equally impressive.lilmoe - Friday, April 21, 2017 - link
Yes, that's the problem with readers. They're comparing this to the 960 Pro and other M.2 and even SATA drives. Um.... NO. You compare this with similar form factor SSDs with similar price tags and heat sinks.And no, even QD1 benches aren't that big of a difference.
lilmoe - Friday, April 21, 2017 - link
"And no, even QD1 benches aren't that big of a difference"This didn't sound right, I meant to say that even QD1 isn't very different **compared to enterprise full PCIe SSDs*** at similar prices.
sor - Friday, April 21, 2017 - link
You're crazy. This thing is great. The current weak spot of NAND is on full display here, and xpoint is decimating it. We all know SSDs chug when you throw a lot of writes at them, all of Anandtech "performance consistency" benchmarks show that iops take a nose dive if you benchmark for more than a few seconds. Xpoint doesn't break a sweat and is orders of magnitude faster.I'm also pleasantly surprised at the consistency of sequential. A lot of noise was made about their sequential numbers not being as good as the latest SSDs, but one thing not considered is that SSDs don't hit that number until you get to high queue depths. For individual transfers xpoint seems to actually come closer to max performance.
tuxRoller - Friday, April 21, 2017 - link
I think the controllers have a lot to due with the perf.It's perf profile is eerily similar to the p3700 in too many cases.
Meteor2 - Thursday, April 20, 2017 - link
So... what is a queue depth? And what applications result in short or long QDs?DrunkenDonkey - Thursday, April 20, 2017 - link
Queue depth is concurent access to the drive, at the same time.For desktop/gaming you are looking at 4k random read (95-99% of the time), QD=1
For movie processing you are looking at sequential read/write at QD=1
For light file server you are looking at both higher blocks, say 64k random read and also sequential read, at QD=2/4
For heavy file server you go for QD=8/16
For light database you are looking for QD=4, random read/random write (depends on db type)
For heavy database you are looking for QD=16/more, random read/random write (depends on db type)
Meteor2 - Thursday, April 20, 2017 - link
Thank you!bcronce - Thursday, April 20, 2017 - link
A heavy file server only has such a small queue depth if using spinning rust, to keep down latency. When using SSDs, file servers have QDs in 64-256 range.extide - Thursday, April 20, 2017 - link
Queue depth is how many commands the computer has queued up for the drive. The computer can issue commands to the drive faster than it can service them -- so, for example, SATA can support a queue of up to 32 commands. Typical desktop use just doesn't generate enough traffic on the drives to queue up much data so you usually are in the low 1-2, maybe 4 QD. Some server workloads can be higher, but even on a DB server, if you are seeing QD's of 16 I would say your storage is not fast enough for what you are trying to do, so being able to get good performance at low queue depths is truly a breakthrough.bcronce - Thursday, April 20, 2017 - link
For file servers, it's not just the queue depth that's important, it's the number of queues. FreeBSD and OpenZFS have had a lot of blogs and videos about the issues of scaling up servers, especially in regards to multi-core.SATA only supports 1 queue. NVMe supports up to ~65,000 with a depth of ~65,000 each. They're actually having issues saturating high end SSDs because their IO stack can't handle the throughput.
If you have a lot of SATA drives, then you effectively have many queues, but if you want a single/few super fast device(s), like say L2ARC, you need to take advantage of the new protocol.
tuxRoller - Friday, April 21, 2017 - link
The answer is something like the Linux kernel's block multiqueue (ongoing, still not the default for all devices but it shouldn't be more than a few more cycles). Its been a massive undertaking and involved rewriting many drivers.https://lwn.net/Articles/552904/
Shadowmaster625 - Thursday, April 20, 2017 - link
It is a pity intel doesnt make video cards, because 16GB of this would go very well with 4GB of RAM and a decent memory controller. It would lower the overall cost and not impact performance at all.ddriver - Friday, April 21, 2017 - link
"It would lower the overall cost and not impact performance at all."Yeah, I bet. /s
Mugur - Friday, April 21, 2017 - link
I think I read something like this when i740 was launched... :-)Sorry, couldn't resist. But the analogy stands.
ridic987 - Friday, April 21, 2017 - link
"It would lower the overall cost and not impact performance at all."What? This stuff is around 50x slower than DRAM, which itself is reaching its limits in GPUs, hence features like delta color compression... Right now when your gpu runs out of ram it uses your system ram as extra space, this is a far better system.
anynigma - Thursday, April 20, 2017 - link
"Intel's new 3D XPoint non-volatile memory technology, which has been on the cards publically for the last couple of years"I think you mean "IN the cards". In this context, "ON the cards" makes it sound like we've all been missing out on 3D xPoint PCI cards for a "couple of years" :)
SaolDan - Thursday, April 20, 2017 - link
bI think he means it like Its been in the works publicly for a couple of years.DrunkenDonkey - Thursday, April 20, 2017 - link
A bit of a suggestion - can you divide (or provide in final thoughts) SSD reviews per consumer base? Desktop user absolutely does not care about sequential performance or QD16 or even write for what matters (except for the odd time installing something). Database can't care less about sequential or low QD, etc. Giving the tables is good for the odd few % of the readers that actually know what to look for, the rest just take a look at the end of the graph and take a stunningly wrong idea. Just a few comparisons tailored per use will make it so easy for the masses. It was Anand that fought for that during the early sandforce days, he forced ocz to reconsider their ways to tweak SSDs for real world performance, not graph based and got me as a follower. Let that not die in vain and let those, that lack the specific knowledge be informed. Just look at the comments and see how people interpret the results.I know this is enterprise grade SSD, but it is also a showcase for a new technology that will come in our hands soonish.
ddriver - Friday, April 21, 2017 - link
Then those reviews would show minuscule benefit of nvme and hypetane over a regular old ssd.DrunkenDonkey - Friday, April 21, 2017 - link
NVME yes, will show up that when you run that game on 960 pro it will take exactly same amount (-~1 sec) compared to old sata ssd. Octane however will show some 8 times faster and it will stick totally awesome in the graph. If you don't know what to look for, octane is not impressive and some new nvme ssd is actually very good compared to the old, both are untrue.ianmills - Thursday, April 20, 2017 - link
Loving how the "vertical axis units" were added after the graph for clarity. Great attention to detail Billy!serendip - Thursday, April 20, 2017 - link
How about power consumption? Could we start seeing similar hardware in tablets and phones in say, 5 years' time? We will still need DRAM for speed and low power consumption. XPoint would then make for a great system and caching drive, with slower and cheaper NAND being used for media storage, like how SSD + HDD setups are used now.Ian Cutress - Friday, April 21, 2017 - link
Literally the last two sentences in the review (and mentioned at other times):"Since our testing was remote, we have not yet even had the chance to look under the drives's heatsink, or measure the power efficiency of the Optane SSD and compare it against other SSDs. We are awaiting an opportunity to get a drive in hand, and expect some of the secrets under the hood to be exposed in due course as drives filter through the ecosystem."
random2 - Thursday, April 20, 2017 - link
"so it is interesting to see where Intel is going to lay down its line in the sand."Mixed metaphor; "lay down your cards" or "draw a line in the sand"
iwod - Friday, April 21, 2017 - link
What I really want to see, ( Not sure if Intel allows them to )Optane to put through all the test of SSD Bench. ( For Reference Only, and we would know how QD1 affect the benchmarks )
And Power Consumption.
ddriver - Friday, April 21, 2017 - link
( Not sure if Intel allows them to )Most likely the review guidelines intel mandated for this are longer than the actual review ;)
Pork@III - Friday, April 21, 2017 - link
$1520 for only 375GB...we not live in 2012. We live in 2017! If not to lie myself?tuxRoller - Friday, April 21, 2017 - link
"However it is worth noting that the Optane SSD only manages a passing score when the application uses asynchronous I/O APIs. Using simple synchronous write() system calls pushes the average latency up to 11-12µs"You mentioned "polling mode" for nvme was disabled, which is strange since that's been the default since ~4.5. Also, there are different types of polling modes, so, my hope is that the polling mode you are talking about is the new hybrid polling (introduced in 4.10, but possibly backported to your Ubuntu kernel). If not, then we know that xpoint is faster than the data you've gathered. Western Digital gave a talk at the recent Vault conference and discussed when it makes sense to poll vs reap.
Polling ends up being about 1.3x faster (average latency) than waiting for the irq (4.5us vs 6us). If you went with one of the userspace drivers, polling ends up twice as fast, but that would take much more work to benchmark.
So, considering that you're benchmarking the kinds of devices that this feature was designed for, and that we are interested in us latencies, what you've ended up benchmarking here was, to a greater extent than needed, the default kernel configuration.
http://events.linuxfoundation.org/sites/events/fil...
Billy Tallis - Friday, April 21, 2017 - link
I said the NVMe driver wasn't manually switched into polling mode; I left it with the default behavior which on 4.8 seems to be not polling unless the application requests. I'm certainly not seeing the 100% CPU usage that would be likely if it was polling.If I'd had more time, I would have experimented with the latest kernel versions and the various tricks to get even lower latency.
tuxRoller - Friday, April 21, 2017 - link
I wasn't claiming that you disabled polling only that polling was disabled since it should be on be default for this device.Assuming you were looking at the sysfs interface, was the key that was set to 0 called io_poll or io_poll_delay? The later set to 0 enables hybrid polling, so the cpu wouldn't be pegged.
Either way, you wouldn't need a new kernel, just to enable a feature the kernel has had since 4.4 for these low latency devices.
Also, did you disable the pagecache (direct=1) in your fio commands? If you didn't, that would explain why aio was faster since it uses dio.
Btw, it's not my intent to unnecessarily criticize you because i realize the tests were performed under constrained circumstances. I just would've appreciated some comment in the article about a critical feature for this hardware was not enabled in the kernel.
yankeeDDL - Friday, April 21, 2017 - link
Optane was supposed to be 1000x faster, have 1000X endurance and be 10x denser than NAND (http://hothardware.com/ContentImages/NewsItem/4020...I realize this is the first product, but saying that it fell short of expectation is an understatement.
It has lower endurance, lower density and it is measurably faster, but certainly nowhere close 1000X.
Oh, did I mention it is 5-10X more expensive?
I am quite disappointed, to be honest. It will get better, but @not ready@ is something that comes to ind reading the article.
Billy Tallis - Friday, April 21, 2017 - link
3D XPoint memory was supposed to be 1000x faster than NAND, 1000x more durable than NAND, and 10x denser than DRAM. Those claims were about the 3D XPoint memory itself, not the Optane SSD built around that memory.ddriver - Friday, April 21, 2017 - link
It is probably as good as they said... if you compare it to the shittiest SD card from 10 years ago. Still technically NAND ;)yankeeDDL - Monday, April 24, 2017 - link
I disagree. I can agree that the speed may be limited by the drive, but even so, it falls short by a large factor. The durability and the density, however, are pretty much platform independent and they are not there by a very, very long shot. Intel itself demonstrated that it is only 2.4-3X faster (https://en.wikipedia.org/wiki/3D_XPoint).It clearly has a future, especially as the NAND is approaching the end of its scalability. Engineering wise is interesting, but today, it makes really little sense, while it should have been a slam dunk. I mean, who would have thought twice before buying a 500GB drive that maxes out the SATA for $20-30? But this one ... not so much.
zodiacfml - Friday, April 21, 2017 - link
It will perform better in DIMM.factual - Friday, April 21, 2017 - link
I don't see xpoint replacing dram due to both latency and endurance not being up to par , but It's going to disrupt the ssd market and as the technology matures and prices come down, I can see xpoint revolutionizing the storage market as ssd did years ago.Competition is clearly worried since seems like paid trolls are trying to spread falsehoods and bs here and elsewhere on the web.
ddriver - Saturday, April 22, 2017 - link
I just bet it will be highly disturbing to the SSD market LOL. With its inflated price, limited capacity and pretty much unnecessary advantages I can just see people lining up to buy that and leaving SSDs on the shelves.factual - Saturday, April 22, 2017 - link
You are either extremely ignorant or a paid troll !!! anyone who understands technology knows that new tech is always expensive. When SSDs came to the market, they were much more expensive and had a lot less capacity than HDDs but they closed the gap and disrupted the market. The same is bound to happen for Xpoint which performs better than NAND by orders of magnitude.ddriver - Sunday, April 23, 2017 - link
It is not expensive because it is new, it is expensive because intel and micron wasted a crapload of money on RDing it and it turned out to be mediocre - significantly weaker than good old and almost forgotten SLC. So now they hype and lie about it and sell it significantly overpriced in hopes they will see some returns of the investment.Also, it seems like you are quite ignorant, ignorant enough to not know what "order of magnitude" means. You just heard someone smart using it and decided to imitate, following some brilliant logic that it will make you look smart. Well, it doesn't. It does exactly the opposite. Now either stop using it, or at the very least, look it up, understand and remember what it actually means, so the next time you use it, you don't end up embarrassing yourself.
factual - Sunday, April 23, 2017 - link
"significantly weaker than good old and almost forgotten SLC"Seriously ?! You must be getting paid to spew this bs! no one can be this ignorant!! can you read numbers ?! what part of 8.9us latency don't you understand, this is at least 10x better than the latest and greatest NVMe SSDs (be it TLC, VNAND or whatever bs marketing terms they feed idiots like you nowadays).
what part of 95K/108K QD1 IOPS don't you understand ?! This is 3-10x compared to this best SSDs on the market.
So I repeat again, Xpoint is orders of magnitude better performing than the latest and greatest SSDs (from Samsung or whichever company) on the market. This is a fact.
You don't even understand basic math, stop embarrassing yourself by posting these idiotic comments!
ddriver - Monday, April 24, 2017 - link
LOL, your intellect is apparently equal to that of a parrot.factual - Monday, April 24, 2017 - link
Well if this fruitless exchange is any evidence my intellect is far superior to yours. So If my intellect is equal to that of a parrot, yours must be equal to that of a maggot ... lolevilpaul666 - Saturday, April 22, 2017 - link
So where are the 32gb client ones?tomatus89 - Saturday, April 22, 2017 - link
Who is this ddriver troll? Hahaha you are hillarious. And the worse is that people keep feeding him instead of ignoring him.peevee - Saturday, May 27, 2017 - link
From your testing, looks like the drive offers real advantages on low QD, i.e. for desktop/small office server use. For these uses a normal SSD is also enough though.Given that modern Xeons have up to 28 cores (running 56 threads each) and server motherboards have 2 or more CPU slots, a properly loaded server will offer QD > 64 all day long, and certainly not just 4 active threads - where the Micron 9100 offers even higher performance, and if the performance is good enough there, it certainly good enough on lower QDs where it is even better PER REQUEST.
And who cares what 99.999% latency is, as long as it is milliseconds and not seconds - network and other latencies on the accesses to these servers will be higher anyway.
An incredibly good first attempt, but it really does not push the envelope in the market it is priced for - high-performance storage-bottlenecked servers.