Name: The Intel Optane Memory M10 (64GB) Review: Optane Caching Refreshed
Item: The Intel Optane Memory M10 (64GB) Review: Optane Caching Refreshed
Author: Billy Tallis

Original Link: https://www.anandtech.com/show/12748/the-intel-optane-memory-m10-64gb-review-optane-caching-refreshed

The Intel Optane Memory M10 (64GB) Review: Optane Caching Refreshed

VIEW ARTICLE

by Billy Tallis on May 15, 2018 10:45 AM EST

96 Comments

Intel is introducing their second generation of Optane Memory products: these are low-capacity M.2 NVMe SSDs with 3D XPoint memory that are intended for use as cache devices to improve performance of systems using hard drives. The new Optane Memory M10 brings a 64GB capacity to the product line that launched a year ago with 16GB and 32GB options.

The complete Optane Memory caching solution consists of an M.2 SSD plus Intel's drivers for caching on Windows, and firmware support on recent motherboards for booting from a cached volume. Intel launched Optane Memory with its Kaby Lake generation of processors and chipsets, and this generation is intended to complement Coffee Lake systems. However, all of the new functionality works just as well on existing Kaby Lake systems as with Coffee Lake.

The major new user-visible feature for this generation of Optane Memory caching is the addition of the ability to cache a secondary data drive, whereas previously only boot drives were possible. Intel refers to this mode as "data drive acceleration", compared to the system acceleration (boot drive) that was the only mode supported by the first generation of Optane Memory. Data drive acceleration has been added solely through changes to the Optane Memory drivers for Windows, and this feature was actually quietly rolled out with version 16 of Intel's RST drivers back in February.

Also earlier this year, Intel launched the Optane SSD 800P family as the low-end alternative to the flagship Optane SSD 900P. The 800P and the new Optane Memory M10 are based on the same hardware and an updated revision of the original Optane Memory M.2 modules. The M10 and the 800P use the same controller and the same firmware. The 800P is usable as a cache device with the Optane Memory software, and the Optane Memory M10 and its predecessor are usable as plain NVMe SSDs without caching software. The 800P and the M10 differ only in branding and intended use; the drive branded as the 58GB 800P is functionally identical to the 64GB M10 and both have the exact same usable capacity of 58,977,157,120 bytes.

Everything said about the 58GB Optane SSD 800P in our review of the 800P family applies equally to the 64GB Optane Memory M10. Intel hasn't actually posted official specs for the M10, so we'll just repeat the 800P specs here:

Intel Optane SSD Specifications
Model	Optane SSD 800P		Optane Memory
Capacity	118 GB	58 GB M10 (64 GB)	32 GB	16 GB
Form Factor	M.2 2280 B+M key		M.2 2280 B+M key
Interface	PCIe 3.0 x2		PCIe 3.0 x2
Protocol	NVMe 1.1		NVMe 1.1
Controller	Intel		Intel
Memory	128Gb 20nm Intel 3D XPoint		128Gb 20nm Intel 3D XPoint
Sequential Read	1450 MB/s		1350 MB/s	900 MB/s
Sequential Write	640 MB/s		290 MB/s	145 MB/s
Random Read	250k IOPS		240k IOPS	190k IOPS
Random Write	140k IOPS		65k IOPS	35k IOPS
Read Latency	6.75 µs		7 µs	8 µs
Write Latency	18µs		18µs	30 µs
Active Power	3.75 W		3.5 W	3.5 W
Idle Power	8 mW	8 mW	1 W	1 W
Endurance	365 TB	365 TB	182.5 TB	182.5 TB
Warranty	5 years		5 years
Launch Date	March 2018		April 2017
Launch MSRP	$199	800P: $129 M10: $144	$77	$44

Buy Intel Optane SSD 800p 58GB on Newegg

Rather than cover exactly the same territory as our review of the 800P, this review is specifically focused on use of the Optane Memory M10 as a cache drive in front of a mechanical hard drive. Thanks to the addition of the data drive acceleration functionality, we can use much more of our usual benchmark suite for this than we could with last year's Optane Memory review. The data drive acceleration mode also broadens the potential market for Optane Memory, to include users who want to use a NAND flash-based SSD as their primary storage device but also need a more affordable bulk storage drive. The combination of a 64GB Optane Memory M10 (at MSRP) and a 1TB 7200RPM hard drive is about the same price as a 1TB SATA SSD with 3D TLC NAND, and at higher capacities the combination of a hard drive plus Optane Memory is much cheaper than a SATA SSD.

Intel's Optane Memory system works as an inclusive cache: adding an Optane Memory cache to a system does not increase the usable storage capacity, it just improves performance. Data written to the cache will also be written to the backing device, but applications don't have to wait for the data to land on both devices.

Once enabled, there is no need or option for manual tuning of cache behavior. The operation of the cache system is almost entirely opaque to the user. After an unclean shutdown, there is a bit of diagnostic information visible as the cache state is reconstructed, but this process usually seems to only take a second or two before the OS continues to load.

Test Systems

Intel's Optane Memory caching drivers require a Kaby Lake or newer processor and chipset, but our primary consumer SSD testbed is still a Skylake-based machine. For last year's Optane Memory review, Intel delivered the 32GB module pre-installed in a Kaby Lake desktop. This time around, Intel provided a Coffee Lake system. Both of those systems have been used for tests in this review, and a few benchmarks of drives in a non-caching role have been performed on our usual SSD testbed.

AnandTech 2017/2018 Consumer SSD Testbed
CPU	Intel Xeon E3 1240 v5
Motherboard	ASRock Fatal1ty E3V5 Performance Gaming/OC
Chipset	Intel C232
Memory	4x 8GB G.SKILL Ripjaws DDR4-2400 CL15
Graphics	AMD Radeon HD 5450, 1920x1200@60Hz
Software	Windows 10 x64, version 1709
Software	Linux kernel version 4.14, fio version 3.1

Thanks to Intel for the Xeon E3 1240 v5 CPU
Thanks to ASRock for the E3V5 Performance Gaming/OC
Thanks to G.SKILL for the Ripjaws DDR4-2400 RAM
Thanks to Corsair for the RM750 power supply, Carbide 200R case, and Hydro H60 CPU cooler
Thanks to Quarch for the XLC Programmable Power Module and accessories
Thanks to StarTech for providing a RK2236BKF 22U rack cabinet.

Test Procedures

Our usual SSD test procedure was not designed to handle multi-device tiered storage, so some changes had to be made for this review and as a result much of the data presented here is not directly comparable to our previous reviews. The major changes are:

All test configurations were running the latest OS patches and CPU microcode updates for the Spectre and Meltdown vulnerabilities. Regular SSD reviews with post-patch test results will begin later this month.
Our synthetic benchmarks are usually run under Linux, but Intel's caching software is Windows-only so the usual fio scripts were adapted to run on Windows. The settings for data transfer sizes and test duration are unchanged, but the difference in storage APIs between operating systems means that the results shown here are lower across the board, especially for the low queue depth random I/O that is the greatest strength of Optane SSDs.
We only have equipment to measure the power consumption of one drive at a time. Rather than move that equipment out of the primary SSD testbed and use it to measure either the cache drive or the hard drive, we kept it busy testing drives for future reviews. The SYSmark 2014 SE test results include the usual whole-system energy usage measurements.
Optane SSDs and hard drives are not any slower when full than when empty, because they do not have the complicated wear leveling and block erase mechanisms that flash-based SSDs require, nor any equivalent to SLC write caches. The AnandTech Storage Bench (ATSB) trace-based tests in this review omit the usual full-drive test runs. Instead, caching configurations were tested by running each test three times in a row to check for effects of warming up the cache.
Our AnandTech Storage Bench "The Destroyer" test takes about 12 hours to run on a good SATA SSD and about 7 hours on the best PCIe SSDs. On a mechanical hard drive, it takes more like 24 hours. Results for The Destroyer will probably not be ready this week. In the meantime, the ATSB Heavy test is sufficiently large to illustrate how SSD caching performs for workloads that do not fit into the cache.

Benchmark Summary

This review analyzes the performance of Optane Memory caching both for boot drives and secondary drives. The Optane Memory modules are also tested as standalone SSDs. The benchmarks in this review fall into three categories:

Application benchmarks: SYSmark 2014 SE

SYSmark directly measures how long applications take to respond to simulated user input. The scores are normalized against a reference system, but otherwise are directly proportional to the accumulated time between user input and the result showing up on screen. SYSmark measures whole-system performance and energy usage with a broad variety of non-gaming applications. The tests are not particularly storage-intensive, and differences in CPU and RAM can have a much greater impact on scores than storage upgrades.

AnandTech Storage Bench: The Destroyer, Heavy, Light

These three tests are recorded traces of real-world I/O that are replayed onto the storage device under test. This allows for the same storage workload to be reproduced consistently and almost completely independent of changes in CPU, RAM or GPU, because none of the computational workload of the original applications is reproduced. The ATSB Light test is similar in scope to SYSmark while the ATSB Heavy and The Destroyer tests represent much more computer usage with a broader range of applications. As a concession to practicality, these traces are replayed with long disk idle times cut short, so that the Destroyer doesn't take a full week to run.

Synthetic Benchmarks: Flexible IO Tester (FIO)

FIO is used to produce and measure artificial storage workloads according to our custom scripts. Poor choice of data sizes, access patterns and test duration can produce results that are either unrealistically flattering to SSDs or are unfairly difficult. Our FIO-based tests are designed specifically for modern consumer SSDs, with an emphasis on queue depths and transfer sizes that are most relevant to client computing workloads. Test durations and preconditioning workloads have been chosen to avoid unrealistically triggering thermal throttling on M.2 SSDs or overflowing SLC write caches.

BAPCo SYSmark 2014 SE

BAPCo's SYSmark 2014 SE is an application-based benchmark that uses real-world applications to replay usage patterns of business users in the areas of office productivity, media creation and data/financial analysis. In addition, it also addresses the responsiveness aspect which deals with user experience as related to application and file launches, multi-tasking etc. Scores are calibrated against a reference system that is defined to score 1000 in each of the scenarios. A score of, say, 2000, would imply that the system under test is twice as fast as the reference system.

SYSmark scores are based on total application response time as seen by the user, including not only storage latency but time spent by the processor. This means there's a limit to how much a storage improvement could possibly increase scores, because the SSD is only in use for a small fraction of the total test duration. This is a significant difference from our ATSB tests where only the storage portion of the workload is replicated and disk idle times are cut short to a maximum of 25ms.

For this review, SYSmark has been used on two different machines: a relatively high-end system with a six-core Intel Core i7-8700K processor and 16GB of RAM, and a more limited system with a quad-core Intel Core i5-7400 processor and just 4GB of RAM. The low-end system spends a lot of time swapping thanks to its small amount of RAM, and this adds greatly to the storage workload.

AnandTech SYSmark SSD Testbed
CPU	Intel Core i7-8700K
Motherboard	Gigabyte Aorus H370 Gaming 3 WiFi
Chipset	Intel H370
Memory	2x 8GB Kingston DDR4-2666
Case	In Win C583
Power Supply	Cooler Master G550M
OS	Windows 10 64-bit, version 1709

AnandTech SYSmark SSD Low-End Testbed
CPU	Intel Core i5-7400
Motherboard	ASUS PRIME Z270-A
Chipset	Intel Z270
Memory	1x 4GB Corsair DDR4-2666
Case	In Win C583
Power Supply	Cooler Master G550M
OS	Windows 10 64-bit, version 1709

None of the Optane Memory modules are large enough to serve as a Windows boot drive alone as well as storing all the applications used for SYSmark, so this section only tests the Optane Memory and Optane SSD 800P as cache drives. (The 118GB Optane SSD 800P is pretty much the smallest drive that can could run SYSmark, but it doesn't leave much room for user data.)

BAPCo SYSmark 2014 SE - Data / Financial Analysis BAPCo SYSmark 2014 SE - Media Creation

BAPCo SYSmark 2014 SE - Office Productivity

The Data/Financial Analysis, Media Creation, and Office Productivity sub-tests are all relatively insensitive to storage performance, and they are shown in order of decreasing sensitivity to the CPU and RAM differences between the two test systems. These results show that a mechanical hard drive can hold back application performance, but almost any solid state storage system—including Optane Memory caching—is sufficient to shift the bottlenecks over to compute and memory.

BAPCo SYSmark 2014 SE - Responsiveness

The Responsiveness test is less focused on overall computational throughput and more on those annoying delays that make a computer feel slow: application launching, opening and saving files, and a variety of multitasking scenarios. Here, moving off a mechanical hard drive is by far the best upgrade that can be made to improve system performance. Going beyond a mainstream SATA SSD provides diminishing returns, but there is a measurable difference between the SATA SSD and the fastest Optane SSD.

Energy Usage

The SYSmark energy usage scores measure total system power consumption, excluding the display. Our SYSmark test system idles at around 26 W and peaks at over 60 W measured at the wall during the benchmark run. SATA SSDs seldom exceed 5 W and idle at a fraction of a watt, and the SSDs spend most of the test idle. This means the energy usage scores will inevitably be very close. A typical notebook system will tend to be better optimized for power efficiency than this desktop system, so the SSD would account for a much larger portion of the total and the score difference between SSDs would be more noticeable.

BAPCo SYSmark 2014 SE - Total System Power

The Intel Optane SSD 900P is quite power-hungry by SSD standards, but running a hard drive is even worse. The Optane Memory M10 and 118GB 800P further add to power consumption when used as cache devices, but they speed up the test enough that total energy usage is not significantly affected. The 32GB Optane Memory doesn't offer as much of a performance boost, and it lacks the power management capabilities of the more recent Optane M.2 drives.

AnandTech Storage Bench - Heavy

Our Heavy storage benchmark is proportionally more write-heavy than The Destroyer, but much shorter overall. The total writes in the Heavy test aren't enough to fill the drive, so performance never drops down to steady state. This test is far more representative of a power user's day to day usage, and is heavily influenced by the drive's peak performance. The Heavy workload test details can be found here. This test is run twice, once on a freshly erased drive and once after filling the drive with sequential writes.

ATSB - Heavy (Data Rate)

The 118GB Optane SSD 800P is the only cache module large enough to handle the entirety of the Heavy test, with a data rate that is comparable to running the test on the SSD as a standalone drive. The smaller Optane Memory drives do offer significant performance increases over the hard drive, but not enough to bring the average data rate up to the level of a good SATA SSD.

ATSB - Heavy (Average Latency) ATSB - Heavy (99th Percentile Latency)

The 64GB Optane Memory M10 offers similar latency to the 118GB Optane SSD 800P when both are treated as standalone drives. In a caching setup the cache misses have a big impact on average latency and a bigger impact on 99th percentile latency, though even the 32GB cache still outperforms the bare hard drive on both metrics.

ATSB - Heavy (Average Read Latency) ATSB - Heavy (Average Write Latency)

The average read latency scores show a huge disparity between standalone Optane SSDs and the hard drive. The 118GB cache performs almost as well as the standalone Optane drives while the 64GB cache averages a bit worse than the Crucial MX500 SATA SSD and the 32GB cache averages about half the latency of the bare hard drive.

On the write side, the Optane M.2 modules don't perform anywhere near as well as the Optane SSD 900P, and the 32GB module has worse average write latency than the Crucial MX500. In caching configurations, the 118GB Optane SSD 800P has about twice the average write latency of the 900P while the smaller cache configurations are worse off than the SATA SSD.

ATSB - Heavy (99th Percentile Read Latency)

The 99th percentile read and write latency scores rank about the same as the average latencies, but the impact of an undersized cache is much larger here. With 99th percentile read and write latencies in the tens of milliseconds, the 32GB and 64GB caches won't save you from noticeable stuttering.

AnandTech Storage Bench - Light

Our Light storage test has relatively more sequential accesses and lower queue depths than The Destroyer or the Heavy test, and it's by far the shortest test overall. It's based largely on applications that aren't highly dependent on storage performance, so this is a test more of application launch times and file load times. This test can be seen as the sum of all the little delays in daily usage, but with the idle times trimmed to 25ms it takes less than half an hour to run. Details of the Light test can be found here. As with the ATSB Heavy test, this test is run with the drive both freshly erased and empty, and after filling the drive with sequential writes.

ATSB - Light (Data Rate)

The data rates on the Light test show clear signs of a cold cache on the first run, with substantially improved performance for the second and third runs. The 32GB cache module is still a bit small for this test and it can only bring the data rates up to about the level of a SATA SSD, but the 64GB and 118GB modules allow for performance that almost matches low-end NVMe SSDs like the MyDigitalSSD SBX (and without the capacity limitations or steep performance drop when full).

ATSB - Light (Average Latency) ATSB - Light (99th Percentile Latency)

With a warmed-up cache, the Optane Memory M10 64GB and the larger Optane SSD 800P offer better average and 99th percentile latency than SATA SSDs. The 118GB cache beats the SATA drives even with a cold cache. The 32GB Optane Memory is well behind the SATA SSD even with a warm cache, especially for 99th percentile latency. But even so, all of these cache configurations easily beat running on just a hard drive.

ATSB - Light (Average Read Latency) ATSB - Light (Average Write Latency)

The effects of a cold vs. warm cache show up quite clearly on the average read latency chart, but naturally have minimal effect on the average write latencies. It is clear that the 32GB Optane Memory's overall latency fell behind that of the SATA SSD almost entirely because of poor write performance: with a warm cache, the read latency of the 32GB module is slower than that of its larger siblings but is still an improvement over the SATA SSD.

ATSB - Light (99th Percentile Read Latency)

The 99th percentile read latency scores emphasize the impact of a cold cache more than the average latency, especially for the 64GB cache module. Even the 118GB cache lags behind the SATA SSD on the first run. The 99th percentile write latencies are larger in absolute terms than the average write latencies, but the relative differences are almost all the same except that the hard drive stands out even more.

Random Read Performance

Our first test of random read performance uses very short bursts of operations issued one at a time with no queuing. The drives are given enough idle time between bursts to yield an overall duty cycle of 20%, so thermal throttling is impossible. Each burst consists of a total of 32MB of 4kB random reads, from a 16GB span of the disk. The total data read is 1GB.

Burst 4kB Random Read (Queue Depth 1)

The M.2 Optane modules offer the fastest burst random read speeds when tested as standalone drives, but Intel's caching system imposes substantial overhead. Even with that overhead, the random read performance is far above any solution that doesn't involve 3D XPoint memory. As in past reviews, we find that the Optane Memory/Optane SSD 800P has a slight advantage here over the top of the line Optane SSD 900P.

Our sustained random read performance is similar to the random read test from our 2015 test suite: queue depths from 1 to 32 are tested, and the average performance and power efficiency across QD1, QD2 and QD4 are reported as the primary scores. Each queue depth is tested for one minute or 32GB of data transferred, whichever is shorter. After each queue depth is tested, the drive is given up to one minute to cool off so that the higher queue depths are unlikely to be affected by accumulated heat build-up. The individual read operations are again 4kB, and cover a 64GB span of the drive.

Sustained 4kB Random Read

The sustained random read test covers a larger span of the drive, and the 32GB and 64GB modules are not large enough to cache the entire dataset plus the necessary cache management metadata, leaving them with performance close to that of the the hard drive. The 118GB cache is sufficient to contain the full data set for this test, and its performance is below that of the Optane drives tested as standalone drives, but still out of reach of flash-based storage.

The random read performance scaling of the Optane Memory and 800P drives is rather uneven at higher queue depths, but they do still reach very high throughput. The 118GB cache configuration doesn't scale to higher queue depths as well as the standalone SSD configuration, and the 900P hits a wall at a far lower performance level than it should based on our Linux benchmarking.

Random Write Performance

Our test of random write burst performance is structured similarly to the random read burst test, but each burst is only 4MB and the total test length is 128MB. The 4kB random write operations are distributed over a 16GB span of the drive, and the operations are issued one at a time with no queuing.

Burst 4kB Random Write (Queue Depth 1)

On the burst random write test, the larger two caching configurations perform far above what any standalone drive delivers under Windows. The 32GB Optane Memory module also scores better when used as a cache than as a standalone SSD. It is possible that Intel's caching software is also using a RAM cache and is lying to the benchmark software about whether the writes have actually made it onto non-volatile storage. However, the performance here is not actually beyond what NVMe SSDs deliver when we test them under Linux, so it's somewhat possible that there are simply some much-needed fast paths in Intel's drivers.

As with the sustained random read test, our sustained 4kB random write test runs for up to one minute or 32GB per queue depth, covering a 64GB span of the drive and giving the drive up to 1 minute of idle time between queue depths to allow for write caches to be flushed and for the drive to cool down.

Sustained 4kB Random Write

The sustained random write test covers more data than can be cached on the 64GB Optane Memory M10, so it and the 32GB cache module fall far behind mainstream SATA SSDs. The standalone Optane SSDs continue to offer great performance, and the 118GB Optane SSD 800P as a cache device tops the chart.

For the one configuration with a cache large enough to handle this test, performance scales up much sooner than in the standalone SSD configuration: QD2 gives almost the full random write speed. When the cache is too small, increasing queue depth just makes performance worse.

Sequential Read Performance

Our first test of sequential read performance uses short bursts of 128MB, issued as 128kB operations with no queuing. The test averages performance across eight bursts for a total of 1GB of data transferred from a drive containing 16GB of data. Between each burst the drive is given enough idle time to keep the overall duty cycle at 20%.

Burst 128kB Sequential Read (Queue Depth 1)

The burst sequential read results are bizarre, with the 32GB caching configuration coming in second only to the Optane SSD 900P while the large Optane M.2 modules perform much worse as cache devices than as standalone drives. The caching performance from the 64GB Optane Memory M10 is especially disappointing, with less than a third of the performance that the drive delivers as a standalone device. Some SSD caching software attempts to have sequential I/O bypass the cache to leave the SSD ready handle random I/O, but this test is not a situation where such a strategy would make sense. Without more documentation from Intel about their proprietary caching algorithms and with no way to query the Optane Memory drivers about the cache status, it's hard to figure out what's going on here. Aside from the one particularly bad result from the M10 as a cache, all of the Optane configurations do at least score far above the SATA SSD.

Our test of sustained sequential reads uses queue depths from 1 to 32, with the performance and power scores computed as the average of QD1, QD2 and QD4. Each queue depth is tested for up to one minute or 32GB transferred, from a drive containing 64GB of data.

Sustained 128kB Sequential Read

The sustained sequential read test results make more sense. The 32GB cache configuration isn't anywhere near large enough for this test's 64GB dataset, but the larger Optane M.2 modules offer good performance as standalone drives or as cache devices. The 64GB Optane Memory M10 scores worse as a cache drive, which is to be expected since the test's dataset doesn't quite fit in the cache.

Using an 118GB Optane M.2 module as a cache seems to help with sequential reads at QD1, likely due to some prefetching in the caching software. The 64GB cache handles the sustained sequential read workload better than either of the sustained random I/O tests, but it is still slower than the SSD alone at low queue depths. Performance from the 32GB cache is inconsistent but usually still substantially better than the hard drive alone.

Sequential Write Performance

Our test of sequential write burst performance is structured identically to the sequential read burst performance test save for the direction of the data transfer. Each burst writes 128MB as 128kB operations issued at QD1, for a total of 1GB of data written to a drive containing 16GB of data.

Burst 128kB Sequential Write (Queue Depth 1)

As with the random write tests, the cache configurations show higher burst sequential write performance than testing the Optane M.2 modules as standalone SSDs. This points to driver improvements that may include mild cheating through the use of a RAM cache, but the performance gap is small enough that there doesn't appear to be much if any data put at risk. The 64GB and 118GB caches have similar performance with the 64GB slightly ahead, but the 32GB cache barely keep up with a SATA drive.

Our test of sustained sequential writes is structured identically to our sustained sequential read test, save for the direction of the data transfers. Queue depths range from 1 to 32 and each queue depth is tested for up to one minute or 32GB, followed by up to one minute of idle time for the drive to cool off and perform garbage collection. The test is confined to a 64GB span of the drive.

Sustained 128kB Sequential Write

The rankings on the sustained sequential write test are quite similar, but this time the 118GB Optane SSD 800P has the lead over the 64GB Optane Memory M10. The performance advantage of the caching configurations over the standalone drive performance is smaller than for the burst sequential write test, because this test writes far more data than could be cached in RAM.

Aside from some differences at QD1, the Optane M.2 modules offer basically the same performance when used as caches or as standalone drives. Since this test writes no more than 32GB at a time without a break and all of the caches tested are that size or larger, the caching software can always stream all of the writes to just the Optane module without having to stop and flush dirty data to the slower hard drive. If this test were lengthened to write more than 32GB at a time or if it were run on the 16GB Optane Memory, performance would plummet partway through each phase of the test.

Mixed Random Performance

Our test of mixed random reads and writes covers mixes varying from pure reads to pure writes at 10% increments. Each mix is tested for up to 1 minute or 32GB of data transferred. The test is conducted with a queue depth of 4, and is limited to a 64GB span of the drive. In between each mix, the drive is given idle time of up to one minute so that the overall duty cycle is 50%.

Mixed 4kB Random Read/Write

Unsurprisingly, the mixed random I/O test produces crap performance from the hard drive and the two cache configurations where the cache is too small for this test. The 118GB Optane SSD 800P is more cache than this test needs, and it performs almost as well as the Optane SSD 900P.

When used as a cache for this test, the largest Optane SSD 800P shows slightly different performance characteristics than when it is treated as a standalone drive, but in either case it is a strong performer across the board. The smaller Optane drives aren't large enough to cache the entire working set of this test and can't do much to improve performance over the hard drive.

Mixed Sequential Performance

Our test of mixed sequential reads and writes differs from the mixed random I/O test by performing 128kB sequential accesses rather than 4kB accesses at random locations, and the sequential test is conducted at queue depth 1. The range of mixes tested is the same, and the timing and limits on data transfers are also the same as above.

Mixed 128kB Sequential Read/Write

All of the Optane configurations easily outperform the SATA drives on the mixed sequential I/O test. The 64GB and 118GB modules are tied when tested as standalone drives and close when tested as cache devices, and the cache performance is 30-40% faster than the standalone SSD performance. The 32GB module is substantially slower and performance is much closer between caching and standalone SSD use.

The performance improvements in the caching configurations over the standalone drive configurations generally apply throughout the mixed sequential test. The main exception is in the early phases of the test with the 32GB cache, where cache performance falls far short of the standalone drive performance. Once the proportion of reads has dropped to 70%, the cache configuration comes out ahead.

Conclusion: Going for a Data Cache

At the hardware level, the Optane Memory M10 is a simple and straightforward update to the original Optane Memory, bringing a new capacity that some users will appreciate, and power management that was sorely lacking from the original. On the software side, the experience is quite similar to the first iteration of Optane Caching that Intel released a year ago with their Kaby Lake platform. The only big user-visible change is the ability to cache non-boot drives. This corrects the other glaring omission, giving the Optane Memory system an overall impression of being a more mature product that Intel is taking seriously.

Optane Memory caching now seems to be limited primarily by the fundamental nature of SSD caching: not everything can fit in the cache. Occasional drops down to hard drive performance are more frequent and more noticeable than when a flash-based SSD's SLC write cache fills up, or when a M.2 SSD starts thermally throttling. Neither of those happens very often for real-world use, but cache misses are still inevitable.

The latency from a flash-based SSD tends to grow steadily as the workload gets more intense, up to the point that there isn't enough idle time for garbage collection. The performance of most modern flash-based SSDs degrades gracefully until the SLC cache fills up. The latency distribution from an Optane+hard drive cache setup looks very different: latency is excellent until you try to read the wrong block, then you have to wait just as long as in a hard drive-only setup.

The key to making SSD caching work well for desktop use is thus to ensure that the cache is big enough for the workload. For relatively light workloads, the 32GB Optane Memory is often sufficient, and even the $25 16GB module that we haven't tested should offer a noticeable improvement in system responsiveness. For users with heavier workloads, the larger 64GB Optane Memory M10 and even the 118GB Optane SSD 800P may not be big enough, and the price starts getting close to that of a good and reasonably large SATA SSD.

For power users, the data drive acceleration mode is more appealing. A gamer might want to use a 256GB or 512GB SATA SSD for the OS and most programs and documents, but would need a 1TB or larger drive for his entire Steam library. A 1TB 7200RPM hard drive plus the 64GB Optane Memory M10 or the 58GB Optane SSD 800P is cheaper than a good 1TB SATA SSD, and the cache is large enough to hold one or two games. A 2TB hard drive plus the 118GB Optane SSD 800P can cache the even the largest of AAA games and is no more expensive than the cheapest 2TB SATA drive. For capacities beyond that, caching only gets more appealing.

Our SYSmark testing showed that for many common tasks, adding even a 32GB cache to a hard drive can bring performance up to the level of a SSD-only configuration. There are a lot of lightweight everyday workloads that can fit well in such a cache, and for those users the larger 64GB Optane Memory M10 doesn't bring worthwhile performance improvements over the 32GB Optane Memory.

On the other hand, it is clear that no amount of fast storage can make up for a system crippled by too little RAM, which is a disappointment in a time when SSDs are getting cheaper but RAM prices are still climbing. Optane SSDs may be the fastest swap devices money can buy, but they're no substitute for having adequate RAM. The 4GB low-end configuration we tested is simply not enough anymore, and for future storage caching tests we will consider 8GB as the absolute minimum requirement before any storage performance upgrades should be considered.

Our synthetic benchmarks of Intel's Optane Memory caching confirmed the most predictable effects of cache size compared to working set size, but didn't reveal many nuances of Intel's cache management strategies. There is clearly some overhead relative to accessing just the SSD, but not enough eliminate the fundamental performance advantages of 3D XPoint memory. There also appears to be some write caching and combining done with system RAM, trading a bit of safety for improved write performance beyond even what the Optane SSDs alone can handle. Whether it's advertised or not, this tends to be a feature of almost every third-party add-on software for storage acceleration. It's the simplest way to improve storage benchmark numbers and the tradeoffs are quite acceptable to many users.

The Optane Memory caching seems to be quite responsive to changes in usage patterns. One launch of an application is sufficient to bring its data into the cache, and Intel isn't shy about sending writes to the cache. It doesn't appear that the Optane Memory caching system does anything significant to reduce wear on the cache device, so Intel seems confident that these cache devices have plenty of write endurance.

Intel Optane Product Lineup
Capacity	Drives
16 GB	Optane Memory (M.2) $24.99 ($1.56/GB)
32 GB	Optane Memory (M.2) $58.91 ($1.84GB)
58 GB 64 GB	Optane SSD 800P (M.2) $111.48 ($1.92/GB) Optane Memory M10 $144 ($2.48/GB)
118 GB	Optane SSD 800P (M.2) $196.43 ($1.66/GB)
280 GB	Optane SSD 900P (AIC, U.2) $354.99 ($1.27GB)
480 GB	Optane SSD 900P (AIC) 544.99 ($1.14/GB) Optane SSD 905P (U.2) $599.00 ($1.25/GB)
960 GB	Optane SSD 905P (AIC) $1299.00 ($1.35/GB)

Buy Intel Optane M10 64GB on Amazon.com

Intel's 3D XPoint memory is clearly a far better media for small cache drives than NAND flash, but it is still afflicted by a very high price per GB. Only the 16GB Optane Memory at $25 seems like an easy purchase to make, but it is small enough that its performance potential is much more limited than the larger Optane products. The 64GB Optane Memory M10 is expensive enough that skipping caching altogether and going with just a SATA SSD has to be seriously considered, even when shopping for 1 or 2 TB of storage. In spite of the power management the Optane Memory M10 adds over the original cache module, it still doesn't seem like a hard drive plus a cache module makes any sense for mobile use. Optane prices need to come down faster than NAND prices in order for this caching strategy to gain wide acceptance. This doesn't seem likely to happen, so Optane Memory will remain a niche solution—but that niche is definitely not as small as it was when Optane Memory was first introduced.