Original Link: https://www.anandtech.com/show/13704/enterprise-ssd-roundup-intel-samsung-memblaze
The Enterpise TLC Storage Era Begins: Rounding Up 13 SSDs With Samsung, Intel, and Memblaze
by Billy Tallis on January 3, 2019 9:45 AM EST2018 was a busy year for consumer SSDs. With all the NAND flash manufacturers now shipping high-quality 3D NAND in volume, we've seen more competition than ever, and huge price drops. NVMe is starting to go mainstream and Samsung is no longer sitting atop that market segment unchallenged. But not all of the interesting SSD advancements have been in the consumer realm. We've reported on new datacenter SSD form factors, the introduction of QLC NAND and enterprise SSDs with staggering capacities, but we haven't been publishing much in the way of traditional product reviews for enterprise/datacenter SSDs.
Tackling the Enterprise SSD Market
Today we're looking at several recent models that cover the wide range of enterprise SSDs, from entry-level SATA drives based on the same hardware as mainstream consumer SSDs, up to giants that deliver 1M IOPS over a PCIe x8 connection while pulling more than 20W.
No two models in this review are direct competitors against each other, but together these drives represent almost every important product segment for enterprise/datacenter SSDs. The selection of drives reflects recent trends: the fact that MLC NAND is basically gone from the market, and now replaced by mature 3D TLC NAND. The state of 3D TLC is that it now offers plenty of endurance and performance when aggregated into the multi-TB drives that are most cost-effective.
For the few niches that still require the highest endurance and lowest latency that money can buy, alternative memories like Intel 3D XPoint and Samsung Z-NAND are filling the gap left by the disappearance of traditional MLC and SLC NAND flash. SAS SSDs still exist, but their relevance is waning now that NVMe SSDs have matured to the point of offering not just higher performance, but all the same reliability and management features that are standard in the SAS ecosystem.
In the consumer SSD market, there's not a big price between the cheapest NVMe SSDs and the fastest (NAND-based) NVMe SSDs, so it's difficult to divide the consumer NVMe market into multiple product segments that each contain reasonable products. The enterprise SSD market doesn't have this problem. There's a clear distinction between the lower-power NVMe drives that usually share a controller platform with a consumer product, and the high-end drives that use enterprise-only 16 to 32-channel controllers.
Probing Samsung, Intel, and Memblaze
The table below lists the drives that are the subject of this review. Three of the four new models Samsung announced during Q3 2018 are included: two SATA and one NVME, and a fourth (983 ZET Z-SSD) will be reviewed later. The Intel DC P4510 mid-range NVMe SSD and Optane DC P4800X have already been reviewed on AnandTech, but are being revisited now that we have more relevant drives to compare against. The Memblaze PBlaze5 is the most powerful flash-based SSD we have tested, and is a good representative of the top-tier enterprise SSDs that don't get sampled for review very often.
Not shown in the table but included in the review is the older Samsung PM863 SATA drive, which helps illustrate how things have progressed since the days when Samsung had a monopoly on 3D NAND. We do have several other enterprise SSDs on hand, but they're all either in service as boot drives for various testbeds, or are old planar NAND drives, so they're not included in this review.
Reviewed Models Overview | |||||||
Model | SATA/PCIe | Form Factor | Capacities | Memory | Write Endurance (DWPD) |
||
Samsung 860 DCT | SATA | 2.5" 7mm | 960 GB 1920 GB 3840 GB |
64L 3D TLC | 0.2 DWPD | ||
Samsung 883 DCT | SATA | 2.5" 7mm | 240 GB 480 GB 960 GB 1.92 TB 3.84 TB |
64L 3D TLC | 0.8 DWPD | ||
Samsung 983 DCT | PCIe 3.0 x4 | 2.5" 7mm U.2 | 960 GB 1.92 TB |
64L 3D TLC | 0.8 DWPD | ||
PCIe 3.0 x4 | M.2 22110 | 960 GB 1.92 TB |
64L 3D TLC | 0.8 DWPD | |||
Intel DC P4510 | PCIe 3.0 x4 | 2.5" 15mm U.2 | 1 TB 2 TB 4 TB 8 TB |
64L 3D TLC | 0.7–1.1 DWPD | ||
Intel Optane DC P4800X |
PCIe 3.0 x4 | HHHL AIC | 375 GB 750 GB 1.5 TB |
3D XPoint | 60 DWPD | ||
Memblaze PBlaze5 D900 |
PCIe 3.0 x4 | 2.5" 15mm U.2 | 2 TB 3.2 TB 4 TB 8 TB |
32L 3D TLC | 3 DWPD | ||
Memblaze PBlaze5 C900 |
PCIe 3.0 x8 | HHHL AIC | 2 TB 3.2 TB 4 TB 8 TB |
32L 3D TLC | 3 DWPD | ||
Note: Tested capacities are shown in bold |
To determine where an enterprise SSD fits into the competitive landscape, the first specifications to look at are the form factor, interface, and write endurance rating.
- Interface: As with consumer SSDs, in the enterprise storage market SATA drives are generally cheaper and much slower than NVMe drives. But in a server, the performance differences between SATA and NVMe can have a much larger impact on overall system performance than with an interactive desktop workload. SATA product lines usually top out at 4TB or 8TB; beyond that, the SATA interface bottleneck leads to unacceptably low performance-per-TB ratios. NVMe product lines usually start around 1TB and go up from there, because smaller capacities would not be able to make good use of PCIe bandwidth.
- Form Factor: Smaller form factors like M.2 and 7mm thick 2.5" drives cannot draw as much power as a PCIe add-in card, and this often leads to the physically smaller drives offering lower performance. The largest 16 and 32-channel SSD controllers cannot even fit on a 22mm wide M.2 card. This is why newer standards are coming into the market, like Samsung's NF1 and Intel's Ruler.
- Write Endurance: Endurance ratings indicate what kind of workload a drive is intended for. Drives with higher endurance ratings also have higher write performance, especially random write performance. Ratings of less than one drive write per day usually indicate a drive that is intended for a relatively read-heavy workload. As little as 0.1 or 0.2 DWPD can be plenty for a use case like a streaming video CDN node, which experiences little write volume and what writes it does have to contend with are mostly sequential. Drives with ratings of 5 DWPD or more are becoming rare, except for specialty models like Intel's Optane SSDs and Samsung's Z-SSDs.
Based on these guidelines, we can expect the Intel SSD DC P4510 to be a small step up from the Samsung 983 DCT, in price and power consumption if not performance. The Memblaze PBlaze5 D900 is another step up with its higher write endurance rating, and the PBlaze5 C900 is clearly the beefiest NAND-based drive in this bunch with its PCIe 3 x8 interface.
Test System
Intel provided our enterprise SSD test system, one of their 2U servers based on the Xeon Scalable platform (codenamed Purley). The system includes two Xeon Gold 6154 18-core Skylake-SP processors, and 16GB DDR4-2666 DIMMs on all twelve memory channels for a total of 192GB of DRAM. Each of the two processors provides 48 PCI Express lanes plus a four-lane DMI link. The allocation of these lanes is complicated. Most of the PCIe lanes from CPU1 are dedicated to specific purposes: the x4 DMI plus another x16 link go to the C624 chipset, and there's an x8 link to a connector for an optional SAS controller. This leaves CPU2 providing the PCIe lanes for most of the expansion slots, including most of the U.2 ports.
Enterprise SSD Test System | |
System Model | Intel Server R2208WFTZS |
CPU | 2x Intel Xeon Gold 6154 (18C, 3.0GHz) |
Motherboard | Intel S2600WFT |
Chipset | Intel C624 |
Memory | 192GB total, Micron DDR4-2666 16GB modules |
Software | Linux kernel 4.19.8 fio version 3.12 |
Thanks to StarTech for providing a RK2236BKF 22U rack cabinet. |
The enterprise SSD test system and most of our consumer SSD test equipment are housed in a StarTech RK2236BKF 22U fully-enclosed rack cabinet. During testing for this review, the front door on this rack was generally left open to allow better airflow, since the rack doesn't include exhaust fans of its own. The rack is currently installed in an unheated attic and it's the middle of winter, so this setup provided a reasonable approximation of a well-cooled datacenter.
The test system is running a Linux kernel from the most recent long-term support branch. This brings in the latest Meltdown/Spectre mitigations, though strategies for dealing with Spectre-style attacks are still evolving. The benchmarks in this review are all synthetic benchmarks, with most of the IO workloads generated using FIO. Server workloads are too widely varied for it to be practical to implement a comprehensive suite of application-level benchmarks, so we instead try to analyze performance on a broad variety of IO patterns.
Enterprise SSDs are specified for steady-state performance and don't include features like SLC caching, so the duration of benchmark runs doesn't have much effect on the score, so long as the drive was thoroughly preconditioned. Except where otherwise specified, for our tests that include random writes, the drives were prepared with at least two full drive writes of 4kB random writes. For all the other tests, the drives were prepared with at least two full sequential write passes.
Our drive power measurements are conducted with a Quarch XLC Programmable Power Module. This device supplies power to drives and logs both current and voltage simultaneously. With a 250kHz sample rate and precision down to a few mV and mA, it provides a very high resolution view into drive power consumption. For most of our automated benchmarks, we are only interested in averages over time spans on the order of at least a minute, so we configure the power module to average together its measurements and only provide about eight samples per second, but internally it is still measuring at 4µs intervals so it doesn't miss out on short-term power spikes.
Quarch provides a variety of power injection adapters for use with different drive form factors and connectors. For this review, we used two different adapters: one for U.2/SAS/SATA drives that delivers power over a thin ribbon cable that can fit between drives in a hot-swap bay, and a PCIe add-in card riser. For the Samsung 983 DCT M.2, we used a PCIe to M.2 adapter board that includes a 12V to 3.3V converter, and our measurements have been adjusted to account for the overhead of that converter. This is necessary because the 983 DCT M.2 is rated to draw a peak of 3.8A, and it is very difficult to deliver that much current at only 3.3V without serious voltage drop along the cable. Newer Quarch power modules include upgrades to improve high-current delivery and add automatic compensation for voltage drop along the cable, but we haven't had the chance to upgrade yet.
Samsung's Latest Datacenter SSDs
In the latter half of this year, Samsung launched several new enterprise/datacenter SSD product lines. This is part of a new strategy to make its enterprise storage products more accessible to smaller customers. The 860 DCT, 883 DCT, 983 DCT and 983 ZET all use hardware that Samsung was already selling under different names, but are now being sold by retailers in individual quantities. We'll be testing the 983 ZET next month, but the other three new models have already survived our testing.
Samsung 860 DCT
The Samsung 860 DCT uses the same hardware as the 860 EVO consumer SSD, but with enterprise-oriented firmware. That means SLC caching is out and the firmware is optimized to provide consistent sustained performance, even though that means peak write speeds are far lower and the usable capacity is 4% lower due to increased overprovisioning. The 860 DCT also doesn't offer the idle power saving modes that consumer drives feature. The 860 DCT is an entry-level server drive that lacks many features traditionally associated with enterprise SSDs, most notably power loss protection capacitors. Like consumer SSDs, the 860 DCT operates with a volatile write cache, though this can be disabled at further cost to write performance.
Samsung 860 DCT Specifications | ||||
Capacity | 960 GB | 1.92 TB | 3.84 TB | |
Controller | Samsung MJX | |||
Form Factor | 2.5" 7mm SATA | |||
NAND Flash | Samsung 64-layer 3D TLC | |||
DRAM | 1 GB LPDDR4 | 2 GB LPDDR4 | 4 GB LPDDR4 | |
Sequential Read | 550 MB/s | |||
Sequential Write | 520 MB/s | |||
Random Read | 98k IOPS | |||
Random Write | 19k IOPS | |||
Power Consumption | Read | 1.9 W | ||
Write | 2.9 W | |||
Idle | 1.05 W | |||
Write Endurance | 349 TB 0.2 DWPD |
698 TB 0.2 DWPD |
1396 TB 0.2 DWPD |
|
Warranty | 5 years |
The 860 DCT is intended for use primarily on very read-heavy workloads, with a very low volume of writes that preferably would be mostly sequential – for example, serving up streaming video. This is reflected by the write endurance rating of just 0.2 drive writes per day. Samsung doesn't intend for the 860 DCT to compete against any of their existing enterprise SSDs. Instead, they are targeting use cases where customers are currently using consumer-grade SSDs or mechanical hard drives.
This is pretty much the same description that everyone is giving for their first enterprise QLC SSDs, but the 860 DCT is still a TLC-based drive at this point. We expect its successor to switch to QLC NAND. Samsung says their currrent MJX controller can support 8TB SSDs, but so far they haven't gone beyond 4TB. As NAND and DRAM prices will continue to decline in 2019, it probably won't be long before they decide to max out the MJX controller.
The labeling on the 860 DCT differs slightly from Samsung's retail consumer SSDs, but inside we find exactly the same PCB, with components that differ only in date codes and lot numbers. Our 3.84 TB 860 DCT sample uses eight NAND packages each containing eight of Samsung's 512Gb 64-layer 3D TLC V-NAND, for a total raw capacity of 4 TiB.
Samsung 883 DCT
The Samsung 883 DCT is a more traditional enterprise SATA SSD, with the power loss protection capacitors that the 860 DCT lacks. The controller and NAND are still the same as in the 860 DCT and consumer 860 EVO. Random write performance is a bit better than the 860 DCT and the write endurance rating jumps up to 0.8 DWPD, a fairly mainstream value for enterprise drives intended to be used with read-heavy workloads.
The 883 DCT product line extends all the way down to 240GB where the 860 DCT family starts at 960GB. Samsung didn't sample the two smallest capacities of 883 DCT, but they are most likely to be used in applications that aren't performance-critical, such as OS boot drives. Samsung uses their smaller 256Gb TLC dies on the 960GB and smaller models, while the 1.92TB and 3.84TB models use the 512Gb TLC dies.
Samsung 883 DCT Specifications | ||||||
Capacity | 240 GB | 480 GB | 960 GB | 1.92 TB | 3.84 TB | |
Controller | Samsung MJX | |||||
Form Factor | 2.5" 7mm SATA | |||||
NAND Flash | Samsung 256Gbit 64L 3D TLC | Samsung 512Gbit 64L 3D TLC | ||||
DRAM | 512 MB LPDDR4 | 1 GB LPDDR4 | 2 GB LPDDR4 | 4 GB LPDDR4 | ||
Sequential Read | 550 MB/s | |||||
Sequential Write | 520 MB/s | |||||
Random Read | 98k IOPS | |||||
Random Write | 14k IOPS | 24k IOPS | 25k IOPS | 25k IOPS | 28k IOPS | |
Power Consumption | Read | 3.6 W | ||||
Write | 2.3 W | |||||
Idle | 1.3 W | |||||
Write Endurance | 341 TB 0.8 DWPD |
683 TB 0.8 DWPD |
1366 TB 0.8 DWPD |
2733 TB 0.8 DWPD |
5466 TB 0.8 DWPD |
|
Warranty | 5 years |
The addition of power loss protection capacitors doesn't require Samsung to adopt a larger PCB than the 860 DCT/860 EVO hardware, but it does give a more crowded layout. The 3.84 TB 883 DCT only has six large surface-mount capacitors plus empty pads on the back for two more, but there are enough other small components added that the large NAND, DRAM and controller packages had to be rearranged.
Samsung 983 DCT
The Samsung 983 DCT is an entry-level enterprise NVMe SSD. It has the same 0.8 drive writes per day endurance rating as the 883 DCT SATA drive, but boasts much higher performance. The 983 DCT uses the Samsung Phoenix controller that we are familiar with from the consumer 970 EVO and 970 PRO, and the OEM client PM981 SSDs. As usual, SLC caching is not implemented on the enterprise products, so write performance is substantially lower than what the consumer 970 EVO advertises, but read performance is similar.
The 983 DCT is only currently offered in two capacities: 960 GB and 1920GB, but customers also have their choice of M.2 or 2.5" U.2 form factors.
Samsung 983 DCT Specifications | ||||||
Capacity | 960 GB | 1.92 TB | 960 GB | 1.92 TB | ||
Controller | Samsung Phoenix | |||||
Form Factor | 2.5" 7mm U.2 | M.2 22110 | ||||
Interface, Protocol | PCIe 3.0 x4 NVMe 1.2b | |||||
NAND Flash | Samsung 256Gbit 64L 3D TLC | |||||
DRAM | 1.5 GB LPDDR4 | 3 GB LPDDR4 | 1.5 GB LPDDR4 | 3 GB LPDDR4 | ||
Sequential Read | 3000 MB/s | |||||
Sequential Write | 1050 MB/s | 1900 MB/s | 1100 MB/s | 1400 MB/s | ||
Random Read | 400k IOPS | 540k IOPS | 400k IOPS | 480k IOPS | ||
Random Write | 40k IOPS | 50k IOPS | 38k IOPS | 42k IOPS | ||
Power Consumption | Read | 8.7 W | 7.6 W | |||
Write | 10.6 W | 8.0 W | ||||
Idle | 4.0 W | 2.6 W | ||||
Write Endurance | 1366 TB 0.8 DWPD |
2733 TB 0.8 DWPD |
1366 TB 0.8 DWPD |
2733 TB 0.8 DWPD |
||
Warranty | 5 years |
The M.2 version of the 983 DCT is rated to use significantly less power than the U.2 version. The 1.92 TB model is the only one that appears to have its performance meaningfully constrained by this, with sequential write speeds of 1.4GB/s instead of 1.9GB/s, and reduced random read and write performance.
The 983 DCT M.2 uses the 22x110mm card size, longer than the consumer standard of 80mm in order to accommodate the power loss protection capacitors. The 983 DCT M.2 is also double-sided, because unlike the 970 EVO it doesn't need to squeeze into thin laptops. This allows Samsung to use four NAND packages instead of two, and consequently shorter stacks of NAND dies.
The 2.5" version of the 983 DCT uses the same 7mm thick form factor as most SATA SSDs, but the construction is a bit different from Samsung's SATA drives: the screws holding the case together enter from the top instead of the bottom, and are covered by a label that puts Samsung's logo upside down relative to their SATA drives.
Inside, we find a PCB that uses almost all of the available space, with more than twice as many power loss protection capacitors as on the M.2 version. To keep all eight channels of the Phoenix controller busy and maximize performance, Samsung uses their 256Gb TLC dies on both capacities. The 983 DCT also features 50% more DRAM per GB of NAND than is common for SATA drives; some of this extra may be used for more robust ECC, but the primary purpose is most likely enabling higher performance.
Intel SSD DC P4510
When Intel launched the DC P4510 earlier this year, our initial review focused on using it to test out Intel's Virtual RAID on CPU (VROC) feature. Now, we're focusing on individual drive performance and adding in power efficiency testing that wasn't practical for multi-drive RAID configurations.
Since the P4510 launched, Intel has announced a new naming scheme for their enterprise SSDs. Their long-term plan is to push a combination of QLC drives for capacity and Optane drives for performance, but TLC drives like the P4510 aren't going away anytime soon and the P4510 is still a current-generation product.
The Intel P4510 is the middle tier of their enterprise NVMe drives with 64-layer 3D NAND. Below it sits the D5-P4320 QLC NAND SSD, and above it is the DC P4610. The Px700 product tier is empty this generation, with the P4610 pitched as the replacement for both the P4600 and the first-generation P3700. With the P4500 and P4600, Intel introduced their second-generation enterprise NVMe controller and paired it with their first-generation (32-layer) 3D NAND. The P4510 is an update to use the current 64-layer 3D TLC.
Intel SSD DC P4510 Specifications | |||||
Capacity | 1 TB | 2 TB | 4 TB | 8 TB | |
Form Factor | 2.5" 15mm U.2 | ||||
Interface | PCIe 3.1 x4 NVMe 1.2 | ||||
Memory | Intel 512Gb 64-layer 3D TLC | ||||
Sequential Read | 2850 MB/s | 3200 MB/s | 3000 MB/s | 3200 MB/s | |
Sequential Write | 1100 MB/s | 2000 MB/s | 2900 MB/s | 3000 MB/s | |
Random Read | 469k IOPS | 624k IOPS | 625.5k IOPS | 620k IOPS | |
Random Write | 72k IOPS | 79k IOPS | 113.5k IOPS | 139.5k IOPS | |
Maximum Power | Active | 10 W | 10 W | 14 W | 16 W |
Idle | 5 W | 5 W | 5 W | 5 W | |
Write Endurance | 1.1 DWPD | 0.7 DWPD | 0.9 DWPD | 1.0 DWPD | |
Warranty | 5 years |
With the P4510, we're finally starting to see performance specs that look comparable to high-end consumer SSDs, but unlike consumer drives the P4510 can be expected to maintain this performance indefinitely. The write endurance rating varies a bit with capacity, from a low of 0.7 DWPD up to 1.1 DWPD. Overall, this puts it in the same endurance class as the Samsung 983 ZET, but the P4510 is in a slightly higher class for performance and power consumption.
The P4510 uses a 15mm 2.5" U.2 form factor. Inside are two stacked PCBs, connected by a semi-flexible joint. The 12-channel controller and most of the DRAM are at the bottom of the drive, with thermal pads dissipating heat to the case's heatsink surface. The back side of the primary PCB has room for more DRAM (to enable the 8TB model) and a bit more NAND. But most of the NAND is on the secondary PCB, with 10 packages on each side. Due to the extra thickness of the 15mm form factor, a single large power loss protection capacitor is used instead of an array of smaller caps. The NAND and DRAM on the inner faces of the two PCBs do not get any airflow or thermal pads bridging them to the case. This lack of cooling is one of the major motivators of Intel's Ruler form factor, now standardized as EDSFF.
Intel Optane SSD DC P4800X
Intel's Optane SSD DC P4800X is the flagship model not just of Intel's SSD family, but the entire SSD market. Built around 3D XPoint memory instead of NAND flash, the Optane SSD sets the standard for low latency and high endurance. We first tested the 375GB model through remote access before the P4800X was ready for widespread release, then later reviewed the 750GB model hands-on. More recently, Intel has introduced a 1.5TB model and doubled the write endurance rating of new models to 60 DWPD.
Intel Optane SSD DC P4800X Specifications | ||||
Capacity | 375 GB | 750 GB | 1.5 TB | |
Form Factor | PCIe HHHL or 2.5" 15mm U.2 | |||
Interface | PCIe 3.0 x4 NVMe | |||
Controller | Intel SLL3D | |||
Memory | 128Gb 20nm Intel 3D XPoint | |||
Typical Latency (R/W) | <10µs | |||
Random Read (4 kB) IOPS (QD16) | 550,000 | |||
Random Read 99.999% Latency (QD1) | 60µs | |||
Random Read 99.999% Latency (QD16) | 150µs | |||
Random Write (4 kB) IOPS (QD16) | 500,000 | |||
Random Write 99.999% Latency (QD1) | 100µs | |||
Random Write 99.999% Latency (QD16) | 200µs | |||
Mixed 70/30 (4kB) Random IOPS (QD16) | 500,000 | |||
Sequential Read (64kB) | 2400 MB/s | |||
Sequential Write (64kB) | 2000 MB/s | |||
Active Power | Read | 8 W | 10 W | 18 W |
Write | 13 W | 15 W | ||
Idle Power | 5 W | 6 W | 7 W | |
Endurance | 30 DWPD | 60 DWPD | ||
Warranty | 5 years |
Memblaze PBlaze5
Memblaze is not one of the biggest names in the SSD market, but they supplied two of the most powerful SSDs in this review. Memblaze is one of many companies that offer high-end enterprise SSDs but don't manufacture their own memory or controllers. Instead, Memblaze uses the most powerful SSD controllers available on the open market: the Flashtec NVMe product family that originated with IDT and was used in one of the first PCIe SSDs, and has changed hands several times since then in corporate acquisitions, to PMC-Sierra, Microsemi and now Microchip. Major players like Intel and Samsung have their own controller ASICs that can compete in this product segment, but most other companies either go for the Flashtec controllers or a big Xilinx FPGA. Familiar controller vendors like Marvell and Silicon Motion have been eying this market segment, but so far they are only offering solutions that more or less pair up two of their smaller 8-channel controllers, rather than monolithic 16 or 32-channel controller designs.
The PBlaze5 900 series SSDs use the Flashtec NVMe2016 controller that provides 16 channels for NAND and up to 8 PCIe lanes for the host interface. The PBlaze D900 is the U.2 version and supports operating with either a PCIe 3.0 x4 host interface or in dual-port x2+x2 mode when used in a system with PCIe fabric switches that support redundant multipath connections. For our testing, the D900 has a direct PCIe x4 connection to one of the test system's CPUs.
Memblaze PBlaze5 Series Specifications | |||||||||
PBlaze5 D900 | PBlaze5 C900 | ||||||||
Form Factor | 2.5" 15mm U.2 | HHHL AIC | |||||||
Interface | PCIe 3.0 x4 | PCIe 3.0 x8 | |||||||
Controller | Microsemi Flashtec PM8607 NVMe2016 | ||||||||
Protocol | NVMe 1.2a | ||||||||
NAND | Micron 384Gb 32L 3D TLC | ||||||||
Capacities (TB) | 2 TB | 3.2 TB | 4 TB | 8 TB | 2 TB | 3.2 TB | 4 TB | 8 TB | |
Sequential Read (GB/s) | 3.5 | 5.3 | 6.0 | 5.9 | 5.5 | ||||
Sequential Write (GB/s) | 2.2 | 3.2 | 3.4 | 3.5 | 2.2 | 3.2 | 3.8 | 3.8 | |
Random Read (4 kB) IOPS | 825k | 835k | 823k | 1005k | 1010k | 1001k | |||
Random Write (4 kB) IOPS | 255k | 280k | 347k | 328k | 235k | 288k | 335k | 348k | |
Latency Read (4kB) | 94 µs | 93 µs | |||||||
Latency Write | 16 µs | 15 µs | |||||||
Power | Idle | 7 W | |||||||
Operating | 25 W | ||||||||
Endurance | 3 DWPD | ||||||||
Warranty | Five years |
The PBlaze5 C900 is a half-height half-length add-in card with a PCIe x8 connection, giving it the possibility of providing more than 4GB/s—our 4TB sample is rated for 5.9GB/s sequential reads, and slightly over 1M random read IOPS.
The PBlaze5 900-series uses Micron's 32-layer 3D TLC NAND. The somewhat awkward 384Gb per-die capacity lends itself well to building a drive with large overprovisioning: these 4TB drives have 6TiB of NAND onboard, plenty to enable a 3 DWPD endurance rating and support very high performance despite the older and slower 3D NAND. Memblaze has also introduced newer PBlaze5 models that move to Micron's 64L TLC, but due to less extreme overprovisioning ratios the drive performance ratings aren't significantly higher.
All this performance comes at the cost of power consumption of up to 25W. The C900's add-in card form factor can easily dissipate this much with its large heatsink, but even with the bottleneck of a narrower PCIe x4 link the D900 is still very power-hungry. Memblaze uses a 2.5" 15mm U.2 case design that allows for some airflow through the drive between the two PCBs, though a typical hot-swap backplane won't provide a very clear path for this flow.
Miscellaneous Drive Features
Enterprise SSDs can be distinguished from client/consumer SSDs by far more than just their performance profile and price. There are a wide variety of features that enterprise SSDs can implement to benefit aspects like reliability, security and ease of management, and the scope of possibilities continues to grow with the evolution of standards like NVMe. The drives in this review are all relatively 'mainstream' enterprise SSD products which don't target any particular niche that requires obscure features, but there's still some variety in which optional features they provide.
Reliability Features | ||||||
Model | Samsung 860 DCT | Samsung 883 DCT | Samsung 983 DCT | Intel DC P4510 | Intel Optane DC P4800X | Memblaze PBlaze5 |
Power Loss Protection | No | Yes | Yes | Yes | Yes | Yes |
T10 Data Integrity | No | No | No | No | Yes | No |
Multipath IO |
No | No | No | No | No | Yes |
Power loss protection is often considered a mandatory feature for a drive to be considered server-grade, but there are many use cases where losing a bit of data during an unexpected power failure isn't a serious concern. The Samsung 860 DCT is still unusual in omitting power loss protection, but this may become more common as low-end enterprise SSDs push into hard drive price territory.
Support for multipath IO and T10 Data Integrity Field are features commonly found on SAS drives, but they have been appearing more often in NVMe drives as that ecosystem matures toward fully replacing SAS. The T10 Data Integrity Field enables end-to-end data protection by augmenting each sector with a few extra bytes of checksum and metadata that are carried along with the normal payload data. This metadata effectively causes the drive's sector size to expand from 512 bytes to 520 or 528 bytes. All of the NVMe drives in this review already support switching between 512-byte and 4kB sector sizes, but only the Optane SSD supports the extended metadata sector formats.
Multipath IO allows a drive to remain accessible even if one of the switches/port expanders or HBAs between it and the host system fails. Support for two port interfaces is standard for SAS drives, impossible for SATA drives, and rare for NVMe drives. The Microsemi Flashtec controller used by the Memblaze PBlaze5 supports dual-port operation, and Memblaze's firmware exposes that capability. This feature isn't useful for drives that are directly attached to CPU PCIe lanes, but is an important high-availability feature for large arrays that rely on PCIe fanout switches (and there are a lot of those).
Security Features | ||||||
Model | Samsung 860 DCT |
Samsung 883 DCT |
Samsung 983 DCT |
Intel DC P4510 |
Intel Optane DC P4800X |
Memblaze PBlaze5 |
TCG Opal | No* | No | Yes | No | Yes | No |
Sanitize | No | Yes | No | No | No | No |
The TCG Opal standard defines a command set for managing self-encrypting drives. Samsung and Crucial are the only two consumer SSD brands that commonly implement TCG Opal, though it was recently revealed that their early implementations suffer from several severe flaws. In the enterprise space, demand for self-encrypting drives is largely confined to certain customer bases that have regulatory obligations to protect customer data. Some market segments actively prefer non-encrypting drives, such as when selling to (or from) certain countries that regulate strong cryptography.
In most cases, SSDs that support TCG Opal can be identified by the presence of a PSID on the drive label. This is a long serial number unique to the drive that can be used to reset and unlock it if the password/keys are forgotten. The PSID cannot be determined by electronically querying the drive, so resetting a drive with the PSID requires physical access to the label. The Samsung 860 DCT's label includes a PSID, but the drive does not respond to TCG Opal commands and is not listed by Samsung as supporting TCG Opal. The Samsung 983 DCT and Intel Optane DC P4800X both implement TCG Opal. (The consumer counterparts of the 983 DCT also support TCG Opal, but the consumer Optane SSDs do not.)
Sanitize commands were introduced to the ATA, SCSI and NVMe standards as an erase method that comes with stronger guarantees than an ATA Secure Erase command. Sanitize operations are required to purge user data from all caches and buffers and from flash memory that is awaiting garbage collection. A Sanitize operation cannot be cancelled and is required to run to completion and resume after a power loss. Sanitize commands also make it clear whether data is destroyed through block erase operations, overwriting, or destroying the key necessary to decrypt the data. Most SSDs already implement adequate erase operations through ATA Secure Erase or NVMe Format commands, but a few also provide the Sanitize command interface. Among this batch of drives, only the Samsung 883 DCT implements this feature.
Other NVMe Features | ||||
Model | Samsung 983 DCT |
Intel DC P4510 |
Intel Optane DC P4800X |
Memblaze PBlaze5 |
Firmware Slots | 2+1 | 1 | 1 | 2+1 |
Multiple Namespaces | No | No | No | Yes |
Active Power States | 1 | 1 | 1 | 3 |
Temperature Sensors | 3 | 1 | 1 | 4 |
The NVMe standard has grown to encompass a wide range of optional features, and the list gets longer every year. NVMe drives can support multiple firmware slots, allowing a new firmware upgrade to be flashed to the drive without overwriting the currently in-use version. The Samsung 983 DCT and Memblaze PBlaze5 both implement three firmware slots, one of which is a permanently read-only fallback.
The Memblaze PBlaze5 is the first SSD we have tested that implements support for multiple namespaces. At a high level, namespaces are a way of partitioning the drive's storage space. Most interesting use cases involve pairing this feature with something else: for example, support for different sector sizes/formats can allow one namespace to provide T10 Data Integrity Field support while another uses plain 4k sectors. Multiple namespace support also has numerous uses in tandem with virtualization support or NVMe over Fabrics.
In client computing, SSD power management is primarily about putting the drive to sleep during idle times. In a server, high wake-up latencies make such sleep states relatively useless, but the baseline idle power consumption of an enterprise SSD without sleep states still contributes to the operating cost of the server. There are also some scenarios where the maximum power draw of an SSD needs to be capped due to limitations on airflow or power delivery. In the client space, this is usually only seen in fanless battery-powered systems. In servers, it can happen if the system design provides less airflow than usual for a particular form factor, or if the rack as a whole is pushing the limits of what the datacenter can handle. The PBlaze5 is the most power-hungry drive in this bunch, but it provides lower-power states that limit it to 20W or 15W instead of the default 25W.
The PBlaze5 and the Samsung 983 DCT both provide access to multiple temperature sensors. These are also aggregated in a drive-specific way to produce a composite temperature readout that indicates how close the drive is to its thermal throttling threshold(s). The Intel drives only report the composite temperature.
QD1 Random Read Performance
Drive throughput with a queue depth of one is usually not advertised, but almost every latency or consistency metric reported on a spec sheet is measured at QD1 and usually for 4kB transfers. When the drive only has one command to work on at a time, there's nothing to get in the way of it offering its best-case access latency. Performance at such light loads is absolutely not what most of these drives are made for, but they have to make through the easy tests before we move on to the more realistic challenges.
*The colors on the graphs have no specific meaning; just to make it easier to read from the graph based on drive family
The Intel DC P4510 and Samsung 983 DCT offer identical QD1 random read performance, showing that Intel/Micron 3D NAND has caught up to Samsung after a first generation that was clearly slower (shown here by the Memblaze PBlaze5). The Samsung SATA drives are about 40% slower than their NVMe drives, and the Optane SSD is almost ten times faster than anything else.
Power Efficiency in kIOPS/W | Average Power in W |
QD1 isn't pushing the drives very hard, so they're not very far above idle power draw. That means the bigger, beefier drives draw more power but have little to show for it. The SATA drives all have higher efficiency scores than the NVMe drives, except that the Optane SSD provides more than 2.5x the performance per Watt at QD1.
All of these drives have pretty good consistency for QD1 random reads. The 8TB P4510 has a 99.99th percentile latency that's just over 2.5 times its average, and all the other flash-based SSDs are better than that. (The PBlaze5 is a bit slower than the P4510, but consistently). The Optane SSD actually has the biggest relative disparity between average latency and 99.99th percentile, but that hardly matters with its worst-case is as good or better than the average-case performance of flash memory.
The random read performance of most of these drives is optimized for 4kB block sizes: smaller block sizes perform roughly the same in terms of IOPS, and thus get much lower throughput. Throughput stays relatively low until block sizes exceed 64kB, after which the NVMe drives all start to deliver much higher throughput. This is most pronounced for the Samsung 983 DCT.
QD1 Random Write Performance
The clearest trend for 4kB random write performance at QD1 is that the NVMe drives are at least twice as fast as the SATA drives, but there's significant variation: Intel's NVMe drives have much faster QD1 random write performance than Samsung's, and the Memblaze drives are in between.
Power Efficiency in kIOPS/W | Average Power in W |
The SATA drives draw much less power than the NVMe drives during this random write test, so the efficiency scores end up in the same ballpark with no clear winner. The PBlaze5 is the clear loser, because its huge power-hungry controller is of no use here at low loads.
The Intel DC P4510 has some QoS issues, with 99th and 99.99th percentile random write latencies that are far higher than any of the other NVMe drives, despite the P4510 having one of the best average latency scores. Samsung's latest generation of SATA drives doesn't offer a significant improvement to average-case latency, but the tail latency has clearly improved.
As with random reads, we find that the random write performance of these drives always provides peak IOPS when dealing with 4kB transfers. The Memblaze PBlaze5 drives do extremely poorly with writes smaller than 4kB, but the rest of the drives handle tiny writes with roughly the same IOPS as a 4kB write. 8kB and larger random writes always yield fewer IOPS but usually significantly higher overall throughput.
QD1 Sequential Read Performance
At QD1, the SATA drives aren't quite saturating the host interface, but mainly because of the link idle time between the drive finishing one transfer and receiving the command to start the next. The PBlaze5 SSDs are only a bit faster than the SATA drives at QD1 despite the C900 being the drive with the most host bandwidth; reminding us how the first-generation IMFT 3D TLC could be quite slow at times. The Intel drives are a bit slower than the Samsung drives, coming in at or below 2GB/s while the 983 DCT U.2 hits 2.5GB and the M.2 is around 2.2GB/s.
Power Efficiency in MB/s/W | Average Power in W |
The latest Samsung SATA drives are comparable in efficiency to Intel's NVMe, while Samsung's own NVMe SSD is substantially more efficient. The Memblaze PBlaze5 is by far the least efficient, since it offers disappointing QD1 sequential read performance while drawing more power than all the other flash-based SSDs.
The Memblaze PBlaze5 doesn't seem to be any good at prefetching or caching when performing sequential reads: its throughput is very low for small to medium block sizes, and even at 128kB it is much slower than with 1MB transfers. The rest of the drives generally provide full sequential read throughput for transfer sizes starting around 64kB or 128kB.
QD1 Sequential Write Performance
At QD1, the Intel P4510 and Samsung 983 DCT are only slightly faster at sequential writes than the SATA drives. The Optane SSD and the Memblaze PBlaze5 C900 both perform very well, while the PBlaze5 D900 can't quite hit 1GB/s at QD1.
Power Efficiency in MB/s/W | Average Power in W |
Once again the SATA drives dominate the power efficiency rankings; Samsung's current-generation SATA drives draw less than half the power of the most efficient flash-based NVMe drive. The older Samsung PM863 shows that Samsung's SATA drives have improved significantly even though the performance is barely changed from earlier generations. Among NVMe drives, power consumption scales roughly as expected, with the Memblaze PBlaze5 C900 drawing over 22W to deliver 1.6GB/s sustained writes.
As with random writes, for sequential writes the Memblaze PBlaze5 doesn't like being handed writes of less than 4kB, and neither does the Intel Optane P4800X. The rest of the drives generally hit their steady-state sequential write speed starting with transfer sizes in the 8kB to 32kB range.
Peak Throughput And Steady State
For client/consumer SSDs we primarily focus on low queue depth performance for its relevance to interactive workloads. Server workloads are often intense enough to keep a pile of drives busy, to the maximum attainable throughput of enterprise SSDs is actually important. But it usually isn't a good idea to focus solely on throughput while ignoring latency, because somewhere down the line there's always an end user waiting for the server to respond.
In order to characterize the maximum throughput an SSD can reach, we need to test at a range of queue depths. Different drives will reach their full speed at different queue depths, and increasing the queue depth beyond that saturation point may be slightly detrimental to performance, and will drastically and unnecessarily increase latency. SATA drives can only have 32 pending commands in their queue, and any attempt to benchmark at higher queue depths will just result in commands sitting in the operating system's queues before being issued to the drive. On the other hand, some high-end NVMe SSDs need queue depths well beyond 32 to reach full speed.
Because of the above, we are not going to compare drives at a single fixed queue depth. Instead, each drive was tested at a range of queue depths up to the excessively high QD 512. For each drive, the queue depth with the highest performance was identified. Rather than report that value, we're reporting the throughput, latency, and power efficiency for the lowest queue depth that provides at least 95% of the highest obtainable performance. This often yields much more reasonable latency numbers, and is representative of how a reasonable operating system's IO scheduler should behave. (Our tests have to be run with any such scheduler disabled, or we would not get the queue depths we ask for.)
One extra complication is the choice of how to generate a specified queue depth with software. A single thread can issue multiple I/O requests using asynchronous APIs, but this runs into at least one of two problems: if each system call issues one read or write command, then context switch overhead becomes the bottleneck long before a high-end NVMe SSD's abilities are fully taxed. Alternatively, if many operations are batched together for each system call, then the real queue depth will vary significantly and it is harder to get an accurate picture of drive latency.
Using multiple threads to perform IO gets around the limits of single-core software overhead, and brings an extra advantage for NVMe SSDs: the use of multiple queues per drive. The NVMe drives in this review all support 32 separate IO queues, so we can have 32 threads on separate cores independently issuing IO without any need for synchronization or locking between threads. For even higher queue depths, we could use a combination of techniques: one thread per drive queue, issuing multiple IOs with asynchronous APIs. But this is getting into the realm of micro-optimization that most applications will never be properly tuned for, so instead the highest queue depths in these tests are still generated by having N threads issuing synchronous requests one at a time, and it's up to the OS to handle the rest.
Peak Random Read Performance
The SATA drives all have no trouble more or less saturating their host interface; they have plenty of flash that could service more read requests if they could actually be delivered to the drive quickly enough. Among NVMe drives, we see some dependence on capacity, with the 960GB Samsung 983 DCT falling well short of the 1.92TB model. The rest of the NVMe drives make it past half a million IOPS before software overhead on the host system becomes a bottleneck, so we don't even get close to seeing the PBlaze5 hit its rated 1M IOPS.
Power Efficiency in kIOPS/W | Average Power in W |
The Samsung 983 DCT offers the best power efficiency on this random read test, because the drives with bigger, more power-hungry controllers weren't able to show off their full abilities without hitting bottlenecks elsewhere in the system. The SATA drives offer respectable power efficiency as well, since they are only drawing about 2W to saturate the SATA link.
The 2TB P4510 and both PBlaze5 drives have consistency issues at the 99.99th percentile level, but are fine at the more relaxed 99th percentile threshold. The Optane SSD's latency scores are an order of magnitude better than any of the other NVMe SSDs, and it was the Optane SSD that delivered the highest overall throughput.
Peak Sequential Read Performance
Since this test consists of many threads each performing IO sequentially but without coordination between threads, there's more work for the SSD controller and less opportunity for pre-fetching than there would be with a single thread reading sequentially across the whole drive. The workload as tested bears closer resemblance to a file server streaming to several simultaneous users, rather than resembling a full-disk backup image creation.
The Intel drives don't quite match the performance of the Samsung 983 DCT or the slower PBlaze5. The Optane SSD ends up being the slowest NVMe drive on this test, but it's actually slightly faster than its spec sheet indicates. The Optane SSD's 3D XPoint memory has very low latency, but that doesn't change the fact that the drive's controller only has seven channels to work with. The PBlaze5s are the two fastest drives on this test, but they're both performing significantly below expectations.
Power Efficiency in MB/s/W | Average Power in W |
The Samsung 983 DCT clearly has the lead for power efficiency, followed by the slightly slower and more power-hungry Intel P4510. The current-generation SATA drives from Samsung mostly stay below 2W and end up with decent efficiency scores despite the severe performance bottleneck they have to contend with.
Steady-State Random Write Performance
The hardest task for most enterprise SSDs is to cope with an unending stream of writes. Once all the spare area granted by the high overprovisioning ratios has been used up, the drive has to perform garbage collection while simultaneously continuing to service new write requests, and all while maintaining consistent performance. The next two tests show how the drives hold up after hours of non-stop writes to an already full drive.
The Samsung drives don't even come close to saturating their host interfaces, but they are performing according to spec for steady-state random writes, with higher-capacity models offering clearly better performance. The Intel and Memblaze drives have a huge advantage, with the slower P4510 maintaining twice the throughput that a 983 DCT can handle.
Power Efficiency in kIOPS/W | Average Power in W |
The Samsung 983 DCTs used about 1.5W more power to deliver only slightly higher speeds than the Samsung SATA drives, so the NVMe drives wind up with some of the worst power efficiency ratings. The Optane SSD with its wide performance lead more than makes up for its rather high power consumption. In second place for efficiency is the lowly Samsung 860 DCT; despite our best efforts, it continued deliver higher than spec performance results on this test, while drawing less power than the 883 DCT.
The random write throughput provided by the Samsung 983 DCT at steady-state is nothing special, but it delivers that performance with low latency and extremely good consistency that rivals the Optane SSD. The Intel P4510 and Memblaze PBlaze5 SSDs provide much higher throughput, but with tail latencies that extend into the millisecond range. Samsung's 883 DCT SATA drive also has decent latency behavior that is far better than the 860 DCT.
Steady-State Sequential Write Performance
The steady-state sequential write test mostly levels the playing field. Even the NVMe drives rated at or below 1 DWPD offer largely SATA-like write throughput, and only the generously overprovisioned PBlaze5 can keep pace with the Optane SSD.
Power Efficiency in MB/s/W | Average Power in W |
The PBlaze5 requires over 20W to keep up with what the Optane SSD can deliver at 14W, so despite its high performance the PBlaze5's efficiency is no better than the other NVMe drives. It's the SATA drives that come out well ahead: even though this workload pushes their power consumption relatively high, Samsung's latest generation of SATA drives is still able to keep it under 3W, and that's enough for a clear efficiency win.
Mixed Random Performance
Real-world storage workloads usually aren't pure reads or writes but a mix of both. It is completely impractical to test and graph the full range of possible mixed I/O workloads—varying the proportion of reads vs writes, sequential vs random and differing block sizes leads to far too many configurations. Instead, we're going to focus on just a few scenarios that are most commonly referred to by vendors, when they provide a mixed I/O performance specification at all. We tested a range of 4kB random read/write mixes at queue depth 32, and also tested the NVMe drives at QD128. This gives us a good picture of the maximum throughput these drives can sustain for mixed random I/O, but in many cases the queue depth will be far higher than necessary, so we can't draw meaningful conclusions about latency from this test. As with our tests of pure random reads or writes, we are using 32 (or 128) threads each issuing one read or write request at a time. This spreads the work over many CPU cores, and for NVMe drives it also spreads the I/O across the drive's several queues.
The full range of read/write mixes is graphed below, but we'll primarily focus on the 70% read, 30% write case that is a fairly common stand-in for moderately read-heavy mixed workloads.
Queue Depth 32 | Queue Depth 128 |
The SATA SSDs are all significantly slower at 70% reads than they were at 100% reads on the previous page, but the higher capacity drives come closer to saturating the SATA link. Among the NVMe drives, the Samsung 983 DCT shows no further improvement from increasing the queue depth from 32 all the way to 128, but the more powerful NVMe drives do need the higher queue depth to deliver full speed. The Intel P4510's improvement at QD128 over QD32 is relatively modest, but the Memblaze PBlaze5 almost doubles its throughput and manages to catch up to the Intel Optane P4800X.
QD32 Power Efficiency in MB/s/W | QD32 Average Power in W | ||||||||
QD128 Power Efficiency in MB/s/W | QD128 Average Power in W |
The Intel Optane P4800X is the only drive that stands out with a clear power efficiency advantage; aside from that, the different product segments are on a relatively equal footing. The different capacities within each product line all have similar power draw, so the largest (fastest) models end up with the best efficiency scores. The smaller NVMe drives like the 960GB Samsung 983 DCT and the 2TB Intel P4510 waste some of the performance potential of their SSD controllers, so from a power efficiency standpoint only the larger NVMe drives are competitive with the SATA drives.
QD32 | |||||||||
QD128 |
The SATA drives and slower NVMe drives generally show a steep decline in performance as the test progresses from pure reads through the more read-heavy mixes, accompanied by a increase in power consumption. For the more balanced mixes and the more write-heavy half of the test, those drives show slower performance decline and power consumption plateaus. For the faster NVMe drives (the Memblaze PBlaze5 and Intel Optane P4800X), power consumption climbs through most or all of the test, and they are the only drives for which increasing the queue depth beyond 32 helps on the more balanced or write-heavy mixes. Higher queue depths only help the Samsung 983 DCT and Intel P4510 for the most ready-heavy workloads.
Aerospike Certification Tool
Aerospike is a high-performance NoSQL database designed for use with solid state storage. The developers of Aerospike provide the Aerospike Certification Tool (ACT), a benchmark that emulates the typical storage workload generated by the Aerospike database. This workload consists of a mix of large-block 128kB reads and writes, and small 1.5kB reads. When the ACT was initially released back in the early days of SATA SSDs, the baseline workload was defined to consist of 2000 reads per second and 1000 writes per second. A drive is considered to pass the test if it meets the following latency criteria:
- fewer than 5% of transactions exceed 1ms
- fewer than 1% of transactions exceed 8ms
- fewer than 0.1% of transactions exceed 64ms
Drives can be scored based on the highest throughput they can sustain while satisfying the latency QoS requirements. Scores are normalized relative to the baseline 1x workload, so a score of 50 indicates 100,000 reads per second and 50,000 writes per second. Since this test uses fixed IO rates, the queue depths experienced by each drive will depend on their latency, and can fluctuate during the test run if the drive slows down temporarily for a garbage collection cycle. The test will give up early if it detects the queue depths growing excessively, or if the large block IO threads can't keep up with the random reads.
We used the default settings for queue and thread counts and did not manually constrain the benchmark to a single NUMA node, so this test produced a total of 64 threads scheduled across all 72 virtual (36 physical) cores.
The usual runtime for ACT is 24 hours, which makes determining a drive's throughput limit a long process. For fast NVMe SSDs, this is far longer than necessary for drives to reach steady-state. In order to find the maximum rate at which a drive can pass the test, we start at an unsustainably high rate (at least 150x) and incrementally reduce the rate until the test can run for a full hour, and the decrease the rate further if necessary to get the drive under the latency limits.
Samsung's SATA drives have vastly improved performance over the older PM863—even the entry-level 860 DCT is several times faster, and it's absolutely not intended for workloads that are this write-heavy. The 3.84TB 883 DCT is a bit slower than the lower capacities, but still offers more than twice the performance of the 860 DCT.
The NVMe drives all outperform the SATA drives, with the Samsung 983 DCT M.2 predictably being the slowest of the bunch. The Intel P4510 outperforms the 983 DCTs, and the Memblaze PBlaze5s are much faster still, though even the PBlaze5 C900 can't quite catch up to the Intel Optane DC P4800X.
Power Efficiency | Average Power in W |
The power consumption differences between these drives span almost an order of magnitude. The latest Samsung SATA drives range from 1.6 W up to 2.7 W, while the NVMe drives start at 5.3 W for the 983 DCT M.2 and go up to 12.9 W for the PBlaze 5. However, the power efficiency scores don't vary as much. The two fastest NVMe SSDs also take the two highest efficiency scores, but then the Samsung 883 DCT SATA drives offer better efficiency than most of the rest of the NVMe drives. The SATA drives are at a serious disadvantage in terms of IOPS/TB, but for large datasets the SATA drives might offer adequate performance in aggregate at a lower TCO.
Conclusion
The enterprise SSD market has undergone major shifts from a few years ago. PCIe SSDs have expanded from an expensive niche to include a broad range of mainstream products. It's no longer possible to carve the market up into just a few clear segments; the enterprise SSD market is a rich spectrum of options. We're further than ever from having a one size fits all approach to storage.
But at the same time, we're as close as we'll ever get to seeing the market dominated by one kind of memory. TLC NAND has pushed MLC NAND out of the market. QLC, 3D XPoint and Z-NAND are all still niche memories compared to the vast range that TLC currently covers. We tested enterprise SSDs from a variety of market segments: two tiers of SATA SSD and a range of NMVe from a low-power 1TB M.2 up to power-hungry multi-TB U.2 and add-in card drives.
The latest Samsung enterprise SATA drives show that SATA is far from a dying legacy technology. The SATA drives often come out on top of our power efficiency ratings: with power draw that largely stays in the 2-3W range, they can compete in IOPS per Watt even when the raw performance is much slower than the NVMe drives. And the SATA drives aren't always far behind on performance: the smaller and slower NVMe drives don't have a huge advantage in steady-state write performance compared to SATA drives of the same capacity. Granted, most of these drives are intended for heavily read-oriented workloads, and it no longer makes sense to make a high-endurance write-oriented SATA drive because then the interface would be more of a bottleneck than the NAND flash itself.
Where the NVMe drives shine is in delivering read performance far beyond what a single SATA link can handle, and this carries over to relatively read-heavy mixed workloads. The downsides of these drives are higher cost and higher power consumption. Their power efficiency is only competitive with the SATA drives if the NVMe drives are pushed to deliver the most performance their controllers can handle. That usually means higher queue depths than needed to saturate a SATA drive, and it often means that a higher capacity drive is needed as well: the 1TB and 2TB NVMe drives often don't have enough flash memory to keep the controller busy. The big, power-hungry controllers used in high-end NVMe SSDs are most worthwhile when paired with several TB of flash. Samsung's 983 DCT uses the same lower-power NVMe controller as their consumer NVMe drives, and its sweet spot is clearly at lower capacities than the ideal for the Intel P4510 or Memblaze PBlaze5.
The choice between SATA, low-power NVMe and high-end NVMe depends on the workload, and each of those market segments has a viable use case in today's market. The SATA drives are by far the cheapest way to put the most TB of flash into a single server, and in aggregate they can deliver high performance and great performance per Watt. Their downside is in applications requiring high performance per TB: datasets that aren't very large, but are very hot. It takes hours to read or write the entire capacity of a 4TB SATA SSD. A handful of 4TB SATA SSDs can easily be large enough while not offering enough aggregate performance. In those cases, splitting the same dataset across 1TB SATA SSDs won't provide as much performance boost as moving to multi-TB NVMe drives.
The most powerful NVMe SSDs like the Memblaze PBlaze5 have shown that modern 3D TLC NAND can outperform older MLC-based drives in almost every way. With a sufficiently high queue depth, the PBlaze5 can even approach the throughput of Intel's Optane SSDs for many workloads: the PBlaze5 offers similar sequential write performance and better sequential read performance than the Intel Optane P4800X. The random write speed of the PBlaze5 is slower by a third, but for random reads it matches the Optane SSD and with careful tuning it can provide substantially more random read throughput than a single Optane SSD. All of this is from a drive that's high-end even by enterprise standards, but is actually a generation behind the other flash-based SSDs in this review.
Overall, there's no clear winner from today's review, and no obvious sweet spot in the enterprise SSD market. Samsung still puts out a very solid product lineup, but they're not the only supplier of good 3D NAND anymore. Intel's 64-layer 3D TLC is just as fast and power efficient, though Intel's current use of it the P4510 suggests that they're still a bit behind on the controller side of things—the Samsung 983 DCT's QoS is much better even if the throughput is a bit lower. And the Memblaze PBlaze5 shows that the brute force power of the largest SSD controllers can overcome the disadvantage of being a generation behind on the flash memory; we look forward to testing their more recent models that upgrade to 64-layer 3D TLC.
We're still feeling our way with how we want to present future Enterprise SSD reviews, so if you have comments on what you'd like to see, either product wise or testing methodology, then please leave a comment below.