Original Link: https://www.anandtech.com/show/6371/micron-p320h-pcie-ssd-700gb-review-first-nvme-ssd



Update: Micron tells us that the P320h doesn't support NVMe, we are digging to understand how Micron's controller differs from the NVMe IDT controller with a similar part number.

Well over a year ago Micron announced something unique in a sea of PCIe SSDs that were otherwise nothing more than SATA drives in RAID on a PCIe card. The drive Micron announced was the P320h, featuring a custom ASIC and a native PCIe interface. The vast majority of PCIe SSDs we've looked at thus far feature multiple SATA/SAS SSD controllers with their associated NAND behind a SATA/SAS RAID controller on a PCIe card. These PCIe SSDs basically deliver the performance of a multi-drive SSD RAID-0 on a single card instead of requiring multiple 2.5" bays. There's decent interest in these types of PCIe SSDs simply because of the form factor advantage as many servers these days have moved to slimmer form factors (1U/2U) that don't have all that many 2.5" drive bays. Long term however, this SATA/SAS RAID on a PCIe card SSD solution is clunky at best. Ideally you'd want a native PCIe controller that could talk directly to the NAND, rather than going through an unnecessary layer of abstraction. That's exactly what Micron's P320h promised. Today, we have a review of that very drive.

Although it was publicly announced a long time ago (in SSD terms), the P320h's specifications are still very competitive:

Micron P320h
Capacity 350GB 700GB
Interface PCIe 2.0 x8
NAND 34nm ONFI 2.1 SLC
Max Sequential Performance (Reads/Writes) 3.2 / 1.9 GBps
Max Random Performance (Reads/Writes) 785K / 205K IOPS
Max Latency (QD=1, Read/Write) 47 µs / 311 µs (nonposted)
Endurance (Max Data Written) 25PB 50PB
Encryption N
TDP 25W
Form Factor Half-Height, Half-Length PCIe
68.9mm x 167.65mm x 18.71mm

In fact, the only indication that this product was announced over a year ago is the fact that it is launching using 34nm SLC NAND. Most of the enterprise SSDs we review these days have shifted to 2x-nm eMLC or MLC-HET. Micron will be making a 25nm SLC version available as well as eMLC/MLC-HET versions in the future, but the launch product uses 34nm SLC NAND. I don't have official pricing from Micron yet, but I would expect it to be pretty high given the amount of expensive SLC NAND on each of the drives (512GB for the 350GB drive, 1TB for the 700GB drive).

The obvious benefit from using SLC NAND is endurance. While Intel's MLC-HET based 910 SSD tops out at 14PB of writes over the life of the 800GB, the 350GB P320h is rated for 25PB. The 700GB drive doubles that to 50 petabytes of writes.

Micron is also quite proud of its low read/write latencies, enabled by its low overhead PCIe controller and driver stack.

As a native PCIe SSD, the P320h features a single controller on the card - a giant 1517-pin controller made by IDT. The huge pin count is needed to connect the controller to its 32 independent NAND channels, 4x what we see from most SATA SSD controllers:

There are no bridge chips or RAID controllers on-board, that single Micron developed IDT manufactured controller is all that's needed. Talk about clean.

Each of the 32 channels can talk to up to 8 targets, with a maximum capacity of 4TB although Micron only uses 1TB of NAND on-board. Twenty two percent of the on-board NAND is set aside as spare area for garbage collection, bad block replacement and wear leveling. An additional 1/8 of the user capacity is reserved for parity data.

The IDT controller features a configurable hardware RAID-5 that stripes accesses across multiple logical units. The logical units are broken down into blocks and pages as is standard for NAND based SSDs. Blocks and pages are striped across logical units, with parity data calculated from every 7 blocks/pages.

Micron picked 7+1P as its preferred balance of performance, user capacity and failure protection:

Calculating parity based on fewer blocks/pages would be able to withstand greater failures but capacity and performance would suffer. As NAND failures should be far more rare/predictable than mechanical storage failures, this tradeoff shouldn't be a problem.

The P320h is available in one form factor: a half-height, half-length PCIe 2.0 x8 card. In the box are both half and full height brackets allowing the P320h to fit in both types of cases:

Unlike most 2.5" SATA/SAS SSDs, these PCIe SSDs are pretty interesting to look at. With much more bandwidth to saturate, the drive makers have become more creative in finding ways to cram as many NAND devices onto a half height, half length PCIe card as possible. While sticking to a single slot profile, Micron uses two smaller daughterboards attached via high density interface connectors to the main P320h card to double the amount of NAND on the drive.

 

Each daughtercard has sixteen 34nm 128Gb NAND packages for a total of 256GB of NAND. That's 512GB of NAND on cards, and then another 512GB on the main P320h card itself for a total of 1TB of NAND for a 700GB drive. The 350GB drive keeps the daughtercards but moves to 64Gb NAND packages instead. Remember that these are 34nm SLC NAND die, so you're looking at only 2GB per die vs. the 8GB per die we get from 25nm MLC NAND (or 4GB per die from 25nm SLC NAND).

Of course with a huge increase in the number of NAND devices, there's a correspondingly large increase in the number of DRAM devices to keep track of all of the LBAs and flash mapping tables. The P320h features nine 256MB DDR3-1333 devices (also made by Micron) for a total of 2.25GB of on-board DRAM. 

There's a relatively small heatsink on the custom PCIe controller itself. Micron claims it only needs 1.5m/s of airflow in order to maintain its operating temperature. Prying the heatsink off reveals IDT's NVMe (Non-Volatile Memory Express) controller. This is a native PCIe controller that supports up to 32 NAND channels, as well as a full implementation of the NVMe spec. Although the controller itself is PCIe Gen 3, Micron only certifies it for PCIe Gen 2 operation. With 8 PCIe lanes there's more than enough host bandwidth on PCIe 2.x so this isn't an issue. Update: Micron tells us that the P320h doesn't support NVMe, we are digging to understand how Micron's controller differs from the NVMe IDT controller with a similar part number.

The NVMe spec promises a lower overhead, more efficient command set for native PCIe SSDs. This is a transition that makes a lot of sense as the current approach of just using SATA/SAS controllers behind a PCIe switch is unnecessarily complex. With NVMe the NAND talks to a native PCIe controller which can in turn deliver tons of bandwidth to the host vs. being bottlenecked by 6Gbps SATA or SAS. The NVMe host spec also scales the number of concurrent IOs supported all the way up to 64,000 (a max of 256 currently supported under Windows vs 32 for SATA based SSDs), well beyond what most current workloads would be able to generate.

As NVMe spec defines the driver interface between the SSD and the host OS, it requires a new set of drivers to function. The goal is down the road these drivers will be built into the OS, but in the short term you'd hopefully only need one NVMe driver that would work on all NVMe SSDs rather than the current mess of having an individual driver for every PCIe SSD. Companies like Intel have gotten around the driver issues by simply using SATA/SAS to PCIe controllers whose drivers are already integrated into modern OSes (e.g. LSI's Falcon 2008 controller on the Intel SSD 910).

In the long run NVMe SSDs should enjoy the same plug and play benefits that SATA drives enjoy today. You never have to worry about installing a SATA driver to make your new SSD work (you shouldn't at least), and the same will hopefully be true for NVMe SSDs. The reality today is much more complicated than that.

Micron provided us with drivers for the P320h under the guidance that the driver was only tested/validated for certain server configurations. Even having other PCIe devices installed in the system could cause incompatibilities. In practice I found Micron's warnings accurate. While the P320h had no issues working on our X79 testbed, our H67 testbed wouldn't boot into Windows with the P320h installed. What was really strange about the P320h in the H67 system was that the simple presence of the card caused graphical corruption at POST. I noticed other incompatibilities with certain PCIe video cards installed in our X79 system. I eventually ended up with a stable configuration that let me run through our suite of tests, but even then I noticed the P320h would sometimes drop out of the system entirely - requiring a power cycle to come back again.

Micron made no attempt to hide the fact that the P320h is only validated on specific servers, but it's something worth considering if you're looking at this drive. Apparently the state of Linux drivers is much better than Windows, unfortunately most of our tests run under Windows which forced us into dealing with these compatibility issues head on.

 



Random Read/Write Speed

The four corners of SSD performance are as follows: random read, random write, sequential read and sequential write speed. Random accesses are generally small in size, while sequential accesses tend to be larger and thus we have the four Iometer tests we use in all of our reviews. For our enterprise suite we make a few changes to our usual tests.

Our first test writes 4KB in a completely random pattern over all LBAs on the drive (compared to an 8GB address space in our desktop reviews). We perform 32 concurrent IOs (compared to 3) and run the test until the drive being tested reaches its steady state. The results reported are in average MB/s over the entire time. We use both standard pseudo randomly generated data for each write as well as fully random data to show you both the maximum and minimum performance offered by SandForce based drives in these tests. The average performance of SF drives will likely be somewhere in between the two values for each drive you see in the graphs. For an understanding of why this matters, read our original SandForce article.

Enterprise Iometer - 4KB Random Write

Excluding the two SandForce data points using highly compressible data, the P320h is the new king here. At least in the 700GB configuration the P320h is able to offer better steady state 4KB random write performance than Intel's SSD 910. The drive also delivers over 6x the performance of Micron's 2.5" P400e.

Enterprise Iometer - 4KB Random Read

Random read performance is an even more impressive showing for the P320h at 758MB/s. This is truly the benefit of having 32 NAND concurrently accessible channels, given a heavy workload there's more than enough data to parallelize and stripe across all channels.

Sequential Read/Write Speed

Similar to our other Enterprise Iometer tests, queue depths are much higher in our sequential benchmarks. To measure sequential performance I ran a 1 minute long 128KB sequential test over the entire span of the drive at a queue depth of 32. The results reported are in average MB/s over the entire test length.

Enterprise Iometer - 128KB Sequential Write

Peak sequential write performance is slightly behind Intel's SSD 910 operating in its 38W high performance mode, but still very competitive. At 1357MB/s workloads that need to move large blocks of data will enjoy great performance on the P320h. Micron claims much higher sequential read/write numbers under Linux at 256 concurrent IOs.

Enterprise Iometer - 128KB Sequential Read

Sequential read performance is also very strong at 1817MB/s. The 910 as well as OCZ's Z-Drive R4 manage better performance here.



Enterprise Storage Bench - Oracle Swingbench

We begin with a popular benchmark from our server reviews: the Oracle Swingbench. This is a pretty typical OLTP workload that focuses on servers with a light to medium workload of 100 - 150 concurrent users. The database size is fairly small at 10GB, however the workload is absolutely brutal.

Swingbench consists of over 1.28 million read IOs and 3.55 million writes. The read/write GB ratio is nearly 1:1 (bigger reads than writes). Parallelism in this workload comes through aggregating IOs as 88% of the operations in this benchmark are 8KB or smaller. This test is actually something we use in our CPU reviews so its queue depth averages only 1.33. We will be following up with a version that features a much higher queue depth in the future.

Oracle Swingbench - Average Data Rate

Intel's SSD 910 didn't do well at all in our Oracle Swingbench test, mostly due to its inability to perform well at very small transfer sizes (512B). The P320h is no different here and is easily outperformed by standard 2.5" SATA SSDs (although Micron's P400e also does really poorly here). The results here just go to show how important it is that you understand your workload when picking an enterprise SSD.

Average service time is nice and low for the P320h, it's just we don't end up seeing a high throughput rate for the SSD.

Oracle Swingbench - Disk Busy Time

Oracle Swingbench - Average Service Time



Enterprise Storage Bench - Microsoft SQL UpdateDailyStats

Our next two tests are taken from our own internal infrastructure. We do a lot of statistics tracking at AnandTech - we record traffic data to all articles as well as aggregate traffic for the entire site (including forums) on a daily basis. We also keep track of a running total of traffic for the month. Our first benchmark is a trace of the MS SQL process that does all of the daily and monthly stats processing for the site. We run this process once a day as it puts a fairly high load on our DB server. Then again, we don't have a beefy SSD array in there yet :)

The UpdateDailyStats procedure is mostly reads (3:1 ratio of GB reads to writes) with 431K read operations and 179K write ops. Average queue depth is 4.2 and only 34% of all IOs are issued at a queue depth of 1. The transfer size breakdown is as follows:

AnandTech Enterprise Storage Bench MS SQL UpdateDaily Stats IO Breakdown
IO Size % of Total
8KB 21%
64KB 35%
128KB 35%

Microsoft SQL UpdateDailyStats - Average Data Rate

Things look a lot better with our first SQL benchmark, Micron's P320h outperforms both of the OCZ SandForce based offerings. Only Intel's 910 is faster, but it maintains a healthy performance advantage here over the P320h (~44%).

Average service times are very low, which is one of the benefits of being able to serve so many IOs in parallel by a native PCIe SSD controller.

Microsoft SQL UpdateDailyStats - Disk Busy Time

Microsoft SQL UpdateDailyStats - Average Service Time



Enterprise Storage Bench - Microsoft SQL WeeklyMaintenance

Our final enterprise storage bench test once again comes from our own internal databases. We're looking at the stats DB again however this time we're running a trace of our Weekly Maintenance procedure. This procedure runs a consistency check on the 30GB database followed by a rebuild index on all tables to eliminate fragmentation. As its name implies, we run this procedure weekly against our stats DB.

The read:write ratio here remains around 3:1 but we're dealing with far more operations: approximately 1.8M reads and 1M writes. Average queue depth is up to 5.43.

Microsoft SQL WeeklyMaintenance - Average Data Rate

We see the same 44% performance advantage for the 910 over the P320h in our second SQL benchmark. The P320h is ahead of the remaining competitors however.

Average IO latency continues to be a clear strength of the P320h.

Microsoft SQL WeeklyMaintenance - Disk Busy Time

Microsoft SQL WeeklyMaintenance - Average Service Time



Final Words

 

Update: Micron tells us that the P320h doesn't support NVMe, we are digging to understand how Micron's controller differs from the NVMe IDT controller with a similar part number.

For Micron's first PCIe SSD, the P320h performs very well. Random read and write performance are untouched by any non-SandForce architecture we've tested here. Average service times in our application based workload traces are also class leading, presumably as a result of the IDT controller and lightweight PCIe controller. Sequential performance is also very good and potentially even better under heavier workloads. The fact that there's no claimed performance difference between the 350GB and 700GB drives is good for users who don't have giant workload footprints. Overall it's an impressive step forward. The native PCIe architecture makes a lot of sense and will hopefully quickly supplant the current crop of SATA-RAID-on-a-PCIe-card solutions on the market today. Where things will get really interesting is when we start coupling multiple PCIe SSDs in a system.

The downsides to the P320h are obvious. By using 34nm SLC NAND Micron ensures wonderful endurance, but prices the solution out of the reach of many customers whose needs don't require such high endurance. Until Micron brings eMLC/MLC-HET NAND to the P320h, I suspect the more conventional PCIe SSDs (e.g. Intel's SSD 910) will remain better values. For the subset of users who require SLC endurance however, the P320h should definitely fit the bill.

The second downside is just as fundamental: the driver stack is still in its infancy. Although the ultimate goal is SATA-like compatibility with all systems, it will take some time to get there. Until that day comes, if you're considering the P320h you'll want to make sure that Micron has validated the drive on your platform.

PCIe is the future. I don't expect a smooth ride to get us there, but it's where solid state storage is headed - particularly in the enterprise market. The P320h is a good starting point, I'm eager to see where Micron takes it.

Log in

Don't have an account? Sign up now