Original Link: https://www.anandtech.com/show/5529/ocz-octane-113-firmware-update-faster-4kb-random-write-performance
OCZ Octane 1.13 Firmware Update: Improving 4KB Random Write Performance
by Anand Lal Shimpi on February 9, 2012 11:21 PM ESTThe one thing that OCZ has been missing for so many years is finally one of its staples: focus. The same company that dabbled in everything from brain mice to DIY notebooks is now almost exclusively an SSD company that peddles power supplies on the side. OCZ's penchant for aggressively trying new things hasn't faded away however. As an SSD maker, OCZ is currently or will in the near future, be shipping drives based on controllers from three different vendors - each with their own strengths. OCZ's relationship with SandForce continues and the Vertex 3 remains OCZ's highest performing offering. A recent partnership with Marvell gives OCZ early experience with native PCIe based SSDs, experience that is extremely important as the industry marches towards a new PCIe based interface standard for SSDs (SATA Express). Finally there's OCZ's own controller, the Indilinx Everest, which it is quickly building momentum behind. It's obviously in OCZ's best interests to have its own controllers in the bulk of the drives it makes, but one doesn't simply build a better controller than everyone else on the first try.
Everest has promise. Its overall performance is competitive with SandForce based solutions, but the architecture has real limitations when it comes to long term random write performance. In my own internal tests I measured a worst case write amplification of over 50x in high queue depth 4KB random writes. The chart below gives you an idea of estimated write amplification for a number of controllers in a highly random workload:
In our reviews of OCZ's Everest based Octane SSDs I mentioned that the high write amplification is really only an issue for enterprise workloads. As OCZ has high aspirations for being a player in the enterprise SSD space it's clear that work needed to be done to make Everest enterprise-worthy.
At CES OCZ showed me Everest 2, which it claimed would get write amplification down to around 10x. OCZ wouldn't go into specifics other than to say that it would come through a combination of firmware and controller improvements. With Everest 2 due out this summer, I figured we wouldn't see much progress on the Everest 1 front in the mean time. I was wrong.
A few weeks ago OCZ released a firmware update for its Octane drives that promised a significant increase in 4KB random write performance. The Octane's original 4KB random write performance wasn't all that high but it was good enough for most client workloads. The thing to keep in mind when it comes to random performance is that even the best hard drives are only good for a couple hundred random IOs per second. Any client workload that required close to a hundred thousand IOPS would simply be non-functional on any hard drive based systems. Instead, being able to deliver around 20 - 40K IOPS seems to be the sweet spot for client SSDs until we move to an all-SSD world and developers can really begin to take advantage of all of the available IOPS on these drives.
Doubling random IO performance would surely make benchmarks look better, but I suspect this new firmware has more to do with Everest 2 and preparing for an entry into the enterprise market rather than improve the client Octane experience. Indeed if you look at estimated write amplification when running a highly random workload, the new firmware has a huge impact on existing Octane drives. While it's not quite as low as I'd like, it's clear that the new firmware is better if you're running a high queue depth, highly random workload. Precisely the sort of thing an enterprise customer would be looking for.
You can update the Octane's firmware from within Windows using OCZ's toolbox, provided it's not your OS/boot drive. There's no support for alternate flash routes at this point, although OCZ says a Linux based updater is coming.
OCZ starts off by warning you that firmware 1.13 is a destructive flash. There are a number of reasons why an SSD update would get rid of all of your data, ranging from sloppy coding all the way up to significant firmware architecture changes. If you change the way LBAs are mapped to NAND pages you're posed with an interesting problem. Do you wipe the existing mapping tables and start from scratch, guaranteeing the best possible performance, or do you attempt to slowly migrate the old mappings to the new layout, preserving data but potentially significantly reducing performance? Users of the original X25-M may be familiar with the impacts of the latter. If you had a lot of data on your X25-M and ran the first major update, you would've noticed much higher latency IO as the drive attempted to reorganize parts of itself with every write. If OCZ shifted the balance of its hybrid page/block mapped architecture a bit, mapping more LBAs directly to NAND pages, it was likely easier to just rebuild the tables rather than deal with the extra work involved with migrating architectures with live data.
OCZ's toolbox tries to determine whether or not your Octane is a boot drive by looking at the drive id number enumerated by your system. The theory is that drive 0 should be your boot drive, while everything else is expendable. This is obviously a flawed theory as drive enumeration depends on more than the simple choice of SATA port. It's possible to have a secondary drive be detected first and thus appear to be drive 0 to Windows. If this happens, and your secondary drive is enumerated as drive 0, OCZ's toolbox won't allow you to update the firmware. The easiest way around this limitation is to simply boot your drive without a SATA cable attached (leave power attached) and just plug the drive in after you're in Windows. Obviously notebook users are out of luck as this method won't work, although you could try hot plugging your drive while your notebook is on (you could theoretically damage your drive this way, proceed at your own risk). While this gets you around the drive 0 limitation, the toolbox won't see your drive if you have Intel's RST drivers loaded - you need to be using Microsoft's AHCI drivers. I get that OCZ doesn't have quite the software engineering staff that Intel and Samsung do, but both of those companies allow their toolboxes to work with Intel's drivers installed and as a competitor OCZ needs to offer a similar user experience.
With all of the requirements met and the Octane showing up as drive 1, I was able to upgrade the firmware. OCZ changed its firmware numbering schema, this new update is now version 1.13 compared to 1463, its immediate predecessor. Midway through the update you'll get this message:
Ah ha! OCZ is changing its LBA mapping algorithm. After waiting a couple of minutes for the tables to finish mapping, you need to shut down your machine (a soft reset never works for me with the Octane and firmware updates) and power it back on. After all of this, the Octane is now up to 1.13 and I could begin testing.
Given the focus of the update was to address small block random IO it's likely that OCZ moved to a more granular mapping algorithm where each LBA now maps either directly to a single NAND page or a smaller group of pages. Another alternative would be if OCZ is using a hybrid page/block mapping algorithm where only a percentage of LBAs are page mapped while the rest are mapped to blocks. If this is the case then OCZ could have simply adjusted the balance between page and block mapped LBAs. The final option is a combination of the two possibilities.
Seeing as how the Octane has a ton of DRAM on-board (512MB) the controller should have no problems maintaining sequential performance even with more mapping entries to keep track of. Let's look at the numbers to see how the new firmware changes things.
The Test
CPU | Intel Core i7 2600K running at 3.4GHz (Turbo & EIST Disabled) - for AT SB 2011, AS SSD & ATTO |
Motherboard: | Intel DH67BL Motherboard |
Chipset: | Intel H67 |
Chipset Drivers: | Intel 9.1.1.1015 + Intel RST 10.2 |
Memory: | Corsair Vengeance DDR3-1333 2 x 2GB (7-7-7-20) |
Video Card: | eVGA GeForce GTX 285 |
Video Drivers: | NVIDIA ForceWare 190.38 64-bit |
Desktop Resolution: | 1920 x 1200 |
OS: | Windows 7 x64 |
Random Read/Write Speed
The four corners of SSD performance are as follows: random read, random write, sequential read and sequential write speed. Random accesses are generally small in size, while sequential accesses tend to be larger and thus we have the four Iometer tests we use in all of our reviews.
Our first test writes 4KB in a completely random pattern over an 8GB space of the drive to simulate the sort of random access that you'd see on an OS drive (even this is more stressful than a normal desktop user would see). I perform three concurrent IOs and run the test for 3 minutes. The results reported are in average MB/s over the entire time. We use both standard pseudo randomly generated data for each write as well as fully random data to show you both the maximum and minimum performance offered by SandForce based drives in these tests. The average performance of SF drives will likely be somewhere in between the two values for each drive you see in the graphs. For an understanding of why this matters, read our original SandForce article.
Random read performance takes a small step backwards. We've generally found that anything in the north of 40MB/s tends to do pretty well in our client workloads so I'm not too worried about the drop here. The big gains are, of course, in random writes:
The 128GB drive shows the biggest performance increase, more than doubling from the release firmware. The 512GB drive also gets faster here but by only 47%. The Octane is now faster than Intel's SSD 320 when it comes to random write performance. Again I don't expect this to do much for client workloads, but in the enterprise space things are different...
Many of you have asked for random write performance at higher queue depths. What I have below is our 4KB random write test performed at a queue depth of 32 instead of 3. While the vast majority of desktop usage models experience queue depths of 0 - 5, higher depths are possible in heavy I/O (and multi-user) workloads:
The gains continue at higher queue depths.
Sequential Read/Write Speed
To measure sequential performance I ran a 1 minute long 128KB sequential test over the entire span of the drive at a queue depth of 1. The results reported are in average MB/s over the entire test length.
While random performance goes up, sequential performance actually drops a bit. This shouldn't happen given how much DRAM is available to cache larger tables on each Octane drive, but needless to say performance did go down.
Here's where things get a little confusing. Sequential write performance on the 512GB drive went down, while the 128GB drive saw higher performance. Again, this shouldn't be happening and why there's a difference based on capacity is odd. OCZ tells me that 128/256GB drives should see performance gains here and only the 512GB drive will see a drop. The only thing I can think of is the 512GB drive has more die that writes need to be split across. Why that would cause performance to go down is a mystery to me. If I had to guess I'd say that this is something OCZ should be able to fix with a future firmware update.
AS-SSD Incompressible Sequential Performance
The AS-SSD sequential benchmark uses incompressible data for all of its transfers. The result is a pretty big reduction in sequential write speed on SandForce based controllers.
The same story here in AS-SSD. Both drives are slower on reads, the 128GB drive gets faster on writes while the 512GB drive loses a lot of its write speed.
AnandTech Storage Bench 2011
Last year we introduced our AnandTech Storage Bench, a suite of benchmarks that took traces of real OS/application usage and played them back in a repeatable manner. I assembled the traces myself out of frustration with the majority of what we have today in terms of SSD benchmarks.
Although the AnandTech Storage Bench tests did a good job of characterizing SSD performance, they weren't stressful enough. All of the tests performed less than 10GB of reads/writes and typically involved only 4GB of writes specifically. That's not even enough exceed the spare area on most SSDs. Most canned SSD benchmarks don't even come close to writing a single gigabyte of data, but that doesn't mean that simply writing 4GB is acceptable.
Originally I kept the benchmarks short enough that they wouldn't be a burden to run (~30 minutes) but long enough that they were representative of what a power user might do with their system.
Not too long ago I tweeted that I had created what I referred to as the Mother of All SSD Benchmarks (MOASB). Rather than only writing 4GB of data to the drive, this benchmark writes 106.32GB. It's the load you'd put on a drive after nearly two weeks of constant usage. And it takes a *long* time to run.
1) The MOASB, officially called AnandTech Storage Bench 2011 - Heavy Workload, mainly focuses on the times when your I/O activity is the highest. There is a lot of downloading and application installing that happens during the course of this test. My thinking was that it's during application installs, file copies, downloading and multitasking with all of this that you can really notice performance differences between drives.
2) I tried to cover as many bases as possible with the software I incorporated into this test. There's a lot of photo editing in Photoshop, HTML editing in Dreamweaver, web browsing, game playing/level loading (Starcraft II & WoW are both a part of the test) as well as general use stuff (application installing, virus scanning). I included a large amount of email downloading, document creation and editing as well. To top it all off I even use Visual Studio 2008 to build Chromium during the test.
The test has 2,168,893 read operations and 1,783,447 write operations. The IO breakdown is as follows:
AnandTech Storage Bench 2011 - Heavy Workload IO Breakdown | ||||
IO Size | % of Total | |||
4KB | 28% | |||
16KB | 10% | |||
32KB | 10% | |||
64KB | 4% |
Only 42% of all operations are sequential, the rest range from pseudo to fully random (with most falling in the pseudo-random category). Average queue depth is 4.625 IOs, with 59% of operations taking place in an IO queue of 1.
Many of you have asked for a better way to really characterize performance. Simply looking at IOPS doesn't really say much. As a result I'm going to be presenting Storage Bench 2011 data in a slightly different way. We'll have performance represented as Average MB/s, with higher numbers being better. At the same time I'll be reporting how long the SSD was busy while running this test. These disk busy graphs will show you exactly how much time was shaved off by using a faster drive vs. a slower one during the course of this test. Finally, I will also break out performance into reads, writes and combined. The reason I do this is to help balance out the fact that this test is unusually write intensive, which can often hide the benefits of a drive with good read performance.
There's also a new light workload for 2011. This is a far more reasonable, typical every day use case benchmark. Lots of web browsing, photo editing (but with a greater focus on photo consumption), video playback as well as some application installs and gaming. This test isn't nearly as write intensive as the MOASB but it's still multiple times more write intensive than what we were running last year.
As always I don't believe that these two benchmarks alone are enough to characterize the performance of a drive, but hopefully along with the rest of our tests they will help provide a better idea.
The testbed for Storage Bench 2011 has changed as well. We're now using a Sandy Bridge platform with full 6Gbps support for these tests.
AnandTech Storage Bench 2011 - Heavy Workload
We'll start out by looking at average data rate throughout our new heavy workload test:
The good news is that overall performance is mostly unaffected. The 512GB drive took a little drop while the 128GB drive actually saw overall performance increase.
The next chart represent sthe same data, but in a different manner. Instead of looking at average data rate, we're looking at how long the disk was busy for during this entire test. Note that disk busy time excludes any and all idles, this is just how long the SSD was busy doing something:
AnandTech Storage Bench 2011 - Light Workload
Our new light workload actually has more write operations than read operations. The split is as follows: 372,630 reads and 459,709 writes. The relatively close read/write ratio does better mimic a typical light workload (although even lighter workloads would be far more read centric).
The I/O breakdown is similar to the heavy workload at small IOs, however you'll notice that there are far fewer large IO transfers:
AnandTech Storage Bench 2011 - Light Workload IO Breakdown | ||||
IO Size | % of Total | |||
4KB | 27% | |||
16KB | 8% | |||
32KB | 6% | |||
64KB | 5% |
Our lighter workload actually showed more of a performance drop for the 512GB drive, while the 128GB drive's performance remained unchanged.
Performance Over Time
I was curious to see if the latest firmware improved the Octane's worst case scenario. Although write amplification is definitely down, it's still a problem as the drive can get into a pretty nasty performance state when subjected to constant random writes. I suspect we'll need to see a more significant effort on the firmware side to get this addressed. Perhaps that's something we'll need Everest 2 for.
Final Words
I asked OCZ if the 1.13 firmware update offered any bug fixes or if it was purely for performance, the answer was the latter - it's a performance upgrade, nothing else. If you've got a 128GB or 256GB drive the upgrade is worthwhile. If you've got a 512GB drive however, you may want to hold off as there's no real benefit. The only exception would be if you've deployed your Octane in a server that's subjected to tons of random writes. I suspect even the bravest enterprise customers aren't too keen on adopting a fairly new consumer drive for use in their servers though.
The bigger news is that OCZ is clearly addressing one of the performance issues with Octane and the Everest platform. There's still more room to improve but this is an important step forward to hardening Everest. Reducing write amplification and improving random write performance will make Everest more feasible for use in enterprise workloads, although it may ultimately be Everest 2 that gets OCZ all the way there.
As far as Octane goes, I'm still in the wait and see mode with this drive. I have one Octane deployed in a system here that's used daily. The drive has been problem-free thus far but we've still got several months of testing before I'm totally comfortable. The competition is tough for sure (particulary after this last round of Intel and Samsung launches), but the market is growing quickly enough where there's still room for multiple controller vendors.