Original Link: https://www.anandtech.com/show/4329/intel-z68-chipset-smart-response-technology-ssd-caching-review
Intel Z68 Chipset & Smart Response Technology (SSD Caching) Review
by Anand Lal Shimpi on May 11, 2011 2:34 AM ESTThe problem with Sandy Bridge was simple: if you wanted to use Intel's integrated graphics, you had to buy a motherboard based on an H-series chipset. Unfortunately, Intel's H-series chipsets don't let you overclock the CPU or memory—only the integrated GPU. If you want to overclock the CPU and/or memory, you need a P-series chipset—which doesn't support Sandy Bridge's on-die GPU. Intel effectively forced overclockers to buy discrete GPUs from AMD or NVIDIA, even if they didn't need the added GPU power.
The situation got more complicated from there. Sandy Bridge's Quick Sync was one of the best features of the platform, however it was only available when you used the CPU's on-die GPU, which once again meant you needed an H-series chipset with no support for overclocking. You could either have Quick Sync or overclocking, but not both (at least initially).
Finally, Intel did very little to actually move chipsets forward with its 6-series Sandy Bridge platform. Native USB 3.0 support was out and won't be included until Ivy Bridge, we got a pair of 6Gbps SATA ports and PCIe 2.0 slots but not much else. I can't help but feel like Intel was purposefully very conservative with its chipset design. Despite all of that, the seemingly conservative chipset design was plagued by the single largest bug Intel has ever faced publicly.
As strong as the Sandy Bridge launch was, the 6-series chipset did little to help it.
Addressing the Problems: Z68
In our Sandy Bridge review I mentioned a chipset that would come out in Q2 that would solve most of Sandy Bridge's platform issues. A quick look at the calendar reveals that it's indeed the second quarter of the year, and a quick look at the photo below reveals the first motherboard to hit our labs based on Intel's new Z68 chipset:
Architecturally Intel's Z68 chipset is no different than the H67. It supports video output from any Sandy Bridge CPU and has the same number of USB, SATA and PCIe lanes. What the Z68 chipset adds however is full overclocking support for CPU, memory and integrated graphics giving you the choice to do pretty much anything you'd want.
Pricing should be similar to P67 with motherboards selling for a $5—$10 premium. Not all Z68 motherboards will come with video out, those that do may have an additional $5 premium on top of that in order to cover the licensing fees for Lucid's Virtu software that will likely be bundled with most if not all Z68 motherboards that have iGPU out. Lucid's software excluded, any price premium is a little ridiculous here given that the functionality offered by Z68 should've been there from the start. I'm hoping over time Intel will come to its senses but for now, Z68 will still be sold at a slight premium over P67.
Overclocking: It Works
Ian will have more on overclocking in his article on ASUS' first Z68 motherboard, but in short it works as expected. You can use Sandy Bridge's integrated graphics and still overclock your CPU. Of course the Sandy Bridge overclocking limits still apply—if you don't have a CPU that supports Turbo (e.g. Core i3 2100), your chip is entirely clock locked.
Ian found that overclocking behavior on Z68 was pretty similar to P67. You can obviously also overclock the on-die GPU on Z68 boards with video out.
The Quick Sync Problem
Back in February we previewed Lucid's Virtu software, which allows you to have a discrete GPU but still use Sandy Bridge's on-die GPU for Quick Sync, video decoding and basic 2D/3D acceleration.
Virtu works by intercepting the command stream directed at your GPU. Depending on the source of the commands, they are directed at either your discrete GPU (dGPU) or on-die GPU (iGPU).
There are two physical approaches to setting up Virtu. You can either connect your display to the iGPU or dGPU. If you do the former (i-mode), the iGPU handles all display duties and any rendering done on the dGPU has to be copied over to the iGPU's frame buffer before being output to your display. Note that you can run an application in a window that requires the dGPU while running another that uses the iGPU (e.g. Quick Sync).
As you can guess, there is some amount of overhead in the process, which we've measured to varying degrees. When it works well the overhead is typically limited to around 10%, however we've seen situations where a native dGPU setup is over 40% faster.
Lucid Virtu i-mode Performance Comparison (1920 x 1200—Highest Quality Settings) | |||||||
Metro 2033 | Mafia II | World of Warcraft | Starcraft 2 | DiRT 2 | |||
AMD Radeon HD 6970 | 35.2 fps | 61.5 fps | 81.3 fps | 115.6 fps | 137.7 fps | ||
AMD Radeon HD 6970 (Virtu) | 24.3 fps | 58.7 fps | 74.8 fps | 116.6 fps | 117.9 fps |
The dGPU doesn't completely turn off when it's not in use in this situation, however it will be in its lowest possible idle state.
The second approach (d-mode) requires that you connect your display directly to the dGPU. This is the preferred route for the absolute best 3D performance since there's no copying of frame buffers. The downside here is that you will likely have higher idle power as Sandy Bridge's on-die GPU is probably more power efficient under non-3D gaming loads than any high end discrete GPU.
With a display connected to the dGPU and with Virtu running you can still access Quick Sync. CrossFire and SLI are both supported in d-mode only.
As I mentioned before, Lucid determines where to send commands based on the source of the commands. In i-mode all commands go to the iGPU by default, and in d-mode everything goes to the dGPU. The only exceptions are if there are particular application profiles defined within the Virtu software that list exceptions. In i-mode that means a list of games/apps that should run on the dGPU, and in d-mode that is a smaller list of apps that use Quick Sync (as everything else should run on the dGPU).
Virtu works although there are still some random issues when running in i-mode. Your best bet to keep Quick Sync functionality and maintain the best overall 3D performance is to hook your display up to your dGPU and only use Sandy Bridge's GPU for transcoding. Ultimately I'd like to see Intel enable this functionality without the use of 3rd party software utilities.
SSD Caching
We finally have a Sandy Bridge chipset that can overclock and use integrated graphics, but that's not what's most interesting about Intel's Z68 launch. This next feature is.
Originally called SSD Caching, Intel is introducing a feature called Smart Response Technology (SRT) alongside Z68. Make no mistake, this isn't a hardware feature but it's something that Intel is only enabling on Z68. All of the work is done entirely in Intel's RST 10.5 software, which will be made available for all 6-series chipsets but Smart Response Technology is artificially bound to Z68 alone (and some mobile chipsets—HM67, QM67).
It's Intel's way of giving Z68 owners some value for their money, but it's also a silly way to support your most loyal customers—the earliest adopters of Sandy Bridge platforms who bought motherboards, CPUs and systems before Z68 was made available.
What does Smart Response Technology do? It takes a page from enterprise storage architecture and lets you use a small SSD as a full read/write cache for a hard drive or RAID array.
With the Z68 SATA controllers set to RAID (SRT won't work in AHCI or IDE modes) just install Windows 7 on your hard drive like you normally would. With Intel's RST 10.5 drivers and a spare SSD installed (from any manufacturer) you can choose to use up to 64GB of the SSD as a cache for all accesses to the hard drive. Any space above 64GB is left untouched for you to use as a separate drive letter.
Intel limited the maximum cache size to 64GB as it saw little benefit in internal tests to making the cache larger than that. Admittedly after a certain size you're better off just keeping your frequently used applications on the SSD itself and manually storing everything else on a hard drive.
Unlike Seagate's Momentus XT, both reads and writes are cached with SRT enabled. Intel allows two modes of write caching: enhanced and maximized. Enhanced mode makes the SSD cache behave as a write through cache, where every write must hit both the SSD cache and hard drive before moving on. Whereas in maximized mode the SSD cache behaves more like a write back cache, where writes hit the SSD and are eventually written back to the hard drive but not immediately.
Enhanced mode is the most secure, but it limits the overall performance improvement you'll see as write performance will still be bound by the performance of your hard drive (or array). In enhanced mode, if you disconnect your SSD cache or the SSD dies, your system will continue to function normally. Note that you may still see an improvement in write performance vs. a non-cached hard drive because the SSD offloading read requests can free up your hard drive to better fulfill write requests.
Maximized mode offers the greatest performance benefit, however it also comes at the greatest risk. There's obviously the chance that you lose power before the SSD cache is able to commit writes to your hard drive. The bigger issue is that if something happens to your SSD cache, there's a chance you could lose data. To make matters worse, if your SSD cache dies and it was caching a bootable volume, your system will no longer boot. I suspect this situation is a bit overly cautious on Intel's part, but that's the functionality of the current version of Intel's 10.5 drivers.
Moving a drive with a maximized SSD cache enabled requires that you either move the SSD cache with it, or disable the SSD cache first. Again, Intel seems to be more cautious than necessary here.
The upside is of course performance as I mentioned before. Cacheable writes just have to hit the SSD before being considered serviced. Intel then conservatively writes that data back to the hard drive later on.
An Intelligent, Persistent Cache
Intel's SRT functions like an actual cache. Rather than caching individual files, Intel focuses on frequently accessed LBAs (logical block addresses). Read a block enough times or write to it enough times and those accesses will get pulled into the SSD cache until it's full. When full, the least recently used data gets evicted making room for new data.
Since SSDs use NAND flash, cache data is kept persistent between reboots and power cycles. Data won't leave the cache unless it gets forced out due to lack of space/use or you disable the cache altogether. A persistent cache is very important because it means that the performance of your system will hopefully match how you use it. If you run a handful of applications very frequently, the most frequently used areas of those applications should always be present in your SSD cache.
Intel claims it's very careful not to dirty the SSD cache. If it detects sequential accesses beyond a few MB in length, that data isn't cached. The same goes for virus scan accesses, however it's less clear what Intel uses to determine that a virus scan is running. In theory this should mean that simply copying files or scanning for viruses shouldn't kick frequently used applications and data out of cache, however that doesn't mean other things won't.
Intel's SSD 311 20GB: Designed to Cache
Although SRT supports any SSD, Intel created a brand new drive specifically for use as a cache with Z68 platforms. This is the Intel SSD 311, codenamed Larson Creek:
The SSD 311 uses the same controller as Intel's X25-M G2, SSD 310 and SSD 320 drives:
The big difference here is the SSD 311 comes with 20GB of 34nm SLC NAND. If you remember back to the SSD Anthology, SLC NAND is architecturally identical to MLC NAND. With half the number of data stored per NAND cell SLC NAND not only lasts longer than MLC NAND but it also is much faster, particularly for writes.
As a cache that'll be constantly written to, SLC NAND isn't a bad decision on Intel's part. Intel insists that the move wasn't motivated by reliability but rather write performance.
A quick look at the performance of the SSD 311 shows that it packs a lot of punch for being a small 20GB drive with only 5 of 10 NAND channels populated:
The SSD 311 basically offers the performance of a 160GB X25-M G2 but with fewer NAND channels and a much lower capacity.
Remember this is SLC NAND so despite only being a 20GB drive, it's priced more like a 40GB MLC drive: Intel expects the SSD 311 to retail for $110. Thankfully you aren't locked in to only using Intel drives as Smart Response Technology will work with any SSD.
Application & Game Launch Performance: Virtually Indistinguishable from an SSD
We'll get to our standard benchmark suite in a second, but with a technology like SRT we need more to truly understand how it's going to behave in all circumstances. Let's start with something simple: application launch time.
I set up a Z68 system with a 3TB Seagate Barracuda 7200RPM HDD and Intel's 20GB SSD 311. I timed how long it took to launch various applications both with and without the SSD cache enabled. Note that the first launch of anything with SSD caching enabled doesn't run any faster; it's the second, third, etc... times that you launch an application that the SSD cache will come into effect. I ran every application once, rebooted the system, then timed how long it took to launch both in the HDD and caching configurations:
Application Launch Comparison | |||||||
Intel SSD 311 20GB Cache | Adobe Photoshop CS5.5 | Adobe After Effects CS5.5 | Adobe Dreamweaver CS5.5 | Adobe Illustrator CS5.5 | Adobe Premier Pro CS5.5 | ||
Seagate Barracuda 3TB (No cache) | 7.1 seconds | 19.3 seconds | 8.0 seconds | 6.1 seconds | 10.4 seconds | ||
Seagate Barracuda 3TB (Enhanced Cache) | 5.0 seconds | 11.3 seconds | 5.5 seconds | 3.9 seconds | 4.7 seconds | ||
Seagate Barracuda 3TB (Maximize Cache) | 3.8 seconds | 10.6 seconds | 5.2 seconds | 4.2 seconds | 3.8 seconds |
These are pretty big improvements! Boot time and multitasking immediately after boot also benefit tremendously:
Boot & Multitasking After Boot Comparison | ||||
Boot Time (POST to Desktop) | Launch Adobe Premier + Chrome + WoW Immediately After Boot | |||
Seagate Barracuda 3TB (No cache) | 55.5 seconds | 37.0 seconds | ||
Seagate Barracuda 3TB (Enhanced Cache) | 35.8 seconds | 12.3 seconds | ||
Seagate Barracuda 3TB (Maximize Cache) | 32.6 seconds | 12.6 seconds |
Let's look at the impact on gaming performance, this time we'll also toss in a high end standalone SSD:
Game Load Comparison | ||||||||
Intel SSD 311 20GB Cache | Portal 2 (Game Launch) | Portal 2 (Level Load) | StarCraft 2 (Game Launch) | StarCraft 2 (Level Load) | World of Warcraft (Game Launch) | World of Warcraft (Level Load) | ||
Seagate Barracuda 3TB (No cache) | 12.0 seconds | 17.1 seconds | 15.3 seconds | 23.3 seconds | 5.3 seconds | 11.9 seconds | ||
Seagate Barracuda 3TB (Enhanced Cache) | 10.3 seconds | 15.0 seconds | 10.3 seconds | 15.1 seconds | 5.2 seconds | 5.6 seconds | ||
Seagate Barracuda 3TB (Maximize Cache) | 9.9 seconds | 15.1 seconds | 9.7 seconds | 15.0 seconds | 4.5 seconds | 5.8 seconds | ||
OCZ Vertex 3 240GB (6Gbps) | 8.5 seconds | 13.1 seconds | 7.5 seconds | 14.5 seconds | 4.1 seconds | 4.7 seconds |
While the Vertex 3 is still a bit faster, you can't argue that Intel's SRT doesn't deliver most of the SSD experience at a fraction of the cost—at least when it comes to individual application performance.
Look at what happens when we reboot and run the application launch tests a third time:
Game Load Comparison | ||||||||
Intel SSD 311 20GB Cache | Portal 2 (Game Launch) | Portal 2 (Level Load) | StarCraft 2 (Game Launch) | StarCraft 2 (Level Load) | World of Warcraft (Game Launch) | World of Warcraft (Level Load) | ||
Seagate Barracuda 3TB (No cache) | 12.0 seconds | 17.1 seconds | 15.3 seconds | 23.3 seconds | 5.3 seconds | 11.9 seconds | ||
Seagate Barracuda 3TB (Enhanced Cache) | 10.3 seconds | 15.0 seconds | 10.3 seconds | 15.1 seconds | 5.2 seconds | 5.6 seconds | ||
Seagate Barracuda 3TB (Maximize Cache) | 9.9 seconds | 15.1 seconds | 9.7 seconds | 15.0 seconds | 4.5 seconds | 5.8 seconds | ||
Seagate Barracuda 3TB (Maximize Cache)—Run 3 | 9.9 seconds | 14.8 seconds | 8.1 seconds | 14.9 seconds | 4.4 seconds | 4.3 seconds | ||
OCZ Vertex 3 240GB (6Gbps) | 8.5 seconds | 13.1 seconds | 7.5 seconds | 14.5 seconds | 4.1 seconds | 4.7 seconds |
Performance keeps going up. The maximized SRT system is now virtually indistinguishable from the standalone SSD system.
Gaming is actually a pretty big reason to consider using Intel SRT since games can eat up a lot of storage space. Personally I keep one or two frequently used titles on my SSD, everything else goes on the HDD array. As the numbers above show however, there's a definite performance benefit to deploying a SSD cache in a gaming environment.
I was curious how high of a hit rate I'd see within a game loading multiple levels rather than just the same level over and over again. I worried that Intel's SRT would only cache the most frequently used level and not improve performance across the board. I was wrong.
StarCraft 2 Level Loading—Seagate Barracuda 3TB (Maximize Cache) | ||||
Levels Loaded in Order | Load Time | |||
Agria Valley | 16.1 seconds | |||
Blistering Sands | 4.5 seconds | |||
Nightmare | 4.8 seconds | |||
Tempest | 6.3 seconds | |||
Zenith | 6.2 seconds |
Remember that SRT works by caching frequently accessed LBAs, many of which may be reused even across different levels. In the case of StarCraft 2, only the first multiplayer level load took a long time as its assets and other game files were cached. All subsequent level loads completed much quicker. Note that this isn't exclusive to SSD caching as you can benefit from some of this data being resident in memory as well.
The Downside: Consistency
Initially it's very easy to get excited about Intel's SRT. If you only run a handful of applications, you'll likely get performance similar to that of a standalone SSD without all of the cost and size limitations. Unfortunately, at least when paired with Intel's SSD 311, it doesn't take much to kick some of that data out of the cache.
To put eviction to the test, I ran through three games—Portal 2, Starcraft 2 and World of Warcraft—then I installed the entire Adobe CS5.5 Master Collection, ran five of its applications and tried running Starcraft 2 again. All of Starcraft 2's data had been evicted from the SSD cache resulting in HDD-like performance:
Starcraft 2 Level Loading—Seagate Barracuda 3TB (Maximize Cache) | |||||
Load Time | Load Time After App Install/Launch | ||||
Game Launch | 9.7 seconds | 17.4 seconds | |||
Level Load | 15.0 seconds | 23.3 seconds |
I thought that may have been a bit excessive so I tried another test. This time I used the machine a bit more, browsed the web, did some file copies and scanned for viruses but I didn't install any new applications. Instead I launched five Adobe applications and then ran through all of our game loading tests. The result was a mixed bag with some games clearly being evicted from the cache and others not being touched at all:
Game Load Comparison | ||||||||
Intel SSD 311 20GB Cache | Portal 2 (Game Launch) | Portal 2 (Level Load) | Starcraft 2 (Game Launch) | Starcraft 2 (Level Load) | World of Warcraft (Game Launch) | World of Warcraft (Level Load) | ||
Load Time | 9.9 seconds | 15.1 seconds | 9.7 seconds | 15.0 seconds | 4.5 seconds | 5.8 seconds | ||
Load Time After Use | 12.1 seconds | 15.1 seconds | 10.1 seconds | 15.3 seconds | 3.6 seconds | 14.0 seconds |
Even boot time was affected. For the most part performance didn't fall back down to HDD levels, but it wasn't as snappy as before when I was only running games.
Boot Time—Seagate Barracuda 3TB (Maximize Cache) | ||||
Time | ||||
Boot Time | 32.6 seconds | |||
Boot Time After Use | 37.3 seconds | |||
Boot Time Without Cache | 55.5 seconds |
Although Intel felt that 20GB was the ideal size to balance price/performance and while SRT is supposed to filter out some IO operations from being cached, it's clear that if you frequently use ~10 applications that you will evict useful data from your cache on a 20GB SSD 311. For lighter usage models with only a few frequently used applications, a 20GB cache should be just fine.
There's also the bigger problem of the initial run of anything taking a long time since the data isn't cached. The best way to illustrate this is a quick comparison of how long it takes to install Adobe's CS5.5 Master Collection:
Install Adobe CS5.5 Master Collection | ||||
Time | ||||
Seagate Barracuda 3TB (No cache) | 13.3 minutes | |||
Seagate Barracuda 3TB (Maximize Cache) | 13.3 minutes | |||
OCZ Vertex 3 240GB (6Gbps) | 5.5 minutes |
A pure SSD setup is going to give you predictable performance across the board regardless of what you do, whereas Intel's SRT is more useful in improving performance in more limited, repetitive usage models. Admittedly most users probably fall into the latter category.
In my use I've only noticed two reliability issues with Intel's SRT. The first issue was with an early BIOS/driver combination where I rebooted my system (SSD cache was set to maximized) and my bootloader had disappeared. The other issue was a corrupt portion of my Portal 2 install, which only appeared after I disabled by SSD cache. I haven't been able to replicate either issue and I can't say for sure that they are even caused by SRT, but I felt compelled to report them nevertheless. As with any new technology, I'd approach SRT with caution—and lots of backups.
AnandTech Storage Bench 2011
With the hand timed real world tests out of the way, I wanted to do a better job of summarizing the performance benefit of Intel's SRT using our Storage Bench 2011 suite. Remember that the first time anything is ever encountered it won't be cached and even then, not all operations afterwards will be cached. Data can also be evicted out of the cache depending on other demands. As a result, overall performance looks more like a doubling of standalone HDD performance rather than the multi-x increase we see from moving entirely to an SSD.
Heavy 2011—Background
Last year we introduced our AnandTech Storage Bench, a suite of benchmarks that took traces of real OS/application usage and played them back in a repeatable manner. I assembled the traces myself out of frustration with the majority of what we have today in terms of SSD benchmarks.
Although the AnandTech Storage Bench tests did a good job of characterizing SSD performance, they weren't stressful enough. All of the tests performed less than 10GB of reads/writes and typically involved only 4GB of writes specifically. That's not even enough exceed the spare area on most SSDs. Most canned SSD benchmarks don't even come close to writing a single gigabyte of data, but that doesn't mean that simply writing 4GB is acceptable.
Originally I kept the benchmarks short enough that they wouldn't be a burden to run (~30 minutes) but long enough that they were representative of what a power user might do with their system.
Not too long ago I tweeted that I had created what I referred to as the Mother of All SSD Benchmarks (MOASB). Rather than only writing 4GB of data to the drive, this benchmark writes 106.32GB. It's the load you'd put on a drive after nearly two weeks of constant usage. And it takes a *long* time to run.
First, some details:
1) The MOASB, officially called AnandTech Storage Bench 2011—Heavy Workload, mainly focuses on the times when your I/O activity is the highest. There is a lot of downloading and application installing that happens during the course of this test. My thinking was that it's during application installs, file copies, downloading and multitasking with all of this that you can really notice performance differences between drives.
2) I tried to cover as many bases as possible with the software I incorporated into this test. There's a lot of photo editing in Photoshop, HTML editing in Dreamweaver, web browsing, game playing/level loading (Starcraft II & WoW are both a part of the test) as well as general use stuff (application installing, virus scanning). I included a large amount of email downloading, document creation and editing as well. To top it all off I even use Visual Studio 2008 to build Chromium during the test.
The test has 2,168,893 read operations and 1,783,447 write operations. The IO breakdown is as follows:
AnandTech Storage Bench 2011—Heavy Workload IO Breakdown | ||||
IO Size | % of Total | |||
4KB | 28% | |||
16KB | 10% | |||
32KB | 10% | |||
64KB | 4% |
Only 42% of all operations are sequential, the rest range from pseudo to fully random (with most falling in the pseudo-random category). Average queue depth is 4.625 IOs, with 59% of operations taking place in an IO queue of 1.
Many of you have asked for a better way to really characterize performance. Simply looking at IOPS doesn't really say much. As a result I'm going to be presenting Storage Bench 2011 data in a slightly different way. We'll have performance represented as Average MB/s, with higher numbers being better. At the same time I'll be reporting how long the SSD was busy while running this test. These disk busy graphs will show you exactly how much time was shaved off by using a faster drive vs. a slower one during the course of this test. Finally, I will also break out performance into reads, writes and combined. The reason I do this is to help balance out the fact that this test is unusually write intensive, which can often hide the benefits of a drive with good read performance.
There's also a new light workload for 2011. This is a far more reasonable, typical every day use case benchmark. Lots of web browsing, photo editing (but with a greater focus on photo consumption), video playback as well as some application installs and gaming. This test isn't nearly as write intensive as the MOASB but it's still multiple times more write intensive than what we were running last year.
As always I don't believe that these two benchmarks alone are enough to characterize the performance of a drive, but hopefully along with the rest of our tests they will help provide a better idea.
The testbed for Storage Bench 2011 has changed as well. We're now using a Sandy Bridge platform with full 6Gbps support for these tests. All of the older tests are still run on our X58 platform.
AnandTech Storage Bench 2011—Heavy Workload
We'll start out by looking at average data rate throughout our new heavy workload test:
For this comparison I used two hard drives: 1) a Hitachi 7200RPM 1TB drive from 2008 and 2) a 600GB Western Digital VelociRaptor. The Hitachi 1TB is a good large, but aging drive, while the 600GB VR is a great example of a very high end spinning disk. With a modest 20GB cache enabled, the 3+ year old Hitachi drive is easily 41% faster than the VelociRaptor. We're still not into dedicated SSD territory, but the improvement is significant.
I also tried swapping the cache drive out with a Crucial RealSSD C300 (64GB). Performance went up a bit but not much. You'll notice that average read speed got the biggest boost from the C300 as a cache drive since it does have better sequential read performance. Overall I am impressed with Intel's SSD 311, I just wish the drive were a little bigger.
The breakdown of reads vs. writes tells us more of what's going on:
This isn't too unusual—pure write performance is actually better with the cache disabled than with it enabled. The SSD 311 has a good write speed for its capacity/channel configuration, but so does the VelociRaptor. Overall performance is still better with the cache enabled, but it's worth keeping in mind if you are using a particularly sluggish SSD with a hard drive that has very good sequential write performance.
The next three charts just represent the same data, but in a different manner. Instead of looking at average data rate, we're looking at how long the disk was busy for during this entire test. Note that disk busy time excludes any and all idles, this is just how long the SSD was busy doing something:
AnandTech Storage Bench 2011—Light Workload
Our new light workload actually has more write operations than read operations. The split is as follows: 372,630 reads and 459,709 writes. The relatively close read/write ratio does better mimic a typical light workload (although even lighter workloads would be far more read centric).
The I/O breakdown is similar to the heavy workload at small IOs, however you'll notice that there are far fewer large IO transfers:
AnandTech Storage Bench 2011—Light Workload IO Breakdown | ||||
IO Size | % of Total | |||
4KB | 27% | |||
16KB | 8% | |||
32KB | 6% | |||
64KB | 5% |
Despite the reduction in large IOs, over 60% of all operations are perfectly sequential. Average queue depth is a lighter 2.2029 IOs.
While the heavy workload was long enough to not show any benefit in performance by running it multiple times, our light workload boasts serious gains if we run it a second time with the cache active. With a light enough workload the SSD 311 as a cache can actually bring hard drive performance up to the level of an Intel X25-M G2, which is exactly what Intel was targeting with Smart Response Technology to begin with. For light users you can get an SSD-like experience at a fraction of the cost and without having to manage data across two drives.
Final Words
Intel's Z68 should have been the one and only high end launch chipset offered with Sandy Bridge. It enables all of the configurations we could possibly want with Sandy Bridge and does so without making any sacrifices. Users should be able to overclock their CPU and use integrated graphics if they'd like. While Z68 gives us pretty much exactly what we asked for, it is troubling that we even had to ask for it in the first place. With Intel holding onto a considerable performance advantage and a growing manufacturing advantage, I am worried that this may be a sign of things to come. It was strong competition from AMD that pushed Intel into executing so flawlessly time and time again, but it also put Intel in a position where it can enforce limits on things like overclocking. Let's hope that Z68 corrected a mistake that we won't see repeated.
Intel's Smart Response Technology (SRT) is an interesting addition to the mix. For starters, it's not going to make your high end SSD obsolete. You'll still get better overall performance by grabbing a large (80-160GB+) SSD, putting your OS + applications on it, and manually moving all of your large media files to a separate hard drive. What SRT does offer however is a stepping stone to a full blown SSD + HDD setup and a solution that doesn't require end user management. You don't get the same performance as a large dedicated SSD, but you can turn any hard drive into a much higher performing storage device. Paired with a 20GB SLC SSD cache, I could turn a 4-year-old 1TB hard drive into something that was 41% faster than a VelociRaptor.
If you're building a system for someone who isn't going to want to manage multiple drive letters, SRT may be a good alternative. Similarly, if you're building a budget box that won't allow for a large expensive SSD, the $110 adder for an Intel SSD 311 can easily double the performance of even the fastest hard drive you could put in there. The most obvious win here is the lighter user that only runs a handful of applications on a regular basis. As our tests have shown, for light workloads you can easily get the performance of an X25-M G2 out of a fast hard drive + an SSD cache. Even gamers may find use in SSD caching as they could dedicate a portion of their SSD to acting as a cache for a dedicated games HDD, thereby speeding up launch and level load times for the games that reside on that drive. The fact that you can use any SSD as a cache is nice since it gives you something to do with your old SSDs when you upgrade.
I believe there's a real future in SRT, however it needs to be available on more than just the highest end Sandy Bridge motherboards. I'd like to see SSD caching available on all Intel chipsets (something that we'll get with Ivy Bridge and the 7-series chipsets next year), particularly on the more mainstream platforms since that appears to be the best fit for the technology. I would also prefer a larger cache drive offering from Intel (at least 40GB) as it wasn't that difficult to evict frequently used programs from the SSD cache. The beauty of NAND is that we'll of course get larger capacities at similar price points down the road. Along those lines I view SRT as more of a good start to a great technology. Now it's just a matter of getting it everywhere.