Typically disk IO (even SSD) causes the CPU to sit around and ponder what to do. You would have to integrate your software and hardware impossibly well to not have lots of CPU cycles doing nothing when it's time for disk IO. I'd put money down that while this *will* use CPU cycles to accomplish the IO that overall the service time across computation and IO will decrease while CPU utilization will climb a small amount. I'm sure there will be use cases where this isn't what you want... but I can't think of any that this tech (or similar) would be considered for in the first place.
I'll take that bet. This just introduces a way for the Host and SSD to communicate what's going on and how things should be done. Currently SSD's are communicating over a standard designed for Hard Disks so the SSD has to do a lot of guesswork.
This doesn't move any real effort to the host. It just give the host a switch to turn garbage collection on and off or to make it high priority. This is no cpu effort in the host. This allows the host to fire-off garbage collection at high priority when not using the drive, then taking all the SSD internal bandwidth when it wants. If used properly, it makes sense. If used improperly, it will do bad things.
Call me when SSDs simplify way down and just expose everything and let the OS take full control of the SSD - ie: let the OS see the raw blocks with full information of how it's wired to the controller and read/write to them directly instead of having just LBAs and relying on the underlying controller.
Right?! We threw out 40 years of hard-learned CS principles when we allowed OEMs to sell us these magic black boxes that provide no visibility to the host OS (or in fact flat out lie to them a lot of the time).
If your IT can't benchmark for shit, then I'm sorry for you. For the rest of us, alternative methods better suited to our uses would be much more interesting.
There are certainly merits to hiding the underlying structure of an SSD. I don't think programmers want to deal with errors/reliability details. The tricky part is to find the right amount of details to expose to gain that extra performance.
Yep, hiding complexity from "everyone" and letting the professionals (SSD manufacturer) deal with it is good. As long as it doesn't hurt that the SSD doesn't know anything about the outside world (i.e. what's the future workload going to be that the OS / user are planning?).
For the average webdev schmuck, sure. For the OS-level guys writing disk schedulers and filesystems, raw access would be real nice.
Take CFS for example, it's been tuned over a very long time to suit HDD's large seek times, andin it's internal queue, it reorganizes file order access so that the disk can access the blocks using as few sweeps as possible. On an SSD, this sort of stuff would potentially let the OS be able to balance block writes and page erases much, much more effectively than a lower-level controller by virtue of knowing the amount of pending data that needs to be written, their block sizes and how many blocks may need to be freed. In fact, one of the really fun avenues would be page erasing multiple NAND channels while writing to the other NAND channels.
But you're ignoring that the controller providing an LBA abstraction and standard interface is actually a good thing. Otherwise you're basically saying that every SSD needs its own driver. You either have the logic in the SSD firmware or in a driver.
It sounds like what you want is basically NAND DIMMs or similar. It's an interesting concept, but there are various reasons why this wasn't done. The main reason is probably compatibility; can you imagine if SSD vendors came up with a standard and just dumped flash DIMMs into the market, asking CPU manufacturers to develop interfaces and OS vendors to each write their own flash drivers?
The general purpose CPUs we have these days really aren't great for making real-time, low latency decisions like this. You probably don't want uninterruptible SSD controller threads blocking your CPU while you're doing work. Moving toward this model where the SSD controller is basically a coprocessor that can have the real-time work offloaded to it and the CPU has enhanced controls over the coprocessor is probably the best of both worlds anyway.
Then you get a FusionIO drive: Never tried looking the the driver source code and I guess documentation isn't really publicly available, but that's how they do it.
Admittedly very much a Steve Wozniak approach, like the floppy disk code in the Apple ][, even if he only joined the company much later.
To me this is very much aimed at web-scale type operations, Open Compute users and such likes. But your mention of RAID got me thinking that even enterprise customers make like the looks of these, once they are partially hidden behind a RAID controller managing the individual SSDs. So if you think SSD appliance built out of SSD form factor flash with custom controller code, this might hit a nice niche.
But you'd still have to use massive amounts of these to get your return on investing into the management code and then would you trust anyone without tons of verification?
Sure can't see this in the enterprise nor in the consumer space (except with the fallback mode).
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
17 Comments
Back to Article
superunknown98 - Thursday, October 15, 2015 - link
This seems like a nice feature for raid arrays that require consistency but how much CPU power will it take to essentially be SSD controller?alaricljs - Thursday, October 15, 2015 - link
Typically disk IO (even SSD) causes the CPU to sit around and ponder what to do. You would have to integrate your software and hardware impossibly well to not have lots of CPU cycles doing nothing when it's time for disk IO. I'd put money down that while this *will* use CPU cycles to accomplish the IO that overall the service time across computation and IO will decrease while CPU utilization will climb a small amount. I'm sure there will be use cases where this isn't what you want... but I can't think of any that this tech (or similar) would be considered for in the first place.DerekZ06 - Friday, October 16, 2015 - link
I'll take that bet. This just introduces a way for the Host and SSD to communicate what's going on and how things should be done. Currently SSD's are communicating over a standard designed for Hard Disks so the SSD has to do a lot of guesswork.woggs - Thursday, October 15, 2015 - link
This doesn't move any real effort to the host. It just give the host a switch to turn garbage collection on and off or to make it high priority. This is no cpu effort in the host. This allows the host to fire-off garbage collection at high priority when not using the drive, then taking all the SSD internal bandwidth when it wants. If used properly, it makes sense. If used improperly, it will do bad things.ZeDestructor - Thursday, October 15, 2015 - link
Call me when SSDs simplify way down and just expose everything and let the OS take full control of the SSD - ie: let the OS see the raw blocks with full information of how it's wired to the controller and read/write to them directly instead of having just LBAs and relying on the underlying controller.aryonoco - Thursday, October 15, 2015 - link
Right?! We threw out 40 years of hard-learned CS principles when we allowed OEMs to sell us these magic black boxes that provide no visibility to the host OS (or in fact flat out lie to them a lot of the time).melgross - Thursday, October 15, 2015 - link
You're kidding, right? That's all we need is for IT to screw this up too.ZeDestructor - Friday, October 16, 2015 - link
If your IT can't benchmark for shit, then I'm sorry for you. For the rest of us, alternative methods better suited to our uses would be much more interesting.woggs - Thursday, October 15, 2015 - link
You are clueless about what goes on inside an SSD.ZeDestructor - Friday, October 16, 2015 - link
I'd like to know how "clueless" I am. The first step to learning is knowing you don't know something, etc.mgl888 - Thursday, October 15, 2015 - link
There are certainly merits to hiding the underlying structure of an SSD. I don't think programmers want to deal with errors/reliability details. The tricky part is to find the right amount of details to expose to gain that extra performance.MrSpadge - Thursday, October 15, 2015 - link
Yep, hiding complexity from "everyone" and letting the professionals (SSD manufacturer) deal with it is good. As long as it doesn't hurt that the SSD doesn't know anything about the outside world (i.e. what's the future workload going to be that the OS / user are planning?).ZeDestructor - Friday, October 16, 2015 - link
For the average webdev schmuck, sure. For the OS-level guys writing disk schedulers and filesystems, raw access would be real nice.Take CFS for example, it's been tuned over a very long time to suit HDD's large seek times, andin it's internal queue, it reorganizes file order access so that the disk can access the blocks using as few sweeps as possible. On an SSD, this sort of stuff would potentially let the OS be able to balance block writes and page erases much, much more effectively than a lower-level controller by virtue of knowing the amount of pending data that needs to be written, their block sizes and how many blocks may need to be freed. In fact, one of the really fun avenues would be page erasing multiple NAND channels while writing to the other NAND channels.
sor - Friday, October 16, 2015 - link
But you're ignoring that the controller providing an LBA abstraction and standard interface is actually a good thing. Otherwise you're basically saying that every SSD needs its own driver. You either have the logic in the SSD firmware or in a driver.It sounds like what you want is basically NAND DIMMs or similar. It's an interesting concept, but there are various reasons why this wasn't done. The main reason is probably compatibility; can you imagine if SSD vendors came up with a standard and just dumped flash DIMMs into the market, asking CPU manufacturers to develop interfaces and OS vendors to each write their own flash drivers?
The general purpose CPUs we have these days really aren't great for making real-time, low latency decisions like this. You probably don't want uninterruptible SSD controller threads blocking your CPU while you're doing work. Moving toward this model where the SSD controller is basically a coprocessor that can have the real-time work offloaded to it and the CPU has enhanced controls over the coprocessor is probably the best of both worlds anyway.
Gigaplex - Thursday, October 15, 2015 - link
Call me too when that happens. So I know when it's time to hide in a bunker until the fallout fades away.abufrejoval - Saturday, October 24, 2015 - link
Then you get a FusionIO drive: Never tried looking the the driver source code and I guess documentation isn't really publicly available, but that's how they do it.Admittedly very much a Steve Wozniak approach, like the floppy disk code in the Apple ][, even if he only joined the company much later.
abufrejoval - Saturday, October 24, 2015 - link
To me this is very much aimed at web-scale type operations, Open Compute users and such likes.But your mention of RAID got me thinking that even enterprise customers make like the looks of these, once they are partially hidden behind a RAID controller managing the individual SSDs. So if you think SSD appliance built out of SSD form factor flash with custom controller code, this might hit a nice niche.
But you'd still have to use massive amounts of these to get your return on investing into the management code and then would you trust anyone without tons of verification?
Sure can't see this in the enterprise nor in the consumer space (except with the fallback mode).
Still, I like the direction this is taking...