16GB of ram on both the 6 bay tower, and 12 bay rackmount boxes. Can't tell if they're upgradable to use more ram. The I3/Xeon processors powering them are capable of more unless hobbled by only 2 ram slots. The real soaking appears to be from software licensing costs though: ~$100/300/1000 per desktop/vm/physical server backed up.
It is if you want to dedup with ZFS. Personally, I don't feel it's usually worth it unless you're dealing with a ton of very similar data. It usually makes more sense to buy more storage than to buy mroe RAM.
It is. IIRC WHSv1 was only able to make it work on low end hardware by limiting it to their backup tool instead of the FS and having the client PCs do all the heavy work of computing zillions of hashes.
Since the main gain from dedupe on a many machine backup box is from eliminating numerous copies of OS/application binaries, I wonder if going from block to file level dedupe would keep a large portion of the benefit at substantially lower resource requirements. Probably a theoretical question more than anything else, since I suspect the level of change involved means that even if it did pay off it'd need to be a new FS instead of a ZFS enhancement.
It's not nearly that high; it all depends on on unique your data is. The more unique your data, the more space needed in the DDT (one entry per unique block), and thus the more RAM you need. If your data is mostly duplicates, the DDT will be fairly small, and you won't need a lot of RAM. The "rule-of-thumb" has been 1 GB of RAM per TB of unique data in your pool for a long time now.
Obviously, if your data is so unique that you need oodles of RAM to support dedupe ... you shouldn't be running dedupe. :)
We have several backups servers running ZFS and dedupe. Home-built systems using SuperMicro chassis and motherboards, and LSI HBAs. A hell of a lot less expensive than the turnkey options like the NetGEAR thing.
The largest one has 128 GB of RAM and 120 TB of of storage in the pool. Dedupe ratio of 1.82, compress ratio of 1.35, for a combined disk space savings of 2.38x. This is the off-site replication box that all the others send snapshots to.
The next largest has 64 GB of RAM and 45 TB of storage. Dedupe ratio of 2.68, compression ratio of 1.64, for a combined disk savings of 4.26x.
In comparison, we have another backups box with 64 GB of RAM and 45 TB of storage without dedupe. Compression ratio is only 1.51.
And our last box has 64 GB of RAM and 75 TB of storage without dedupe. Compression ratio is only 1.14.
None of them run out of RAM or ARC/L2ARC. And, between the 2 main backups system keep over 250 remote servers (over half at the other end of an ADSL or E10 link) backed up every night. As we run diskless Linux workstations in the schools, this effectively backs up over 5000 desktops every night.
The only time we had issues was when we ran them with only 16-24 GB of RAM. Once we got above 32 GB, all our dedupe-related issues went away. The only reason the off-site replication box has 128 GB in it is to future-proof it as we add more storage shelves each summer.
I don't personally have any real-world experience with ZFS's Dedupe specifically for how many warnings and cautionary tales I've seen. Might have to look into it again as we have a couple ZFS devices at work. I found this was an interesting read regarding Dedup requirements: http://constantin.glez.de/blog/2011/07/zfs-dedupe-...
Obviously you have some bad ass storage hardware that makes me jealous :-)
Nothing too fancy, it's all off-the-shelf parts. And amazingly inexpensive for what we get (our first storage box 7-odd years ago using 24 harddrives was just shy of $20,000 CDN; our latest multi-node setup with 45 drives came in under $15,000 CDN.)
SuperMicro SC826 2U chassis (24x 2.5" drive bays) for the head unit, using SuperMicro H8DGi-F6 motherboard, AMD Opteron CPUs, and LSI 9211-8e SAS/SATA controllers. 2x SSDs for the OS and SLOG, connected to the onboard SAS controller. 2x SSDs for L2ARC.
SuperMicro SC846EL2-JBOD (45x 3.5" drive bays) for the storage shelves. Currently, 1 shelf per SAS controller, as we're expanding it as needed every other year or so, instead of building it out as a massive storage setup from the get-go. The biggest box has 2 shelves now. One of the other boxes has only 1 shelf.
The other two, older, storage boxes are just SuperMicro SC846 (24x 3.5" drive bays). Each with 2x SSDs for the OS and L2ARC. Same motherboard, CPU, RAM, and controllers (9211-8i).
This is a critical launch of a much-needed product. But still, for my job, I prefer to use google teams backup recovery from https://spinbackup.com/blog/google-team-drives-bac... It allows me to perform a one-click recovery and to have an unlimited number of manual backups available.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
12 Comments
Back to Article
CamdogXIII - Monday, July 14, 2014 - link
Those units better have a lot of RAM if they are using ZFS Dedupe (roughly 5GB for every 1TB stored)DanNeely - Monday, July 14, 2014 - link
These're enterprise products, I think we can safely assume no expense has been spared in finding ways to drive the sticker price up.DanNeely - Monday, July 14, 2014 - link
16GB of ram on both the 6 bay tower, and 12 bay rackmount boxes. Can't tell if they're upgradable to use more ram. The I3/Xeon processors powering them are capable of more unless hobbled by only 2 ram slots. The real soaking appears to be from software licensing costs though: ~$100/300/1000 per desktop/vm/physical server backed up.CamdogXIII - Monday, July 14, 2014 - link
It's always the licensing that gets you! Crazy expensive just for permission to use a product.piroroadkill - Monday, July 14, 2014 - link
Jesus! Really? That's a hell of a RAM requirement.r3loaded - Monday, July 14, 2014 - link
It is if you want to dedup with ZFS. Personally, I don't feel it's usually worth it unless you're dealing with a ton of very similar data. It usually makes more sense to buy more storage than to buy mroe RAM.DanNeely - Monday, July 14, 2014 - link
It is. IIRC WHSv1 was only able to make it work on low end hardware by limiting it to their backup tool instead of the FS and having the client PCs do all the heavy work of computing zillions of hashes.Since the main gain from dedupe on a many machine backup box is from eliminating numerous copies of OS/application binaries, I wonder if going from block to file level dedupe would keep a large portion of the benefit at substantially lower resource requirements. Probably a theoretical question more than anything else, since I suspect the level of change involved means that even if it did pay off it'd need to be a new FS instead of a ZFS enhancement.
phoenix_rizzen - Monday, July 14, 2014 - link
It's not nearly that high; it all depends on on unique your data is. The more unique your data, the more space needed in the DDT (one entry per unique block), and thus the more RAM you need. If your data is mostly duplicates, the DDT will be fairly small, and you won't need a lot of RAM. The "rule-of-thumb" has been 1 GB of RAM per TB of unique data in your pool for a long time now.Obviously, if your data is so unique that you need oodles of RAM to support dedupe ... you shouldn't be running dedupe. :)
We have several backups servers running ZFS and dedupe. Home-built systems using SuperMicro chassis and motherboards, and LSI HBAs. A hell of a lot less expensive than the turnkey options like the NetGEAR thing.
The largest one has 128 GB of RAM and 120 TB of of storage in the pool. Dedupe ratio of 1.82, compress ratio of 1.35, for a combined disk space savings of 2.38x. This is the off-site replication box that all the others send snapshots to.
The next largest has 64 GB of RAM and 45 TB of storage. Dedupe ratio of 2.68, compression ratio of 1.64, for a combined disk savings of 4.26x.
In comparison, we have another backups box with 64 GB of RAM and 45 TB of storage without dedupe. Compression ratio is only 1.51.
And our last box has 64 GB of RAM and 75 TB of storage without dedupe. Compression ratio is only 1.14.
None of them run out of RAM or ARC/L2ARC. And, between the 2 main backups system keep over 250 remote servers (over half at the other end of an ADSL or E10 link) backed up every night. As we run diskless Linux workstations in the schools, this effectively backs up over 5000 desktops every night.
The only time we had issues was when we ran them with only 16-24 GB of RAM. Once we got above 32 GB, all our dedupe-related issues went away. The only reason the off-site replication box has 128 GB in it is to future-proof it as we add more storage shelves each summer.
CamdogXIII - Monday, July 14, 2014 - link
I don't personally have any real-world experience with ZFS's Dedupe specifically for how many warnings and cautionary tales I've seen. Might have to look into it again as we have a couple ZFS devices at work.I found this was an interesting read regarding Dedup requirements: http://constantin.glez.de/blog/2011/07/zfs-dedupe-...
Obviously you have some bad ass storage hardware that makes me jealous :-)
phoenix_rizzen - Tuesday, July 15, 2014 - link
Nothing too fancy, it's all off-the-shelf parts. And amazingly inexpensive for what we get (our first storage box 7-odd years ago using 24 harddrives was just shy of $20,000 CDN; our latest multi-node setup with 45 drives came in under $15,000 CDN.)SuperMicro SC826 2U chassis (24x 2.5" drive bays) for the head unit, using SuperMicro H8DGi-F6 motherboard, AMD Opteron CPUs, and LSI 9211-8e SAS/SATA controllers. 2x SSDs for the OS and SLOG, connected to the onboard SAS controller. 2x SSDs for L2ARC.
SuperMicro SC846EL2-JBOD (45x 3.5" drive bays) for the storage shelves. Currently, 1 shelf per SAS controller, as we're expanding it as needed every other year or so, instead of building it out as a massive storage setup from the get-go. The biggest box has 2 shelves now. One of the other boxes has only 1 shelf.
The other two, older, storage boxes are just SuperMicro SC846 (24x 3.5" drive bays). Each with 2x SSDs for the OS and L2ARC. Same motherboard, CPU, RAM, and controllers (9211-8i).
kamper - Wednesday, July 16, 2014 - link
"large SMBs" ? :-$peteraustin - Tuesday, August 6, 2019 - link
This is a critical launch of a much-needed product. But still, for my job, I prefer to use google teams backup recovery from https://spinbackup.com/blog/google-team-drives-bac... It allows me to perform a one-click recovery and to have an unlimited number of manual backups available.