Comments Locked

48 Comments

Back to Article

  • ddriver - Monday, November 13, 2017 - link

    That's such a great product. Nand wears out and you throw away an otherwise perfectly good memory stick.

    Because I supposed it would take too much engineering genius to make the nand chip socketed.

    And even more challenging to use the PCIE NVME storage to save and restore paged memory dumps.

    My only criticism is they didn't put RGB LEDs on it.
  • Billy Tallis - Monday, November 13, 2017 - link

    It uses SLC NAND, and only writes to it when there's an unexpected power failure. How many servers are expected to survive 10000 power failures?
  • ddriver - Monday, November 13, 2017 - link

    True, but it is still a dumb idea for a number of reasons. There is no point in increasing complexity. More chips - more things to fail. Pointless cost increase. A separate controller for each dimm. Low bandwidth with a single nand channel. Requires additional hardware and software support - more room for bugs.

    What's the point when it is trivial to do accomplish the same with a few lines of code and a general purpose nvme drive?

    The question was entirely rhetorical of course. Because this way generates more profit on waste.

    There is an even better solution to begin with - you don't really need a complete memory dump, there is lot of memory that will not be holding any important information. What you have to store and then restore is only the usable program and data state. Much less data written or read compared to a dumb memory dump, much shorter time to save and avoid running out of power in the middle of the operation, much shorter time before your servers are back to operational when powered back on.
  • extide - Monday, November 13, 2017 - link

    Again, you clearly didn't read the article...
  • ddriver - Monday, November 13, 2017 - link

    What a convincing argument you make. The maker of a product says it is good and useful - unprecedented. Must be genuine.

    It might be a mystery for you, but it is actually possible to read something and disagree with it based on understanding. Much like your silent agreement is because as simple as all this is, it goes completely over your head, thus leading to your laughable conclusion - you are smart because you agree with something you don't understand, and I am dumb because I disagree, which is the opposite of agreeing, and the opposite of smart is dumb, therefore I am dumb.

    Or maybe you have some problem with people who have higher standards than you do, therefore go for the good application of peer-pressure to force a dissident to conform. News flash - that only works when you have idiots on both sides. The less of an idiot one gets, the less one craves the approval of idiots.
  • ddrіver - Monday, November 13, 2017 - link

    There's plenty of ways to make this product great. They should make the RAM chips socketed. The controller could just count how many errors have to be corrected in RAM and identify the offending chip, then you could easily replace it. Instead of throwing away a perfectly good RAM stick. Also they could have used some off the shelf components (a Pentium/Celeron CPU for example) to make this instead of custom FPGA. Clearly an attempt to milk stupid multi-billion dollar corporations or people like me who actually do something special with computers.
    They could also implement a software solution to allow you to choose the important data that should be protected.

    And I know I complained a few articles ago that saving 10s every bootup is worthless but now I tend to think that saving 3-5s every time you recover from a power loss (when generator, UPS, and battery local supercapacitor all fail) could be kind of a big deal. This happens easily every few years maybe so it adds up.
  • edzieba - Monday, November 13, 2017 - link

    "They should make the RAM chips socketed."

    Pfffhahaha! Tell another one!
  • StevoLincolnite - Monday, November 13, 2017 - link

    Ram chips haven't been socketed in several decades...
  • extide - Tuesday, November 14, 2017 - link

    You are completely missing the point of these things. They are not for saving bootup time. They are to ensure you don't lose certain pieces of data if/when a server loses power, like filesystem or database journals (mentioned in the article).

    You need a controller on each DIMM otherwise you would need an entirely new DIMM interface/standard, that would be more complex.

    The performance of the NAND (with a single package as you mentioned) doesn't matter because the NAND isn't used during normal operation. The NAND is only used when the power cuts off and then the data is copied from DRAM to NAND. For this same reason the write/erase cycles don't matter because the NAND isn't used very much.

    You mention that this could be accomplished with a few lines of code and a general purpose NVMe drive. No, it can't. If the server suddenly loses power then those lines of code won't excute.

    You mention that you don't need a complete memory dump, which is correct. Again, this is mentioned in the article, which you clearly didn't read. It says you would generally only use 1-2 of these DIMMS per CPU in a server and then your OS/hypervisor would need to be aware of what memory ranges exist on the NVDIMMs vs regular DIMMs.

    Then you go on about a bunch of gibberish trying to insult me, in a post that frankly belongs on r/IamVerySmart.

    RAM chips socketed? Why would you do that when the DIMM ITSELF is socketed. Ridiculous.

    As far as using a general purpose CPU vs an FPGA, an FPGA is a lot simpler here. An FPGA can do these sort of mass data transfers much more quickly, plus it looks like they are using an Altera MAX series FPGA which means it has built in flash so it doesnt need an external EEPROM to store the bitstream. A general purpose CPU would indeed need an external flash device, so again more complicated (which you are trying to advocate against). You also mention them trying to 'milk' people, but those MAX series FPGA's are cheaper than generic Pentium/Celeron chips, and use a lot less power as well.

    Again you mention that they should implement a software solution allowing you to choose the data that is backed up .. which is exactly how it works. Again, you clearly didn't read teh article. This is accomplished by where you store your data in memory, whether it is on one of the NVDIMMs or a regular DIMM.
  • PeachNCream - Tuesday, November 14, 2017 - link

    "Clearly an attempt to milk stupid multi-billion dollar corporations..."

    Multi-billion dollar corporations don't become multi-billion dollar corporations by repeatedly doing stupid things or taking unprofitable actions.
  • Hereiam2005 - Tuesday, November 14, 2017 - link

    Fpga consumes far less power than any off the shelf CPU for what it can do.
    Which is offering huge bandwidth. A basic virtex fpga is capable of handling 300gbps+ bandwidth within a several watts power envelope - nothing off the shelf can come close.
    Of course they could make an asic to the same spec, but that will delay the time to market of the product.
  • Elstar - Monday, November 13, 2017 - link

    The people who buying these DIMMs think that the tradeoffs are well worth it.

    As the article points out, the primary customers are transaction heavy workloads. When RAM becomes nonvolatile, then the number of transactions per second can go up dramatically.
  • ddriver - Monday, November 13, 2017 - link

    "When RAM becomes nonvolatile, then the number of transactions per second can go up dramatically."

    Now all that remains do be done is to explain how does this statement relate to this product. Because that's still 100% volatile ram, and having it being dumped on flash storage in no way changes its nature or performance. And you don't really need any of that to accomplish said goal. You can just as easily dump paged memory to a pcie drive, without any need to increase cost or complexity.
  • ddriver - Monday, November 13, 2017 - link

    And if you go for actually persistent solution rather than "dump on power loss, restore on power on" you will absolutely massacre performance as well as the poor flash memory as well.

    Actually persistent solution would mirror every memory operation on the flash memory, and while SLC flash is very fast, it is tremendously slower than dram, and that will destroy performance, not to mention completely defeat the purpose of having dram in the first place.

    It will be even worse in terms of rapidly wearing the flash memory, because ram writes are often very small - ram is byte addressable, which will lead to tremendous write amplification, horribly worse than what you see from the file system. Even with SLC memory, that module won't last a week of typical use.

    Which is exactly why they didn't go for a true non-volatile solution but a simple dump-restore mechanism, which minimizes nand usage, doesn't butcher performance and doesn't eat through the flash p/e cycles in record time.
  • Elstar - Monday, November 13, 2017 - link

    I think you're overthinking this. Servers are the target market here, and power outages and shutdowns should be rare; and therefore "dump and restore" is good enough. In fact, some of these DIMMs might never experience a single "dump and restore" operation in their lifetime. They'll be plugged in, powered on, and they'll be lucky enough to continuous power until decommissioning.
  • Elstar - Monday, November 13, 2017 - link

    Ensuring that data is committed to NVRAM takes maybe a couple hundred nanoseconds (via a few cache flushing synchronization instructions). On the other hand, ensuring that data is committed to NVMe requires many microseconds, if not milliseconds worth of OS system calls. In other words, orders of magnitude slower. Sure, one can try and batch transactions to minimize the cost of NVMe I/O, but at that point, the latency of pending transactions starts to skyrocket, and one still cannot overcome the fact that CPU-to-RAM bandwidth is much higher than PCIe bandwidth.

    Finally, one could buy a large UPS to provide backup power to the whole machine while the in-volatile-memory database/filesystem is backed to during a power outage. But at that point, the added cost and complexity of NVDIMMs start to look better than the added cost and complexity of a large UPS setup.

    In short, these are all just tradeoffs, and for some, NVDIMMs are worth it.

    PS – I strongly bet that NVDIMMs are effectively hidden during boot. Otherwise the OS might use them for random needs. Once the OS is up and running, then a tiny driver probably tells the NVDIMM to repopulate the onboard RAM from flash, and then a mapping is assigned to the RAM. From there, a database/filesystem can then assign the mapping to their own address space at launch and resume as if nothing happened.
  • ddriver - Monday, November 13, 2017 - link

    This is not non-volatile memory, how is this not obvious by now? Paged memory is saved on the flash upon power loss, which will likely take significantly more than "maybe a couple hundred nanoseconds" or even "many microseconds". You are copying potential gigabytes to a single flash chip, meaning it will not be anywhere nearly as fast as multi-channel nvme drives. It will likely take many seconds, likely more than it would take to dump the memory to a fast pcie drive.

    Now if they used reram or mram or something that is actually non-volatile and fast, enduring and addressable enough it would be a completely different thing. Then it would make sense. This however doesn't. That being said, nonsense doesn't prevent some people from buying it nonetheless. Some people buy golden toilets encrusted with diamonds. That doesn't mean having golden toilets encrusted with diamonds somehow makes a better cr@pper.
  • Elstar - Monday, November 13, 2017 - link

    Yes, how one defines "non-volatile memory" is certainly contextual and debatable. For some, the Micron approach is good enough.

    And yes, the dump-and-restore will be slow, but the target market doesn't care because they hope that it will never be necessary, and the upsides of this design are still worth it.
  • ddriver - Monday, November 13, 2017 - link

    So they will pay extra for a feature that is unnecessary to begin with, with the hope that it will never be necessary?

    That's totally the industry as we know it. And I am not being sarcastic here, not one bit. I am not sarcastic about not being sarcastic either.
  • Elstar - Monday, November 13, 2017 - link

    I give up. George Carlin was right: "Never argue with an idiot. They will only bring you down to their level and beat you with experience."
  • III-V - Monday, November 13, 2017 - link

    "There is no point in increasing complexity. More chips - more things to fail."

    Okay, so let's just run 8-bit CPUs. More transistors equals more things that can fail, after all! Idiot.

    "Pointless cost increase."

    Ever second of downtime counts in a data center. Idiot.

    "What's the point when it is trivial to do accomplish the same with a few lines of code and a general purpose nvme drive?"

    Because you've got to spend time doing that. This saves time, and therefore money. If you had half a brain, you know this. Idiot.
  • PeachNCream - Monday, November 13, 2017 - link

    Don't feed the troll.
  • ddriver - Monday, November 13, 2017 - link

    Adding "idiot" to every pathetic failure to make a valid point or even a basic adequate analogy only adds value to illustrating what you are :)
  • ddrіver - Monday, November 13, 2017 - link

    Less chips is less complexity and less things to fail. How is this not obvious? O transistor itself does not fail. The chip it's part of might. You have ~35bn transistor in a 4GB chip. They could easily put 256bn transistors into a 32GB package and reliability would be a lot better. Or at least they could make all chips socketed so you can easily replace the failing ones.
  • Reflex - Monday, November 13, 2017 - link

    Why did you let him back? First article I click on today and on the very first comment and throughout most of the comments its a bunch of ill-informed drivel from someone who does not understand the product, the target market, and who clearly did not even read the article.

    How is this additive?
  • CajunArson - Monday, November 13, 2017 - link

    Oh look, "ddriver" insulting technologies he clearly doesn't understand again from the comfort of his mom's basement.
  • peevee - Monday, November 13, 2017 - link

    The only problem with it is that the capacitors are external. So the DIMM itself is insufficient to maintain the data.
  • theeldest - Monday, November 13, 2017 - link

    Dell PowerEdge 14G has a battery available to provide power to up to 12 (maybe 16?) NVDIMMS. So it's integrated into the system and works exactly as you'd expect.
  • Hereiam2005 - Monday, November 13, 2017 - link

    Thing is, there is this application called in memory database. Where the entire database is stored within the dimm. About 3TB a node.
    Lets say there’s a power failure. If you have a super ssd with a 3000MBps bandwidth, you have to keep the entire system alive for 1000 seconds, or about 15 minutes to backup your entire memory. That’s 15 minutes you don’t have.
    On the other hand, if you put the SLC cache on the DIMM, 1) you don’t have to keep the entire system up, just the DIMM itself is enough, 2) you only need to backup the data on one single DIMM per SLC cache instead of all of them, and 3) you bypass the entire CPU and motherboard, enabling you to have monster bandwidth between the DIMM and the cache with far less power requirement.
    Yeah, these things will eventually fail. But the pros outweigh the cons. Unless you can solve all those problems without the ssd cache, nvdimms are here to stay.
    Just because you can’t see the need for these doesn’t mean it is not useful to someone else.
  • ddriver - Monday, November 13, 2017 - link

    So in your expert opinion, you are gonna spend 100 000$ on RAM but put a single SSD on that system? Yeah, that makes perfect sense, after all you spent your budget on RAM ;)

    IMO such applications would actually rely on much faster storage solutions than your "super ssd" - current enterprise SSDs are twice as fast and more. For example the Ultrastar SN260 pushes above 6 GB/s. So that's only 500 seconds. A tad over 8 minutes. And you can put a few of those in parallel too. Two of those will cut time to 4 minutes, four to just 2. You put 150k in a server and put in a power backup solution that cannot even last 4 minutes? You are clearly doing it wrong. I'd put a power generator on such a machine as well. Not just a beefy UPS.

    But that doesn't even have to take that long. Because in-memory databases can do async flushing of writes to negligible performance impact, and to tremendous returns.

    You DON'T wait for power failure and then commit the whole thing to memory. You periodically commit modifications, and when power is lost, you only flush what remains. It won't take more than a few seconds, even with very liberally configured flush cycles. It will usually take less than a second.

    Nobody keeps in memory databases willy-nilly without flushing data to persistent storage, not only in cases of power loss, but also in cases of component failure. Components do fail, dram modules including. And when that happens, your 3 TB database will be COMPLETELY lost, even with them precious pseudo NV DIMMs. As I already said - pointless.

    But hey, don't beat yourself, at least you tried to make a point, unlike pretty much everyone else. That's a good sign.
  • Hereiam2005 - Monday, November 13, 2017 - link

    1) The SN260 tops out at 2.2GBps write speed. Check your .
    2) The last thing you'd want to do after spending 100k$ on RAM is to spend another 100k$ on SSD that you don't need.
    3) The whole point of in-memory database is the fact that it might be updating more frequently than what an array of SSD can handle. Something like high frequency trading. So no flushing, unfortunately.
    4) There's redundancy, in case of component failure. Still redundancy can only do so much.
    5) RAM fails far less frequently than PSU and other electronics.
    6) Yeah, power backup solution is bulky and fail often, even when not in use. If your machine fail and your generator/psu fail on you too, you are SOL.
    7) If you flush the content of RAM into SSDs, you have to keep the entire system online. That's 1000 watts per node. If you flush on an NVDIMM, you only need to keep the dimms alive - 5 watts per DIMM at most, or about 150W per 32 DIMM system. That's why small supercaps are sufficient. There are many NVDIMM solutions that are self contained within a single node - that is not something you can do with power backup/generator solutions.
    8)NVDIMMs/supercapacitors is far more compact and reliable than PSU/power generator solution, less power hungry and less expensive than SSD solution. What more do you want?
    I get it, it is pointless to you. Don't extrapolate that to everybody else, please.
  • Hereiam2005 - Monday, November 13, 2017 - link

    I forgot to add, the time to flush the content of a NVDIMM to the on dimm ssd is about 40 sec. Which is constant no matter how much RAM you have. And that is just the baseline, which will be improved with better NVDIMM in the future, without having to change the underlying hardware.
    With SSD the write speed peaks around 2 GB/s, and is generally capped by the speed of the PCIE itself - that's why I used a generous 3GB/s figure for an imaginary SSD that is not yet on the market.
  • FullmetalTitan - Thursday, November 16, 2017 - link

    Petition to ban this flame war starting idiot. He willingly misses the point of EVERY SINGLE PRODUCT REVIEW, just to start fights with people who have 100x the industry/application knowledge as him on VERY niche products. His contributions to this discussion (as with all others) are nothing but negativity and distraction
  • vanilla_gorilla - Monday, November 13, 2017 - link

    Serious question: how much and when could this be used as primary storage (i.e., boot from it) ?
  • DanNeely - Monday, November 13, 2017 - link

    What's the jumper block for? The way it's positioned unless you put a single NVdimm in each block of slots it's going to prevent using an adjacent slot or two.
  • ddriver - Monday, November 13, 2017 - link

    It is probably just for debugging prototypes.
  • Elstar - Monday, November 13, 2017 - link

    Yup. If one looks closely, the jumper block is missing from from the "anatomy" diagram.
  • iwod - Monday, November 13, 2017 - link

    I am much more interested to see larger Memory per stick, and much lower pricing.

    While we have In-Memory computing taking off, the price for Memory is still very high.
  • kolbryn - Monday, November 13, 2017 - link

    Please note this solution is not for consumer/prosumer market. This is for a very specific use case. For an enterprise spending $1M+ for a storage solution these DIMMs will actually reduce the cost of the hardware and/or provide additional protection from a service processor power failure.
  • PeachNCream - Monday, November 13, 2017 - link

    It almost looks like Micron is preemptively working to build up the hardware and software ecosystem needed to support XPoint DIMMs in the future. That's probably a smart idea since NV sorts of RAM are peeking up over the horizon (had to read up on this stuff to learn a little more about it and it looks like one of the biggest potential changes to modern computing we've had in a while). It'd be a surprise if we didn't see XPoint DIMMs before the end of next year.
  • ddriver - Monday, November 13, 2017 - link

    Yeah that will be a blast - much slower than ram, but at least it wears out, so that's gotta make up for the abysmal performance. Replacing worn out dimm modules periodically - that does wonders to reduce downtime hehehe.
  • PeachNCream - Monday, November 13, 2017 - link

    (>^^)> <3 <(^^<)
  • theniller - Monday, November 13, 2017 - link

    here I must agree with DDriver! This is why normal people use an UPS! Anyone with an important server have one thus makes this useless.
  • ddriver - Monday, November 13, 2017 - link

    Pretty much yes - even winblows can hybernate - dump and restore paged memory, resuming a powered off system to its previous state. You only need a few seconds of battery backup on a decent system. Even the most basic UPS has enough juice for that.

    And data centers go further, they have UPS as well as power generators. Those are pretty affordable to "normal people" too, I got a 5 KW gasoline generator for like 300 bucks for critical electronics and the security system, lasts 10 hours on a single fuel tank, and you can always top it off while it is running.
  • Elstar - Monday, November 13, 2017 - link

    Don't confuse having a UPS with a backup plan. These NVDIMMs represent the fastest backup and restore in the event of a total (including UPS) power failure.
  • ddriver - Monday, November 13, 2017 - link

    You are the one confusing how quickly this works. The only advantage it has is relying on a dedicated power source and on-board implementation, you can lose an UPS or even a PSU and it may still manage to backup the data, possibly even after the rest of the system has lost power.

    That being said, this benefit is very diminished, since proper servers have redundancy both at UPS and PSU level. And if you somehow lose all that, those NVDIMMs won't be that much of a win in terms of reducing downtime.
  • Reflex - Monday, November 13, 2017 - link

    I've seen a server room lose everything. One upstairs from my own lab actually. Malfunctioning component set off the sprinkler system. Due to some failures in the fire suppression system the server room was filled with 3ft of water overnight, and additional failures caused a lack of notification until the next morning.

    I don't care how redundant you are, more redundancy is a good thing in certain spaces. In that case it literally knocked the team in question out of an entire industry as the data in that lab was too large to have replicated elsewhere (this was pre-cloud, back in 2007 or so). Redundant UPS's couldn't do a damn thing about the sudden loss of power. And while this specific solution would probably not have saved them, the point is that every redundant system can potentially fail which is why you do layered disaster recovery and NEVER rely on one thing, or even two or three.

    Furthermore, you should routinely exercise disaster scenarios to ensure you do not have an unanticipated weak link.

    Yes, mission critical servers and data storage should have redundant UPS's, I'd argue at more levels than just the room itself. Yes they should have offsite backups and failover regions. Obviously the storage itself should be redundant. There is nothing wrong with also making the memory capable of surviving failure also. Quite frankly in the space this is targeted at, it's just money and if you can solve a problem with money that is the easiest solution available.
  • diehardmacfan - Tuesday, November 14, 2017 - link

    NVDIMM's are commonly used in the SAN space. Nimble uses them to cache all incoming writes, and some are now using them with ZFS as a dedicated ZIL. Having seconds of incoming writes getting hosed because you had some massive kernel panic can wreak having on a filesystem. NAND is currently not fast enough when it comes to latency and max throughput at low queue depths. There are some other solutions that could be useful (Optane.. etc) but there are drawbacks to all of them.

    Why you refuse to see the usefulness of this product is beyond me.

Log in

Don't have an account? Sign up now