Comments Locked

14 Comments

Back to Article

  • Kevin G - Tuesday, November 20, 2018 - link

    The change in weight and cooling spec makes me wonder if they included a liquid cooling system internally.
  • DanNeely - Wednesday, November 21, 2018 - link

    I doubt it. The weight increase would only allow ~1kg for each CPU/GPU's share of waterblock, radiator, and coolant. It's a server setup, so the air cooled version would be relatively small heatsinks with massive wear hearing protection levels of case level airflow; so dropping the air cooling heatsinks doesn't free up much additional weight.
  • MrSpadge - Tuesday, November 20, 2018 - link

    The maximum GPU memory of the DGX-1 should be 8 x 16 GB = 128 GB, shouldn't it?
  • plopke - Tuesday, November 20, 2018 - link

    the data sheet it says "GPU Memory 256 GB total system" ,
    but when I open the white paper of DGX-1 it says "he eight Tesla V100 GPUs have a total of 128 GB HBM2 memory"

    Maybe part of system memory is reserved for the GPU?
  • Eric Klien - Wednesday, November 21, 2018 - link

    The original DGX-1 had 128 GB while the latest DGX-1 has 256 GB as the memory per GPU has doubled. So this chart should be fixed showing that each GPU has 32 GB in all 3 systems. I believe you can still buy the original DGX-1 for a mere $129,000.
  • Charlie22911 - Tuesday, November 20, 2018 - link

    Maximum operating temperature of 25c?! Is that normal for systems like this? Why so low?
  • jimjamjamie - Tuesday, November 20, 2018 - link

    There's 16x 450W GPUs in that box. If you're going to spend half a million bucks on something like this, you should probably get some nice AC to stop it from going nuclear when you try and run minecraft.
  • Death666Angel - Wednesday, November 21, 2018 - link

    For AC controlled server rooms, that seems quite high, at least compared to the ones I know. You don't want to bake your millions of dollars worth of computer equipment anyway.
  • mode_13h - Tuesday, November 20, 2018 - link

    I'm actually more impressed they doubled the tensor throughput simply by going to 350 W. The extra bump from going to 450 W isn't worth it, IMO.
  • Santoval - Tuesday, November 20, 2018 - link

    You misread the specs. They did not double the tensor (along with the FP16/32/64) performance of the DGX-1 by raising the TDP of the DGX-2 graphics cards but by doubling their number. Since the numbers exactly doubled we can safely assume that the TDP of the DGX-1 and DGX-2 Tesla V100s is exactly the same (350W).
  • mode_13h - Wednesday, November 21, 2018 - link

    Got it. Thanks.

    BTW, I previously read the mezzanine V100's were rated at 300 W. Maybe the DGX-1 was already overclocking them.
  • DanNeely - Wednesday, November 21, 2018 - link

    I'm not sure if the newer model really makes a lot of sense unless you need the better networking. 10% faster, 20% (28% if you just look at the tesla cards share, - might be relevant if running a workload that has the CPU and network at idle) more power used isn't an attractive option unless there're scalability issues with spreading workloads across multiple boxes.
  • Yojimbo - Wednesday, November 21, 2018 - link

    In situations where the performance is being bound by thermal constraints in the original DGX-2 the increase in the theoretical throughput is not useful to compare the utility of the new system's higher thermal allowance. I think it's safe to assume that it is exactly those situations this new system is meant to target. We would need real world benchmarks to draw any conclusions, but the safer assumption would be that NVIDIA didn't make this system just to keep their systems engineers and salesmen busy because they had no other work to do.
  • Impetuous - Wednesday, November 21, 2018 - link

    you know you're getting old when no one has asked if it can run Crysis yet...

Log in

Don't have an account? Sign up now