Comments Locked

8 Comments

Back to Article

  • YaleZhang - Monday, December 21, 2015 - link

    Can you achieve the same thing with PCI bridges? You can have up to 255 PCI busses, so you can host about 128 GPUs on the leaf nodes. It seems the main limitation from using PCI bridges would be that the topology will have to be a tree, which will be a bottleneck if there's no data locality.
  • Loki726 - Monday, December 21, 2015 - link

    I think that part of the problem is GPU driver performance and correct functionality with that many devices, but you can get quite far with bridges.
  • SleepyFE - Tuesday, December 22, 2015 - link

    The GPU is more power hungry and less flexible. With an FPGA you program an ASIC onto it making it suit your needs better.
  • SaberKOG91 - Tuesday, December 22, 2015 - link

    Application Specific Processor, not ASIC. ASICs are full-custom chip designs. Technically an FPGA is an ASIC which can be programmed to perform specific functionality required for an application, but never at the speed or power of an ASIC.
  • Vatharian - Monday, December 21, 2015 - link

    If the main host dies, you have to shut down whole tree. In distributed host-less environment you're mainly taking offline single nodes for very short time. Also in high concurrency PCIe tends to congest, and DRAM throughput is big limitation, especially if data from one channel connected to one CPU needs data from other DRAM/NUMA node. That can introduce high randomness into latency and further decrease performance.
  • ddriver - Monday, December 21, 2015 - link

    Yay, finally, I've been wondering with PCIe switch chips being so affordable, how long before someone figures out you can make crazy fast complex topology interconnect on the budget. I've been looking into using PCIe switches to build supercomputers out of affordable single socket motherboards - snap in a decent CPU and 2 GPUs for compute, use the other PCIe slots to connect to other nodes and there you have it - no need for crazy expensive proprietary interconnect, no need for crazy expensive components.
  • agentd - Saturday, December 26, 2015 - link

    Have you seen the Avago (formerly PLX) PEX9700 series switches?
  • extide - Monday, December 21, 2015 - link

    In there you mention you could combine the BI and the BN but that wouldn't make sense .. I mean then you have a server with a cpu and an accelerator .. right where we started.

    However it makes sense to combine the CN and the BI -- that way you don't need the extra layer of BI nodes.

    Am I misunderstanding this or was that a typo? or?

Log in

Don't have an account? Sign up now