Original Link: https://www.anandtech.com/show/16819/supermicro-ultra-sys120utnr-review-testing-dual-10nm-ice-lake-xeon-in-1u



With the launch of Intel’s Ice Lake Xeon Scalable platform comes a new socket and a range of features that vendors like Supermicro have to design for. The server and enterprise market is so vast that every design can come in a range of configurations and settings, however one of the key elements is managing compute density with memory and accelerator support. The SYS-120U-TNR we are testing today is a dense system with lots of trimmings all within a 1U, to which Supermicro is aiming at virtualization workloads, HPC, Cloud, Software Defined Storage, and 5G. This system can be equipped with upwards of 80 cores, 12 TB of DRAM, and four PCIe 4.0 accelerators, defining a high-end solution from Supermicro.

Servers: General Purpose or Hyper Focused?

Due to the way the server and enterprise market is both expansive and optimized, vendors like Supermicro have to decide how to partition their server and enterprise offerings. Smaller vendors might choose to target one particular customer, or go for a general purpose design, whereas the larger vendors can have a wide portfolio of systems for different verticals. Supermicro falls into this latter category, designing targeted systems with large customers, but also enabling ‘standard’ systems that can do a bit of everything but still offer good total cost of ownership (TCO) over the lifetime of the system.


Server size compared to a standard 2.5-inch SATA SSD

When considering a ‘standard’ enterprise system, in the past we have typically observed a dual socket design in a 2U (3.5-inch, 8.9cm height) chassis, which allows for a sufficient cooling design along with a number of add-in accelerators such as GPUs or enhanced networking, or space on the front panel for storage or additional cooling. The system we’re testing today, the SYS-120U-TNR, certainly fields this ‘standard’ definition, although Supermicro does the additional step of optimizing for density by cramming everything into a 1U chassis.

With only 1.75-inches (4.4cm) vertical clearance on offer, cooling becomes a priority, which means substantial enough heatsinks and fast moving airflow backed by 8 powerful 56mm fans, which are running at up to 30k RPM with PWM control. The SYS-120U-TNR we’re testing has support for 2 Ice Lake Xeon processors at up to 40 cores and 270 W each, as well as additional add-in accelerators (one dual slot full height + two single slot full height), and comes equipped with dual 1200W Titanium or dual 800W Titanium power supplies, indicating that it is suited up should a customer want to fill it with plenty of hardware. You can see in the image above and on the right of the image below, Supermicro uses plastic baffles to ensure that airflow through the heatsink and memory is as laminar as possible.


LGA-4189 Socket with 1U Heatsink and 16 DDR4 slots

Even with the 1U form factor, Supermicro has enabled full memory support for Ice Lake Xeon, allowing both processors sixteen DDR4-3200 memory slots, capable of supporting a total of 12 TB of memory with Intel’s Optane DCPMM 200-series.

At the front are 12 2.5-inch SATA/NVMe PCIe 4.0 x4 hot swappable drive bays, with six apiece coming from each processor. If we start looking into where all the PCIe lanes from each processor go, it gets a bit confusing very quickly:

By default the system comes without network connectivity, only with a BMC connection for admin control. Network options requires an Ultra add-in riser card for dual 10GBase-T (X710-AT2), or dual 10GBase-T plus dual 10GbE SFP+ (X710-TM4). With the PCIe connectors, any other networking option might be configured, but Supermicro also lists the complete no-NIC option for air-gapped systems. The system also has three USB 3.0 ports (2 rear, 1 front), a rear VGA output, a rear COM port, and two SuperDOM ports internally.

Admin control comes from the Aspeed AST2600 which supports IPMI v2.0, Redfish API, Intel Node Manager, Supermicro’s Update Manager, and Supermicro’s SuperDoctor 5 monitoring interface.

The configuration Supermicro sent to us for review contains the following:

  • Supermicro SYS-120U-TNR
  • Dual Intel Xeon Gold 6330 CPUs (2x28-core, 2.5-3.1 GHz, 2x205W, 2x$1894)
  • 512 GB of DDR4-3200 ECC RDIMMs (16 x 32 GB)
  • Dual Kioxia CD6-R 1.92TB PCIe 4.0x4 NVMe U.2
  • Dual 10GBase-T via X710-AT2

Full support for the system includes:

Supermicro SYS-120U-TNR
AnandTech Info
Motherboard Super X12DPU-6
CPUs Dual Socket P+ (LGA-4189)
Support 3rd Gen Ice Lake Xeon
Up to 270W TDP, 40C/80T
7+1 Phase Design Per Socket
DRAM 32 DDR4-3200 ECC Slots
Support RDIMM, LRDIMM
Up to 8 TB
32 x 256 GB LRDIMM
Up to 12 TB
16 x 512 GB Optane
16 x 256 GB LRDIMM
Storage 12 x SATA Front Panel
Optional PCIe 4.0 x4 NVMe Cabling
PCIe PCIe 4.0 x16 Low Profile
PCIe 4.0 x16 Low Profile (Internal)
2 x PCIe 4.0 x16 Full Height (10.5-inch length)
Ultra Riser for Networking
Networking None by default
Optional X710-AT2 dual 10GBase-T
Optional X710-TM4 dual 10GBase-T + SFP+
IO RJ45 BMC via ASpeed AST2600
3 USB 3.0 Ports (2 rear, 1 front)
VGA BMC
1 x COM
2 x SuperDOM
Fans 8 x 40mm double thick 30k RPM with control
2 Shrouds, 1 per CPU socket+DRAM
Power 1200W Titanium Redundant, Max 100A
Chassis CSE-119UH3TS-R1K22P-T
Management
Software
IPMI 2.0 via ASpeed AST2600
Supermicro OOB License included
Redfish API
Intel Node Manager
KVM with Dedicated LAN
SUM
NMI
Watch Dog
SuperDoctor 5
ACPI Power Management
Optional 2x M.2 RAID Carrier
Broadcom Cache Vaults
Intel VROC Raid Key
RAID Cards + Cabling
Hardware-based TPM
Ultra Riser Cards
Note Sold as assembled system to resellers
(2 CPU, 4xDDR, 1xStorage, 1xNIC)

We reached out to Supermicro for some insight into how this system might be configured for the different verticals. 

Supermicro Ultra-E SYS-120U-TNR
Configuration Variants
AnandTech CPU Memory Storage Add-In
Virtualization ++ ++    
HPC ++     +
Cloud Computing handles all mainstream configs
High-End Enterprise ++ ++ ++ ++
Software Defined Storage     + or 2U  
Application aaS + + + +
5G/Telco Ultra-E Short-Depth Version


Read on for our benchmark results.



BIOS, Software, BMC

The networked management for the Supermicro SYS-120U-TNR uses the latest interface from Supermicro through the ASpeed AST2600 which is given an IP my DHCP upon connection. Interestingly enough trying to access the interface did not work with Chrome at all - after logging in it would just freeze on the system page while trying to get basic system details. In the end I had to use non-Chromium based Edge. On top of that both Chrome and Edge warned that the certificate for the BMC webpage was invalid, resulting in jumping through a hoop to access it.

The username and password to access the system are no longer the default admin/admin or admin/password: due to the 2018 law in California known as SB-327, all IoT devices (including servers) that have administrator access to settings and configurations must have unique passwords. The username for us was still ADMIN however the password was found on a pull-out tab on the front of the server - or alternatively just on the inside of the double width PCIe slot inside the chassis.

The Supermicro interface is as detailed as a management interface needs to be, with this main dashboard showcasing firmware versions, power consumption, the remote console, and recent system messages and actions.

The System tab states a lot of similar information to the dashboard, with links to the separate component detection of the server.

The CPUs are both detected here, and although it says they have with a base frequency of 2.00 GHz (actually 2.6 GHz) and a turbo frequency of 4.5 GHz (actually 3.1 GHz), we actually measure the correct numbers in the operating system.

All sixteen memory modules are detected, with ECC enabled, for a total of 512 GB.

Power supplies as well – in this image we only have one of the 1200W models connected to the mains, but even without it will still showcase the thermal sensor on the power supply not connected.

In our system, the sensor module didn’t seem to read anything from the hardware, however we did run the fans at full speed regardless.

Updating the BMC or BIOS is relatively easy through the update interface when you have a file to hand. The system also keeps track of when it was updated and with what version firmware.

For remote control, both HTML5 and Java are supported, however we could not get the HTML 5 interface to work during our testing. Java worked well, and is likely kept here for the specific reason of legacy and fallback support despite Java not being recommended.

Overall the management options were as standard as we normally expect from this sort of system. On the plus side it looks a lot nicer than some of the base AMI / older interfaces we still encounter from time to time, but on the minus side I’m still unsure why it wouldn’t work in Chrome.

BIOS

On the BIOS/UEFI side of the equation, we get a simple blue and grey interface from AMI which runs as standard on enterprise systems. The X12DPU-6 motherboard we are using has BIOS version 1.0b and a total of 512 GB of memory detected.

In the Advanced CPU section, it showcases that we have two Xeon Gold 6330 processors, with the D1 stepping. Similar to the BMC, it says here a 2.0 GHz base frequency (Intel’s official specifications state 2.5 GHz) but everything else looks in order. Individual cores can be disabled with the bitmaps as shown here:

One of the new features of the Xeon Gold processors is SGX enclaves, which require TME to be enabled.

In the PCIe section, Above 4G Decoding was enabled by default (often disabled by default on consumer platforms), and the system allows a selection of NVMe firmware such that it can be software driven rather than vendor firmware driven.

For the uncore / mesh sub-system, we can see that this system is configured to 11.2 GT/s speed UPI links (one of the upgrades over previous generation), but there are also a number of options here that could affect the system based on use case. Customers can select the system to prioritize topologically at the expense of feature performance (e.g. cores over IO), or vice versa. Similarly a user can select SNC2 (Sub-NUMA Clustering) to partition the processor into two hemispheres for lower latency memory accesses at the expense of immediate bandwidth. There is also an option to throttle cache snooping to manage power based on what sort of workloads the system would end up running.

All the NVMe slots in the front panel of the system can be PCIe 4.0 x4 enabled, and there’s an option to check that here as well.

Other options in the BIOS include IMPI network settings, event logs, and traditional BIOS security.



System Results and Benchmarks

When it comes down to system tests, the most obvious case in point is power consumption. Running through our benchmark tests and the IPMI does a good job of monitoring the power consumption every few minutes. We managed to see a 747 Watt peak listed, however the graph to run a few quick last photos for this reviews is showing something north of 750W. 

750W for a fully loaded dual 28C 2x205W system sounds quite high. This system has a peak of 1200W on the power supply, so that leaves 500W for an AI accelerator and anything additional. This means a good GPU and a dozen high power NVMe drives is about your limit. Luckily that's all you can fit into the system. Users who need 270 W processors in this system might have to cut back on some of the extras.

One of the elements in which to test this system at full power, and if we look at the processor power consumption we get about 205 W per processor (which is the rated TDP) during turbo.

Out of this power, it would appear that the idle power is around 100 W, which is split between cores/DRAM (we assume IO is under DRAM?). When loaded, extra budget goes into the processors. We see the same thing on CineBench, except there seems to be less stress on the DRAM/IO in this test.

Benchmarks

While we don't have a series of server specific tests, we are able to probe the capability of the system as delivered through mix of our enterprise and workstation testing. LLVM compile and SPEC are Linux based, while the rest are Windows, based on personal familiarity and also our back catalog of comparison data. It is worth noting that some software has difficulty scaling beyond 64 threads in Windows due to thread groups - this is down to the way the software is compiled and run. All the tests here were all able to dismiss this limitation except LinX LINPACK, which has a 64 thread limit (and is limited to Intel).

LLVM Compile

(4-1) Blender 2.83 Custom Render Test

(8-5) LinX 0.9.5 LINPACK

SPECint2017 Base Rate-N

SPECfp2017 Base Rate-N

(4-5) V-Ray Renderer

(4-2) Corona 1.3 Benchmark

(2-2) 3D Particle Movement v2.1 (Peak AVX)

(1-1) Agisoft Photoscan 1.3, Complex Test

(4-7b) CineBench R23 Multi-Thread

In almost all cases, the dual socket 28C SYS-120U-TNR sits behind the single socket 64C option from AMD. For the tests against dual 8280 or dual 6258R, we can see a generational uplift, however there is still a struggle against a AMD's previous generation top tier processor. That said, AMD's processor costs $6950, whereas two of these 6330s is around $3800. There is always a balance between price, total cost of ownership, and benefits versus the complexities of a dual socket system against a single socket system. The benchmarks where the SYS-120U-TNR did the best were our AVX tests, such as 3DPM and y-cruncher, where these processors could use AVX-512. As stated by Intel's Lisa Spelman in our recent interview, "70% of those deal wins, the reason listed by our salesforce for that win was AVX-512; optimization is real".



Conclusions & Thoughts on Dense Compute

Truth be told, when I was in discussions with Supermicro about reviewing one of its Ice Lake systems, I wasn’t sure what to expect.  I spoke to my contact at the company about sending a system that is expected to be a popular all-around enterprise system, one that could entertain many of Supermicro’s markets, and the SYS-120U-TNR fits that bill with the understanding that companies are also requesting denser environments.

The desire to move from previously standard 2U designs to 1U designs, even for generic dual socket systems, seems to be a feature of this next generation of enterprise deployments. Data centers and colocation centers have built infrastructure to support high-powered racks for AI – those enterprises that require the super dense AI workloads now invest in 5U systems consuming 5kW+, enough that you can’t even fill a 42U rack without going above standard rack power limits unless you have high power infrastructure in place. The knock-on effect of having better colo and enterprise infrastructure is allowing customers that use generic all-round off-the-shelf systems to reduce those racks of 2U infrastructure in half. This can also be combined with any benefit of moving from an older generation of processor to the new generation.

This causes a small issue for those of us that review servers every now and then: a modern dual socket server in a home rack with some good CPUs can no longer be tested without ear protection. Normally it would be tested in a lower-than-peak fan mode, without additional thermal assistance, however these systems require either fans at full or some additional HVAC to even run standard tests. A modern datacenter enables these systems to run as loud as they need, and the cooling environment is optimized for performance density regardless of the fan speed. Enterprise customers are taking advantage of this at scale, and that’s why companies like Supermicro are designing systems like the SYS-120U-TNR to meet those needs.

Dense Thoughts on Compute

What I think Supermicro is trying to do here with the SYS-120U-TNR is to cater for the biggest portion of demand in a variety of use cases. This system could be used as a single CPU caching tier, it could be a multi-tiered database processing hub, it could be used for AI acceleration in both training and inference, add in a double slot NVIDIA GPU with a virtualization license and you could run several remote workers with CUDA access, or with multiple FPGAs it could be a hub for SmartNIC offload or development. I applaud the fact that Supermicro have quite capably built an all-round machine that can be constructed to cater to so many markets.

One slightly fallback from my perspective is the lack of a default Network interface – even a simple gigabit connection – without an add-in card. Supermicro won’t ship the system without an add-in NIC anyway, however users will either have to add in their own PCIe solution (taking up a slot) or rely on one of Supermicro’s Ultra Riser networking cards drawing PCIe lanes from the processor. We could state that Supermicro’s decision allows for better flexibility, especially when space at the rear of a system is limited, but I’m still of the opinion that at least something should be there, and hanging off of the chipset.

On the CPU side of things, as we noted in our Intel 3rd Generation Xeon Scalable Ice Lake review, the processors themselves offer an interesting increase in generational performance, as well as key optimization points for things like AVX-512, SGX enclaves, and Optane DC Persistent Memory. The move up to PCIe 4.0, eight lanes of DDR4-3200 memory, and focusing on an optimized software stack do well as plus points for the product, but if your workload falls outside of those optimizable use cases, AMD equivalent offerings seem to have more performance for the same cost, or in some instances a lower cost and lower power.

The Xeon Gold 6330s we are testing today are the updates to the 28-core Xeon Gold 6258R from the previous generation, running at the same power, but half the cost and much lower frequencies. There’s a trade-off there as the Xeon 6330s aren’t as fast, but consume the same power – by charging half as much for the processors, Intel is trying to change the TCO equation to where it needs to be for their customers. The Ice Lake Xeon Gold 6348 are closer in frequency to the 6258R (2.6G base vs 2.7G base), and are closer in price ($3072 list vs $3950), but with a lower frequency they are rated to a higher TDP (235W vs 205W). In our Ice Lake review, the new 8380 vs older 8280 won as the power was higher, there were more cores, and we saw an uplift in IPC. The question is now more in the mid-range: while Intel struggles to get its new CPUs to match the old without changing list pricing, AMD allows customers to change from dual socket to single socket, all while increasing performance and reducing power costs.

This is somewhat inconsequential for today’s review, in that Supermicro’s system caters to the customers that require Intel for their enterprise infrastructure, regardless of the processor performance.  The SYS-120U-TNR is versatile and configurable for a lot of markets, ripe for an ‘off-the-shelf’ system deployment.

Log in

Don't have an account? Sign up now