Original Link: https://www.anandtech.com/show/16236/asrock-4x4-box4800u-renoir-nuc-review



AMD-based ultra-compact form-factor (UCFF) systems are slowly gaining market acceptance, with the Zen architecture slowly catching up with Intel on both the performance as well as power consumption front. AMD's latest and greatest has been reserved for the high-end desktop market, with the parts meant for low-power / compact systems appearing a few quarters later. Zen 3-based desktop CPUs were introduced recently. However, Zen 2-based parts with 12-25W TDP (Renoir APUs) have started appearing in compact desktop systems only recently. ASRock Industrial launched the Ryzen 4000U-based 4X4 BOX-4000 series in September. The review below looks at the flagship model - the 4X4 BOX-4800U - and how it compares against the equivalent Comet Lake-U-based Frost Canyon NUC from Intel.

Introduction

The PC market has grown in the last few years, thanks in no small part to ultra-compact form-factor (UCFF) and gaming systems. Intel's NUC line-up has been ruling the roost in the former category. Given AMD's focus on multi-threaded performance and core counts with the first-generation Zen microarchitecture, Zen and Zen+-based Ryzen APUs did not have good enough power efficiency and performance per watt to make a dent in Intel's success in the NUC space. ASRock Industrial did release UCFF systems based on the AMD Ryzen Embedded Processors lineup (we reviewed one such system - the 4X4 BOX-V1000M). While the GPU prowess and multi-threaded performance turned out to be appreciable aspects, the single-threaded performance, power efficiency, and driver issues made it a tough sell against competing Intel-based NUCs. The introduction of Zen 2-based APUs (Renoir) fabricated in TSMC's 7nm process changed the equation by addressing all the aforementioned weak points.

AMD prioritized the delivery of Renoir APUs to the notebook market, with mini-PCs following soon after. ASRock Industrial was again at the forefront. Along with Asus's PN50, they were one of the first to launch systems based on these parts. The 4X4 BOX-4000 series has three different SKUs with CPU core counts of 4 (Ryzen 3 4300U), 6 (Ryzen 5 4500U), and 8 (Ryzen 7 4800U) each. The last one is the flagship, and that is the one we are looking at today.

The 4X4 BOX-4800U has a 104mm x 102mm main-board housed in a 110mm x 117.5mm x 47.85mm plastic chassis. The system matches the Intel NUCs in the footprint department. The board comes with a soldered processor - the Ryzen 7 4800U belonging to the AMD Renoir APU series. It is an octa-core processor with SMT enabled (8C/16T). It can operate with a TDP configurable between 12W and 25W.

ASRock Industrial sampled us a barebones version of the system. In partnership with Patriot Memory, they also provided us with their recommended storage (Patriot P300 PCIe 3.0 x4 NVMe SSD) and memory (Patriot Signature Line 2x32GB DDR4-3200 SODIMM) for usage with the PC.

The specifications of our ASRock 4X4 BOX-4800U review configuration are summarized in the table below.

ASRock 4X4 BOX-4800U Specifications
Processor AMD Ryzen 7 4800U
Zen 2 (Renoir) 8C/16T, 1.8 - 4.2 GHz
TSMC 7nm, 8MB L3, 10 - 25 W (15W)
Memory Patriot Memory PSD432G32002S DDR4 SODIMM
22-22-22-52 @ 3200 MHz
2x32 GB
Graphics AMD Radeon Graphics (Renoir) - Integrated GPU with 8 CUs
Disk Drive(s) Patriot P300
(512 GB; M.2 2280 PCIe 3.0 x4; Kioxia 96L 3D TLC)
(Silicon Motion SM2263XT Controller)
Networking Intel Wi-Fi 6 AX200
(2x2 802.11ax - 2400 Mbps)
1x Realtek RTL8111G Gigabit Ethernet Controller
1x Realtek RTL8125 2.5 Gigabit Ethernet Controller
Audio 3.5mm Headphone Jack
Capable of 5.1/7.1 digital output with HD audio bitstreaming (HDMI)
Miscellaneous I/O Ports 2x USB 2.0
2x USB 3.2 Gen 2 Type-C
1x USB 3.2 Gen 2 Type-A
Operating System Retail unit is barebones, but we installed Windows 10 Enterprise x64
Pricing (As configured) $600 (barebones)
$878 (as configured)
Full Specifications ASRock Industrial 4X4 BOX-4800U Specifications

The ASRock Industrial 4X4 BOX-4800U kit doesn't come with any pre-installed OS, but does come with a CD containing the drivers. In any case, we ended up installing the latest drivers downloaded off the product support page. In addition to the main unit, the other components of the package include a 90 W (19V @ 4.74A) adapter, a US power cord, a VESA mount (along with the necessary screws), a driver CD, user's manual and a quick-start guide. Installing the storage and RAM is straightforward - a matter of popping off four screws on the chassis underside and mounting the components in the appropriate slot.

The above gallery shows the package components along with the chassis design and the internal components. The system also includes support for the installation of a 2.5" drive, with a very flexible SATA power / data cable already in place.

In the table below, we have an overview of the various systems that we are comparing the ASRock 4X4 BOX-4800U against. Note that they may not belong to the same market segment. The relevant configuration details of the machines are provided so that readers have an understanding of why some benchmark numbers are skewed for or against the ASRock 4X4 BOX-4800U when we come to those sections.

Comparative PC Configurations
Aspect ASRock 4X4 BOX-4800U
CPU AMD Ryzen 7 4800U AMD Ryzen 7 4800U
GPU AMD Renoir (Radeon RX Vega 8 / GCN5) AMD Renoir (Radeon RX Vega 8 / GCN5)
RAM Patriot Memory PSD432G32002S DDR4 SODIMM
22-22-22-52 @ 3200 MHz
2x32 GB
Patriot Memory PSD432G32002S DDR4 SODIMM
22-22-22-52 @ 3200 MHz
2x32 GB
Storage Patriot P300
(512 GB; M.2 2280 PCIe 3.0 x4; Kioxia 96L 3D TLC)
(Silicon Motion SM2263XT Controller)
Patriot P300
(512 GB; M.2 2280 PCIe 3.0 x4; Kioxia 96L 3D TLC)
(Silicon Motion SM2263XT Controller)
Wi-Fi Intel Wi-Fi 6 AX200
(2x2 802.11ax - 2400 Mbps)
Intel Wi-Fi 6 AX200
(2x2 802.11ax - 2400 Mbps)
Price (in USD, when built) $600 (barebones)
$878 (as configured)
$600 (barebones)
$878 (as configured)


Platform Analysis

AMD provided the press with a block diagram of the Renoir APU at the time of the launch of the Zen 2 APUs. These APUs are monolithic dies, and true SoCs with all I/Os being sourced from the APU without a platform controller hub in the picture.

The above layout needs to be studied in conjunction with the design of the 4X4 BOX-4800U's motherboard. The various I/Os of the system (as well as internal components) are enabled using the following configuration.

An idea of the distribution of the various PCIe lanes can be obtained from the above diagram:

  • PCIe 3.0 x4 and a SATA port multiplexed behind the M.2 SSD slot (PCI Express x8 Bus #4)
  • PCIe 3.0 x1 for the Intel Wi-Fi 6 AX200 160 MHz WLAN card (PCI Express x1 Bus #3)
  • PCIe 3.0 x1 for the Realtek RTL8125 Gaming 2.5GbE Ethernet Controller (PCI Express x1 Bus #1)
  • PCIe 3.0 x1 for the Realtek RTL8168/RTL8111 GbE Ethernet Controller (PCI Express x1 Bus #2)

Other aspects of interest include the distribution of various USB ports - particularly in terms of bandwidth sharing. For example, we expect ports 'B' and 'C' behind the same root hub to share bandwidth. Note that the PCI Express x16 Bus #5 includes the two USB 3.1 controllers as well as the integrated GPU. The two Type-C ports in the front panel also act as conduits for two display outputs from the latter.

In the remainder of this review, we will first look at BAPCo's SYSmark 25, followed by various UL benchmarks and miscellaneous workloads. We also present some gaming benchmarks. A detailed look at the HTPC credentials of the system is followed by testing of the power consumption and thermal solution.



BAPCo SYSmark 25

The ASRock 4X4 BOX-4800U was evaluated using our Fall 2020 test suite for small-form factor PCs. In the first section, we will be looking at SYSmark 25.

BAPCo's SYSmark 25 is an application-based benchmark that uses real-world applications to replay usage patterns of business users in the areas of productivity, creativity, and responsiveness. The 'Productivity Scenario' covers office-centric activities including word processing, spreadsheet usage, financial analysis, software development, application installation, file compression, and e-mail management. The 'Creativity Scenario' represents media-centric activities such as digital photo processing, AI and ML for face recognition in photos and videos for the purpose of content creation, etc. The 'Responsiveness Scenario' evaluates the ability of the system to react in a quick manner to user inputs in areas such as application and file launches, web browsing, and multi-tasking.

Scores are meant to be compared against a reference desktop (the SYSmark 25 calibration system, a Lenovo Thinkcenter M720q with a Core i5-8500T and 8GB of DDR4 memory to go with a 256GB M.2 NVMe SSD). The calibration system scores 1000 in each of the scenarios. A score of, say, 2000, would imply that the system under test is twice as fast as the reference system.

SYSmark 25 - Productivity

SYSmark 25 - Creativity

SYSmark 25 - Responsiveness

SYSmark 25 - Overall

SYSmark 25 also adds energy measurement to the mix. A high score in the SYSmark benchmarks might be nice to have, but, potential customers also need to determine the balance between power consumption and the efficiency of the system. For example, in the average office scenario, it might not be worth purchasing a noisy and power-hungry PC just because it ends up with a 2000 score in the SYSmark 25 benchmarks. In order to provide a balanced perspective, SYSmark 25 also allows vendors and decision makers to track the energy consumption during each workload. In the graphs below, we find the total energy consumed by the PC under test for a single iteration of each SYSmark 25 workload. For reference, the calibration system consumes 8.88 Wh for productivity, 10.81 Wh for creativity, and 19.69 Wh overall.

SYSmark 25 - Productivity Energy Consumption

SYSmark 25 - Creativity Energy Consumption

SYSmark 25 - Overall Energy Consumption

Traditional office and content creation workloads continue to be influenced heavily by single-threaded performance. Based on the above results, it can be said that the Frost Canyon NUC has a slight edge in performance as well as power consumption. It must be noted that the Renoir APU attempts to place 8 CPU cores within a 15W power envelop, while the Comet Lake-U SKU has only 6 cores within a similar profile. Applications unable to take advantage of all the 8 cores may end up performing better on Comet Lake-U, as seen in the above results.



UL Benchmarks - PCMark and 3DMark

This section deals with a selection of the UL Futuremark benchmarks - PCMark 10, PCMark 8, and 3DMark. While the first two evaluate the system as a whole, 3DMark focuses on the graphics capabilities.

PCMark 10

UL's PCMark 10 evaluates computing systems for various usage scenarios (generic / essential tasks such as web browsing and starting up applications, productivity tasks such as editing spreadsheets and documents, gaming, and digital content creation). We benchmarked select PCs with the PCMark 10 Extended profile and recorded the scores for various scenarios. These scores are heavily influenced by the CPU and GPU in the system, though the RAM and storage device also play a part. The power plan was set to Balanced for all the PCs while processing the PCMark 10 benchmark.

Futuremark PCMark 10 - Essentials

Futuremark PCMark 10 - Productivity

Futuremark PCMark 10 - Gaming

Futuremark PCMark 10 - Digital Content Creation

Futuremark PCMark 10 - Extended

For productivity and essentials, we see a situation similar to BAPCo's SYSmark 25 results. However, workloads involving the GPU such as gaming are a big win for the 4X4 BOX-4800U. Digital content creation can take advantage of multiple cores as well as the GPU, and that helps the 4X4 BOX-4800U score another win.

PCMark 8

We continue to present PCMark 8 benchmark results (as those have more comparison points) while our PCMark 10 scores database for systems grows in size. PCMark 8 provides various usage scenarios (home, creative and work) and offers ways to benchmark both baseline (CPU-only) as well as OpenCL accelerated (CPU + GPU) performance. We benchmarked select PCs for the OpenCL accelerated performance in all three usage scenarios. These scores are heavily influenced by the CPU in the system. The GPU acceleration helps the 4X4 BOX-4800U get the edge over the Frost Canyon NUC. However, the Bean Canyon NUC with its Iris Plus iGPU and higher thermal headroom consistently outscorese the 4X4 BOX-4800U.

Futuremark PCMark 8 - Home OpenCL

Futuremark PCMark 8 - Creative OpenCL

Futuremark PCMark 8 - Work OpenCL

3DMark

UL's 3DMark comes with a diverse set of graphics workloads that target different Direct3D feature levels. Correspondingly, the rendering resolutions are also different. We use 3DMark 2.4.4264 to get an idea of the graphics capabilities of the system. In this section, we take a look at the performance of the ASRock 4X4 BOX-4800U across the different 3DMark workloads.

3DMark Ice Storm

This workload has three levels of varying complexity - the vanilla Ice Storm, Ice Storm Unlimited, and Ice Storm Extreme. It is a cross-platform benchmark (which means that the scores can be compared across different tablets and smartphones as well). All three use DirectX 11 (feature level 9) / OpenGL ES 2.0. While the Extreme renders at 1920 x 1080, the other two render at 1280 x 720. The graphs below present the various Ice Storm worloads' numbers for different systems that we have evaluated.

UL 3DMark - Ice Storm Workloads

3DMark Cloud Gate

The Cloud Gate workload is meant for notebooks and typical home PCs, and uses DirectX 11 (feature level 10) to render frames at 1280 x 720. The graph below presents the overall score for the workload across all the systems that are being compared.

UL 3DMark Cloud Gate Score

3DMark Sky Diver

The Sky Diver workload is meant for gaming notebooks and mid-range PCs, and uses DirectX 11 (feature level 11) to render frames at 1920 x 1080. The graph below presents the overall score for the workload across all the systems that are being compared.

UL 3DMark Sky Diver Score

3DMark Fire Strike Extreme

The Fire Strike benchmark has three workloads. The base version is meant for high-performance gaming PCs. Similar to Sky Diver, it uses DirectX 11 (feature level 11) to render frames at 1920 x 1080. The Ultra version targets 4K gaming system, and renders at 3840 x 2160. However, we only deal with the Extreme version in our benchmarking - It renders at 2560 x 1440, and targets multi-GPU systems and overclocked PCs. The graph below presents the overall score for the Fire Strike Extreme benchmark across all the systems that are being compared.

UL 3DMark Fire Strike Extreme Score

3DMark Time Spy

The Time Spy workload has two levels with different complexities. Both use DirectX 12 (feature level 11). However, the plain version targets high-performance gaming PCs with a 2560 x 1440 render resolution, while the Extreme version renders at 3840 x 2160 resolution. The graphs below present both numbers for all the systems that are being compared in this review.

UL 3DMark - Time Spy Workloads

3DMark Night Raid

The Night Raid workload is a DirectX 12 benchmark test. It is less demanding than Time Spy, and is optimized for integrated graphics. The graph below presents the overall score in this workload for different system configurations.

UL 3DMark Fire Strike Extreme Score

The 3DMark workloads deliver expected results - the AMD iGPU in the Renoir APU is miles ahead of the one in the Comet Lake-U SKUs.



GPU Performance - Gaming Workloads

UCFF systems are typically not put through our gaming tests. Given the capabilities of the AMD iGPU in Renoir, we made an exception and processed the following games used in our gaming SFF PC reviews.

  • Civlization VI (DX12)
  • Dota 2
  • F1 2017
  • Grand Theft Auto V
  • Middle Earth: Shadow of War
  • Far Cry 5

Most system reviews take a handful of games and process them at one resolution / quality settings for comparison purposes. Recently, we have seen many pre-built systems coming out with varying gaming capabilities. Hence, it has become imperative to give consumers an idea of how a given system performs over a range of resolutions and quality settings for each game. With our latest suite, we are able to address this aspect.

Civilization VI (DX12)

The Civilization series of turn-based strategy games is very popular. For such games, the frame rate is not necessarily an important factor in the gaming experience. However, with Civilization VI, Firaxis has cranked up the visual fidelity to make the game more attractive. As a result, the game can be taxing on the GPU as well as the CPU, particularly in the DirectX 12 mode.

Civilization VI (DirectX 12) Performance

We processed the built-in benchmark at two different resolutions (1080p and 2160p), and with two different quality settings (medium and ultra, with the exact differences detailed here). The Ryzen 5 2400G in the DeskMini A300 has a higher power budget and comfortably outscores the rest. The 4X4 BOX-4800U actually has pretty playable frame rates at 1080p with medium settings.

Dota 2

Dota 2 has been featuring in our mini-PC and notebook reviews for a few years now, but, it still continues to be a very relevant game. Our evaluation was limited to a custom replay file at 1080p resolution with enthusiast settings ('best-looking' preset). We have now revamped our testing to include multiple resolutions - This brings out the fact that the game is CPU-limited in many configurations.

Dota 2 allows for multiple renderers - we use the DirectX 11 mode. The rendering settings are set to 'enthusiast level' (best-looking, which has all options turned on, and at Ultra level, except for the Shadow Quality set to 'High'). We cycle through different resolutions after setting the monitor resolution to match the desired resolution. The core scripts and replay files are sourced from Jonathan Liebig's original Dota 2 benchmarking instructions which used a sequence of frames from Match 3061101068.

Dota 2 - Enthusiast Quality Performance

The 4X4 BOX-4800U is quite capable of playing Dota 2 at 1080p with enthusiast quality settings.

F1 2017

Our gaming system reviews have always had a representative racing game in it. While our previous benchmark suite for PCs featured Dirt 2, we have moved on to F1 2017 from Codemasters for our revamp.

F1 2017 - Ultra Quality Performance

The supplied example benchmark (with some minor tweaks) is processed at four different resolutions while maintaining the graphics settings at the built-in 'Ultra' level. This is taxing on iGPUs, but the 4X4 BOX-4800U manages to acquit itself passably at 720p.

Grand Theft Auto V

GTA doesn’t provide graphical presets, but opens up the options to users and extends the boundaries by pushing even the hardest systems to the limit using Rockstar’s Advanced Game Engine under DirectX 11. Whether the user is flying high in the mountains with long draw distances or dealing with assorted trash in the city, when cranked up to maximum it creates stunning visuals but hard work for both the CPU and the GPU. For our test we have scripted a version of the in-game benchmark. The in-game benchmark consists of five scenarios: four short panning shots with varying lighting and weather effects, and a fifth action sequence that lasts around 90 seconds. We use only the final part of the benchmark, which combines a flight scene in a jet followed by an inner city drive-by through several intersections followed by ramming a tanker that explodes, causing other cars to explode as well. This is a mix of distance rendering followed by a detailed near-rendering action sequence.

Grand Theft Auto V Performance

We processed the benchmark across various resolutions and quality settings (detailed here). The results are presented above. iGPU-equipped systems including the 4X4 BOX-4800U can only handle the game att Low quality settings.

Middle Earth: Shadow of War

Middle Earth: Shadow of War is an action RPG. In our previous gaming benchmarks suite, we used its prequel - Shadow of Mordor. Produced by Monolith and using the new LithTech Firebird engine and numerous detail add-ons, Shadow of War goes for detail and complexity. The graphics settings include standard options such as Graphical Quality, Lighting, Mesh, Motion Blur, Shadow Quality, Textures, Vegetation Range, Depth of Field, Transparency and Tessellation. There are standard presets as well. The game also includes a 'Dynamic Resolution' option that automatically alters graphics quality to hit a pre-set frame rate. We benchmarked the game at four different resolutions - 4K, 1440p, 1080p, and 720p. Two standard presets - Ultra and Medium - were used at each resolution after turning off the dynamic resolution option.

Middle Earth: Shadow of War Performance

The 4X4 BOX-4800U can handle the game at 720p with medium settings. Other resolutions / quality settings are not playable.

Far Cry 5

Ubisoft's Far Cry 5 is an action-adventure first-person shooter game released in March 2018. The game comes with an in-built benchmark and has standard pre-sets for quality settings. We benchmarked the game at four different resolutions - 720p, 1080p, 1440p, and 2160p. Two preset quality settings were processed at each resolution - normal and ultra.

Far Cry 5 Performance

The 4X4 BOX-4800U can handle the game at 720p even with ultra settings, but other resolution and quality combinations can't deliver a passable experience.

From the perspective of iGPU gaming, Renoir scores well in games like Dota 2 which can also take advantage of the increase in CPU horsepower. Overall, the AMD iGPU emerges as the winner over the ones in the mini-PCs we evaluated before for actual gaming workloads. However, this was always on the cards given the performance of the previous AMD APUs in such scenarios.



SPECworkstation 3 Benchmark

The diminutive 4X4 BOX-4800U is not a workstation by any stretch of imagination. As an academic exercise, it is definitely interesting to look at how the system performs on a comparative basis across various professional applications. The SPECworkstation 3 benchmark measures workstation performance based on such applications. It includes more than 140 tests based on 30 different workloads that exercise the CPU, graphics, I/O and memory hierarchy. These workloads fall into different categories.

  • Media and Entertainment (3D animation, rendering)
  • Product Development (CAD/CAM/CAE)
  • Life Sciences (medical, molecular)
  • Financial Services
  • Energy (oil and gas)
  • General Operations
  • GPU Compute

Individual scores are generated for each test and a composite score for each category is calculated based on a reference machine (HP Z240 tower workstation using an Intel E3-1240 v5 CPU, an AMD Radeon Pro WX3100 GPU, 16GB of DDR4-2133, and a SanDisk 512GB SSD). The SPEC Ratio for the tests in each category is presented in the graphs below.

Media and Entertainment

The Media and Entertainment category comprises of workloads from five distinct applications:

  • The Blender workload measures system performance for content creation using the open-source Blender application. Tests include rendering of scenes of varying complexity using the OpenGL and ray-tracing renderers.
  • The Handbrake workload uses the open-source Handbrake application to transcode a 4K H.264 file into a H.265 file at 4K and 2K resolutions using the CPU capabilities alone.
  • The LuxRender workload benchmarks the LuxCore physically based renderer using LuxMark.
  • The Maya workload uses the SPECviewperf 13 maya-05 viewset to replay traces generated using the Autodesk Maya 2017 application for 3D animation.
  • The 3ds Max workload uses the SPECviewperf 13 3dsmax-06 viewset to replay traces generated by Autodesk's 3ds Max 2016 using the default Nitrous DX11 driver. The workload represents system usage for 3D modeling tasks.
SPECworkstation 3.0.4 - Media and Entertainment Workloads

All the workloads can take advantage of the extra cores available in the Renoir APU (8 vs. 6 in the Comet Lake-U), and that helps the 4X4 BOX-4800U score handsomely over the Frost Canyon NUC.

Product Development

The Product Development category comprises of eight distinct workloads:

  • The Rodinia (CFD) workload benchmarks a computational fluid dynamics (CFD) algorithm.
  • The WPCcfd workload benchmarks another CFD algorithm involving combustion and turbulence modeling.
  • The CalculiX workload uses the Calculix finite-element analysis program to model a jet engine turbine's internal temperature.
  • The Catia workload uses the catia-05 viewset from SPECviewperf 13 to replay traces generated by Dassault Systemes' CATIA V6 R2012 3D CAD application.
  • The Creo workload uses the creo-02 viewset from SPECviewperf 13 to replay traces generated by PTC's Creo, a 3D CAD application.
  • The NX workload uses the snx-03 viewset from SPECviewperf 13 to replay traces generated by the Siemens PLM NX 8.0 CAD/CAM/CAE application.
  • The Solidworks workload uses the sw-04 viewset from SPECviewperf 13 to replay traces generated by Dassault Systemes' SolidWorks 2013 SP1 CAD/CAE application.
  • The Showcase workload uses the showcase-02 viewset from SPECviewperf 13 to replay traces from Autodesk’s Showcase 2013 3D visualization and presentation application
SPECworkstation 3.0.4 - Product Development Workloads

In GPU-intensive workloads, the Renoir APU marches well ahead of CML-U. In the couple of cases where the focus is on the CPU performance, the Renoir APU ekes out a slender lead.

Life Sciences

The Life Sciences category comprises of four distinct test sets:

  • The LAMMPS set comprises of five tests simulating different molecular properties using the LAMMPS molecular dynamics simulator.
  • The NAMD set comprises of three tests simulating different molecular interactions.
  • The Rodinia (Life Sciences) set comprises of four tests - the Heartwall medical imaging algorithm, the Lavamd algorithm for calculation of particle potential and relocation in a 3D space due to mutual forces, the Hotspot algorithm to estimate processor temperature with thermal simulations, and the SRAD anisotropic diffusion algorithm for denoising.
  • The Medical workload uses the medical-02 viewset from SPECviewperf 13 to determine system performance for the Tuvok rendering core in the ImageVis3D volume visualization program.
SPECworkstation 3.0.4 - Life Sciences Workloads

Similar to what was seen in the product development workloads, the Renoir APU marches well ahead of CML-U in the GPU-intensive tasks. In the couple of cases where the focus is on the CPU performance, the Renoir APU is still ahead.

Financial Services

The Financial Services workload set benchmarks the system for three popular algorithms used in the financial services industry - the Monte Carlo probability simulation for risk assessment and forecast modeling, the Black-Scholes pricing model, and the Binomial Options pricing model.

SPECworkstation 3 - Financial Services

This workload is again able to take good avantage of available cores in the system, enabling the 4X4 BOX-4800U to provide more than double the performance of the Frost Canyon NUC.

Energy

The Energy category comprises of workloads simulating various algorithms used in the oil and gas industry:

  • The FFTW workload computes discrete Fourier transforms of large matrices.
  • The Convolution workload computes the convolution of a random 100x100 filter on a 400 megapixel image.
  • The SRMP workload processes the Surface-Related Multiples Prediction algorithm used in seismic data processing.
  • The Kirchhoff Migration workload processes an algorithm to calculate the back propogation of a seismic wavefield.
  • The Poisson workload takes advantage of the OpenMP multi-processing framework to solve the Poisson's equation.
  • The Energy workload uses the energy-02 viewset from SPECviewperf 13 to determine system performance for the open-source OPendTec seismic visualization application.
SPECworkstation 3.0.4 - Energy Industry Workloads

Most workloads are dependent on single-threaded performance here, with the 4X4 BOX-4800U trailing the Frost Canyon NUC in those cases.

General Operations

In the General Options category, the focus is on workloads from widely used applications in the workstation market:

  • The 7zip workload represents compression and decompression operations using the open-source 7zip file archiver program.
  • The Python workload benchmarks math operations using the numpy and scipy libraries along with other Python features.
  • The Octave workload performs math operations using the Octave programming language used in scientific computing.
  • The Storage workload evaluates the performance of the underlying storage device using transaction traces from multiple workstation applications.
SPECworkstation 3.0.4 - General Operations

The match-up is a bit even for general operations, with the 4X4 BOX-4800U ahead in some, and the Frost Canyon NUC ahead in others (based on whether the workload is single-thread performance limited, or it is able to take advantage of a large number of cores)

GPU Compute

In the GPU Compute category, the focus is on workloads taking advantage of the GPU compute capabilities using either OpenCL or CUDA, as applicable:

  • The LuxRender benchmark is the same as the one seen in the media and entertainment category.
  • The Caffebenchmark measures the performance of the Caffe deep-learning framework.
  • The Folding@Home benchmark measures the performance of the system for distributed computing workloads focused on tasks such as protein folding and drug design.
SPECworkstation 3.0.4 - GPU Compute

The 4X4 BOX-4800U managed to process all of the GPU compute benchmarks, but the Frost Canyon NUC couldn't process the Caffee and Folding @ Home tasks. As expected, the GPU compute performance of the AMD iGPU is better in the single workload where a comparison could be made.



Miscellaneous Performance Metrics

This section looks at some of the other commonly used benchmarks representative of the performance of specific real-world applications.

3D Rendering - CINEBENCH

We use CINEBENCH R23 for 3D rendering evaluation, but continue to present R15 results till we build up a database of R23 results. The R15 program provides three benchmark modes - OpenGL, single threaded and multi-threaded. Evaluation of different PC configurations in all three modes provided us the following results.

3D Rendering - CINEBENCH R15 - Single Thread

3D Rendering - CINEBENCH R15 - Multiple Threads

3D Rendering - CINEBENCH R15 - OpenGL

The 4X4 BOX-4800U is ahead of the Frost Canyon NUC across the board, though the single-threaded performance is quite close for this particular rendering workload.

3D Rendering - CINEBENCH R23 - Single Thread

3D Rendering - CINEBENCH R23 - Multiple Threads

x265 Benchmark

Next up, we have some video encoding benchmarks using x265 v2.8. The appropriate encoder executable is chosen based on the supported CPU features. In the first case, we encode 600 1080p YUV 4:2:0 frames into a 1080p30 HEVC Main-profile compatible video stream at 1 Mbps and record the average number of frames encoded per second.

Video Encoding - x265 - 1080p

Our second test case is 1200 4K YUV 4:2:0 frames getting encoded into a 4Kp60 HEVC Main10-profile video stream at 35 Mbps. The encoding FPS is recorded.

Video Encoding - x265 - 4K 10-bit

The workload is a perfect fit for parallelizing and executing across multiple cores, and the 4x4 BOX-4800U emerges as the comfortable leader, thanks to its octa-core nature.

7-Zip

7-Zip is a very effective and efficient compression program, often beating out OpenCL accelerated commercial programs in benchmarks even while using just the CPU power. 7-Zip has a benchmarking program that provides tons of details regarding the underlying CPU's efficiency. In this subsection, we are interested in the compression and decompression rates when utilizing all the available threads for the LZMA algorithm.

7-Zip LZMA Compression Benchmark

7-Zip LZMA Decompression Benchmark

Compression and decompression can also be easily optimized for multi-core processors, and that is evident in the benchmark results above.

Cryptography Benchmarks

Cryptography has become an indispensable part of our interaction with computing systems. Almost all modern systems have some sort of hardware-acceleration for making cryptographic operations faster and more power efficient. In this sub-section, we look at two different real-world applications that may make use of this acceleration.

BitLocker is a Windows features that encrypts entire disk volumes. While drives that offer encryption capabilities are dealt with using that feature, most legacy systems and external drives have to use the host system implementation. Windows has no direct benchmark for BitLocker. However, we cooked up a BitLocker operation sequence to determine the adeptness of the system at handling BitLocker operations. We start off with a 2.5GB RAM drive in which a 2GB VHD (virtual hard disk) is created. This VHD is then mounted, and BitLocker is enabled on the volume. Once the BitLocker encryption process gets done, BitLocker is disabled. This triggers a decryption process. The times taken to complete the encryption and decryption are recorded. This process is repeated 25 times, and the average of the last 20 iterations is graphed below.

BitLocker Encryption Benchmark

BitLocker Decryption Benchmark

We see the Frost Canyon NUC providing faster encryption support compared to the 4X4 BOX-4800U.

Creation of secure archives is best done through the use of AES-256 as the encryption method while password protecting ZIP files. We re-use the benchmark mode of 7-Zip to determine the AES256-CBC encryption and decryption rates using pure software as well as AES-NI. Note that the 7-Zip benchmark uses a 48KB buffer for this purpose.

7-Zip AES256-CBC Encryption Benchmark

7-Zip AES256-CBC Decryption Benchmark

When it comes to password-protected archives, the 4X4 BOX-4800U comes out on top easily.

Yet another cryptography application is secure network communication. OpenSSL can take advantage of the acceleration provided by the host system to make operations faster. It also has a benchmark mode that can use varying buffer sizes. We recorded the processing rate for a 8KB buffer using the hardware-accelerated AES256-CBC-HAC-SHA1 feature.

OpenSSL Encryption Benchmark

OpenSSL Decryption Benchmark

Cryptography engines across all eight cores enable the 4X4 BOX-4800U to score much better than the hexa-core Frost Canyon NUC.

Agisoft Photoscan

Agisoft PhotoScan is a commercial program that converts 2D images into 3D point maps, meshes and textures. The program designers sent us a command line version in order to evaluate the efficiency of various systems that go under our review scanner. The command line version has two benchmark modes, one using the CPU and the other using both the CPU and GPU (via OpenCL). We present the results from our evaluation using the CPU mode only. The benchmark (v1.3) takes 84 photographs and does four stages of computation:

  • Stage 1: Align Photographs (capable of OpenCL acceleration)
  • Stage 2: Build Point Cloud (capable of OpenCL acceleration)
  • Stage 3: Build Mesh
  • Stage 4: Build Textures

We record the time taken for each stage. Since various elements of the software are single threaded, and others multithreaded, it is interesting to record the effects of CPU generations, speeds, number of cores, and DRAM parameters using this software.

Agisoft PhotoScan Benchmark - Stage 1

Agisoft PhotoScan Benchmark - Stage 2

Agisoft PhotoScan Benchmark - Stage 3

Agisoft PhotoScan Benchmark - Stage 4

The 4X4 BOX-4800U scores better in three of the four stages, with the third stage seeing the Frost Canyon NUC edge ahead by less than 15 seconds. Overall, the advantage lies with the Renoir system.

Dolphin Emulator

Wrapping up our application benchmark numbers is the new Dolphin Emulator (v5) benchmark mode results. This is again a test of the CPU capabilities.

Dolphin Emulator Benchmark

The emulator is bottlenecked by single-threaded performance, and the Frost Canyon NUC is able to outwit the 4X4 BOX-4800U for this workload.



Storage and Networking Performance

Storage and networking are two major aspects which influence our experience with any computing system. This section presents results from our evaluation of these aspects in the ASRock 4X4 BOX-4800U. On the storage side, one option would be repetition of our strenuous SSD review tests on the drive(s) in the PC. Fortunately, to avoid that overkill, PCMark 8 has a storage bench where certain common workloads such as loading games and document processing are replayed on the target drive. Results are presented in two forms, one being a benchmark number and the other, a bandwidth figure. We ran the PCMark 8 storage bench on selected PCs and the results are presented below.

Futuremark PCMark 8 Storage Bench - Score

Futuremark PCMark 8 Storage Bench - Bandwidth

The 1TB Crucial P5 SSD in the Frost Canyon NUC manages to provide higher storage bandwidth compared to the 512GB DRAM-less Patriot P300 in the 4X4 BOX-4800U, but the storage subsystem scores are fairly close to each other.

On the networking side, we restricted ourselves to the evaluation of the WLAN component. Our standard test router is the Netgear Nighthawk AX12 RAX120 configured with both 2.4 GHz and 5 GHz networks. The router is placed approximately 11 ft. away with a direct line-of-sight to the PC under test. A wired client (Zotac MI553, with an Akitio T3-10G NBASE-T Thunderbolt 3 adapter) is connected to the 5GbE port of the RAX120 and serves as one endpoint for iperf evaluation.

We first left the 5GHz network at default (meaning, no DFS), and the 4X4 BOX-4800U connected with the following parameters,

A script to run iPerf3 with 1, 2, 4, 8, and 16 parallel streams between the 4X4 BOX-4800U and the Zotac ZBOX MI553 was processed - the first set for TX alone, followed by another set for RX, and finally a third set with bidirectional traffic.

The RAX120 can be explicitly configured to connect over a DFS channel. This works in the absence of any radar presence in the vicinity. In such a scenario, the 4X4 BOX-4800U connected with the following parameters,

The iPerf3 script was processed again and delivered the following result.

With DFS support, we can expected around 1.28 Gbps of best-case throughput via the AX200 in the 4X4 BOX-4800U. The table below presents the iPerf3 benchmark results obtained in the above testing scenario.

Wireless Bandwidth - TCP Traffic
(iPerf3 Throughput in Gbps)
Stream
Count
80 MHz Wi-Fi 6 (Non-DFS) 160 MHz Wi-Fi 6 (DFS)
TX RX TX RX
1 0.897 - 1.259 -
2 0.893 - 1.242 -
4 0.868 - 1.286 -
8 0.862 - 1.299 -
16 0.876 - 1.294 -
1 - 0.695 - 0.908
2 - 0.697 - 1.027
4 - 0.698 - 1.154
8 - 0.701 - 1.211
16 - 0.701 - 1.262
1 0.652 0.153 1.183 0.045
2 0.612 0.208 1.020 0.185
4 0.451 0.342 1.089 0.152
8 0.207 0.541 0.940 0.298
16 0.403 0.383 0.815 0.421

The numbers presented above are slightly lesser than the average segment bandwidths noted, as the data in the graph is computed from the network interface's counters, while iPerf reports results based only on the traffic sent by it alone.



HTPC Credentials - I

The ASRock 4X4 BOX-4800U comes with four display outputs. All of them support 4Kp60, with the HDMI port providing support for HDR as well as HDCP 2.2. Unfortunately, UHD Blu-rays can't be played back on the system, though HDCP 2.2 enables other protected content to be displayed at full resolution. The system also supports HD audio bitstreaming over the HDMI port.

Our evaluation of the 4X4 BOX-4800U as a HTPC was done using the native HDMI output connected to a TCL 55P607 4K HDR TV via a Denon AVR-X3400H AV receiver.

We tested out various display refresh rates ranging from 23.976 Hz to 59.94 Hz. Of particular interest is the 23.976 Hz (23p) setting, which Intel used to have trouble with in the pre-Broadwell days.

The gallery below presents screenshots from the other refresh rates that were tested. The system has no trouble maintaining a fairly accurate refresh rate throughout the duration of the video playback.

Our HTPC testing with respect to YouTube had been restricted to playback of a 1080p music video using the native HTML5 player in Firefox. The move to 4K, and the need to evaluate HDR support have made us choose Mystery Box's Peru 8K HDR 60FPS video as our test sample moving forward. On PCs running Windows, it is recommended that HDR streaming videos be viewed using the Microsoft Edge browser after putting the desktop in HDR mode.

The system is supposed to support 4Kp60 VP9 Profile 2 playback, and MS Edge automatically gets the correct encode from the YouTube servers. However, the playback was punctuated by frequent dropped frames as shown in the statistics segment of the above screenshot.

Various metrics of interest such as GPU usage and at-wall power consumption were recorded for the first four minutes of the playback of the above video. The numbers are graphed below.

The GPU and decoder utilization numbers spiking close to 100% in the above graph explains the dropped frames. Interestingly, as we shall see in the next section, local playback fo VP9 Profile 2 videos didn't result in any issues.



HTPC Credentials - II

Evaluation of local media playback and video processing is done by playing back files encompassing a range of relevant codecs, containers, resolutions, and frame rates. A note of the efficiency is also made by tracking GPU usage and power consumption of the system at the wall. Users have their own preference for the playback software / decoder / renderer, and our aim is to have numbers representative of commonly encountered scenarios. Towards this, we played back the test streams using the following combinations:

  • MPC-HC x64 1.8.5 + LAV Video Decoder (DXVA2 Native) + Enhanced Video Renderer - Custom Presenter (EVR-CP)
  • VLC 3.0.8
  • Kodi 18.9

The fourteen test streams (each of 90s duration) were played back from the local disk with an interval of 30 seconds in-between. Various metrics including GPU usage and at-wall power consumption were recorded during the course of this playback. Prior to looking at the metrics, a quick summary of the decoding capabilities of the integrated Radeon GPU is useful to have for context.

On paper, the GPU should be able to play back all codecs with hardware acceleration (except for AV1).

All our playback tests were done with the desktop HDR setting turned on. It is possible for certain system configurations to have madVR automatically turn on/off the HDR capabilities prior to the playback of a HDR video, but, we didn't take advantage of that in our testing.

VLC and Kodi

VLC is the playback software of choice for the average PC user who doesn't need a ten-foot UI. Its install-and-play simplicity has made it extremely popular. Over the years, the software has gained the ability to take advantage of various hardware acceleration options. Kodi, on the other hand, has a ten-foot UI making it the perfect open-source software for dedicated HTPCs. Support for add-ons make it very extensible and capable of customization. We played back our test files using the default VLC and Kodi configurations, and recorded the following metrics.

Video Playback Efficiency - VLC and Kodi

VLC had trouble with the interlaced VC-1 clip, and there was no hardware acceleration for AV1. Kodi was flawless all through, though the 8Kp60 AV1 clip ended up consuming a lot of power with both players.

MPC-HC

MPC-HC offers an easy way to test out different combinations of decoders and renderers. The configuration we evaluated is the default post-install scenario, with only the in-built LAV Video Decoder forced to DXVA2 Native mode. The metrics collected during the playback of the test files using the above three configurations are presented below.

We usually attempt usage of madVR, but activating the filter resulted in some glitches. In any case, usage of madVR with integrated GPUs is not advisable. Similar to Kodi, the MPC-HC + EVR-CP combination makes good use of the hardware acceleration capabilities of the GPU to achieve satisfactory playback across all the tested codecs and resolutions. Hardware acceleration allows the system to never exceed 40W at the wall even for streams with high frame rates and large resolutions.



Power Consumption and Thermal Performance

The power consumption at the wall was measured with a 4K display being driven through the HDMI port. In the graphs below, we compare the idle and load power of the ASRock 4X4 BOX-4800U with other low power PCs evaluated before. For load power consumption, we ran the AIDA64 System Stability Test with various stress components, and noted the maximum sustained power consumption at the wall.

Idle Power Consumption

The idle power of 10.45W is a tad too high compared to the Intel NUCs. The peak power consumption is also low, compared to other systems.

Our thermal stress routine starts with the system at idle, followed by four stages of different system loading profiles using the AIDA64 System Stability Test (each of 30 minutes duration). In the first stage, we stress the CPU, caches and RAM. In the second stage, we add the GPU to the above list. In the third stage, we stress the GPU standalone. In the final stage, we stress all the system components (including the disks). Beyond this, we leave the unit idle in order to determine how quickly the various temperatures in the system can come back to normal idling range. The various clocks, temperatures and power consumption numbers for the system during the above routine are presented in the graphs below.

ASRock 4X4 BOX-4800U System Loading with the AIDA64 System Stability Test

The frequencies stay above the base value (1.8 GHz) advertised. Being actively cooled, the temperature of the package doesn't exceed 95C. The key is the package power - for CPU alone, the steady state is around 15W. With the GPU in the mix, it goes up to around 20W (though instantaneous values go as high as 30W for very short bursts).

ASRock 4X4 BOX-4800U System Loading with Prime95 and Furmark

The artificial power virus test of both Prime95 and Furmark results in the package temperature going as high as 100C. According to the official specifications, the maximum permitted temperature of the Ryzen 7 4800U is 105C. The thermal solution is able to keep it below that number, allowing the processor to deliver its advertised performance in a sustained manner.



Concluding Remarks

NUCs such as the 4X4 BOX series from ASRock Industrial have enabled AMD to participate in the burgeoning UCFF PC market. ASRock Industrial has quickly learnt from its mistakes in the 4X4 BOX-V1000M - the 4X4 BOX-4000 series brings support for M.2 2280 PCIe 3.0 x4 NVMe SSDs (compared the M.2 2242-only support in the previous generation). The WLAN component is also the best that is available in this particular form-factor (2x2 Wi-Fi 6 compared to the 1x1 Wi-Fi 5 card in the previous generation). Reducing the physical footprint of the system is also welcome. The fan curve / noise profile has also improved.

The key face-off for the 4X4 BOX-4000 series is against systems based on Comet Lake-U. Tiger Lake-U is on the way, but no UCFF system based on TGL-U is currently available for purchase. To get a good idea of how Renoir compares against Comet Lake, we went back to our Frost Canyon NUC sample and revamped its internals to match the ASRock-suggested 4X4 BOX-4800U configuration. We replaced the 16GB DDR4 SODIMMs with 64GB DDR4-2666 SODIMMs (maximum supported frequency in CML-U) and the 256GB Kingston A1000 with a 1TB Crucial P5 SSD. On the pricing front, the two systems end up costing almost the same when the storage and RAM are also considered. The preceding pages presented benchmarks that are essentially apples-to-apples comparison.

The user experience with SFF desktops relies on multiple pillars - single-threaded performance, multi-threaded performance, energy efficiency, and last, but not the least, driver/software support. In the last few years, Intel has stalled a bit in delivering improvements in these pillars from one generation to the next. With the first-generation Ryzen, we saw AMD tackling the multi-threaded performance aspect with aplomb. Zen 2 has delivered updates across the first three of those pillars - single-threaded performance improvement is good enough to actually challenge CML-U across a large number of workloads within the power envelop dictated by the form factor of NUC-like systems. The 7nm fabrication process has delivered power efficiency gains, though it still doesn't match Intel's when it comes to race to idle (as shown by the gulf in the idle power numbers for the Frost Canyon NUC and the 4X4 BOX-4800U). It is in the drivers/software segment that AMD gives us cause for complaint. For example, we faced issues playing back YouTube HDR content in MS Edge (with Windows 10 20H2 and the latest AMD drives), and madVR usage resulted in playback issues. Both of these might well turn out to be application bugs over which AMD may not have control. But that is scant consolation for the end-user. It is an unfortunate fact that most QA is done on Intel-based systems, leaving the experience with AMD systems a little less than ideal. Hopefully, with AMD gaining market share, these types of software compatibility issues become a thing of the past.

 

On the pricing front, the barebones version of the 4X4 BOX-4800U is available for $600 - it pretty much matches the launch price of the Frost Canyon NUC10i7FNH. For the same price, the Renoir NUC surpasses the CML-U system by including support for NBASE-T with a 2.5 Gbps LAN port (backed by the Realtek RTL8125BG controller), native support for DDR4-3200 without overclocking, and support for four simultaneous 4Kp60 display outputs. Intel will be playing catch-up with TGL-U here, but at the moment, the 4X4 BOX-4000 series wins the features-per-dollar battle. Given the benchmark numbers we have just seen, the performance-per-dollar metric is also firmly in favor of the 4X4 BOX-4800U. On the performance-per-watt front, there is still scope for improvement. Overall, the ASRock 4X4 BOX-4800U has given us the opportunity to finally evaluate an AMD NUC that can go head-to-head against Intel's current flagship in the same market segment. This has been a remarkable turnaround for AMD. The renewed competition in this market is also excellent news for consumers.

Log in

Don't have an account? Sign up now