Original Link: https://www.anandtech.com/show/16905/hot-chips-2021-live-blog-new-tech-infineon-edgeq-samsung



08:22PM EDT - Welcome to Hot Chips! This is the annual conference all about the latest, greatest, and upcoming big silicon that gets us all excited. Stay tuned during Monday and Tuesday for our regular AnandTech Live Blogs.

08:22PM EDT - Going to start here in about 10 minutes

08:30PM EDT - Should just about to start

08:32PM EDT - First up is Infineon

08:32PM EDT - Next gen automotive challenges

08:33PM EDT - Let's go climb a mountain

08:33PM EDT - Literally drive up a mountain!

08:34PM EDT - Evolving technologies - Battery, Sensing, AI

08:35PM EDT - Adaptable architectures with high availability without any legacy impact

08:35PM EDT - Machine Learning - workload specific compute

08:35PM EDT - fast security accelerators for authentification

08:36PM EDT - E-architecture evolution

08:36PM EDT - Connectivity - logical attacks, spoofing - any connection out is an attack vector

08:36PM EDT - Need fail-safe system

08:38PM EDT - Moving towards future architectures with an ethernet backbone and a central computer

08:38PM EDT - ALso helps reducing cost

08:38PM EDT - Infineon Aurix and Tricore architecture

08:38PM EDT - Designed around two decades ago - Tricore

08:38PM EDT - Aurix in production since 2015, Tricore since 1995

08:39PM EDT - Adding modern features as time goes on

08:39PM EDT - 500 MHz in latest gen

08:39PM EDT - new accelerators - parallel processing, enhanced DSPs

08:40PM EDT - ASIL D safety, security standards

08:40PM EDT - Hardware isolation at the core level, 8 VMs per core and Hypervisor

08:40PM EDT - Fine granular access protection, DMA protection

08:41PM EDT - 2 x 5 Gbit ethernet, accelerated MACsec support, hardware acceleration for encryption

08:41PM EDT - two PCIe 3.0 x1 lanes

08:42PM EDT - Full CPU architectgure layout

08:42PM EDT - six cores at 500 MHz

08:42PM EDT - Debug and Trace

08:42PM EDT - SIMD Vector DSP and Scalar core

08:42PM EDT - ARC EV71FS Parallel Processing Unit

08:43PM EDT - Software stack

08:44PM EDT - Security - Security cluster

08:57PM EDT - Supports automotive encryption, intrusion detection, physical or digital

08:57PM EDT - Sorry, Internet cut out for 10 minutes, ISP went borked

08:58PM EDT - Just in the Q&A section now of this talk. Going to cut losses, and just wait for the next talk in 2 minutes

09:02PM EDT - Second talk is EdgeQ - Open RISC-V 5G Radio Access Networks

09:03PM EDT - One of the emerging companies

09:04PM EDT - First software programmable SoC for AI and 5G

09:04PM EDT - 5G basestation on a single chip

09:04PM EDT - 50+ SoCs launched, 2 billion modems shipped, $100b revenue generated

09:05PM EDT - Was in stealth until end of last year

09:05PM EDT - Next Generation RAN

09:06PM EDT - Banding for 5G is important

09:06PM EDT - Progression of 5G RAN over time

09:07PM EDT - OpenRAN using Off-the-shelf hardware

09:07PM EDT - Migration to a cloud native model

09:08PM EDT - Central Unit, Distributed Unit, Radio Unit

09:08PM EDT - Signal processing

09:09PM EDT - Requires scheduling of users

09:09PM EDT - Multiple RUs to one central unit

09:10PM EDT - DU is a hybrid architecture - mixed special hardware or general hardware

09:10PM EDT - What's needed is the open interfaces between each section

09:11PM EDT - 5G programmable baseband DSP

09:12PM EDT - one EdgeQ is in the Radio Unit

09:12PM EDT - Distributed Unit has multiple EdgeQ chips for signal processing

09:13PM EDT - Developing a converged SoC

09:13PM EDT - Need a programmable DSP engine

09:14PM EDT - RISC-V with 50+ custom instructions

09:14PM EDT - eight-core Arm Neoverse CPU subsystem

09:14PM EDT - Accelerators, IO subsystem, PCIe, USB, Ethernet

09:14PM EDT - GNU Tool Chain

09:14PM EDT - Massively parallel

09:17PM EDT - Supports multiple configurations and is software upgradeable

09:17PM EDT - beamforming, other intense operations

09:18PM EDT - gang up to 4 chips for up to 40 Gbps

09:18PM EDT - Life of a packet within a chip

09:20PM EDT - 'Profound disruption in 5G and ORAN'

09:20PM EDT - Sampling Now

09:21PM EDT - Q&A time

09:22PM EDT - Q: Process Node - A: Not disclosing public, but TSMC FinFet

09:22PM EDT - Q: Neoverse cores? A: E1, at 2 GHz

09:23PM EDT - Q: TDP range? A: Not disclosing. Base station unprecedented. Power is low. Very competitive for this implementation. Maybe in the teens

09:23PM EDT - Q: RISC-V and Arm, What's the RISC-V base? A: Licence IP from Andes, but functionality is custom

09:30PM EDT - Samsung time

09:30PM EDT - HBM2-PIM

09:32PM EDT - Been working with vendors on PIM for a while

09:33PM EDT - What is PIM - rather than move data to teh CPU or accelerator for basic operations, do it right in memory

09:33PM EDT - PIM proof of concept is difficult, only Samsung so far

09:34PM EDT - Designed to be inserted into current solutions

09:34PM EDT - Expanding the pyramid of storage

09:35PM EDT - Aquabolt-XL, system level 1st gen PIM memry based on HBM2 Aquabolt

09:36PM EDT - Memory bound workloads, such as AI

09:37PM EDT - or perhaps crypto?

09:37PM EDT - 2x system performance at 70% energy

09:38PM EDT - PIM unit has 3 units

09:38PM EDT - FP16 SIMD, controller, and register files

09:39PM EDT - No additional timing impact on memory

09:39PM EDT - Still Samsung specific, working with JEDEC for proper spec

09:40PM EDT - Works by using current signalling techniques with no overhead

09:41PM EDT - Use PIM library replacements for AI and recompile

09:41PM EDT - Python, BLAS, GEMM

09:42PM EDT - PIM execution blocks

09:42PM EDT - HBM2 8Hi stack has 4 PIM + 4 HBM dies

09:42PM EDT - Compute bandwidth is 1.23 TB/s and 4.92 TB/s off-chip + on-chip

09:43PM EDT - Synthetic benchmark testing

09:44PM EDT - Best performance gain on batch 1

09:44PM EDT - +5.4% power compared to regular HBM

09:45PM EDT - Is that iso-capacity?

09:46PM EDT - Evaluation with reduced power overall

09:46PM EDT - Reduced overall system power and execution time

09:46PM EDT - Natural Language Processing

09:47PM EDT - Xilinx model with HBM2-PIM, coming September ?

09:47PM EDT - U280+PIM test results

09:48PM EDT - Neural Networks

09:48PM EDT - 3.4x perf/watt

09:49PM EDT - Can also be applied to LPDDR5, such as LPDDR5X-6400

09:49PM EDT - based on simulation results

09:50PM EDT - Camera use cases

09:51PM EDT - DIMM level PIM

09:51PM EDT - DDR4/DDR5 compatible

09:51PM EDT - Requires a buffer

09:52PM EDT - AXDIMM buffer

09:52PM EDT - Evaluation system

09:53PM EDT - Those add-in boards look fun

09:53PM EDT - can I have one

09:53PM EDT - PoC on a Broadwell server

09:54PM EDT - GDDR6 and HBM3 in the future

09:55PM EDT - HBM3 will have FP16 and FP32, currently only INT8 and INT16

09:55PM EDT - Trying to introduce JEDEC standard with HBM3 by initial spec at end of year

09:55PM EDT - Q&A time

09:56PM EDT - Q: How does PIM manage coherence with host? A: memory vision will be offload will not be cached, but those applications have low data reusability

09:58PM EDT - Q: Does software need to know HBM-PIM is there? A: Yes need to recompile

10:01PM EDT - Q: +5.4% power is iso-capacity A: Not answered and evaded

10:02PM EDT - That's all for today! Come back tomorrow!

Log in

Don't have an account? Sign up now