A marketing buzzword. It's like saying your company "leverages cloud computing" instead of saying, "We're buying computer resources from a third party."
Except in this case, it's accurate. NPUs are designed for AI work, just as GPUs are designed for graphics processing. I get clutching pearls over "the cloud" and the terrible buzzword-heavy language surrounding it, but this ain't that.
I get the buzzword weariness you comment on but no one can honestly deny the impact cloud computing has had, both on the industry at large and all the businesses it touches. Cloud computing has completely altered the industry.
And the GP should consider the impact NPU's have had and just how many of them are being produced. NPU's have significantly different workloads than both CPU's and GPU's. The workloads can be shoehorned onto both the prior chips but at a high cost in efficiency. The NPU's that first showed up with Google's Tensor processor more than 10 years ago have been growing exponentially in deployments. If you are curious about there is plenty of information out there on why google developed the Tensor NPU and why they continue to develop and refine it. [In particular I remember an article discussing when google first deployed Google assistant and realized that if even 3% of android users used the assistant at once it would overwhelm all of Google's compute resources.[
Almost every major CPU producer now has a separate NPU unit in production and the number being sold is growing at nearly exponential rates. For example, every Pixel phone since the second generation has included a separate google designed NPU and most of the other phone producers have followed suit with NPU's of their own in the most recent generations after seeing the benefits this chip provided to the pixel phones.
NPU's are here to stay and will be growing and developing likely as fast as GPU's did. Though they won't have the retail deployment like GPU's did you're likely to see them deployed, if not directly into products into the back-end that serves them, even if you don't know they are there.
Math coprocessors is nothing new, I prefer TPU for Tensor processor unit, NPU is a "buzzword", it's just a very specialized math processor for tensors, much as an FPU for floating point. There is nothing neural about them at all.
There's nothing "tensor" about these math coprocessors. They are designed to multiply regular 2D matrices. Lots of multiply accumulate ops performed in parallel on two wide input vectors. Typically the inputs are FP16 (e.g. Google TPU), or INT8 (this one).
That depends on the design of the accelerator. Some accelerators are systolic array matrix multipliers . Others implement dot product multipliers that operate on the native tensor without conversion to 2D matrix form .
That’s not something that it appears companies reveal when it comes to NPUs. The information just gets encapsulated in the blob ‘N MACs’ . For example 4096 MACs could be a 64x64 matrix multiplier, or a 16x16x16 dot product multiplier stack. It may be possible to infer it indirectly.
What in the heck are you all on about?! A 2D matrix is a representation of a 1D tensor, because that's how tensors work. They're mathematical objects, not some Google brand or some nebulous thing that must be 'turned into' a 2D matrix. The matrix is the tensor, that's how it's represented mathematically.
A tensor is an N-dimensional array. Typically it refers to 3D and 4D tensors. A 1D tensor is a vector. A 2D tensor is a matrix.
A 3D tensor is converted into 2D matrix form using techniques like im2col , enabling a 3D convolution to be performed using a 2D matrix multiplication operation. Different NPU designs may or may not choose to do this depending on how they are designed.
Tensor is a specific product from Google - made into systems and integrated into Nvidia's GPUs (Volta, Turing and Ampere). So while Nervana or Habana products that work in a similar way to Tensor cores, they ARE NOT Tensor cores.
The TPU is a specific product range from Google. It is different from the GPU tensor cores in NVidia's Volta and subsequent architectures, though they may both be matrix multipliers.
The former was originally a quantized 8-bit integer MAC engine, that later had (or will have ?) support for the bfloat16 datatype defined by Google.
The latter supports fp32, fp16 and since Turing, int8 and int4. Ampere also defines the new tf32 datatype.
Thank you, and I don't deny the impact "cloud" computing has had on the world in the past 20+ years. My day job relies on it, and I wouldn't want it any other way. My comment was specific to the whole buzzword phenomenon and nothing more.
Make 6150 bucks every month… Start doing online computer-based work through our website. I have been working from home for 4 years now and I love it. I don’t have a boss standing over my shoulder and I make my own hours. The tips below are very informative and anyone currently working from home or planning to in the future could use this website. WWW. iⅭash68.ⅭOⅯ a
Good write-up, but I think you mean to say "figuratively" in place of literally. We would be in trouble if there's a literal explosion of machine learning accelerators.
I hope nobody got hurt! Also, I wonder where that "literal wild west" is. Western... China? Or is it a subtle reference about Silicon Valley and its general "go fast and break things" attitude? I mean, that's in the western US, and it's pretty wild in many ways (few of them good, just like the wild west of old).
On a more serious note: please don't use "literal" when you are using words in a metaphorical manner. I get that that's a thing these days, and I suppose one could claim it is some sort of ironic postmodernist satire (a few decades too late if that is the case), but generally it just makes you look like you don't understand what words mean.
You may not be aware, but literally now literally means figuratively. It's in the dictionary. www.merriam-webster.com/dictionary/literally "2: in effect : VIRTUALLY —used in an exaggerated way to emphasize a statement or description that is not literally true or possible"
Thanks Andrei! As a request: could you do a deep dive into what neural PUs are useful for, especially in mobile? I know about photography, but I understand that learning/deep learning is also used for 5G telephony for such tasks as constantly adjusting beam forming. If someone here knows of a good review (overview), I'd appreciate it also. Might help with separating marketing hype from actual requirements.
They are used for a lot of things, including image/video processing, face detection, voice recognition, handwriting recognition.
Basically anything that requires pattern recognition works well for NN/ML/AI accelerating hardware.
It's more than likely used in the "inside out" tracking for modern standalone VR devices too.
Mind you most of these accelerators work best for running the NN's (also called inference) rather than training them, especially on mobile platforms which are energy constrained.
Image and video processing such as stacked HDR, fake bokeh, all worked fine on the pre-NPU S8 and even on various low end phones like the Moto G6 with sideloaded Google camera app.
Face detection works fine even in digital cameras from 2003.
Handwriting recognition... Maybe on an iPad?
Voice recognition? Why doesn't my S10 ever seen to pick up my voice over music?
"we’ve seen a literal explosion of machine learning accelerators in the industry, with a literal wild west" What exploded and why are people shooting at each other?
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
34 Comments
Back to Article
Operandi - Wednesday, May 27, 2020 - link
Ok.....whats an NPU?psychobriggsy - Wednesday, May 27, 2020 - link
Neural Processing UnitAI accelerator
PeachNCream - Wednesday, May 27, 2020 - link
A marketing buzzword. It's like saying your company "leverages cloud computing" instead of saying, "We're buying computer resources from a third party."kaidenshi - Wednesday, May 27, 2020 - link
Except in this case, it's accurate. NPUs are designed for AI work, just as GPUs are designed for graphics processing. I get clutching pearls over "the cloud" and the terrible buzzword-heavy language surrounding it, but this ain't that.rahvin - Wednesday, May 27, 2020 - link
I get the buzzword weariness you comment on but no one can honestly deny the impact cloud computing has had, both on the industry at large and all the businesses it touches. Cloud computing has completely altered the industry.And the GP should consider the impact NPU's have had and just how many of them are being produced. NPU's have significantly different workloads than both CPU's and GPU's. The workloads can be shoehorned onto both the prior chips but at a high cost in efficiency. The NPU's that first showed up with Google's Tensor processor more than 10 years ago have been growing exponentially in deployments. If you are curious about there is plenty of information out there on why google developed the Tensor NPU and why they continue to develop and refine it. [In particular I remember an article discussing when google first deployed Google assistant and realized that if even 3% of android users used the assistant at once it would overwhelm all of Google's compute resources.[
Almost every major CPU producer now has a separate NPU unit in production and the number being sold is growing at nearly exponential rates. For example, every Pixel phone since the second generation has included a separate google designed NPU and most of the other phone producers have followed suit with NPU's of their own in the most recent generations after seeing the benefits this chip provided to the pixel phones.
NPU's are here to stay and will be growing and developing likely as fast as GPU's did. Though they won't have the retail deployment like GPU's did you're likely to see them deployed, if not directly into products into the back-end that serves them, even if you don't know they are there.
Zoolook - Wednesday, May 27, 2020 - link
Math coprocessors is nothing new, I prefer TPU for Tensor processor unit, NPU is a "buzzword", it's just a very specialized math processor for tensors, much as an FPU for floating point. There is nothing neural about them at all.p1esk - Wednesday, May 27, 2020 - link
There's nothing "tensor" about these math coprocessors. They are designed to multiply regular 2D matrices. Lots of multiply accumulate ops performed in parallel on two wide input vectors. Typically the inputs are FP16 (e.g. Google TPU), or INT8 (this one).sun_burn - Wednesday, May 27, 2020 - link
That depends on the design of the accelerator. Some accelerators are systolic array matrix multipliers . Others implement dot product multipliers that operate on the native tensor without conversion to 2D matrix form .p1esk - Wednesday, May 27, 2020 - link
Which ones operate on native tensor without conversion to 2D matrix form?sun_burn - Wednesday, May 27, 2020 - link
That’s not something that it appears companies reveal when it comes to NPUs. The information just gets encapsulated in the blob ‘N MACs’ . For example 4096 MACs could be a 64x64 matrix multiplier, or a 16x16x16 dot product multiplier stack. It may be possible to infer it indirectly.dotjaz - Saturday, May 30, 2020 - link
Except ARM uses MCE blocks, 128 MAC per block, and in both U55 and N77's product briefs it's described as 8x8edzieba - Thursday, May 28, 2020 - link
What in the heck are you all on about?!A 2D matrix is a representation of a 1D tensor, because that's how tensors work. They're mathematical objects, not some Google brand or some nebulous thing that must be 'turned into' a 2D matrix. The matrix is the tensor, that's how it's represented mathematically.
sun_burn - Thursday, May 28, 2020 - link
A tensor is an N-dimensional array. Typically it refers to 3D and 4D tensors. A 1D tensor is a vector. A 2D tensor is a matrix.A 3D tensor is converted into 2D matrix form using techniques like im2col , enabling a 3D convolution to be performed using a 2D matrix multiplication operation. Different NPU designs may or may not choose to do this depending on how they are designed.
Veedrac - Thursday, May 28, 2020 - link
‘Tensor’ is a ML-ism for multi-dimensional arrays.Deicidium369 - Wednesday, May 27, 2020 - link
Tensor is a specific product from Google - made into systems and integrated into Nvidia's GPUs (Volta, Turing and Ampere). So while Nervana or Habana products that work in a similar way to Tensor cores, they ARE NOT Tensor cores.sun_burn - Thursday, May 28, 2020 - link
The TPU is a specific product range from Google. It is different from the GPU tensor cores in NVidia's Volta and subsequent architectures, though they may both be matrix multipliers.The former was originally a quantized 8-bit integer MAC engine, that later had (or will have ?) support for the bfloat16 datatype defined by Google.
The latter supports fp32, fp16 and since Turing, int8 and int4. Ampere also defines the new tf32 datatype.
kaidenshi - Wednesday, May 27, 2020 - link
Thank you, and I don't deny the impact "cloud" computing has had on the world in the past 20+ years. My day job relies on it, and I wouldn't want it any other way. My comment was specific to the whole buzzword phenomenon and nothing more.surt - Wednesday, May 27, 2020 - link
Neural PU.alicebcao75 - Monday, June 8, 2020 - link
Make 6150 bucks every month… Start doing online computer-based work through our website. I have been working from home for 4 years now and I love it. I don’t have a boss standing over my shoulder and I make my own hours. The tips below are very informative and anyone currently working from home or planning to in the future could use this website. WWW. iⅭash68.ⅭOⅯa
Stochastic - Wednesday, May 27, 2020 - link
Good write-up, but I think you mean to say "figuratively" in place of literally. We would be in trouble if there's a literal explosion of machine learning accelerators.Stochastic - Wednesday, May 27, 2020 - link
Also, the word "veritable" works in this context.Valantar - Thursday, May 28, 2020 - link
I hope nobody got hurt! Also, I wonder where that "literal wild west" is. Western... China? Or is it a subtle reference about Silicon Valley and its general "go fast and break things" attitude? I mean, that's in the western US, and it's pretty wild in many ways (few of them good, just like the wild west of old).On a more serious note: please don't use "literal" when you are using words in a metaphorical manner. I get that that's a thing these days, and I suppose one could claim it is some sort of ironic postmodernist satire (a few decades too late if that is the case), but generally it just makes you look like you don't understand what words mean.
Spunjji - Friday, May 29, 2020 - link
100% on the above. I felt a little bit like I was being trolled with the use of "literal" to mean "figurative" twice in two consecutive sentences 😅surt - Wednesday, June 3, 2020 - link
You may not be aware, but literally now literally means figuratively. It's in the dictionary.www.merriam-webster.com/dictionary/literally
"2: in effect : VIRTUALLY —used in an exaggerated way to emphasize a statement or description that is not literally true or possible"
eastcoast_pete - Wednesday, May 27, 2020 - link
Thanks Andrei! As a request: could you do a deep dive into what neural PUs are useful for, especially in mobile? I know about photography, but I understand that learning/deep learning is also used for 5G telephony for such tasks as constantly adjusting beam forming. If someone here knows of a good review (overview), I'd appreciate it also. Might help with separating marketing hype from actual requirements.brucethemoose - Wednesday, May 27, 2020 - link
^ this.TBH, I would'nt be suprised if its mostly used for data mining, in which case we're *never* going to find a good overview.
soresu - Wednesday, May 27, 2020 - link
They are used for a lot of things, including image/video processing, face detection, voice recognition, handwriting recognition.Basically anything that requires pattern recognition works well for NN/ML/AI accelerating hardware.
It's more than likely used in the "inside out" tracking for modern standalone VR devices too.
Mind you most of these accelerators work best for running the NN's (also called inference) rather than training them, especially on mobile platforms which are energy constrained.
PeterCollier - Sunday, May 31, 2020 - link
So no practical uses?Image and video processing such as stacked HDR, fake bokeh, all worked fine on the pre-NPU S8 and even on various low end phones like the Moto G6 with sideloaded Google camera app.
Face detection works fine even in digital cameras from 2003.
Handwriting recognition... Maybe on an iPad?
Voice recognition? Why doesn't my S10 ever seen to pick up my voice over music?
Deicidium369 - Wednesday, May 27, 2020 - link
"OK Google" or "Siri ..." for oneValantar - Thursday, May 28, 2020 - link
Aren't 99% of voice assistant tasks still handled by cloud servers?Sivar - Wednesday, May 27, 2020 - link
Not really a literal explosion or a literal Wild West.Deicidium369 - Wednesday, May 27, 2020 - link
And it all started on a computer in the UK called the Acorn.DifferentFrom - Thursday, June 4, 2020 - link
"we’ve seen a literal explosion of machine learning accelerators in the industry, with a literal wild west" What exploded and why are people shooting at each other?BoyBawang - Monday, June 15, 2020 - link
So how does this ARM N78 NPU relate to the tensor Accelerator inside Snapdragons Hexagon? Is there a redundancy in functionality?