It just shows what I don't know, but interesting to me to see 'out-of-order' and 'real-time' together. I thought the branch prediction you typically see in OoO, and the misprediction penalties it comes with, would be bad for worst-case performance (your limiter in hard realtime situations) even if good in the average case.
I'm not sure what the definition of real-time is in this product. Real-time systems are generally deadline driven, and although the OoO does introduce uncertainty to timing, the timescale makes it immaterial.
Realistically, media frame based systems just need the frame to be done within a certain number of ms, networks have packet latency on the order of ms. A millisecond is huge when an OoO CPU is running at GHz (where 1ms is 1000us and thats 1million clks/ns). Just a guess, I've never made a "RT" CPU...
The RT problem of desktop grade CPUs is not that they are out of order, but the degree of abstraction required to support a user operating system. Programs are running in user space, there are thousands of threads running on few hardware threads, there is a lot of context switching and it is all fairly coarse grained, making such system impractical for RT critical scenarios.
Compared to that, OoO is minuscule and negligible, in this regard OoO is not obstructive to real-time applications but actually constructive, as it allows significantly higher computational throughput so you can accomplish a lot more complex tasks still in real time.
That being said, there are still some areas where good old 8bit AVR still can't be beat - if timing is indeed very very critical and the application is very very simple, the simplicity of the 8bit AVR chip offers significantly better accuracy than 32bit ARM, even if they are in order they are still a lot more complex, hard to predict and more latent relatively.
Real time is often synonymous with deterministic. It means that any tasks will be done on a regular interval every time. Like if you need to send a packet every 50 us +/- 5 us you can't use regular operating systems. As long as the processor is fast it doesn't matter if the instructions are reordered as it's executing them. All that matters is that operations don't wait long periods of time. Even a maxed out reorder buffer is a relatively short period of time compared to most real time applications.
Yes you're right, to a large extent it is about the maximum time. The main problem with using a normal processor though for real time is the memory access - having virtual memory and loading the address translation and the data. Their R and M series don't use virtual store and the Tightly-Coupled Memory gets rid of the major problem with accessing memory. Disk controllers would still have a lot of shared memory for cacheing the disk but that would be a well contained problem.
The misprediction penalty is due to a the pipeline. If you have some complex iterative processing your loop will be very fast, then you fall out and the CPU stalls but it'll still be finished within a shorter worst case time than a simpler non-superscalar processor which isn't interleaving instructions. As others have said, it's the maximum latency and that it be deterministic, so no virtiual memory which may be cached or sitting on some HDD
Real time is usually defined against a metric - a deadline to do something when the interrupt comes in, 50 microseconds for example.
Here we have a CPU that runs at 1500000000 Hz. If your metric is "react within 1 microsecond" then you have 1500 cycles to deal with that interrupt. That would be pretty harsh (many would see it as a fun challenge) but more typically you'd want tens to hundreds of microseconds, and that's plenty of cycles to check data, signal something else to work on it, etc.
Think of the OoO aspects and the pipeline allowing higher clock speeds as gravy on top, rather than challenges, and it becomes more acceptable.
The RT problem of desktop grade CPUs is not that they are out of order, but the degree of abstraction required to support a user operating system. Programs are running in user space, there are thousands of threads running on few hardware threads, there is a lot of context switching and it is all fairly coarse grained, making such system impractical for RT critical scenarios.
Compared to that, OoO is minuscule and negligible, in this regard OoO is not obstructive to real-time applications but actually constructive, as it allows significantly higher computational throughput so you can accomplish a lot more complex tasks still in real time.
ARM just sells the processor IP, you can take that and implement it in whatever process you want. The 28nm line is ARM giving an example of what clock speeds you could get in that process.
ARM discloses that currently all major hard-drive and SSD manufacturers use controllers based on Cortex R processors, which is least to say an interesting market position.
How has this never come up before? I've been reading SSD reviews here since the first one and have never been aware of ARM chips being inside the controllers. Is it just a given, or don't manufacturers talk about it?
Then you really suck at reading articles. It was repeatedly stated in Samsung's SSD reviews that the MEX and MGX controllers are using dual-core and tri-core Cortex-R4 chips.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
22 Comments
Back to Article
twotwotwo - Wednesday, February 17, 2016 - link
It just shows what I don't know, but interesting to me to see 'out-of-order' and 'real-time' together. I thought the branch prediction you typically see in OoO, and the misprediction penalties it comes with, would be bad for worst-case performance (your limiter in hard realtime situations) even if good in the average case.webdoctors - Wednesday, February 17, 2016 - link
I'm not sure what the definition of real-time is in this product. Real-time systems are generally deadline driven, and although the OoO does introduce uncertainty to timing, the timescale makes it immaterial.Realistically, media frame based systems just need the frame to be done within a certain number of ms, networks have packet latency on the order of ms. A millisecond is huge when an OoO CPU is running at GHz (where 1ms is 1000us and thats 1million clks/ns). Just a guess, I've never made a "RT" CPU...
ddriver - Friday, February 19, 2016 - link
The RT problem of desktop grade CPUs is not that they are out of order, but the degree of abstraction required to support a user operating system. Programs are running in user space, there are thousands of threads running on few hardware threads, there is a lot of context switching and it is all fairly coarse grained, making such system impractical for RT critical scenarios.Compared to that, OoO is minuscule and negligible, in this regard OoO is not obstructive to real-time applications but actually constructive, as it allows significantly higher computational throughput so you can accomplish a lot more complex tasks still in real time.
That being said, there are still some areas where good old 8bit AVR still can't be beat - if timing is indeed very very critical and the application is very very simple, the simplicity of the 8bit AVR chip offers significantly better accuracy than 32bit ARM, even if they are in order they are still a lot more complex, hard to predict and more latent relatively.
willis936 - Wednesday, February 17, 2016 - link
Real time is often synonymous with deterministic. It means that any tasks will be done on a regular interval every time. Like if you need to send a packet every 50 us +/- 5 us you can't use regular operating systems. As long as the processor is fast it doesn't matter if the instructions are reordered as it's executing them. All that matters is that operations don't wait long periods of time. Even a maxed out reorder buffer is a relatively short period of time compared to most real time applications.Dmcq - Thursday, February 18, 2016 - link
Yes you're right, to a large extent it is about the maximum time. The main problem with using a normal processor though for real time is the memory access - having virtual memory and loading the address translation and the data. Their R and M series don't use virtual store and the Tightly-Coupled Memory gets rid of the major problem with accessing memory. Disk controllers would still have a lot of shared memory for cacheing the disk but that would be a well contained problem.RobATiOyP - Thursday, February 18, 2016 - link
The misprediction penalty is due to a the pipeline. If you have some complex iterative processing your loop will be very fast, then you fall out and the CPU stalls but it'll still be finished within a shorter worst case time than a simpler non-superscalar processor which isn't interleaving instructions.As others have said, it's the maximum latency and that it be deterministic, so no virtiual memory which may be cached or sitting on some HDD
psychobriggsy - Thursday, February 18, 2016 - link
Real time is usually defined against a metric - a deadline to do something when the interrupt comes in, 50 microseconds for example.Here we have a CPU that runs at 1500000000 Hz.
If your metric is "react within 1 microsecond" then you have 1500 cycles to deal with that interrupt. That would be pretty harsh (many would see it as a fun challenge) but more typically you'd want tens to hundreds of microseconds, and that's plenty of cycles to check data, signal something else to work on it, etc.
Think of the OoO aspects and the pipeline allowing higher clock speeds as gravy on top, rather than challenges, and it becomes more acceptable.
ddriver - Friday, February 19, 2016 - link
The RT problem of desktop grade CPUs is not that they are out of order, but the degree of abstraction required to support a user operating system. Programs are running in user space, there are thousands of threads running on few hardware threads, there is a lot of context switching and it is all fairly coarse grained, making such system impractical for RT critical scenarios.Compared to that, OoO is minuscule and negligible, in this regard OoO is not obstructive to real-time applications but actually constructive, as it allows significantly higher computational throughput so you can accomplish a lot more complex tasks still in real time.
iwod - Thursday, February 18, 2016 - link
I am starting to think the amount of complexity in 5G Modem will be insane.And why aren't these done on 20nm?
Andrei Frumusanu - Thursday, February 18, 2016 - link
ARM disclosed that the Cortex-R8 will likely see implementations at 28, 16, 10 and even 7nm.RobATiOyP - Thursday, February 18, 2016 - link
Same reason as GPUs have waited for 14/16nm node, the processes at 20nmm without FinFet sucked too many Wattsextide - Thursday, February 18, 2016 - link
Yeah, but 20nm would likely still be appropriate for this chip.ImSpartacus - Thursday, February 18, 2016 - link
Why is that? Wasn't 20nm mostly just used for socs that needed every ounce of power savings regardless of the cost?ImSpartacus - Thursday, February 18, 2016 - link
Apparently 20nm kinda sucked. It wasn't a large enough upgrade from 28nm to justify the cost.lagittaja - Friday, February 19, 2016 - link
Because 20nm isn't worth it over the now very, very mature 28nm process.14/16 FF or stick with 28nm.
evancox10 - Friday, February 19, 2016 - link
ARM just sells the processor IP, you can take that and implement it in whatever process you want. The 28nm line is ARM giving an example of what clock speeds you could get in that process.Mr Perfect - Thursday, February 18, 2016 - link
How has this never come up before? I've been reading SSD reviews here since the first one and have never been aware of ARM chips being inside the controllers. Is it just a given, or don't manufacturers talk about it?
vladx - Thursday, February 18, 2016 - link
Then you really suck at reading articles. It was repeatedly stated in Samsung's SSD reviews that the MEX and MGX controllers are using dual-core and tri-core Cortex-R4 chips.stimudent - Friday, February 19, 2016 - link
When did the S start being pronounced like a Z here in America.This article looks like it was written for a British audience.
FLHerne - Friday, February 19, 2016 - link
Noting also 'centred', I'd agree. Nice to see English on the internet for a change! :-)Pissedoffyouth - Friday, February 19, 2016 - link
It's almost like ARM is a British company or somethingSydneyBlue120d - Friday, February 19, 2016 - link
I haven't seen any article about the new ARMv8.2-A specification:https://community.arm.com/groups/processors/blog/2...