So, no mention of Hyperthreading so far.
It seems reasonable to assume that Intel has either found a way to temporarily deactivate idle execution units, or has implemented a greatly improved Out-of-order engine able to keep all four (integer units) busy. Considering the increase in power draw additional execution units cause, it seems likely that with Conroe's emphasis on Performance per Watt, Intel felt they could widen the core without sacrificing efficiency.
So, how have they managed this?
In the past, I've read that there are two main reasons for doing SMT. One is that you have a deep pipeline and want to keep the penalty of mispredicted branches and stalls to a minimum. The other is if you have a really wide core and lots of execution units, and you want to keep them filled. I'd say there's still a reasonable chance that we'll see some form of SMT in Conroe, though it could be like the original HTT where it sits inactive initially and Intel only turns it on after further testing. (/speculation)
The number of instructions active in the pipeline at any one time is determined by the issue rate and the pipeline length.
Although the P4 had three integer units, its decode rate was actually quite low, as a result of its single decoder (and Trace Cache).
Now that we know Conroe is a four-issue design, it could well have more instructions in flight that the P4, with deeper re-order buffers in order to extract a sufficient amount of ILP from the instruction stream to keep its execution units busy.
Nah, this is a 65nm process, were looking at low to mid 2GHZ for the mobile chips with mid to high 2 GHZ for the desktop revisions, also as time goes on they should breach 3 GHZ on this processor for desktop.
I really thought Intel would have put that in this generation. I think the two cores will be beating each other up for the memory controller in peak usage scenarios. Even with a 1066Mhz FSB and dual channel ddr 667.
Where does it say they lack an on-die memory controller? Methinks direct L1-to-L1 transfer and unified L2 cache means something 'intelligent' sits between the cores?
quote: Where does it say they lack an on-die memory controller? Methinks direct L1-to-L1 transfer and unified L2 cache means something 'intelligent' sits between the cores?
Well, that 'intelligent' thing is called the Arbitration Logic, which acts to manage data between two cores.
Sounds like BS to me. If they were truly 5x faster per watt we'd be seeing benchmarks left and right. CPUs are different from video cards. You dont see performance increases of that magnitude.
How is vector math going to be handled by the conroe? Is it going to be done like it was done on the p4 (using dedicated vector hardware) or like AMD does it (using the x87 fpu)?
The basic integer pipeline appears to be 14 stages long, making it a significant decrease from the 31+ stage pipeline in Prescott and the 12 stage pipeline in the Athlon 64.
quote: The basic integer pipeline appears to be 14 stages long, making it a significant decrease from the 31+ stage pipeline in Prescott and the 12 stage pipeline in the Athlon 64.
NO you are wrong it says this: The basic integer pipeline appears to be 14 stages long, making it a significant decrease from the 31+ stage pipeline in Prescott and a slight increase from the 12 stage pipeline in the Athlon 64.
An extremely predictable direction; I actually yawned in the middle ... maybe because its time for my coffee or maybe the details do not excite.
All-round very boring.
I wanted to see a new architecture that can dazzle.
- a proper vector co-processor similar to what PPC970FX or the Xbox360 XPU have.
- hyperthreading done proper – like what IBM does with the Power5 series
- on-die memory / PCI-e controller
***YAWN****
I want to see what AMDs next gen architecture will offer. They seem to more on the cutting edge.
Well AMD can already produce quad cores since the SRI on current a64s actually has 4 ports for cores. If you mean who will be first to market I'd guess AMD as long as they can pull off a timely 65nm shrink... it always comes back to that.
Considering how long it has taken AMD to get in to the swing of 90nm process, I would be very surprised if they beat Intel to 65nm. Intel has the size and the money to make ridiculous investments in new process technologies, and I think that will be enough to keep them ahead, in spite of the AMD/IBM partnership.
LOL, contrary to wild speculation? That articles was one of Nicholas Blachford's saner predictions. I mean, it's more down to earth than his anti-graviton engine for faster-than-ligth travel or his "the Cell will redefine the chip industry" garbage.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
26 Comments
Back to Article
BitByBit - Tuesday, August 23, 2005 - link
So, no mention of Hyperthreading so far.It seems reasonable to assume that Intel has either found a way to temporarily deactivate idle execution units, or has implemented a greatly improved Out-of-order engine able to keep all four (integer units) busy. Considering the increase in power draw additional execution units cause, it seems likely that with Conroe's emphasis on Performance per Watt, Intel felt they could widen the core without sacrificing efficiency.
So, how have they managed this?
JarredWalton - Tuesday, August 23, 2005 - link
In the past, I've read that there are two main reasons for doing SMT. One is that you have a deep pipeline and want to keep the penalty of mispredicted branches and stalls to a minimum. The other is if you have a really wide core and lots of execution units, and you want to keep them filled. I'd say there's still a reasonable chance that we'll see some form of SMT in Conroe, though it could be like the original HTT where it sits inactive initially and Intel only turns it on after further testing. (/speculation)IntelUser2000 - Wednesday, August 24, 2005 - link
Pentium 4 doesn't have 4-issue wide core. Why did Anand assume that?
or perhaps in means more instructions in flight than Pentium 4 because Conroe has 4-issue core?
BitByBit - Wednesday, August 24, 2005 - link
The number of instructions active in the pipeline at any one time is determined by the issue rate and the pipeline length.Although the P4 had three integer units, its decode rate was actually quite low, as a result of its single decoder (and Trace Cache).
Now that we know Conroe is a four-issue design, it could well have more instructions in flight that the P4, with deeper re-order buffers in order to extract a sufficient amount of ILP from the instruction stream to keep its execution units busy.
UNCjigga - Tuesday, August 23, 2005 - link
I'm guessing we'll have ~2.0 to ~2.6GHz at launch, maybe ~1.6 to ~2GHz for the mobile parts?coldpower27 - Tuesday, August 23, 2005 - link
Nah, this is a 65nm process, were looking at low to mid 2GHZ for the mobile chips with mid to high 2 GHZ for the desktop revisions, also as time goes on they should breach 3 GHZ on this processor for desktop.Leper Messiah - Tuesday, August 23, 2005 - link
something like that. Maybe not even that high, with a slightly longer pipeline starting maybe 2.4GHz max.Wee for integer performance. How 'bout some concrete benchies if they actually have some out.
Hacp - Tuesday, August 23, 2005 - link
Why only 2.4 GHZ? I think Intel could start at 2.8, go up to 3.2-3.4 and work their way down to 2.4...Doormat - Tuesday, August 23, 2005 - link
I really thought Intel would have put that in this generation. I think the two cores will be beating each other up for the memory controller in peak usage scenarios. Even with a 1066Mhz FSB and dual channel ddr 667.UNCjigga - Tuesday, August 23, 2005 - link
Where does it say they lack an on-die memory controller? Methinks direct L1-to-L1 transfer and unified L2 cache means something 'intelligent' sits between the cores?IntelUser2000 - Wednesday, August 24, 2005 - link
Well, that 'intelligent' thing is called the Arbitration Logic, which acts to manage data between two cores.
haelduksf - Tuesday, August 23, 2005 - link
If they want 5x the performance/watt, and these things are going to be about 65W, that means they will be 2-3x faster. Exciting stuff.Den - Tuesday, August 23, 2005 - link
Well, if you compare new dual core to old single core, the speed increase is more like 20-50% depending on which old CPU you look at...Furen - Tuesday, August 23, 2005 - link
Sounds like BS to me. If they were truly 5x faster per watt we'd be seeing benchmarks left and right. CPUs are different from video cards. You dont see performance increases of that magnitude.neogodless - Tuesday, August 23, 2005 - link
Integer Performance...Which doesn't really mean overall system performance and certainly not real world performance...
Let us not forget the "amazing" integer performance of the upcoming game consoles... which does not equate to amazing real world physics...
Furen - Tuesday, August 23, 2005 - link
How is vector math going to be handled by the conroe? Is it going to be done like it was done on the p4 (using dedicated vector hardware) or like AMD does it (using the x87 fpu)?bart - Tuesday, August 23, 2005 - link
The basic integer pipeline appears to be 14 stages long, making it a significant decrease from the 31+ stage pipeline in Prescott and the 12 stage pipeline in the Athlon 64.Is 14 less then 12???
IntelUser2000 - Wednesday, August 24, 2005 - link
NO you are wrong it says this: The basic integer pipeline appears to be 14 stages long, making it a significant decrease from the 31+ stage pipeline in Prescott and a slight increase from the 12 stage pipeline in the Athlon 64.
mino - Wednesday, August 24, 2005 - link
Heh, you forgot that most writers at AT have tendency to correct mistakes as soon as they surface... I like this approach.neogodless - Tuesday, August 23, 2005 - link
Heh yup... I was wondering about the wording chosen for that too!knitecrow - Tuesday, August 23, 2005 - link
An extremely predictable direction; I actually yawned in the middle ... maybe because its time for my coffee or maybe the details do not excite.All-round very boring.
I wanted to see a new architecture that can dazzle.
- a proper vector co-processor similar to what PPC970FX or the Xbox360 XPU have.
- hyperthreading done proper – like what IBM does with the Power5 series
- on-die memory / PCI-e controller
***YAWN****
I want to see what AMDs next gen architecture will offer. They seem to more on the cutting edge.
erinlegault - Tuesday, August 23, 2005 - link
Quad-core? Who's going to be the first out with that?Den - Tuesday, August 23, 2005 - link
IBM's had a quad core Power 5 out for a while now...If you mean in the x86 world though, yes, that will be an interesting race.
Furen - Tuesday, August 23, 2005 - link
Well AMD can already produce quad cores since the SRI on current a64s actually has 4 ports for cores. If you mean who will be first to market I'd guess AMD as long as they can pull off a timely 65nm shrink... it always comes back to that.ViRGE - Tuesday, August 23, 2005 - link
Considering how long it has taken AMD to get in to the swing of 90nm process, I would be very surprised if they beat Intel to 65nm. Intel has the size and the money to make ridiculous investments in new process technologies, and I think that will be enough to keep them ahead, in spite of the AMD/IBM partnership.Furen - Tuesday, August 23, 2005 - link
LOL, contrary to wild speculation? That articles was one of Nicholas Blachford's saner predictions. I mean, it's more down to earth than his anti-graviton engine for faster-than-ligth travel or his "the Cell will redefine the chip industry" garbage.