![](/Content/images/logo2.png)
Original Link: https://www.anandtech.com/show/906
Intel Introduces 533MHz FSB CPUs - Pentium 4 2.53GHz
by Anand Lal Shimpi on May 6, 2002 12:00 PM EST- Posted in
- CPUs
Whether you like Intel or not, you can't argue that the Pentium 4 of today is infinitely more competitive than what was launched a year and a half ago. But did we honestly expect any different? Intel received the same treatment at the launch of their P5 processors at 60/66MHz; competing solutions from AMD such as the 486 DX4-120 were much better buys and although they were running on an older 486 platform, they had more of an upgrade path than the soon to be forgotten 5V P5 processors. Remember the launch of the Pentium Pro? It has been said that until the Pentium 4, Intel has never released a processor that actually performed worse than its previous generation offering. With the majority of the applications back in 1995 written for 16-bit versions of Windows, the Pentium Pro's horrendous 16-bit application performance left everyone but a small percentage of users with a bad aftertaste.
While we wouldn't be caught dead recommending the Pentium 4 after its launch in November 2000, times have changed. The biggest win for Intel came with the release of their 0.13-micron Northwood core; the smaller core meant cooler running processors, even higher clock speeds and an extra 256KB of L2 cache. A favorite in the overclocking community, the 1.6GHz Northwood Pentium 4 brought 2GHz+ clock speeds to users at relatively affordable prices.
Competition from AMD has been fierce and with Thoroughbred a month away, the competition will indeed continue. The only thing that has changed is that now Intel is competing quite well on a performance level, although AMD continues to hold a title they've had for a long time: a tremendous value. In fact, it's the tremendous value and amazingly competitive performance that has convinced 56% of the AnandTech Members, who offered their system configuration information, to use AMD processors.
Today Intel is working a bit ahead of schedule. Originally Intel was going to release one CPU, the Pentium 4 2.4B processor; the 'B' suffix denotes the use of a 133MHz quad-pumped FSB (effectively 533MHz). But courtesy of high yields and good performance in Intel's strict validation process, today you'll not only get one but two new 533MHz FSB processors - clocked at 2.4 and 2.53GHz.
A new FSB requires a new chipset?
The Pentium 4 debuted with the i850 chipset which only carried official support for a 100MHz clocked quad-pumped FSB. The "quad-pumped" nature of the bus indicated that data was transferred four times per clock, twice on the rising edge and twice on the falling edge. As you can expect, with a quad-pumped FSB there are very strict electrical guidelines that must be followed in order to ensure that the correct data gets transferred at the right time during every clock cycle. Although it's very easy to say "400MHz FSB," the amount of effort put into validation to make sure that the chipsets, motherboards and CPUs work with a quad-pumped bus is incredible. We put together this little diagram to help you understand why:
The three images you see above are pictures of a 100MHz rectangular pulse train similar to what the FSB and other clocks in your system operate off of. Since all of the signals run at 100MHz, the time between two rising edges is 10ns. What does a rising edge actually signify? It represents the change from low to high voltage; low is usually defined as a voltage close to 0 while high can vary depending on the platform which in this case is around 1.6V.
The first signal is only triggered once per clock cycle (10ns) meaning that the chipset only has to detect a voltage close to 1.6V in order to initiate a data transfer.
The second signal is triggered twice per clock cycle, once on the rising edge and once on the falling edge. This becomes a little more complicated as the logic must now transfer data twice every 10ns but it has to detect when the voltage is increasing towards 1.6V and then once again when it's decreasing towards 0V. Double-pumped or DDR signaling requires a bit more effort to test and ensure proper operation under all conditions.
The final case is by far the most complex to implement of the three because now the chipset must detect four different voltages. The first two occur somewhere between 0 and 1.6V on the rising edge of the clock, and the other two occur in the same range but on the falling edge. If we assume that 0 - 0.7V defines one part of the rising edge of the clock and 0.8 - 1.6V defines the other part, you can see how the signal must be fairly consistent in order to properly enable four data transfers per clock cycle. It is already difficult enough to distinguish high from low when the spread is only 1.6V wide, but trying to distinguish two different values on each edge of the clock is even worse.
Now let's say we increased the frequency of the FSB from 100MHz to 133MHz; that seems like such a small increase but now all of the circuitry and logic has 25% less time (7.5ns vs. 10ns) to transfer data four times and prepare for the next set of data transfers. The actual modifications that needed to be made to the i850 chipset in order to allow for 133MHz FSB clock speeds were minimal; however the biggest limitation by far was validating the chipsets (ensuring that they would work flawlessly at 133MHz).
This is one situation in which Intel's strict validation policies put users in a bit of a bind. Currently, the vast majority of i850 chipsets work just fine at a 133MHz FSB frequency but definitely not all. From this point on however, all Pentium 4 processors will be using the 133MHz FSB. What are end users to do? If you have a recent i850 board then chances are it already works fine at the 133MHz FSB, but in order to meet Intel's validation requirements a new chipset with official support for the FSB had to be produced - the i850E.
Intel's 850E Memory Controller Hub
The i850E is no different from the i850 from a feature standpoint other than its official support for the 133MHz quad-pumped FSB. Electrically, it's nearly identical to the i850 with minor modifications to ensure that it will fully pass Intel's validation specifications at the increased FSB frequency.
PC1066 RDRAM - "Not Officially Supported"
When we said that the i850E was no different from its predecessor, we meant it. The chipset does not boast official support for Rambus' latest PC1066 RDRAM either. This is another situation where Intel won't boast official support for the standard but the memory will work on virtually all i850E boards and even some i850 boards. The limiting factor here is memory (most newer Samsung PC800 RIMMs work at PC1066 speeds, and of course all PC1066 modules work fine) and the clock generator on the motherboards themselves.
The ASUS P4T-E and ABIT TH7-II were some of the first i850 boards that started shipping with the proper clock generator for PC1066 RDRAM operation.
Why is PC1066 RDRAM so important? When you move to a 133MHz quad-pumped FSB the amount of FSB bandwidth between the i850 MCH and the Pentium 4 increases from 3.2GB/s to 4.26GB/s. Since the fastest supported memory type of the i850E is PC800 RDRAM, by default the i850E reduces the RDRAM clock multiplier from 4x FSB frequency to 3x FSB frequency to keep the RDRAM clock at 400MHz (for PC800 RDRAM). This keeps the amount of bandwidth between the MCH and the RDRAM banks stuck at 3.2GB/s in spite of the fact that the FSB bandwidth has increased by 33%. You are then in fact, widening one end of a pipe but leaving the other end untouched and expecting a gain in throughput. It's the FSB/memory bandwidth balance that made the i850 perfect for the Pentium 4 and it's that very same balance that's disrupted unless you use PC1066 RDRAM.
In order to understand what sort of performance benefits you should expect you have to understand exactly what's happening when you do open one or both ends of this pipe. The end goal is the same, to help the rest of the system keep up with the CPU. The Pentium 4 is now running at 2.53GHz and that is significantly faster than the rest of the system. The slower the rest of the system is in comparison to the CPU, the more of the CPU's power is wasted. It's the equivalent of taking a Ferrari through bumper to bumper traffic, you're not taking full advantage of its horsepower or torque if you're just creeping forward every few minutes.
Is 4.26GB/s of bandwidth necessary for today's Pentium 4s? No. But the same was said about the first Celeron processors with only 800MB/s of FSB/memory bandwidth, sure enough as time went on the limited amount of FSB/memory bandwidth was holding the poor processor back. The following are all influencers of demand on system bandwidth:
- Clock Speed
- Applications
- CPU Architecture
Clock speed should be the most obvious influence on system bandwidth demands; if all things are kept the same, the faster a CPU runs, the faster it needs to be fed data to avoid wasting precious clock cycles and execution resources.
The next thing to keep in mind is that as applications become more demanding, with more features and larger working datasets they begin to eat up more of that precious FSB/memory bandwidth. Making a processor run Word is one thing, but making it handle all of the physics & artificial intelligence calculations for the next-generation of 3D games can easily become a hog of bandwidth.
And finally there is the issue of changing CPU architectures. While this won't apply to the Pentium 4 anytime soon, if the architecture of the processor changes at all then its bandwidth demands could change as well. Introducing faster (or more) execution units or higher bandwidth caches among other things can all change demand on system bandwidth.
In the end, PC1066 RDRAM isn't necessary today but it will be in the future. Your best bet is to try and find some good PC800 modules that will work at 1066 speeds. Intel provided us with a few sticks that work at PC1066 but we were also able to purchase them online very easily through sites like Googlegear.
The CPUs
As we mentioned at the start of this story, the two CPUs being announced today are the Pentium 4 2.4B and the 2.53GHz processors. The 'B' suffix indicates that the CPU uses the 133MHz FSB and from this point on you'll only see 133MHz FSB processors available in higher clock speed bins.
The specifications of these processors are identical to all of the other 0.13-micron Northwood based CPUs so we'll just point you at previous reviews for more information:
The Test
Windows
XP Professional Test Bed
|
|
Hardware
Configuration
|
|
CPU |
AMD
Athlon XP 2100+ (1.73GHz)
AMD Athlon XP 2000+ (1.67GHz) AMD Athlon XP 1800+ (1.53GHz) AMD Athlon XP 1600+ (1.40GHz) Intel Pentium 4 2.53GHz Intel Pentium 4 2.40B GHz Intel Pentium 4 2.40GHz Intel Pentium 4 2.20GHz Intel Pentium 4 2.0A GHz Intel Pentium 4 2.0GHz Intel Pentium 4 1.8GHz Intel Pentium 4 1.6GHz |
Motherboard |
EPoX
8K3A+ - VIA KT333 Chipset
ABIT TH7-II RAID - Intel 850 Chipset Intel D850EMV2 - Intel 850E Chipset |
RAM |
1
x 256MB DDR333 CAS2.5 Kingston DIMM
2 x 128MB PC800 Samsung RIMMs 2 x 128MB PC1066 Samsung RIMMs |
Sound |
None
|
Hard Drive |
80GB
Maxtor D740X
|
Video Cards (Drivers) |
NVIDIA GeForce4 Ti 4600 (28.32) |
Intel 850E Performance
Before we get to the actual CPU performance tests, let's have a look at what sort of improvement to expect from the 533MHz FSB and what sort of gains are in store if you explore PC1066 RDRAM as an option as well (note that we ran all of our CPU tests with PC800 RDRAM):
With a performance gain ranging from 0 - 12%, the combination of PC1066 RDRAM and the 533MHz FSB is a valuable asset for the Pentium 4. As clock speeds increase, the assets will definitely grow in value to the point where we'll quickly shun the older 400MHz parts.
Internet Content Creation & General Usage Performance
With this review we continue to use SYSMark 2002; SYSMark 2002 can be considered to be a much more memory bandwidth intensive version of the Winstone tests. The benchmark is split into two parts, Internet Content Creation which deals with content creation applications (Photoshop, Dreamweaver, etc...) and Office Productivity which is more general usage oriented (Word, Excel, Netscape, Anti-Virus, etc...).
The 2002 update changes things around a bit; first of all the benchmark's total scores are arrived at differently than in the 2001 benchmark. Windows Media Encoder no longer accounts for close to half of the Internet Content Creation test, rather only about 10%. There is also no need for a special Athlon XP SSE patch as the 2002 suite uses a version of the encoding dll that properly detects SSE support on all Palomino cores as well as Pentium 4 cores.
The rest of the benchmark is much more evenly distributed and it is much more memory bandwidth intensive than the old benchmark. The Internet Content Creation tests on average use about 600MB/s of bandwidth vs 300MB in SYSMark 2001. The Office Productivity tests are still stuck at around 580MB/s of memory bandwidth.
For more information on the tests and the applications used consult this whitepaper provided by BAPCo.
|
While the faster FSB doesn't really help all that much in either of the SYSMark tests, the Pentium 4 doesn't really need it as it has always been favored in these benchmarks. Here the lead is simply extended by the 2.53GHz part.
|
The race is much closer in the office productivity tests, remember that a greater than 10% margin is usually required for the end user to notice any performance differences as significant.
Media Encoding Performance
What once was a very CPU intensive task is now fairly trivial. Because of the streaming nature of MP3 encoding, having a larger cache doesn't necessarily result in a tangible increase in performance. The reason we continue to stress MP3 encoding as a CPU benchmark is mainly because of the fact that MP3 encoding usually does play a role in larger projects such as MPEG-4 video encoding where you're ripping audio as well as video.
|
If you'll notice there is a slight performance boost when going to the 533MHz FSB alone at 2.4GHz. The reason behind this is simple, the application and its working dataset are too small to saturate the memory bus so a higher speed FSB simply lets the Pentium 4 gain better efficiency of the available memory bandwidth.
As we just mentioned, MP3 encoding does play a role in ripping DVDs to highly compressed DiVX 5.0 files since it's usually no fun to watch a movie with no audio and it defeats the purpose of DiVX encoding to use uncompressed audio.
|
Here the increased FSB doesn't help out too much although the Pentium 4 does still have command of the performance lead here. It will take much more than a die shrink to bring the Athlon XP back to the top of the charts although the processor is still very competitive and a great value for its price.
Video Effects Rendering Performance
We've added two new benchmarks to our suite for this comparison: Adobe After Effects 5.5 and NewTek's Lightwave 7.5. These two make good examples of what heavy SSE2 optimizations can bring to the Pentium 4. You'll remember from the original discussions of the Pentium 4's architecture, many criticized Intel's decision to move to an essentially weaker x87 FP execution setup in favor of putting great faith in the adoption of SSE2. The adoption of the instruction set has been going well but as you can tell by most of our 3D rendering and other FP intensive benchmarks, the Pentium 4 is only now becoming competitive because of its high clock speeds.
With AMD's Opteron and the next-generation Athlon scheduled to receive support for Intel's SSE2 instructions as well, the assimilation of SSE2 optimizations into as many applications as possible is in the best interests of both CPU giants. If history is any indication however, it will take quite a bit of time to see significant optimizations in place.
Adobe After Effects is one application that has received a high level of SSE2 optimizations as the type of video manipulation the program allows is perfectly suited for SSE2. Let's have a look at the results:
|
The After Effects results paint a completely different picture from what we're used to seeing when it comes to FP performance with the Pentium 4 and Athlon XP. Even the Athlon XP 2100+ is beat by Intel's Willamette based Pentium 4 running at 1.8GHz.
While the performance here can't be generally applied to all sorts of video editing/effects rendering packages, After Effects is widely used and for those users that depend on the highest performance in the application the Pentium 4 is your best bet.
What will be the most interesting is to see how the Opteron/next-generation Athlon perform under this and other SSE2 optimized applications.
3D Rendering Performance
Next we have our usual two 3D rendering tests. We'll start off with rendering the first frame of the Waterfall.max scene (provided on the 3DSMAX CD) at 1024x768:
|
When it comes down to raw x87 FP performance, the Pentium 4 is finally able to capture the lead but only by a small amount and with their two fastest processors. The Athlon XP has historically dominated this benchmark because of its extremely strong x87 FP performance courtesy of its three, fully-capable, FP units.
|
The picture doesn't change much under Maya, there is a bit of shuffling in the standings but overall we see the same thing we saw under 3D Studio MAX. We mentioned before that Intel has put a lot of faith in SSE2 optimizations in favor of raw x87 FP power, let's find out how things stack up when we switch to a heavily SSE2 optimized 3D rendering application - Lightwave.
3D Rendering Performance using SSE2
While 3D Studio MAX is SSE2 optimized, the level of optimization is nowhere near what NewTek reported with Lightwave upon releasing version 7.0b. The performance improvements offered by the new SSE2 optimized version were all above 20% using NewTek's supplied benchmarking scenes.
We chose two benchmarks to use, the least SSE2 optimized one and another that is more optimized just to get an idea of the potential that lies for Pentium 4 users running heavily optimized applications:
|
Again we see a very impressive showing of the Pentium 4 in real-world scenarios where SSE2 is used heavily. Intel is definitely working hard to make SSE2 as prevalent as possible but without more significant gains in other applications the Pentium 4 will have to use its ability to reach very high clock speeds in order to consistently end up ahead of the game.
|
If you thought the last benchmark painted the Pentium 4 in a favorable light then this one will truly impress you. Unfortunately for the Pentium 4, situations where such heavy SSE2 optimizations are present are very rare. For heavy Lightwave users, you're in luck.
3D Gaming Performance
When it comes to most 3D games there's generally very little performance to be found by heavily optimizing for SSE2 or 3DNow! on either of these processors and thus the performance is mostly dependent on the overall platform (e.g. FPU capabilities, chipset, memory latency/bandwidth, cache latency/bandwidth, etc...).
We'll start off with our favorite 3D gaming benchmark - the Unreal Performance Test 2002. For an explanation of what this test is and why it is so significant, be sure to read our 15-way GPU Shootout that we used to introduce the test. In short, the benchmark uses the current build of the Unreal Engine (that will power games such as UnrealTournament 2003 and Unreal II) and serves as a great indication for future performance in games that use the engine.
|
The latest build of the benchmark is severely limited by the graphics card, making it not the best CPU test but you can get an idea for how things stack up. Provided that your graphics card is powerful enough, you can expect everything faster than a Pentium 4 2.2GHz to perform very similar.
Jedi Knight 2 is an extremely unique incarnation of the Quake III engine and it also happens to make a terrific CPU benchmark as the game is rarely ever GPU bound with today's video cards.
|
The performance standings under Jedi Knight 2 are pretty similar to what we saw under UPT2002, however now the gaps between the individual CPUs widens considerably. For example, the fastest Pentium 4 (2.53GHz) ends up being 11.7% faster than the fastest Athlon XP (2100+) mostly because the benchmark isn't as GPU limited as UPT2002.
Finally we have Comanche 4, another fairly new game (DX8 compliant) with a very heavy CPU dependency as we've seen in recent video card roundups.
|
Under Comanche 4 we see that the larger L2 cache offered by the Northwood core is strongly favored, giving the Pentium 4 a solid lead here. If you look at the older Willamette CPUs, even the Athlon XP 1600+ can trounce the fastest 2GHz 0.18-micron part.
Final Words
Intel was creeping ahead of AMD in the performance tests when they released the 2.4GHz Pentium 4, but now with 533MHz FSB parts and bumping the clock speed up to 2.53GHz the performance crown is undeniably Intel's. There are still a few cases where the Athlon XP can come out ahead but for the foreseeable future, Intel will claim the right to the highest performing desktop microprocessor.
That statement doesn't come without stipulations however; since all of our benchmarks used officially supported PC800 RDRAM (going to PC1066 would result in another boost in performance over what you see here) that does mean that in order to get the highest performance out of the Pentium 4 you will have to go to the RDRAM based i850/850E. As you can conclude on your own by looking at the necessary math, there isn't a single Pentium 4 DDR solution available today that can offer the amount of bandwidth necessary to feed a 4.26GB/s 533MHz FSB. Especially as CPU clock speeds increase, the Pentium 4's dependency on a high bandwidth memory bus will increase as well. While we haven't included the numbers here (we're planning another Pentium 4 chipset comparison in the near future), pairing the Pentium 4 up with Intel's 845 solution paints a significantly different performance picture.
AMD's chance to regain the performance lead could be had with the Thoroughbred if they ramp quickly enough since it will take more than an XP 2200+ running at 1.8GHz to take the lead away from Intel. But in all honesty we aren't waiting for the Athlon XP to ramp quickly, we're waiting for the inevitable Hammer based Athlon match up against the Pentium 4. With execution power exceeding that of the current Athlon XP, an extremely low-latency path to main memory, SSE2 support and a shipping clock speed of at least 2GHz there is much to expect from Hammer. There has already been talk of a 3400+ model rating for Hammer upon its launch later this year; with Intel targeting 3GHz for the Pentium 4 before the end of the year, it may be Hammer that is AMD's real chance at regaining the performance crown.
On the flip side of the coin we have an equally as important issue to take into account, price. Although the 2.53GHz Pentium 4 trounces the Athlon XP 2100+, the chip is well over twice as expensive as the Athlon XP. AMD's Athlon line has almost always offered a significantly better value than Intel's competing solutions; whether it was competing against the Pentium III or the Pentium 4, the value was never understated and that's a good part of the reason why the processor has gained such great acceptance. The only thing that has changed now is that AMD is no longer at the top of the performance charts, but still within striking distance.
What will AMD strike with? Whether it be a Thoroughbred, a Barton or a Hammer it's clear that there's no getting rid of this worthy competitor. Until then, expect to see even faster Pentium 4s as Intel makes the seemingly effortless journey to the 3GHz mark. We say effortless because many overclockers have been running very close to that speed for quite a while now. Until the next die-shrink, enjoy.