Original Link: https://www.anandtech.com/show/158
AMD's Future: The need for the K6-3 and explaining Winstone 99
by Anand Lal Shimpi on November 28, 1998 2:26 PM EST- Posted in
- CPUs
With the recent review of the K6-2 400, much controversy has been brought to the attention of AnandTech as well as readers all over the hardware world as to what makes the K6-2 such a poor performer in comparison to the Pentium II, especially in in Ziff Davis' most recently released benchmarking utility, Winstone 99. While it is often easier to say that the benchmark in question, in this case Winstone 99, is biased towards Intel processors, there is a much more accurate explanation for why.
In defense of ZDBOp, I must say that Winstone 99 is a much better benchmark than Winstone 98, simply because it is more true to real-world usage. How many of you can sit there and honestly say that you only use one application at a time? Often times, we're browsing the web, typing up a report, and doing much more than just running a single application. Winstone 99 tests just that. For example, instead of running Microsoft Word 97 and benchmarking a system's performance in that particular task like Winstone 98 used to do, Winstone 99 will open up Microsoft Word, Microsoft Excel, Microsoft Access, and PowerPoint as well as Netscape Navigator, and run tests in all of the applications together. If that seems a little extreme, it also does the same for Corel WordPerfect running alongside Corel Quattro Pro, once again with Netscape open and browsing the web. The software switches between the open applications and tests the system's performance in multitasking environments, which is much more realistic than measuring performance in one area alone such as word processors (if all you are using your computer for is word processing, then you shouldn't really be concerned with a K6-2 400, a simple Cyrix or a regular K6 will do).
Now it is quite obvious, by the first scores published from the benchmark, that processors with slower L2 caches perform much worse under Winstone 99 than they did under Winstone 98. Where the difference between a K6-2 running at 400MHz and a Pentium II running at 400MHz under Winstone 98 was 5 tenths of a point, the difference is a full 1.5 points under Winstone 99. If you compare Winstone 97 to Winstone 98 for example, you'll notice that Intel processors seem to score much higher on Winstone 98 in comparison to non-Intel (i.e. AMD) processors...the same trend follows with Winstone 99, however there is an explanation for it under Winstone 99. The P2/CeleronA runs its cache at a speed much greater than 100MHz, and with every clock increase the K6-2 goes through in comparison to every equivalent clock increase the Pentium II or Celeron A will go through, the latter two processors will benefit the most. Why? Let's take a look at the following table to find out:
Table 1: Cache Clock Speed Increase vs Processor Speed Increase | ||||||
Clock Speed in
MHz @ 300MHz |
Clock Speed in
MHz @ 450MHz |
% Clock Speed Increase | ||||
L1 Cache | L2 Cache | L1 Cache | L2 Cache | L1 Cache | L2 Cache | |
AMD K6-2 | 300 | 100 | 450 | 100 | + 50% | + 0% |
iCeleronA | 300 | 300 | 450 | 450 | + 50% | + 50% |
iPentium II | 300 | 150 | 450 | 225 | + 50% | + 50% |
History has shown us that where the presence and performance of the L2 cache truly becomes apparent is under business applications and not games. This can be proven by the fact that comparing a cacheless Celeron running at 300MHz to a Celeron A outfitted with 128KB of L2 cache also running at 300MHz results in two different pictures in business and gaming situations. Under Quake 2, there is very little difference between the two processors, however under Winstone the difference is greatly exaggerated in comparison.
Business Winstone Comparison
- The Effect of Cache
The reason such a performance difference is present under business applications is because those applications often times continuously repeat certain operations over and over again, and can usually fit in to the L2 cache of the system, whereas 3D games, like Quake 2, usually consist of FPU intensive math calculations with very little redundancy in terms of operations. The above comparison taken from the Intel Celeron A Review also shows that the faster L2 cache of the Celeron A, although it is 1/4 of the size of the Pentium II's L2 cache, does provide for greater overall system performance due to its sheer clock speed advantage.
With that said, it should now make much more sense that a system with a faster L2 cache would perform much better than a system with a slower L2 cache. If you look at the K6-2, with it's L2 cache operating at 100MHz regardless of the clock speed increase, its performance under Winstone (98 or 99) is going to increase with clock speed, however the increase is going to provide a diminishing return as the clock speed increases further. Intel made a smart decision by including the L2 cache on the cartridge of the Pentium II in that as the speed of the processor increases, the speed of the L2 cache will also increase, making every step up in a clock speed correspond with a hefty increase in overall business application performance as well. The downside to this approach is naturally, cost, however as Intel proved with the integrated L2 cache of the Celeron A, such an approach can be made in a cost effective manner. For comparison's sake, let's see how that table from above changes with the introduction of the K6-3 which will feature a full 256KB of L2 cache operating at clock speed ala the Celeron A:
Table 2: Cache Clock Speed Increase vs Processor Speed Increase w/ K6-3 | ||||||
Clock Speed in
MHz @ 300MHz |
Clock Speed in
MHz @ 450MHz |
% Clock Speed Increase | ||||
L1 Cache | L2 Cache | L1 Cache | L2 Cache | L1 Cache | L2 Cache | |
AMD K6-2 | 300 | 100 | 450 | 100 | + 50% | + 0% |
AMD K6-3 | 300 | 300 | 450 | 450 | + 50% | + 50% |
iCeleronA | 300 | 300 | 450 | 450 | + 50% | + 50% |
iPentium II | 300 | 150 | 450 | 225 | + 50% | + 50% |
This should paint a more vivid picture of why AMD had to release the K6-3 in order to survive until the release of the K7. With each step towards a higher clock speed, the performance difference between the P2/Celeron A processors and the K6-2 would increase to a point where the K6-2 would eventually become a noticeably slower alternative, in order to avoid that, AMD chose to integrate 256KB of L2 cache onto the K6-2, and slap on the K6-3 label. A smart move by AMD, however the real question will be whether or not it'll make it into the hands of the consumer in time to be effective.