Original Link: https://www.anandtech.com/show/9533/intel-i7-6700k-overclocking-4-8-ghz
The Intel Skylake i7-6700K Overclocking Performance Mini-Test to 4.8 GHz
by Ian Cutress on August 28, 2015 2:30 PM ESTAt the time of our Skylake review of both the i7-6700K and the i5-6600K, due to the infancy of the platform and other constraints, we were unable to probe the performance uptake of the processors as they were overclocked. Our overclock testing showed that 4.6 GHz was a reasonable marker for our processors; however fast forward two weeks and that all seems to change as updates are released. With a new motherboard and the same liquid cooler, the same processor that performed 4.6 GHz gave 4.8 GHz with relative ease. In this mini-test, we tested our short-form CPU workload as well as integrated and discrete graphics at several frequencies to see where the real gains are.
In the Skylake review we stated that 4.6 GHz still represents a good target for overclockers to aim for, with 4.8 GHz being indicative of a better sample. Both ASUS and MSI have also stated similar prospects in their press guides that accompany our samples, although as with any launch there is some prospect that goes along with the evolution of understanding the platform over time.
In this mini-test (performed initially in haste pre-IDF, then extra testing after analysing the IGP data), I called on a pair of motherboards - ASUS's Z170-A and ASRock's Z170 Extreme7+ - to provide a four point scale in our benchmarks. Starting with the 4.2 GHz frequency of the i7-6700K processor, we tested this alongside every 200 MHz jump up to 4.8 GHz in both our shortened CPU testing suite as well as iGPU and GTX 980 gaming. Enough of the babble – time for fewer words and more results!
We actually got the CPU to 4.9 GHz, as shown on the right, but it was pretty unstable for even basic tasks.
(Voltage is read incorrectly on the right.)
OK, a few more words before results – all of these numbers can be found in our overclocking database Bench alongside the stock results and can be compared to other processors.
Test Setup
Test Setup | |
Processor | Intel Core i7-6700K (ES, Retail Stepping), 91W, $350 4 Cores, 8 Threads, 4.0 GHz (4.2 GHz Turbo) |
Motherboards | ASUS Z170-A ASRock Z170 Extreme7+ |
Cooling | Cooler Master Nepton 140XL |
Power Supply | OCZ 1250W Gold ZX Series Corsair AX1200i Platinum PSU |
Memory | Corsair DDR4-2133 C15 2x8 GB 1.2V or G.Skill Ripjaws 4 DDR4-2133 C15 2x8 GB 1.2V |
Memory Settings | JEDEC @ 2133 |
Video Cards | ASUS GTX 980 Strix 4GB ASUS R7 240 2GB |
Hard Drive | Crucial MX200 1TB |
Optical Drive | LG GH22NS50 |
Case | Open Test Bed |
Operating System | Windows 7 64-bit SP1 |
The dynamics of CPU Turbo modes, both Intel and AMD, can cause concern during environments with a variable threaded workload. There is also an added issue of the motherboard remaining consistent, depending on how the motherboard manufacturer wants to add in their own boosting technologies over the ones that Intel would prefer they used. In order to remain consistent, we implement an OS-level unique high performance mode on all the CPUs we test which should override any motherboard manufacturer performance mode.
Many thanks to...
We must thank the following companies for kindly providing hardware for our test bed:
Thank you to AMD for providing us with the R9 290X 4GB GPUs.
Thank you to ASUS for providing us with GTX 980 Strix GPUs and the R7 240 DDR3 GPU.
Thank you to ASRock and ASUS for providing us with some IO testing kit.
Thank you to Cooler Master for providing us with Nepton 140XL CLCs.
Thank you to Corsair for providing us with an AX1200i PSU.
Thank you to Crucial for providing us with MX200 SSDs.
Thank you to G.Skill and Corsair for providing us with memory.
Thank you to MSI for providing us with the GTX 770 Lightning GPUs.
Thank you to OCZ for providing us with PSUs.
Thank you to Rosewill for providing us with PSUs and RK-9100 keyboards.
Frequency Scaling
Below is an example of our results from overclock testing in a table that we publish in with both processor and motherboard. Our tests involve setting a multiplier and a frequency, some stress tests, and either raising the multiplier if successful or increasing the voltage at the point of failure/a blue screen. This methodology has worked well as a quick and dirty method to determine frequency, though lacks the subtly that seasoned overclockers might turn to in order to improve performance.
This was done on our ASUS Z170-A sample while it was being tested for review. When we applied ASUS's automatic overclock software tool, Auto-OC, it finalized an overclock at 4.8 GHz. This was higher than what we had seen with the same processor previously (even with the same cooler), so in true fashion I was skeptical as ASUS Auto-OC has been rather hopeful in the past. But it sailed through our standard stability tests easily, without reducing in overclocking once, meaning that it was not overheating by any means. As a result, I applied our short-form CPU tests in a recently developed automated script as an extra measure of stability.
These tests run in order of time taken, so last up was Handbrake converting a low quality film followed by a high quality 4K60 film. In low quality mode, all was golden. At 4K60, the system blue screened. I triple-checked with the same settings to confirm it wasn’t going through, and three blue screens makes a strike out. But therein is a funny thing – while this configuration was stable with our regular mixed-AVX test, the large-frame Handbrake conversion made it fall over.
So as part of this testing, from 4.2 GHz to 4.8 GHz, I ran our short-form CPU tests over and above the regular stability tests. These form the basis of the results in this mini-test. Lo and behold, it failed at 4.6 GHz as well in similar fashion – AVX in OCCT OK, but HandBrake large frame not so much. I looped back with ASUS about this, and they confirmed they had seen similar behavior specifically with HandBrake as well.
Users and CPU manufacturers tend to view stability in one of two ways. The basic way is as a pure binary yes/no. If the CPU ever fails in any circumstance, it is a no. When you buy a processor from Intel or AMD, that rated frequency is in the yes column (if it is cooled appropriately). This is why some processors seem to overclock like crazy from a low base frequency – because at that frequency, they are confirmed as working 100%. A number of users, particularly those who enjoy strangling a poor processor with Prime95 FFT torture tests for weeks on end, also take on this view. A pure binary yes/no is also hard for us to test in a time limited review cycle.
The other way of approaching stability is the sliding scale. At some point, the system is ‘stable enough’ for all intents and purposes. This is the situation we have here with Skylake – if you never go within 10 feet of HandBrake but enjoy gaming with a side of YouTube and/or streaming, or perhaps need to convert a few dozen images into a 3D model then the system is stable.
To that end, ASUS is implementing a new feature in its automatic overclocking tool. Along with the list of stress test and OC options, an additional checkbox for HandBrake style data paths has been added. This will mean that a system needs more voltage to cope, or will top out somewhere else. But the sliding scale has spoken.
Incidentally at IDF I spoke to Tom Vaughn, VP of MultiCoreWare (who develops the open source x265 HEVC video encoder and accompanying GUI interface). We discussed video transcoding, and I bought up this issue on Skylake. He stated that the issue was well known by MultiCoreWare for overclocked systems. Despite the prevalance of AVX testing software, x265 encoding with the right algorithms will push parts of the CPU beyond all others, and with large frames it can require large amounts of memory to be pushed around the caches at the same time, offering further permutations of stability. We also spoke about expanding our x265 tests, covering best case/worst case scenarios from a variety of file formats and sources, in an effort to pinpoint where stability can be a factor as well as overall performance. These might be integrated into future overclocking tests, so stay tuned.
CPU Tests on Windows: Professional
Cinebench R15
Cinebench is a benchmark based around Cinema 4D, and is fairly well known among enthusiasts for stressing the CPU for a provided workload. Results are given as a score, where higher is better.
Agisoft Photoscan – 2D to 3D Image Manipulation: link
Agisoft Photoscan creates 3D models from 2D images, a process which is very computationally expensive. The algorithm is split into four distinct phases, and different phases of the model reconstruction require either fast memory, fast IPC, more cores, or even OpenCL compute devices to hand. Agisoft supplied us with a special version of the software to script the process, where we take 50 images of a stately home and convert it into a medium quality model. This benchmark typically takes around 15-20 minutes on a high end PC on the CPU alone, with GPUs reducing the time.
Rendering – PovRay 3.7: link
The Persistence of Vision RayTracer, or PovRay, is a freeware package for as the name suggests, ray tracing. It is a pure renderer, rather than modeling software, but the latest beta version contains a handy benchmark for stressing all processing threads on a platform. We have been using this test in motherboard reviews to test memory stability at various CPU speeds to good effect – if it passes the test, the IMC in the CPU is stable for a given CPU speed. As a CPU test, it runs for approximately 2-3 minutes on high end platforms.
HandBrake v0.9.9 LQ: link
For HandBrake, we take a 2h20 640x266 DVD rip and convert it to the x264 format in an MP4 container. Results are given in terms of the frames per second processed, and HandBrake uses as many threads as possible.
Conclusions on Professional Performance
In all of our professional level tests, the gain from the overclock is pretty much as expected. Photoscan sometimes offers a differing perspective, but this is partly due to some of the randomness of the implementation code between runs but also it affords a variable thread load depending on which stage. Not published here are the HandBrake results running at high quality (double 4K), because it actually failed at 4.6 GHz and above. There is a separate page addressing this stability issue at the end of this mini-review.
CPU Tests on Windows: Office
WinRAR 5.0.1: link
Our WinRAR test from 2013 is updated to the latest version of WinRAR at the start of 2014. We compress a set of 2867 files across 320 folders totaling 1.52 GB in size – 95% of these files are small typical website files, and the rest (90% of the size) are small 30 second 720p videos.
3D Particle Movement
3DPM is a self-penned benchmark, taking basic 3D movement algorithms used in Brownian Motion simulations and testing them for speed. High floating point performance, MHz and IPC wins in the single thread version, whereas the multithread version has to handle the threads and loves more cores.
FastStone Image Viewer 4.9
FastStone is the program I use to perform quick or bulk actions on images, such as resizing, adjusting for color and cropping. In our test we take a series of 170 images in various sizes and formats and convert them all into 640x480 .gif files, maintaining the aspect ratio. FastStone does not use multithreading for this test, and results are given in seconds.
Synthetic – 7-Zip 9.2: link
As an open source compression tool, 7-Zip is a popular tool for making sets of files easier to handle and transfer. The software offers up its own benchmark, to which we report the result.
Conclusions on Office Benchmarks
Similar to the professional tests, the gains here are in-line with what we would expect with +200 MHz overclocks.
Linux Performance
C-Ray: link
C-Ray is a simple ray-tracing program that focuses almost exclusively on processor performance rather than DRAM access. The test in Linux-Bench renders a heavy complex scene offering a large scalable scenario.
NAMD, Scalable Molecular Dynamics: link
Developed by the Theoretical and Computational Biophysics Group at the University of Illinois at Urbana-Champaign, NAMD is a set of parallel molecular dynamics codes for extreme parallelization up to and beyond 200,000 cores. The reference paper detailing NAMD has over 4000 citations, and our testing runs a small simulation where the calculation steps per unit time is the output vector.
NPB, Fluid Dynamics: link
Aside from LINPACK, there are many other ways to benchmark supercomputers in terms of how effective they are for various types of mathematical processes. The NAS Parallel Benchmarks (NPB) are a set of small programs originally designed for NASA to test their supercomputers in terms of fluid dynamics simulations, useful for airflow reactions and design.
Redis: link
Many of the online applications rely on key-value caches and data structure servers to operate. Redis is an open-source, scalable web technology with a strong developer base, but also relies heavily on memory bandwidth as well as CPU performance.
Conclusions on Linux-Bench
Our Linux testing actually affords ten tests, but we chose the most important to publish here (the other results can be found in Bench). But here we see some slight differences when it comes to overclocks - the NPB tests rely on multi-dimensional matrix solvers, which are often more cache/memory dependent and thus a higher frequency processor doesn't always help. With Redis, we are wholly cache/memory limited here. The other results are in-line with CPU performance deltas over the overclock range.
Gaming Benchmarks: Integrated Graphics
Our regular benchmarks for processor reviews consist of Alien Isolation, Total War: Attila, Grand Theft Auto V, GRID: Autosport and Middle Earth: Shadow of Mordor. Rather than the full run of graphics cards from $70 and up, we are limiting here to just the low-end testing on integrated graphics and a full on ASUS GTX 980 Strix assault.
Integrated Graphics
Conclusions on Integrated Graphics
It is clear that an overclocked processor gives worse integrated graphics performance when the graphics setting is left on Auto. In some titles, the more overclock gives more of an effect, although 4.6 GHz and 4.8 GHz seems to get similar numbers. The key here is power budget, and by forcing the CPU to work harder, in order to balance the total power the IGP has to decrease in performance - and requires being forced at a certain frequency as a result. These CPUs are not necessarily purchased for their integrated graphics performance, however. Although if DirectX 12 titles are capable of multi-adapter modes where the Intel IGP can be used, it could result in tests for an interesting balance in order to get the best performance.
Gaming Benchmarks: Integrated Graphics Overclocked
Given the disappointing results on the Intel HD 530 graphics when the processor was overclocked, the tables were turned and we designed a matrix of both CPU and IGP overclocks to test in our graphics suite. So for this we still take the i7-6700K at 4.2 GHz to 4.8 GHz, but then also adjust the integrated graphics from 'Auto' to 1250, 1300, 1350 and 1400 MHz as per the automatic overclock options found on the ASRock Z170 Extreme7+.
Technically Auto should default to 1150 MHz in line with what Intel has published as the maximum speed, however the results on the previous page show that this is more of a see-saw operation when it come to power distribution of the processor. With any luck, actually setting the integrated graphics frequency should maintain that frequency throughout the benchmarks. With the CPU overclocks as well, we can see how it scales with added CPU frequency.
Results for these benchmarks will be provided in matrix form, both as absolute numbers and as a percentage compared to the 4.2 GHz CPU + Auto IGP reference value. Test settings are the same as the previous set of data for IGP.
Absolute numbers
Percentage Deviation from 4.2 GHz / Auto
Conclusions on Overclocking the IGP
It becomes pretty clear that by fixing the frequency of the integrated graphics, there is for the most part no longer the detrimental effect when you overclock the processor, or at least the reduction in performance to the same degree (which falls within standard error). On three of the games, fixing the integrated graphics to 1250 Mhz nets ~10% boost in performance, which for titles like Attila extends to 23% at 1400 MHz. By contrast, GTA V shows only a small gain, indicating that we are perhaps limited in other ways.
Gaming Benchmarks: Discrete Graphics
Our regular benchmarks for processor reviews consist of Alien Isolation, Total War: Attila, Grand Theft Auto V, GRID: Autosport and Middle Earth: Shadow of Mordor. Rather than the full run of graphics cards from $70 and up, we are limiting here to just the low-end testing on integrated graphics and a full on ASUS GTX 980 Strix assault.
NVIDIA GTX 980
Conclusions on NVIDIA GTX 980
In our discrete test, it is clear that there is not much difference between the games tested.
Conclusions
The how, what and why questions that surround overclocking often result in answers that either confuse or dazzle, depending on the mind-set of the user listening. At the end of the day, it originated from trying to get extra performance for nothing. Buying the low-end, cheaper processors and changing a few settings (or an onboard timing crystal) would result in the same performance as a more expensive model. When we were dealing with single core systems, the speed increase was immediate. With dual core platforms, there was a noticeable difference as well, and overclocking gave the same performance as a high end component. This was noticeable particularly in games which would have CPU bottlenecks due to single/dual core design. However in recent years, this has changed.
Intel sells mainstream processors in both dual and quad core flavors, each with a subset that enable hyperthreading and some other distinctions. This affords five platforms – Celeron, Pentium, i3, i5 and i7 going from weakest to strongest. Overclocking is now enabled solely reserved for the most extreme i5 and i7 processors. Overclocking in this sense now means taking the highest performance parts even further, and there is no recourse to go from low end to high end – extra money has to be spent in order to do so.
As an aside, in 2014, Intel released the Pentium G3258, an overclockable dual core processor without hyperthreading. When we tested, it overclocked to a nice high frequency and it performed in single threaded workloads as expected. However, a dual core processor is not a quad core, and even with a +50% increase in frequency, it will not escape a +100% or +200% increase in threads over the i5 or i7 high end processors. With software and games now taking advantage of multiple cores, having too few cores is the bottleneck, not frequency. Unfortunately you cannot graft on extra silicon as easily as pressing a few buttons.
One potential avenue is to launch an overclockable i3 processor, using a dual core with hyperthreading, which might play on par with an i5 even though we have hyperthreads compared to actual core count. But if it performed, it might draw away sales from the high end overclocking processors, and Intel does not have competition in this space, so I doubt we would see it any time soon.
But what exactly does overclocking the highest performing processor actually achieve? Our results, including all the ones in Bench not specifically listed in this piece, show improvements across the board in all our processor tests.
Here we get three very distinct categories of results. The move of +200 MHz is rounded to about a 5% jump, and with our CPU tests it is more nearer 4% for each step up and slightly less in our Linux Bench. In both of these there were benchmarks that bought the average down due to other bottlenecks in the system: Photoscan Stage 2 (the complex multithreaded stage) was variable and in Linux Bench both NPB and Redis-1 gave results that were more DRAM limited. Remove these and the results get closer to the true % gain.
Meanwhile, all of our i7-6700K overclocked testing are now also available in Bench, allowing direct comparison to other processors. Other CPUs when overclocked will be updated in due course.
Moving on, with our discrete testing on a GTX 980, our series of games had little impact on increased frequency at 1080p or even SoM at 4K. Some might argue that this is to be expected, because at high settings the onus is more on the graphics card – but ultimately with a GTX 980 you would be running at 1080p or better at maximum settings where possible.
Finally, the integrated graphics results are a significantly different ball game. When we left the IGP at default frequencies, and just overclocked the processor. The results give a decline in average frame rates, despite the higher frequency, which is perhaps counterintuitive and not expected. The explanation here is due to power delivery budgets – when overclocked, the majority of the power pushes through to the CPU and items are processed quicker. This leaves less of a power budget within the silicon for the integrated graphics, either resulting in lower frequencies to maintain the status quo or by the increase in graphical data occurring over the DRAM-to-CPU bus causing a memory latency bottleneck. Think of it like a see-saw: when you push harder on the CPU side, the IGP side effect is lower. Normally this would be mitigated by increasing the power limit on the processor as a whole in the BIOS, however in this case this had no effect.
When we fixed the integrated graphics frequencies however, this issue disappeared.
Taking Shadow of Mordor as the example, raising the graphics frequency not only gave a boost in performance when we used the presets provided on the ASRock motherboard, but also the issue of balancing power between the processor and the graphics disappeared and our results were within expected variance.