Original Link: https://www.anandtech.com/show/10959/intel-launches-7th-generation-kaby-lake-i7-7700k-i5-7600k-i3-7350k
Intel Launches 7th Generation Kaby Lake: 15W/28W with Iris, 35-91W Desktop and Mobile Xeon
by Ian Cutress on January 3, 2017 12:01 PM ESTThe death of Intel’s ‘Tick-Tock’ means that Kaby Lake is Intel’s third crack at their 14nm process. 14nm started with Broadwell (5th Gen, tick), introduced a new microarchitecture with Skylake (6th Gen, tock), and now is in the ‘optimization’ stage with Kaby Lake (7th Gen). This means an improved ‘14nm Plus’, offering better power efficiency and higher frequencies through a less strained transistor floorplan. Intel is launching a myriad of SKUs under Kaby Lake, ranging from mobile KBL-U at 15W and 28W through mobile KBL-H at 45W and desktop-class KBL-S at 35W to 91W. This includes three overclocking SKUs for desktop, including an i3 variant. Here’s the front page of AnandTech’s Kaby Lake launch coverage.
Kaby Lake
Despite what has been released elsewhere on the internet, today marks the second official launch of Kaby Lake, the first being back in September with six mobile processors which currently feature in the premium notebook and mini-PC categories. As a ‘trial run’, these six processors have shown promise and as a result another 25+ are coming to the wider market.
The main features for Kaby Lake includes support for Optane Memory, a range of new 200-series chipsets to support the processors, an update to the Gen9 graphics featuring Main10 support and other lower power hardware accelerated video playback, as well as adjustments to the underlying silicon to afford a better voltage frequency curve.
Tick Tock Boom
Intel’s ‘Tick-Tock’ cadence has disappeared, as we’ve reported on several times previously, and Kaby Lake is the first wave of Intel’s ‘Optimization’ step in their ‘Process, Architecture, Optimization’ release structure. The goal of the first two steps in that trio have been well documented over the last decade or so of Intel releases: a process change means a shrink in the minimum feature size of the silicon (and arguably represents the main business focus of Intel’s R&D), such as from 90nm to 65nm or 22nm to 14nm, and ‘architecture’ indicates an improvement to the underlying microarchitecture, typically taking advantage of the new process in the previous step. Both of these steps, barring a significant paradigm shift in the microarchitecture, have yielded 5-15% performance jumps each iteration. Also, based on yield, typically the smaller chips are the ones to come to market first.
The Optimization step is a relative unknown, as the term is somewhat definable in many different contexts when it comes to semiconductor design. Optimization could be an adjustment to the base microarchitecture giving it more support, or an adjustment in the silicon manufacturing process giving better efficiency, or it could be a different set of SKUs for a changing market, or it could be an updated internal graphics implementation. There are many ways in which Intel could play the optimization card, and for the launch of their 7th generation processors, it comes across as a number of features.
At this point I should mention the Devil’s Canyon launch of the i5-4690K and i7-4790K, which were out-of-cycle launches of new overclocking products. While Intel marketed this as an ‘optimization’ of the current design, it technically wasn’t part of the PAO cadence. Most of the Devil’s Canyon optimization was around heat management rather than any other significant shift, whereas the ‘Optimization’ label for Kaby Lake is an actual physical change to the silicon.
The first one to make note of though, to which our readers are probably most interested, is that while there is a change to the silicon, there is no underlying change to the microarchitecture. As a result, Intel are not promoting any performance difference with the previous generation (Skylake, 6th Gen, SKL) at a given frequency. They are however promoting better efficiency, which means that it should do the same at lower power, or offer better points in the voltage frequency curve. The motto should be ‘do more with less’.
As we’ll explain in this piece, similar to the six processors launched back in September, the main benefits for Kaby Lake over previous generations of Intel CPUs will be in the feature set, and to liven things up, what SKUs are on offer.
Today’s Launch: 15W, 28W, 35W, 45W, 51W, 60W, 65W, 91W
Intel defines its processor product lines in terms of segments: Y, U, H and S, which we’ll associate via KBL-Y, KBL-U, KBL-H and KBL-S. Recent changes in Intel’s processor naming scheme have made it harder to determine which segment a processor might be from without looking directly at the TDP or the core arrangement, but here’s a slide to show the base layout:
The Y series, using the Core m3, Core i5/i7 and Core i5/i7 vPro nomenclature, are the 4.5W dual-core processors with HyperThreading that are focused on small and light mobile PCs, such as 2-in-1s or premium lightweight laptops, or Compute Stick-like devices. The reason they get so low in power is often due to their super low base frequency, and with such a low TDP rating this enables these processors to be placed in laptops and have a good chunk of battery removed to save weight but still offer decent battery life.
Intel Kaby Lake YUH SKUs | |||||||
Cores/ Threads |
Base/ Turbo |
IGP | L3 | eDRAM | TDP | Cost | |
Y-Series | |||||||
i7-7Y75 | 2/4 | 1.3/3.6 | HD 615 | 4 MB | - | 4.5 W | $353 |
i5-7Y57 | 2/4 | 1.2/3.3 | HD 615 | 4 MB | - | 4.5 W | $253 |
i5-7Y54 | 2/4 | 1.2/3.2 | HD 615 | 4 MB | - | 4.5 W | $253 |
m3-7Y30 | 2/4 | 1.0/2.0 | HD 615 | 4 MB | - | 4.5 W | $253 |
U-Series | |||||||
i7-7660U | 2/4 | 2.5/4.0 | Iris Plus 640 | 4 MB | 64 MB | 15 W | $373 |
i7-7600U | 2/4 | 2.8/3.9 | HD 620 | 4 MB | - | 15 W | $353 |
i7-7567U | 2/4 | 3.5/4.0 | Iris Plus 650 | 4 MB | 128 MB ? | 28 W | $373 |
i7-7560U | 2/4 | 2.4/3.8 | Iris Plus 640 | 4 MB | 64 MB | 15 W | $373 |
i7-7500U | 2/4 | 2.7/3.5 | HD 620 | 4 MB | - | 15 W | $353 |
i5-7360U | 2/4 | 2.3/3.6 | Iris Plus 640 | 4 MB | 64 MB | 15 W | $273 |
i5-7300U | 2/4 | 2.6/3.5 | HD 620 | 3 MB | - | 15 W | $253 |
i5-7200U | 2/4 | 2.5/3.1 | HD 620 | 3 MB | - | 15 W | $253 |
i5-7287U | 2/4 | 3.3/3.7 | Iris Plus 650 | 4 MB | 128 MB ? | 28 W | $273 |
i5-7267U | 2/4 | 3.1/3.5 | Iris Plus 650 | 4 MB | 128 MB ? | 28 W | $273 |
i5-7260U | 2/4 | 2.2/3.4 | Iris Plus 640 | 4 MB | 64 MB | 15 W | $273 |
i3-7167U | 2/4 | 2.8 | Iris Plus 650 | 3 MB | 128 MB ? | 28 W | $273 |
i3-7100U | 2/4 | 2.4 | HD 620 | 3 MB | - | 15 W | $253 |
H Series | |||||||
E3-1535M v6 | 4/8 | 3.1/4.2 | HD P630 | 8 MB | - | 45 W | $560 |
E3-1505M v6 | 4/8 | 3.0/4.0 | HD 630 | 8 MB | - | 45 W | $390 |
i7-7920HQ | 4/8 | 3.1/4.1 | HD 630 | 8 MB | - | 45 W | $510 |
i7-7820HQ | 4/8 | 2.9/3.9 | HD 630 | 8 MB | - | 45 W | $340 |
i7-7820HK | 4/8 | 2.9/3.9 | HD 630 | 8 MB | - | 45 W | $340 |
i7-7700HQ | 4/8 | 2.8/3.8 | HD 630 | 6 MB | - | 45 W | $340 |
i5-7440HQ | 4/4 | 2.8/3.8 | HD 630 | 6 MB | - | 45 W | $225 |
i3-7300HQ | 4/4 | 2.5/3.5 | HD 630 | 6 MB | - | 45 W | $225 |
i3-7100H | 2/4 | 3 | HD 630 | 3 MB | - | 35 W | $210 |
The U series are at the 15W and 28W mark, still offering two cores with hyperthreading, but with a much higher base clock. These are often cheaper than the Y-series as their voltage/frequency profile does not need to hit such a stringent set of requirements, and these are the parts users will likely see in cheaper designs, or premium gaming notebooks that are not so thick. Some of the U series parts will use an additional on-package chip called eDRAM, either 64 MB or 128 MB, which will act like a transparent DRAM buffer to main memory. This mostly affects graphics performance, and these parts will fall under the Iris Plus branding. The eDRAM use model, as we’ll re-explain in a bit, is different to previous generations of Intel’s eDRAM.
(Astute users will note that this is Iris Plus, rather than just plain Iris or Iris Pro as in previous generations. Intel made clear in our pre-briefings that there will be no Iris Pro products at this launch (which isn’t to say there will not be in the future), however as things stand Iris Plus will be the branding on KBL-U series products that use eDRAM. No exact reason was given for the change, except that it will help differentiate between Skylake parts with Iris grade graphics, but whether this is enough of a reason to adjust the branding remains to be seen).
The H series processors are mobile soldered (BGA) parts that tip the TDP scales at 45W, and the goal of these parts is to provide the top performance to mobile devices that require it. Intel is keen to promote the H-series under its new ‘VR Ready’ branding, indicating that these processors in the right devices will have enough grunt for virtual reality systems that have suitable graphics. The H-series will be wide ranging in parts and performance, as well as bringing in mobile workstation-class Xeon processors.
The S series are the socketable parts for desktops. Just a reminder, that’s ‘S for deSktop’. For users following the desktop line of processors, there is nothing particularly out of the ordinary here to start. Three i7 parts with four cores and hyperthreading, one of which is unlocked and one is low power, followed by a few i5 quad-core parts that have the same trend of unlock and low power, as well as some dual-core with hyperthreading i3 parts.
Intel Kaby Lake S SKUs | |||||||
Cores/ Threads |
Base/ Turbo |
IGP | L3 | eDRAM | TDP | Cost | |
i7-7700K | 4/8 | 4.2/4.5 | HD 630 | 8 MB | - | 91 W | $305 |
i7-7700 | 4/8 | 3.6/4.2 | HD 630 | 8 MB | - | 65 W | $272 |
i7-7700T | 4/8 | 2.9/3.8 | HD 630 | 8 MB | - | 35 W | $272 |
i5-7600K | 4/4 | 3.8/4.2 | HD 630 | 6 MB | - | 91 W | $217 |
i5-7600 | 4/4 | 3.5/4.1 | HD 630 | 6 MB | - | 65 W | $199 |
i5-7600T | 4/4 | 2.8/3.7 | HD 630 | 6 MB | - | 35 W | $199 |
i5-7500 | 4/4 | 3.4/3.8 | HD 630 | 6 MB | - | 65 W | $179 |
i5-7500T | 4/4 | 2.7/3.3 | HD 630 | 6 MB | - | 35 W | $179 |
i5-7400 | 4/4 | 3.0/3.5 | HD 630 | 6 MB | - | 65 W | $170 |
i5-7400T | 4/4 | 2.4/3.0 | HD 630 | 6 MB | - | 35 W | $170 |
i3-7350K | 2/4 | 4.2 | HD 630 | 4 MB | - | 60 W | $157 |
i3-7320 | 2/4 | 4.1 | HD 630 | 4 MB | - | 51 W | $139 |
i3-7300 | 2/4 | 4.0 | HD 630 | 4 MB | - | 51 W | $129 |
i3-7300T | 2/4 | 3.5 | HD 630 | 4 MB | - | 35 W | $129 |
i3-7100 | 2/4 | 3.9 | HD 630 | 3 MB | - | 51 W | $109 |
i3-7100T | 2/4 | 3.4 | HD 630 | 3 MB | - | 35 W | $109 |
Two things stand out from the new KBL-S release segments. First is a new part of the stack that comes unlocked and is therefore overclockable: the Core i3-7350K. This is a dual core CPU with hyperthreading, a 60W TDP, comes with a 4.2 GHz base frequency (there’s no turbo on an i3), and the multiplier is adjustable. Certain segments of the enthusiast community have been asking for an unlocked Core i3, to bring overclocking back into having its original meaning of boosting a low-end CPU into performing like a high-end CPU, and this Core i3 will finally answer that question. We have a full review of the Core i3 as part of our Kaby Lake launch series, even though the formal sale date for the i3-7350K is early February.
The second stand-out part is the lack of desktop Xeons being announced, what would be the E3-1200 v6 line of CPUs, as well as the lack of dual-core Pentiums or Celeron parts for the super low-cost segment in the market. The v6 parts, assuming the strategy would follow from the v5 parts, would require a new Xeon-specific chipset for motherboards and no C-series desktop chipset is being announced today. I highly suspect that the Xeon CPUs will be announced later in Q1, given that the target market for these is a little different to standard desktop processors.
Today’s Coverage of Intel Kaby Lake
For a launch day piece, we managed to secure a sizeable chunk of hardware for analysis. As a result, rather than the usual method of shoving it all into one review and somehow getting some key analysis lost, we will be posting several pieces today to cover the main topics of the hardware at hand. Links to all the other pieces are as follows:
Intel Launches 7th Generation Kaby Lake (Overview and Core Improvements)
The Intel Core i7-7700K Review: The New Out-of-the-box Performance Champion
The Intel Core i5-7600K Review: The More Amenable Mainstream Performer
Upcoming (we’re at CES and didn’t have time to finish these yet):
Calculating Generational IPC Changes from Sandy Bridge to Kaby Lake
The Intel Core i3-7350K Review: When a Core i3 Nearly Matches the Core i7-2600K
Intel Core i7-7700K, i5-7600K and i3-7350K Overclocking: Hitting 5.0 GHz on AIR
Intel Launches 200-Series Chipset Breakdown: Z270, H270, B250, Q250, C232
Intel Z270 Motherboard Preview: A Quick Look at 80+ Motherboards
A New Optimized 14nm Process: 14nm+
As originally reported in the Kaby Lake-Y/U Launch
One of the mysteries with the launch of Kaby Lake is the optimized 14nm+ process that Intel is promoting as one of the key components for the performance uptick in Kaby Lake. It’s worth noting that Intel has said nothing about caches, latencies or bandwidths. We are being told that the underlying microarchitecture for Kaby Lake is the same as Skylake, and that the frequency adjustments from the new process, along with features such as Speed Shift v2 and the new fixed function media codecs, account for the performance improvements as well as battery life increases when dealing with 4K content.
For users that speak in pure IPC, this may/may not be a shock. Without further detail, Intel is implying that Kaby Lake will have the same IPC as Skylake (which we can confirm in our reviews), however it will operate with a better power efficiency (same frequency at lower power, or higher frequency at same power) and for media consumption there will be more idle CPU cycles with lower power drain. The latter makes sense for mobile devices such as tablets, 2-in-1s and notebooks, or for power conscious users, but paints a static picture for the future of the desktop platform in January if the user only gets another 200-400 MHz in base frequencies.
However I digress with conjecture – the story not being told is on how has Intel changed its 14nm+ platform. We’ve only been given two pieces of information: taller fins and a wider gate pitch.
When Intel launched Broadwell on 14nm, we were given an expose into Intel’s latest and greatest semiconductor manufacturing lithography node. Intel at its core is a manufacturing company rather than a processor company, and by developing a mature and robust process node allows them to gain performance advantages over the other big players: TSMC, Samsung and GlobalFoundries. When 14nm was launched, we had details on their next generation of FinFET technology, discussions about the issues that faced 14nm as it was being developed, and fundamental dimensional data on how transistors/gates were proportioned. Something at the back of my brain says we’ll get something similar for 10nm when we are closer to launch.
But as expected, 14nm+ was given little extra detail. What would interest me is the scale of results or the problems faced by the two changes in the process we know about. Taller fins means less driving current is needed and leakage becomes less of an issue, however a wider gate pitch is typically associated with a decrease in transistor density, requiring higher voltages but making the manufacturing process easier with fewer defects. There is also the argument that a wider pitch allows the heat generation of each transistor to spread more before affecting others, allowing a bit more wiggle room for frequency – this is at least how Intel puts it.
The combination of the two allows for more voltage range and higher frequencies, although it may come at the expense of die size. We are told that transistor density has not changed, but unless there was a lot of spare unused silicon in the Skylake die design for the wider pitch to spread, it seems questionable. It also depends which part of the metal stack is being adjusted as well. It’s worth noting that Intel has not released die size information again, and transistor counts as a metric is not being disclosed, similar to Skylake.
Finally, there's some question over what it takes at a fab level to produce 14nm+. Though certainly not on the scale of making the jump to 14nm to begin with, Intel has been tight-lipped on whether any retooling is required. At a minimum, as this is a new process (in terms of design specifications), I think it's reasonable to expect that some minor retooling is required to move a line over to 14nm+. In which case the question is raised over which Intel fabs can currently produce chips on the new process. One of the D1 fabs in Oregon is virtually guaranteed; whether Arizona or Ireland is also among these is not.
I bring this up because of the parallels between the Broadwell and Kaby Lake launches. Both are bottom-up launches, starting with the low wattage processors followed by the bigger parts a few months later. In Broadwell's case, 14nm yields - and consequently total volume - were a bottleneck to start with. Depending on the amount of retooling required and which fabs have been upgraded, I'm wondering whether the bottom-up launch of Kaby Lake is for similar reasons. Intel's yields should be great even with a retooling, but if it's just a D1 fab producing 14nm+, then it could be that Intel is volume constrained at launch and wanted to focus on producing a larger number of small die 2+2 processors to start with, ramping up for larger dies like 4+2 and 4+4e later on.
Speed Shift v2: Speed Harder
As originally reported at Kaby Lake-Y/U Launch
One of the new features for Skylake was Speed Shift. With the right OS driver, the system could relinquish control of CPU turbo to the CPU itself. Using internal metric collection combined with access to system-level sensors, the CPU could adjust the frequency with more granularity and faster than the OS can. The purpose of Speed Shift was to allow the system to respond quicker to requests for performance (such as interacting with a touch screen or browsing the web), reduce delays and improve the user experience. So while the OS was limited to predefined P-state options, a Speed Shift enabled processor with the right driver had a near contiguous selection of CPU multipliers within a wide range to select from.
The first iteration of Speed Shift reduced the time for the CPU to hit peak frequencies from ~100 milliseconds down to around 30. The only limitation was the OS driver, which is now a part of Windows 10 and comes by default. We extensively tested the effects of the first iteration of Speed Shift at launch.
With Kaby Lake, the hardware control around Speed Shift has improved. Intel isn’t technically giving this a new name, but it is an iterative update which I prefer to call ‘v2’, if only because the adjustment from v1 to v2 is big enough to note. There is no change in the OS driver, so the same Speed Shift driver works for both v1 and v2, but the performance means that a CPU can now reach peak frequency in 10-15 milliseconds rather than 30.
The light green and yellow lines show the difference between v1 and v2, with the yellow Kaby Lake processor getting up to 3.5 GHz incredibly quickly. This will have an impact on latency limited interactions as well as situations where delays occur, such as asynchronous web page loading. Speed Shift is a play for user experience, so I’m glad to see it is being worked on. We will obviously have to test this when we can.
A note about the graph, to explain why the lines seem to zig-zag between lower and higher frequencies because I have encountered this issue in the past. Intel’s test, as far as we were told, relies on detecting register counters that increment as instructions are processed. By monitoring the value of these registers, the frequency can be extrapolated. Depending on the polling time, or adjacent point average (a common issue with counter based time benchmarks I’ve experienced academically), it can result it statistical variation depending on the capability of the code.
While this graph uses the i7-7500U, which was released back in September, Speed Shift v2 is a feature for all Kaby Lake processors in the stack with the right OS. We still have not received an official word if Intel is working closely with Apple to bring the feature to macOS, or even if it will be promoted if it ever makes the transition – Apple may never promote it so as not to confuse the non-technical media that follow Apple, but also not allow Intel to specify that Apple is using it. Or, it’ll be part of a presentation; we don’t know.
Optane Memory: Support for Intel 3D XPoint
For our regular readers, 3D XPoint will be well known, but for everyone else here’s a quick recap. One of the goals in the memory industry is to create something with the speed of DRAM but persistent, such that the data is saved when the power is switched off. DRAM uses power to refresh data, but is a bulk-source of data movement for software. A lot of the speedup behind software is the speed behind memory accesses, or the ability to have data closer to the core when needed, so having a large, close, non-volatile memory set can be a big help in both performance and power consumption. In order to create non-volatile memory like this, several players have spent the best part of a decade creating new materials and new paradigms to get something that works. Intel (and Micron) formally announced their 3D XPoint solution over a year ago, allowing for a bit-level addressable matrix of non-volatile memory, however no products have formally been announced.
Intel intends to create three levels of 3D XPoint, suitable for different parts of the storage chain:
#1 - Storage (‘Optane SSD’): 120GB+ of SSD like NVMe storage
#2 - Storage Caching (‘Optane Memory’): 16-32GB of HDD Caching storage
#£ - DRAM (‘Optane DRAM’ ?): ?GB of DRAM, DDR4-pin compatible
The first two in that list have been defined, although the use of the term ‘Memory’ to define a storage cache is a little odd. Intel has stated in briefings that ‘Optane Memory’ is different to ‘Optane SSD’, and no exact word yet on how the 3D XPoint DRAM product will be named.
We saw a pair of the Optane SSD evaluation drives at IDF. These are drives that Intel hands out to large partners and OEMs to help configure future products or optimize software, and we were told these were early-production 140GB drives which may or may not represent what’s coming to market. Unfortunately aside from a few images of a heatsink several inches away, we were unable to look any deeper in to the design at the time.
Optane Memory, as announced through the recent Lenovo announcement, forms the brand name for a series of caching drives. We’ve had caching drives before, when SSDs were coming to the market and a combination of Intel RST, an mSATA drive and a spinning HDD meant that some elements of the storage were cached, affording a higher rate of performance over a regular HDD. The same concept applies here, with the 16GB Optane Memory drives acting as a cache for certain elements of the HDD use case enabling faster speeds when needed. We are told that this requires an update to Intel RST, and that Optane Memory will only work on Kaby Lake and up with a 200-series chipset (as in, you can’t use it with a 200-series chipset and Skylake, for example). The benefits of having Optane as a cache over a regular SSD are unclear at this point, aside from the fact that Optane will be under NVMe and use PCIe lanes – Billy Tallis, our SSD editor, has written a piece on the situation which is worth a read.
One thing that Intel mentioned during our Kaby Lake briefings is that ‘Optane Memory’ is different from ‘Optane SSD’, which could lead one to believe that ‘Optane Memory’, as a caching drive, cannot be used as an SSD. The original caching of an mSATA drive was almost like a hybrid RAID setup, and the mSATA drive could still be used as an SSD if it was configured that way. Because Intel is giving this a different naming system, it might stand to reason that the Optane caching drive cannot be used as a separate SSD in itself. Intel also stated that it will be later in the first half of 2017 before these drives are available as a separate purchase.
As of right now, only Lenovo has announced laptops with Optane Memory installed, however we expect several more vendors to announce products over the next few weeks unless Lenovo has initial exclusivity. So while Optane Memory is an advertised feature of Kaby Lake and the 200-series chipsets, it is one that users will have to wait to get to grips with, if relevant at all.
Intel vPro Parts with Intel Authenticate
With the vPro variants of Kaby Lake, Intel is announcing the launch of Intel Authenticate. This is a hardware driven protection system designed to offer more security for user authentication, such as hardware based biosensors (iris, fingerprint) or location based (Bluetooth) security. Similar to how other hardware authentication based systems work, the vPro CPUs will have Intel’s Security and Software Guard Extensions (SGX) enabled to be able to isolate various instruction workflows and also detect when potential intrusion occurs. Intel did not go into any significant detail here, probably based on the initial audience for the Kaby Lake launch being more consumer oriented, but we were told that Intel is working with popular password managers to offer a more secure environment.
The vPro capable processors available from today are the i7-5Y75, i7-5Y57, i7-7600U, i7-7300U, i7-6660U, i5-7350U, E3-1535M v6, E3-1505M v6, i7-7920HQ, i7-7820HQ, i5-7440HQ, i7-7700, i5-7600, i5-7500, i7-7700T, i5-7600T and i5-7500T. These processors need to use the Q270 chipset in order to have vPro capabilities.
OPI 2.0 to OPI 3.0
From the Kaby Lake-Y/U Launch
For anyone following our NUC testing, you may remember that during testing of a Skylake-Y NUC, we found that the PCIe 3.0 x4 slot for M.2 drives was actually limited to PCIe 2.0 x4, effectively reducing the peak bandwidth. At the time we probed Intel and our data sheets to find out that it was actually a limitation imposed by Intel on the CPU which wasn’t obvious from the original launch materials. The interconnect between the CPU cores/system agent and the integrated chipset, what we normally call DMI on a desktop platform but is called OPI on an SoC, was fixed at PCIe 2.0 speeds.
We asked about why this was the case, and we were told that the system actually can support PCIe 3.0 x4 speeds, and the system even tests this on startup, but for stability PCIe 2.0 is chosen. This made the marketing of the NUC a little confusing, especially as PCIe 3.0 x4 for storage was listed as a feature. Working with Intel, they pushed through a new BIOS for the NUC that kept the OPI at PCIe 3.0 x4 speeds, and we were able to get peak bandwidth from our storage devices. However, that BIOS update was limited to one mini-PC from one vendor, leaving all the other vendors to do their own thing.
This time around, Intel is doing the change by default (on every KBL processor, except the Core i3-7100U). The interconnect to the chipset now runs at PCIe 3.0 x4 speeds, allowing the integrated chipset to offer full PCIe 3.0 lanes for extra controllers, storage and other features. For the Y series, this means 10 PCIe 3.0 lanes and the U series get 12 lanes. We could end up seeing some Core i5/i7 devices offer dual M.2 PCIe 3.0 x4 RAID as a result, depending on compatibility. Intel also lists NVMe support for the five processors on OPI 3.0.
The Kaby Lake-U/Y GPU - Media Capabilities
Written by Ganesh
While from a feature standpoint Kaby Lake is not a massive shift from Skylake, when it comes to GPU matters it none the less brings across some improvements that are directly visible to the end-user. As with the CPU cores, Intel’s 14nm+ process will allow for higher GPU frequencies and overall better GPU performance, but arguably the more impressive change with Kaby Lake is the updated media capabilities. To be clear, Kaby Lake is still an Intel Gen9 GPU – the core GPU architecture has not changed – but Intel has revised the video processing blocks to add further functionality and improve their performance for Kaby Lake.
The media capabilities of the Skylake GPU were analyzed in great detail in our 2015 IDF coverage. The updates to Kaby Lake-U/Y should be analyzed while keeping those features in mind. The major feature change in the Kaby Lake-U/Y media engine is the availability of full hardware acceleration for encode and decode of 4K HEVC Main10 profile videos. This is in contrast to Skylake, which can support HEVC Main10 decode up to 4Kp30, but does so using a “hybrid” process that spreads out the workload over the CPU, the GPU’s media processors, and the GPU’s shader cores. As a result, not only can Kaby Lake process more HEVC profiles in fixed function hardware than before, but it can do so at a fraction of the power and with much better throughput.
Also along these lines, Kaby Lake has implemented full fixed function 8-bit encode and 8/10-bit decode support for Google’s VP9 codec. Skylake offered hybrid decode support for the codec, which is useful from a feature standpoint, but is a bit more problematic in real-world use since it’s not as power-efficient to use VP9 a codec implemented in fixed function hardware. Google has proven eager to serve up VP9 to its YouTube users, so they can now much more efficiently decode the codec. Meanwhile, on the encode side, brand-new to Kaby Lake is VP9 encoding support, to go with the aforementioned HEVC encode support.
An overview of the GPU engine in Kaby Lake-U/Y is presented in the slide below.
The new circuitry for hardware accelerating HEVC Main10 and VP9 are part of the MFX block. The MFX block can now handle 8b/10b HEVC and VP9 decode and 10b HEVC / 8b VP9 encode. The QuickSync block also gets a few updates to improve quality further, and AVC encode performance also receives a boost.
The Video Quality Engine also receives some tweaks for HDR and Wide Color Gamut (Rec.2020) support. Skylake's VQE brought in RAW image processing support with a 16-bit image pipeline for selected filters. While Intel has not discussed the exact updates that enable Rec.2020 support, we suspect that more components in the VQE can now handle higher bit-widths. Intel pointed out that the HDR capabilities involve usage of both the VQE and the EUs in the GPU. So, there is still scope for further hardware acceleration and lower power consumption in this particular use-case.
Intel claims that Kaby Lake-U/Y can handle up to eight 4Kp30 AVC and HEVC decodes simultaneously. HEVC decode support is rated at 4Kp60 up to 120 Mbps (especially helpful for premium content playback and Ultra HD Blu-ray). With Kaby Lake-U/Y's process improvements, even the 4.5W TDP Y-series processors can handle real-time HEVC 4Kp30 encode.
On the subject of premium content, in their presentation Intel rather explicitly mentioned that the improved decode capabilities were, in part, for “premium content playback.” When we pushed Intel a bit on the matter – and specifically on 4K Netflix support – they didn’t have much to say beyond the fact that to play 4K Netflix, you need certification. Based on what was said and what was not said (and what we know about the certification process) our educated guess is that the updates in Kaby Lake-U/Y include some new DRM requirements for 4K content, and 4K Netflix should hopefully be good to go with the new platform. However on that note, because of those DRM requirements and that this is being pitched as a new feature for Kaby Lake, we suspect that when 4K Netflix streaming does come to the PC platform, Skylake owners are going to be out of luck.
It must be kept in mind that all the encode / decode aspects discussed above are for 4:2:0 streams. This is definitely acceptable for consumer applications, as even Blu-ray video streams (that have plenty of bandwidth at their disposal) are encoded in 4:2:0. However, if Intel wants to use the new media engine in professional broadcast and datacenter applications, 4:2:2, and, to a much lesser extent, even 4:4:4 support might become necessary. For the purpose of the Kaby Lake-U/Y consumer platforms being introduced today, this is not an issue at all.
Moving on, like the GPU core itself, Kaby Lake-U/Y's display pipeline is the same as that of Skylake. This means the iGPU can support up to three simultaneous displays.
One of the disappointing aspects from Skylake that has still not been addressed in Kaby Lake-U/Y is the absence of a native HDMI 2.0 port with HDCP 2.2 support. Intel has been advocating the addition of an LSPCon (Level Shifter - Protocol Converter) in the DP 1.2 path. This approach has been used in multiple motherboards and even SFF PCs like the Intel Skull Canyon NUC (NUC6i7KYK) and the ASRock Beebox-S series. Hopefully, future iterations of Kaby Lake (such as the desktop and high-performance mobile parts coming in January) address this issue to simplify BOM cost for system vendors.
In summary, Kaby Lake-U/Y resolves one of the major complaints we had about Skylake's media engine: the absence of hardware-accelerated 4Kp60 HEVC Main10 decode. There are a few other improvements under the hood that enable a more satisfying multimedia experience for consumers. The software and content-delivery ecosystems have plenty of catching up to do when it comes to taking full advantage of Kaby Lake-U/Y's media capabilities.
Chipsets
We discuss the chipsets in greater detail in our main chipset piece, but here’s a brief overview of what is coming to market. As with many other previous generation CPU launches from Intel, a new series of platform controller hubs (PCH), commonly known as chipsets, are being released. These are paired with LGA1151 socket motherboards, and thus also support Skylake as well as Kaby Lake CPUs. (This also means that 100-series chipsets, such as Z170, will support Kaby Lake with an appropriate BIOS update.)
Intel’s chipset releases are fairly predictable now: a Z-series chipset focused on multi-GPU and overclocking, an H series chipset which is basically Z without overclocking, a Q series chipset for vPro enabled platforms, and a B series chipset for a cheaper route to enablement.
Intel 200-series Chipsets | ||||||||
Z270 | H270 | B250 | Q270 | Q250 | HM175 | QM175 | CM238 | |
DMI | 3.0 | 3.0 | 3.0 | 3.0 | 3.0 | 3.0 | 3.0 | 3.0 |
PCIe 3.0 Lanes | 24 | 20 | 12 | 24 | 14 | 16 | 16 | 20 |
SATA 6 Gbps Ports | 6 | 6 | 6 | 6 | 6 | 4 | 4 | 8 |
USB 3.0 Ports | >= 10 | 8 | 6 | >= 10 | 8 | >= 8 | >= 8 | >= 10 |
Total USB 2/3 | 14 | 14 | 12 | 14 | 14 | 14 | 14 | 14 |
Drives for PCIe RST | 3 | 2 | 1 | 3 | 1 | 2 | 2 | 3 |
PCIe Config | x16 x8/x8 x8/x4/x4 |
x16 | x16 | x16 x8/x8 x8/x4/x4 |
x16 | x16 x8/x8 x8/x4/x4 |
||
Overclocking | Yes | No | ||||||
vPro | No | No | No | Yes | No | No | Yes | Yes |
Intel Manageability | No | No | No | Yes | Yes | No | Yes | Yes |
There is also a trio of new mobile chipsets which follow a similar pattern, including a Xeon-enabled chipset in the CM238 to allow for the new E3-1500 v6 processors.
Intel’s power consumption on chipsets isn’t officially listed for these parts yet, neither is their cost. In recent generations, there have been rumors (not officially confirmed) that pricing of chipsets has played a big part in some design decisions for OEM products, and perhaps why we are seeing more 200-series products at launch than we did for 100-series is down to this.
Motherboards
We reached out to all of the big motherboard vendors for details in advance of their new product lines, all of which are summed up in a special motherboard piece coming up this week with over 80 individual SKUs being announced. Note that some products are region focused, so flipping through our analysis will depend if a product is in your region. Nonetheless, one of the key features in this year’s motherboard lineups is that some of the vendors are streamlining their product stacks.
ASUS Maximus IX Code, Supermicro C7Z270-CG, GIGABYTE Z270X-Gaming 9
MSI Z270 Gaming Pro Carbon, ASRock Z270 Taichi, ECS Z270H4-I
There will be a series of new controllers seen on these boards, including the ASMedia ASM2142, a USB 3.1 10 Gbps controller that uses two PCIe 3.0 lanes rather than one to supply up to two ports. Previously the ASM1142 was only one PCIe 3.0 lane for two ports, meaning the bandwidth was shared – the ASM2142 will ensure that both ports will receive peak performance when both in use.
Another new controller to note will be for audio: the Realtek ALC1220 (and ALC1220A from ASUS). This is an upgrade over the ALC1150 and ALC898 codecs we’ve seen on most motherboards over the past couple of years, with the added benefit of having a rated 120 dBA output and 113 dBA input. Despite these results, as we’ve seen in our audio testing before, the whole audio subsystem has to be up to par in order to produce the best measurable quality. ASUS is using the ALC1220A, which I’m told is a particular variant that ASUS has requested with additional features, but we’re not told if it costs anymore.
On the networking side we will still see Intel’s I219-V gigabit Ethernet controller on a lot of product, but not so much Rivet Network’s Killer E2400 solution. Last year Rivet launched their E2500 controller which we saw on a small number of late-release motherboards, but we expect that Rivet’s partners (MSI, GIGABYTE, ASRock) will be using the E2500 over the E2400 moving forward. The E2500 has additional benefits afforded by a new software package being developed (we’ve been in constant contact with Rivet on this), as well as announcements coming through Computex beyond the PC NIC.
The big change in the networking side will be the introduction of multi-gigabit Ethernet controllers, in particular Aquantia’s 5G/2.5G AQC107 controller that we reported on only a few weeks ago. From the breakdown of motherboard information we have so far, we can expect to see it on at least two ASRock boards and we’ve heard murmurings about other vendors also using it later in Q1/Q2. The AQC107 supports 5G, 2.5G and 1G connections through standard RJ-45 Cat 5 cabling at a lower footprint, power consumption, and cost to the 10GBase-T chips we’ve seen pop-up on a few products. Speaking with Aquantia, they want to expand a lot into the consumer space with their partners for switches, docks, commercial, professional and so on, meaning that with enough effort on the switch side, we might see a 2.5G/5G drop-in infrastructure take place without the need for re-cabling. It might take a few years though.
A new feature coming through on a few motherboards is a front panel USB 3.1 10 Gbps header. To my immediate knowledge it is already on MSI’s Z270 Gaming M7 and a couple of others, and in conjunction with various cable manufacturers there is hope that this connector will be a standard. At present it is being enabled through the ASM2142, using two PCIe lanes to provide one port of USB 3.1. We tried getting more detailed info from MSI as to why two PCIe lanes were needed for this but didn’t get a coherent answer, leading us to believe that perhaps the older ASM1142 chip will not support a front panel port like this. More investigation is required.
Technically all the motherboards, when equipped with Kaby Lake, should be able to support Optane Memory (remember, that’s the 16GB M.2 cache drive) when the drives come out. Intel will be providing a certification service for this, and depending on how much it costs, we might not see it in play on the low-end products.
RGB is still going to play a big factor for 200-series motherboards as well: vendors are moving the RGB features down to lower price points on the cheaper chipsets. This will either come as onboard LEDs or as bundling an LED strip with the board for a dedicated header onboard (with software in tow). If you didn’t like LEDs before, you won’t like them now either, but there are one or two products at each price point still without LEDs (or you can turn them off).
Head over to our chipset analysis and motherboard preview pieces to learn more.
A Brief Rundown of Results for Kaby Lake on Desktop
At the time of writing this, we’ve managed to test all three overclockable SKUs and started on the 65W parts as well. Each of the K parts will have their own dedicated reviews which we thoroughly recommend to understand more about the specific products, but here’s a page to give a brief overview of performance.
IPC: No Change in CPU Performance (link)
As was to be expected, especially judging on how Intel described the upgrade between Skylake and Kaby Lake, there is no IPC gain between the two for direct performance. In our testing, a 3.0 GHz Core i7-7700K Kaby Lake part performed identical to a 3.0 GHz Core i7-6700K Skylake processor (HT disabled). The only difference is really in the memory support, given that Skylake supports DDR4-2133 and Kaby Lake supports DDR4-2400, however this has a minor effect on almost all benchmarks.
One thing I will note is that on the motherboards we tested with, some of them implemented the full supported FCLK frequency (1000 MHz) rather than the BIOS base (800 MHz) when it dedicated a K-CPU in. The FCLK issue we documented on Skylake was down to a rush to get the higher speed certified before the launch, but as a result we saw 800 MHz being the main frequency used (and BIOS updates required to even offer 1000 MHz as an ‘overclockable’ option). So while Kaby Lake seems to go at 1000 MHz out of the box, depending on which motherboard you use, from our testing we didn’t see much change in GPU performance when both CPUs are at 3 GHz.
Power
One of Kaby Lake’s big things has been ‘same frequency for less power’, or ‘more frequency for the same power’, compared to Skylake. After all, if you want more performance out of the box but don’t have an IPC gain, then a higher frequency is required. On the box of the i7-7700K it will say that it supports a 4.5 GHz turbo with a 91W TDP, and in our power testing we practically match that number (leaving aside the fact that TDP != power consumption).
For all the Kaby Lake SKUs we tested, even when manual overclocking, the power consumption of the part was very close to the rated TDP at the box frequency. Normally when we overclock we find that the CPU vendor has vastly overestimated the voltage required to be ‘AnandTech Stable’, but in the case of Kaby Lake we saw numbers that were very close.
Overclocking
Overclocking perceptions will change with Kaby Lake, due to the new AVX Offset feature to be found in the BIOS of each Z270 motherboard. AVX instructions are known to cripple a good overclock, reducing the stability and making it harder to push the non-AVX code if that is the limitation. With Kaby Lake, a user can apply an AVX offset (e.g. -10x) which will reduce the multiplier when an AVX instruction occurs. This means that if an overclock of 4.8 GHz is reached and an AVX Offset of 8x is in play, then the AVX instruction will run at 4.0 GHz, generating less heat and keeping the system stable.
We will have a dedicated overclocking piece going over all our OC results, but the short of it is that all three of our K-SKUs (retail parts) happily reached 4.8 GHz AVX at a reasonable voltage. The i7-7700K was able to hit 4.9 GHz with an AVX offset of -10, and our i5-7600K hit 5.0 GHz even with AVX turned on.
Ultimately overclocking a 4.2/4.5 GHz CPU in the i7-7700K to 4.8 GHz isn’t much of a step. This will be one of the big results from the launch of Kaby Lake for enthusiasts: overclocking the high-end SKU doesn’t actually do that much. Another 600 MHz on top of 4.2/4.5 GHz is +13-14%, which is not that much. However, given the voltage profile of the chips we’ve seen, just sitting at 4.5 GHz all day is nice for temperatures and voltage, and still gives a CPU that outperforms the i7-4790K or i7-6600K.
Performance
Read our Core i7-7700K review here.
Read our Core i5-7600K review here.
Read our Core i3-7350K review here.
At the end of the day, the Core i7-7700K takes the performance crown for practically every benchmark (there are a few in which the i7-5775C still wins, due to 128MB of eDRAM), and has a tray price of $305. This means it will probably reach shelves around $330-$350, and we haven’t heard about a new stock cooler so it will probably come without one.
The Core i5-7600K still keeps the mantra of how a Core i5 performs almost the same as the Core i7 except in lightly threaded scenarios (ray tracing), but for day to day work it certainly keeps neck and neck. The downside here is that the Core i5-7600K, due to the lack of an IPC increase, is essentially the Core i5-6600K save for a few MHz. You could consider that ours overclocked well, and the temperatures for the overclock were immensely better than the i7-7700K, but for running at stock there’s nothing out of the ordinary here.
The elephant in the room however is the Core i3-7350K. At a tray price of $159, it is only $11 away from the Core i5-7400 which runs at $170 but has two more full cores, albeit at a lower frequency (3.0/3.5 GHz vs 4.2 GHz all the time). If you want to see our analysis, and what we think, I’ll leave it to the review to tell you. We also look at the question as to whether something like the Core i3-7350K will ever reach the same performance as the perennial favorite, the i7-2600K.
Is Intel Breaking New Ground?
For the most part, Kaby Lake doesn’t do much new. Optane Memory support is a plus, but the rest of the product stack is all about moving the power and efficiency curve. What used to get you 3.0 GHz last year now gets you 3.3 GHz, which means saving time doing work or saving money burning less power. There’s also Speed Shift v2, which is a really nice feature, but is limited to Windows 10 users. Arguably looking at the controller side (ALC1220, E2500, Aquantia) is vaguely more interesting. But this is what we kind of expected from an ‘Optimization’ step in the ‘Process, Architecture, Optimization’ way of doing things: we weren’t expecting to be amazed with the product, but nodding and approving of better efficiency. The fact that there’s a new performance champion gives us something to cheer about after the Skylake/Devil’s Canyon discussion is a plus for enthusiasts with a short upgrade cycle.