Intel Core i7 Presentation
"OC3D were present at Intels recent Core i7 presentation at Heathrow. We'll take a look at some of the new features and what they do"
Intel reveal some more Nehalem information at recent Core i7 presentation
Recently OC3D received an invitation to a presentation Intel were planning to hold in the Hilton at Heathrow's Terminal 4, and I was the lucky one who got to go.
Before I get down to the nitty gritty, a lot of the information that I am allowed to publish at this time has already been floating around the net, but at least now it has been confirmed by Intel. There are a couple of new bits though, so stick with it and take a read. As for the stuff I'm not allowed to publish, well that's a lot more interesting and for another day.
At the end of the day, I took away with me a memory stick full of Powerpoint slides from the presentation and lots of promo pictures of Nehalem wafers and cores. But rather than use those glorified images, I've decided to stick with the real pictures I took on the day and slides throughout this article.
It was a long old day and at about 10:15, to start things off after a little hanging around was a presentation on the architecture details of Nehalem by Chief Architect, Ronak Singhal, who is one of four lead architects who worked to create Nehalem. If I'm totally honest, quite a lot of it had me scratching my head a little. I'm going to try and not bore you with too much techo-babble but keep to the major points I think might be of interest.
Ronak has been working for Intel since 1997, and back then he was working on the Pentium 4 processor. He said he had been working on Nehalem since 2003 and went on to explain that there will be two Nehalem CPU's released in Q4 of 2008, one being a mainstream quad-core chip, and the other being a server product. Also mentioned was that Intel intend to launch their Octo-core processor towards the latter part of Q4 2009. Bit of a wait hey?
Tick Tock Tick Tock Tick........
The first slides of the presentation showed information on their "Tick-Tock Development Model" which basically shows that they look to release new architecture every two years, which is "Tock", i.e. Merom (Core2) of old, Nehalem (Core i7) of new and Sandy Bridge of the future. Staggered in between these two years are Intel's "Tick" releases which are essentially the same architecture but created on a smaller manufacturing process, i.e. Penryn. Westmere is set to be the future "Tick" to Nehalem's "Tock".
Obviously "Tock" stages take much longer to create than the "Tick" stages. This means that Nehalem was being worked on quite a time before Penryn, and interestingly, improvements being designed for the next architectural "Tock" can make their way into a "Tick" release, but on a smaller scale.
One of the improvements in Nehalem Ronak spoke about was Branch Prediction improvements. While this technology was in Merom and improved on a little in Penryn, it has moved on quite a lot further in Nehalem and really makes a difference. In Nehalem, Intel have introduced a second level branch predictor per core. This new branch predictor helps out the normal one that is in the processor pipeline and supports it in a similar fashion as an L2 cache does with an L1 cache.
This second level predictor has a much larger set of history data it can call upon to help to predict branches, but while being larger is a good thing, it also makes it quite a lot slower than the first level predictor.
The first level predictor runs in pretty much the same way as it always has which is by predicting branches as best it can. However, in Nehalem the second level predictor will also be simultaneously scanning branches. When the first level predictor makes a prediction based on the type of branch but doesn't have enough historical data to make an highly accurate prediction, the second level predictor can jump in, and as it has a larger history window to predict from, can on-the-fly find mispredicts and correct them without too much of a penalty.
Intel gave an example of the second level branch predictor making significant improvements on applications with very large code sizes i.e. database applications and also some games to a certain extent. In short, if the second level branch detector catches something the first level detector missed or mispredicted, it will throw it out of the pipeline which saves time on wrong calculations, and in turn on power.
Now for some Cache & Hyper Threading on the next page
Most Recent Comments