So what's Intel's Secret?

Prenihility · April 11, 2019, 1:43pm

8700K VS 2700X, case and point. 2 less cores, 4 fewer threads. Same frequency. It isn’t a massive difference. But STILL…

What part of Intel’s CPU design makes this possible? Is it in the fabrication process?

Oh yeah, and let’s not forget that Zen+ is at 12nm.

kewldude007 · April 11, 2019, 1:48pm

Intel’s architecture can do more things per cycle aka Instructions per…nvm @Fouquin is replying

mutation666 · April 11, 2019, 1:56pm

Instructions per clock /thread Ok there is a bit more branch perdition etc

Adubs · April 11, 2019, 2:01pm

inb4 they cheat

Fouquin · April 11, 2019, 2:10pm

Are we talking all-core clocks assuming no thermal limit? Because the 2700X definitely does not run at the same turbo as the 8700K. The 8700K sustains 4.3GHz across 5-6 cores, where as the 2700X is 4.3GHz on 1-2 cores. The 5-8 core PB2 range of the 2700X according to AMD is 4.0-4.075GHz.

The sort of flip-flop of the performance crown (with the 8700K coming out ahead by a minimal amount as you stated) has a lot to do with Intel’s architecture, yes. It is absolutely more capable with per-core instruction throughput per clock. Intel also does some things differently, like AVX for example, which put them ahead in some software that supports the newer versions.

This is marketing. In the context of Zen+, 12nm is just 14nm with a new name to differentiate that minor tweaks were made to the process. The individual core architecture was not changed between Zen and Zen+, just some elements were shrunk in place to reduce power usage and assist in pushing higher clock rates with more stability.

Skylake at the core level is a serviceable enough µarch. Combined with refinements to the process and throwing efficiency out the window to scale core counts and clock rates (Coffee-Lake’s 6 cores use roughly 28W more than Zen+'s 8 cores) it is definitely capable of edging out the 2700X in the average. There are still quite a few applications where the 8700K just doesn’t even compete, such as multi-core rendering (hence Blender and Cinebench being AMD’s go-to stage presentations) and AES encoding. Most of the time though, faster cores with better IPC are more than enough to retain the crown.

Dynamic_Gravity · April 11, 2019, 4:01pm

They use only the finest free-range organic grass-fed & grass-finished Silicon.

PhaseLockedLoop · April 11, 2019, 4:21pm

This is true and Intel’s 3D trigate transistors DO have an edge over the finfet process in regards to Intel also controlling its fan. There’s actually a lot of engineering that goes into this but you definitely summed up the differences in speed above I just wanna touch base on one thing.

Intels processes are actually smaller (though called the same node) vs AMDs processes… And Intel 14 nm is more like an AMD 10nm is what I am getting at and this is just the nature of the benefit from 3d stacking vs finFET

I might make a post on this given most of the misconceptions. Of course borrowing material already made in other places but I can make one so that the misconception between the processes doesn’t occur. Its actually this trigate technology that allows Intel to pack more into the processor and being able to make more logic for say extra AVX at the same node giving the IPC a boost. In fact in the 3d trigate there are some dielectrics that can be more finally controlled allowing the processor to flip and flop faster (its switching capacity is basically directly related to its clock) and that’s why you also see higher clocks on the Intel variants vs AMD however AMD definetly has an extremely efficient Samsung 14nm process going so I’d be interested in the coming years to see which one wins

In any account its that’s faster clock that results in enough of a boost and your very correct in that regard… Processor design is fun

Fouquin · April 11, 2019, 4:39pm

With that said, I remember there being some pretty big announcement about the SRAM density in Zen matching (or possibly exceeding) that of Skylake and Kaby-Lake, allowing the extra core and I/O logic required for the Zeppelin dual-CCX pattern to fit with Samsung/GloFo’s 14nm process. That led to the process refinements being implemented on Zen+ to sort of ‘double back’ and shrink that logic after the fact.

Even the minor feature shrink still leaves Intel’s 14nm++ at 1.37x logic density or something equally as ridiculous.

Prenihility · April 11, 2019, 5:21pm

So the 3D stacking method of Intel’s designs greatly contibute to core speed vs FinFET?

And what did AMD use before, then?

I’m still pretty overwhelmed by how 2 LESS cores, with single threads can make up equal/more performance…

PhaseLockedLoop · April 11, 2019, 5:30pm

32 nm SOI

Silicon on Insulator

anon46267848 · April 11, 2019, 6:17pm

If you’re interested in the strength and weaknesses of Skylake versus Zen, I recommend you read this article comparing EPYC and skylake SP.

MazeFrame · April 11, 2019, 6:44pm

Looking at the difference of Intel and AMD cpus in the server space pre- and post spectre patchses, I think Intel takes some shortcuts in the branch predicton.

Prenihility · April 12, 2019, 12:54am

What’s stopping AMD from using the 3D Stacking method?

d0rk · April 12, 2019, 4:09am

this article predates spectre/meltdown and sort of pointed to why they happened.

I would definitely suspect them taking shortcuts in branch prediction (in hardware, duped registers/etc) and thats why they took a bigger hit when the fixes came.

thro · April 12, 2019, 7:34am

Partially fab, partially cheating at security

Make no mistake. Intel’s 14nm+++++(+++) process is pretty amazing. But their fabs always have been. That has been intel’s advantage since the 90s.

However, with their stumbling at 10nm they’ve dropped the ball a bit and now their designs are having to compete on a somewhat level playing field and AMD design wise is just generally more ambitious and often more “clever” in terms of trade-offs in order to be more cost-effective (Bulldozer being an unfortunate mistake, but Intel had their own mistakes like the Pentium 4 and RDRAM… oh and how could we forget the Itanic!).

It was the case with Athlon, it was the case with their Athlon XP, Athlon 64 and various other times through history, however intel had fab capacity, supplier agreements and dirty tricks on their side.

https://www.theregister.co.uk/2004/02/17/who_sank_itanic/

anon85933304 · April 12, 2019, 9:45pm

There was an old TekSyndicate video about intel and compilers from the Phenom days

Dynamic_Gravity · April 12, 2019, 10:12pm

oh yeah I remember that.

if (amdcpu == true) { for (; i < 1000000; i++) }

MazeFrame · April 12, 2019, 10:15pm

Ah yes, the days of this little tool:
https://www.majorgeeks.com/files/details/intel_compiler_patcher.html

risk · April 13, 2019, 6:15am

AMD needs:

more branch prediction table entries
less cache/inter core latency
a bit more clock speed wouldn’t hurt
most importantly in the server market - bigger CCXes (e.g. why not go to 28core mesh like Intel, or even one up that with a donut setup, and then stick 4 of them on a PCB with 16 ram channels and some pcie 5.0 interconnect; put your serdes design skills where your mouth is. They could easily sell hundreds of thousands of those per year probably at $3k - $4k each even at 500W … AMD, why do you hate nanoseconds and money so much?)

Also, AMD is cheaper in retail pricing and for small business, but Intel is known for giving deeper discounts to large OEMs, as long as they also have viable AMD products (or threaten to have them) - which makes for an interesting capital bootstrapping problem.

Fouquin · April 13, 2019, 6:26am

Because that significantly decreases silicon yields and destroys all the cost effectiveness of the Zeppelin core design. The whole point of which is to be cheap to produce, extremely high yield, and fucking insanely easy to validate for mass shipment. AMD can’t just make monolithic dies anymore and compete, they can’t afford the waste of silicon.

Chiplets are the next step in that race for quantity and quality. You get all the important logic on a bleeding edge process, and all the stubborn logic that is significantly harder to shrink on the now extremely cheap last-gen process. Situate them on a 40x40 package and with some clever interconnect work pop out a processor that can be sold for around $500, instead of $900.