Return to

Project Larrabee FOUND: What could this mean?


In case you didn’t see it, Linus managed to buy Knight’s Corner, AK A the actual Larrabee that isn’t a pile of shit. And for 400 dollars.

So what does this mean? That he has some cool thing? Oh wow, unreleased hardware! Neato! Oops~! Linus dropped it, what a gag!

No. And I’m pissed that he put out a video. Let me explain.

That card, in and of itself, is a complete computer. 64 threads, 8-16 GB of ram, and an SSD. ON that SSD is FreeBSD with a LOT of kernel modifications built to communicate to whatever it is plugged into and to search for a lerraby driver stack. Think of it as PCIe Passthrough for VM’s, but rather than passing through a GPU to a local software level, take the entire driver layer and hardware layer that has anything to do with PCIe and that card, and make a bridge that can change data however it wants. Lerraby was set to be an interesting card one way or another, but theres something more too it that I don’t think anyone is thinking about.

On that card is a BSD compatible version, 1:1, of DX 9, 10, and 11, as well as the bridge software to accept a data pipeline. What Linus NEEDS to understand is that he could single handedly flip everything upside down by getting that data off of that card’s SSD. Not only could we run games native, but we could make our own EXE’s for them and run them like we were running windows, and the apps wouldn’t know the difference. They would think they were on windows 7.

He needs to be careful. I can tell you that card is worth a lot more than 400 canadian bucks if anything but for the data thats on it. The only thing we need is to be able to get it to run.

And for fucks sake he needs to put it in a safe.


As much as intel wants it to be a graphics card this is a CPU. A CPU with many cores, but a CPU nonetheless. So any DirectX implementation on this is purely a software rasterizer and not useful for anything. I believe DirectX is also implemented in Windows, with the graphics cards only providing some lower level interface.
So no, this thing is not useful for Linux gaming at all.

Just to be clear, this is the performance of intels “highly optimized rasterizer” running on a Threadripper 1950X:

Yes, that is 27 fps in a Quake 3 like game in 1024x768. Useless.


Are you dense? I’m not saying plug it in to a linux machine and magic it works, I’m saying crack the data out of it and we have DirectX on linux that works and doesn’t need NT.

Thats huge.

Does it need work? Of course. Larrabee team had to baby sit GEN the whole fucking time. But it works, and its very useable for what it is.


I understand that. I’m saying you can’t do that.

You are proposing to take a program written for a CPU and run it on a GPU. That’s just wrong on so many levels. Nothing about this thing can make DirectX games run on Linux.

Why then did nobody just buy on of these cards and get the data off it? They were available on amazon one or two years ago for a couble hundred bucks. Why do people spend years wiring their own DirectX wrappers when they could literally copy DirectX off a Xeon Phi as you say.

Look it doesn’t work like that. Like, at all.


You do understand that theres very little actual differences between the two technologies, right? They are both ASIC’s?

Because Knights Corner has been playing Lost Ark for the last 4 fucking years and there were a lot more important things going on 4 years ago, on top of Embargo as well as it not being a public property.

PHI’s aren’t Knight’s Corner.

Because there is a specific build of it on that card, paired with an identical build in its driver, that is built as a test head. There is one, and only one, and that is the one.

Do you get it yet or do I have to beat it into you?


Please tell me you are joking. The differences are HUGE. “They are both ASICS” is like saying Books are just like Nuclear power plants because they are both collections of atoms.

Graphics cards are way more restrictive than CPUs and will not run the same programs. Moreover software has to be written using entirely different paradigms to reach acceptable performance.

Yes they are. Intel Xeon Phi is to Knights Corner as Intel Core is to Coffee Lake.

Look, I know you want this to be true, but it isn’t. Sorry.


Well, no, actually they are pretty similar. Except CPU’s have more actual cores and GPU’s have like… 2. At most 2 right now, more commonly 1.

That’d be cool.

Uhhh, I think you mean CPU’s? Unless you mean graphics cards that are like the RED Graphics Compute Unit? I reually don’t know what you are talking about here.

By that I mean any old PHI doesn’t have what is on that specific card’s SSD. The libs are the important thing here, not the hardware.

The thing I am trying to point out here is that on that card is a whole different system built to run on a VERY complex RISC chip that could, and shouldh, have changed the industry for the better. We’d see AMD and Nvidia be one in the same at this point, or ATI at least being bought by Intel for pipeline developers.

And yes, a GPU isn’t a CPU, its overly complicated space magic vs a rock, but my point is that DX built for that card, stored on that card, is X86, sorta, and built for BSD. You’re seriously focusing on the wrong thing here.


They are really not. Manufacturers like to say their Graphics cards have X “Compute Units”, or “CUDA Cores”, or “Stream Processors” or whatever, but these things are entirely different from CPU cores. CPUs have few cores that aim to be as fast as possible while graphics cards have thousands of very slow cores. Thus CPUs excel at executing few, serialized tasks while graphics cards have to process massively parallel programs to be any use at all. Current gen nvidia cards need to have > 10000 tasks running at the same time just to reach their full performance. I don’t know where your “GPUs have 2 cores” comes from but you’ve got it completely backwards.

But there’s more: Due to CPUs only having a couple cores each of these has access to at least several hundred of Kilobytes of cache. Graphics cores only have a couble of bytes of cache each. So while cache misses are bad on CPUs, diverging code can easily slow down a GPU by a factor of ten.

Again: Completely different architecture, requiring entirely different programs. And all of this ignores that GPUs use different instruction sets anyway.

It would, if only it was real. It’s only cool until you realize the ducks are leaving nuclear, uhm, waste everywhere

Yes I meant CPUs. Fixed.

CPUs are okay at everything, but not the best at anything. Graphics cards can only do a couple of things (hence restricted) but dominate all other chips in these few areas. This is why computer graphics, fluid simulations and machine learning are all done by graphics cards: they are orders of magnitude faster, if only you can make your program run on them.

These libraries are software rasterizers. They are not useful for real world rendering and will not run on graphics cards.

You mean CISC? The card has nothing more than ~60 slow-ass atom cores on it. Plain x86, i.e. CISC.


Ok no. No no no nonononono. You understand that a core is built out of a certain amount of FPU’s, IPU’s, and other compute components that then have an interconnect to other groupings, right? CUDA cores, at their very base, are slightly more capable FPU’s. Thats all they are. You can have one mass of CUDA “cores”, but at most its a lot of FPU’s piled on top of each other even if they’re in designated areas. Its part of the power design.

To add, the reason I am psyched about… What was it? The thing that AMD made that let you use a GPU as a CPU and the namwe started with an H. Fuck. Anyways it was sick and would make APU’s basically the same as a PHI, though no where near as powerful, they would do essentially the same tasks.

HSA. I want HSA to be a thing.

So, lets disect.

The above Pascal ASIC has 4 total cores, or GPC’s. If I knew more about processor design itself, I’d say that SM’s in this diagram are similar to threads in any old processor. I’m not going to make that judgement though as that is far above where I am at. But I can say that, no, really, there aren’t thousands of cores in there. I really wish there were diagrams of intel CPU’s and AMD CPU’s so that you could see that they are the same thing, but at this point all I can do is point you to some gamers nexus videos and documentation to say they are the same thing with different labels.

No, they aren’t. They do the same math as a normal processor, but because they have a stupid amount of IPU’s and FPU’s, all that shit, they can handle a ton more of that sorta stuff. 4 cores, 20 threads, to put it simply to the diagram above.

And just the same, CPU’s have the same exact components but at a far lesser scale. I garuntee you that KNights Corner, as well as any PHI processor, is designed just the same as a GPU but with the intention of a CPU. They are their own technology.

Sorta? CPU’s are built to process the same repeating requests thousands of times in a row. Its why the AMD “AI” in the Ryzen CPU’s does what it does. You couldn’t do that sorta tech in a GPU unless yfou had an assload of “AI’s” running and trading information. GPU’s are built to run any process, whether its the same one many times in a row or thousands of different instructions in one large line, as fast as possible. Its why theres so many CUDA cores (Floating Point Units).

4 cores, but thats besides the point.

Whichi si why I often make the comparison of GPU’s to RISC CPU’s. They handle data pretty similarly and are big endian as well. Built to eat and shit as much data as possible.

Uh, no.

Ok I was really really confused.

One thing you should look into with chip design is how PHI’s are the weird half baked Aspergers Cousin of both GPU’s and CPU’s, then compare a Phi to an APU. Its kinda silly how similar they actually are, and has to do with the MASSIVE differences in little and big endian chips.

Not in their current state, no. It’d be a lot of work to even finish them they’re probably half the reason the project was scrapped in the first place.

Sorta, not really. Complex Risc meaning that the cores are wired way differently than RISC, but no where close to CISC either. Half baked asperger cousin, more akin.

There needs to be a processor architecture mega thread but the mods are ass and would kill all the fun.


Hmm, we somewhat agree but use different terms. I have things to take care of right now but’ll be back in a couple hours.


Indeed, I hope I am teaching you something this has been the thing that kept me interested in computers at all for years.

I’m going to bed, I look forward to our conversation in the morning.

Also, if you want to look at it in an easier way, CPU manufacturers don’t show diagrams ofg their chips. Ever. Risc-V might have a diagram out there but fuck me if I will ever find it. But, compare an IBM Pi processor to a Phi or that pascal up there, you will see they are basically the same. Couple cores, TONNES of threads. And on top of that thats CPU to FPGA to GPU. A monochrome scale of hardware if you will. Its because CPU manufac’s don’t release this sorta info but GPU manufac’s do that a lot of misinformation gets thrown around, such as little endian X86 is truely the king and theres more of it than ever before on the planet, when in fact RISC just left the CPU label and went to the GPU label instead.

Now I’m excited to go to bed to talk about this more tomorrow this is a first in… Well years.




I don’t have anything to add just a quick educational tidbit of that whole thing


yo dawg, we heard you like computers so we put a computer in your comptuer so you can compute while you compute



Why bother? Sure, it might be ok for old software we can (already) run in a VM, but long term all that does is still encourage use of DX rather than porting to something better … like Vulkan.

AFAIK there are already DX to vulkan or Metal to Vulkan shims available…


While the card is interesting as a compute accelerator and that is it.
Yeah, it is a half way step between GPU and CPU, but it fails at both tasks.

Extracting the code from some onboard storage (wich may aswell be baked into silicon) will give you some intel specific… something. You would then basically try intels software rendering from 2003 2007 (?) on modern hardware. That will run like arse (if at all).
If that thing was worth a damn, Intel would have allocated more R&D budget and made it happen.


Let’s take a step back. The card is a Xeon Phi “Knights Landing” coprocessor. Despite what Linus says this is not a graphics card and was never meant to be used as such. It’s a x86 processor with slow cores, but a lot of them. These are also not a new RISC architecture or anything of the sort, but full blown x86 cores.

The cores at the heart of Intel’s first Xeon Phi are based on the P54C revision of the original Pentium and appear largely unchanged from the design Intel planned to use for Larrabee.


There are no additional shader cores either. (Almost) all computation, including scheduling, running the OS and computations are done on the ~60 pentium cores.

Because this is a regular old computer all of the rendering is done in software. There is no driver for shader units or anything like that. OpenGL is implemented as a regular program, running directly on the CPU cores. So if we were to extract the rendering code from a Xeon Phi all we’d get is a slow-ass software rasterizer. Not useful.

I’ve also confirmed with @Nemes in the L1T discord that what I said before is correct. DirectX is provided by Windows, not the GPU/driver.

There is no DirectX code we can get off this thing to run windows games on Linux. The DirectX code is in Windows.

(We’ve had an interesting discussion regarding GPU architectures in general should anyone care. You can find it in #dev_corner.)


Well, for starts, extracting the libs annd rolling our own would be nice., And as I said earlier we have .Net now, so not only can we break down and rebuild applications, we can rebuild EXE’s to run in linux, though that takes a LOT of effort.

Would be a lot better than DXVK.


That thing isn worth a big ol’ damn, intel has been a mismanagged pile of ass ever since they started making money off of sueing AMD in 2007 and we know this.

What that card should have proven out is that our GPU’s could have done more on the fly sort of work. If a texture didn’t load, invent one that matches the scenery. Model missing? Build a place holder. Stuff like that. On top of that, its extremely similar to an FPGA in that you update it all in software. So vulkan comes out, bam update the core. New openGL, bam update the core. Yeah, you’d need a new one after you couldn’t do a core update anymore, or core updates were kinda meh, but something like that would have lasted a lot longer than the 2 year cycle our GPU’s seem to have.


See, that wouldn’t make any sense to me. There has to be something in there, at the very least a rendering engine built in of some sort. Because its a hardware to software to hardware bridge, you have to have the libs on both sides to communicate and check back to correct mistakes. Reading through leaked docs thats how this thing is supposed to work.

Right, its a giant FPGA for all intensive purposes, its just games were the most complex workload tnhat intel was willing to bother with at the time.

Well I never said that, but yes I agree with that as well. Assload of threads (not cores, I don’t believe).