A little teaser of what is to come :)

Amazing initiative! I look forward to it!

Tualha

@bsodmike now I just have the strongest urge to go watch 24 again. For those interested into the coding / inner workings of RDMA and RoCE, Mellanox has published a fairly comprehensive PDF on the topic.

Another place to look would be the Open Fabrics Alliance, though that might take a bit more digging as their Wiki leaves quite a lot to be desired.

Should anyone want to know more, pm me as I’d rather not derail the thread to much :wink:

2 Likes

Do I understand correctly that this already works with Intel CPU’s, the point of this thread is getting it working on AMD?

No, that is what the NPT patch was about.

1 Like

If you read carefully it’s running on Ryzen, will be public soon.

The point of the funding campaign is to get it running on Threadripper (or getting powerstates to work on TR4), I don’t know about Intel.

slightly offtopic maybe ?

does that include the powerstate problems on vega ?

i probably should have followed the development there a little closer given that i use them

Um… YES PLEASE?

Sort of… you might see the odd skipped frame, it’s best to either match framerates on both the guest and the host, or run at 2x the framerate on the host (See Nyquist as to why). The higher the overall framerate the lower the latency too, if you could run 240Hz the maximum worst case latency would be 4.2ms.

Correct, it works on Intel and Ryzen, but certain configurations are substandard, such as running a VEGA on the host like in Wendell’s setup.

2 Likes

So what are the base requirements hardware wise for this? I only have a quad core, no HT.

VGA PCI Passthrough… that’s it.

3 Likes

Neat. Will be keeping a cautious eye on this :eyes:

1 Like

Perfect name IMO

Can you see this integrating well with virt-manager? or would you need to do some extra stuff to get the input working?

Awesome, keep up the good work guys! :+1:

It would be possible but that is something that libvirt would have to do, we are basically just inventing the technology here :slight_smile:

4 Likes

@gnif
So what happens once the frame is in the CPU? Is it drawn like any other application? I read a mention of SDL.
You might want to consider using Vulkan for this if applicable.

SDL is in use at current simply for rapid development and input events (x11/keyboard/mouse). It is only used on the host, and even then we are leveraging GL_ARB_buffer_storage directly to improve performance and reduce CPU load. So no, it is not drawn like any other image, a texture is created and we are rendering to a 2D surface.

Vulkan is certainly on the road map, but first we have to make it work :slight_smile:. Also Vulkan is not likely to help much at all since we are only rendering two polygons with a texture, 99% of any overheads in that process are PCI bus and GPU, not API.

4 Likes

Remember that on the surface this looks simple it is quite complicated to tune,and here is why:

  1. The guest software has to implement NvFBC and DXGI Desktop Duplication which each have their own problems.
  2. The guest software is a windows Application and needs to access a shared memory resource (IVSHMEM) from user mode (Ring 3). Events are used for sync with the host. This is another project all on it’s own, a kernel mode driver.
    1. The IVSHMEM driver provides access from Ring 3 to the shared memory resource which it maps into the guest software’s application space.
    2. The IVSHMEM driver also needs to allow the guest software to register events that are set when interrupts arrive.
    3. The IVSHMEM driver also needs to allow the guest software to ring a doorbell letting the host know of changes
  3. The host runs a ivshmem-server that provides the glue for interrupt events between the host software and the ivshmem virtual device.
  4. The host runs the client which connects to the ivshmem-server where it is able to map the same block of memory that the guest has mapped, and register for events.

So to summarize this, we have four projects all tightly dependent on each other, two Windows, one is a Driver which adds to complications, and the other two are Linux applications.

We are using DirectX or NvFBC on Windows and OpenGL (later perhaps Vulcan) under Linux. Then mix into that different GPU vendors and different nuances of those devices.

You can see why this project had not been done by anyone else yet. Too many technologies had to come together at the same time. I know for a fact that there are things in this mess that can be made better, and that is the entire point of going open source. There are people much smarter then I with specific parts of this that they specialize in (ie, OpenGL & DirectX devs), but on their own wouldn’t be able to build the other parts of this system.

This is the start of a much larger project that is going to need more developers, it is not a one man show. I will do my best to get the best I can out the door so you all can use it, but more so that other developers can join in and help improve it.

I honestly started this project for fun, to see if it was possible, and because I wanted it, the fact that everyone else does too hadn’t even crossed my mind until I saw the response to the NPT fix. I had a prototype of this working before I even came across the NPT bug :slight_smile:

13 Likes

Alright, looks like it’s a lot of work, but the payoff is good.
Oh and In that video Hans-Kristian found Vulkan did help with a 2D texture so there is that for later.
This whole thing also also opens the door for other fluff features, like scaling the entire frame with a pixel shader, something you can’t even do on a native Windows machine. I know someone who might be interested in going somewhere with that.

How is input performance compared to say evdev passthrough with virtio input?
Thanks for taking this on and good luck.

No idea at this point, it has not been my focus. The primary goal of this was to make a spice client that didn’t use spice for video for easy integration with the host’s window manager.