The perfect build?

So we've had a few threads about passing through a GPU to a Windows virtual machine and running that inside/under Linux. I was thinking about demoing this setup:

Linux Host based on Skylake. Everything setup for virtualization, and vt-d, and all that. Linux would be well setup -- for 4k, multiple monitors, etc. Connected to the onboard graphics.

Add-in graphics card, probably an AMD390X, connected via DP to the same 4k display (second input).

A Windows VM would be setup with the pass-through AMD390X to it, and the Logitech MX Master would be paired to the Windows machine through USB and the Bluetooth part of the MX master paired to linux (the MX Master can swap between up to 3 computers with a mix of bluetooth and logitech unifying recievers). The thumb button on the MX master is trapped on windows with a custom program to 'release' the USB keyboard passthrough and on linux, it releases the keyboard and passes it through to windows.

I was also going to show how, if you don't care about gaming on windows, you can run windows apps seamlessly via RDP (rather than the virtualbox passthrough). That's how I run some things that I need on windows.

Windows sandboxed this way really does work pretty well. It is annoying you have to actually power cycle (reboot is not enough) otherwise it'd be easy to shut down/wake up the windows VM w/ and w/o the passthrough GPU.

This method does let you run Linux 100% of the time but having really "solid" windows install for all the windowsy things you need.

You do have to toggle the 4k display between the two inputs (HDMI and DP) but that's not too bad imho. I can probably make an arduino or a teensy toggle inputs for me when I hit that mouse button if nothing else. Possibly if I don't get the usb keyswitch thing working we'll, I'll use use a usb keyboard switcher and an arduino in hid mode to both toggle the display (infrared yay) and transmit the keystroke to toggle displays to the kvm.

It'd be nice if we could pass through the framebuffer over PCIe the way it works with Crossfire on AMD or Optimus on mobile devices. Then Windows/Linux could share a GPU much more easily with full h/w acceleration. I suspect people are being paid not to work on that though (/tinfoil).

With this setup one has the option of:

  • Passthrough Graphics -- Steam In Home Streaming from The Windows VM back to Steam on Linux (LOZL!)
  • Reboot WinVM w/different config for Virtualized Graphics -- Run apps seamlessly, but no GPU acceleration
  • Passthrough Graphics -- Toggle Input and be "in windows" at 60fps. Seamlessly move your mouse from windows to linux on secondary monitor w/Syngergy or something like that.

UPDATE: 2015-08-28

I've run into a problem w/skylake though. I've done this before, but never on skylake, and always with pci stubs instead of vfio. I'd really like to see if vfio/full intel iommu is newer/better/etc. With intel_iommu on and the i915 enable kernel parameters, I get this at boot:

DMAR:[DMA Write] Request device [00:02.0] fault addr 7099000
               DMAR:[fault reason 23] Unknown
[   10.805742] dmar: DRHD: handling fault status reg 2
[   10.805746] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 709b000
               DMAR:[fault reason 23] Unknown
[   10.822677] dmar: DRHD: handling fault status reg 2
[   10.822683] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 709d000
               DMAR:[fault reason 23] Unknown
[   10.957100] dmar: DRHD: handling fault status reg 2
[   10.957114] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 709d000
               DMAR:[fault reason 23] Unknown
[   11.090397] dmar: DRHD: handling fault status reg 2
[   11.090404] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 709d000
               DMAR:[fault reason 23] Unknown

I am not sure why. I get a graphical prompt but if I try to do anything, I get a kernel panic and the machine hard locks. I can switch to a text console and continue fine. I'm using arch with kernel 4.1-6something (I think).

I'm going to try intel_iommu=pt and see if that resolves the issue. It resolves the issue, but has it broken my passthrough? Not sure.

I haven't gotten to the part where I actually pass through the Asus AMD 390X GPU, however.

The boot drive is an Intel 750 nvme SSD, and I'm using systemd/loader to boot (not grub2 because..well grub2 seems pretty stupid about nvme devices right now).

Anyone seen anything like this before? The arch forums weren't much help and the 234 page how-to for passthrough has become a bit outdated with the slick new vfio stuff.

21 Likes

yeah with a 6700k and a R9-390, you should be good to go.
Pick that Asrock Extreme 7 board, 16GB of ram.

As for the Linux host OS, i would be interested to see this on Ubuntu.
But you could also do KVM with open suse.

Would be awesome if you could do a video from this.
Because i think allot of people might be exaly interessted to go this way,
with all the shanigans going on for windows 10 atm.

1 Like

I'd love to see a video from you on GPU passthrough with Ubuntu, OVMF and hugepages. It would also be nice if you could do some benchmarks on the performance loss from native performance on apps to virtualized. Also some measurements on input latency and network latency in case the VM NIC is bridged would be nice. You might also try 2 video cards, the 390X you mentioned and an Nvidia one like the 980Ti and show all the quirks for the Nvidia setup.

I think that if more people knew about this and where comfortable with this then Linux adoption would increase. For the majority apps like those in the Adobe suite and a lot of games are major reasons why they don't switch over to Linux. So, yes, please do a video on this!

1 Like

Couldn't you just run one 4k monitor off of the integrated graphics, and a second monitor dedicated to the windows VM off the GPU? Turning on the 2nd monitor whenever you needed to use it?

Just gotta turn yourself a bit to use the 2nd monitor when gaming I suppose.

I'm really interested in seeing this on the Linux channel. I think i'm more excited about this than any sane person should be.

Use an Ubuntu flavour for sure. 390X would be nice, if there are any differences between it and an Nvidia card, they would be worth mentioning. The RDP stuff is also definitely worth demoing.

sounds like a really good idea. I 'd love to see the gaming performance of games that can run on windows and linux this way. windows on vm and native linux and native windows. do you gain any performance from doing things one way or the other. gpu passthrough sounds like a really good idea indeed. I am dual booting at the moment just for the sake of keeping this separate and I am working towards more separation for security purposes but it's kind of annoying to have to plug and unplug stuff whenever I want to boot on to another os so this would be really practical in case I just need to check something in windows. also with the gpu passthrough I'd like to see how well F@H on the windows vm works with the gpu.

Huh. I always found SUSE better for specialized stuff like this. Could also drop down to a 390. 390X isn't really worth it in my opinion.

I would absolutely love to see this. I've used Linux distros on and off over the years but never did fully switch over.
Somehow the dual boot thing always annoyed me so I ended up with pure Windows at some point or another.
Now I am dual booting again because of Windows 10 and my ongoing interest for Linux.

I need / want windows for Lightroom and Photoshop, the according colorimeter to keep the monitor in check and a few games that do not run on Linux. All the other stuff I use on a daily basis is open source already or has a decent replacement in the Linux world.

Out of curiousity: What distro is Wendell using personally?

Don't know anything about solving your problem, sorry.

Just to add to the suggestions above, VRR(GSync/Freesync) should work fine right?
Maybe check it out because if we are chasing ultimate, we want ultimate gaming as well.

something something IOMMU groups?
https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#IOMMU_groups
i never needed that ACS patch.
Not sure if it's required to pass throught he graphic's card audio device with the graphics card. Make sure there's no sound driver attached to that sound device (lspci -k).
I'm on the vanilla arch kernel, passing through a 290x to a Windows 7 VM works nicely (using OVMF to get around vga-arbitration problems with i915, this might also affect PCI reset behaviour AFAIK).
I'm on Haswell though. I hope it's not your Skylake IOMMU being quirky.

It seems as though intel_iommu=pt solves the issue, but seems to effectively disable iommu on skylake.

Still doing research.

The 390X has its own iommu groups, so I expect that would be okay. I am (attempting to) use ovmf .

Sounds pretty rad to me.
A couple questions and comments, though:

1) Which hypervisor are you using? As far as I know, KVM and Xen are the only ones capable of VGA Passthrough.
You mentioned pcistub, which as far as I recall, is KVM only...?
2) Kernel 4.x is still a little....quirky....and I've personally held off on updating my own. I dunno if you'd consider it a valid or feasible option, (especially with Arch and other more cutting/bleeding-edge distros) but 3.x might be the much simpler option to choose.
3) If you want to toggle between displays more easily, and don't want to add a second display, you could just use a kvm switch-- (I'm referring to the physical hardware, having nothing to do with KVM the hypervisor) the wires get a little out of control and the screen takes a few seconds to start working again, but it works for me, for the most part. I find it's also a lot better than dealing with synergy's latency.
4) Why Skylake, if you don't mind my asking? With a typical quad-core mainstream chip you're basically stuck with two i3 (virtual) machines, eliminating any real potential for productivity, if that's your thing.
5) "Being paid not to work on that?" lol, actually, AMD has been really cool about this. They have an engineer who routinely tests out their products with VGA passthrough; Nvidia are the ones who actively prevent this from happening, unless you fork over 5k for a Quadro card. (or unless you can flash your card's BIOS to behave like any other similar "multi-OS" board.)

6) Since the Arch topic has become deprecated, some guy has taken up a blog and formed a mailing list to keep the subject updated over time. The link should be at the top of the first page on that same topic.

7)
REGARDING the [fault reason 23],

It looks like the device it's "requesting" is whatever's in the first pci slot. Running [lspci] returns something with an address similar to that, with "00:xx.x" being the first pci slot. [fault reason 23] appears to be storage related, and the fault address sort of appears to be a memory address. So if that 750 is in the first slot, that may be the problem...?
That's just conjecture, though, since I've never owned or experimented with an NVME, so I wouldn't know for sure. :c

To address the less-likely problem:
8) I'm sorry if this sounds like a "captain obvious" solution, but I don't know if you've messed around with device assignment before, so here's the idea anyway:
You need two display adapters in order to run this setup- one for Linux and one for passthrough. (you implied early in that post that you expect Linux and Windows to "share" a GPU...sorry again if you're already aware of this and this is a pointless thing to address) The card you use for passthrough needs to be completely hidden from Linux from the get-go; once Linux touches the video card, that's it; it can't be dynamically unloaded and then reloaded into Windows, until you reboot the system completely. Try to ensure that the open source drivers are not in use, (it has something to do with nouveau and radeon being built into the kernel. I think this isn't actually necessary, but it's definitely a lot easier this way) and another piece of advice for convenience is to ensure-- if possible-- that if dom0's and domU's video cards use the same drivers, that they are from two different manufacturers. This has something to do with the device id, but again, I can't really specify why this is. It works for me, but then again, I can't really make an educated comparison between Xen and KVM. : /
The reason I bring this up is that if you only have one display adapter and that's where it's located, (that doesn't look like a video card, though) the problem might be because Linux has already claimed that GPU.
.
.
.
Again, though, I find it more likely that the NVME is the problem. : /

That's all I got for now. Can't tell you how stoked I am to see you guys are finally [considering] doing a video on this.

Thanks for going into details! I'm using KVM.

So it turns out it is a problem with iommu and the i915. It is likely a kernel bug, but may be fixed in kernel 4.2. I'm doing this on Arch just for extra challenges or I would have already recompiled the kernel since it was released today. Here was the fix (it was a couple of things):

i915.preliminary_hw_support=1 intel_iommu=on intel_iommu=igfx_off pcie_acs_override=downstream

I am not really convinced htat the acs_override is required. Interestingly if I do intel_iommu=igfx_off by itself, the whole intel iommu thing is shut off. I have to do both of those, it seems, to get it to work. That is.. different.. than my other system with Haswell Xeons.

I went with virtio instead of pci stubs because I have no idea. A lot of guides/howtos use pci stubs and mbr booting. None of that for me.. it needs to be UEFI all the way down, plus secure boot or whatever not-quite-secure boot uefi support is required for fast boot. OVMF is a must.. I want that 2 second Windows 8/10 boot time.

This system is skylake, so I have the iGPU + Asus Strix 390x .

-[0000:00]-+-00.0  Intel Corporation Sky Lake Host Bridge/DRAM Registers [8086:191f]
           +-01.0-[01]--+-00.0  Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon R9 290X] [1002:67b0]
           |            \-00.1  Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:aac8]
           +-02.0  Intel Corporation Sky Lake Integrated Graphics [8086:1912]

That's the PCI bus layout with me passing through 67b0 and aac8 only. iommu groups look good:

/sys/kernel/iommu_groups/1/devices/0000:01:00.0
/sys/kernel/iommu_groups/1/devices/0000:01:00.1

https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt

There is probably some weirdness with skylake, though I hope it is not Asus UEFI tables or something. The boot line above seems to work fine though I suspect the iGPU can never be passed to a VM. That suits me fine though.

By "paid not to work on that" I meant something like cross-gpu framebuffer. I have a friend that surmises that because crossfire works the way it does, without the bridge, it should be possible to "bridge" two AMD gpus to have a situation such that one GPU is for windows and the other is for linux. Rather than physically connecting both connectors, a framebuffer on one gpu is mapped through to the other. He's made some good progress on this but only glxgears (iirc) runs at the moment. It's very fast/low overhead. You're right, though, AMD is doing more good work here than nvidia. Actually it seems that nvidia is actively sabotaging this kind of thing unless you run a quadro, which I will point out in the video. It is possible to passthrough nvidia but you have to do it in such a way the virtualization cannot be easily detected.

The blog was helpful, but not complete. and I think most of my issue was down to the newness of skylake.

I think at this point I'm fully operational with the above settings.

Her'es what I used for my qemu startup script:

 qemu-system-x86_64 \
        -serial none \
        -parallel none \
        -nodefaults \
        -nodefconfig \
        -enable-kvm \
        -name Windows \
        -cpu host,kvm=off,check \
        -smp sockets=1,cores=2,threads=1 \
        -m 8192 \
        -device ich9-usb-uhci3,id=uhci \
        -device usb-ehci,id=ehci \
        -device nec-usb-xhci,id=xhci \
        -rtc base=localtime \
        -nographic \
        -netdev tap,id=t0,ifname=tap0,script=no,downscript=no -device e1000,netdev=t0,id=nic0 \
        -device vfio-pci,host=01:00.0,multifunction=on \
        -device vfio-pci,host=01:00.1 \
        -vga none \
        -device usb-host,hostbus=3,hostaddr=3 \
        -hda /dev/sdb \
        -boot menu=on \
        -device usb-host,hostbus=3,hostaddr=2 \
        -device usb-host,hostbus=3,hostaddr=5 \
        -drive if=pflash,format=raw,readonly,file=/usr/share/ovmf/x64/ovmf_code_x64.bin \
        -drive if=pflash,format=raw,file=./Windows_ovmf_vars_x64.bin \
        -hdb ./win10.qcow2

The awesome thing about this UEFI setup is that if I want to free up the GPU I can restart windows in about 10 seconds.

Why skylake? Uefi support all the way down and dmi 3.0 is fast enough to keep up with the nvme ssd.

Well, I guess that confirms I won't be of much help. : /
I was about to start shotgunning ideas anyway, but it looks like that might be pointless since some of this is way over my head. Hope you don't mind my asking a couple questions instead.

I don't think I'm really taking the journey with you here, because I can't tell if you're talking about sli or device assignment now. When you say it ought to be possible to bridge a connection between two gpus on two different machines (Linux and Windows, in this case) it almost sounds like you're talking about distributed computing...? Or would I closer to the mark if I assumed this is more like you're streaming the framebuffer to whatever display belongs to Linux?
I think part of my misunderstanding here is that I'm not fully grasping how this would work in practice.

Yeah, I wasn't aware until just today that KVM could sort of "sneak" a card through the Hypervisor like that-- they only openly enable it for boards marked as "Multi-OS compatible," and getting it to work on Xen involved some crazy stuff like pencil modding and bios flashing. (ie, a 680 convincingly disguised as a Quadro K5000. )

Anyway, I guess that's just another reason to start experimenting with KVM. Hopefully your video will shed some light on this "black art."
Thanks for the educational response-- it's always a pleasure.

To add my two cents to this topic:
KVM switches:
Tl;dr: good luck finding one that supports something digital above full HD and doesn't cost as much as a cheap monitor itself.
Synergy:
Is usable even for twitchy FPS gaming.
Nvidia and GPU passthrough:
basically the KVM hidden switch tries to hide any traces of the virtualization in place. There is no failsafe way for the nvidia driver to detect it's running inside a VM; it just looks for all kinds of small clues - paravirtualized clocks, CPU-information mentioning KVM, etc.
This IS some money making bullshit. There's this well known hack of swapping out SMD resistors on the card to make the VGA-bios think it's a quadro model. Apparently a couple weeks (months?) ago someone figured out how to patch the nvidia binary to not shut down on KVM-detection, along with enabling g-sync on regular DP monitors and such. He got shut down pretty fast, but apparently archive.org still has it - I'll try to find that link soon
1 VM on a 4C8T CPU = 2C4T for everyone
Basically - no. By default, the virtual cores of the VM are started as single processes on your host. Hence you could have 3 virtual cores (no hyperthreading on them) be shuffled around by your scheduler happily on your 8 logical host cores. I'm not sure what counts as 'productivity' for you, but I found something similar (i do some core-pinning cause) to work quite well for me. I got a steal on a 3930k+board recently, so i switched to 4 cores (8 threads) for my VM, 2 cores (4 threads) for my host, including some pinning via cgroups. Performance improvements in most games were.. unspectacular, as they max out a single core mostly.

Wendell's QEMU script
Oh my. Why not use libvirt? virt-manager cover 99% of the xml you need for KVM-gaming and is quite newbie-friendly. It also does image creation and such for you. This also helps when you have to shutdown your PC from afar cause you did something stupid that lead to your PC becoming headless. In that case you can tell libvirt to safely ACPI shut down your windows VM before turning off your host - all via SSH.

Did you include using hugepages for backing that virtual RAM?

1 Like

I really would Like to switch to this kind of solution, I just don't Know how (sorry guys but I'm still a Linux noob) I'm looking forward for a week done guide about it, and if you can link some I would appreciate that.

The gui was not playing nice with the required UEFI firmware that has to be substituted in. There was also some BS from virtmanager only wanting to create LXC containers, too. Probably missing some package somewhere even though the CLI was working fine. Besides, I think if someone copies the script they'll be fine. Except for the networking part, which arch/me failed at functioning correctly in the gui. Isn't that the arch way anyway? LOL You're right though on Fedora/Rawhide it was almost a cake walk.

For the KVM, 4k DP KVMs are about $400. However, I have a cheap USB/VGA KVM that is fine for swapping keyboard/mouse from real to virtual. But you can use steam in home streaming, vnc and remote desktop to not ever need to swap keyboards. You are right about synergy -- with the proper local network config, it should be sub millisecond latency.

2 cores/4 threads on the 6700k seems to be the sweet spot for most gaming/most games. It's 90-101% native speed (depends on the game). "Good Enough"

He got shut down pretty fast, but apparently archive.org still has it - I'll try to find that link soon

I heard about that, but saw no details. Need details on this; I've seen similar. The g-sync enable was for freesync monitors but only mobile GPUs (980M) as far as I know.

Did you include using hugepages for backing that virtual RAM?

Didn't seem to need to; arch seemed to have this configured out of the box.

Yeah, the magic happens on the bus, largely absent the cpu (normally, with SLI). Apparently. So in this case it doesn't matter too much that the virtualized OS gets half the graphics card and the real OS gets the other half? Something like that? Because at the end of the day the data is just copied from the frame buffer on the GPU passed through to the VM directly into the other GPUs video memory. So I gather it is like one gpu painting a rectangle on the other gpu? But I am not really sure how it works in reality.

The black magic seems really shady. The video will only demo a 390x; no demo of a 980Ti but my config above does have the crap turned off to get a 980Ti to work in the VM. I think. 85% sure. I can't think of any reasonable reason they would have done that.

1 Like

check out Heterogeneous computing....