GPU Passthrough on Arch Linux - BSOD on driver install (Radeon 7970)

dominicm · March 26, 2016, 8:46pm

2016-03-26 14:31:27.196+0000: starting up libvirt version: 1.3.2, qemu version: 2.5.0, hostname: workstation
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/sbin/qemu-system-x86_64 -name win10-uefi-nospice -S -machine pc-i440fx-2.5,accel=kvm,usb=off,vmport=off -cpu IvyBridge -drive file=/usr/share/edk2.git/ovmf-x64/OVMF$
Domain id=2 is tainted: high-privileges
2016-03-26T14:32:07.887539Z qemu-system-x86_64: terminating on signal 15 from pid 8351
2016-03-26 14:32:09.288+0000: shutting down

Formatting isn't great, what's a good too to view logs like these?

dominicm · March 26, 2016, 9:00pm

How do I apply the .conf? Simply creating a file in /var/lib/libvirt/images doesn't make the VM show up in virt-manager.

lessaj · March 26, 2016, 9:09pm

In my case the XML is here:

/etc/libvirt/qemu/Windows10.xml

mythicalcreature · March 26, 2016, 9:31pm

To use the XML directly, create an XML file as above (name it anything you want) then run:

virsh define xml_file_path

to register it with libvirt. You can then start it with

virsh start name_of_vm

The name is whatever you have in the name field of the XML. Edit the XML again with:

virsh edit name_of_vm

I usually view logs with less. It wraps long lines too.

dominicm · March 26, 2016, 11:00pm

Had to make a lot of edit as I was getting various errors. I used previous .qcow2 container with win10. Still same results, no signal. GPU sure sounds like it's doing something (audible fan change when vm starts) but no video.

mythicalcreature · March 26, 2016, 11:52pm

Once thing I was curious about, when you were using the ACS patched kernel, did you set the acs_override=downstream kernel option?

dominicm · March 27, 2016, 1:35am

I did along with splash "i915.enable_hd_vgaarb=1 iommu=on intel_iommu=on" options though I really don't think I needed either of the patch options since the pci groups were clean.

dominicm · March 29, 2016, 7:47pm

Well I am officially out of ideas. At this point I would just buy a 390X but no reason to believe it would work either. The only other thing I could do is do a fresh install of Arch or other OS and issue a series of confirmed working commands which would give a strong indication that it's hardware and not software. Is there an article that you guys followed exactly or very remember the steps?

Blanger · March 29, 2016, 9:09pm

Most of us followed this guide to a certain degree, the entire series is well worth a read.

dominicm · March 29, 2016, 9:31pm

Certainly some good info but quite vague. Is there any chance the fact that I am using systemd-boot instead of grub or vfio-pci instead of pci-stub cause my issues?

Blanger · March 29, 2016, 11:00pm

Like I said above I'm not a expert....but I did try UEFI, vfio, and never got it to work even though I could see the GPU had been black listed and no conflicts in Linux, I switched to using pci-stub and wala! it has been working ever since, I have actually a list of devices that are being passed to the guest using pci-stub and all work as expected.

Something hardware-wise in my system didn't like UEFI and it didn't like vfio either and as to what it was that kept me from making it work I haven't a clue because everything looked correct and should have worked but didn't....pci-stub worked like a charm though....and of course I do use grub.

dominicm · March 30, 2016, 12:40am

Definitely worth a try then. What do you mean by "try UEFI" exactly? Do you mean to say you boot your system in BIOS mode? or are you referring to GPU UEFI firmware or..?

When you used vfio was the gpu devices bound to vfio-pci? i.e. did output from lspci -nnk show "Kernel driver in use: vfio-pci"? Was the PCI grouping clean (i.e. no other devices in same group as GPU)? See original post for example of my pci grouping (clean).

Can anyone else comment on any of this? Will try pci-stub tomorrow but to try grub I would do a clean test install. Any general reasons not to use system-boot over grub?

Blanger · March 30, 2016, 1:33am

Yeah...GPU UEFI, I tried to follow how Wendell did it on his skylake build, and yes vfio did show the device bound to it as far as the grouping to be honest I really don't remember that was over 9 months ago and I was only 3 months into learning Linux so my head was whirling around a lot trying to digest everything, looking back at my notes (yeah I keep notes sometimes) I believe the grouping was as you say clean, but I tried so many different configs and several different distros (starting with Ubuntu, then OpenSuse and finally settling on Fedora) so I was wiping the drive and reinstalling a lot, I did a ton of testing and didn't keep notes every time I tried something if I had tried it before on a different distro but used the notes as a guideline.

I have a little odd setup also because I'm using 3 GPUs 2 identical 270's and a 270x so it was tricky figuring out which card was witch sometimes, but bottom line following the guide in the link I provided above and just using pci-stub got it working, if your set on doing this I would try pci-stub, nothing to loose and I really don't see where using grub over sys-boot would make much difference, you already know that you were able to bind the card to vfio using sys-boot so pci-stub should work the same I'd think.....but again I'm no expert.

dominicm · March 30, 2016, 11:25am

What about the GPU UEFI? Isn't UEFI firmware a requirement? Either way it's not like I can disable UEFI on my GPU (even the oldest gpu girmware is UEFI). Please clarify.

Blanger · March 30, 2016, 11:57am

I'll try.....from Wendell's Skylake build he mentioned booting the VM in UEFI mode.

"The next piece of the puzzle is that we want to boot our Virtual
Machine in UEFI mode. QEMU/KVM doesn't do that out of the box with
Seabios, so we need to get a UEFI bios. Fortunately the Fedora folks
have put together an awesome UEFI. You'll want to grab adn install it
from here:

There seem to be two UEFI files -- one that is the actual UEFI and
one to store UEFI vars. The one to store UEFI vars you'll want to copy
to your home folder and use it that way. "

Doing the above supposedly makes the VM (KVM) boot in seconds, I tried to configure this several times but had no luck, I'm not really sure why but someone told me at the time that it was related to the age of my GPUs and while they support UEFI something wasn't quite right because it never worked, my KVM boots up just like a bare metal install, actually a little slower in my opinion which isn't a problem for me since my guest runs all the time, I don't start and stop it when my computer is up running the guest is running also.

If you haven't looked at the Skylake thread it is worth reading through, as far as UEFI compliant GPUs I'm not sure that is a requirement, any device that can be blacklisted can be passed through to a guest which includes just about any piece of hardware from controllers, add-in cards, peripherals, that don't show up in a UEFI configuration can be passed through.

https://forum.teksyndicate.com/t/gta-v-on-linux-skylake-build-hardware-vm-passthrough-tek-syndicate/87440

The thing I've found is that when doing a hardware pass through there are several different methods of doing it, some are distro based, some are kernel based, some are related to the type of hardware your trying to pass, to get mine to work reliably I used bits and pieces of information from different sources but the info at vifo.blogspot I found the most valuable in figuring out my setup.

While I didn't follow the guide 100% I did use it in bits and pieces, I'm not sure just how much sense that makes...lol

Hope this helps.

dominicm · March 30, 2016, 12:51pm

Ok, so you are just using default bios provided by virt-manager (seabios)? What do you see from lspci -nnk output? Is it "Kernel driver in use: pci-stub"? Looking at one tutorial even though it uses pci-stub there is a script that is executed that binds it to pci-vfio, just want to make sure...

Also from the link you provided did you complete the steps relating to "dracut"? I've never seen this mentioned anywhere else.

Blanger · March 30, 2016, 2:13pm

I'm at work so I can't look till this evening, but I know I tried to use a script to bind to pci-vfio and also never got that to work, on the guide I posted above I followed the steps using pci-stub to the point that he is binding to vfio....

"If you're using kernel v4.1 or newer, the vfio-pci driver supports the same ids option
so you can directly attach devices to vfio-pci and skip pci-stub.
vfio-pci is not generally built statically into the kernel, so we need
to force it to be loaded early. To do this on Fedora we need to setup
the module options we want to use with modprobe.d. I typically use a
file named /etc/modprobe.d/local.conf for local, ie. system specific, configuration. In this case, that file would include:"

I stopped there and just used pci-stub....it worked, so yes I'm using seabios for the KVM instead of the UEFI variation and only pci-stub to blacklist the devices, it was really simple once I cut out all the BS that I couldn't get to work, is mine the best setup? I have no clue, but it is very stable and will run anything that I have tried, I have no host issues and no more guest issues than anyone else that I've talked to, I expected more problems with guest stability, and way less performance of the guest running in the KVM, but I'm actually quite pleased with it and will never run Windows on bare metal ever again.

As far as dracut...nope since I only used pci-stub I didn't do any of the vfio stuff to get it to work, but I did try several times to follow his guide word by word and step by step at least twice but failed each time, like I said I was really new to Linux at that point and was flying blind, today I might have different results since I'm more comfortable editing configuration files but I'm not very comfortable with the scrip side of things, I understand mostly what is being done but lack the necessary knowledge to write scrips from scratch.

Once I figured out that pci-stub had taken control of the video card I was trying to pass through I built a KVM in QEMU.virt-manager and the card was listed in the "add other hardware" so added it to the KVM along with it's audio section, installed Win 7 and of course it saw the device but didn't know what it was other than a display device, opened a browser went to AMDs site and DL'd the drivers for the 270x (just in-case) but I also ran the auto detect on the site and it found the card and installed the drivers and software, rebooted the guest (actually shut down not reboot) and restarted the guest......I had video on the passed through card and it's monitor.

At this point it was about my 5th or 6th KVM I had built trying to get it to work, and of course it was just a test KVM not near enough drive space allocated for a Windows install and any games or other software, but it was working.

dominicm · March 30, 2016, 6:25pm

It looks like no matter what method is used pci-vfio is used regardless. I have it bound to pci-stub on boot but as soon as virt-manager starts a vm with a gpu attached it gets bound to pci-vfio. This is why binding script needs to used when not using a GUI.

Using pci-stub didn't make a difference, still same issue. BSOD on drive install.

Blanger · March 30, 2016, 7:08pm

Yeah I understand what your saying, and since I'm using a GUI I had less issues I'd guess, but I know it can be done on Arch because others have had success, might be a topic to take over to the Arch forum for help.

https://bbs.archlinux.org/

There also might be information here on Reddit

https://www.reddit.com/r/linux_gaming/comments/3lnpg6/gpu_passthrough_revisited_an_updated_guide_on_how/

dominicm · March 30, 2016, 7:54pm

I actually posted on Arch forum first but no responses yet :) Will try doing the same on an AMD system to rule out hardware issues if no luck will do a test install on spare hdd to rule out software issues. If I ever get this to work will do a write up on my blog like with most other things I do...