GPU Passthrough on Arch Linux - BSOD on driver install (Radeon 7970)

/usr/share/ somewhere, but libvirt should already look for it. can't change it after the first boot though, you'll have to go through the virt-manager setup again and change it before the first boot. alternatively, you might be able to clone the existing one and change it, though im not sure

/etc/libvirt/qemu.conf there's an nvram variable in there somewhere you can use to specify ovmf, but if you have a recent version of libvirt it should already see it

Yes, you can only change it before installation. It's not finding it still, I tried installing ovmf package and I already had ovmf files from the command line method trials though in a subfolder.

The nvram option was commented out and those directories do not exist anyways. I added paths to ovmf (.fd) files but still not found. Should the ovmf files come with libvirt?

Here's how it looks like now:

/etc/libvirt/qemu.conf 

#nvram = [
#   "/usr/share/OVMF/OVMF_CODE.fd:/usr/share/OVMF/OVMF_VARS.fd",
#   "/usr/share/AAVMF/AAVMF_CODE.fd:/usr/share/AAVMF/AAVMF_VARS.fd"
#]

nvram = [
    "/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd:/usr/share/edk2.git/ovmf-x64/OVMF_VARS-pure-efi.fd"
]

There's no /usr/share/OVMF directory. Perhaps on Arch you need to get ovmf manually? I did as per the wiki and linked the paths in nvram option but it's still showing as not found.

Edit:

Reboot fixed the not found ovmf (or perhaps the new kernel). But with UEFI enabled I get a blank screen in the console too now when starting the vm.

Edit 2:

Without pci passed to the vm I get a brief flash of a logo in the console then just black with a single underscore. 100% usage on one core (4 passed to the vm).

sorry i dont have much time to help anymore today, i forgot to mention you should try restarting libvirtd to see if ovmf showed up > . <

you should try removing the spice server though, it can and probably will break things if you're passing though the card

I would try again and follow http://vfio.blogspot.com/2015/05/vfio-gpu-how-to-series-part-4-our-first.html
he's on fedora but its basically the same. if you're still having issues ill check back tonight/tomorrow.

Yeah, I am all Linuxed out for today as well :)

I suspected I needed to restart libvirtd so I just rebooted the system which made the ovmf file appear in UEFI option in virt-manager. By removing the spice server do you mean selecting an alternate setting in virt-manager details window?

I'm not running a spice server with my VM because it's intended powered by the card. I had the spice server there during initial install, I wanted to get windows installed first, and once it was installed I added in my 780 Ti. Once the driver was installed and working I removed the spice server and display, it's just running off the card now. I was able to have both, and the spice server/QXL display was basically like a second monitor but I didn't need that so I removed it.

I assume you mean "Display Spice" in the Details view? That's another thing rules out then I guess :)

Yes, I went in and double checked Virt-manager, it's Display Spice and Video QXL.

Thanks for checking! Starting to run out of thing to try though...

I had a similar issue at one point, but I really don't remember what I did to fix it... Maybe you're getting some errors from libvirt? Try running this before booting the VM, see if there's any errors.

journalctl -u libvirtd.service --no-pager --follow

Check the logs, always a good idea :) Will give it a shot in the morning.

  1. Sorry if this is a dumb question but do you actually have the correct OVMF files? I got mine from here: https://www.kraxel.org/repos/jenkins/edk2/
    There are tools you can use to extract rpms if you're not on Fedora.
    You want OVMF_VARS-pure-efi.fd and OVMF_CODE-pure-efi.fd

  2. Also, I'm wondering if your graphics card actually has a UEFI bios. OVMF won't work without it.

  3. In the libvirt config you should not have a video device defined. Passing through your hardware graphics card as a secondary device (where the virtual device is primary) only works with some Quadro cards as far as I know. Also, unless you are using OVMF, you need to set x-vga=on. See:
    http://vfio.blogspot.com/2015/05/vfio-gpu-how-to-series-part-5-vga-mode.html

1 Like

1 - Not a dumb question at all, but yes that's exactly where I got them and I do see the tiancore logo briefly which should mean it is working but freezing for some reason. (only in virtual console, nothing on monitor).

2 - I have Gigabyte 7970 with F3 firmware which does say it supports UEFI http://www.gigabyte.ie/products/product-page.aspx?pid=4102#bios.

3 - It seems like it should still work but I can try removing spice server. Is that all that should be done?

I am also testing with this script with same results (no video out):

#!/bin/bash
cp /usr/share/edk2.git/ovmf-x64/OVMF_VARS-pure-efi.fd /tmp/my_vars.fd
qemu-system-x86_64 \
-enable-kvm \
-m 2048 \
-cpu host,kvm=off \
-vga none \
-device vfio-pci,host=01:00.0,multifunction=on \
-device vfio-pci,host=00:01.0 \
-drive if=pflash,format=raw,readonly,file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd \
-drive if=pflash,format=raw,file=/tmp/my_vars.fd

Edit: Removed all unnecessary devices including spice server from virt-manager details page with no change.

No errors after VM is started:

-- Reboot --
    Mar 26 11:29:25 workstation systemd[1]: Started Virtualization daemon.
    Mar 26 11:29:26 workstation libvirtd[398]: libvirt version: 1.3.2
    Mar 26 11:29:26 workstation libvirtd[398]: hostname: workstation
    Mar 26 11:29:26 workstation libvirtd[398]: Cannot check dnsmasq binary /sbin/dnsmasq: No such file or directory
    Mar 26 11:29:26 workstation libvirtd[398]: direct firewall backend requested, but /sbin/ebtables is not available: No such file or directory
    Mar 26 11:29:26 workstation libvirtd[398]: internal error: Failed to initialize a valid firewall backend
    Mar 26 11:29:26 workstation libvirtd[398]: internal error: Failed to find path for dmidecode binary
    Mar 26 11:29:26 workstation libvirtd[398]: internal error: Failed to find path for dmidecode binary
    Mar 26 11:29:27 workstation libvirtd[398]: internal error: Failed to initialize a valid firewall backend

Looks like libvirtd isn't the problem.

You need to install the dnsmasq and etables I think comes from firewalld packages. I don't think it's stopping the vm from booting but it is an error and you need to have those packages.

Didn't want to install unnecessary packages. After 3 packages installed, no more errors but no change otherwise.

Hm. Can you try running a newer kernel? Arch is currently running a much newer kernel, I'm on 4.5 myself. I also have my VM setup and launched from virt-manager, I've never tried to run it from command line. Virt-manager is pretty good at telling you at least if there's some kind of problem when you try to launch it.

Also I had problems with QEMU without having the second line in the nvram config. Not really sure why, seems like it expects an array? In my case, this is my config. It is actually trying to start, when I didn't have the second entry it wouldn't start at all.

nvram = [
   "/usr/share/ovmf/OVMF-pure-efi.fd:/usr/share/ovmf/OVMF_VARS-pure-efi.fd",
   "/usr/share/AAVMF/AAVMF_CODE.fd:/usr/share/AAVMF/AAVMF_VARS.fd"
]

Can you also verify if vfio is binding the card properly? It will show in dmesg. Mine has scrolled off from the beginning of the log so I'd have to reboot. My card binds on boot with a vfio.conf in /etc/modprobe.d.

I am running Linux 4.4.5-1-ARCH which is the latest. 4.5 is in testing.

My nvram option looks same as yours.

Vfio seems to bind of as far as I can tell.

[dom@workstation ~]$ dmesg | grep -i vfio
[    0.412881] VFIO - User Level meta-driver version: 0.3
[    0.427775] vfio_pci: add [1002:6798[ffff:ffff]] class 0x000000/00000000
[    0.441101] vfio_pci: add [1002:aaa0[ffff:ffff]] class 0x000000/00000000
[ 3935.363096] vfio-pci 0000:01:00.0: enabling device (0000 -> 0003)
[ 3935.363180] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[ 3935.363183] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[ 3935.376517] vfio-pci 0000:01:00.1: enabling device (0000 -> 0002)
[ 4006.961761] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[ 4006.961766] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[ 4006.975006] vfio-pci 0000:01:00.1: enabling device (0400 -> 0402)
[ 4273.582213] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[ 4273.582221] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[ 4273.595471] vfio-pci 0000:01:00.1: enabling device (0400 -> 0402)
[10567.427328] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[10567.427333] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[10567.457273] vfio-pci 0000:01:00.1: enabling device (0400 -> 0402)
[10670.580076] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[10670.580080] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[10670.593360] vfio-pci 0000:01:00.1: enabling device (0400 -> 0402)
[10799.326858] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[10799.326863] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[10799.340129] vfio-pci 0000:01:00.1: enabling device (0400 -> 0402)
[10924.096879] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[10924.096885] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[10924.110166] vfio-pci 0000:01:00.1: enabling device (0400 -> 0402)
[16570.338020] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[16570.338026] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[16570.357954] vfio-pci 0000:01:00.1: enabling device (0400 -> 0402)
[16852.698972] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[16852.698978] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[16852.712216] vfio-pci 0000:01:00.1: enabling device (0400 -> 0402)
[17198.554840] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[17198.554846] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[17198.568097] vfio-pci 0000:01:00.1: enabling device (0400 -> 0402)
[17229.692389] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[17229.692395] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[17229.705617] vfio-pci 0000:01:00.1: enabling device (0400 -> 0402)
[28636.927687] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x19@0x270
[28636.927692] vfio_ecap_init: 0000:01:00.0 hiding ecap 0x1b@0x2d0
[28636.944283] vfio-pci 0000:01:00.1: enabling device (0400 -> 0402)

Ah I thought you were still on 4.1 since you mentioned that earlier. Yes the binds are lines 2 and 3.

Can you try creating a VM with virt-manager and see if you have the same issues? I think you'll need to remove the hyper-v section with virsh, but maybe that's only nvidia specific. Here's my XML.

<domain type='kvm'>
  <name>Windows10</name>
  <uuid>13ae3cb1-b005-4ede-8c2e-d5467c2c0f41</uuid>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <vcpu placement='static'>6</vcpu>
  <os>
    <type arch='x86_64' machine='pc-i440fx-2.5'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/ovmf/OVMF-pure-efi.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/Windows10_VARS.fd</nvram>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <kvm>
      <hidden state='on'/>
    </kvm>
    <vmport state='off'/>
  </features>
  <cpu mode='host-passthrough'>
    <topology sockets='1' cores='4' threads='2'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/sbin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/home/michael/VMs/Windows10.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </controller>
    <interface type='network'>
      <mac address='52:54:00:88:12:e6'/>
      <source network='default'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='yes'>
      <source>
        <vendor id='0x1b1c'/>
        <product id='0x0a0a'/>
      </source>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </hostdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
  </devices>
</domain>

You may also have VM specifc logs under /var/log/libvirt/qemu that may give you additional feedback. Could you also post that?