[VFIO Example] Primary slot VEGA passthrough now works, VEGA reset issue fixed for good

I had previously posted here about how I couldn’t get the GPU to pass through unless I swapped the GPUs to the other slots.


Obviously this wasn’t ideal as the Vega is 2.5 slots and I prefer to also keep slots available.

So I had originally asked Asrock about a possible option to select boot GPU, but they said no and I haven’t seen that on any of their BIOSes (either never seen it, or not seen in years).
However I discovered that by turning the compatibility support module (CSM) off in the BIOS any extra option is unnecessary. Once that option is off, I can boot to the host GPU no matter what slot it is in. (even 1x slot is ok)
The BIOS will remember this. So you setup which GPU is host by unplugging monitors from the other cards. Then after the first boot you can plug them back in. The BIOS screen will still only show on the host GPU card.
However by doing this, some initialization will still happen on the VEGA, (can see monitor light come on even though black screen) and the passthrough was broken. I had even tried supplying the BIOS manually to the GPU, but it still did not work.

After updating Qemu to 2.12 (along with the related libvirt and virt-manager packages) using this PPA and using kernel 4.19 RC1 the problem is now gone.
https://launchpad.net/~jacob/+archive/ubuntu/virtualisation

In addition the VEGA reset issues now appear to be fixed for good, compared to Kernel 4.17 where I still had them.

Here is my current setup example for reference. I have in this example how to do the Qemu command line from the XML so that you can enable the ioh3420 option to prevent crashes on AMD driver install. I need to do another driver install before knowing 100% that this fixes it but I believe it does so far. I also show the Ryzen CPU pinning to pass through just the physical cores of the 1600x. Also this example shows the actual CPU model getting passed through, so it says Ryzen now instead of EPYC in the VM. Finally I have working Hugepages shown here. You must add the hugepages to the xml or it will not use it and you will waste RAM.
Also I pass through the mobo audio and 4 of the USB3 ports perfectly fine.

To re-iterate this is on the Asrock X370 Pro Gaming with BIOS 4.64.


The USB and Audio passthrough setup works perfectly. Now that VEGA reset issue is fixed I can start the VM and the connected devices (game controllers, mouse/keyboard, etc) perfectly pass through to the VM. On VM shutdown the devices work fine on the host again automatically (no restarting needed, audio can automatically switch on host after few sec). Then I can restart the VM whenever.

The USB ports that get passed through from this one controller are the 2 next to the PS2 port and the 2 underneath the 1gb ethernet. The other ports are all in the other controller, and crowded grouping.

Here is the config snippet minus storage and extra controllers.

Note that when I moved the host gpu from the 8x slot to a 1x slot, the IOMMU groupings moved around. The numbers for the audio or usb changed. If I re-populate the 8x slot with something else the groupings will change again, so I would need to change the device in virt-manager to get the USB/audio working again.

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>win7</name>
  <uuid>07d336e7-eb30-4c63-a520-576389a73442</uuid>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
  <vcpu placement='static'>6</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='2'/>
    <vcpupin vcpu='2' cpuset='4'/>
    <vcpupin vcpu='3' cpuset='6'/>
    <vcpupin vcpu='4' cpuset='8'/>
    <vcpupin vcpu='5' cpuset='10'/>
    <vcpusched vcpus='0' scheduler='fifo' priority='1'/>
    <vcpusched vcpus='1' scheduler='fifo' priority='1'/>
    <vcpusched vcpus='2' scheduler='fifo' priority='1'/>
    <vcpusched vcpus='3' scheduler='fifo' priority='1'/>
    <vcpusched vcpus='4' scheduler='fifo' priority='1'/>
    <vcpusched vcpus='5' scheduler='fifo' priority='1'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-bionic'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/win7_VARS.fd</nvram>
    <bootmenu enable='no'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <vmport state='off'/>
  </features>
  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
    <topology sockets='1' cores='6' threads='1'/>
    <feature policy='disable' name='hypervisor'/>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/bin/kvm-spice</emulator>
    <controller type='pci' index='0' model='pci-root'/>    
    <interface type='direct'>
      <mac address='52:54:00:be:0e:35'/>
      <source dev='enp42s0' mode='bridge'/>
      <model type='e1000'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </interface>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x31' slot='0x00' function='0x3'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x2f' slot='0x00' function='0x0'/>
      </source>
      <rom bar='on' file='/home/bill/VBIOS/Vega10P.rom'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0d' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x2f' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0e' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x30' slot='0x00' function='0x3'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </hostdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </memballoon>
  </devices>
  <qemu:commandline>
    <qemu:arg value='-device'/>
    <qemu:arg value='ioh3420,bus=pci.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1'/>
    <qemu:arg value='-cpu'/>
    <qemu:arg value='host,kvm=off'/>
  </qemu:commandline>
</domain>

1 Like

Any chance you may be testing Looking Glass with Vega soon? @gnif has been trying non-stop to fix the Vega reset issue.

Does the fix in 4.19 RC1 address Polaris reset issues as well? I really struggled with that on Fedora 28, so much so that I eventually threw in the towel and installed a GTX 1050 Ti despite having to deal with the Code 43 issue.

I am planning to at some point test out my RX480 as well. (primarily because I still can’t get Assassins Creed Origins to work with VEGA so more testing/troubleshooting is required, but that is another topic…)

The problem is that it is a huge pain to take out the vega and put in the polaris. So I am probably going to get a riser and hook the polaris into the empty 8x slot, then create a clone of the VM that uses the polaris instead. Its incredibly annoying to undo all of the screws and my new gpu sag prevention setup. So getting risers is easier. Also don’t want to lose $700 breaking the VEGA from constant swapping. It was very hard to get the VEGA and I had to wait ages to get one with a DVI port for a decent price. Otherwise I would have setup my VFIO last year, lol.

I had tried to setup looking glass this weekend on Windows 7, but I need to figure out how to compile dd4seven first. Anyone have any tips on this before I spend a lot of time on it? I have never setup MinGW 64 before on Windows or installed it on linux.

Once I get this compiling I plan to post a guide on setup with Windows 7. But it sounds like I will be in for some bugs. Both looking glass and dd4seven have their own bugs.

I’ll give VFIO another go by the sounds of it.

I had it somewhat working and then a kernel update or something broke my shit and i’ve not had much time at home to play with it.

I have the additional complication of 2 vegas in the box, but i did get around that (and that’s the bit that broke).

Your post reminds me of something that is missing from many of the guides.
After you put the pci IDs into the grub command line for a particular boot option, you need to call the update on all kernels, otherwise it will only update the current or latest kernel installed. I currently have separate boot entries that have the PCI IDs included, in case I need/want to boot without it on.

The “-k all” option is the important part.

sudo update-grub
sudo update-initramfs -u -k all

Also 4.19 RC1 is pretty stable. No issues and I have been using it since about the time it came out. This release also should have good AMD APU graphics support. So yall might want to give it a spin early if you have APUs.

Cheers, that could maybe be something to do with it.
My current Linux install is borked and on a 250 GB SSD (windows is on my bigger ones).

Given the cost of 1TB SSDs has dropped significantly, i might just buy a new SSD this weekend and do a fresh install on it.

Yeah, I ran into that too. Was easier to upgrade my monitor, frankly.

Re: Assassin’s Creed Origins - runs nicely, if I a bit slow, on my GTX 1050 Ti. But even under native Windows the game doesn’t run great on AMD hardware. Tweaking the settings helps some, but I’ve run into some graphical glitches.