Arch Linux Threadripper 3000 Dual GTX 1080 VFIO Setup Help

Hey all!

I recently built a TRX40 Aorus Xtreme/Threadripper 3960X system where i migrated two GTX 1080s from a gaming rig with the intention to utilize VFIO/IOMMU to pass through one of them to a Windows VM for some light gaming.

The Problem

I installed windows before passing through the GPU. But, when I run Windows 10 VM. It runs but monitor never detects a signal. Nothing in journalctl stands out but here is some feedback on my configuration. (When i bing up the console it on host side it is just a flashing cursor). This is my first VFIO build so any leads at what to look for is much appreciated.

win10.xml

<domain type="kvm">
  <name>win10</name>
  <uuid>f40bda08-157a-4ef1-a576-13fa7cbe2e5e</uuid>
  <metadata>
    <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
      <libosinfo:os id="http://microsoft.com/win/10"/>
    </libosinfo:libosinfo>
  </metadata>
  <memory unit="KiB">8192000</memory>
  <currentMemory unit="KiB">8192000</currentMemory>
  <vcpu placement="static">8</vcpu>
  <os>
    <type arch="x86_64" machine="pc-q35-4.1">hvm</type>
    <boot dev="hd"/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state="on"/>
      <vapic state="on"/>
      <spinlocks state="on" retries="8191"/>
    </hyperv>
    <vmport state="off"/>
  </features>
  <cpu mode="host-model" check="partial">
    <model fallback="allow"/>
  </cpu>
  <clock offset="localtime">
    <timer name="rtc" tickpolicy="catchup"/>
    <timer name="pit" tickpolicy="delay"/>
    <timer name="hpet" present="no"/>
    <timer name="hypervclock" present="yes"/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled="no"/>
    <suspend-to-disk enabled="no"/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type="file" device="disk">
      <driver name="qemu" type="qcow2"/>
      <source file="/var/lib/libvirt/images/win10.qcow2"/>
      <target dev="sda" bus="sata"/>
      <address type="drive" controller="0" bus="0" target="0" unit="0"/>
    </disk>
    <disk type="file" device="cdrom">
      <driver name="qemu" type="raw"/>
      <source file="/home/archie/jails/firefox/Downloads/Win10_1909_English_x64.iso"/>
      <target dev="sdb" bus="sata"/>
      <readonly/>
      <address type="drive" controller="0" bus="0" target="0" unit="1"/>
    </disk>
    <controller type="usb" index="0" model="qemu-xhci" ports="15">
      <address type="pci" domain="0x0000" bus="0x02" slot="0x00" function="0x0"/>
    </controller>
    <controller type="sata" index="0">
      <address type="pci" domain="0x0000" bus="0x00" slot="0x1f" function="0x2"/>
    </controller>
    <controller type="pci" index="0" model="pcie-root"/>
    <controller type="pci" index="1" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="1" port="0x10"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x0" multifunction="on"/>
    </controller>
    <controller type="pci" index="2" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="2" port="0x11"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x1"/>
    </controller>
    <controller type="pci" index="3" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="3" port="0x12"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x2"/>
    </controller>
    <controller type="pci" index="4" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="4" port="0x13"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x3"/>
    </controller>
    <controller type="pci" index="5" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="5" port="0x14"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x02" function="0x4"/>
    </controller>
    <controller type="pci" index="6" model="pcie-root-port">
      <model name="pcie-root-port"/>
      <target chassis="6" port="0x8"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x0"/>
    </controller>
    <controller type="virtio-serial" index="0">
      <address type="pci" domain="0x0000" bus="0x03" slot="0x00" function="0x0"/>
    </controller>
    <serial type="pty">
      <target type="isa-serial" port="0">
        <model name="isa-serial"/>
      </target>
    </serial>
    <console type="pty">
      <target type="serial" port="0"/>
    </console>
    <channel type="spicevmc">
      <target type="virtio" name="com.redhat.spice.0"/>
      <address type="virtio-serial" controller="0" bus="0" port="1"/>
    </channel>
    <input type="tablet" bus="usb">
      <address type="usb" bus="0" port="1"/>
    </input>
    <input type="mouse" bus="ps2"/>
    <input type="keyboard" bus="ps2"/>
    <sound model="ich9">
      <address type="pci" domain="0x0000" bus="0x00" slot="0x1b" function="0x0"/>
    </sound>
    <hostdev mode="subsystem" type="usb" managed="yes">
      <source>
        <vendor id="0x046d"/>
        <product id="0xc52f"/>
      </source>
      <address type="usb" bus="0" port="4"/>
    </hostdev>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x4d" slot="0x00" function="0x0"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x05" slot="0x00" function="0x0"/>
    </hostdev>
    <hostdev mode="subsystem" type="pci" managed="yes">
      <source>
        <address domain="0x0000" bus="0x4d" slot="0x00" function="0x1"/>
      </source>
      <address type="pci" domain="0x0000" bus="0x06" slot="0x00" function="0x0"/>
    </hostdev>
    <memballoon model="virtio">
      <address type="pci" domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
    </memballoon>
  </devices>
</domain>
$ dmesg | grep -i -e DMAR -e IOMMU
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-linux root=/dev/mapper/vg_main-rootfs rw mce=off amd_iommu=on iommu=pt loglevel=3 quiet
[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-linux root=/dev/mapper/vg_main-rootfs rw mce=off amd_iommu=on iommu=pt loglevel=3 quiet
[    3.697236] iommu: Default domain type: Passthrough (set via kernel command line)
[    3.887681] pci 0000:60:00.2: AMD-Vi: IOMMU performance counters supported
[    3.887716] pci 0000:40:00.2: AMD-Vi: IOMMU performance counters supported
[    3.887733] pci 0000:20:00.2: AMD-Vi: IOMMU performance counters supported
[    3.887747] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    3.888195] pci 0000:00:01.0: Adding to iommu group 0
[    3.888219] pci 0000:00:01.1: Adding to iommu group 1
[    3.888243] pci 0000:00:01.2: Adding to iommu group 2
[    3.888261] pci 0000:00:02.0: Adding to iommu group 3
[    3.888280] pci 0000:00:03.0: Adding to iommu group 4
[    3.888300] pci 0000:00:04.0: Adding to iommu group 5
[    3.888319] pci 0000:00:05.0: Adding to iommu group 6
[    3.888342] pci 0000:00:07.0: Adding to iommu group 7
[    3.888364] pci 0000:00:07.1: Adding to iommu group 8
[    3.888386] pci 0000:00:08.0: Adding to iommu group 9
[    3.888406] pci 0000:00:08.1: Adding to iommu group 10
[    3.888429] pci 0000:00:14.0: Adding to iommu group 11
[    3.888443] pci 0000:00:14.3: Adding to iommu group 11
[    3.888496] pci 0000:00:18.0: Adding to iommu group 12
[    3.888510] pci 0000:00:18.1: Adding to iommu group 12
[    3.888523] pci 0000:00:18.2: Adding to iommu group 12
[    3.888538] pci 0000:00:18.3: Adding to iommu group 12
[    3.888552] pci 0000:00:18.4: Adding to iommu group 12
[    3.888565] pci 0000:00:18.5: Adding to iommu group 12
[    3.888579] pci 0000:00:18.6: Adding to iommu group 12
[    3.888592] pci 0000:00:18.7: Adding to iommu group 12
[    3.888614] pci 0000:01:00.0: Adding to iommu group 13
[    3.888635] pci 0000:02:00.0: Adding to iommu group 14
[    3.888655] pci 0000:03:00.0: Adding to iommu group 15
[    3.888681] pci 0000:04:00.0: Adding to iommu group 16
[    3.888703] pci 0000:04:00.3: Adding to iommu group 17
[    3.888724] pci 0000:20:01.0: Adding to iommu group 18
[    3.888742] pci 0000:20:02.0: Adding to iommu group 19
[    3.888764] pci 0000:20:03.0: Adding to iommu group 20
[    3.888787] pci 0000:20:03.1: Adding to iommu group 21
[    3.888805] pci 0000:20:04.0: Adding to iommu group 22
[    3.888824] pci 0000:20:05.0: Adding to iommu group 23
[    3.888845] pci 0000:20:07.0: Adding to iommu group 24
[    3.888866] pci 0000:20:07.1: Adding to iommu group 25
[    3.888890] pci 0000:20:08.0: Adding to iommu group 26
[    3.888910] pci 0000:20:08.1: Adding to iommu group 27
[    3.888950] pci 0000:21:00.0: Adding to iommu group 28
[    3.888975] pci 0000:21:00.1: Adding to iommu group 28
[    3.888995] pci 0000:22:00.0: Adding to iommu group 29
[    3.889019] pci 0000:23:00.0: Adding to iommu group 30
[    3.889040] pci 0000:23:00.1: Adding to iommu group 31
[    3.889065] pci 0000:23:00.3: Adding to iommu group 32
[    3.889087] pci 0000:23:00.4: Adding to iommu group 33
[    3.889109] pci 0000:40:01.0: Adding to iommu group 34
[    3.889131] pci 0000:40:01.1: Adding to iommu group 35
[    3.889150] pci 0000:40:02.0: Adding to iommu group 36
[    3.889172] pci 0000:40:03.0: Adding to iommu group 37
[    3.889193] pci 0000:40:03.1: Adding to iommu group 38
[    3.889212] pci 0000:40:04.0: Adding to iommu group 39
[    3.889234] pci 0000:40:05.0: Adding to iommu group 40
[    3.889255] pci 0000:40:07.0: Adding to iommu group 41
[    3.889275] pci 0000:40:07.1: Adding to iommu group 42
[    3.889297] pci 0000:40:08.0: Adding to iommu group 43
[    3.889317] pci 0000:40:08.1: Adding to iommu group 44
[    3.889342] pci 0000:41:00.0: Adding to iommu group 45
[    3.889416] pci 0000:42:00.0: Adding to iommu group 46
[    3.889516] pci 0000:42:01.0: Adding to iommu group 47
[    3.889590] pci 0000:42:02.0: Adding to iommu group 48
[    3.889663] pci 0000:42:03.0: Adding to iommu group 49
[    3.889737] pci 0000:42:04.0: Adding to iommu group 50
[    3.889811] pci 0000:42:05.0: Adding to iommu group 51
[    3.889845] pci 0000:42:08.0: Adding to iommu group 52
[    3.889880] pci 0000:42:09.0: Adding to iommu group 53
[    3.889914] pci 0000:42:0a.0: Adding to iommu group 54
[    3.889994] pci 0000:43:00.0: Adding to iommu group 55
[    3.890089] pci 0000:44:00.0: Adding to iommu group 56
[    3.890172] pci 0000:45:00.0: Adding to iommu group 57
[    3.890251] pci 0000:45:00.1: Adding to iommu group 58
[    3.890312] pci 0000:47:00.0: Adding to iommu group 59
[    3.890367] pci 0000:48:00.0: Adding to iommu group 60
[    3.890427] pci 0000:49:00.0: Adding to iommu group 61
[    3.890448] pci 0000:4a:00.0: Adding to iommu group 52
[    3.890496] pci 0000:4a:00.1: Adding to iommu group 52
[    3.890513] pci 0000:4a:00.3: Adding to iommu group 52
[    3.890534] pci 0000:4b:00.0: Adding to iommu group 53
[    3.890555] pci 0000:4c:00.0: Adding to iommu group 54
[    3.890591] pci 0000:4d:00.0: Adding to iommu group 62
[    3.890616] pci 0000:4d:00.1: Adding to iommu group 62
[    3.890636] pci 0000:4e:00.0: Adding to iommu group 63
[    3.890657] pci 0000:4f:00.0: Adding to iommu group 64
[    3.890677] pci 0000:60:01.0: Adding to iommu group 65
[    3.890695] pci 0000:60:02.0: Adding to iommu group 66
[    3.890716] pci 0000:60:03.0: Adding to iommu group 67
[    3.890734] pci 0000:60:04.0: Adding to iommu group 68
[    3.890753] pci 0000:60:05.0: Adding to iommu group 69
[    3.890774] pci 0000:60:07.0: Adding to iommu group 70
[    3.890794] pci 0000:60:07.1: Adding to iommu group 71
[    3.890819] pci 0000:60:08.0: Adding to iommu group 72
[    3.890839] pci 0000:60:08.1: Adding to iommu group 73
[    3.890861] pci 0000:61:00.0: Adding to iommu group 74
[    3.890884] pci 0000:62:00.0: Adding to iommu group 75
[    3.891026] pci 0000:60:00.2: AMD-Vi: Found IOMMU cap 0x40
[    3.891029] pci 0000:40:00.2: AMD-Vi: Found IOMMU cap 0x40
[    3.891030] pci 0000:20:00.2: AMD-Vi: Found IOMMU cap 0x40
[    3.891032] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    3.894000] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[    3.894029] perf/amd_iommu: Detected AMD IOMMU #1 (2 banks, 4 counters/bank).
[    3.894058] perf/amd_iommu: Detected AMD IOMMU #2 (2 banks, 4 counters/bank).
[    3.894087] perf/amd_iommu: Detected AMD IOMMU #3 (2 banks, 4 counters/bank).
[    3.910960] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <[email protected]>
$ dmesg | grep -i vfio
[    4.204127] VFIO - User Level meta-driver version: 0.3
[  407.162926] vfio-pci 0000:4d:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[  408.213216] vfio-pci 0000:4d:00.0: enabling device (0000 -> 0003)
[  408.213500] vfio-pci 0000:4d:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[  408.229885] vfio-pci 0000:4d:00.1: enabling device (0000 -> 0002)

My Setup (Due Diligence Details)

I have enabled SVM in UEFI.

I am running Arch Linux with the following installed: qemu, libvirtr, virt-manager, ovmf. I have gone through the Arch Wiki article here and set up the following boot script:

/usr/local/bin/vfio-pci-override.sh

#!/bin/sh

DEVS="0000:4d:00.0 0000:4d:00.1"

if [ ! -z "$(ls -A /sys/class/iommu)" ]; then
    for DEV in $DEVS; do
        echo "vfio-pci" > /sys/bus/pci/devices/$DEV/driver_override
    done
fi

/etc/initcpio/install/vfio


#!/bin/bash
build() {
    add_file / /usr/local/bin/vfio-pci-override.sh
    add_runscript
}

/etc/initcpio/hooks/vfio

#!/usr/bin/ash

run_hook() {
    msg ":: Triggering vfio-pci override"
    /bin/sh /usr/local/bin/vfio-pci-override.sh
}

And i have the following in my /etc/mkinitcpio.conf

FILES=(/usr/local/bin/vfio-pci-override.sh)
MODULES=(dm_mod kvm_amd vfio_pci vfio vfio_iommu_type1 vfio_virqfd)
HOOKS=(base udev autodetect modconf block mdadm_udev lvm2 filesystems keyboard fsck vfio)

And here are the boot flags in /etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="mce=off amd_iommu=on iommu=pt loglevel=3 quiet"

needs ovmf i think

 <os>
    <type arch='x86_64' machine='pc-q35-4.0'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/ovmf/x64/OVMF_CODE.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/main_VARS.fd</nvram>
  </os>

You’ll have to reinstall Windows 10 if you didn’t install it while booted into OVMF in the first place. I don’t see OVMF anywhere on your libvirt config.

Yes, thank you @pantato! You are right. In case there are other Arch users out there who stumble on the same problem. There was no OVMF or any UEFI firmware options in virt-manager after installing ovmf so i had virsh edit win10 to add the following:

  <os>
    <type arch="x86_64" machine="pc-q35-4.1">hvm</type>
    <loader readonly="yes" type="pflash">/usr/share/ovmf/x64/OVMF_CODE.fd</loader>
    <nvram>/usr/share/ovmf/x64/OVMF_VARS.fd</nvram>
  </os>

That’s not true. You just have to set the correct path in /etc/libvirt/qemu.conf and it shows up in virt-manager. Gotta read the PCI passthrough guide on the Arch wiki a bit more carefully.

I actually did do that but no luck on having it show up. Thus why i wrote the previous message if this happens more broadly.

You observed it not showing up as an option for the already created VM? Or did you not see it after creating a new VM? I do know that you can’t change the BIOS after already creating the VM in virt-manager for some reason. But for me I can see it there as an option after setting the path during the creation of a VM

Yeah when i create a new VM it just says grey “UEFI not found” as the only other option besides BIOS.

Did you systemctl restart libvirt and systemctl restart virtlogd.socket after making the change to qemu.conf?

Yeah. Adding manually totally works for me, besides being a bit curious. Played my first VM game of Overwatch last night and was very happy! Thanks for pointing me in the right direction.

1 Like

I also planning to build a two gamer pc, with similiar hardware.
Could you share some details about the system you have?

Since there is so little known about system topology of the TR3000, I would be very happy if you can share the following:

sudo lspci -vvv
lsusb
lsusb -t
dmesg

Best regards,
Maxim Levitsky

@maximlevitsky Too much info for thread. Check DMs. Cheers!

1 Like

Thanks a million!!

@GuyThreepwood could you DM me the same info you sent to maximlevitsky? I’m also very curious about the IOMMU groupings. Based on the dmesg output above they look like they’re pretty amazing, no?

Or perhaps toss all of this in a GitHub gist or pastebin so that others can get it as well without overloading the thread?

This is processed version of the info I received.

The takeaways:

  1. IO die appears to be divided into 4 parts, as if AMD moved the IO portion of each chiplet into the IO die

  2. We have 2 USB controllers in the IO die, each connected to to 2 USB ports at the back of motherboard (on all TRX40 motherboards).
    Pass through of those two controllers is currently broken, but there is a workaround that should work.

  3. IO die portion 1 does have a sound controller but it is not connected to anything, most likely it is not exposed in the socket.
    Instead we have usb sound devices on board, in this particular board 2 such devices, probably one for back and one for front sound ports.

  4. Chipset is connected to IO die part 1 with 8X link as expected, and it contains 2 USB controllers. Both are lumped into same IOMMU group but with ACS override there are reports that they work well, even separately.
    Each controller has 4 USB3 ports and 2 USB2 only ports, however some USB3 ports are bonded to create a dual lane 20GBit/s type C ports.
    Moreover, the USB2 ports are mostly used for onboard devices. On this board, 2 USB2 ports of one USB controller are used for onboard audio,
    and one port of the second USB controller are used for bluetooth funcion of the wifi card, and last USB2 port is connected to a USB2 hub which provides the 4 USB2 ports on the internal header.
    This complicates a bit the pass-though of these controllers since you will be forced to pass these onboard devices with the controller.

EDIT: we have here additional AsMedia USB controller but in the dmesg I see that it dies even without passing it though. These controllers are crap.

https://pastebin.com/wPZpicBB

Oh, Asmedia…

From the single-monitor L1T KVM switch notes:

NOTE2: we don’t recommend connecting to ASMedia USB controllers if you can help it, as they can be problematic coming out of sleep mode. Please report any issues with USB peripherals.

On every PC I’ve ever built dating back to the mid-00s I’ve had to disable everything Asmedia, both on Intel and AMD platforms. Typically it’s only meant losing a few ports…

With all these lanes, I think we’d all be fine using some of the fancier USB cards from Startech and Orico, even for VR users. Oculus recommends some Inatek AIC, but only because it’s the cheapest known-tested card. I’ve seen plenty of people saying the Renesas/NEC USB controllers are pretty good.

  • StarTech PEXUSB3S44V uses 4x Renesas µPD720202
  • Orico PNU-S4 uses 4x Renesas µPD720201
  • Both cards use 4 controllers for 4 ports, which is overkill for most situations, but in theory should guarantee better resistance to bottlenecks.
  • The only major downside I see is the physical port density on the back of the card is likely to impede the use of most thumb drives and some of the beefier connectors too.

And if you’re already spending this much on a system, you might as well just disable the onboard audio anyways and get yourself a decent USB DAC+AMP like the JDS Labs Objective2+ODAC Combo. No 5.1, but having a pot for volume is just chefkiss.gif

The Asmedia recorded in dmesg and that fail seems to be a USB 3.1. This makes makes me think that the 3.1 port is the lone located on my case ( be quiet! Dark Base Pro 900), which I was not able to get to work. I am ignorant to whether that would have an effect on the controller but i thought i would point it out.

I also have not been able to get the onboard sound card to show up but have not troubleshooted it too much because I just decided to wire the external amp/dac to usb for linux and passthrough a wireless headset to gaming VM.

$  cat /proc/asound/cards
 0 [S7             ]: USB-Audio - SteelSeries Arctis 7
                      SteelSeries SteelSeries Arctis 7, full speed
 1 [NVidia         ]: HDA-Intel - HDA NVidia
                      HDA NVidia at 0xb1080000 irq 155
 3 [Audio          ]: USB-Audio - USB Audio
                      Generic USB Audio, high speed
 4 [Device         ]: USB-Audio - USB Audio Device
                      C-Media Electronics Inc. USB Audio Device, full speed
 5 [Audio_1        ]: USB-Audio - USB Audio
                      Generic USB Audio, high speed
 6 [NVidia_1       ]: HDA-Intel - HDA NVidia
                      HDA NVidia at 0xe0080000 irq 275

Most likely the HDA interface in the cpu, is dummy, at least according to this article
https://www.anandtech.com/show/15121/the-amd-trx40-motherboard-overview-

Thats why all TRX40 motherboards have 1 or 2 USB sound cards.