Return to Level1Techs.com

My notes/tutorial to achieve KVM with passthrough on OpenSUSE and Ryzen/Threadripper system

These are the notes of all things I did to set up a KVM server with GPU passthrough in OpenSUSE Tumbleweed, on a Ryzen 5 1600x on a Asrock x470 Master SLI motherboard.

EDIT: if you check a few posts below I also list some information about adapting what I did to work on a Threadripper system

Since I took quite a bit of information from the guides posted in this forum, I’d figure I would give back by writing down all I did in here.

It’s in the form of a tutorial, because why not. Most of what I write here is applicable also to Fedora and similar.

START

On Ryzen motherboards, IOMMU/passthrough is broken between AGESA 0.0.7.2 and AGESA 1.0.0.4 patch B (also called ComboPI) so update UEFI if needed. Latest firmware for my motherboard had the fixed AGESA, and I had to update it or I would only get errors when enabling passthrough.

Then go in the UEFI setup, enable IOMMU (set it to “enabled”, not on “Auto”, that’s usually a worse IOMMU grouping to workaround Windows issues), and PCIe ACS (I find “ACS Enable” under Advanced\AMD CBS\NBIO Common Options), and set it to “Enable”.
and SR-IOV too if you want, and adding "amd_iommu=force_isolation iommu=pt " to kernel command line (and reboot).
This is usually done by editing GRUB or bootloader settings, on OpenSUSE I used Yast’s “booloader” menu to set this.

All the commands I do are executed as root user, or with sudo.

create this script to check what IOMMU groups each device is in

!/bin/bash
shopt -s nullglob
for iommu_group in $(find /sys/kernel/iommu_groups/ -maxdepth 1 -mindepth 1 -type d);do 
	echo "IOMMU group $(basename "$iommu_group")"
 	for device in $(\ls -1 "$iommu_group"/devices/); do 
		if [[ -e "$iommu_group"/devices/"$device"/reset ]]; then 
			echo -n "[RESET]"
		 fi
	echo -n $'\t'
	lspci -nns "$device"
	done
done

It will dump a list of all onboard hardware, grouped by IOMMU group, and with [RESET] if the device supports resetting (needed for passthrough).
It won’t order them by IOMMU group thoug so you will have to scroll around to find the devices you are looking for.

we must assign all hardware that must be passed through to vfio-pci.
This can be done by device model or whatever with kernel command line and in modprobe options, but that is not as flexible as manually assigning the PCIe address and having a script force load of vfio-pci as “driver_override” (feature added in kernel 3.16 so should be available everywhere now)

create a file:
nano /etc/modprobe.d/gpu-passthrough.conf

write inside the file:

install vfio-pci /sbin/vfio-pci-override.sh

this means that it will run the script when asked to install the vfio-pci module

Then create the script with
nano /sbin/vfio-pci-override.sh

#!/bin/sh

DEVS="0000:0a:00.0 0000:0a:00.1 0000:08:00.0 0000:09:00.0"

for DEV in $DEVS; do
    echo "vfio-pci" > /sys/bus/pci/devices/$DEV/driver_override
done

modprobe -i vfio-pci

In the DEVS variable at the top place all PCI addresses (lspci) of stuff you want to passthrough. I have four entries in this example, first two are for a GPU, the other two are for a couple USB 3.0 cards I want to pass through as well (I have plenty of native USB 3.0 for the host already, I prefer to have these “native” in the VM instead of doing USB passthroughs each time I need to connect a USB device).
Note: most modern GPUs have 2 entries, one for the GPU and one for the “audio device” they use to send audio stream over HDMI/Displayport, both should be added

Make it executable chmod +x /sbin/vfio-pci-override.sh

now we need to tell dracut to load all these files and add vfio-pci in the initramfs, create a file with

nano /etc/dracut.conf.d/gpu-passthrough.conf

inside write

force_drivers+=" vfio vfio-pci vfio_iommu_type1 "
install_items=" /sbin/vfio-pci-override.sh "

for some reason, “add_drivers” as suggested in other guides does not do the job.

note the spaces before and after the ", that’s important for not garbling up with other files that add drivers to the list

rebuild initramfs only for current kernel (so we have a fallback in case what we did breaks things).
Depends from distro, on OpenSUSE it’s
mkinitrd -k $(uname -r)

check that all we need is loaded in the initramfs (and this is how I found out that “add_drivers” didn’t do anything.

lsinitrd | grep vfio 
drwxr-xr-x   1 root     root            0 Apr 28 04:34 lib/modules/5.6.4-1-default/kernel/drivers/vfio
drwxr-xr-x   1 root     root            0 Apr 28 04:34 lib/modules/5.6.4-1-default/kernel/drivers/vfio/pci
-rw-r--r--   1 root     root        25824 Apr 18 03:27 lib/modules/5.6.4-1-default/kernel/drivers/vfio/pci/vfio-pci.ko.xz
-rw-r--r--   1 root     root        13480 Apr 18 03:27 lib/modules/5.6.4-1-default/kernel/drivers/vfio/vfio_iommu_type1.ko.xz
-rw-r--r--   1 root     root        12908 Apr 18 03:27 lib/modules/5.6.4-1-default/kernel/drivers/vfio/vfio.ko.xz
-rw-r--r--   1 root     root         3344 Apr 18 03:27 lib/modules/5.6.4-1-default/kernel/drivers/vfio/vfio_virqfd.ko.xz
-rwxr-xr-x   1 root     root          157 Apr 28 03:38 sbin/vfio-pci-override.sh

now add the following to the kernel commandline (usually done by editing GRUB or bootloader settings), on OpenSUSE I used Yast’s “booloader” menu to set this.

amd_iommu=force_isolation iommu=pt rd.driver.pre=vfio-pci

the first two are needed to enable IOMMU with passthrough on the Linux side, and the last is to load the vfio-pci driver before anything else so the script we added will be triggered.

Reboot the system and check that the card is using the vfio-pci driver

lspci -k
0a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Oland GL [FirePro W2100]
	Subsystem: Dell Device 2120
	Kernel driver in use: vfio-pci
	Kernel modules: radeon, amdgpu
0a:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Oland/Hainan/Cape Verde/Pitcairn HDMI Audio [Radeon HD 7000 Series]
	Subsystem: Dell Device aab0
	Kernel driver in use: vfio-pci
	Kernel modules: snd_hda_intel

IOMMU groups.

Unless you are doing this on server-grade equipment, or you have a Threadripper or other CPU where all PCIe on the board come from its controllers, IOMMU groups will have more than one device in them.
Most usually it’s the integrated stuff (connected or provided by the chipset), and the PCIe slots provided by the chipset.

If you do the above to pass them through, when you go and start the VM you get errors like

Error starting domain: internal error: qemu unexpectedly closed the monitor: 2020-04-28T15:36:59.742769Z qemu-system-x86_64: -device vfio-pci,host=0000:08:00.0,id=hostdev2,bus=pci.9,addr=0x0: vfio 0000:08:00.0: group 13 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.

it means that the PCIe port places this device in a IOMMU group with other stuff and you must bind all things in that group to the vfio driver.

In my case, this is a USB 3.0 card and it’s been placed in IOMMU group 13, together with other stuff connected/provided by the chipset like onboard SATA and USB controllers, plus other cards that I’m not passing through like the Broadcom ethernet and the R5 230 that is the KVM host GPU.

as shown by the IOMMU groups dump script of above (only Group 13 shown)

IOMMU group 13
[RESET]	01:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43d0] (rev 01)
	01:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
	01:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01)
[RESET]	02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	02:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	02:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	02:03.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	02:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	02:06.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	02:07.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
[RESET]	04:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)
[RESET]	05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Caicos PRO [Radeon HD 7450] [1002:677b]
[RESET]	05:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Caicos HDMI Audio [Radeon HD 6450 / 7450/8450/8490 OEM / R5 230/235/235X OEM] [1002:a...
[RESET]	06:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme II BCM5709 Gigabit Ethernet [14e4:1639] (rev 20)
	06:00.1 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme II BCM5709 Gigabit Ethernet [14e4:1639] (rev 20)
[RESET]	08:00.0 USB controller [0c03]: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller [1912:0014] (rev 03)
[RESET]	09:00.0 USB controller [0c03]: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller [1912:0014] (rev 03)
  1. The first (and best) way to deal with this is to move the cards to another PCIe slot served by the CPU. So either of the x16 ones for the GPU or the primary M.2 slot. In general all slots that support Gen3 PCIe must come from the CPU, because the x470 chipset does NOT provide Gen3 PCIe lanes.
    If you are booting with CSM disabled, for Ryzen systems, the default boot GPU is the chipset one, probably because it has lover PCIe address. This is perfect for passthrough GPU configuration, (my “boot GPU” is an Asus R5 230, that seems to be the only model of R5 230 that has a EFI/GOP capability, so it can actually boot with CSM disabled).
    This is kind of annoying on this board, but probably the only “secure” choice. I have the GPU I want to pass through in one of the main x16 slots, and the other x16 slot is occupied by a SAS HBA card that runs the SAS drives in the VM storage array. I also have a M.2 slot and also a dumb adapter to PCIe x4 slot.

  2. Another way to deal with this (as also suggested by the error message) is to place all devices in this group with vfio-pci driver. In my case it’s a bit annoying as more or less all x1 PCIe and chipset Sata and USB are in this group, and if I do that I have to move the root filesystem to a NVME drive or to a drive connected to the HBA card and then when a VM is using even just ONE of these devices all the others in the same IOMMU group are blocked off for other VMs.

  3. The third option is ACS override, that is basically “let’s assume ACS exists so it’s safe to split the devices in more IOMMU groups” and imho is kind of bad. It may work fine either because the PCIe bridge genuinely does not support peer-to-peer communication and does not report it, or rely on the VMs not abusing the capability, but I’m using VMs for isolation and I don’t like that.
    For a better explanation of these options, see this youtube video “A little about Passthrough, PCIe, IOMMU Groups and breaking them up” by Spaceinvader One on Youtube yes he is using Unraid, which I don’t personally like much, but he details the kernel command line to use for each ACS override possibility.

So I chose 1, and moved the USB PCIe card to the M.2 slot from the CPU, added its PCIe ID to the vfio script list so it’s locked out and not used by Linux. Passedthrough and solved firmware issues with it, and boom all is fine.

9 Likes

Welcome to the forum. Thank you for the awesome first post, and the formatting is a nice bonus.

I added in a link to the video. You can get permission to post links by leveling up, and you can level up by reading posts while logged in.

1 Like

DUDE! This was the missing link for me to get this working on Tumbleweed. I really appreciate it. I was using resources from Arch but only got so far. I just wanted to thank you before carrying on my way

I’ve got a GTX 970 for my host right now and a Vega 56 passed to a Windows VM and all is well. Going to use a few other Linux distros this way too. The 56 I flashed to a 64 just the other day, and I was kind of surprised that everything worked as well as it did

2 Likes

So, since I was kind of annoyed by the IOMMU groups and lack of PCIe ports to place cards in passthrough, i kept an eye on ebay and I finally managed to acquire a cosmetically-damaged Threadripper motherboard (Fatal1ty X399 Professional Gaming) for around the same price I can sell the current x470 Ryzen board for, and a Threadripper 1900x (8 core 16 threads) for slightly more that I can sell the current Ryzen 5 1600x for.
And obviously I had to buy also a new Noctua heatsink for this bad boy because Threadrippers are thicc and I could not reuse the old one.

Yes I know it’s not the best Threadripper CPU ever but it is better than the Ryzen 5 1600x I currently use. The main thing is that it’s the cheapest CPU I can get to use a motherboard with the connectivity and IOMMU groups I need. Great mounting system btw, with the CPU bracket that slides into a rail puts Intel’s high end sockets (the ones in servers) to shame.

Let’s recap what does this board offer (that I care about in my build):

  • 4 PCIe x8 slots (if all are filled with a card, or 2 slots at x16), all the slots support bifurcation from x8x8 to x4x4x4x4 in the UEFI settings, so if I want to go completely bananas I can.
  • 3 M.2 slots with PCIe x4 each
  • a single x1 PCIe slot from the chipset, since it’s from the chipset it will not have its own IOMMU group so it’s limited to host stuff. For now it has a GPU I used for changing UEFI settings, but once I’m done I’ll probably remove it and it is going to power a SAS expander (that is using a PCIe slot as power supply only)
  • 3 ethernet ports (2x gigabit and a single 10gigabit), so I can do without a dual port gigabit card, also very nice to have a 10gbit port, I guess will have to buy that fanless managed switch with 24 ports and a couple 10gig sfp+ slots now if I want to use it beyond gigabit speed, maybe in the future.
  • 8 RAM slots, now I have 4 free slots In the unlikely event that 128GB of RAM are not enough. Yay more future proofing

But the most important thing of all is that the IOMMU groups are great on this board (as should be on all Threadripper boards afaik).

  • All PCIe slots and the M.2 slots are placed in their own IOMMU group, apart from the single x1 PCIe slot in the middle of the board and the PCIe lanes for the wifi card.

  • USB 3.0 and Sata controllers are split up in different groups, one controller is in the chipset group and I’ve got the other 2 controllers isolated in their own group. This means I can stop wasting a M.2 slot with x4 lanes for a x1 PCie card with a couple USB 3.0 ports.

Here a dump of stuff and IOMMU groups.

IOMMU group 17
[RESET]	09:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]
IOMMU group 35
[RESET]	43:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]
IOMMU group 7
[RESET]	00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU group 25
[RESET]	40:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU group 15
[RESET]	09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
IOMMU group 33
[RESET]	43:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
IOMMU group 5
	00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 23
	40:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 13
[RESET]	01:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset USB 3.1 xHCI Controller [1022:43ba] (rev 02)
	01:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset SATA Controller [1022:43b6] (rev 02)
	01:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] X399 Series Chipset PCIe Bridge [1022:43b1] (rev 02)
[RESET]	02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
	02:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
	02:05.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
	02:06.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
	02:07.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
[RESET]	03:00.0 Ethernet controller [0200]: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] [1d6a:d107] (rev 02)
[RESET]	04:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)
[RESET]	05:00.0 Network controller [0280]: Intel Corporation Dual Band Wireless-AC 3168NGW [Stone Peak] [8086:24fb] (rev 10)
[RESET]	06:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)
[RESET]	07:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Caicos PRO [Radeon HD 7450] [1002:677b]
[RESET]	07:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Caicos HDMI Audio [Radeon HD 6450 / 7450/8450/8490 OEM / R5 230/235/235X OEM] [1002:a...
IOMMU group 31
[RESET]	41:00.0 USB controller [0c03]: Renesas Technology Corp. uPD720202 USB 3.0 Host Controller [1912:0015] (rev 02)
IOMMU group 3
	00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 21
	40:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 11
	00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
	00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
	00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
	00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
	00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
	00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
	00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
	00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
IOMMU group 1
[RESET]	00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU group 28
[RESET]	40:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU group 18
[RESET]	0a:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
IOMMU group 36
[RESET]	44:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
IOMMU group 8
	00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 26
	40:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 16
	09:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
IOMMU group 34
	43:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
IOMMU group 6
	00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 24
	40:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 14
[RESET]	08:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2004 PCI-Express Fusion-MPT SAS-2 [Spitfire] [1000:0070] (rev 03)
IOMMU group 32
[RESET]	42:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Lexa PRO [Radeon 540/540X/550/550X / RX 540X/550/550X] [1002:699f] (rev c7)
	42:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0]
IOMMU group 4
	00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 22
[RESET]	40:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU group 12
	00:19.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
	00:19.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
	00:19.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
	00:19.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
	00:19.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
	00:19.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
	00:19.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
	00:19.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
IOMMU group 30
[RESET]	40:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU group 2
[RESET]	00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU group 20
	0a:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]
IOMMU group 10
	00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
	00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU group 29
	40:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 0
	00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 19
	0a:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU group 37
	44:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU group 9
[RESET]	00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU group 27
	40:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]

So let’s start with the tutorial section update.

Updated the firmware to latest, version 3.80 because why not. Enabled the same stuff as with the other board:

  • IOMMU set to “Enabled” (in 2 different places for some reason)
  • ACS set to Enable
  • PCIe ARI support set to Enable
  • SR-IOV set to Enable

Since I’ve moved the system drive from the older build, most of the setup is done already and does not change.

In the old build I renamed the 3 ethernet devices (and the network config is expecting these custom names) so I need to edit the config so the current 3 ethernet devices are now used in the new build. This can be done by Yast, but it involves deleting old interfaces and configuring the new ones from scratch.

Writing ip a command will list the current ethernet interfaces and their MAC address.

Just change the mac address ( the ATTR{address} ) in /etc/udev/rules.d/70-persistent-net.rules to be the one of your current ethernet cards, and reboot. (the mac addresses below are not real although technically valid)

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{type}=="1", ATTR{dev_id}=="0x0", ATTR{address}=="00:00:00:00:00:1a", NAME="eth-vm-0"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{type}=="1", ATTR{dev_id}=="0x0", ATTR{address}=="00:00:00:00:00:1c", NAME="eth-main"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{type}=="1", ATTR{dev_id}=="0x0", ATTR{address}=="00:00:00:00:00:18", NAME="eth-vm-1"

This has to be done manually from physical keyboard and screen of the system, once this is done I can connect again through SSH from my main PC and enjoy again copy-paste to/from the Internet.

Second order of businness is just changing what PCIe addresses are locked out from the host and can only be passed through to the VMs, and that does not need repeating, just edit the /sbin/vfio-pci-override.sh file to add stuff and rebuild initramfs so the one used on boot is updated.

But before we do that, as I said this board has IOMMU groups for USB and Sata, and I want to use the USB 3.0 controllers for passthrough. I need to detect what phisical ports are connected to each controller, as all controllers have the same name, and I want to have at least two different USB 3.0 banks for two different VMs.

I will be connecting a USB flash drive to the ports and I will be using lshw tool from OpenSUSE to detect what pcie addres has its controller.
The same method will work also for Sata controllers, I can see what controller is using the Sata drives (the system drive and the ISO files drive). But I don’t need to pass through Sata drives so I’m not going to care about that for the moment.

I’ll start plugging my USB drive in the first USB 3.0 ports under the PS/2 round connector, and use the command lshw -sanitize to get a dump of all hardware onboard (the -sanitize option is only to remove serial numbers and stuff so this is ok to post online)

Here you can see a part of theoutput where we find the tree from the USB controller to the USB device (a Transcend 8GB flash drive) I connected.

       *-usb
            description: USB controller
            product: Family 17h (Models 00h-0fh) USB 3.0 Host Controller
            vendor: Advanced Micro Devices, Inc. [AMD]
            physical id: 0.3
            bus info: [email protected]:43:00.3
            version: 00
            width: 64 bits
            clock: 33MHz
            capabilities: pm pciexpress msi xhci bus_master cap_list
            configuration: driver=xhci_hcd latency=0
            resources: irq:66 memory:82000000-820fffff
          *-usbhost:0
               product: xHCI Host Controller
               vendor: Linux 5.7.7-1-default xhci-hcd
               physical id: 0
               bus info: [email protected]
               logical name: usb5
               version: 5.07
               capabilities: usb-2.00
               configuration: driver=hub slots=4 speed=480Mbit/s
             *-usb
                  description: Mass storage device
                  product: Mass Storage Device
                  vendor: JetFlash
                  physical id: 1
                  bus info: [email protected]:1
                  logical name: scsi11
                  version: 11.00
                  serial: [REMOVED]
                  capabilities: usb-2.00 scsi emulated scsi-host
                  configuration: driver=usb-storage maxpower=500mA speed=480Mbit/s
                *-disk
                     description: SCSI Disk
                     product: Transcend 8GB
                     vendor: JetFlash
                     physical id: 0.0.0
                     bus info: [email protected]:0.0.0
                     logical name: /dev/sdy
                     version: 1100
                     serial: [REMOVED]
                     size: 7728MiB (8103MB)
                     capabilities: partitioned partitioned:dos
                     configuration: ansiversion=4 logicalsectorsize=512 sectorsize=512 signature=4cdd33a6
                   *-volume
                        description: Windows FAT volume
                        vendor: MSWIN4.1
                        physical id: 1
                        bus info: [email protected]:0.0.0,1
                        logical name: /dev/sdy1
                        version: FAT32
                        serial: [REMOVED]
                        size: 7725MiB
                        capacity: 7727MiB
                        capabilities: primary fat initialized
                        configuration: FATs=2 filesystem=fat label=trans8gb
          *-usbhost:1
               product: xHCI Host Controller
               vendor: Linux 5.7.7-1-default xhci-hcd
               physical id: 1
               bus info: [email protected]
               logical name: usb6
               version: 5.07
               capabilities: usb-3.00
               configuration: driver=hub slots=4 speed=5000Mbit/s

So now we know this port comes from the controller at 0000:43:00.3 (check the “bus info” entry)
Let’s try the other port in the same stack.
Aaand the output is identical, which confirms that the second port is also from the same controller at0000:43:00.3

Let’s try the second USB 3.0 stack, between the antenna connectors and the audio jacks.
Nope, same controller for both ports. So the first 4 ports from the left are all on the same controller.

Let’s try the USB ports stack under the first Ethernet (the red ethernet port aka the Aquantia 10G port).

Aand yes they are under a different controller.

       *-usb
            description: USB controller
            product: Family 17h (Models 00h-0fh) USB 3.0 Host Controller
            vendor: Advanced Micro Devices, Inc. [AMD]
            physical id: 0.3
            bus info: [email protected]:09:00.3
            version: 00
            width: 64 bits
            clock: 33MHz
            capabilities: pm pciexpress msi xhci bus_master cap_list
            configuration: driver=xhci_hcd latency=0
            resources: irq:55 memory:bb200000-bb2fffff
          *-usbhost:0
               product: xHCI Host Controller
               vendor: Linux 5.7.7-1-default xhci-hcd
               physical id: 0
               bus info: [email protected]
               logical name: usb1
               version: 5.07
               capabilities: usb-2.00
               configuration: driver=hub slots=4 speed=480Mbit/s
             *-usb
                  description: Mass storage device
                  product: Mass Storage Device
                  vendor: JetFlash
                  physical id: 1
                  bus info: [email protected]:1
                  logical name: scsi11
                  version: 11.00
                  serial: [REMOVED]
                  capabilities: usb-2.00 scsi emulated scsi-host
                  configuration: driver=usb-storage maxpower=500mA speed=480Mbit/s
                *-disk
                     description: SCSI Disk
                     product: Transcend 8GB
                     vendor: JetFlash
                     physical id: 0.0.0
                     bus info: [email protected]:0.0.0
                     logical name: /dev/sdy
                     version: 1100
                     serial: [REMOVED]
                     size: 7728MiB (8103MB)
                     capabilities: partitioned partitioned:dos
                     configuration: ansiversion=4 logicalsectorsize=512 sectorsize=512 signature=4cdd33a6
                   *-volume
                        description: Windows FAT volume
                        vendor: MSWIN4.1
                        physical id: 1
                        bus info: [email protected]:0.0.0,1
                        logical name: /dev/sdy1
                        version: FAT32
                        serial: [REMOVED]
                        size: 7725MiB
                        capacity: 7727MiB
                        capabilities: primary fat initialized
                        configuration: FATs=2 filesystem=fat label=trans8gb
          *-usbhost:1
               product: xHCI Host Controller
               vendor: Linux 5.7.7-1-default xhci-hcd
               physical id: 1
               bus info: [email protected]
               logical name: usb2
               version: 5.07
               capabilities: usb-3.00
               configuration: driver=hub slots=4 speed=5000Mbit/s

So the controller for the ports here has pci address 0000:09:00.3

The ports under the last ethernet port come from a controller called “X399 Series Chipset USB 3.1 xHCI Controller” with address 0000:01:00.0.

The ports from the two headers on the board are on a controller that has address 0000:01:00.0, and is called " X399 Series Chipset USB 3.1 xHCI Controller".

Now let’s see where these controllers are, and hope the first two are the ones with their own dedicated IOMMU group, because I know the " X399 Series Chipset USB 3.1 xHCI Controller" is in the IOMMU group with the other assorted chipset stuff (unsurprisingly).

Time to run the IOMMU group script again to get a list.

Here they are, the two controllers that have their own IOMMU group were the ones controlling the 8 USB 3.0 ports in the back.

IOMMU group 17
[RESET]	09:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]

And this

IOMMU group 33
[RESET]	42:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]

Great, so we have the pcie address and we can add both to the list of pcie devices we can passthrough to the VMs. Yay two USB 3.0 passthrough controllers.

While for the host we still have 4 USB 2.0 ports and 4 USB 3.0 ports from the board headers, and also the two USB 3.1 Gen2 ports (the ones under the third ethernet port).

3 Likes

minor update: I’ve written a script that updates the vfio list of pcie addresses in the /sbin/vfio-pci-override.sh so that when I move cards around I just need to run this script and then reboot.
Yes I’ve been moving cards a lot and I was getting annoyed by having to check every time manually what addresses stuff was now at.

#!/bin/bash

array_of_strings=('Family 17h (Models 00h-0fh) USB 3.0 Host Controller' 'Lexa PRO' 'Baffin HDMI/DP Audio')

list_of_pci_ids=''

number_of_strings_in_the_array=$(( ${#array_of_strings[@]} -1))

counter=0

while [ $counter -le $number_of_strings_in_the_array ] ; do

#echo $counter

echo "${array_of_strings[$counter]}"

grep_string=${array_of_strings[$counter]}

new_pci_id=$( lspci | grep "$grep_string" | awk '{print $1}' | awk '{print "0000:"$0}' | tr '\n' ' '  )

echo "$new_pci_id"

list_of_pci_ids="${list_of_pci_ids} ${new_pci_id}"

counter=$((counter + 1))

done

echo "$list_of_pci_ids"


sed -i "s/DEVS=.*/DEVS=\"$list_of_pci_ids\"/g" /sbin/vfio-pci-override.sh 

sed "s/DEVS=.*/DEVS=\"$list_of_pci_ids\"/g" /sbin/vfio-pci-override.sh 

mkinitrd

The script works by adding the card or device name or part of it as seen from lspci as a string in the array_of_strings variable at the beginning of the script

for example, these are the entries of the RX550

09:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Lexa PRO [Radeon 540/540X/550/550X / RX 540X/550/550X] (rev c7)
09:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X]

that became the ‘Lexa PRO’ and the ‘Baffin HDMI/DP Audio’ string in the array_of_strings variable in the script.

and these are the entries for the USB controllers I can pass through

0a:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller
41:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller

that became ‘Family 17h (Models 00h-0fh) USB 3.0 Host Controller’ in the array_of_strings

Please avoid using strings containing [ and ] because that won’t work in the script.

The mkinitrd command at the end is specific to OpenSUSE and rebuilds the initramfs images, if you are using Fedora or another distro it’s different and you need to write the right command to do that.

So now you can just download this script, call it update-vfio-passtrhough and chmod +x it to make it executable.

1 Like