[Mostly Solved] Passing through PCIe USB controller crashes system

I have a Proxmox 7.0-11 system on Kernel 5.11.22-3-pve and I am suddenly having trouble using PCIe passthrough with a USB controller built in to the motherboard. I am trying to boot a Windows 10 VM with a RX580, motherboard sound card, and motherboard USB controller passed through, and everything works fine when the USB controller is not passed in, but when it is set up to pass through, the system immediately crashes.

This is a new issue - I suspect that a kernel update caused something to break. Everything was working fine until a power outage caused unexpected downtime (on a UPS so gracefully shutdown) and a reboot caused a new kernel to take place. The bug started on Kernel 5.4 (Proxmox 6.4) and it still affects the current Proxmox kernel.

Here are my complete specs:

Motherboard ASUSTeK PRIME X470-PRO
CPU AMD Ryzen 7 2700X
GPU AMD Radeon RX 580 8GB
RAM 3x G-Skill F4-3200C16-16GVK

The USB controller in question is 0b:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller [1022:145f]

Here is the relevant /var/log/syslog segment after running qm start 102

Aug 14 17:19:04 pve03 qm[86904]: start VM 102: UPID:pve03:00015378:0001A3C8:61184158:qmstart:102:root@pam:
Aug 14 17:19:04 pve03 qm[86900]: <root@pam> starting task UPID:pve03:00015378:0001A3C8:61184158:qmstart:102:root@pam:
Aug 14 17:19:04 pve03 kernel: [ 1074.695912] xhci_hcd 0000:0b:00.3: USB bus 5 deregistered
Aug 14 17:19:04 pve03 systemd[1]: Stopped target Sound Card.
Aug 14 17:19:05 pve03 systemd[1]: Started 102.scope.
Aug 14 17:19:05 pve03 systemd-udevd[86909]: Using default interface naming scheme 'v247'.
Aug 14 17:19:05 pve03 systemd-udevd[86909]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Aug 14 17:19:05 pve03 kernel: [ 1075.594595] device tap102i0 entered promiscuous mode
Aug 14 17:19:05 pve03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port tap102i0
Aug 14 17:19:05 pve03 ovs-vsctl: ovs|00002|db_ctl_base|ERR|no port named tap102i0
Aug 14 17:19:05 pve03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port fwln102i0
Aug 14 17:19:05 pve03 ovs-vsctl: ovs|00002|db_ctl_base|ERR|no port named fwln102i0
Aug 14 17:19:05 pve03 systemd-udevd[86912]: Using default interface naming scheme 'v247'.
Aug 14 17:19:05 pve03 systemd-udevd[86912]: ethtool: autonegotiation is uns

When the log cuts off abruptly, the system is completely crashed.

All help or pointers are greatly appreciated!

Make sure there are no other devices in the USB controllers IOMMU group.

It’s isolated in IOMMU Group 20

Bumping this because this is still a problem and I have zero clue what’s going wrong (and I’ve finally returned to this problem). I am now on Proxmox PVE 7.1-10 and on kernel version 5.13.19-5-pve.

Some more investigation I’ve done:

I created a tmux session and ran a curl every half second to a logging webserver. When starting the VM, the logs show that the requests stop coming in almost immediately. However, there are two more requests that come in 8 seconds apart after the system hangs. Not exactly sure what this means, other than that the system isn’t completely dead. Here is the tail end of the log with timestamps:

10.0.10.56 - - [08/Mar/2022 19:49:55] "GET / HTTP/1.1" 200 -
10.0.10.56 - - [08/Mar/2022 19:49:55] "GET / HTTP/1.1" 200 -
10.0.10.56 - - [08/Mar/2022 19:49:56] "GET / HTTP/1.1" 200 -
10.0.10.56 - - [08/Mar/2022 19:49:56] "GET / HTTP/1.1" 200 -
10.0.10.56 - - [08/Mar/2022 19:49:57] "GET / HTTP/1.1" 200 -
10.0.10.56 - - [08/Mar/2022 19:49:58] "GET / HTTP/1.1" 200 -
10.0.10.56 - - [08/Mar/2022 19:50:06] "GET / HTTP/1.1" 200 -
10.0.10.56 - - [08/Mar/2022 19:50:14] "GET / HTTP/1.1" 200 -

Additionally, here is the tail end of journalctl -o short-precise -k -b -1 for the logs of the last boot (other entries were minutes before this; these stop when the system hangs):

Mar 08 18:49:55.831184 pve03 kernel: xhci_hcd 0000:0b:00.3: remove, state 4
Mar 08 18:49:55.831385 pve03 kernel: usb usb6: USB disconnect, device number 1
Mar 08 18:49:55.831523 pve03 kernel: usb 6-1: USB disconnect, device number 2
Mar 08 18:49:55.843175 pve03 kernel: xhci_hcd 0000:0b:00.3: USB bus 6 deregistered
Mar 08 18:49:55.843335 pve03 kernel: xhci_hcd 0000:0b:00.3: remove, state 1
Mar 08 18:49:55.843448 pve03 kernel: usb usb5: USB disconnect, device number 1
Mar 08 18:49:55.843588 pve03 kernel: usb 5-1: USB disconnect, device number 2
Mar 08 18:49:55.843735 pve03 kernel: usb 5-1.1: USB disconnect, device number 3
Mar 08 18:49:55.955175 pve03 kernel: xhci_hcd 0000:0b:00.3: USB bus 5 deregistered
Mar 08 18:49:56.735192 pve03 kernel: device tap102i0 entered promiscuous mode
Mar 08 18:49:56.755193 pve03 kernel: fwbr102i0: port 1(tap102i0) entered blocking state
Mar 08 18:49:56.755289 pve03 kernel: fwbr102i0: port 1(tap102i0) entered disabled state
Mar 08 18:49:56.755312 pve03 kernel: fwbr102i0: port 1(tap102i0) entered blocking state
Mar 08 18:49:56.755326 pve03 kernel: fwbr102i0: port 1(tap102i0) entered forwarding state
Mar 08 18:49:56.763193 pve03 kernel: device fwln102o0 entered promiscuous mode
Mar 08 18:49:56.775180 pve03 kernel: fwbr102i0: port 2(fwln102o0) entered blocking state
Mar 08 18:49:56.775260 pve03 kernel: fwbr102i0: port 2(fwln102o0) entered disabled state
Mar 08 18:49:56.775283 pve03 kernel: fwbr102i0: port 2(fwln102o0) entered blocking state
Mar 08 18:49:56.775297 pve03 kernel: fwbr102i0: port 2(fwln102o0) entered forwarding state
Mar 08 18:49:57.215200 pve03 kernel: device tap102i1 entered promiscuous mode
Mar 08 18:49:57.227183 pve03 kernel: vmbr1: port 2(tap102i1) entered blocking state
Mar 08 18:49:57.227273 pve03 kernel: vmbr1: port 2(tap102i1) entered disabled state
Mar 08 18:49:57.227310 pve03 kernel: vmbr1: port 2(tap102i1) entered blocking state
Mar 08 18:49:57.227338 pve03 kernel: vmbr1: port 2(tap102i1) entered forwarding state

I really have no idea what’s happening here. As always, any help/pointers are appreciated.

Can you post your kernel parameters please.

➜ ~ # cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-5.13.19-5-pve root=/dev/mapper/pve-root ro quiet amd_iommu=on iommu=pt video=vesafb:off,efifb:off

I also have /etc/modprobe.d/vfio.conf with:

options vfio-pci ids=1002:67df,1002:aaf0,1022:1455,1022:145f
# 1022:145f is the USB controller in its own IOMMU group

Let me know if I can post any other info.

Which devices are those for?

IOMMU Group 16:
	09:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] [1002:67df] (rev c7)
	09:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1002:aaf0]
...
IOMMU Group 20:
	0b:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller [1022:145f]
...
IOMMU Group 21:
	0c:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
IOMMU Group 22:
	0c:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU Group 23:
	0c:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]

I thought I put targeting the GPU/audio device, USB controller, and sound card but for some reason I had 1022:1455 listed instead of the sound card’s 1022:1457. I don’t remember why this happened or if it was intentional, but it doesn’t seem to cause any problems and I’ve commented out the audio passthrough anyway.

Can you please provide me with the lspci -v section for the device 0b:00.3.

0b:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller (prog-if 30 [XHCI])
	Subsystem: ASUSTeK Computer Inc. Zeppelin USB 3.0 Host controller
	Flags: bus master, fast devsel, latency 0, IRQ 78, IOMMU group 20
	Memory at fc500000 (64-bit, non-prefetchable) [size=1M]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
	Capabilities: [64] Express Endpoint, MSI 00
	Capabilities: [a0] MSI: Enable- Count=1/8 Maskable- 64bit+
	Capabilities: [c0] MSI-X: Enable+ Count=8 Masked-
	Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150] Advanced Error Reporting
	Capabilities: [2a0] Access Control Services
	Kernel driver in use: xhci_hcd
	Kernel modules: xhci_pci

Here is the entire output of lspci -v: https://yld.me/raw/gLbO.txt

This might be the cause of your problem. The kernel has a driver loaded for the USB controller and when you try to pass it through regardless of that unforseen complications might arrise, like crashes for example. I have no experience if vfio-pci might not be able to handle USB controllers, but you can not have a driver loaded for it. I suggest you do an internet search for vfio-pci and USB controllers to see how to resolve this.

I figure it is a problem with vfio-pci and you need to handle the situation differently since it has loaded correctly for your GPU.

Kernel driver in use: vfio-pci
Kernel modules: amdgpu

Ok, so after some digging it seems you were right- thanks! I was able to temporarily resolve the issue by manually unbinding the xhci_hcd driver and binding it to the vfio-pci driver. For anyone else with the same problem, here’s what I did:

echo -n '0000:0b:00.3' > /sys/bus/pci/drivers/xhci_hcd/unbind
echo -n '0000:0b:00.3' > /sys/bus/pci/drivers/vfio-pci/bind

Still not sure how to make it bind on boot, but at least everything seems to work now!

Hmm, no vfio can dynamically unbind xhci-pci when you assign the controller to a vm … it will unbind xhci, bind itself and keep going …
The only case when starting a Vm would crash proxmox for me was when PCI IDs changed and I suddenly tried to pass through my proxmox boot drive thinking it was an USB controller …
can you post /etc/pve/qemu-server/102.conf ?

/etc/pve/qemu-server/102.conf:

agent: 1
args: -cpu host,+svm
balloon: 0
bios: ovmf
boot: order=sata0
cores: 8
cpu: host
efidisk0: local-lvm:vm-102-disk-1,size=4M
ide0: none,media=cdrom
ide2: none,media=cdrom
machine: q35
memory: 8196
name: win10-gpu
net0: virtio=C2:10:81:39:99:CA,bridge=vmbr0,firewall=1,tag=50
net1: virtio=36:99:5F:B7:99:09,bridge=vmbr1
numa: 0
hostpci0: 09:00,pcie=1,romfile=rx480new.bin,x-vga=1
hostpci1: 0b:00.3,pcie=1
ostype: win10
sata0: local-lvm:vm-102-disk-0,size=355G
sata1: /dev/disk/by-id/ata-Hitachi_HUA722010CLA330_JPW930HQ1PXAYW,size=976762584K
scsihw: virtio-scsi-pci
smbios1: uuid=1e178e39-dd5c-6bc9-ccdf-7a285536125d
sockets: 1
unused0: bigchungus-thin:vm-102-disk-0
vga: none
vmgenid: b98a2b51-3df7-190c-b409-15bb4f0920b8

Both those pci ids should be correct.

I usually pass the GPU and the audio through two hostpci configs…
can you try

hostpci0: 09:00.0,pcie=1,romfile=rx480new.bin,x-vga=1
hostpci2: 09:00.1,pcie=1

?