@maximlevitsky I had the VM booting when I didn’t pass anything but the VGA and Audio portion of my GPU. It seem related to anything USB, so the one on the GPU and the extra one I have in there.
With ASPM off I’m able to pass everything, including the problematic USB controllers.
Everything is now working as expected.
It seems the first gen and third gen threadripper have that fix in common.
No no.
I mean when you passed the USB controllers, did the VM just boot or did you also being able to attach
some devices to the controllers and see them in the guest?
@maximlevitsky I am able to attach USB devices to the USB controllers and see them in the guest. Keyboard, Mouse, USB DAC, and USB Mic all work as expected on either the 2080’s USB controller (type c to type a to a hub) or the Startech PEXUSB3S24 (renasas).
@FutureFade that is interesting. I wonder which method is more stable. Right now I haven’t ran it for more that a couple hours, but that’s usually about as long as any gaming session would go. We’ll see how it does on a full day of solid works or something similar. Also, I’m on BIOS F3 (AEGESA 1.0.0.3), not the latest F4A (AEGESA 1.0.0.3A?).
Using pcie_aspm=off doesn’t work for these devices:
Matisse USB 3.0 Host Controller
Starship/Mattisse HD Audio Controller
Starship USB 3.0 Host Controller
Other then those, you’ll be good to. Though you can through asmedia controllers without pcie_aspm=off.
To conclude this little investigate. If you have 20 series Nvidia card, you’ll need to pcie_aspm=off to use the build in Type-C controller.
Al though I am little surprised that it worked, because pcie_aspm is a active state power management option. Something that shouldn’t really affect passthrough.
I got it working.
I compiled my own kernel and disabled the DPC error driver, and volla, both my RTX2070S work and I don’t need to disable aspm so my thunderbolt driver works too.
By work I mean that USB controller passes through cleanly and so does all other nvidia devices. Now my only wish is that nvidia keeps that little USB port on RTX 3000 series I am waiting for (I will sell that card when RTX 3000 series is released)
Most likely the same can be done with blacklisting and/or poking at sysfs to disable that driver.
These DPC errors probably are just bogus, and are leftover from some enterprise EPYC features.
Hi—coming here from TRX40 Nvidia Single GPU Passthrough where I seemed to be hitting a similar issue (although weirdly mine worked completely fine for the first host boot but then failed after the guest was shut down the first time). Did you ever resolve this with blacklisting or anything other than a custom kernel?
Hi @allu what ultimately solved it for me was not passing the USB controller on my GTX 2080 through OR having the PCI-E ASPM setting off.
Use kernel version 5.4, that’s what I’m using today.
Here’s my vfio.conf (/etc/modprobe.d/vfio.conf): options vfio-pci ids=10de:1e87,10de:10f8,10de:1ad8,10de:1ad9,1912:0015,8086:1528 options vfio_iommu_type1 allow_unsafe_interrupts=1
Here’s the “GRUB_CMD_LINUX_DEFAULT” pieces that matters for you (/etc/default/grub): amd_iommu=on iommu=pt video=efifb:off pcie_aspm=off pci-stub.ids=144d:a808,144d:a801
Make sure to run `sudo mkinitcpio -p linux54’ (or linuxXX whatever version you’re using) after you edit your grub config then reboot.