Linux USB3.0 card with IOMMU ON = unstable

Hello i built myself a linux server that i use primarily for VM’s. It has CentOS 7 and the latest mainline kernel. I pass through various pci-e cards (scsi/sas raid/sas IT mode/Quadro GPU) and they work absolutely fine in all windows guest OS’s that i setup with KVM/QEMU.

I wanted to also pass through a USB3.0 PCI-E card because i want the native performance of it in case i want to plug in high speed USB devices.

A VIA based card works for about 5 minutes and then dies. There are errors in DMESG that XHCI failed/died and was cleaned up. Sometimes i can do a /remove /rescan of the pci bus and it will come back for a short time.

A Renesas based card seemed to work for longer, but when it dies it can cause the entire PC to freeze up.

A last resort i bought an ASMEDIA card to try, but again, after so long it will die (but it doesn’t lockup the pc thankfully). I noticed with this card that i get DMAR releated errors and eventually “Kernel Message: IRQ X disabled”. The IRQ’s in question always point to the USB 3.0 card.

I’ve read lots of threads with people having issues with xhci (USB3.0) on linux and the majority of solutions were to set “iommu=soft” as a kernel parameter. As i understand, this causes Linux to emulate the IOMMU in software? I did try this and it seems to have fixed the problem, but it’s not really viable for me to use this because it breaks the ability to passthrough pci devices to KVM. When i try to pass through with iommu=soft, KVM just says “system doesn’t have pass through support”.

Can anyone shed light on this? If you need more information just let me know what i should provide please.

Thanks,
James.

VIA chips simply DON’T WORK when IOMMU is on. You should disable that in BIOS.

The best luck seems to come from Fresco Logic chipsets. But be careful that those controllers have a reset bug.

Are you using the ACS patch? I was able to pass through an onboard USB controller. Maybe you could do that and leave the add-on card for the host?

Shame about the reset bug else i would have tried a fresco. I did look around to see if intel made a pci-e card but it seems they dont exist… i’m running out of usb chipsets to try. I really dont want to have to resort to a usb2 card unless i have to.

I thought about seeing if there was a thunderbolt card i could buy and then use an adapter to get usb 3.0 through it? not 100% sure if this might be doable? it’s an option i suppose. That’s if a thunderbold card would pass through without any problems…

I’m not using any ACS patch. It’s a supermicro dual socket board but the IOMMU grouping has never been a problem. Everything i pass through is in separate groups and no sharing with anything else.

Just a random question here? Do you have any ACPI features enabled in the BIOS? Are you seeing any ACPI or powerstate messaged in DMESG? In the 4.16 Kernel, there was work on power saving and and turning off “idle” USB chipsets to allow systems to stay in a lower C state. Maybe what you are experiencing is the USB chipset going to sleep and not being woken up properly?

More PowerState changes are being worked on in 4.17 and they are really aggressive.

Random is good, anything that gives another line of investigation is helpful. I couldn’t see anything in the BIOS for ACPI/power related settings for USB. Although if there were im not sure it would apply to an addon card? I did try toggling the PCI-E ASPM (a PCI-E low power state option) but it made no difference whether it was on or off.

Sorry then. I tried.

I don’t know if this is actually fixed yet but there has been some progress. One of the other issues i had with the USB 3.0 cards was that they would instantly die when plugging a 2.5 inch usb disk in. Low power items would work, but like i said before, the cards would alwauys die after so long. Sometimes just a few minutes.

Looking at lspci again today i noticed that the USB card is sharing an IRQ with several other devices. One of the devices it’s sharing with is a very very old adaptec ultra SCSI card that is connected via a PCI to PCI-e adapter. I seem to recall that the adaptec card didn’t like IRQ masking (or there was some error about INTx masking in KVM when i was playing trying to get it to pass through a few months back). I pass this card through sometimes to use my ancient Epson Perfection scanner! So i did some card swapping a moment ago and managed to get the USB card to use it’s own IRQ… Now i can plug in 2.5 inch disks, and other higher powered devices without it instantly falling over. Its too early to tell if the card is actually 100% fixed now but this seems like progress!