XHCI power management in Linux SUCKS. /rant

Here’s the beginning of my headache regarding XHCI power management in xhci_hcd using third party USB 3.0 controllers

Controllers that rely on standard bus power don’t sleep properly at all. Not unless they are extremely low current or a specific class of device that plays well, such as a card reader.

High current devices just damn don’t work. If there’s no supplemental power going to the plugs, the broken port sleep/wake functions break even further because it thinks 500mA is the limit and it pulls 600mA and it just breaks.

Controller cards with supplemental power work great. 5V is directly fed externally so the sleep/wake only has to worry about the data links, not the power management.

Chipset based (B450/X470/Z370) ones likely work far better than external ones and this crapshoot of power management. I’ve not heard of any complaints from people using native Intel or native AMD USB 3.0, so I guess add-in card people are SOL.

I’m sorry, but because the third party XHCI controller market is the minority now, this won’t be fixed. You just have to be EXTREMELY PICKY on controller cards for VFIO.

What I have not tested yet are powered hubs. Looking into getting this one from Orico:

and getting this Apricorn USB 3.0 Y Cable to troubleshoot all these power management issues:

This also wouldn’t be as bad of a problem if I had access to the PCI-E x1 slots below the GPU. Apparently this exists to take care of that, but good luck tracking the OEM for this…

This is a problem with no easy solution.

4 Likes

Do flex PCI plugs to USB exist? Like the riser cables miners use. Just an x1 and a USB end.

Too tall, and the one that’s mass produced is facing the wrong way and the GPU obstructs it.

1 Like

The other thing I was thinking of is some USB 3.0 controllers are just 4 controllers on a x4 PLX chip. I would like to run a PLX chip that runs to both an Intensity Pro and a Renesas USB 3.0 card, with the hope of IOMMU separation. (Intensity Pro on the host, Renesas on the guest) But if that isn’t the case, I have to get the ultra low profile riser card and put the x1 end under the GPU.

Ooh, Amazon Germany also has this low profile riser:

And so does Micro SATA Cables, but they don’t have the small x1 end:

http://www.microsatacables.com/pcie-1x-to-pcie-16x-card-with-usb-30-cable-pci-e1x-16x-usb

And finally, a reasonable price from a USA seller on eBay:

And the OEM on AliExpress:

https://www.aliexpress.com/item/Mini-PCI-E-PCI-Express-Extension-1X-Riser-Card-Power-USB-30cm-Extender-Cable-4-Pin/32856086189.html

1 Like

Okay, the 4 port powered hub works when nothing needs a power cycle from the root controller for certain flaky things, but still manages to crap out the ENTIRE HUB when another 2nd power cycle is issued for a port/device that went to sleep. If one device on the hub needs to be woken up, IT KILLS THE ENTIRE HUB.

This is why I hate the Fresco Logic early firmware controllers. The FL1009 can go DIAF.

All VIA USB 3.0 controllers can go DIAF too for breaking when the kernel parameter iommu=1 is active. It was never fixed and will never be fixed.

This really doesn’t bode well if you have a controller with buggy power management. Because the side effect of that can be passed on down to the downstream hub.

The next thing I’m going to try is the Apricorn cable, with the power only USB plug going to somewhere where there is consistent 5V power. I don’t have high hopes for it though due to this broken power management being an absolute douche.

Edit: I double checked if it was OCP… It wasn’t OCP, cause after a little while, the upstream controller will decide to crap out on a subsequent power cycle. Now you may complain that xhci_hcd has an absolutely terrible power management system.

@wendell

The root cause of flaky USB 3.0 ports is…

  1. Insufficient current

  2. Bad controller/hub firmware for power management

  3. Broken power management in the Linux XHCI drivers

If you have all 3 working against you, god help you…

If the Surface Book is “already flaky enough as it is,” A hub won’t help if the bad USB controller firmware chooses to shutdown the entire hub because of broken power management. If that was a Intel PCH USB 3.0 port, then all the reason to add a mini-PCIE USB 3.0 card with a outside 5V rail.

Seems the Fresco Logic works best in Windows with the official drivers. Problem is, that is the root cause of reset issues with VFIO.

And once again, I tried the VIA controller assuming that if no XHCI kernel module is loaded on the VIA controller, it won’t affect the passthrough…

The VIA controller just completely refuses to work when IOMMU is turned on, no matter if no modules are loaded, cause the actual hardware calls are broken between the kernel and the controller. MANY threads describe people’s trouble with VIA controllers and turning on IOMMU:

https://bugzilla.redhat.com/show_bug.cgi?id=1376455

https://bugzilla.redhat.com/show_bug.cgi?id=1409098

https://ubuntuforums.org/showthread.php?t=2390208

https://ubuntuforums.org/showthread.php?t=2390208

https://bugs.launchpad.net/linuxmint/+bug/1353050

One person disclosed a fix through a firmware update from a Dropbox file, (oh boy…) but that is SPECIFICALLY for the VL805 chipset, not anything older. So if you don’t have that controller, you’re SOL.

I guess I have to if I want to use a Blackmagic card at the same time, use that PCI-E low profile riser to drive my Renesas USB 3.0 card, and take the 2 ports, and use my front panel USB 3.0 ports and abandon using the Fresco Logic and VIA chipsets for passthrough. The card will be protected with electrical tape and will rest on top of my bottom GPU.

So for s***s and giggles, I went ahead and used a MacBook Air for my Avermedia capture device. I thought “That can’t possibly support a high amperage device, can it?”

IT CAN.

And it dawned on me that USB 3.0 traces going directly to the Intel or AMD based USB 3.0 controller just have zero issues. The Surface Book uses a Hub to take 1 single USB 3.0 port from the primary screen and tries to deliver all the USB ports off of a Genesys Hub. In theory, if the upstream controller is Intel, no problem, but the Genesys firmware is where things are going terribly wrong, in addition to the issues with power delivery. You need a direct pinout from the docking interface to bypass the weak hub entirely and just go to a singular USB 3.0 port, then do a powered hub from there. Cause using bus power to drive a hub and a few ports on a quick disconnect is the source of all the power issues, cause upstream, it’s only delivering 500mA TOTAL. This means each port gets 250mA total. Not to mention this Genesys internal hub adds incompatibilities.

Sorry if this is bugging you @wendell but this has been bugging me about why an Intel chipset would have so many problems… Turns out, it’s a bus powered hub with a single upstream port and very little available upstream bus power. And bad hub firmware amplifies the problem.

So, I’ve run into a problem… I know USB 3.0 Male to internal USB 3.0 19 pin Male cables exist, but finding one that isn’t gonna ship by 2-3 months by boat is near impossible.

Cooler Master apparently makes this cable:

http://www.coolermaster.com/cooling/cooling-accessories/internal2external-usb3-adapter/

But I don’t see anywhere else to buy it other than their EU store. Not a single seller in North America if you search it’s model number: RA-USB-3002-IN

The reason I need this is because I’m plugging my front panel USB 3.0 into the Renesas card which will be laying on top of my secondary GPU, with a x1 riser going to a place underneath the STRIX GPU in my system.

I guarantee you that there is no place that sells this cable in North America that is reasonable with it’s shipping to Canada.

Edit: The UK huh? 4 weeks by Atlantic boat is better than 2 months by Pacific boat:

All,

I can confirm that this riser allows you to plug a card into a slot underneath a GPU. I will be using it soon to do some additional VFIO related testing to free up some slots.

With AM4 GPU passthrough systems however, you can’t get a separate grouping on the 1x slots (without ASC patching), but using a riser like this you can move your host GPU away to make room for 2 GPUs or plugging any USB card you want.

I have one of these USB 3.1 cards (only easy way currently that I know to add type C USB 3.1 ports if you mobo doesn’t have those special headers).

https://www.silverstonetek.com/product.php?pid=712&area=en


https://www.amazon.com/gp/product/B074TSPW95/ (plug in cable with type C)

I got this a few months ago and verified that it does support the reset command, but I didn’t use it for VFIO previously due to the 1x slot iommu grouping, and my mobo’s USB 3 4x port passthrough working 100% perfectly.
However I can now do some testing on this card for you guys in addition to more testing against the X370 USB controller.

I also have the Avermedia U3 to test USB with (requires huge bandwidth to work correctly, not compatible with many early usb 3 chipsets, but currently doesn’t work in linux yet).

So is there a good summary of the issues that people are having with various USB controllers that we can test?

  • Start/Shutdown of VM doesn’t work reliably
  • Shutdown VM, sleep host, then restart VM test?
  • High bandwidth card testing such as Avermedia U3
  • USB2/3 self/host powered hub testing

I’m trying to test that at this moment.

The problem with that adapter is the ribbon cable carries the power and the card can become unreliable if the power can’t be delivered sufficiently. The one I linked supplements the power with a Floppy power connector, and uses a USB 3.0 non standard A to Mini-B USB 2.0 end but with 3.0 pins cable. That ensures only data is carried, and then the power is dealt with by the board at the other end with the x1 slot.

I’ll give you a Gist:

FL1009 = BAD. Says it supports reset but doesn’t actually.
FL1100 = Early revisions of the firmware suffered, but cards with the latest firmware apparently work fine. Take that with a grain of salt though. When buying a card with this controller, you’re playing a lottery with firmware revision, with some working great, and some being terrible.
VIA VL8XX = AVOID LIKE THE PLAGUE. Fails as soon as iommu=1
Etron = Multiple incompatibility issues with video capture devices over UVC. Not worth the trouble.
Renesas uPD720201 = Contains issues detailed on the VFIO subreddit.
Renesas uPD720202 = If the card has supplemental power input (SATA or Molex) USE IT. With that external power this is the golden chip for VFIO.
Asmedia = VFIO subreddit has been getting a mixed bag of results. Some report reset hangs.

Native Intel USB 3 = No issues natively, not sure about passthrough. One report details no reset capabilities.
Native AMD USB 3 from the CPU = Doesn’t work with VR headsets from one report, likely amperage/firmware/driver issue.

Hub testing:

If your upstream controller has broken power management, it can affect your hub to completely crash the hub firmware and lose all connections downstream in terms of USB 3.0 connections. USB 2.0 seems better suited for power management on a powered hub if you limit everything to USB 2.0. The key thing for USB 3.0 is your upstream controller has to be rock solid. (AKA just use the Renesas uPD720202)

Side note:

The ExtremeCap U3 isn’t supported in Linux, but the ExtremeCap UVC is supported in Linux, even over USB 2.0.

How much money do you think you’re going to save annually on USB power management?

It’s less power use than “IT JUST HAS TO DAMN WORK”

The point of this thread is that the broken power management at firmware level on some controllers and at XHCI kernel driver level on the Linux side makes things NOT WORK. I’m not concerned about energy bills as opposed to “Are you going to crap out on me when I cycle the device? Yes? You’re crap, FL1009.”

Ladies and gentlemen… The solution… IS THIS CABLE: (For USB 3.0 devices)

By plugging in the power only end in first, the device gets consistent power, and then plugging in the data end means the device can properly power cycle, even on a flaky controller! My Avermedia ExtremeCap UVC now properly power cycles on a “flaky” controller.

So the solution wasn’t a hub, it was to brute force 5V to be constantly fed to the device.

A Hub will always depend on the reliability of the upstream controller’s power management. If the upstream controller has bad power management, it kills the whole downstream hub. But brute forcing the destination device connected directly without a hub with additional 5V power supply is actually the way to go.

Be aware though that if the upstream USB 3.0 controller is flaky with USB 2.0 power management, nothing will save it for USB 2.0 devices. This solution is to fix flaky USB 3.0 in Linux and with potentially bad USB 3.0 controller firmware.

Video about this and the Avermedia ExtremeCap UVC SoonTM.

(FWIW: I have a B450-I mobo but no processor, RAM or GPU. I could test AM4 USB 3.0 but I can’t at this current point. That will have to be a later video.)

I am reviving this topic, as i am facing the same issues with flaky USB3 on different Linuxes and i wanted to point to a few different threads (sorry, but i cannot post urls for some reason, but the threads can be found when entered into google)

there is a discussion on Arch Linux forums
[xHCI host controller not responding, assume dead]

and here is a patch on the linu-usb list, which is supposedly having a fix - but it looks like the patch has not been applied yet? (not sure what is the problem, can someone check please?)

[TESTPATCH] xhci: Fix perceived dead host due to runtime suspend race with event handler

would love to get this running.

I literally just tried a UVC device on a USB 3.1 Gen 2 port and the EXACT same power issues cropped up on a B450 system… So it’s not solved.

1 Like