Anyone looked at TRX40 IOMMU Groups and Passthrough yet?

Fantastic! Has anyone happen to noticed that it’s possible to enable it using the Gigabyte TRX40 Extreme motherboard? I’ve red some reports on this site that mention it might not be a setting in some motherboards. I’m hoping it is since they are using server grade X550 10Gb Intel NICs.

I have three different X399 and TRX40 Mobos, they all have it. It’s difficult to find a decent high end mobo (all TRX40 are high-end) that does not support SR IOV.

2 Likes

From the manual for the Motherboard available on the manufacturer’s site. Note SVM and IOMMU options. Set SVM to enabled and IOMMU to Enabled (not auto).

Some cards have their own BIOS and enable switch for SR-IOV, I know of at least one Emulex NIC like that

That is for the card though, it would still need M/B support. (Just saying, not worried that the mobo wouldn’t have it, but, don’t confuse the fact the card has the option with not needing it in the M/B bios).

@ DJ_Datte

I also planning to build a two gamer pc, with more or less same hardware.

Could you share some details about the system you have?
What motherboard do you have?

Since there is so little known about system topology of the TR3000, I would be very happy if you can share the following:

sudo lspci -vvv
lsusb
lsusb -t
dmesg

iommu groups.

btw, I think that core topology of the 3060x should still have 4 chiplets, each with 2CCX, and each CCX should have 3 cores.

Best regards,
Maxim Levitsky

1 Like

Hi, absolutely!

After some testing, I am running the CPU in NPS2 (two Numa nodes) mode because it offers slightly better latencies (14% improvement) and memory separation (but you don’t have a cost in crossing numa nodes, so you are not creating a problem doing this). Keep this in mind as that means some devices will list their numa nodes.

M/B Zenith Extreme II /w GSkill Fury 3200C
Comment: GSkill FlareX 3200C @ 3400 (TridentZ Neo 3600C16 -16 were incompat., have ordered TZ Neo 3800C14 which are on Asus M/B compatible list).

VFIO USB Info:

Motherboard has a total of 5 PCIe USB controllers
2x Starship controllers (IOd controllers)
2x Matisse controllers (Chipset controllers)
1x Asmedia controller

Passing through the Matisse controllers (after ACS) works perfectly. No errors and you can just plug / unplug things, and one of them is connected to half of the backplate usb I/O.

Passing through the Startship controllers does not work, vm freezes the system shortly after being done.

[ 496.514136] vfio-pci 0000:03:00.3: not ready 1023ms after FLR; waiting
[ 498.562137] vfio-pci 0000:03:00.3: not ready 2047ms after FLR; waiting

The VM never inits, it goes into “pause” (or never leaves pause more likely).
About a minute later, the host freezes.

Sadly, for some reason, same thing happens when you use the built in usb controller on the nvidia cards. Shortly after being connected with a usb-c to usb3 hub, vm gets paused, with this in the debug log:

[ 113.986239] pcieport 0000:00:01.1: DPC: containment event, status:0x1f01 source:0x0000
[ 113.986241] pcieport 0000:00:01.1: DPC: unmasked uncorrectable error detected
[ 115.241243] pcieport 0000:00:01.1: Data Link Layer Link Active not set in 1000 msec
[ 115.241244] pcieport 0000:00:01.1: link reset at upstream device 0000:00:01.1 failed
[ 115.241252] pcieport 0000:00:01.1: AER: Device recovery failed
[ 115.241254] pcieport 0000:00:01.1: DPC: containment event, status:0x1f01 source:0x0000
[ 115.241254] pcieport 0000:00:01.1: DPC: unmasked uncorrectable error detected
[ 115.369262] pcieport 0000:00:01.1: AER: Device recovery successful

I agree with that statement, but what is this in response to?

Find the output files in the attachments!

Thanks

dmesg.txt (113.6 KB) iommu_groups.txt (10.3 KB) iommu_groups_acs.txt (10.4 KB) lspci.txt (176.4 KB) lspci_acs.txt (176.6 KB) lsusb.txt (1.8 KB) lsusb_tree.txt (2.8 KB)

Thanks a million!!
About the my response about the topology, you mentioned trouble defining the topology for the guest, and I thought that you might think that this chip has 6 full CCXes instead.

About the Startship USB controllers, looks like there is a patch

Maybe something like that can work for NVIDIA USB controller as well.

Speaking of Starship, I am a die hard SpaceX fan…

1 Like

Wow thanks for linking to that thread, will test it out tomorrow.

Regarding the topology, yep, I am aware it’s 3 Core CCXs, the problem is, TOPOEXT always reports as 4 core CCXs, even on 3 Core CPUs (6 / 12 / 24 Core Processors, which means, your cache topology in windows is off, and performance breaks. I created a bug for it here, and you can read more details there:

https://bugs.launchpad.net/qemu/+bug/1856335

1 Like

I have a small question.
Can you on this motherboard select boot gpu in bios?

Nope. If you have CSM on, top gpu is boot gpu, if you have CSM off, all gpus are boot gpu (at the same time).

Why do you need that feature, out of curiosity?

Gigabyte boards tend to have this feature. Zenith boards from ASUS obviously don’t have this feature.

Thank you for the information, and I might not need boot gpu selection after all.

My system will have 2 high end GPUs (for dual gaming/whatever system), plus one simple GPU for the host. Due to PCIe layout, to have 16x links on both GPUs, I will have to put the host GPU in between them. Thus host GPU will not be the 1st GPU.

Overall it is more healthy for VFIO, to make the UEFI bios to not touch the passed GPUs to avoid various issues. With NVIDIA this should still work though.
One of the issues is to have clean GPU’s ROM, which is usually tainted by the UEFI bios boot.

Plus I’ll have to fiddle with Linux to make it use the middle GPU as boot GPU, but that it not a problem at that stage (I studied exactly how the boot GPU selection on linux works so no big deal but still)

However assuming that I at least will get BIOS output on all GPUs, I might as well still go for this motherboard, and not some Gigabyte motherboard, since IMHO Gigabyte sucks.

If you have any more info about this board, I would be really happy to hear, I think I’ll go with this board after all.

Ask away and I can reply with what you want to know.
Regarding GPU, you really dont need a gpu for the host, in a VFIO system / system that you will otherwise access by ssh (to start the VMs, etc). It’s just a waste of a precious slot / power.

About extra GPU for host I prefer still to have it, since from my experience especially when debugging things, having a gpu for output is much better that networking access. I might buy a m.2 to serial card though (I love that this card has a m.2 slot on the back.)

Well most of the questions I have would require some effort to be answered, and I don’t want to waste your time, but if you have some more info that you discovered anyway, I would like to hear.

The full set of questions I still have is more or less, most of them will be rather hard to answer are:

  1. 128GB 3600Mhz, CL16. Is that possible on this board?
    Seems so according to
    https://www.reddit.com/r/Amd/comments/e3g3k5/threadripper_3970x_with_128gb_3600_mhz_cl16_ddr4/

  2. Does passthrough of the asmedia usb controller work? It does seem to work for some

  3. Is there a way to make the NVIDIA controller USB passthrough work? NVIDIA device also has a I2C controller, so maybe you forgot to pass it as well, or maybe all nvidia devices should be passed as multifunction device or whatever else. Maybe these AER errors are bogus and disabling them in BIOS work? Maybe also blacklist the FLR reset for this controller like the Startship USB controllers.

  4. With the FLR blacklist patch, do the startship controllers work for passthrough?

  5. (About USB topology). Its not clear from lsusb which ports are connected to chipset’s pair of usb controllers.
    My best guess is:

CHIPSET1 (bus 9,10 in lsusb)

USB3:
   |__ Port 1: PROBABLY REMOVEABLE PORT, PROBABLY BACK
   |__ Port 2: ?????
   |__ Port 3: ?????
   |__ Port 4: PROBABLY REMOVEABLE PORT, PROBABLY BACK
       ^^^ according to your lsusb and another lsusb I have seen
           ports 1,3 are for the back ports and then ports 2,3 are probably for
           the internal usb typec headers. Do you use them?

USB2:
   |__ Port 5: Dev 3, If 0, Class=Hub, Driver=hub/4p, 480M (Genesys Logic, Inc. 4-port hub)
   |   |__ Port 1: Dev 4, If 0, Class=Human Interface Device, Driver=usbhid, 12M (Corsair)
   |   |__ Port 2: ?????
   |   |__ Port 3: Dev 5, If 2, Class=Human Interface Device, Driver=usbhid, 12M (ASUSTek Computer, Inc)
   |   |__ Port 4: Dev 6, If 0, Class=Human Interface Device, Driver=usbhid, 12M (ASUSTek Computer, Inc)
   |   ^^^ this is on board USB2 hub. I wonder if the above devices are plugged in the onboard headers
           or that these are built-in as well


   |__ Port 6: Dev 5, If 0, Class=Wireless, Driver=btusb, 12M
       ^^^ this is bluetooth portion of the wifi device

CHIPSET 2 (bus 11,12 in lsusb)

    USB3:
    |__ Port 1: (BACK,TYPEC)
    |   ^^^ guess, based on usb ID of VIA USB hub you have plugged in here
    |__ Port 2: ?????
    |   ^^^ guess probalby not connected
    |
    |__ Port 3: ASM1074 SuperSpeed hub
    |   |__ Port 1: (BACK,BLUE)
    |   |__ Port 2: (BACK,BLUE)
    |   |__ Port 3: (BACK,BLUE)
    |   |__ Port 4: (BACK,BLUE)
        ^^^ this is for sure, because it fits the mobo spec
    |
    |__ Port 4: ASM1074 SuperSpeed hub
    |   |__ Port 1: (INTERNAL USB 3.0 HEADER)
    |   |__ Port 2: (INTERNAL USB 3.0 HEADER)
    |   |__ Port 3: (INTERNAL USB 3.0 HEADER)
    |   |__ Port 4: (INTERNAL USB 3.0 HEADER)
        ^^^ this is for sure, because it fits the mobo spec
    |
    USB2:
    |__ Port 5: ASUSTek Computer, Inc. USB Audio
    |__ Port 6: ASUSTek Computer, Inc. USB Audio
        ^^^ this is duo of the onboard audio
  1. I am curios where the 8X PCIE ports ends up, but that is really not important right now and I can find out when I buy that board.

  2. Does suspend/resume work on this board?

Best regards,
Maxim Levitsky

Not having the board you asking, but I can answer two question2 s:

2 Not tested but for my board it is in separate IOMMU group. Maybe it needs FLR patch, but only to find out is to try it out.

4 Yes FLR patch does work. It even work for the Audio Controller, which are disguised as starship USB controllers, due to how audio is configured on TRX 40.

I watch tech news every day and I haven’t seen anything like that. Zen3 is not a completely new architecture that requires a totally different socket from everything I’ve seen. In fact it’s supposed to be the last release for the AM4 socket, and I’d bet, without seeing anything to the contrary, that the Zen3 Threadripper will ALSO be compatible with TRX40.

What AMD has said though is that Zen3 could be considered something like a new architecture because of improvements they’ve made. But, Zen3 comes out at the end of 2020, and for Ryzen it will be the last one on the AM4 socket. Zen4 will have PCIe5 and DDR5 and will require new sockets of course. Since I don’t know for a FACT information about Threadripper and TRX40, I can’t say anything definitive, but if there has been any definitive news released by AMD, could you provide a link to it? It’s contrary to what I understand. I find it hard to believe that with the expense that went into making these new sockets and the cost of these boards, which are a bit better than their older gen boards, that they would make it for only one gen with the changes I’ve seen so far on tech news, which from everything I’ve seen doesn’t require new MBs, since both X570 and TRX40 both have an improved architecture to deal with these faster CPUs. I’m pretty sure there will still be a MB that comes out with Zen3, but that doesn’t mean the X570 and TRX40 won’t be capable of running Zen3 CPUs.

There are some boards that have 5 PCIe slots, with one being an X1. But, wouldn’t a USB card be X1, or at least allow running at X1?

Also, some boards have 10Gbps LAN, so that would take care of the last bit.

From what I see the boards do X16 to 2 slots and X8 to two others, didn’t bother looking to see what goes through the chipset and which are direct to the CPU. I think though that the chipset to CPU link is PCIe4 X8, so it give plenty of bandwidth for whatever does go through the chipset.

Maybe one day instead of needing a USB card, there will be breakout boxes that you can connect to one of the ports in the back and gives you multiple ports. That would eliminate the need for a USB card. :slight_smile:

Isn’t that SAS controller PCIe2.0? When I look it up it’s PCIe2.0 X8, which means 4GB/s max. That’s like one high quality NVMe drive. I do understand though that’s plenty of bandwidth for dealing with RAIDed mechanical drives. The problem though is if your system is doing a few things at once, the transmission time (meaning a single data transmission) is slower than a gen3 device and substantially slower than a gen4 device. At some point that slower transmission time, if you’re doing a lot of data transfers with that SAS will create latency for the system, so at some point you probably want to bump that up to PCIe3. But, this is workload dependent and only you know exactly how your system is being used.

Could you explain more about the audio on TRX40? Which TRX40 board you have?