Anyone looked at TRX40 IOMMU Groups and Passthrough yet?

No worries, it can be frustrating when board to board things can be changed on which give issues and which don’t, in addition to AGESA or Chipset differences

Curious by what your handbuilt custom case looks like @snowmirage.

I’m an intel guy making the switch to AMD (since they are killing it right now) but there are some features I’m used to seeing that I don’t know if there is an equivalent or not. I do a lot of virtualization (mostly VMWare) and rely heavily on SR-IOV for networking. Does AMD have a feature like this? I can’t find any mention of it in any documentation.

I have an entire build sitting here in my office… just missing the most important piece… waiting on the CPU to finally come back in stock… fingers crossed it doesn’t take too much longer!

Thanks!

SR-IOV exists on both Ryzen and Threadripper and is called exactly that.

Fantastic! Has anyone happen to noticed that it’s possible to enable it using the Gigabyte TRX40 Extreme motherboard? I’ve red some reports on this site that mention it might not be a setting in some motherboards. I’m hoping it is since they are using server grade X550 10Gb Intel NICs.

I have three different X399 and TRX40 Mobos, they all have it. It’s difficult to find a decent high end mobo (all TRX40 are high-end) that does not support SR IOV.

2 Likes

From the manual for the Motherboard available on the manufacturer’s site. Note SVM and IOMMU options. Set SVM to enabled and IOMMU to Enabled (not auto).

Some cards have their own BIOS and enable switch for SR-IOV, I know of at least one Emulex NIC like that

That is for the card though, it would still need M/B support. (Just saying, not worried that the mobo wouldn’t have it, but, don’t confuse the fact the card has the option with not needing it in the M/B bios).

@ DJ_Datte

I also planning to build a two gamer pc, with more or less same hardware.

Could you share some details about the system you have?
What motherboard do you have?

Since there is so little known about system topology of the TR3000, I would be very happy if you can share the following:

sudo lspci -vvv
lsusb
lsusb -t
dmesg

iommu groups.

btw, I think that core topology of the 3060x should still have 4 chiplets, each with 2CCX, and each CCX should have 3 cores.

Best regards,
Maxim Levitsky

1 Like

Hi, absolutely!

After some testing, I am running the CPU in NPS2 (two Numa nodes) mode because it offers slightly better latencies (14% improvement) and memory separation (but you don’t have a cost in crossing numa nodes, so you are not creating a problem doing this). Keep this in mind as that means some devices will list their numa nodes.

M/B Zenith Extreme II /w GSkill Fury 3200C
Comment: GSkill FlareX 3200C @ 3400 (TridentZ Neo 3600C16 -16 were incompat., have ordered TZ Neo 3800C14 which are on Asus M/B compatible list).

VFIO USB Info:

Motherboard has a total of 5 PCIe USB controllers
2x Starship controllers (IOd controllers)
2x Matisse controllers (Chipset controllers)
1x Asmedia controller

Passing through the Matisse controllers (after ACS) works perfectly. No errors and you can just plug / unplug things, and one of them is connected to half of the backplate usb I/O.

Passing through the Startship controllers does not work, vm freezes the system shortly after being done.

[ 496.514136] vfio-pci 0000:03:00.3: not ready 1023ms after FLR; waiting
[ 498.562137] vfio-pci 0000:03:00.3: not ready 2047ms after FLR; waiting

The VM never inits, it goes into “pause” (or never leaves pause more likely).
About a minute later, the host freezes.

Sadly, for some reason, same thing happens when you use the built in usb controller on the nvidia cards. Shortly after being connected with a usb-c to usb3 hub, vm gets paused, with this in the debug log:

[ 113.986239] pcieport 0000:00:01.1: DPC: containment event, status:0x1f01 source:0x0000
[ 113.986241] pcieport 0000:00:01.1: DPC: unmasked uncorrectable error detected
[ 115.241243] pcieport 0000:00:01.1: Data Link Layer Link Active not set in 1000 msec
[ 115.241244] pcieport 0000:00:01.1: link reset at upstream device 0000:00:01.1 failed
[ 115.241252] pcieport 0000:00:01.1: AER: Device recovery failed
[ 115.241254] pcieport 0000:00:01.1: DPC: containment event, status:0x1f01 source:0x0000
[ 115.241254] pcieport 0000:00:01.1: DPC: unmasked uncorrectable error detected
[ 115.369262] pcieport 0000:00:01.1: AER: Device recovery successful

I agree with that statement, but what is this in response to?

Find the output files in the attachments!

Thanks

dmesg.txt (113.6 KB) iommu_groups.txt (10.3 KB) iommu_groups_acs.txt (10.4 KB) lspci.txt (176.4 KB) lspci_acs.txt (176.6 KB) lsusb.txt (1.8 KB) lsusb_tree.txt (2.8 KB)

Thanks a million!!
About the my response about the topology, you mentioned trouble defining the topology for the guest, and I thought that you might think that this chip has 6 full CCXes instead.

About the Startship USB controllers, looks like there is a patch

Maybe something like that can work for NVIDIA USB controller as well.

Speaking of Starship, I am a die hard SpaceX fan…

1 Like

Wow thanks for linking to that thread, will test it out tomorrow.

Regarding the topology, yep, I am aware it’s 3 Core CCXs, the problem is, TOPOEXT always reports as 4 core CCXs, even on 3 Core CPUs (6 / 12 / 24 Core Processors, which means, your cache topology in windows is off, and performance breaks. I created a bug for it here, and you can read more details there:

https://bugs.launchpad.net/qemu/+bug/1856335

1 Like

I have a small question.
Can you on this motherboard select boot gpu in bios?

Nope. If you have CSM on, top gpu is boot gpu, if you have CSM off, all gpus are boot gpu (at the same time).

Why do you need that feature, out of curiosity?

Gigabyte boards tend to have this feature. Zenith boards from ASUS obviously don’t have this feature.

Thank you for the information, and I might not need boot gpu selection after all.

My system will have 2 high end GPUs (for dual gaming/whatever system), plus one simple GPU for the host. Due to PCIe layout, to have 16x links on both GPUs, I will have to put the host GPU in between them. Thus host GPU will not be the 1st GPU.

Overall it is more healthy for VFIO, to make the UEFI bios to not touch the passed GPUs to avoid various issues. With NVIDIA this should still work though.
One of the issues is to have clean GPU’s ROM, which is usually tainted by the UEFI bios boot.

Plus I’ll have to fiddle with Linux to make it use the middle GPU as boot GPU, but that it not a problem at that stage (I studied exactly how the boot GPU selection on linux works so no big deal but still)

However assuming that I at least will get BIOS output on all GPUs, I might as well still go for this motherboard, and not some Gigabyte motherboard, since IMHO Gigabyte sucks.

If you have any more info about this board, I would be really happy to hear, I think I’ll go with this board after all.

Ask away and I can reply with what you want to know.
Regarding GPU, you really dont need a gpu for the host, in a VFIO system / system that you will otherwise access by ssh (to start the VMs, etc). It’s just a waste of a precious slot / power.

About extra GPU for host I prefer still to have it, since from my experience especially when debugging things, having a gpu for output is much better that networking access. I might buy a m.2 to serial card though (I love that this card has a m.2 slot on the back.)

Well most of the questions I have would require some effort to be answered, and I don’t want to waste your time, but if you have some more info that you discovered anyway, I would like to hear.

The full set of questions I still have is more or less, most of them will be rather hard to answer are:

  1. 128GB 3600Mhz, CL16. Is that possible on this board?
    Seems so according to
    https://www.reddit.com/r/Amd/comments/e3g3k5/threadripper_3970x_with_128gb_3600_mhz_cl16_ddr4/

  2. Does passthrough of the asmedia usb controller work? It does seem to work for some

  3. Is there a way to make the NVIDIA controller USB passthrough work? NVIDIA device also has a I2C controller, so maybe you forgot to pass it as well, or maybe all nvidia devices should be passed as multifunction device or whatever else. Maybe these AER errors are bogus and disabling them in BIOS work? Maybe also blacklist the FLR reset for this controller like the Startship USB controllers.

  4. With the FLR blacklist patch, do the startship controllers work for passthrough?

  5. (About USB topology). Its not clear from lsusb which ports are connected to chipset’s pair of usb controllers.
    My best guess is:

CHIPSET1 (bus 9,10 in lsusb)

USB3:
   |__ Port 1: PROBABLY REMOVEABLE PORT, PROBABLY BACK
   |__ Port 2: ?????
   |__ Port 3: ?????
   |__ Port 4: PROBABLY REMOVEABLE PORT, PROBABLY BACK
       ^^^ according to your lsusb and another lsusb I have seen
           ports 1,3 are for the back ports and then ports 2,3 are probably for
           the internal usb typec headers. Do you use them?

USB2:
   |__ Port 5: Dev 3, If 0, Class=Hub, Driver=hub/4p, 480M (Genesys Logic, Inc. 4-port hub)
   |   |__ Port 1: Dev 4, If 0, Class=Human Interface Device, Driver=usbhid, 12M (Corsair)
   |   |__ Port 2: ?????
   |   |__ Port 3: Dev 5, If 2, Class=Human Interface Device, Driver=usbhid, 12M (ASUSTek Computer, Inc)
   |   |__ Port 4: Dev 6, If 0, Class=Human Interface Device, Driver=usbhid, 12M (ASUSTek Computer, Inc)
   |   ^^^ this is on board USB2 hub. I wonder if the above devices are plugged in the onboard headers
           or that these are built-in as well


   |__ Port 6: Dev 5, If 0, Class=Wireless, Driver=btusb, 12M
       ^^^ this is bluetooth portion of the wifi device

CHIPSET 2 (bus 11,12 in lsusb)

    USB3:
    |__ Port 1: (BACK,TYPEC)
    |   ^^^ guess, based on usb ID of VIA USB hub you have plugged in here
    |__ Port 2: ?????
    |   ^^^ guess probalby not connected
    |
    |__ Port 3: ASM1074 SuperSpeed hub
    |   |__ Port 1: (BACK,BLUE)
    |   |__ Port 2: (BACK,BLUE)
    |   |__ Port 3: (BACK,BLUE)
    |   |__ Port 4: (BACK,BLUE)
        ^^^ this is for sure, because it fits the mobo spec
    |
    |__ Port 4: ASM1074 SuperSpeed hub
    |   |__ Port 1: (INTERNAL USB 3.0 HEADER)
    |   |__ Port 2: (INTERNAL USB 3.0 HEADER)
    |   |__ Port 3: (INTERNAL USB 3.0 HEADER)
    |   |__ Port 4: (INTERNAL USB 3.0 HEADER)
        ^^^ this is for sure, because it fits the mobo spec
    |
    USB2:
    |__ Port 5: ASUSTek Computer, Inc. USB Audio
    |__ Port 6: ASUSTek Computer, Inc. USB Audio
        ^^^ this is duo of the onboard audio
  1. I am curios where the 8X PCIE ports ends up, but that is really not important right now and I can find out when I buy that board.

  2. Does suspend/resume work on this board?

Best regards,
Maxim Levitsky