SR-IOV all Virtual functions in same IOMMU group, breaking VMs

i recently got my Intel X500T network card and am trying to use SR-IOV to give each one of my VMs its own virtual function of the card.

but when i went to try and add a virtual function to a VM, it fails because every single one of my virtual functions is in the same IOMMU group, making it impossible to give each VM its own.

why would intel program the card to do this? am i doing something wrong? what is the intended use of SR-IOV virtual functions if not to be able to give each function to a VM?

no, i will not consider ACS override.

AFAIK, the IOMMU groups are a function of the motherboard implementation, so it would not matter what SR-IOV device you are using, it would always have the virtual functions in the same group. So it is not really the fault of Intel, at least not of their network card division.

  • Make sure you are using a new(ish) kernel. Old kernels (4.8 and older maybe?) have worse groups.
  • Make sure that the BIOS/UEFI is at the best version for IOMMU. Generally, that is the newest, but not always.
  • Make sure that you are using a PCIe slot wired directly to the CPU, instead of one going through the chipset. Some times everything or most things through the chipset are in large groups (or even one giant group)

May I ask why?

1 Like

does this mean i have a faulty motherboard/firmware? (ASUS Prime X399-a)

Linux 5.8 with @gnif 's Navi BACO reset patch applied

 BIOS Information
        Vendor: American Megatrends Inc.
        Version: 1002
        Release Date: 02/15/2019
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 16 MB
        Characteristics:
                PCI is supported
                APM is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                BIOS ROM is socketed
                EDD is supported
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                Print screen service is supported (int 5h)
                8042 keyboard services are supported (int 9h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
                UEFI is supported
        BIOS Revision: 5.13

i think this may be my issue. the device 8086:1563 is my SR-IOV card. is there any reliable way around this?
here is my lstopo:

i wont use ACS override as its not reliable, and can cause system instability.

i could swap the PCIe slots of my RAID controller (1000:005d) and this network card, but im scared that might cause me to lose data on my RAID array and might not even fix my IOMMU grouping issue.

Version 1203 is the latest. Looking at the changelog, I don’t really think it would change anything though.

Great.

Swapping around PCIe slots.

This is a large part of why software arrays are better than a hardware controller these days. I don’t think it would be an issue to swap slots for the card, but I’m not %100 sure.

2 Likes

swapping the slots works!!! i did not lose my RAID array. i can just pass-through virtual functions to VMs.

2 Likes

Hallelujah.

It would be really nice if the more consumer-oriented boards provided block diagrams so people could figure out where to put stuff. Lots of manuals I look at only have stuff for SLI and whatnot, and not the actual layout.

Iommu is on auto in your bios. Auto is partial enabled not totally enabled. It’ll probably also work fine in a different slot :smiley:

Go enable it in bios and let us know

3 Likes

Same here with MSI x670e Carbon on the 3rd PCIE slots(from chipset). All vfs are in the same iommu group. Ironically, on my old z87 motherboard, it can breakdown the iommu group nicely when sr-iov is active.

1 Like

same with the B650 Creator, it works on the second slot, but not in the 3rd via chipset.

Always switch your BIOS for IOMMU, SVM, and VFIO from Auto to Enabled. IOMMU in particular tends to make a big difference, Auto on one of my motherboards would clump everything on the chipset into one group and everything on the CPU into another, making it effectively useless, while Enabled gave every single device its own group. Why that’s not just the default, I’ve no idea.