Non-isolated IOMMU group passthrough

I am in the midst of setting up my first VM on a TR-based system, and would like someone more knowledgeable than myself to confirm that my understanding of the procedure is correct with regards to GPU and USB controller pass-through.

I have moved the GPU that I want to pass to the guest so it is now in an IOMMU group by itself:

IOMMU Group 11 40:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 11 40:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 11 44:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU104GL [Quadro RTX 4000] [10de:1eb1] (rev a1)
IOMMU Group 11 44:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f8] (rev a1)
IOMMU Group 11 44:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1ad8] (rev a1)
IOMMU Group 11 44:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1ad9] (rev a1)

Per the Arch wiki gotcha section am I correct in thinking that neither the PCIe Dummy Host Bridge nor the PCIe GPP Bridge should be passed through?

I would also like to be able pass through an entire USB controller, but the problem is that none are in a IOMMU group by themselves. My two options are group 4, or group 13, both of which share the exact same set of entries:

IOMMU Group 4 00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 4 00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU Group 4 0a:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
IOMMU Group 4 0a:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
IOMMU Group 4 0a:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] USB 3.0 Host controller [1022:145f]

Assuming for the moment that neither the PCIe Dummy Host Bridge, nor the PCIe GPP Bridge are passed through per the GPU case, but that all other entries within the IOMMU need to be, what are the implications of passing through the Raven2 PCIe Dummy Function and the Platform Security Processor? I think I would be okay, but this is a situation where I would like to be sure first.

Thanks in advance for any advice / replies.

EDIT

Updated the IOMMU group 11 listing as I had missed an entry.

usually it’d just be a:00.0, .1, .2., .3 that you pass through, not the host bridge or pci bridge.

what motherboard is this though?

1 Like

Its a Gigabyte X399 AORUS XTREME. I have updated the OP as there was a line missing from the IOMMU group 11 listing. Having gone back and read the Arch wiki entry on PCI passthrough via OVMF very carefully from top to bottom, I am now convinced that the solution is to pass through the four NVIDIA ids and that is all, ie:

  • 10de:1eb1
  • 10de:10f8
  • 10de:1ad8
  • 10de:1ad9

EDIT

Removed reference to passing through the USB 3.0 host controller from IOMMU group 13 as the fact that it shares the exact same PCIe ID as the one in IOMMU group 4 gives me the screaming heebie-jeebies.

So it turns out that was the ticket. Just pass through the four NVIDIA entries in the IOMMU group and Robert is your father’s brother. Yay. I have moved on to tuning the VM, and am now having some trouble interpreting the output of lstopo:

This looks different to the TR 2950x maps that I have previously seen online. For one, there are no NUMA nodes in the map, which seems very weird. I can see the two GPUs in the PCIe tree (10de:1cb3 which is the P400 host GPU, and 10de:1eb1 which is the RTX4000 guest GPU), but its not obvious which cores I should prefer for the VM.

Is someone able to explain how to interpret this map?

I have looked the article on memory and core configuration for passthrough on threadripper by @SgtAwesomesauce, and my lstopo output is quite different to his…

Your bios probably has your CPU in UMA mode.

1 Like

Aww darn it - you are right. And the setting you need to change is not exactly listed as NUMA / UMA in the BIOS either. You have to dive deep and set memory interleaving to “channel”. Thanks for the heads-up Sarge.

Now to do battle with the USB issue!

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.