Ryzen IOMMU: PCIe Passthrough works, BUT... | Level One Techs

Some more progress - I have a windows VM (OVMF EFI BIOS) connecting to the GPU over the iommu. Unfortunately, something is clearly wrong with it

My guess is that ARCH overwrote the stock VGA ROM when it booted, and the VM can't make sense of the modified ROM. I'm going to try and find a stock rom to use instead.

Alternately, I might get a second keyboard to pass through to the VM. THe image may be distorted, but it would be encouraging to see windows boot.

Edit: Another thought - the 9500 GT doesn't support UEFI. Can OVMF still work with it WITHOUT UEFI support on the GPU ?Perhaps my 1070 would work better. Hopefully one of the gurus out there could give me some pointers

AFAIK you need to pass a UEFI GOP to get cards to work on OVMF VM

Running a fury on the guest and a 1070 on the host-- previously I ran a 7970 on the guest with a shunted UEFI bios, and set one up for a friend where I bought a sapphire card that had native UEFI support. Never tried to get older hardware working on Tianocore, but you may have more luck with a seabios setup given how old your cards are.

2 Likes

if you boot a graphical OS installer, it should not matter about the rom. But ideally the card being passed through is a UEFI native card.

And thanks for working hard on this and reporting back. It means a lot.

1 Like

I'm a bit confused at all this information. The general consensus is that IOMMU for Ryzen is a no-go, but is it essentially up to motherboard manufacturers to push out better grouping in their EFI updates, or is this something that's gimped in the chips themselves?

IOMMU Works. It just doesn't as work as well as we might like. Basically, it isn't isolating every device into its own IOMMU Group. Instead the group seems to be determined by where that pcie port comes from - is it from the cpu, or the chipset? Everything off the chipset gets shoved into its own group.

Check out Wendells video on the subject, or look at the iommu bash read outs people are posting.

See here:

For an example of someone with working pass-through on ryzen

I saw that, thanks for the info. Are IOMMU groups assigned by the board's EFI or are they hard wired though? Is it possible for board manufacturers to fix the groups in firmware so the patches aren't needed?

I haven't seen a screenshot yet on the group for a m.2 NVME SSD. The reason I want to check is because it is technically possible to install an m.2 to pcie adapter, with a riser cable, to install another GPU in that port. If it is not pert of the bifurcated x16 group that could be a new approach. My biggest fear is that some video cards draw a lot of pcie power, more than m.2 is probably capable of... however...

I happen to have a gigabyte ax370 gaming 5, and as an extra bonus it has u.2, which doesn't supply power, but instead uses a sata or molex for power, thus I don't have to worry about overloading my m.2 port. So, if it is in a different group...

u.2 > m.2 > pci-e > ribbon riser > GPU

Since the u.2 / m.2 port is direct to the CPU it should be possible to prioritize it's initiation too, or if not use that as the pass through port. Anyways, does anyone have an m.2 pcie ssd that can check the grouping for the port. I sadly don't.

M.2 is in group 0, it's in the video but I don't call it out.

1 Like

So i guess the question is can we initialize the "m.2" gpu over the pcie x16 one? If yes, great, but probably not. it is going through the chipset rather than direct so i suspect it will be no different than using the x4 2.0 slot.

Anyone have an m.2 to pcie slot adapter like this to try:

If no difference in init then basically only benefit is pcie 3.0 speed over 2.0

No, this is not a good idea with this type of adapter. I have one, but they are limited to 25w even with the power adapater. Graphics card may use up to 75 watts, which requires a special type of adapter.

There are some B350 boards that actually provide a real x16/x4 electrical slot from these PCIe lanes, and it is still iommu group 0.

I have asked for UEFI options to init that graphics card first in UEFI, in that case, but so far not so much.

3 Likes

There are 1x GT710 Cards that use 19Watt, but actually going to that length somewhat defeats the purpose of getting high performance GPU passthrough going.

Yes, but group 0, the GT710 as you say, is the card you are NOT passing through.

My true goal is to be able to split my crossfire setup in half and pass 1 though, but without proper isolation that will be impossible. ACS is no hope for me. My next best is to install my passive cooling nvidia card in a 4x slot and pass through both AMD cards, and hopefully iommu crossfire works?

Could be a good way to get more pcie devices on microATX/miniITX if you dont care about nvme. I might try running sata cards of one of these and report back.

More updates

I installed my 1070. It looks like the iommu is working fine. Windows can see the 1070 from within the vm:

I think whats going on, as described on other forums, is that when linux boots using the 1070 it shadows the 1070's rom with its own boot rom. It then writes to this shadow rom, which confuses UEFI. The solution is to create a rom file for the graphics card, and use that in the VM instead of passing through the one from the host OS.

I plan on doing this, but need to rip the Rom from my 1070 - the stock rom as downloaded elsewhere includes a hybrid BIOS boot image, and that seems to be tripping up OVMF. I should be able to download the rom from the 1070 when it is running via x4 pcie 2.0 though.

As an aside, CPU performance in a VM is really, really good with KVM. I haven't played with static huge pages or core pinning yet,

6 Likes

I did some tests with an Asus Crosshair VI Hero lately. Graphic cards available: nVidia 1080, AMD 7850 and a passive NVIDIA GT520 (from an old htpc). First I tried messing around just with the 1080 and the 7850. But the setting is similar to your posted ones:

IOMMU Group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
IOMMU Group 0 00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1453]
IOMMU Group 0 03:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b9] (rev 02)
IOMMU Group 0 03:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b5] (rev 02)
IOMMU Group 0 03:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b0] (rev 02)
IOMMU Group 0 1d:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 0 1d:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 0 1d:03.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 0 1d:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 0 1d:05.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 0 1d:06.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 0 1d:07.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
IOMMU Group 0 21:00.0 USB controller [0c03]: ASMedia Technology Inc. Device [1b21:1343]
IOMMU Group 0 23:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)
IOMMU Group 1 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
IOMMU Group 2 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
IOMMU Group 2 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1453]
IOMMU Group 2 00:03.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1453]
IOMMU Group 2 29:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] (rev a1)
IOMMU Group 2 29:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)
IOMMU Group 2 2a:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn PRO [Radeon HD 7850 / R7 265 / R9 270 1024SP] [1002:6819]
IOMMU Group 2 2a:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] [1002:aab0]
IOMMU Group 3 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
IOMMU Group 4 00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
IOMMU Group 4 00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1454]
IOMMU Group 4 2b:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device [1022:145a]
IOMMU Group 4 2b:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Device [1022:1456]
IOMMU Group 4 2b:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:145c]
IOMMU Group 5 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1452]
IOMMU Group 5 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1454]
IOMMU Group 5 2c:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device [1022:1455]
IOMMU Group 5 2c:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU Group 5 2c:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Device [1022:1457]
IOMMU Group 6 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
IOMMU Group 6 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 7 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1460]
IOMMU Group 7 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1461]
IOMMU Group 7 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1462]
IOMMU Group 7 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1463]
IOMMU Group 7 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1464]
IOMMU Group 7 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1465]
IOMMU Group 7 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1466]
IOMMU Group 7 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1467]

That means: There will be issues. So the solution was also to go for the x4-Slot, which always is initialized first, when being used. However, that is, where I ALWAYS got a kernel panic, when putting the AMD card here. Only exception is the ARCH installer disc. This one is crying, that AMD-VI seems to have an issue, however, its not panicking. This happened even with the Ubuntu 17.04 BETA2 files. Means, that the Ryzen patches are in there. But that is when I opened my old HTPC and replaced the AMD card with the GT520. And like a miracle, the system is booting.

Seems, that either the radeon and/oder blobbed equivalent driver have massive issues, when the AMD gpu is not in an designated CPU-GPU-laned slot. But I am not fit enough regarding these issues to open a bugtracker ticket somewhere, so that will either require your help or I will have to leave that topic for now.

Next steps will be to try to run a KVM with the GT520 as "main gpu" and the 1080 as vfio'ed one. Lets see when I will have time to dig into that more...

Any news from getting your bios exported, @witnaaay?

Regards, Bigfoot29

Same amd vi kernel issue on MSI b350 where only option is PCh slot for 2nd video card

1 Like

@wendell You said, that you are passing through a Gigabyte RX460 to the VM running Heaven in the background. Were there any challanges getting it to work on the Windows side? I run into driver issues passing through a 480 on an Asus Prime 370 Pro.

OK, so I'm about to go in on a ryzen system with this use case in mind. People seem to be having the most success with the ASRock and ASUS boards. I'm fine with the early adopter tax on this, I dealt with broken ACS on skylake for two years now, so this won't be a huge difference for me.

My questions are as follows:

  • which motherboard (preferably with external clock gen) would be 'good' for this?
  • is there any reason running the host GPU through the chipset would degrade performance on GPU compute tasks (specifically deep learning and neural network stuff)
  • is there any, more detailed info on what is causing certain card combos to crash?

Thanks in advance for your assistance.

The host GPU had to be Nvidia was the main challenge. Other than that it worked fine. Only one GPU in the VM -- the passthrough VM.

1 Like

Wendell what kernel did you use with and was KVR24E17D8/16 the ECC RAM you used?

I tried 4.10, 4.11 with KVR24E17D8/16MA (Micro version) on both the Asrock k4 and x370 pro gaming.
EDAC log in syslog says the bios disabled ecc or ecc is not installed, or something along those lines.

Going to go after the non-micron based memory next, if you got it working I think I will have a good chance.

Wish they would put something in the bios to say if they are detecting ECC ram.