Switching from Nvidia as primary video driver to AMD so QEMU can claim the Nvidia card

Hello.

I’ve been trying for a week to set up my pc for gpu passthru.

Specs:
Asrock x470 Taichi
Ryzen 3800x
64GB ram
GTX 1080 (top slot)
9070XT (middle slot)

  • The cards do not fit the other way around. (unless i stick the gtx card in pcie_4 which is pcie 2.0 x1)
    (bottom pcie slot unusable because of lane sharing with an in-use nvme drive)

I have tried various guides but it turns into a hodgepodge as I’m on Endeavour (with grub) and dracut isn’t a drop-in-replacement for mkinitcpio

I have tried to blacklist the nouveu driver and sometimes it works, sometimes it doesn’t.
When it does work the system hangs because it…
i do not know? forgets that there is a next gpu?

I’ve checked the IOMMU grouping and it’s not ideal but should be usable afaik.

Current info:

Group 3:
	00:03.0 Host bridge [0600]: Starship/Matisse PCIe Dummy Host Bridge [1482]
	00:03.1 PCI bridge [0604]: Starship/Matisse GPP Bridge [1483]
	00:03.2 PCI bridge [0604]: Starship/Matisse GPP Bridge [1483]
	0e:00.0 VGA compatible controller [0300]: GP104 [GeForce GTX 1080] [1b80]
	0e:00.1 Audio device [0403]: GP104 High Definition Audio Controller [10f0]
	0f:00.0 PCI bridge [0604]: Navi 10 XL Upstream Port of PCI Express Switch [1478]
	10:00.0 PCI bridge [0604]: Navi 10 XL Downstream Port of PCI Express Switch [1479]
	11:00.0 VGA compatible controller [0300]: Navi 48 [RX 9070/9070 XT] [7550]
	11:00.1 Audio device [0403]: Device [ab40]

lspci (nvidia for guest):

0e:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] (rev a1)
        Subsystem: NVIDIA Corporation Device [10de:119e]
        Kernel modules: nouveau
0e:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)
        Subsystem: NVIDIA Corporation Device [10de:119e]
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel

lspci (amd for host)

0f:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch [1002:1478] (rev 24)
        Subsystem: Sapphire Technology Limited Device [1da2:1478]
        Kernel driver in use: pcieport
10:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch [1002:1479] (rev 24)
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch [1002:1479]
        Kernel driver in use: pcieport
11:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 48 [RX 9070/9070 XT] [1002:7550] (rev c0)
        Subsystem: Sapphire Technology Limited Device [1da2:e490]
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu
11:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:ab40]
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:ab40]
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel

How do i “switch primary gpu” on the OS side so i can pass it over to a guest of qemu?

I don’t have a full answer, but just be aware that on the GTX 1080 (I also have one) the nouveu driver is such utter dogshit that it can barely run the normal desktop without bugs. It struggles to run youtube videos. Personally, I’d never, ever, use that driver unless there was literally no other option.

There is a Wiki entry on Archlinux how to setup GPU passthrough.
Since it is Arch it also covers dracut.

Just to cover the basics. Your monitor is plugged into the amd card?

https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF

It doesn’t seem like you read the post? I’m trying to get the nvidia card “released” from the host. that includes nouveu driver.

I have one monitor plugged into each card at the moment.

Maybe that’s part of the problem? (although seems like the best option for getting low-latency video out of the guest later)

I did read the post, but it was a bit murky exactly what happened in what order. I recently tried to roll back to Nvidia 550 from 570 (because 570 drivers are trassshhh) and from that process, you just uninstall the drivers and all their associated components. the OS REALLY doesn’t like this, but it should let you do it (probably via CLI). I had to remove the old driver completely (no reboot needed) before I could reinstall a different one.

ok. I don’t quite see how downgrading the nvidia driver is relevant.

Edit: why does this post look like a reply only in EDIT mode and not normal view?

ok. I don’t quite see how downgrading the nvidia driver is relevant.

Did you figure it out or are you still working on that?

Hmm, both GPUs are in the same group so you are out of luck and I think you need the ACS patch. I’m not sure about the details since I haven’t had to deal with this issue yet.

Secondly, if you follow the arch guide the nvidia GPU should not be using nouveau (or nvidia) on boot, but it should show ‘vfio_pci’ as the kernel module…