GPU passthrough causes SSD to vanish

Hi Gentlemen,

I am posting this in hopes someone has an idea that can lead me to a solution.

I have recently bought a second GPU (4070) to try and get pass-through working from Debian (Host) to Windows (Guest).

However, when I specify my ids:

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on vfio-pci.ids=10de:2786,10de:22bc"

Upon reboot the system cannot find the (linux) boot drive and looks in wrong place to boot.

I can restore grub to:

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"

Via Debian install (rescue option) CD. And system boots fine again.

IOMMU support is on.

[    0.035120] DMAR: IOMMU enabled
[    0.068435] DMAR: DRHD base: 0x000000fed90000 flags: 0x0 
... 
[    0.925461] DMAR: Intel(R) Virtualization Technology for Directed I/O

The GPU is in its own isolated IOMMU group 17. The 4070 ids are correct.

The drivers are installed:

lsmod | grep vfio*
vfio_pci               16384  0
vfio_pci_core          86016  1 vfio_pci
irqbypass              12288  2 vfio_pci_core,kvm
vfio_iommu_type1       45056  0
vfio                   57344  3 vfio_pci_core,vfio_iommu_type1,vfio_pci

System
CPU: 11900k
Motherboard: EVGA Z590 FTW WIFI (latest bios)

GPUs:
3090 Primary
4070 Secondary pass-through

Do you have any idea what might be happening?
I am starting to think this is a motherboard hardware/firmware problem.

IO groupings. Same root port. I may have exact terminology wrong.
I only really read title, btw.

probably

Thank you, root ports are new info to me.
The only shared root ports are those of devices on GPUs:

           +-01.0-[01]--+-00.0  NVIDIA Corporation GA102 [GeForce RTX 3090]
           |            \-00.1  NVIDIA Corporation GA102 High Definition Audio Controller
           +-01.1-[02]--+-00.0  NVIDIA Corporation AD104 [GeForce RTX 4070]
           |            \-00.1  NVIDIA Corporation Device 22bc

via lspci -tv

I’ll contact EVGA for support. I think the problem is with hardware.

I think… IOMMU groups can contain multiple root ports.
Like I said though, I may be using very wrong terminology.
I’m not good at remembering exact terms; why I struggle to do anything with programming.

1 Like

Did you see if the SSD was passed through? Or maybe with minimal work it would also show in VM.

No ssd in question is the linux drive, I am not passing it through. I am only trying to pass through 4070.

I understand that much.

This looks suspicious.


Not properly enumerated and SSD is next in line, so something about the process fucks it up?
IDK; poke around in that direction.

:thinking:
Not sure what to think of that, but yeah… that looks wrong.