GPU passthrough works, but host graphics card drivers do not load, any suggestions?

My Windows VM can use the GTX 670 I am passing it, and games work just fine, but the dedicated GPU for the host does not load up, due to neither the nvidiafb nor nouveau drivers loading.

Does anyone know how to fix this? I’m even testing this on a brand new Linux Mint 18.1 install (Ubuntu 16.04), multiple fresh installs, with the same result. I upgrade to kernel 4.15.0-36-generic to support my hardware.

Oddly, I can access my desktop environment on Linux, and things look just fine, but I want to be able to play games in either OS, so I definitely need to get the nvidia driver loaded for my host GPU. I’ve been trying to get this working all week.

lspci output for my Linux graphics card (GT 220, which I want to use for Linux native games):

# lspci -nnk -d 10de:0a20
65:00.0 VGA compatible controller [0300]: NVIDIA Corporation GT216 [GeForce GT 220] [10de:0a20] (rev a2)
Subsystem: eVga.com. Corp. GT216 [GeForce GT 220] [3842:1226]
Kernel modules: nvidiafb, nouveau

Note that there is no “Kernel driver in use:” line, as there would normally be. I think that this is the core of the problem.

lspci output for the GTX 670 for the Windows VM, which works just fine with qemu + kvm:

# lspci -nnk -d 10de:1189; lspci -nnk -d 10de:0e0a
17:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK104 [GeForce GTX 670] [10de:1189] (rev a1)
	Subsystem: eVga.com. Corp. GK104 [GeForce GTX 670] [3842:2678]
	Kernel driver in use: vfio-pci
	Kernel modules: nvidiafb, nouveau
17:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1)
	Subsystem: eVga.com. Corp. GK104 HDMI Audio Controller [3842:2678]
	Kernel driver in use: vfio-pci
	Kernel modules: snd_hda_intel

The problem can also be seen in lshw (note that the second display is “UNCLAIMED”):

# lshw -c display
  *-display               
       description: VGA compatible controller
       product: GK104 [GeForce GTX 670]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: [email protected]:17:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress vga_controller cap_list rom
       configuration: driver=vfio-pci latency=0
       resources: iomemory:38000-37fff iomemory:38000-37fff irq:11 memory:a4000000-a4ffffff memory:380070000000-380077ffffff memory:380078000000-380079ffffff ioport:7000(size=128) memory:a5000000-a507ffff
  *-display UNCLAIMED
       description: VGA compatible controller
       product: GT216 [GeForce GT 220]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: [email protected]:65:00.0
       version: a2
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress vga_controller bus_master cap_list
       configuration: latency=0
       resources: memory:d7000000-d7ffffff memory:c0000000-cfffffff memory:d0000000-d1ffffff ioport:b000(size=128) memory:c0000-dffff

And the same thing in inxi (note the “FAILED: nvidia,nouveau” part):

# inxi -Fxz | grep -A 6 Graphics
Graphics:  Card-1: NVIDIA GK104 [GeForce GTX 670] bus-ID: 17:00.0
           Card-2: NVIDIA GT216 [GeForce GT 220] bus-ID: 65:00.0
           Display Server: X.org 1.18.4 drivers: vesa (unloaded: fbdev) FAILED: nvidia,nouveau
           tty size: 209x55 Advanced Data: N/A for root
Audio:     Card-1 Intel Device a2f0 driver: snd_hda_intel bus-ID: 00:1f.3 Sound: ALSA v: k4.15.0-36-generic
           Card-2 NVIDIA GK104 HDMI Audio Controller driver: vfio-pci bus-ID: 17:00.1
           Card-3 NVIDIA GT216 HDMI Audio Controller driver: snd_hda_intel bus-ID: 65:00.1

I did my configuration based on this post by wendell, adapting it to nvidia by replacing “amdgpu” with “nvidiafb” or “nvidia”, with both not fixing this problem: Ubuntu 17.04 -- VFIO PCIe Passthrough & Kernel Update (4.14-rc1)

But, I’ll list out the key files of interest:

From /etc/default/grub:

#the problem does not change whether nomodeset is there or not
#also tried without "iommu=1", same problem
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nomodeset iommu=1 intel_iommu=on vfio-pci.ids=10de:1189,10de:0e0a"

/etc/initramfs-tools/modules:

softdep nvidia pre: vfio vfio_pci 
# tried the above live with "nvidiafb" as well
vfio
vfio_iommu_type1
vfio_virqfd
options vfio_pci ids=10de:1189,10de:0e0a
vfio_pci ids=10de:1189,10de:0e0a
vfio_pci
# tried this with "nvidiafb" as well
nvidia

/etc/modules:

vfio
vfio_iommu_type1
vfio_pci ids=10de:1189,10de:0e0a

/etc/modprobe.d/vfio_pci.conf

options vfio_pci ids=10de:1189,10de:0e0a

/etc/modprobe.d/nvidia.conf

softdep nvidia pre: vfio vfio_pci
#tried this with nvidiafb, and with this file empty, too

Again, my goal here is really just to get the GT 220 that is Linux only to have the nvidia proprietary driver loaded, so I can play games, but also, so I can run the looking glass client, which needs some graphics driver loaded on the host GPU.

In this case, I figure that we might as well forget about the qemu + kvm aspect, since the problem happens when booting the system.

The one other thing I might add is that in the “driver-manager” GUI program, it shows my GT 220 as using the nvidia proprietary drivers, and even the vfio’d GTX 670 that definitely does not have the nvidia driver load, so “driver-manager” clearly has problems. Also, oddly, it only shows both GPUs on 1 out of every 3 launches, so take what it says with a grain of salt.

I’m extremely comfortable with the command line, if that matters. And help would be immeasurably appreciated!

I was able to fix this by swapping out my GT 220 for a GTX 465, and using “nvidiafb”. A GT 710 did not work. I’m guessing that nvidia doesn’t care about these older GT series cards and doesn’t actually support them, despite them being listed as supported on their website for the driver versions I had installed.