Kernel panics when unbinding nvidia module and binding vfio

I’m wondering if anyone experienced a similar issue with nvidia-open drivers - unbinding GPU IOMMU group devices fails and crashes the kernel after a few seconds:

for i in {0..3}; do echo 0000:01:00.$i | sudo tee /sys/bus/pci/devices/0000:01:00.$i/driver/unbind; done
for i in 1ed1 10f8 1ad8 1ad9; do echo "10de "$i | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id; done

journalctl -e -k -b -1:

vfio-pci 0000:01:00.0: not ready 1023ms after resume; waiting
vfio-pci 0000:01:00.0: not ready 2047ms after resume; waiting
vfio-pci 0000:01:00.0: not ready 4095ms after resume; waiting
vfio-pci 0000:01:00.0: not ready 8191ms after resume; waiting
vfio-pci 0000:01:00.0: not ready 16383ms after resume; waiting
crash...

It doesn’t happen all the time. Only after a day or so of using the computer. I’m a newbie, so if anyone knows a more reliable way to bind/unbind drivers please let me know.

Optimus laptop and Xorg is configured to only use iGPU.

nvidia-open 535.86.05-2

inxi -G:

Graphics:
  Device-1: Intel CometLake-H GT2 [UHD Graphics] driver: i915 v: kernel
  Device-2: NVIDIA TU104BM [GeForce RTX 2070 SUPER Mobile / Max-Q]
    driver: nvidia v: 535.86.05
  Device-3: Syntek Integrated Camera driver: uvcvideo type: USB
  Display: x11 server: X.Org v: 21.1.8 with: Xwayland v: 23.1.2 driver: X:
    loaded: modesetting dri: iris gpu: evdi,i915 resolution: 1: 1920x1080~60Hz
    2: 1920x1080~240Hz
  API: OpenGL v: 4.6 Mesa 23.1.3 renderer: Mesa Intel UHD Graphics (CML GT2)

I may have found the reason why Nvidia card wasn’t available. Applications were still using it instead of iGPU via EGL - removing /usr/share/glvnd/egl_vendor.d/10_nvidia.json may solve it.

I say may because this issue only happend after a day or so, so I will be able to only confirm it tomorrow/day after tomorrow.

1 Like

as suggested by another user it may be also possible to use __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/50_mesa.json environment variable instead (but i couldnt find this documented anywhere)

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.