I’m having unexpected kernel panics when booting a windows vm with a GPU passthrough - most recent seems to give meaningful reason for this - vfio gets stuck waiting for nvidia card to become available.
I’ve had Xorg blocking Nvidia card before, but it never caused vfio drivers to get stuck and cause kernel panic. Has anyone seen anything like this before?
Jul 13 01:03:26 host3 kernel: vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=io+mem:owns=none
Jul 13 01:03:26 host3 kernel: xhci_hcd 0000:01:00.2: remove, state 4
Jul 13 01:03:26 host3 kernel: usb usb4: USB disconnect, device number 1
Jul 13 01:03:26 host3 kernel: xhci_hcd 0000:01:00.2: USB bus 4 deregistered
Jul 13 01:03:26 host3 kernel: xhci_hcd 0000:01:00.2: remove, state 4
Jul 13 01:03:26 host3 kernel: usb usb3: USB disconnect, device number 1
Jul 13 01:03:26 host3 kernel: xhci_hcd 0000:01:00.2: USB bus 3 deregistered
Jul 13 01:03:26 host3 libvirtd[52416]: libvirt version: 9.5.0
Jul 13 01:03:26 host3 libvirtd[52416]: hostname: host3
Jul 13 01:03:26 host3 libvirtd[52416]: Domain id=1 name='win10-off' uuid=aab5554f-8df9-4344-b691-776796a3ab04 is tainted: custom-argv
Jul 13 01:03:26 host3 systemd-machined[922]: New machine qemu-1-win10-off.
Jul 13 01:03:26 host3 systemd[1]: Started Virtual Machine qemu-1-win10-off.
Jul 13 01:03:26 host3 qemu-system-x86_64[52589]: xoauth2_scope is not set
Jul 13 01:03:34 host3 kernel: vfio-pci 0000:01:00.0: not ready 1023ms after resume; waiting
Jul 13 01:03:35 host3 kernel: vfio-pci 0000:01:00.0: not ready 2047ms after resume; waiting
Jul 13 01:03:37 host3 kernel: vfio-pci 0000:01:00.0: not ready 4095ms after resume; waiting
Jul 13 01:03:41 host3 kernel: vfio-pci 0000:01:00.0: not ready 8191ms after resume; waiting
Jul 13 01:03:50 host3 kernel: vfio-pci 0000:01:00.0: not ready 16383ms after resume; waiting
Jul 13 01:03:59 host3 libvirtd[52416]: Cannot start job (query, none, none) in API remoteDispatchConnectGetAllDomainStats for domain win10-off; current job is (async nested, none, start) owned by (52423 remoteDispatchDomainCreate, 0 <null>>
Jul 13 01:03:59 host3 libvirtd[52416]: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreate)
Jul 13 01:04:03 host3 systemd[1233]: Started scdaemon-reload-to-clear-cache.service.
Jul 13 01:04:07 host3 kernel: vfio-pci 0000:01:00.0: not ready 32767ms after resume; waiting
Jul 13 01:04:19 host3 kernel: i915 0000:00:02.0: [drm] *ERROR* Atomic update failure on pipe A (start=465751 end=465752) time 1933 us, min 1052, max 1079, scanline start 907, end 121
Jul 13 01:04:29 host3 libvirtd[52416]: Cannot start job (query, none, none) in API remoteDispatchConnectGetAllDomainStats for domain win10-off; current job is (async nested, none, start) owned by (52423 remoteDispatchDomainCreate, 0 <null>>
Jul 13 01:04:29 host3 libvirtd[52416]: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreate)