I have had a VM with a GPU passed through working.
A few days ago I did a BIOS update on the host and, since then, the VM no longer boots (not even the VM splash screens or the bios hows up) only a black screen is shown. (I saved the BIOS settings and restored them after the update)
I checked the IOMMU groups, the kernel drivers used, everything seems good.
If I remove the GPU from the VM, then the VM starts normally.
There is nothing I can see in the qemu logs.
I took the GPU out of vfio-pci.ids in the grub command line, and it seems to be working on the host. Haven’t tried it in windows.
I tried updating the kernel, nothing (I am on 5.14 now).
I have no idea how to continue with finding what the problem is.
With the new BIOS version either some setting in your BIOS was reset or with your new BIOS version the PCIe IDs changed. The former would require you to check all options in the BIOS, especially things like IOMMU set to enabled, or ACS override set to enabled and alike options. The later would require you to remove the old PCI entries in the VM and add them again after making sure you have the correct ID, preferably with virt-manager.
I have an NVMe SSD and a PCI USB hub passed to the VM beside the GPU.
I checked all the addresses of the passed through devices (shouldn’t have changed anyway as they are bound to the HW) it’s all still corect, and it should give an error at the VM startup if there was any issues here
Also, after only removing the GPU, the VM boots correctly.
Is there any configs/logs I should post, I am at a total loss right now.
If your BIOS/UEFI has a setting for VGA boot order, this should be reset back to default after update. Make sure the Primary VGA is the one that host uses, not the one you are passing through to your VM
managed to fix it
apparently vendor-reset kernel module was not loaded in the kernel
I had it working in 5.9.0
the GPU was still working perfectly on 5.12.0 (and I guess without the vendor-reset module loaded) until I updated the BIOS when everything went down the drain
I am now running kernel 5.14.0 with the vendor-reset module loaded and the VM is working normally