I’m on fedora 39 and this dracut configuration is breaking my host nvidia drivers
add_drivers+=" vfio vfio_iommu_type1 vfio_pci "
When this is added to dracut and initramfs are rebuilt, whenever I restart the computer, I get an error on startup that says “nvidia kernel module missing, falling back to noveau”, it doesn’t actually fall back to noveau since I have it blacklisted and it ends up binding to the vfio whenever I check with lspci
Note that vfio works completely fine, HOWEVER when I try to remove the vfio drivers and add my nvidia drivers back in for gaming on the host, it works as in I get no errors with my commands however I cannot use the gpu for any gpu accelerated tasks whatsoever.
Quick sanity check, do you use nvidia akmod drivers and is nvidia kernel module actually built and available? This might have not be related to dracut at all.
nvidia kernel module missing, falling back to noveau
This seem on surface “I have problem A, but trying to solve unrelated B”.
Kernel updates have done this to me on fedora few times, I think it was missing matching kernel-devel-X.Y.Z package that led to silently failing akmod build for newer kernel.
There were also issues with akmod build for certain vdrvier version an kernel versions. Check that also.
Next boot from old working no newer one led to exact the same scenario.
However I dont use iommu and similar, so if it actually is related, I have no other input.
TLDR:
if it says module missing, check that first. It can happen
check if older kernels have the same issue with missing module
do distro sync and force install all kernel packages
re-enable nouveau for debugging or use integrated gpu if able
Absolutely, first thing I did when I installed is updated everything including the kernel, boot into the fresh kernel, install the proprietary nvidia drivers and let it finish building, I verify that it’s finished building with modinfo -F version nvidia and see that I get an output.
Then I restart and install a game that is gpu accelerated and successfully play it using my Nvidia gpu. Then I update my grub configuration for vfio and restart, still no issues.
But as soon as I rebuild the initramfs that dreaded error comes straight back and trying to use my gpu on my host machine is impossible as drivers seem to be broken, then I uninstall the dracut config, update the initramfs and the error magically goes away and I’m back to gaming on my dgpu.
I believe it’s correlated, I have no clue what to do from here
Whoops, sorry, I somehow totally missed this kinda important part of your post
Note that vfio works completely fine, HOWEVER when I try to remove the vfio drivers and add my nvidia drivers back in for gaming on the host, it works as in I get no errors with my commands however I cannot use the gpu for any gpu accelerated tasks whatsoever.
| will let someone less bad at reading answer the rest.
Maybe its as simple as dracut not packaging the nvidia driver with your custom config. Did you try adding “nvidia” to the line of vfio* modules?
You didn’t explain what you try to accomplish with the new dracut configuration (I can make an educated guess, but it would be better if you explained it). It’s possible that your config change causes dracut/fedora to use the vfio driver with the NVidia GPU, leaving you with no graphics output at boot. It this is what you want, do you have a second GPU that can output? How do you ensure that vfio is leaving one GPU alone and that this is the correct (i)GPU?