Nvidia kernel module issues with Fedora 28

I have a laptop with Intel integrated graphics and a rather weak Nvidia dGPU (NVS 5200M). A few updates ago I started having a problem where Xwindows would not load if Optimus was enabled and the Intel GPU was the primary GPU. I worked around this by toggling Optimus off and using the Nvidia as the primary GPU.

I was hoping a recent update of the Nvidia drivers (loaded from RPMFusion) would correct this problem, but the situation ended up even weirder. If I enable Optimus, the desktop loads but as it loads I see the message:

NVIDIA kernel module not found. Failing back to Nouveau.

And the desktop loads. I confirmed that the nouveau and i915 modules were loaded with lsmod. The weird thing is that nouveau is blacklisted, both in the kernel options and in /etc/modprobe.d/.

OTOH if I try to disable Optimus and boot with the Nvidia dGPU alone, I get no desktop at all. If I try ā€˜modprobe nvidia.koā€™ I get a message that the kernel module canā€™t be found in /usr/lib/(kernel version). Even if I hard link the nvidia modules in /usr/lib/(kernel version), modprobe still says it canā€™t find the bloody thing. (Is there a command like ldconfig for kernel modules?) Even if I provide an absolute path to the kernel module files, modprobe still complains that it canā€™t find them.

I also tried ā€˜sudo dnf reinstall xorg-x11-drv-nvidia akmod-nvidiaā€™ which accomplished absolutely nothing.

This is not an urgent issue. The Nvidia dGPU is so weak, that the Intel driver actually provides roughly equivalent performance. Iā€™m just trying to wrap my head around the Fedora driver stack and understand whatā€™s going on.

To recap, my questions are:

  1. Why canā€™t I load nvidia.ko and nvidia-drm.ko with modprobe manually?
  2. Why does modprobe complain that it canā€™t find the *.ko files, even though theyā€™re right bloody there.
  3. Is there some kind of utility that needs to be run so that Fedora can find its kernel modules (like say ldconfig)?
  4. Why is nouveau loading even though itā€™s blacklisted?

Fedoraā€™s kernel updates for 4.17.7 REALLY broke a lot of things. Downgrade to 4.17.3 for now, because even on the vanilla kernel, you canā€™t update past 4.17.8 or Nvidia DKMS wonā€™t build.

The nouveau issue may be an issue with the new version of dracut. Downgrade dracut.

1 Like

Iā€™m actually on 4.17.9. Does the 4.17.7 advice still apply? Iā€™ll have to see if 4.17.3 is still in my GRUB settingsā€¦

I strongly recommend downgrading to 4.17.3, since using dnf downgrade goes all the way back to 4.16.3. If youā€™re on Ryzen, you want the 4.17 optimizations.

Hereā€™s what I get with dnf downgrade:

[root@e6430 ~]# dnf downgrade kernel-4.17.3-200.fc28
Last metadata expiration check: 1:44:47 ago on Sat 28 Jul 2018 11:37:23 PM CDT.
No package kernel-4.17.3-200.fc28 available.
Error: No packages marked for downgrade.
[root@e6430 ~]# dnf downgrade kernel-4.17.3
Last metadata expiration check: 1:44:57 ago on Sat 28 Jul 2018 11:37:23 PM CDT.
No package kernel-4.17.3 available.
Error: No packages marked for downgrade.
[root@e6430 ~]# dnf downgrade kernel-4.17.3-200
Last metadata expiration check: 1:45:02 ago on Sat 28 Jul 2018 11:37:23 PM CDT.
No package kernel-4.17.3-200 available.
Error: No packages marked for downgrade.

So I guess Iā€™m doing it wrongā€¦?

Yup. You have to manually download the RPMs for that kernel, then launch DNF from the dedicated folder you put the kernel RPMs in.

What do I need besides ā€˜kernel-4.17.3-200.fc28.rpmā€™? I assume I also need ā€˜kernel-modulesā€™, but do I need anything else?

OK I downgraded the kernel. Couldnā€™t find a 4.17.3 RPM anywhere, so downgraded to 4.16.3. After rebooting I ran

sudo dnf reinstall xorg-x11-drv-nvidia akmod-nvidiaā€™

again. And I still get

NVIDIA kernel module not found. Failing back to Nouveau.

Any other ideas?

Downgrade dracut, regenerate your initial ramdisk, and reinstall your nvidia driver by purging it and reinstalling.

Still no worky. As I mentioned earlier, the performance improvement isnā€™t worth the trouble, so I just completely removed nvidia and blacklisted nouveau. And the Intel driver performs quite a bit better now. Thanks anyway.

1 Like

Ah, I didnā€™t read that youā€™re on a laptop. Thatā€™s different with Nvidia Optimus. That has a completely different set of graphics driver installation instructions.

Have you tried reinstalling the Nvidia driver with the iGPU access disabled? And can you even do that from your laptopā€™s BIOS?

If I disable Optimus in UEFI, the Nvidia dGPU becomes the primary and sole GPU. Butā€¦itā€™s relatively power hungry, and Iā€™d like to use the laptop on battery for more than an hour. I do have an extended 9-cell battery, but itā€™s got a Fermi GPUā€¦

Waitā€¦ Fermi is compatible with Nouveau. You may not get top of the line performance, but using Nouveau on Fermi is actually a good option, since Nvidia has ended support for Fermi with their official driver stack.

Unfortunately itā€™s far worse than ā€œnot top of the line performance.ā€ In this case, the Intel driver completely smokes the nouveau driver. No point in using the dGPU unless I can use the nvidia driver. Windows is a different story, though.

What driver repo were you using? rpmfusion or negativo17?

RPMFusion. I did try disabling SELINUX which didnā€™t seem to help.

You donā€™t need to disable SELinuxā€¦ donā€™t know where would suggest that.

I would suggest trying to use the negativo17 repo for Nvidia drivers. Iā€™ve found them better packaged and easier in the past and they provide a dkms package which I think works better than the kmod versions.

Make sure your stuffs all up to date and youā€™ve undone any blacklisting or holding back packages you did here if you try it.

This is why I disabled SELinux:
https://ask.fedoraproject.org/en/question/121596/fedora-28-failed-to-boot-after-system-update-with-installed-nvidia-driver-from-rpm-fusion-nonfree-nvidia-driver/

But as mentioned, it didnā€™t change anything. Iā€™ll take a look at negativo17 later tonight.

The post mentions that the bug they had was fixed and put into stable. So as long as your up to date selinix should be fine.