Nvidia kernel module issues with Fedora 28

I have a laptop with Intel integrated graphics and a rather weak Nvidia dGPU (NVS 5200M). A few updates ago I started having a problem where Xwindows would not load if Optimus was enabled and the Intel GPU was the primary GPU. I worked around this by toggling Optimus off and using the Nvidia as the primary GPU.

I was hoping a recent update of the Nvidia drivers (loaded from RPMFusion) would correct this problem, but the situation ended up even weirder. If I enable Optimus, the desktop loads but as it loads I see the message:

NVIDIA kernel module not found. Failing back to Nouveau.

And the desktop loads. I confirmed that the nouveau and i915 modules were loaded with lsmod. The weird thing is that nouveau is blacklisted, both in the kernel options and in /etc/modprobe.d/.

OTOH if I try to disable Optimus and boot with the Nvidia dGPU alone, I get no desktop at all. If I try ‘modprobe nvidia.ko’ I get a message that the kernel module can’t be found in /usr/lib/(kernel version). Even if I hard link the nvidia modules in /usr/lib/(kernel version), modprobe still says it can’t find the bloody thing. (Is there a command like ldconfig for kernel modules?) Even if I provide an absolute path to the kernel module files, modprobe still complains that it can’t find them.

I also tried ‘sudo dnf reinstall xorg-x11-drv-nvidia akmod-nvidia’ which accomplished absolutely nothing.

This is not an urgent issue. The Nvidia dGPU is so weak, that the Intel driver actually provides roughly equivalent performance. I’m just trying to wrap my head around the Fedora driver stack and understand what’s going on.

To recap, my questions are:

  1. Why can’t I load nvidia.ko and nvidia-drm.ko with modprobe manually?
  2. Why does modprobe complain that it can’t find the *.ko files, even though they’re right bloody there.
  3. Is there some kind of utility that needs to be run so that Fedora can find its kernel modules (like say ldconfig)?
  4. Why is nouveau loading even though it’s blacklisted?

Fedora’s kernel updates for 4.17.7 REALLY broke a lot of things. Downgrade to 4.17.3 for now, because even on the vanilla kernel, you can’t update past 4.17.8 or Nvidia DKMS won’t build.

The nouveau issue may be an issue with the new version of dracut. Downgrade dracut.

1 Like

I’m actually on 4.17.9. Does the 4.17.7 advice still apply? I’ll have to see if 4.17.3 is still in my GRUB settings…

I strongly recommend downgrading to 4.17.3, since using dnf downgrade goes all the way back to 4.16.3. If you’re on Ryzen, you want the 4.17 optimizations.

Here’s what I get with dnf downgrade:

[[email protected] ~]# dnf downgrade kernel-4.17.3-200.fc28
Last metadata expiration check: 1:44:47 ago on Sat 28 Jul 2018 11:37:23 PM CDT.
No package kernel-4.17.3-200.fc28 available.
Error: No packages marked for downgrade.
[[email protected] ~]# dnf downgrade kernel-4.17.3
Last metadata expiration check: 1:44:57 ago on Sat 28 Jul 2018 11:37:23 PM CDT.
No package kernel-4.17.3 available.
Error: No packages marked for downgrade.
[[email protected] ~]# dnf downgrade kernel-4.17.3-200
Last metadata expiration check: 1:45:02 ago on Sat 28 Jul 2018 11:37:23 PM CDT.
No package kernel-4.17.3-200 available.
Error: No packages marked for downgrade.

So I guess I’m doing it wrong…?

Yup. You have to manually download the RPMs for that kernel, then launch DNF from the dedicated folder you put the kernel RPMs in.

What do I need besides ‘kernel-4.17.3-200.fc28.rpm’? I assume I also need ‘kernel-modules’, but do I need anything else?

OK I downgraded the kernel. Couldn’t find a 4.17.3 RPM anywhere, so downgraded to 4.16.3. After rebooting I ran

sudo dnf reinstall xorg-x11-drv-nvidia akmod-nvidia’

again. And I still get

NVIDIA kernel module not found. Failing back to Nouveau.

Any other ideas?

Downgrade dracut, regenerate your initial ramdisk, and reinstall your nvidia driver by purging it and reinstalling.

Still no worky. As I mentioned earlier, the performance improvement isn’t worth the trouble, so I just completely removed nvidia and blacklisted nouveau. And the Intel driver performs quite a bit better now. Thanks anyway.

1 Like

Ah, I didn’t read that you’re on a laptop. That’s different with Nvidia Optimus. That has a completely different set of graphics driver installation instructions.

Have you tried reinstalling the Nvidia driver with the iGPU access disabled? And can you even do that from your laptop’s BIOS?

If I disable Optimus in UEFI, the Nvidia dGPU becomes the primary and sole GPU. But…it’s relatively power hungry, and I’d like to use the laptop on battery for more than an hour. I do have an extended 9-cell battery, but it’s got a Fermi GPU…

Wait… Fermi is compatible with Nouveau. You may not get top of the line performance, but using Nouveau on Fermi is actually a good option, since Nvidia has ended support for Fermi with their official driver stack.

Unfortunately it’s far worse than “not top of the line performance.” In this case, the Intel driver completely smokes the nouveau driver. No point in using the dGPU unless I can use the nvidia driver. Windows is a different story, though.

What driver repo were you using? rpmfusion or negativo17?

RPMFusion. I did try disabling SELINUX which didn’t seem to help.

You don’t need to disable SELinux… don’t know where would suggest that.

I would suggest trying to use the negativo17 repo for Nvidia drivers. I’ve found them better packaged and easier in the past and they provide a dkms package which I think works better than the kmod versions.

Make sure your stuffs all up to date and you’ve undone any blacklisting or holding back packages you did here if you try it.

This is why I disabled SELinux:

But as mentioned, it didn’t change anything. I’ll take a look at negativo17 later tonight.

The post mentions that the bug they had was fixed and put into stable. So as long as your up to date selinix should be fine.