Fedora 37, Dell 9520, Nvidia driver

I did, thanks. That’s what led me to attempting getting the CUDA toolkit from Nvidia which in turn requires me to get the latest driver which has incompatibilities with those I’ve already installed to get this far. At the risk of getting a broken install I decided against going further until there’s a matched Fedora and Nvidia installation process. I feel as though I’m waiting on rpmfusion creating an installation with the latest Nvidia driver. The one that I used is apparently working, or at least not crashing, but I’m still unable to prove that the 3050Ti is operational. I have yet to plug the laptop into a 4K external monitor to see if that does the trick (the 15" panel is 4K which is being driven very nicely by the Alder Lake iGPU).

So, Discover just installed a bunch of files - mostly grub executables it looked like - and rebooted, and that has screwed over the Nvidia driver from rpmfusion and is trying to load the blacklisted nouveau driver:

Dec 18 14:44:32 mandrake systemd[1]: Starting nvidia-fallback.service - Fallback to nouveau as nvidia did not load…

(Why is it starting the nouveau fallback process as I’ve blacklisted nouveau in grub.cfg? I guess because it doesn’t know that there’s another GPU - the Alder Lake iGPU running Mesa? But it’s the kernel that knows everything…?)

The only prior mentions of nvidia in journalctl are:

Dec 18 14:44:32 mandrake systemd[1]: Starting nvidia-powerd.service - nvidia-powerd service…
Dec 18 14:44:32 mandrake /usr/bin/nvidia-powerd[947]: nvidia-powerd version:1.0(build 1)

So no idea why the nvidia driver is actually failing to load after this latest reboot with new bits. The kernel isn’t happy about the Nvidia driver being unsigned, but it wasn’t before this recent reboot and it still loaded the Nvidia drivers.

The kernel either tries to load the nvidia driver again, or there’s an ordering issue in journaling. After nouveau is attempted, the following appears in the journal:

Dec 18 14:44:32 mandrake kernel: nvidia: loading out-of-tree module taints kernel.
Dec 18 14:44:32 mandrake kernel: nvidia: module license ‘NVIDIA’ taints kernel.
Dec 18 14:44:32 mandrake kernel: Disabling lock debugging due to kernel taint
Dec 18 14:44:32 mandrake kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel

but it’s been complaining about failed module verification since I managed to install the rpmfusion nvidia drivers (since 15 Dec). And presumably if the kernel is tainted by an untrusted driver then it’s been loaded?

I knew I’d likely have problems in the future with this Nvidia driver, I just didn’t expect the future to arrive so soon.

Getting Nvidia drivers working on Linux continues to be a sh!t show. It’d be good if Fedora/Red Hat and Nvidia and maybe rpmfusion could get their act together and produce a reliable, working solution. The onus is on Nvidia as their drivers don’t play well with other versions of Linux either. (My previous attempt to get Nvidia drivers working on Linux ended in failure: my desktop 12900 system is using the nouveau driver on Debian 11)

I’ll see if I can’t drum up some help from Fedora/Red Hat or Nvidia, or even rpmfusion

The sooking is likely because yoh have secure boot on, just disable that and it shoukd be good

Secure Boot has been off since before I started on the rpmfusion install.

Installing the latest cuda version supported by driver 520 should work. That should be cuda 11-8. You might need to use the fedora35 repo for that (but it should be compatible).

Alternatively, you could install nvidias driver rpms from the fedora36 repo, but I haven‘t tried this. RPMFusion is geberally the recommended way.

The older repos are accesible through the archive: https://developer.nvidia.com/cuda-toolkit-archive

Thanks. I definitely got the impression that rpmfusion was the way to go especially after my attempt to install the latest drivers from Nvidia (525.60.11) failed in the DKMS phase. Alas, what was working, or at least loading, yesterday no longer is today after a system update which is not a hopeful sign, and something I was concerned about. If I start unloading and re-loading drivers I expect I’ll be left with a system that crashes or at the very least is in an unknown and unstable state. Right now it’s working with the iGPU (very well as it happens) and it’s not crashing like it was out of the box with problems in the nouveau driver (mainly because it’s blacklisted).

I’m going to stick with the Alder Lake iGPU until there’s a version of Fedora and Nvidia drivers that I believe should have a better chance of working: maybe in a couple of months. Maybe by the time 38 is out 37 will ideally work with the drivers from Nvidia

I think I’m kind of on the bleeding edge with this Dell laptop, although Alder Lake isn’t exactly too new for Linux. Fedora 36 hung, or appeared to when I tried to install it; Fedora 37 was the one repo out of the 4 I tried - Ubuntu 22.04, Debian 11.5, Fedora 36 and Fedora 37 that finally installed without problems. But I have failed to get Nvidia drivers running properly, either from Nvidia, or rpmfusion, or another description I followed from linuxcapable.com. I’ve done 4 installs already and I don’t really want to do anymore for now.

I have a desktop machine I put together with an Intel 12900 and a Nvidia 3050. I didn’t get the Nvidia drivers working on that either - Debian 11 - and did at least 3 full installs trying (didn’t try rpmfusion at the time but followed a couple of online articles that got their systems working with Nvidia drivers).

I have a Fedora 36 installation on that desktop machine. I think I’ll experiment with that since I don’t have to sign drivers, or I didn’t the last time I tried it which was 8 months ago TBH and it mutli-boots several flavors of Linux in addition to Windows 11 so unlike the laptop if I wreck that installation I still have a working machine.

The only bleeding here is nvidia and their dog shit drivers and even worse optimus experience and itll be that way for a long time, until it makes monetary sense for them to improve it

:rofl: I am getting that impression!

I had the wizard wheeze of trying to update my desktop Fedora installation. I haven’t used it for a while so the first thing it did was update to Fedora 36. Then I started following the instructions (yet again) for installing the Nvidia .run file for the 525.60 drivers. I have to ensure kernel-headers are installed, but I just updated to 6.0.12-200.fc36.86_64 and the most recent headers at Red Hat are 6.0.5-200.fc36 (Packages in Fedora / RHEL / CentOS / EPEL - All packages in Fedora / RHEL / CentOS / EPEL repositories). It’s a wonder anyone ever manages to squeeze into the interstitial niche where drivers and kernels align and produce the semblance of a working system… Meanwhile, over in WindowsLand, the OS, drivers and 3050Ti are happily cohabiting. I’m starting to think I must be a masochist, or at the very least someone who prefers whittling away their days interminably installing software rather than getting anything productive done.

Well, I’m giving up. I did get some help from the Fedora Project forum but I didn’t really get anything than I’d already gathered here.

I re-installed Fedora 37 from scratch for the umpteenth time (at least 4 fresh installs trying to get this to work) and with the 525 drivers from rpmfusion that apparently were updated today.

The machine still attempts to fall back to nouveau - blacklisted - for some unknown reason. The lines in journalctl with the most culpability near to where the Nvidia driver is reported as not loading are these:

Dec 20 00:35:23 fedora systemd-udevd[811]: nvidia: Process ‘/usr/bin/bash -c ‘/usr/bin/mknod -Z -m 666 /dev/nvidiactl c 195 255’’ failed with exit code 1.
Dec 20 00:35:23 fedora systemd-udevd[783]: nvidia: Process ‘/usr/bin/bash -c ‘/usr/bin/mknod -Z -m 666 /dev/nvidiactl c 195 255’’ failed with exit code 1.
Dec 20 00:35:23 fedora systemd-udevd[811]: nvidia: Process ‘/usr/bin/bash -c ‘for i in $(cat /proc/driver/nvidia/gpus//information | grep Minor | | cut -d \ -f 4); do /usr/bin/mknod -Z -m 666 /dev/nvidia${i} c 195 ${i}; done’’ failed with exit code 1.
Dec 20 00:35:23 fedora systemd-udevd[783]: nvidia: Process '/usr/bin/bash -c 'for i in $(cat /proc/driver/nvidia/gpus/
/information | grep Minor | cut -d \ -f 4); do /usr/bin/mknod -Z -m 666 /dev/nvidia${i} c 195 ${i}; done’’ failed with exit code 1.

By the time the machine has booted those directories and devices exist with the correct version numbers and permissions so I don’t know what the issue is. I’m guessing it’s some combination of SE Linux, Secure Boot (disabled), signed drivers (unsigned), CA loaded in UEFI (not) and the Dell laptop platform which seems to have security dialed up to 11.

It beats me! At least for now. I haven’t been able to find anything on the net that describes what’s going on here.

Thanks for all the help. For now I’ll be thankful that I have a working Alder Lake GPU and pine for an operational 3050Ti. (This is an ex-driver! Bereft of life it rest in peace, etc. etc.)

One last thing: I just installed the rpmfusion drivers on Fedora 36 on my Z690 12900 build and it installed and runs the GPU fine.

The weird thing is that I see the same failures in journalctl which I incorrectly assumed were the reason the driver was failing to load (or being unloaded as a result) on the laptop, like these lines for example:

Dec 21 00:20:11 fedora systemd-udevd[827]: nvidia: Process ‘/usr/bin/bash -c ‘/usr/bin/mknod -Z -m 666 /dev/nvidiactl c 195 255’’ failed with exit code 1.
Dec 21 00:20:11 fedora systemd-udevd[827]: nvidia: Process ‘/usr/bin/bash -c ‘for i in $(cat /proc/driver/nvidia/gpus/*/information | grep Minor | cut -d \ -f 4); do /usr/bin/mknod -Z -m 666 /dev/nvidia${i} c 195 ${i}; done’’ failed with exit code 1.

It looks like the same state and same issues are present on both the 12900 desktop and 12700 laptop, the difference being that the desktop works and the laptop doesn’t. Neither have signed drivers, both are running SE Linux, but one (desktop) is running an up-to-date Fedora 36 with 6.0.13 kernel (same as the laptop) and has 520.56 Nvidia drivers and the other is running up-to-date Fedora 37 with Nvidia 525.60 drivers.

The same - non-fatal apparently - errors are occurring on both platforms.

I suspect that the high security of the Dell laptop, and the fact that it came with Windows and Secure Boot enabled (and BitLocker enabled - still) is the likely culprit, but I really ought to do yet another clean install, this time with Fedora 36 and the 520.56 rpmfusion Nvidia drivers, but I’m burned out on re-installing Fedora from scratch on the laptop just to have it fail to load the Nvidia driver. It’s my main laptop.

I’ll wait until there’s more hope that 37 etc. should work.

Meanwhile Blender 3.4 is working well with CUDA on the lowly 3050 - under 40s to render the BMW test file from blender.org.

Ciao.

The mknod error has got SELinux written all over it

If getenforce is 1 then setenforce 0 and try again

2 Likes

I realize I’m late to the party, but I wanted to share my experience with Fedora 36 and an Optimus laptop. I have an Asus M15 (2020) laptop with a 2070 Max-Q. I did get it working with F36, Optimus, and the Nvidia drivers but it was very weird and quirky. Sometimes games would run without any special environment variables, sometimes they needed to be explicitly specified. Sometimes they would run in one desktop environment, but not another. Sometimes they would only run on the internal screen, sometimes only on an external monitor. All very janky and oddball.

I finally got tired of this BS and installed PopOS, which cleared up 95% of the inconsistencies and weirdness I experienced with Fedora. I still generally prefer Fedora, and run it on my desktop with a 2070 Super. That desktop has nowhere near the number of issues I had on the Optimus laptop.

Whatever OS you decide to run, good luck…

1 Like