The small linux problem thread

I wish small stuff like that would be sorted out long before release so it is all nice and well, but I run into problems like these often too. Also I have on several occasions landed back on L1 as a source for a solution as well. You get bonus points if you land in a thread you already answered in the past :rofl:.

Or - in fact - created … :skull:

2 Likes

I’m experimenting with running Nvidia and nouveau at the same time, for different cards. In my case, it’s a 2080ti and a 1070 in bazzite.

If I run modprobe nouveau manually, everything basically works. I haven’t tested this too extensively because I’m not about doing things manually.

If I create a systemd service (after graphical.target, wantedby default.target) the desktop is very unstable and crashes from kde animations.

If I run an @reboot crontab, the nouveau card doesn’t always show the desktop correctly after waking from sleep. Logging out / in corrects the issue.

A bazzite dev I was talking to in discord explained that this cannot not be done as the drivers will conflict. Totally reasonable to not officially support.

That being said. . . anyone else experiment with this? Is there a less sketchy way to use two Nvidia graphics cards at once like a ā€œnormalā€ desktop? Besides nouveau across the board preferably.

Everyone says this and their source is nearly always ā€œtrust me, bro.ā€ They told me the same thing with an AMD dGPU + nVidia dGPU.

If manually running modprobe works then perhaps the systemd unit needs to start later – I’m guessing after Plasma/Gnome/whatever has fully initialized. I reckon the problem is graphical.target is just the logon manager (sddm?) and not the full desktop (which isn’t handling twinVidia properly at startup).

Perhaps something like:

[Unit]
Description=Secondary GPU Module Loader
After=graphical.target

[Service]
Type=oneshot
RemainAfterExit=yes
TimeoutStartSec=10

ExecCondition=/bin/test -d /run/user/*
ExecStart=/sbin/modprobe <gpu module name goes here>

[Install]
WantedBy=multi-user.target

The idea is a folder appears under /run/user with the user’s <uid> when they logon. At that point test -d /run/user/* becomes true, the service waits for TimeoutStartSec seconds then modprobes.

Note I’m dumb, I’m not an expert, and I haven’t tested this. At all. It might not work. It probably won’t work. But maybe it’s a nudge in the right direction.

As for waking from sleep there’s a number of possibilities for this but the brute force approach might be easiest. Something like automatically rmmod the module prior to system sleep.

1 Like

Being able to modprobe later in the boot process was exactly what I was looking for, thanks!

1 Like

In BTOP
OPTIONS for cpu, change the sensor input to TSIO or k10temp sensor from default. I see my cpu temp in the upper right of the cpu panel right underneath the clock frequency of the cpu.

1 Like

I swear, all of my Linux problems have to do with sleep. To be fair, my Windows problems did as well…

Anyway, I’m hoping for some pointers in what direction to go research / troubleshoot, as I’m kinda out of ideas.

Problem 1:

On my desktop I have issues with graphical display glitches on my monitor after waking from sleep, but only if I have my TV plugged in (even if it isn’t enabled). Admittedly, TV is an HDMI 2.1 connection on an AMD graphics card (via one of the few active adapters that mostly-work), so I’m already in a weird state, but I just want to be able to sleep and wake my desktop without having to deal with display glitches from having a device plugged in I’m not even using half the time.

The display issues on my monitor (1440p/165 Hz Freesync via Displayport) are little hiccups that display a horizontal line like if there was an extreme screen tearing issue. My best guess is that freesync is glitching? They happen around a few times every five or so minutes, but seem partially dependent on what is displayed on screen.

The TV also has (different) display issues after resuming from sleep, but I know what are causing those and I have workarounds to them already. Plus I’m using a jank setup and expect jank in return. If the TV isn’t plugged in when the desktop goes to sleep, zero issues with the monitor.

R9 5900X, 7900XTX, EndeavourOS (KDE/Wayland)

Problem 2:

My laptop does not like waking from sleep in Linux. I swear this worked once upon a time and broke after a distro swap, but even reverting to my original distro it still doesn’t work. Which makes me think this is either driver-related issue, a firmware-related issue, or a ā€œaetherspoon’s memory doesn’t work rightā€ issue.

Basically, if the laptop sleeps for any reason (lid close, command to sleep, etc), it will hang on waking up. On EndeavourOS I get a blinking cursor in the top left and the keyboard initially responds, then stops responding. On other distros, the visual symptoms change (sometimes it will load some graphics and just hang at that point) but the end result (keyboard initially works then stops working, mouse/trackpad never works, machine locks up) remains.

Lenovo Legion 5 Slim, R7 7840HS, RTX 4070, EndeavourOS (KDE/Wayland) but also other distros that are still using X.

Lately I’ve been experiencing really slow boots. If I hit escape during bootup I see boot hanging with slow output of several lines like usb 3-7: device descriptor read/64, error -110 before I think the kernel gives up and moves on. dmesg shows me this:

dmesg
[   23.291836] usb 3-7: device descriptor read/64, error -110
[   23.573800] usb 3-7: new high-speed USB device number 5 using xhci_hcd
[   28.922841] usb 3-7: device descriptor read/64, error -110
[   44.794957] usb 3-7: device descriptor read/64, error -110
[   44.904248] usb usb3-port7: attempt power cycle
[   45.344780] usb 3-7: new high-speed USB device number 6 using xhci_hcd
[   50.144773] usb 3-7: Device not responding to setup address.
[   55.149884] usb 3-7: Device not responding to setup address.
[   55.357747] usb 3-7: device not accepting address 6, error -71
[   55.736773] usb 3-7: new high-speed USB device number 7 using xhci_hcd
[   60.537319] usb 3-7: Device not responding to setup address.
[   65.542422] usb 3-7: Device not responding to setup address.
[   65.749773] usb 3-7: device not accepting address 7, error -71
[   65.752559] usb usb3-port7: unable to enumerate USB device

How do I identify which device corresponds to usb 3-7? I thought lsusb -t would tell me, but it doesn’t show anything useful:

lsusb
/:  Bus 001.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/12p, 480M
    |__ Port 001: Dev 002, If 0, Class=Human Interface Device, Driver=usbhid, 12M
    |__ Port 002: Dev 003, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 003: Dev 005, If 0, Class=Human Interface Device, Driver=usbhid, 12M
    |__ Port 003: Dev 004, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 001: Dev 006, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 004: Dev 007, If 0, Class=Human Interface Device, Driver=usbhid, 12M
        |__ Port 004: Dev 007, If 1, Class=Human Interface Device, Driver=usbhid, 12M
        |__ Port 004: Dev 007, If 2, Class=Human Interface Device, Driver=usbhid, 12M
/:  Bus 002.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/6p, 10000M
    |__ Port 003: Dev 002, If 0, Class=Hub, Driver=hub/4p, 5000M
        |__ Port 001: Dev 003, If 0, Class=Hub, Driver=hub/4p, 5000M
/:  Bus 003.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/12p, 480M
    |__ Port 005: Dev 002, If 0, Class=Human Interface Device, Driver=usbhid, 12M
    |__ Port 005: Dev 002, If 1, Class=Human Interface Device, Driver=usbhid, 12M
    |__ Port 006: Dev 003, If 0, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 006: Dev 003, If 1, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 006: Dev 003, If 2, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 006: Dev 003, If 3, Class=Vendor Specific Class, Driver=[none], 480M
    |__ Port 008: Dev 008, If 0, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 008: Dev 008, If 1, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 008: Dev 008, If 2, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 008: Dev 008, If 3, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 008: Dev 008, If 4, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 008: Dev 008, If 5, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 008: Dev 008, If 6, Class=Human Interface Device, Driver=usbhid, 480M
    |__ Port 009: Dev 009, If 0, Class=Video, Driver=uvcvideo, 480M
    |__ Port 009: Dev 009, If 1, Class=Video, Driver=uvcvideo, 480M
    |__ Port 009: Dev 009, If 2, Class=Audio, Driver=snd-usb-audio, 480M
    |__ Port 009: Dev 009, If 3, Class=Audio, Driver=snd-usb-audio, 480M
/:  Bus 004.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/5p, 20000M/x2
/:  Bus 005.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/2p, 480M
/:  Bus 006.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/2p, 20000M/x2
/:  Bus 007.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/2p, 480M
/:  Bus 008.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/2p, 10000M
/:  Bus 009.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/2p, 480M
/:  Bus 010.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/2p, 10000M
/:  Bus 011.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/1p, 480M
    |__ Port 001: Dev 002, If 0, Class=Human Interface Device, Driver=usbhid, 12M
/:  Bus 012.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/0p, 5000M

I’ve tried unplugging everything connected to USB ports, but that doesn’t have any effect.

System
System:
  Kernel: 6.15.2-300.vanilla.fc42.x86_64 arch: x86_64 bits: 64
  Desktop: GNOME v: 48.2 Distro: Fedora Linux 42 (Workstation Edition)
Machine:
  Type: Desktop System: ASRock product: X870E Taichi Lite v: N/A
    serial: <superuser required>
  Mobo: ASRock model: X870E Taichi Lite serial: <superuser required>
    UEFI: American Megatrends LLC. v: 3.25 date: 05/14/2025
CPU:
  Info: 8-core model: AMD Ryzen 7 9800X3D bits: 64 type: MT MCP cache:
    L2: 8 MiB
  Speed (MHz): avg: 3594 min/max: 603/5272 cores: 1: 3594 2: 3594 3: 3594
    4: 3594 5: 3594 6: 3594 7: 3594 8: 3594 9: 3594 10: 3594 11: 3594 12: 3594
    13: 3594 14: 3594 15: 3594 16: 3594
Graphics:
  Device-1: Advanced Micro Devices [AMD/ATI] Navi 31 [Radeon RX 7900 XT/7900
    XTX/7900 GRE/7900M] driver: amdgpu v: kernel
  Device-2: Advanced Micro Devices [AMD/ATI] Granite Ridge [Radeon Graphics]
    driver: amdgpu v: kernel
  Device-3: Realtek Streamplify CAM driver: snd-usb-audio,uvcvideo type: USB
  Display: wayland server: X.Org v: 24.1.6 with: Xwayland v: 24.1.6
    compositor: gnome-shell driver: dri: radeonsi gpu: amdgpu resolution:
    1: 2560x1440~120Hz 2: 2560x1440~60Hz
  API: OpenGL v: 4.6 vendor: amd mesa v: 25.0.7 renderer: AMD Radeon RX
    7900 XT (radeonsi navi31 LLVM 20.1.5 DRM 3.63
    6.15.2-300.vanilla.fc42.x86_64)
  API: EGL Message: EGL data requires eglinfo. Check --recommends.
  Info: Tools: api: glxinfo x11: xdriinfo, xdpyinfo, xprop, xrandr
Audio:
  Device-1: Advanced Micro Devices [AMD/ATI] Navi 31 HDMI/DP Audio
    driver: snd_hda_intel
  Device-2: Advanced Micro Devices [AMD/ATI] Radeon High Definition Audio
    [Rembrandt/Strix] driver: snd_hda_intel
  Device-3: Advanced Micro Devices [AMD] Family 17h/19h/1ah HD Audio
    driver: N/A
  Device-4: Focusrite-Novation Scarlett Solo 4th Gen driver: snd-usb-audio
    type: USB
  Device-5: Generic USB Audio driver: hid-generic,snd-usb-audio,usbhid
    type: USB
  Device-6: Realtek Streamplify CAM driver: snd-usb-audio,uvcvideo type: USB
  API: ALSA v: k6.15.2-300.vanilla.fc42.x86_64 status: kernel-api
  Server-1: PipeWire v: 1.4.5 status: active
Network:
  Device-1: Realtek RTL8126 5GbE driver: r8169
  IF: enp10s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  Device-2: MEDIATEK MT7925 Wi-Fi 7 160MHz driver: mt7925e
  IF: wlp11s0 state: down mac: <filter>
Drives:
  Local Storage: total: 3.64 TiB used: 3.23 TiB (88.8%)
  ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 990 PRO 4TB size: 3.64 TiB
Partition:
  ID-1: / size: 3.64 TiB used: 3.23 TiB (88.8%) fs: btrfs dev: /dev/dm-0
  ID-2: /boot size: 973.4 MiB used: 437.3 MiB (44.9%) fs: ext4
    dev: /dev/nvme0n1p2
  ID-3: /boot/efi size: 598.8 MiB used: 19.3 MiB (3.2%) fs: vfat
    dev: /dev/nvme0n1p1
  ID-4: /home size: 3.64 TiB used: 3.23 TiB (88.8%) fs: btrfs dev: /dev/dm-0
Swap:
  ID-1: swap-1 type: zram size: 8 GiB used: 0 KiB (0.0%) dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 39.6 C mobo: N/A
  Fan Speeds (rpm): N/A
  GPU: device: amdgpu temp: 37.0 C fan: 2 device: amdgpu temp: 36.0 C
Info:
  Memory: total: 48 GiB note: est. available: 46.13 GiB used: 3.68 GiB (8.0%)
  Processes: 479 Uptime: 11m Shell: Zsh inxi: 3.3.38

I’ve had some issues with GPU power management that may be caused by the mobo, so a part of me wonders whether this is just more of that. I don’t know how to pinpoint issues to the motherboard though.

At one point this solved sleep not waking for me (on 7640U, a rolling distro though not on an arch base), your description reads like I experienced, and applied:

https://bbs.archlinux.org/viewtopic.php?id=296954
(see post #9 for a unit file - but revert it if you don’t see any difference)

Several minor kernel versions later of 6.14, something else caused it to fail, differently (display buffer of the time the screen closed, specifically) so more likely uefi/fw packages/kernel not talking to each other.

There are some sleep info/debug tools here I was meaning to look into further -
https://pypi.org/project/amd-debug-tools/

Though honestly, I gave up on this for now and just start on an older kernel (6.12 in my case).

If it’s always the same ā€˜port’ 3-7 at issue you might be able to either, disable in bios or otherwise to disable it with e.g. usbguard though I don’t know that latter would occur on boot.

Whatever it is doesn’t seem to complete registration for lsusb to report (aside 1 - lsusb.py may exist in your distribution & is arguably more readable, also try lsusb -tv). The ā€˜usb’ device may actually be soldered to the board or even be e.g. a shorted or bad front panel connector itself (i.e. not necessarily the back panel ports). I’d plug in something else until I found which set of physical ports from the board is on that usb2.0 part of the tree, and start to disconnect anything in it.

aside 2 - systemd-analyze blame can also help identify services delaying startup.

Hm - yeah, this definitely sounds like what mine is going through. Sadly, the initial tips didn’t work for me; I’ll poke around further with the debug tools, I really don’t want to revert kernel versions.

Background


A week ago I was seeing really poor performance in looking glass (45 FPS, linux guest GPU maxed out). Eventually narrowed it down to a driver problem that I solved by reverting to 470xx via aur (gpu is a 1070) and using X11 instead of Wayland. Reading through the aur entry it was applying some patches to the driver for the latest kernel to get it to compile.

At this point, looking glass worked perfectly but the desktop ran pretty badly. Additionally, any time I tried to launch a game through steam I would get an error about not having a D3D11 compatible card.

So I was reading the docs from the more recent 570 driver and saw something about Ubuntu 24.04 being ā€œofficiallyā€ supported. Went to find out what was so special about them and learned about the work they put into their HWE kernel.

Extended hardware support sounded like exactly what I needed, so I downloaded the 6.14-hwe kernel source from archive.ubuntu.org, followed the Kernel page in the arch wiki, learned what DKMS is for, installed the 570 drivers, and now I have a better (not perfect) functioning LG, a smooth desktop AND I can launch games with the linux guest gpu. I still have to use X11 instead of Wayland, but that seems OK to me.


Question
My question is: now that I am running a manually compiled kernel, does that mean I’m committed to compiling ubuntu’s HWE kernel for the rest of time? Or is there a way to start using the ubuntu binary kernel in arch and receive automatic updates?