Unkillable process hogging all cpu

hi
i have raspberry pi 4 (with 8gb) with Ubuntu Server 64b with root on external USB3 HDD (no sd)

    PID USER      PRI  NI  VIRT   RES   SHR S CPU%-MEM%   TIME+  Command
 833404 root       20   0  8836   380     0 R 74.5  0.0 79h12:56 /bin/bash /usr/share/prometheus-node-exporter-collectors/smartmon.sh
1802894 root       20   0 22976  1784     0 R 71.8  0.0 12h01:16 /lib/systemd/systemd-udevd
1587158 root       20   0 22976  1784     0 R 64.9  0.0 38h01:20 /lib/systemd/systemd-udevd
  23210 root       20   0 22976  1784     0 R 60.1  0.0 86h15:02 /lib/systemd/systemd-udevd

when i try to kill any of thins process via sudo kill -9 $pid nothing happens
can anybody enlightenment me what to do with them?
why are they running away?
the only way I found to kill theme is to restart rpi

The Ubuntu image for RPi uses Snaps containerizations for everything. I am running Ubuntu 21.10 on my RPi that I use as a NAS due to a requirement of my NAS HAT to use a feature of RPiOS. Ubuntu has ported it over.

The reboot will kill it, but as soon as things start launching, it will kick off those Snaps wich will start UDEV. More than likely because you are using an external HDD. udev is a good thing. you need it for hardware eventing.

The question is, why are you trying to kill the processes? Are they causing issues?

The export collector looks to be something custom. Are you getting this image from Ubuntu?

i try to kill the processes because they are hogging whole cpu
after reboot i have 1-2 days of normal operation then some process (not always the same) decide to be unkillable and hogs all cpu

ubuntu image is from rpi imager (oficial imaging software of raspberry)

  0[###***************************************100.0%*1500MHz*48C]    2[#############*****************************100.0%*1500MHz*48C] Tasks: 438, 1442 thr, 208 kthr; 4 running 
  1[###################************************98.4%*1500MHz$48C]    3[################**************************100.0%*1500MHz*48C] Load average: 16.96 21.10 30.73
Avg[##########################************************************************************************************99.9%*1500MHz*48C] Uptime: 6 days, 00:13:40
Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||#******************************************4.36G/7.63G] ARC: 3.81G Used:2.05G MFU:1.64G MRU:188M Anon:11.6M Hdr:33.9M Oth:183M
    PID USER      PRI  NI  VIRT   RES   SHR S CPU%-MEM%   TIME+  Command
 833404 root       20   0  8836   380     0 R 63.5  0.0 96h24:16 /bin/bash /usr/share/prometheus-node-exporter-collectors/smartmon.sh
1587158 root       20   0 22976  1784     0 R 55.7  0.0 53h07:57 /lib/systemd/systemd-udevd
1802894 root       20   0 22976  1784     0 R 54.1  0.0 27h09:05 /lib/systemd/systemd-udevd
1888099 root       20   0  161M  1008     0 R 46.3  0.0  0:00.89 /usr/sbin/zed -F
  23210 root       20   0 22976  1784     0 R 42.1  0.0     101h /lib/systemd/systemd-udevd
   5000 root       20   0 5059M  217M 12844 S 33.8  2.8 10h41:15 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

Have you just tried disabling SMART? (Homer prancing S M R T) Not sure how relevant this will be as it seems everything has to be systemctl disable yada yada these days but it’s a reference point. smartctl(8) - Linux man page

as far i can see the SMART is not supported on this drive (SEAGATE Expansion Portable 4TB HDD)
and i don’t know how to disable SMART

sudo smartctl -a -T permissive /dev/sda 
smartctl 7.2 2020-12-30 r5155 [aarch64-linux-5.13.0-1013-raspi] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

Read Device Identity failed: scsi error unsupported field in scsi command

=== START OF INFORMATION SECTION ===
Device Model:     [No Information Found]
Serial Number:    [No Information Found]
Firmware Version: [No Information Found]
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   [No Information Found]
Local Time is:    Tue Jan 25 12:23:21 2022 UTC
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported.
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 85-87 don't show if SMART is enabled.
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

Smart is often not available over USB (though the drive inside will have it if connected traditionally). However SMART checks are what are hogging your CPU. Did you read the man page I linked? Kinda odd the rpi build is running SMART given an RPI won’t have attached SATA and SD cards won’t have it.

You originally said “ubuntu image is from rpi imager (oficial imaging software of raspberry)” which is a tad strange to me. RPI imager I’m guessing you mean just the imaging software but that won’t be who makes the real Ubuntu or Raspbian builds. Have considered just running Raspbian rather than Ubuntu?

according to man page

smartctl -s off /dev/hdd

Disable SMART monitoring and data log collection on drive /dev/hdd .

so

sudo smartctl -s off -T permissive /dev/sda 
smartctl 7.2 2020-12-30 r5155 [aarch64-linux-5.13.0-1013-raspi] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

Read Device Identity failed: scsi error unsupported field in scsi command

SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported.
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 85-87 don't show if SMART is enabled.
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

and yes i meant “Raspberry Pi Imager” which downloads official ubuntu image for rpi

i wanted ubuntu because it has better support for 64bit and zfs

something happen

sudo smartctl -s off /dev/sda -T permissive -T permissive -T permissive  -T permissive -T permissive -T permissive -T permissive -T permissive 
smartctl 7.2 2020-12-30 r5155 [aarch64-linux-5.13.0-1013-raspi] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

Read Device Identity failed: scsi error unsupported field in scsi command

SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported.
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 85-87 don't show if SMART is enabled.
                  Checking to be sure by trying SMART RETURN STATUS command.
SMART support is: Unknown - Try option -s with argument 'on' to enable it.=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Disable failed: scsi error unsupported field in scsi command

SMART Disabled. Use option -s with argument 'on' to enable it.

I will restart and let you know if it helped or not

Sounds good. I suspect because the normal Ubuntu install is newish for pi’s it’s not 100% yet and this might be one of those issues. Smart for something that by default runs of an mSDc is el stoopeedo. :wink: I also wonder if this is an odd little quirk where since the type of drives you have won’t support SMART if the query process hangs and loops because it expects the drives/BIOS to give it a YES/NO answer to supported or not and the drives/BIOS are replying “Say what now?”

so it’s a little better
the problem appears after 10-15 days and not after 2
any other ideas?

  0[######************************************100.0%*1500MHz*50C]    2[#########*********************************100.0%*1500MHz*50C] Tasks: 472, 1360 thr, 209 kthr; 4 running                                                                                       [0/1]
  1[##################************************100.0%*1500MHz*50C]    3[###############***************************100.0%*1500MHz*50C] Load average: 25.60 35.84 34.32
Avg[#########################************************************************************************************100.0%*1500MHz*50C] Uptime: 17 days, 19:50:47
Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||#************************************************3.98G/7.63G] ARC: 3.81G Used:2.39G MFU:1.93G MRU:188M Anon:32.9M Hdr:45.6M Oth:198M
    PID USER      PRI  NI  VIRT   RES   SHR S CPU%-MEM%   TIME+  Command
3354041 root       20   0  161M  1004     0 R 44.6  0.0     194h /usr/sbin/zed -F
 618288 systemd-c  20   0  208M  1824     0 R 43.0  0.0     183h postgres
4067292 root       20   0 2837M  8652  1108 R 33.3  0.1  7h36:14 runc init
2303464 root       20   0  8836   388     0 R 27.1  0.0 46h29:39 /bin/bash /usr/share/prometheus-node-exporter-collectors/smartmon.sh
 529891 root       20   0  8704   268     0 R 25.1  0.0     120h /bin/bash /usr/share/prometheus-node-exporter-collectors/apt.sh
   4488 prometheu  20   0 2432M 23144  6420 S 21.5  0.3 28h11:04 /usr/bin/prometheus-node-exporter
 869618 root       20   0 22976  1812     0 R 21.5  0.0  1h47:52 /lib/systemd/systemd-udevd
3417063 root       20   0 22976  1808     0 R 21.0  0.0  4h09:50 /lib/systemd/systemd-udevd
1 Like

Yeah the powerpc image had this problem so bad they dropped it. There was a thread leak they didn’t bother fixing and lazily said ‘its the platform buy a new computer’ instead of actually doing anything about it.

I would just use a lighter distro. To see high cpu use on a risc system is kinda normal tho, unless you are getting lag. At least on ppc. Its more showing efficiency than anything on there.

Look at void linux if you cannot resolve this. You might have a better time as it doesn’t use sysD and it has a more active dev group.

He’s not on RISC or PPC but a lighter distro is a better option, go back to actual Raspbian.

@vonProteus While at least it’s not SMART this time this simply looks like basic maintenance stuff for a normal system that’s running and the tools chosen in base Ubuntu image are just way too “feature rich” for a Pi. You really should go back to Raspbian and call it a day. In the end it’s better vetted for the platform and Debian is Debian. Even Armbian would be a better choice over Ubuntu really.

This reminds me of an issue I had for years with LUbuntu where my CPU would be bagged. System was completely unusable and my CPU/System monitoring tools would not agree on this. So taskmanager would say CPU was at 0.2% and never show the offending process but the tool that would show the cpu over 100% would also either never show a process (or several) amounting to the massive CPU load or the system would be too bagged to even launch a tool to investigate.

Raspberry Pi OS is available in 64bit now
so i will try to install it on zfs and see if it do a better job

thanks for your help @get_off_my_lawn

Zfs IMO has too much overhead for a small CPU like that. Plus you can’t really attach any real drives in a way that will be meaningful for Zfs. However like with anything you’re free to try.

I hope I help but really I just knew what a few processes were and that normal Ubuntu for a low power CPU is still nuts. Which isn’t a dig at you or anyone else running it but more kinda at how Canonical has become bad at most things they were once really good at. Even the full Desktop Ubuntu flavors are riddled with issues these days. Hell the installer can’t even finish it’s completed install shutdown/reboot haha.

I suppose another caveat is if you’re that nut Jeff Geerling. Hey guys today I’ve got some obscure pre-production Raspberry Pi NAS hat (that will never make it to production) that costs $5000 USD and I’ve attached $20K worth of Sata SSD’s. Let’s see how it does with a Zfs pool! - And on my next video I’ll show you how I used home automation to make the propeller on my beanie hat spin whenever someone in my proximity says my first name. As a bonus Red Shirt Jeff will also make it play a sample of a slide whistle when anyone says “Poop.”