I’ve be having this issue for a few years now. It first started way back when I was running Ubuntu (16.04 I think) and after not being able to find the root cause, I found some work arounds that made it stop crashing… or at least, made it crash much less.
The problem is now rearing it’s head on my Debian 10 (root on ZFS) install.
The crashes only occur when I’m away from the system. They also only occur when the system is not doing certain tasks… let me explain. Back when my Ubuntu install was crashing, the symptoms were basically that the screen would remain blank when waking or switching the monitor back on, or everything apart from the mouse would be frozen and unresponsive. Sometimes though, not even the mouse would work. I couldn’t switch TTYs… even SysRq shortcuts wouldn’t work. My only recourse was to do a hard reboot or power cycle.
I can’t remember the reasoning behind it but I tried running RhythmBox (Ubuntu’s default music application) and leaving it playing an internet radio station. Which drastically reduced the crashes to the point where I can’t be certain that it did crash again. When I said it only crashed when not doing certain tasks, those tasks were anything that had continual user input (basically any form of physical interaction with the mousse and keyboard) or when it was playing media such as music. The system only crashed when being left to transfer or hash files, download torrents etc.
Fast forward a few years to my Debian install and after a few months without issue, it starts freezing when I would leave it transferring and hashing files between local and NAS storage. There’s a deadline on transferring and hashing the files, so I installed RhythymBox and tune into an internet radio station as a short termm “fix”, which helped for a couple of days, but now it has crashed again. Only this time, the system was left idling. A Firefox window was open, I was using VI to make a shopping list, and the internet radio was playing. I left the system for maybe fifteen minutes and when I came back, there was a blank screen, my GPU fans were running (they should be in 0rpm mode), and 1 of my 2 CPU fans had stopped.
The things is though, these issues have never happened on my Windows install (which makes me think the issues is related to Linux and it not playing nice with my hardware). I’m a really noob when it comes to diagnosing this sort of thing on Linux and would really appreciate some guidance please.
Here are my hardware specs:
CPU: Ryzen 7 1700x
CPU Heatsink: Noctua NH-D15
Mobo: Gigabyte AX370 Gaming 5
GPU: MSI 1660 Super Gaming X
Case: SIlverstone FT02 (3x 180mm fans and 1x 120mm fan)
Boot - Linux: WD Black 500GB NVMe
Boot - Microsoft Spyware 10: Samsung 850 500GB
Defunct hardware
CPU Heatsink: Scythe Kotetsu
GPU: Asus GTX 670 DirectCU II TOP
Case: Fractal Design XL R2
Boot - Linux: Crucial MX200
Boot - Microsoft Spyware 10: Samsung 830 250GB
[NOTE] Back when I was using my old Fractal case and the GTX 670, I stress tested the system for reason not related to the (or any) crashes. I do have fan curves that err on the side of silence but cooling was not an issue then and although I haven’t properly stress tested the updated system in the FT02 case, I havee no reason to believe cooling is an issue now. GIven that I can play game and transcode blurays on my Windows install without any issues.