Manjaro hard crashing after switching hardware

I recently swapped most of my hardware, going from intel to amd. I run Manjaro, usually with the acs patch. But after using the system for about a minute or two, it would hard crash. This is consistent. After regenerating mkinitcpios turning on and off different settings in my bios (IOMMU and virtualization). The crashing is persistent, reviewing logs and dmesg reveals nothing scary, but if they are needed I can try to get them and post them here.

My motherboard is an Asus X570-e gaming wifi 2 with a Ryzen 7 2700x. A 2080 Super and RX260x.

I am very sure the issue isn’t hardware based, I booted windows and played doom eternal on Ultra Nightmare settings, I had zero issues. I am at a loss. What could be happening? Do I need to reinstall?

Any help is appreciated,
Thank you.

Not sure about how Manjaro configures things, but the first idea that popped into my head was vendor specific the Microcode image that is typically loaded really early in the boot process.

See the Arch wiki entry on Microcode.

It may be worth checking out your boot configs and kernel params for anything intel specific?

1 Like

I suggest booting from a Live USB, to see if the problem reproduces. If it doesn’t, a reinstall will likely fix it.

1 Like

I have removed anything intel specific I believe. And I have installed the AMD microcode, grub did detect it correctly.

Hmmm - and nothing is showing up in your logs?

I’d definitely try @jlittle’s suggestion and get a Live USB to rule out some weird instability. Otherwise, I think you need to make a decision based on your goals…

  1. Scour the logs, review the wiki, and try to crack the puzzle - I’d expect you’d eventually figure out what’s causing the instability, but it might take a while…
  2. Reinstall - Hopefully you have your dotfiles, data, and other configuration backed up. If so, that’s definitely the quickest route to “solving” things.
1 Like

I booted into a live usb and everything was fine. So I have no clue what the issue was. I imagine something might show up in dmesg but probably just before it crashes, which would be almost impossible to catch real time. Are dmesg or kernel logs stored somewhere that I could check?

I think it’s also useful to mention that now KDE doesn’t start, it simply starts into the manjaro web terminal. When attempting to start sddm manually, nothing happens. I’m thinking this could be some sort of issue spawned from the crashing? As in from crashing so often maybe some things are corrupting? Or just straight up breaking?

Yeah, systemd keeps track of it with their “journal”

# journalctl -b -1

The “b” flag gives points it to the previous boot. I think the “k” flag give you dmesg output, but I don’t remember of the top of my head?

1 Like

After booting into manjaro and letting it crash, the log seems innocuous. Everything including the kernel is just going about it’s processes and then just stops. And the stopping point is not the same between crashes. I will try to get the latest logs and post them here.

This is the log file. It is pretty long so I thought best to upload the entire file.
logs.txt (228.7 KB)

Hmm - Don’t really see anything in the log, aside from TOR crapping out. Not sure if that would cause anything though.

There may be some other logs you can check in /var/log/

I didn’t see this before (or maybe it was the edit?). Did you end up changing out your GPU config as well? If you’re running X (and not wayland), there may be some gpu specific configuration when starting up your DE? Those logs should be in /var/log/

I do a gpu passthrough setup. So my main gpu doesn’t really do much within X. But I did add another gpu and remove one. Going from intel integrated as my main gpu to an amd gpu. Could this be the issue? However we can only speculate now, I needed my main setup working and moved over to Arch. However, I can say that my log directory didn’t show anything scary. I scowered it and didn’t find anything alarming still.

Caveat - I haven’t run X in a few years and I’ve never messed with KDE… so what’s below is most likely all lies:

My guess is that there’s some KDE specific configuration files somewhere that use your GPU setup. When switching around GPUs, the output names can change and I think xorg.conf files can have mappings to particular screens or cards?? Maybe something happened along those lines - especially since it seems like your system is “stable”, besides from the DE…

It’s happening again! On my new Arch install it’s happening! This is actually crazy! I don’t know what happened, it crashed once and now it will just crash and restart then crash again! I will attempt to get logs and a restep of what was happening.

I just uninstalled Libvirt, Virt-Manager, and Qemu, removed vfio modules from mkinitcpio.conf. Everything appears to be stable the crashing has stopped.
I’m going to slowly start reinstalling everything and post here my results.
Edit: Got Vfio working no issues.
Almost immediately crashed after starting libvirtd and virtlogd.
!!!

!!!

I have fixed it, I am no longer crashing. But what is interesting is that this wasn’t an issue when I had an intel cpu. Is this a bug? I believe when I had an intel cpu on Manjaro I had /dev as a storage pool. So what changed when I switched to amd? Should I report this to the libvirt devs?

To summarize, the issue I ran into is AMD specific, as far as I know. After adding the /dev directory as storage device pool. Libvirtd crashes the entire system (don’t know why libvirtd itself doesn’t just crash instead everything just stops). Libvirtd then starts at boot and repeats. This is possibly an issue specifically related to libvirtd. But it is obviously a major issue considering the lack of logs to show an issue was present.

Hmm - the /dev folder contains interfaces to physical devices. It makes sense that those mappings change when you change hardware. See the arch wiki.

I’m happy you solved it, though I’m not sure why you’d ever map the entire /dev dir and not a specific device within dev. There’s a lot of stuff in there that’s not related to storage

I guess in my mind I knew it had some storage devices I would need. So I thought “why not” especially since at the time it didn’t really hurt. And sometimes I did use devices by uuids anyways.

ha - makes sense! I’m honestly surprised it worked :rofl:

1 Like

It didn’t! I ended up passing through controllers for the needed devices instead. I just didn’t go out of my way to remove it.

1 Like