Need help figuring out why Ubuntu Server 18.04 LTS is crashing

Hi all,

So I posted awhile back about getting an LSI card into IT mode. Managed to get through that fine and put the latest firmware on the card and finish my home server.

It’s a frankenstein monster made out of consumer parts that I had left over with a bunch of drives I’ve put in.

CPU: Ryzen 5 1600
MBOARD: ASRock B450 Pro4 ATX AM4
RAM: 4 x Kingston ValueRAM 4 GB ECC Unregistered (ECC/EDAC actually works).
HDD: 8 x Western Digital Red 4 TB (raidz2)
HBA: LSI SAS9211-8I (flashed into IT mode)
GPU: Gigabyte AMD Radeon HD 6670 (old GPU I have lying around since Ryzen doesn’t have iGPU’s)

Most times I’ve had issues with Linux it’s been an issue with the hardware rather than the software, but I’m having a hard time determining the cause.

Every 2-3 days I try to access my server and find it’s unreachable. No SSH, no ping response, nothing. Even tried to plug in a monitor to see if anything is displayed but I get no signal.

Only way to bring it back online is a hard reset. But then in two days it drops out again.

I would give some logs but I’m not sure where to look if I’m honest. Any idea how I can find the culprit of this?

Update BIOS to latest for first gen. That would be 3.5 on this board (you might need to flash 1.8 first). Also disable C-States in the BIOS.

The BIOS is on the latest safe version 3.50.

It looks like Asrock have either moved or removed the usual settings for C-states and Cool’n’Quiet.

I’m following the manual from this page:

The “Cool ‘n’ Quiet” and “C6 Mode” are both mysteriously missing from Advanced > CPU Configuration. The manual hasn’t been updated so I can’t find the settings.

I’ve gone through all of the advanced settings pages but there doesn’t seem to be any trace of them.

It should say C-States or global C-States or something like that. I’m not talking about C6 specifically. I disable the whole thing.

I can’t look up where it might be because I don’t have that exact board and all my other systems (except my 2009 mac pro) are currently not in use because they are waiting to move to a new place.

@Copious I have an Asrock X570 Tiachi, try advance/AMD CBS/CPU Common Options there should be an option to disable Gobal C-states.

3 Likes

That was it, I’ve disabled cstates and will let it run for the next few days to see how it behaves.

Marked as Solved?

Not sure if it’s solved until it’s run for a few days since that’s the only way to replicate it.

2 Likes

Thanks @noenken, after disabling c-states, the system has remained stable.

1 Like