[SOLVED] Kdump cannot use NVME on Debian

i am trying to configure kdump on my debian system
however, when the kdump kernel loads, it seems to fail to mount my storage device. this brings it to the initramfs shell but as you can see, kdump also fails to use USB, so i cannot use my keyboard, or initramfs to help debug this.
Linux 5.8.10
asus prime x399-a motherboard.
threadripper 1950x.

Seems like you do no have enough drivers in your initramfs/mkinitcpio to actually load your hardware. Check what is in your Debian kernel versus the kernel that you rolled.

my kdump kernel is a symlink to the system’s normal kernel. therefore kdump has all the drivers it should need.
kdump’s kernel isnt even loading the initrd file as its unable to mount the nvme device which its initramfs file is stored.

But the normal Kernel boots? If so, check the permissions on the symlinks!?

all the kdump symlinks have permissions lrwxrwxrwx

That is a strange one.

1 Like

update: i have tried disabling APST and increasing the NVME IO timeout to 255. none of this changed this behavior at all.

Have you reached out to the Debian or Ubuntu forvms? Your issue is seriously weird, but since I don’t have NVMe, I cannot independently test.

i have reached out to the Debian forums. no replies as of yet.

after over a month of no replies on the user forums. i have filed a bug report to Debian.

Do you frequency perform tasks that require the use of kdump?

Otherwise you don’t really need it.

Lots of guides actually recommend turning off kdump to get the system resources reclaimed.

So unless you really need, not really worth your time.

1 Like

yes.
my setup is unstable and the kernel crashes frequently. without kdump, i have no means of finding out what the cause of these crashes is.
also my system lacks IPMI, so kdump is very useful to reboot my system when the kernel does crash.

Hmm.

Something is not right.

If your system still continues to be unstable after this time.

2 Likes

there are so many things that could be the culprit(s) of my instability. if only i could capture dumps of my system crashes.

Do you have the latest firmware for your board?

According to their site, the UEFI should be version 1203.

Secondly, what other distro’s have you tried? Does something like the latest Ubuntu work?

And could you please post your full system specs?

1 Like

i am using firmware 1002. i have not upgraded because i see threads reporting that upgrading firmware on AGESA boards can break IOMMU.
also, losing my AGESA settings would be a huge inconvenience.

changing distros just to try it is not something im willing to put in the time to do. recreating all my services and VMs would take too long. also the downtime required for it would be unacceptable.
as for my system specs, here’s a hardinfo report: hardinfo_report.html - Google Drive

What I meant was a live distro to test with via a USB. Just to see if other distros have the same behavior.

However, upon reading the output of your system I see you have custom compiled the latest LTS kernel so just distro-hopping probably won’t help that much.

We do seem to be between a rock and a hard place here with the inability to upgrade firmware, but certain devices being flaky and distro maintainers being unresponsive.

1 Like

the only customization to my kernel is gnif’s “vendor-reset” module being built-in. otherwise, it is debian’s 5.10 kernel.

1 Like

Ahh, thank you for that clarification.


@gnif what sort of further troubleshooting steps would you recommend for our friend?

OP has already contacted the maintainers and opened a bug report, however, I know you are very familiar with Debian and might know a bit more about what else the OP can do.

1 Like

i managed to get some clearer and more detailed pictures of this problem occurring.

images