NVMe drive suddenly not showing up

Suddenly one day the NVMe boot drive in my laptop just didn’t boot. When I went into the BIOS on the page that is supposed to show all NVMe drives, my boot drive just isn’t there (only my data NVMe drive). I booted into a Ubuntu live usb and I can see the drive in lspci:

5b:00.0 Non-Volatile memory controller: Sandisk Corp WD PC SN810 / Black SN850 NVMe SSD (rev 01) (prog-if 02 [NVM Express])
	Subsystem: Sandisk Corp WD PC SN810 / Black SN850 NVMe SSD
	Flags: fast devsel, IRQ 16, IOMMU group 22
	Memory at 86300000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [80] Power Management version 3
	Capabilities: [90] MSI: Enable- Count=1/32 Maskable- 64bit+
	Capabilities: [b0] MSI-X: Enable- Count=65 Masked-
	Capabilities: [c0] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [1b8] Latency Tolerance Reporting
	Capabilities: [300] Secondary PCI Express
	Capabilities: [900] L1 PM Substates
	Capabilities: [910] Data Link Feature <?>
	Capabilities: [920] Lane Margining at the Receiver <?>
	Capabilities: [9c0] Physical Layer 16.0 GT/s <?>
	Kernel modules: nvme

but nothing in lsblk

Any ideas on how I can at least recover some data?

1 Like

I’ve seen this behavior from SN850’s before.
Try removing the drive from the laptop, turning the laptop on for a few minutes without the drive, and then back off; after this install the drive back in the laptop and see if it shows correctly.

1 Like

No dice unfortunately. I took out the drive, booted into ubuntu live, waited around 5 minutes, powered off, put the drive back in, and still nothing in BIOS.

Got any other computers you could temporarily transplant the drive into? to rule out something strange happening to the PCIe lanes on the laptop itself.

Just tried that. Still no dice.

I plugged it into a different laptop which only has 1 M.2 slot.

After it didn’t immediately boot, I also went into ubuntu live and gathered some more stuff.

Here is some lshw output:

        *-pci:1
             description: PCI bridge
             product: Renoir/Cezanne PCIe GPP Bridge
             vendor: Advanced Micro Devices, Inc. [AMD]
             physical id: 2.4
             bus info: pci@0000:00:02.4
             version: 00
             width: 32 bits
             clock: 33MHz
             capabilities: pci pm pciexpress msi ht normal_decode bus_master cap_list
             configuration: driver=pcieport
             resources: irq:38 memory:fce00000-fcefffff
           *-nvme UNCLAIMED
                description: Non-Volatile memory controller
                product: WD PC SN810 / Black SN850 NVMe SSD
                vendor: Sandisk Corp
                physical id: 0
                bus info: pci@0000:02:00.0
                version: 01
                width: 64 bits
                clock: 33MHz
                capabilities: nvme pm msi msix pciexpress nvm_express cap_list
                configuration: latency=0
                resources: memory:fce00000-fce03fff

Here is the only dmesg output about nvme:

[   10.982148] nvme nvme0: Device not ready; aborting initialisation, CSTS=0x0

This might be related: 217863 – Lexar NM790 SSDs are not recognized anymore after 6.1.50 LTS

I.e. the drive reports a bogus zero time to get ready and some versions of the linux kernel respects that bogus zero time and thus the drive fails. Did this issue start after a kernel upgrade?

My primary OS on this machine is Windows and it is installed on the drive. Everything worked perfectly for years until it suddenly just rebooted into the BIOS showing no bootable devices. I’m just using Linux to gather some kind of info, because it is the only thing that even sees it.

Dang, I just encountered what seemed like the same issue on a SN850 two weeks ago and swapping the drive into another NVMe slot of the motherboard cured it.
It’s possible it took me longer than 5 minutes to turn the computer off and swap slots, but I would have thought 5 minutes would be enough for any kind of volatile storage on the NVMe controller to discharge.