AM5 with NVMe hotplug, what's the best way?

I am trying to build a small home server. Among my objectives:

  • Runs Linux
  • hot-pluggable NVMe and SATA storage
  • 10GBase-T
  • Reasonable idle power consumption

I have come really close to checking all of these boxes with a cheap AM5 board, a Silverstone CS382 case, and an Icy Dock MB699VP.

The idle power consumption ended up at 35W, which isn’t nearly as good as some Intel platforms, but does not bother me.

The remaining issue comes from the NVMe. I connected the Icy Dock to the motherboard’s bifurcated x16 slot. It works fabulously, as long as the U.2 drives are plugged in at boot time. Unused U.2 slots are not reported by the OS, and if a drive is unplugged after boot, the record of its PCIe connection just seem to … disappear from lspci.

Is there any way to make this work?

I really don’t want to get an HBA or a PCIe switch board – a modestly priced x8 HBA would seem to cut the total PCIe lanes in half, and increase the power consumption to uncomfortable levels.

I could also drop everything and move to Intel. Like W680? Those boards aren’t cheap.

Thanks for reading.

Have you tried using the PCIe hotplug functionality (using remove and rescan files in the /sys tree)? I haven’t tried it myself but a simple web search gives lots of hits about it.

That’s a good question, thanks for bringing it up.

I have two U.2 drives, and if I plug them both in before boot, they’re visible, and attached directly to the root complex:

# lspci -tv
-[0000:00]-+-00.0  Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Root Complex
           +-01.1-[01]----00.0  Intel Corporation SSD 670p Series [Keystone Harbor]
           +-01.3-[03]----00.0  Intel Corporation SSD 670p Series [Keystone Harbor]

I can perform a software release and rescan – pretending to remove the device and put it back, and it is functional afterward:

# lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
nvme0n1     259:0    0   1.9T  0 disk  
nvme1n1     259:1    0   1.9T  0 disk  
# echo 1 > /sys/bus/pci/devices/0000\:00\:01.3/0000\:03\:00.0/remove
# lspci -tv
-[0000:00]-+-00.0  Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Root Complex
           +-01.1-[01]----00.0  Intel Corporation SSD 670p Series [Keystone Harbor]
           +-01.3-[03]--
# lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
nvme0n1     259:0    0   1.9T  0 disk  
# echo 1 > /sys/bus/pci/devices/0000\:00\:01.3/rescan
# lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
nvme0n1     259:0    0   1.9T  0 disk  
nvme1n1     259:1    0   1.9T  0 disk  

Notice after the remove command, lspci still shows the 01.3 slot listed but apparently without a device.

A more complete scenario is to actually unplug the drive, first issuing the polite remove command as above:

# echo 1 > /sys/bus/pci/devices/0000\:00\:01.3/0000\:03\:00.0/remove
# lspci -nnk -s '0000:03:00.0'

After this, I unplug the drive and plug it back in to the same slot. The system recognizes it and configures it, but the nvme driver silently refuses to attach:

# echo 1 > /sys/bus/pci/devices/0000\:00\:01.3/rescan
# lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
nvme0n1     259:0    0   1.9T  0 disk  
# lspci -nnk -s '0000:03:00.0'
03:00.0 Non-Volatile memory controller [0108]: Intel Corporation SSD 670p Series [Keystone Harbor] [8086:f1aa] (rev 03)
	Subsystem: Intel Corporation Device [8086:390f]
# lspci -nnk -s '0000:01:00.0'
01:00.0 Non-Volatile memory controller [0108]: Intel Corporation SSD 670p Series [Keystone Harbor] [8086:f1aa] (rev 03)
	Subsystem: Intel Corporation Device [8086:390f]
	Kernel driver in use: nvme

The dmesg logs show:

[  554.480536] pci 0000:03:00.0: [8086:f1aa] type 00 class 0x010802
[  554.480557] pci 0000:03:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
[  554.480588] pci 0000:03:00.0: Max Payload Size set to 512 (was 128, max 512)
[  554.480822] pci 0000:03:00.0: Adding to iommu group 16
[  554.480980] pcieport 0000:00:01.3: ASPM: current common clock configuration is inconsistent, reconfiguring
[  554.734067] pci 0000:03:00.0: BAR 0: assigned [mem 0xf6c00000-0xf6c03fff 64bit]
[  554.734290] nvme nvme1: pci function 0000:03:00.0
[  554.734303] nvme 0000:03:00.0: enabling device (0000 -> 0002)

This said, there are two other slots in the Icy Dock that were empty at boot time, and I can plug the drive into either of those slots, but it is useless to issue rescan commands, either on the root or on a specific slot device, as none of them cause the device to be recognized. It is as if the slot is only created at boot time if there’s a drive plugged in.

Also, if I skip the polite remove command before yanking the drive, eventually the 01.3 port is removed from the lspci result.

The kernel command line includes pci=realloc.

Interesting!

While looking around a bit I found some interesting tidbits:

This post also mentions re-binding the driver to the device, but doesn’t show how this is done. Also mentions reserving space for PCI bus numbers for new devices.

Maybe some of those things are useful?

Here’s another take:

From https://linuxhaxor.net/code/pci-linux.html