Hi All,
I am running a Linux server with the following specs:
NVMe Drives: 2x Corsair MP600 PRO NH 8 TB PCIe4 NVMe in Software (mdadm) RAID-1
Motherboard: AsRockRack B650D4U
CPU: AMD Ryzen 7950X
I notice a significant slowdown, regardless of disk measuring tool (i.e. I tried both dd and fio) after the server has been powered on for a few hours. If I reboot the server, the disk I/O results will be up to the 2-3 GB/s range, but if i try again a few hours later, its down to the 300MB/s range.
To illustrate, here’s what its suppose to look like:
fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/md126):
Block Size | 4k (IOPS) | 64k (IOPS)
------ | — ---- | ---- ----
Read | 903.14 MB/s (225.7k) | 2.44 GB/s (38.2k)
Write | 905.52 MB/s (226.3k) | 2.45 GB/s (38.4k)
Total | 1.80 GB/s (452.1k) | 4.90 GB/s (76.6k)
| |
Block Size | 512k (IOPS) | 1m (IOPS)
------ | — ---- | ---- ----
Read | 3.08 GB/s (6.0k) | 3.14 GB/s (3.0k)
Write | 3.24 GB/s (6.3k) | 3.35 GB/s (3.2k)
Total | 6.33 GB/s (12.3k) | 6.49 GB/s (6.3k)
And here’s what it looks like after the server has been powered on for more than a few hours:
fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/md126):
Block Size | 4k (IOPS) | 64k (IOPS)
------ | — ---- | ---- ----
Read | 85.42 MB/s (21.3k) | 328.25 MB/s (5.1k)
Write | 85.65 MB/s (21.4k) | 329.98 MB/s (5.1k)
Total | 171.07 MB/s (42.7k) | 658.23 MB/s (10.2k)
| |
Block Size | 512k (IOPS) | 1m (IOPS)
------ | — ---- | ---- ----
Read | 412.86 MB/s (806) | 414.53 MB/s (404)
Write | 434.80 MB/s (849) | 442.14 MB/s (431)
Total | 847.67 MB/s (1.6k) | 856.67 MB/s (835)
At first, I may have thought, perhaps its an issue once the NVMe’s start getting more full, because in the beginning stages when I first deployed this server, the I/O tests were consistently good. However, some of the other servers we have with the exact same build, are 80-90% at capacity and still have decent disk I/O performance, so I’m not certain that might be the case here. Also, I noticed the issue occurs on this server when the NVMe is only at 30% usage already so I don’t think its an issue with how much storage capacity is being used.
iotop show less than 50-100MB/s usage at any given time. CPU usage is low.
Total DISK READ : 768.05 K/s | Total DISK WRITE : 11.91 M/s
Actual DISK READ: 768.05 K/s | Actual DISK WRITE: 12.09 M/s
Firmware looks to be up to date according to nvme list:
[root@server ~]# nvme list
Node SN Model Namespace Usage Format FW Rev
/dev/nvme0n1 A5LIB340001QRC Corsair MP600 PRO NH 1 8.00 TB / 8.00 TB 512 B + 0 B EIFM51.3
/dev/nvme1n1 A5LIB340001PT7 Corsair MP600 PRO NH 1 8.00 TB / 8.00 TB 512 B + 0 B EIFM51.3
Here are the temperature reading/smart log data:
[root@server ~]# nvme smart-log /dev/nvme0n1
Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
critical_warning : 0
temperature : 63 C (336 Kelvin)
available_spare : 100%
available_spare_threshold : 5%
percentage_used : 0%
endurance group critical warning summary: 0
data_units_read : 115,067,351
data_units_written : 36,177,997
host_read_commands : 925,472,295
host_write_commands : 831,088,874
controller_busy_time : 2,688
power_cycles : 3
power_on_hours : 1,185
unsafe_shutdowns : 1
media_errors : 0
num_err_log_entries : 4
Warning Temperature Time : 0
Critical Composite Temperature Time : 0
Thermal Management T1 Trans Count : 0
Thermal Management T2 Trans Count : 0
Thermal Management T1 Total Time : 0
Thermal Management T2 Total Time : 0
[root@server ~]# nvme smart-log /dev/nvme1n1
Smart Log for NVME device:nvme1n1 namespace-id:ffffffff
critical_warning : 0
temperature : 61 C (334 Kelvin)
available_spare : 100%
available_spare_threshold : 5%
percentage_used : 0%
endurance group critical warning summary: 0
data_units_read : 137,448,840
data_units_written : 20,550,570
host_read_commands : 1,113,104,387
host_write_commands : 810,468,119
controller_busy_time : 2,616
power_cycles : 3
power_on_hours : 1,185
unsafe_shutdowns : 1
media_errors : 0
num_err_log_entries : 4
Warning Temperature Time : 0
Critical Composite Temperature Time : 0
Thermal Management T1 Trans Count : 0
Thermal Management T2 Trans Count : 0
Thermal Management T1 Total Time : 0
Thermal Management T2 Total Time : 0
[root@server ~]# dmesg | grep nvme
[ 1.234403] nvme nvme0: pci function 0000:0c:00.0
[ 1.234415] nvme nvme1: pci function 0000:09:00.0
[ 1.257469] nvme nvme1: Shutdown timeout set to 10 seconds
[ 1.260143] nvme nvme0: Shutdown timeout set to 10 seconds
[ 1.535500] nvme nvme1: 32/0/0 default/read/poll queues
[ 1.538258] nvme1n1: p1 p2 p3 p4 p5
[ 1.584838] nvme nvme0: 32/0/0 default/read/poll queues
[ 1.587866] nvme0n1: p1 p2 p3 p4 p5
[root@server ~]# cat /proc/mdstat
Personalities : [raid1]
md123 : active raid1 nvme1n1p4[0] nvme0n1p4[1]
52160 blocks super 1.0 [2/2] [UU]
bitmap: 0/1 pages [0KB], 65536KB chunkmd124 : active raid1 nvme1n1p5[0] nvme0n1p5[1]
7661665280 blocks super 1.2 [2/2] [UU]
bitmap: 19/58 pages [76KB], 65536KB chunkmd125 : active raid1 nvme0n1p3[1] nvme1n1p3[0]
1047552 blocks super 1.2 [2/2] [UU]
bitmap: 0/1 pages [0KB], 65536KB chunkmd126 : active raid1 nvme0n1p1[1] nvme1n1p1[0]
83885056 blocks super 1.2 [2/2] [UU]
bitmap: 0/1 pages [0KB], 65536KB chunkmd127 : active raid1 nvme0n1p2[1] nvme1n1p2[0]
67107840 blocks super 1.2 [2/2] [UU]unused devices:
As you can see above, temperatures look fine as well for both NVMe’s - so in my mind, I don’t think it’s a temperature throttling issue (unless I’m missing something here).
Already tried updating to kernel-lt (5.x) as well as kernel-ml (6.x) - same symptoms exist.
What am I missing here? I already verified, nothing crazy in terms of resource usage (iotop and top look fine), the RAID array is not rebuilding, etc. pcie_aspm is already set to performance as well:
[root@server ~]# cat /sys/block/nvme0n1/queue/scheduler
[none] mq-deadline kyber bfq
[root@server ~]# cat /sys/module/pcie_aspm/parameters/policy
default [performance] powersave powersupersave
[root@server ~]# cat /sys/block/nvme0n1/queue/write_cache
write back
[root@server ~]#
Thanks in advance for any help or guidance here.