Critiquing: Really shitty AMD X570 (also B550) SATA SSD RAID1/10 performance - Sequential Write Speed Merely A Fraction Of What It Could Be :(

so I’m messing around with RaidXpert2 and last time we created an array it was only creating at about 75MB/s out of all 6 drives, now its doing 2GB/s when I checked Zero Create


pushing 2GB/s through the sata controller hasn’t made the chipset explode like on older chipsets that I’ve heard do, so that’s good, though that is just 0’s, crystal diskmark will soon show

I don’t know if its on the CPU-NB or the southbridge for this Generation, I’ll have to look that up

it appears to slow down when it has to write a sector that isn’t already a zero

1 Like

Had tested “zero create” on the X570 system prior to the initial creation of this thread, it didn’t make a difference, although I had been hopeful since I also saw the “MB/s” counter that was roughly in a normal range in the AMD RAID management software.

While the B550 remains shitty at the moment, the X570 system on the other hand is functioning normally and I’m honestly afraid of touching anything in its RAID drivers since I would like to just use the system and avoid triggering the performance regression just as I did on B550… :roll_eyes:

1 Like

Without zero create, it took the entire day to build the 6 drive array, this only took an hour

On b550

1 Like

That I have never experienced, the creation/initialization process was always about what I had expected.

Maybe that’s a thing with the QLC drives? If the the SSD controller temperature stays in its normal operational range* the Micron TLC enterprise SSDs can do their “normal” performance numbers 24/7.

*Not a good idea to house 8 of them in a single 5.25" backplane and forget to plug its fans in.

1 Like

My linux-fu is not the best and I’m a little stumped on how to get the raid driver installed, there was this one you had to add a repository but it ended up not working and AMD only had a x370 driver that was last updated 2017 on their site

1 Like

Since my Linux level is possibly 0.1 I cannot assist with that, maybe open a bottle of orange soda for @wendell to appear?

Can you confirm that that AMD RAID management software isn’t running a background service on Windows with its May 25th release?

Normally there should also be a little tray icon with this software running in the background as soon as Windows boots up.

Do you have any contact at AMD to relay this to or inquire about?

I don’t mean to sound self-righteous but I think we can say quite surely now that some shit’s fucked up just like in a certain Warren Zevon song…

1 Like

Turns out he already finished his testing, we’re doing Linux testing at the moment he showed me how, retaining that information is a different matter however :rofl::rofl::rofl:

So far we solved the build time using an echo command

1 Like

Any news regarding the Windows issues?
(Are they existing outside my imagination and mental capability of suddenly producing sad CrystalDiskMark results?)

(Yes/No/Maybe?)

So far Wendell mentioned shitty RAID10 sequential READ (no benefit of 6 drives in RAID10 over 3 drives in RAID0).

An updated Linux driver isn’t exactly the thing I was looking for - although I could appreciate the irony quite a lot… :+1:

1 Like

So the diagnostic mode im in is basically is this a hardware or software problem.

Like the common types of insanity here could be the driver is always in synchronous writes mode or always uses a certain block size that’s misaligned or blah blah blah

It’s looking to me like in the case of mirrors the driver only operates the mirror in an indirect or standby mode. meaning the mirror never gets read.

The linux performance so far seems good but still testing. I could run a particular workload on intel chipsets to make them overheat and throttle. Was kinda expecting that here but doesn’t seem like that’s the case

here is what the hardware is capable of, best case scenario, in raid 10:

Results:

Sequential Read: 1214MB/s IOPS=1
Sequential Write: 737MB/s IOPS=0

512KB Read: 357MB/s IOPS=714
512KB Write: 10MB/s IOPS=20

Sequential Q32T1 Read: 1300MB/s IOPS=40
Sequential Q32T1 Write: 837MB/s IOPS=26

4KB Read: 19MB/s IOPS=5000
4KB Write: 0MB/s IOPS=65

4KB Q32T1 Read: 19MB/s IOPS=5000
4KB Q32T1 Write: 19MB/s IOPS=5000

4KB Q8T8 Read: 78MB/s IOPS=20000
4KB Q8T8 Write: 1MB/s IOPS=470

more runs:
tests: 5, size: 1G, target: /home/lv1/1/cdm 8.1GiB/2.7TiB
|Name        | Read(MB/s)|Write(MB/s)|
|------------|-----------|-----------|
|SEQ1M Q8 T1 |    1242.47|    1069.16|
|SEQ1M Q1 T1 |     795.53|     239.06|
|RND4K Q32T16|     791.14|      25.64|
|... IOPS    |  193130.02|    6240.35|
|... latency |    2650.48|   82056.07|
|RND4K Q1 T1 |      22.82|      29.54|
|... IOPS    |    5570.46|    7213.07|
|... latency |     179.03|     137.23|



4 Likes

Q: maybe for @wendell

I’m no SATA expert, but is it possible there’s a command rate limit (that is being hit) on the SATA bus? i.e., due to clocking/whatever it can only shove 5000 IO requests/sec over the SATA bus to the drive itself?

Also, I honestly expected SSDs to handle random 4k IO better than this, as a RAID10 of 4x SATA 7200 RPM spindles is up around 150 IOPs (random 4k) on average - which is better than some of your 4k write figures for flash

:astonished:

If those figures are accurate, I’d hazard a guess that on random 4k write, those drives would be outperfomed by an old Seagate Momentus XT hybrid hard-drive with a couple of gig of SLC cache on it.

Are these tests done with the chipset SATA controller in AHCI (= letting Linux handle all the RAID stuff in software to max out the chipset’s SATA ports) or RAID mode (to find firmware/driver issues here)?

Also, what are the “neutral AHCI” best-case individual drive results of the SSDs you’re testing to have more context about the RAID results?

The 4K stuff seems quite fishy as @thro already mentioned, additionally I find this result absurdly low for a RAID10 with 6 SSDs:

|SEQ1M Q8 T1 |    1242.47|    1069.16|

I already saw a bit more than 1.600 MB/s and about 1.000 MB/s here with 4 Micron SSDs in RAID10 (best case when the performance issues aren’t triggered).

Those numbers are with ahci with Linux handling the raid

Can you give a more specific model number for the used QLC SSDs?

When I read “1 TB QLC SATA SSD” on your screenshots as the name they report as, I become a bit concerned and get “generic Chinese trash vibes” and am uncertain if these can really stress the SATA host controller of the chipset to the max.

But okay if they do and the issues are more or less “clear” and I just wasn’t able to completely follow.

I don’t know the model but they are inland pros

from wendell: I tested the drives to ~400mb read+ and 350+ mb write+ but the hdparm stats are interesting:

sda 8:0 0 953.9G 0 disk
└─sda1 8:1 0 953.8G 0 part
└─md127 9:127 0 2.8T 0 raid10 /home/lv1/1
sdb 8:16 0 953.9G 0 disk
└─sdb1 8:17 0 953.8G 0 part
└─md127 9:127 0 2.8T 0 raid10 /home/lv1/1
sdc 8:32 0 953.9G 0 disk
└─sdc1 8:33 0 953.8G 0 part
└─md127 9:127 0 2.8T 0 raid10 /home/lv1/1
sdd 8:48 0 953.9G 0 disk
└─sdd1 8:49 0 953.8G 0 part
└─md127 9:127 0 2.8T 0 raid10 /home/lv1/1
sde 8:64 0 953.9G 0 disk
└─sde1 8:65 0 953.8G 0 part
└─md127 9:127 0 2.8T 0 raid10 /home/lv1/1
sdf 8:80 0 953.9G 0 disk
└─sdf1 8:81 0 953.8G 0 part
└─md127 9:127 0 2.8T 0 raid10 /home/lv1/1
nvme0n1 259:0 0 238.5G 0 disk
├─nvme0n1p1 259:1 0 512M 0 part /boot/efi
└─nvme0n1p2 259:2 0 238G 0 part /
root@lv1-System-Product-Name:/home/lv1/1# hdparm -t /dev/sda

/dev/sda:
Timing buffered disk reads: 564 MB in 3.02 seconds = 186.63 MB/sec
root@lv1-System-Product-Name:/home/lv1/1# hdparm -t /dev/sdb

/dev/sdb:
Timing buffered disk reads: 1318 MB in 3.00 seconds = 439.32 MB/sec
root@lv1-System-Product-Name:/home/lv1/1# hdparm -t /dev/sdc

/dev/sdc:
Timing buffered disk reads: 1320 MB in 3.00 seconds = 439.92 MB/sec
root@lv1-System-Product-Name:/home/lv1/1# hdparm -t /dev/sdd

/dev/sdd:
Timing buffered disk reads: 1312 MB in 3.00 seconds = 436.97 MB/sec
root@lv1-System-Product-Name:/home/lv1/1# hdparm -t /dev/sde

/dev/sde:
Timing buffered disk reads: 1312 MB in 3.00 seconds = 437.07 MB/sec
root@lv1-System-Product-Name:/home/lv1/1# hdparm -t /dev/sdf

/dev/sdf:
Timing buffered disk reads: 250 MB in 3.00 seconds = 83.29 MB/sec

sda and sdf are much slower than expected. these are known to not be this slow. We are investigating.

that is testing each in turn. Testing (read) each simultaneously shows no bottleneck at a hardware level:

/testall.sh

/dev/sda:

/dev/sdb:

/dev/sdc:

/dev/sdd:
root@lv1-System-Product-Name:/home/lv1/1#
/dev/sde:

/dev/sdf:
Timing buffered disk reads: Timing buffered disk reads: Timing buffered disk reads: Timing buffered disk reads: Timing buffered disk reads: Timing buffered disk reads:
1318 MB in 3.00 seconds = 439.28 MB/sec
1312 MB in 3.00 seconds = 437.27 MB/sec
1316 MB in 3.00 seconds = 438.52 MB/sec
1312 MB in 3.00 seconds = 437.05 MB/sec
1148 MB in 3.00 seconds = 382.34 MB/sec
418 MB in 3.00 seconds = 139.10 MB/sec

except for the two odd drives. (sda has sped up a bit since wiggling the cable?)

Any cable wiggling should immediately be logged permanently on the C7 SMART values of the SSD if the firmware is functioning properly.

Easy to test if they are not 000000000000 (?) anymore, maybe from using it previously with a faulty connection, save the values, then make a large file transfer and if the electrical connection is not good for whatever reason so the SSD has to repeat I/O operations the C7 SMART values will have changed.

Note: But the operating system should also log that, at least I know that Windows does so I doubt Linux doesn’t.

Should be able to free up 4 x Samsung 850 EVO 1 TB SSDs soon, just of the back of my head these should be the fastest RAID10 configuration I’d able to test yet.

@GigaBusterEXE

Any update-worthy developments?

  1. You suck, everything is working fine
  2. Yes, there seems to be something
  3. GTFO?

I don’t need exact schematics if an issue has been located (maybe if it is realistic that it can be fixed via firmware or driver updates) but it would be nice to get some closure if it even makes sense for me to further poke around or if this would be a complete waste of time since you guys already found something.

Not necessarily. The platform error handling for pcie errors can also show up here on amd platforms… suppressing those errors also seems to have the side effects of suppressing sata errors. It’s not logged in linux but was for sure struggling somehow with a cabling issue. Partially.

We got a raid 10 setup with 4 drives and are getting appropriate perf from each drive individually in linux. Still testing.

The other drive has some problem as it really is glacial and that makes no sense.

1 Like

While I don’t have data for B550 or X570, my last encounter with a SATA SSD where the SATA port had been mechanically damaged was on X470 with SATA in AHCI mode with Platform-first error handling disabled in the UEFI and Windows 10 - there the C7 errors were properly logged in the SSD itself and Windows Event Viewer.

@wendell

Have you by any chance ever heard of a modern third-party software RAID0/1/10 whatever solution “like” the FuzeDrive stuff that can be used for Windows to boot from etc. BUT is motherboard chipset-independent?

Compared to “intelligent” tiered caching this must be pretty trivial and it would be nice to not be dependent on a specific motherboard platform.

This way the motherboard chipset SATA controller could remain in AHCI mode and you could bypass shitty firmware/drivers.