PCIE Gen 4 x16 Raid Card with support for Raid 5 and 6 for 4 / 8 U.3 SSD

Hello,

I am building a silent data server (RHEL) based on U.3 drives and 100 GbE ETH connectivity.

Software raid is not a solution.

Drives: Micron 9400 Pro x8. (is there a cheaper but still very reliable and performant enough solution?)

Unfortunately, I hit a bottleneck with the Raid card, the only card that I found is Highpoint - SSD7580B (“https://www.highpoint-tech.com/product-page/ssd7580b”) it supports only RAID 0, 1, and 10.

Do we have for DIY a high-performance Raid Card that supports Raid 5 or 6 (4 to 8 drives) that could achieve speeds high enough to saturate the 100 GbE NIC?

Thank you all!

Welcome!

Yes, a SmartRAID Ultra 3258P-32 RAID controller will saturate 100GbE assuming you feed it with 8 PCIe4.0 SSDs that are relatively quick.

That Highpoint SSD7580B controller isn’t a true hardware raid controller, it relies on software raid.

For Raid 5 writes, you should be able to saturate 100GbE with even lower spec PCIe4.0 SSDs, but if you want to saturate 100GbE on RAID 6 writes you likely will need the nicer Micron 9400 drives.
If all you cared about were saturating 100GbE with reads then 4-8 of almost any NVMe SSD will do.

Thank you!

By “nicer Micron 9400 drives”, do you mean the MAX series?

oh no, I just meant the Micron 9400 family in general, which I consider one of the better PCIe 4.0 drives as compared to say something Solidigm offered.

How can you feed the card 8x 4.0 ssd if it only have 16 pcie available?

That particular raid card effectively has 32 pcie lanes of input, so each ssd could be connected with x4 PCIe lanes.

You still only get the bandwidth of 16 lanes. It’s overbooked just as e.g. chipsets are with more lanes than actually connected to the CPU.

But when talking PCIe4 x16…~30GB/s is plenty for most people

The card itself says that it supports 32 nvme. So that means I can use 4x 2305400-R cables and direct connect 32 nvme ssd to it?

Yes that is correct assuming the nvme drives will play nice with the U.3 cable mentioned; most older nvme drives will only work on U.2 cables, so in that case you’d need the x8 U.2 cables for the card.

Since this is a hardware raid card, internal bandwidth among the SSDs can actually be higher than the 30GB/s the x16 PCIe 4.0 connector has to host.
but even then, the host can only write to the card at ~12GB/s assuming raid 5 since parity calculation and RD-RP-WD-WP becomes the bottleneck, this is still a country mile better than any normal/reasonable software raid scheme can muster.

I’ve seen ZFS RAIDZ 14GB/s writes with checksumming and compression and stuff. So your numbers might be a bit outdated there.

Software like ZFS and Xinnor (even faster software raid) falls outside of my arbitrary definition of “normal/reasonable” mostly due to the amount of maintenance involved in their use and the extra possible failure modes to get in trouble with.

Doesn’t mean they aren’t legitimate choices for others though.

Like no RAID 5 write hole or 256 bit checksums? Or triple parity for extra safety? Or uncorruptable transactional nature?

I do see more failure modes in legacy controllers. The best hardware is the CPU, we got plenty of cores now…

I think RAID-controllers are useful crutch for Windows users and need for local storage. Because you can take the controller to your next system and not worry about BIOS RAID and its limitations.

Thanks. Is this info the same for a Broadcom 9670-24i so I can use their x8 U2 cables for 8x3=24 nvme drives?

x8 connector means 8 lanes. So 3x2 NVMe with 4 lanes each.

I was talking about this cable: 05-60006-00 SFF-8654 to 8XU.3

There are millions of cables out there…got any link?

docs. broadcom. com/doc/96xx-MR-eHBA-Tri-Mode-UG
Page 59 05-60006-00

That looks like a nice 8-way breakout cable. This should work. Be ready to get enough SATA power cables too.

I’m not sure how signal integrity will be…could drop to PCIe3, cable is a meter after all. Along with using only x1 bandwidth per drive.

But I’m looking forward to your experience :slight_smile:

Raid 5 on any modern battery backed hwraid card within the past ~10 years isn’t subject to a raid 5 write hole.
In my experience hardware raid tends to more gracefully handle power loss events than ZFS; there is a non-trival chance of ZFS corrupting it’s freespace map when power is lost due to how it is appended- this is probably my second biggest problem with it, the first being the inevitable, permanent fragmentation and how it affects rebuilds, scrubs and the life of the hdds.

Checksumming is a file system function so it isn’t an applicable construct to hardware raid.

I would like to see raid7.3 implemented in hardware raid controllers, this will perhaps be a feature on future hardware releases… but until then nested raid on smaller raid groups must be used to keep similar catastrophic failure probabilities to triple parity raid at the expense of storage efficiency.

There are many theoretical reasons for why ZFS should be uncorruptable and an excellent storage system, but in my experience it isn’t as rosy as the software developers would have me believe.

This is very true, I don’t understand why Microsoft doesn’t remedy this with something like MDRAID (which according to my arbitrary definition is “normal/reasonable”).

​​​ ​ ​
​​​ ​ ​
​​​ ​ ​
​​​ ​ ​

I’d be cautious here, I don’t think broadcom actually sells any x8 SFF-8654 >> eight U.2 calbes for their adapters, they only sell the U.3 variety, which will be fine if you get U.3 compatible nvme drives as opposed to U.2-only nvme drives.
It’s possible that an x8 SFF-8654 >> eight U.2 cable will work on the broadcom card but the fact that they don’t offer one themselves makes me suspicious that support for such a cable might not be in the firmware of the broadcom card.

When I started this Topic I started out with the need to solve a problem, after seeing the “intense messaging” I wondered if you guys could give me some advice on the whole topic.

I am working on a RISC-V Out-of-Order Core, it will go open-source (budget = my money) for a while, developing from scratch alone is so much work, but the verification is even more work.

A few weeks ago I found a guy who worked on the verification of some nice Cores that were based on a different ISA and he agreed to help me with an industrial-grade verification environment.

Some of the most basic requirements are: more cores the better, a lot of fast storage for some trances/seeds (some companies don’t necessarily save a lot of these, but I do want to!), 24/7 uptime…

I did a bit of googling for some solutions but the prices are way more than I can afford since I am basically paying out of my own pocket.

I am currently snooping eBay for some Genoa Epyc for a decent price, starting from 32 cores (the V-Cache seems to work well with tools like VCS and I suppose with Verilator also) and some RDIMMs 384 GB maybe?

What I am not: I am not a sysadmin, DIY wizard, Storage/Compute/Networking expert, or even a rich person (despite the fact that all of these cost a king’s ransom, I will take the eBay route).

Therefore I found the following parts:
Case: 45 HomeLab - HL15 - 15-Bay Storage Server (This will be sitting next to me, therefor I need to Update the Fan of the case)
Motherboard: Supermicro H13SSL-NT (This should fit in the case)
RDIMM: ~ 384 GB (TBD)
CPU: Some Epyc 32 Core or more (V-Cache if possible, but depends)
SSD: Micron 9400 Pro 7.6TB (Starting with 3/5 and going up to 15) Raid 5
OS Drive: Crucial T500 (this one seems to generate the least amount of heat)
NiC: Mellanox Connectx-6 (I honestly don’t think we need this right now, some 40 GbE might be enough)
GPU: Some Random GPU…

Concerning Matters:
PSU: a 24/7 running PSU that is quiet, any ideas?
Cables for the backplane compatible with the raid card???
Case: Drive caddy for Micron 9400 Pro?
Is the Backplane of the case going to work with Micron 9400 Pro?
Is the Backplane going to run at SAS speed?
RaidCard: I need a Raid Card (PCIE Gen 4.0) that is cheap enough and fast enough to saturate that NiC (Raid 5 should be okay or maybe Raid 6 since there are so many drives???)?

The target for this project is the end of next year or the beginning of 2025, therefore we have a few more months until we will need this.

Thank you all!