A NAS to saturate 40Gbps. Advice?

TL;DR, need to build a fast storage server – for around $10/20k

I’m the new sysadmin for a large-ish research compute cluster; 40+ Nodes. This cluster has no storage other than a spare 500gb ssd on the headnode (Long Story). I have the funding to put in place a network storage solution.

I know my way around hardware, networking, linux … but have never actually built/configured a network server or ever touched ZFS before. So I just want to sound my ideas out here to see if I’m being an idiot. Input appreciated. I have the funding to actually buy this thing and I’ll post pictures, so I guess treat this as “What would be the fastest NAS you can think to build thread”

Requirements:

  • Saturate a 40gbps uplink for burst file read. No more than around 100gb. Reading data greater than this can be slower.
  • General use case is someone starts a batch job on the cluster and all 40+ nodes need to copy files (50gb) to their local storage as fast as possible, then over time will each write data back.
  • Have around 5Tb total storage.
  • Some Redundancy. Data is not mission critical but I’ll get all sorts of annoyed people if I loose data.
  • Max Budget of $20k, ideally more like $13k
  • Current infrastructure is 40gpbs, may move to infiniband later.
  • 2U would be best. Can accommodate larger but I’ll have to move things around and it’s loud and dusty in there and servers are heavy
  • I’m in the UK btw
  • Supermicro motherboard based, because I’m a SM fanboy and all the other kit is SM, so I can remote manage things without having to go into the office or even wear pants.

Current plan

  • Get something like this: SM based server with NVMe
  • Add a nice intel 40gbps QSFP Nic
  • Then get Some fast spinning rust for colder storage, some faster ssds for hotter files, possibly have an optane drive for a cache (need to investigate mobo/cpu compatibility there).

Questions:

  1. Am I being dumb with the hardware. is there something more off the shelf / should I just wuss out and call Dell get them to sell me something
  2. How much Ram/Cpu power is needed. This thing will only ever be doing this. No extra VMs/secret Minecraft servers
  3. Which OS? (something unix-like obviously)
  4. ZFS, completely new to it, pointers/guides appreciated
  5. SAS/SATA/PCIE cards. I’m new to storage tech, If i’m doing ZFS software raid, then I don’t need a RAID card. However, I think I may still need (depending on the chassis/mobo) a non-raid host Bus Adaptor to connect to all the drives. These seem to top out at 12Gb/s. Can I use multiple and split the load? Is this possible? I take it if it is then I’ve gotta keep track of PCIE lanes. I guess if I have enough ram this shouldn’t be an issue.
  6. In addition to Q5, see this server for example where the hdd backplane is directly connected to the mobo. How can I tell the max bandwidth of that? Do I need to care?

Thanks for reading,
Cheers

1 Like

I like this combo…

https://www.newegg.com/Product/Product.aspx?Item=N82E16820167467&ignorebbr=1 (or the larger 1.5TB model )

https://www.dell.com/en-us/work/shop/povw/poweredge-r7415

Uh wat?

I actually don’t think this is possible because of PCI limitations but I might be wrong about this.

  1. No your are not being dumb. SM has some great solutions for what you are looking for. However I don’t think going with spinning rust the best plan as running those disks that hard would cause reliability issues. NVMe would be best but it’s expensive. You could get away with 12Gbps drives but again you might not get the saturation you need out of that. With 16x480GB drives using RAID-Z2 in two seperate pools you are going to get approximately 5.76TB of data storage space.

  1. If you are doing ZFS the rule of thumb is 1GB per 1TB. Since you only need 5TB total storage you won’t need a ton of RAM but more is better for obvious reasons. I would think 64GB would be enough headroom for what you are describing.

  2. Ho boy. So this is an interesting one. Personally I would run Debian with ZFS on Linux. You can read all about that here on the ZFS on Linux wiki. Keep in mind though that there are a ton of other options for this. FreeNAS or XigmaNAS (BSD based) are two built from the ground up alternatives, but I still think your best bet is Debian with ZFS on top.

  3. Start reading.

  4. This is the bottle neck. No matter what. I believe the fastest standard for SAS3 is 12Gbps, and SAS4 isn’t reachable for your budget.

  5. No it isn’t. The backplane is a completely separate part from the motherboard. The backplane will always be compliant to the highest standard currently available.

PDF link to for an example server.

I agree with running Debian, a great server os with software up to date enough to get things done. You will have normal CLI control over the zpool, but OpenMediaVault is a great web based control panel for zfs as well. Had great experience running it enterprise before, but ymmv. Really is the best FreeNAS equivalent on Linux, and FreeNAS has gone downhill with some decisions after version 9.

The 1GB to 1TB rule mostly applies to running with deduplication, which if you are running a speed optimised build, should be turned off. Save the overhead, as this is a small pool size. That being said, something like 64GB sounds like a solid starting point, but check system usage.

Whats the 40+nodes running as OS’s?

Backups.
Don’t forget about backup. Having RAID is all well and good, but if something is deleted or gets corrupted, or the server is destroyed or stolen, you’ll need them.

Include some sort of backup provision in your budget.

1 Like