A NAS with a No Name Motherboard

Didn’t find the thread for the recently published video, so I am creating one.

(Mods: Please feel free to move this to wherever you see fit.)

@wendell
In regards to the questions that you asked in your video:

A few things:

  1. If you’re using Ethernet on your Mellanox cards rather than Infiniband, then you’ll probably want to enable RoCE v2.

  2. In my experience, for just about anything (most things) RDMA related, RHEL derivatives tend to perform better than Debian derivatives.

For example, if you want to enable SRIOV for Infiniband (Mellanox ConnectX-4 100 Gbps dual port IB NIC), the OpenSM subnet manager that runs on Debian doesn’t support virtual functions (if you’re using an externally managed 100 Gbps IB switch like I am (because externally managed switches are cheaper than Mellanox’s managed 100 Gbps IB switches). In talking with the dev team, they have no plans to port that over from RHEL (and its derivatives) to Debian.

So, if you want IB VFs/SRIOV, you’ll either need to run a managed IB switch or you’ll need to run RHEL (or its derivative) so that you can run the RHEL variant of opensm.

  1. I have found that IB tended to work better than ETH. When I set my VPI port to run as ETH rather than as IB and I had two 5950X nodes that were connected together with a point-to-point connection (via DAC), the Proxmox host itself can use 8 streams of iperf and hit I think it was close to 96.9 Gbps over IB, but it was only hitting around 23.4 Gbps max with the same 8 iperf streams when it was running with an Ethernet bridge.

So, the PHY protocol matters.

(The idea with this test was trying to get multiple VMs and/or LXCs sharing the same 100 Gbps connection as much as possible. It’s a pity that the VFs/SRIOV ended up being a bust on the Debian-based Proxmox system.)

It’s also a pity that xcp-ng, which is the type 2 hypervisor that’s based on a RHEL-derivative, doesn’t have as much going for it vs. Proxmox. (e.g. doesn’t support virtio-fs, which, ironically, was developed be a senior software engineer at RHEL).

  1. If you want higher levels of performance, CentOS/Rocky Linux allows very CLOSE performance levels to RHEL that isn’t RHEL.

(I used to run CentOS on my micro HPC cluster (including the headnode) where I had four Samsung 850 EVO 1 TB SATA 6 Gbps SSDs pegged (at least once) to 38 Gbps out of the possible 24 Gbps (combined) from the four drives. (It was a HW RAID0 array, managed by the LSI MegaRAID SAS 9341-8i, (I wanted the capacity and speed of the four drives, didn’t care about redundancy nor fault tolerance), formatted with XFS, and then exported to the network via NFSoRDMA.)

  1. I don’t know if the ETH Mellanox 100 GbE cards can do this, but you can also look into iSER (basically iSCSI over RDMA).

You might be able to get better performance with that.

But yeah, there’s a LOT that you can do.

  1. re: AI workloads – I’ve found that unless you’re doing the model training, LOADING a model doesn’t actually take that much. Even a PCIe 3.0 x4 NVMe SSD is PLENTY sufficient. (I think that I was loading the codestral:22b model last night at something like < 500 MB/s from the Intel 670p series 1 TB NVMe SSD.)

There are limits as to how far you can push data/applications to load.

  1. 100% agree with you that for media storage, the PM7s would be wasted.

  2. Unless you are going to boot and reboot VMs often, you don’t necessarily need super fast storage for that neither.

(LXCs boot even faster.)

Windows, interestingly enough, when you analyze the data access pattern when it is booting, is a mix of load times, but then it’s also a mix of processing time as well, when it is booting the Windows kernel.

Again, you can only push both of those so much/so far.

(I’ve tried booting Windows off of a RAM drive. The RAM itself can be very fast, but Windows, didn’t really had much in the way of an appreciable reduction in boot time, despite giving it a MUCH faster storage medium/interface.)

  1. For Steam games, I’ve found that when you’re loading a game, it is, again, not as intensive as it is getting the game (and its updates) installed.

The installation realised more of a benefit from faster storage and network subsystem (i’m already running iSCSI for my Windows gaming clients), but the game itself (e.g. Halo Infinite) couldn’t load the game much faster than 500 MB/s, even if the drive’s STR is capable of 2-3.5 GB/s read. It doesn’t scale with the capability of the drive.

(As such, even a SATA 6 Gbps SSD is able to deliver the 500 MB/s that Halo Infinite was requesting. Giving it a faster drive (also tested it with Silion Power US70 PCIe 4.0 x4 NVMe SSD) (on a 7950X, with a Supermicro H13SAE-HF) again barely reduced the time it took to load Halo Infinite.)

You’ll probably realise more benefit with more concurrent users. But for single user (or mostly single user scenario), the benefits are mehhh…

edit
Apparently the codestral:22b model WILL ask for ~1.4 GB/s read speed from my Intel 670p NVMe 3.0 x4 SSD.

So I guess there is a use case for an all NVMe SSD NAS.

3 Likes

At about 10:30 in this video, a diagram with a switch and vlans is shown. I didn’t understand what was being shown with the green connections vs the final two connections on the right. Has he covered this elsewhere, or can someone make me understand what was being shown there?

PS: New here, I’m not sure if this should have been a new post?

Thanks.

1 Like

I have posted the video discussed in this thread. @alpha754293, please ignore the DM I sent you. I know now I had the wrong video.


I added these other videos because @wendell mentions The Forbidden Router in the first video I posted.

@Tobarja Welcome to the forum. Since your question concerns the video I posted, you didn’t do anything wrong by posting it in this thread. I have some questions but haven’t finished watching the video, so I will wait to ask.

No worries.

I just saw your reply and your DM, and I responded to your DM before clicking the notifications to see your comment here.

Either way, it’s all good.

Loading the mixtral:22b LLM model (which is about 80 GB in terms of file size), HWinfo64 did record about 1.93-ish GB/s read from my Intel 670p NVMe SSD, so that’s the fastest that I’ve seen it go.

But even if that was 2 GB/s, that would still only be about 16 Gbps bandwidth demand/pull from the system, which, yes, an all-NVMe NAS with a 25 GbE connection ought to be able to fill/supply, let alone 100 GbE.

(I would test out the 100 Gbps IB side of things except that my Asus Z170-E motherboard doesn’t have enough PCIe slots for that.)

1 Like