Introduction and some NAS guidance

Hi all!

Subscriber to Level One Techs and somewhat long time lurker of the Level One Forums. I’m finally signing up to become an active member of the community and hopefully can provide some useful inputs once in a while. I’m a software developer and tech geek who has recently been bit by the homelab bug. I’ve done a decent amount of cloud based work in the past, but wanted to start building a small homelab for a fresh perspective and for additional capabilities for me and a pretty technical family (my daughter and I are all CLI-first Linux people. My wife and son…not so much).

The logical place to start seems (to me anyway) to set up some 10GBe network and a NAS. The two constraints I have is we, a family of four, live in a small Bay Area apartment, so I want it silent. I presently have a LaCie 6Big Direct Attached Storage for my main machine that sounds like a bag full of nickels falling down the stairs anytime it’s running. As a result, it’s almost always turned off, limiting its use as an automatic backup drive. Also, my hobby is audio production, so I’d like to have minimum latency when pulling samples off the NAS for programs like Superior Drummer or Kontakt (sample sizes range from hundreds of megabytes to single digit gigabytes). In this way, the Lacie also fails where reads are slow enough to cause Kontakt to stutter when scrolling through samples. In terms of future NAS usage, I’d like to set up automatic backups setup for the family’s ~8 computers as well as caches for gaming/steam libraries and music production (few terabytes of audio libraries and samples) . Likely order of low teens of terabytes all in.

This seems to hint at an all ssd build though it’s unclear if that means all NVMe or just SATA SSDs like some IronWolf NAS SSDs. I’ve read (and been thoroughly confused) about the (maybe?) limitations of UnRaid or TrueNas using only SSDs and have read that SSDs are a waste as they will bottleneck on networking anyway. In terms of networking, I’m thinking of the Buffalo BSMP2012 and running a bunch of Cat8 around the apartment. Most of those 8 machines are 10Gbe, with a few being 2.5Gbe. The NAS itself would be a homebuild of some Ryzen with as many drives hooked up as required to meet the constraints listed above (likely in a ZFS raidz2 config as I do value reliable storage).

Long term, I’ll likely get a few NUCs to run a Kubernetes cluster, host home automation, Graphana and Postgres for various projects, learning and experiments.

My ask here:

  1. No tomatoes please, I’m still very new to this side of things and just trying to get started
  2. Any advice, framing, links, mental models or things to look into or look out for would be super valuable at this point
  3. If I’m totally going in the wrong direction or not thinking about this right, please feel free to suggest going in a different direction though I’d be happy to pick up some new knowledge/skills outside of server side and web dev as part of this project

Thanks all and feel free to hit me up if you have any questions in the web dev space!

1 Like

Welcome to the forum!

I’ll note down the requirements, to make it easier to read:

Ok, with that out of the way, you somewhat answered you own question. You’re going DIY. With this many requirements, it might be a little hard to recommend something on 1 box. It would also be helpful if you’d give us your storage capacity requirement. You point out a few TBs of audio libraries.

I guess you have the budget to spend. I would recommend getting an Antec P101 Silent or bequiet! Dark Base 900 (opt for the metal side panels) and Noctua or bequiet! fans and radiators. For RAM, it’s a good recommendation to go 1GB of RAM for each TB and if you want advanced stuff like deduplication in ZFS, 3x that amount. And obviously it should be ECC memory. I would recommend going with a lower core count Threadripper, the first generation TR 8 core would be perfect, I’ll explain in a bit, albeit it’s more finicky with the RAM.

If you plan on getting 6x 10TB HDDs for about 40TB of storage, you should basically have 40GB of RAM, round that to 64GB. If you want deduplication, 40 x 3 = 120, so round that to 128GB of RAM. It’s a little overkill, but worth it if you want deduplication. If not, I still suggest you run something like Czkawka to somewhat manually remove duplicate files.

Don’t get SMR HDDs.

Also, if you want that SSD performance, you may be interested in getting something like an Asus Hyper M.2 PCI-E expansion card (or 2 of them if you’re a baller) and fill it up with 2x 1-4TB M.2 SSDs (PCI-E gen3 should be fine, it will saturate 10G and probably even 2x LAGG 10G in just pure benchmarks, but not really in real-world applications with very few clients accessing it - latency would be more important than raw bandwidth) and 2x Optane drives if you can, otherwise just 2x 64-256GB normal NAND Flash NVME SSDs. Set 2x m.2 SSDs RAID mirrors, but I believe (don’t quote me on this) that your motherboard will have to support PCI-E Bifurcation. Otherwise, get 2x Hyper M.2 cards and make a RAID 10 (stripped mirrors) with 4 drives (twice) and have the high capacity SSD array be used as a L2ARC (read cache) for the zpool and the smaller, preferably lower latency array (which is why Optane would be nice) be used for SLOG (write cache) for the zpool.

That way, you basically get SSD speeds from spinning rust. Of course, if you have lower total capacity, go with less capacity SSDs. The Hyper M.2 card + 1 or 2 LSI SAS controllers + a 10G NIC will drive you about 48 PCI-E lanes, which is pretty insane. And that is excluding a basic display adapter for initial OS installation and configuration (basically a GT210 for all that matters, whatever you have laying around, you can remove it afterwards). You can ditch the SAS controllers and use the 6 integrated SATA ports, but SATA can be pretty terrible sometimes. In my experience, it’s not so much, I’m using both on-board SATA + 3x 2 SATA port cards on my NAS. Had 12 drives connected, but one on-board port sometimes decides to just die, which is unfortunate, the drive works perfect (I had one as a hot-spare in my zpool, not anymore). I also had an old motherboard with 1 completely dead SATA port. Other than that, I never saw SATA dying or malfunctioning. Apparently SAS is more reliable, but again, in my experience, not by much, the chances of SATA going awry are very low and ZFS will detect and alert you.


Now for the other requirements. You said you needed a steam and other audio cache, that means you need a proxy, like Squid3 running on your network. I believe there may be alternative ways to do that, I believe LTT did a Steam Cache video not using Squid, don’t remember what he used though. Check that out (you’ll have to search their channel for steam cache).

You also said you wanted to have a homelab, messing with K8s, Grafana, postgresql and other stuff. I would suggest running TrueNAS Core as the OS and using other PCs, like NUCs for other services, like steam cache and k8s and whatnot, with the NAS being just the backend for the VMs vdisks or containers’ permanent storage. But you could add more RAM (up to 256GB I believe, which should be plenty) and use Proxmox instead of TrueNAS Core and virtualize everything, or use LXC containers on it for services like Restic (backup software) or Squid. OCI Containers are good for other services, like NextCloud or Bitwarden_rs.

This is a long post, I’ll wait for some corrections if I got anything wrong, I won’t make it any harder to read.

I just realized that I’ve pointed out industry standards, however:

Even for 40TB of storage, you can get away with 32GB of RAM easily, but I wouldn’t go lower than that with all the services you want to host.

Also, I’ve been mistaken, it’s 1GB of RAM for every TB of actual disk storage, even the one lost to redundancy, so for 60 TB (6x 10TB drives) of which 40TB are usable (due to how RAID-Z2 takes 2 drives for parity calculations), that would be 60GB, still rounded to 64GB of RAM. Funny coincidence though.

Also, I forgot to post this important aspect of ZFS, I usually post it on every ZFS build question:

So for ZFS with default settings, in order not to get degraded performance, you need either 4, 6 or 10 drives. 4 drives is pointless, because RAID-10 way makes more sense then (even when you only have 2 of “the right” disks to fail, instead of any 2 disks to fail of RAID-z2). 6 disks is usually a good middle ground, with many businesses maybe going for 2 or 3 either independent or stripped RAID-Z2 arrays (so basically equivalent of RAID 60). RAID-z2 with 10 drives is kinda stretching a little, but I believe gives you the best bang for your buck. Whatever RAID-z2 configuration you do, you lose 2 drives storage size for parity.

So in RAID-z2, using 4 drives gives you 50% capacity, using 6 drives gives you 66% capacity and using 10 drives gives you a whopping 80% capacity. Going any further is risky though, more disks = more potential for more than 2 disks failing.

Adding more vdevs in a ZFS pool gives you more performance. A vdev can be a single disk, a pair or mirrored disks, or a parity raid array. So a single RAID-z2 is a vdev in a pool. Doing a striped RAID-z2 gives you more performance (aggregate bandwidth and IOPS), because you have 2, 3 or 4 RAID-z2 vdevs in a single pool, which means reads and writes are parallelized to more vdevs. Ceteris paribus, for n vdevs that make up your pool, it will be able to deliver n times the IOPS and n times the bandwidth of a single vdev. However, this is mostly true to parallelized workloads, so basically many VMs or services requesting data at the same time from the NAS. For home stuff, a single vdev raid-z2 is fine. Obviously a stripped mirror will perform better, because you don’t have parity calculations and obviously, because stripping. More often than not, I see single vdev arrays rather than stripped arrays, with workloads split among them.

So again, if you want to accelerate the zpool, SSDs for l2arc and slog are the better option, than using more vdevs. And arguably the more practical option.

1 Like

Thank you for the response @ThatGuyB . I appreciate it very much!

I did some additional research on my end to further understand the space. Most of these question are me figuring out what the right questions are.

I apologize for the meandering wall of text. Learning something new is always a little awkward.

To confirm the requirements more clearly:

  1. This is intended to be a learning project with an outcome of a working home network and NAS. I fully expect to build a computer and install an OS. That’s part of the fun.
  2. The primary goal of the NAS is backup. A secondary goal is fast retrieval of music production assets (which may be removed as a requirement as it complicates things considerably)
  3. Family lives in a small Bay Area apartment, so someone will be sleeping 6-8 feet away from this wherever it’s placed. I think that means no HDDs, as the sound of drives spinning/working at 3AM would be unacceptable. I’d be happy to learn I’m wrong and there are some fancy super quiet HDDs, but I suspect that’s not the case. This is really the primary constraint.

Based on what I’ve learned thus far:

  1. 10 Gb networking is cool, but that works out to 1.25 Gigabytes per second in a best case scenario. I’m unclear how to calculate latency, but it seems like the audio asset use case may not be worth doing with the NAS.
  2. That requirement could be solved in favor of alternatives like a an external Thunderbolt NVMe external enclosure drive hooked up to the audio production machine for mid hundreds of dollars. That would likely be much faster, lower latency, and, potentially, more cost effective
  3. The only way I could find of getting NVMe level speeds out of NAS was an LTT video involving 100GB networking and a honey badger. That’s cool, but I’m not in the market for low to mid 5 figures, so that’s out.
  4. If you limit the use case to of the NAS to just a computer the other machines push backups to, latency and throughput requirements go way down and seemingly any solution is fine.
  5. Space requirements for backing up 8 machines would be around 16TB, max.

Thank you so much. Your post was super helpful in just better framing the problem for me:

I think the proposal above still assumes at least six HDDs, that’s the part that worries me most.

I watched the LTT video and it turns out, I misunderstood how steam libraries worked. I thought it would be clever to save my son and me from downloading a game twice and storage on both of our machines. The more I think about it, the more it seems it’s not a problem worth solving. Our internet is fast enough and storage isn’t really that expensive over the lifetime of a computer to solve this problem. Your advice here was super helpful. Thank you!

In the case of Superior Drummer or Kontakt (the audio libraries), you downloads hundreds of gigs or terabytes of samples, sound packs and various files to local storage somewhere. When you log into the software, your license is checked and then you use the files from wherever you configure Superior or Kontakt to find them. The problem I was really trying to solve: my audio production is tied to Apple and I don’t feel like paying apple for overpriced storage to store these huge audio libraries. Right now, I have these libraries on the Lacie DAS and there’s a really flow-breaking stutter (1-5 seconds) switching from one Kontakt instrument to the next which makes sampling different virtual instruments annoying. I’ve confirmed it does not stutter at all if using the files are stored on the machine NVMe based storage.

I’ve read about this, and I’m super interested in this, but need to do more research here to understand this better.

Conclusion:

  1. Forget Steam cache, it’s a solution in search of a problem
  2. For audio libraries, just use some external NVMe drive. You only need the libraries for one machine at a time and getting a NAS to hit NVMe bandwidth/latency numbers for this one use case doesn’t make any sense
  3. Collect more info on an SSD only NAS using TrueNas or Proxmox for the much more pedestrian use case of daily backups of all machines (without any hard to hit bandwidth or latency requirements)
  4. Thank @ThatGuyB profusely for all the help thus far!
1 Like

Don’t worry. If you stick around, you’ll see that my posts here are the most -vvv (very very verbose). I post longer replies than that most of the time.

Believe me, HDDs are the least of the problem. I got 11x 7200 RPM 2TB HDDs (and 1 ssd for boot drive, albeit I recommend raid1 for the OS, but I’m just yolloing it, because I got no more ports nor PCI-E lanes left). My setup’s loudest part is the 48 port switch, which is loud af. I used to sleep with it about 5-6 feet away from me. But I’m kinda insane. The storage box itself isn’t really audible, the fans are more audible than the HDDs and they are still pretty quiet. And I don’t have Noctua or bequiet!, I’m rocking an intel stock cooler, a corsair 120mm exhaust fan and a Seasonic (older) PSU. No problems with it.

If you go with one of the cases mentioned above, which are aimed at dampening sound more than airflow, you won’t notice it unless you really try to. Low RPM fans also make a really pleasant humming, kinda like listening to the wind, unlike high RPM whine, which is high pitched and very disturbing and distracting (which is the case for my switch).

Uhmm… 1 gigabyte = 8 gigabits, so yeah, obviously 1.25 Gigabytes / second in best-case scenario…

Latency at short distances is basically negligible (<1ms). Just use ping between 2 devices. Don’t worry about it, unless there are interference or congestion happening in your network (never saw this on a gigabit home LAN, not to mention 10G).

LTT is just a meme most of the time. 10G Ethernet + a RAID-z2 from 6 disks and a raid1 slog and l2arc is plenty for even some very demanding production stuff. Not sure how latency sensitive stuff would work, but with SSD acceleration, I see no reason why it should be any different than literal locally-attached storage, maybe outside Optane or NVME drives directly connected to the CPU.

Just mere backups? RAID stripped mirrors with 4x 10TB drives. Lower power consumption, less heat, less noise, smaller chassis, less powerful CPU, less RAM. You can get away with 16GB of RAM here, you could potentially cheap out and get away with 8 GB, but I still recommend ECC and getting 2x 8 GB UDIMMs for that sweet-spot dual-channel performance and get the cheapest possible Ryzen, like a 2200G / 3200G, but make sure your motherboard will support ECC.

I was also wondering if you could split the “production” system (as in, the NAS box that would run your services, like k8s, nginx and maybe VMs) from the backup system, as that would be highly recommended.

It does, but you can get away with a RAID mirror from 4 disks, however, RAID mirror for the slog is mandatory if you want SSD acceleration and not get corrupted data. l2arc raid is optional, but recommended anyway, because reading from 2 SSDs is better than reading from one, even if the write speed when caching is as fast as just 1 SSD (ceteris paribus).

But again, for mere backups, you don’t need any of that stuff. RAID mirror is more than enough for home backups, but since you need more storage, RAID 10 and you’re off to the races.

Sweet Jesus.

Then having a USB 3 / TB 3 SSD or DAS array makes way more sense then, as you don’t really need this to be accessible from multiple machines.

Never heard of it. Ok, there might be a bottleneck somewhere, maybe in the Lacie firmware? I’m not knowledgeable enough to debug that. Crystal Disk Mark alternatives? (a quick search shows Amorphous Disk Mark, which appears to be just a CDM port). Maybe try out a barebones m.2 NVME USB 3.1 enclosure with a decent SSD like Samsung 970 Pro or Adata XPG sx8200 pro? The bigger the capacity, the more performant they should be, due to more NAND flash chips working together. But I may be giving bad advice here. Well, thank smoke for n days return policies.

Expensive AF. Maybe if you’re lucky to buy cheap enterprise SSDs, but even so, these things run hot, they’re made for server airflow and constant AC.

Consumer grade SSDs like the 8TB Samsung ass slow QVO 870 SATA SSDs are listed for $850. And they use 4-bit per cell NAND, which is awful. Better than HDDs, sure, but not worth it. You need basically 4 of these in RAID 10. Iron Wolf 7200 RPM 10 TB NAS drives? $400 a piece. Still need 4 of them. Apparently there are 16TB 7200 RPM Iron Wolf drives from between $500-$600, meaning you can get away with only 2, saving you an additional $400-$600, which can mean you can have a cold spare around. And as mentioned, RAID mirror = perfectly acceptable just for home backups.

Obviously, the first backup will be slowest, unless you plan on doing more than 1 full backup and then differentials. But full backups every week or month kinda wasteful.

Also noteworthy of mentioning, in the case of lots of data being sent at once to a NAS, an SLOG might be counter-productive sometimes, especially if it’s not big enough to hold all the data being sent over from all the devices, e.g. if only 3 PCs are filling up say a 1TB SSD RAID 1 SLOG, then it’s useless, as now it has to wait for data to be written to the spinning rust anyway, meaning HDD speeds anyway. If you do full backups at once and not spread them among different days of the week, I believe it will fill up.

1 Like

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.