“Optimal” approach varies by individual needs, this is why there’s a gazillion threads and ways of doing stuff, and this is why things are confusing.
The main things to understand are that raid isn’t a backup, and backups are intended to be a copy of your data you’re willing to lose, because you still have your main data, and if you don’t want to pay for some kind of subscription service you have to fall back to relying on your own knowledge and diligence.
Options for storage I’m aware of:
btrfs for small mirrored raid.
mergerfs + snapraid for data hoarding
zfs for large raid (often TrueNAS scale)
ceph on baremetal with one osd per disk for small cloud setups
ceph in rook for medium cloud setups (medium= linode, hetzner, digital ocean)
pay Amazon, Microsoft, Google and have their people run your own buildings air gapped from their own cloud (e.g. if you’re a government of some kind) for single digit billions per year
do an Apple and negotiate to pay some hyperscaler double digit billions per year at a discount to store icloud data
Then there’s LizardFS, and other lesser known storage options.
Then there’s various Acronis, BackBlaze, Tar Snap and friends that offer cloud storage.
Then there’s rsync, rclone, restic, duplicity, duplicati, bup, borgbackup,…used either with your disks, or some cloud storage.
Then there’s StorJ and various other space exchanges with random people.
Then there’s the option of shipping an 18T USB HDD + NanoPi to a friend.
Everyone needs to store data, everyone’s needs are slightly different, that’s why there’s so many options.
It’s worth noting in Linus’s case, if he had used TruNAS and set up email alerts, he wouldn’t be in this situation. The default TruNAS install performs a scrub every 35 days. He and/or his team set up those NAS’s manually and skipped some essential steps (no scrub or alerts).
Exactly the channels/articles you have been following so far, but you need to understand that for reliability you have to pay in some way … it can be your time and your hardware, someone else’s time and your hardware, and every combination you can think of you/someone else/your hardware/a storage provider
The more you try to do yourself the more the onus of reliability falls on your knowledge and experience, and also on how much you are potentially willing to lose …
If your purpose is to learn how to make systems that run at home reliable and not have issues like the guys at LTT, you need to read about systems/networking/storage and get familiar with procuring hardware and balancing perfromance/consumption/spend/ease of use
You will soon find out that the golden goose doesn’t exist, otherwise we’d all be having it , and that there’s some content that is not on youtube, not because it’s a secret and people want to make money out of it, but because it’s complex, requires a lot of time and experiments and money … and then some. It looks to me like that is not sellable content on modern media channels, as opposed to blowing up/zapping stuff, assembling splendidly looking pCs, doing cryo overclocking and mad scientist stuff …
The gist of it is: a RAID array will tolerate some amount of failure as opposed to a single disk solution, the more redundancy/resiliency you choose, the more you will pay in space and cost of additional drives.
RAID is NOT a BACKUP - you will read and hear this repeated everywhere, and it refers to the fact that a RAID setup will make loss of data due to a disk failing less likely, but it will not prevent user error (files deleted, drives formatted) or catastrophic server failure, for that you need backup, and there you will go down another rabbit hole …
I completely understand raid not being a backup. If the drives themselves weren’t so expensive lto tape even at a small scale looks interesting for keeping off site somewhere as the shelf life they advertise sounds really appealing and it sounds safer than keeping a bunch of drives on a shelf somewhere.
What really instigated this topic is that I have just brought a video camera for some hobby stuff. It can shoot uncompressed 4K Raw. You won’t even get 2 hours of video out of 500GB. That is very quickly going to be a problem.
That’s why, unless you have unlimited budget, everybody establishes a workflow where they try to keep the raw/uncut/unprocessed camera footage for the least amount of time possible, and they only backup the end result …
It really boils down to:
do you need a fast, ssd backed scratch area where you dump your files and work on them with whatever workflow? This will ideally be raid-10 for max perfromance, and supported by something like ZFS snapshots t oavoid ‘mishaps’ during the post production cycle
how big are you post produced artifacts and how long do you need to store them, this second tier can be slower and use some form or RAID5-6 RAIDZ1-2, unraid parity or whatever
do you need a backup, can it be local to another storage or remote somewhere else
This will establish your ideal goal in terms of storage and number of appliances requirements.
From there budget and time will be your limit, and your workflow will dictate some of the choices.
If you haven’t thought about that and just bought a 4k camera because it was on sale … you’ll need to decide whether you want to go YOLO and use a bunch of USB drives, establish a filing system, and pray that the drives are still working some years down the line, or whether you need a proper NAS, the cost and number of which will depend on the factors above.
Only you know your workflow, your budget and your expectations in terms of availability and reliability of your data so you’ll have to do a lot of homework to rceate what works best for you or, if you happen to have loads of money and not so much time, you can just define your goals and then have a professional deisgn/quote/build it for you …
I guess what I’m picturing in my head is something I can have attached to a network whether in a rack or Nuc or even in a pc case attached to some kind of drive bay that gives me room to add drives as time goes on.
That is your first requirement, a NAS as opposed to direct attached storage.
Now, do you want your NAS to be relatively cheap, expandable and fast, and can you compromise on space used and power consumption? If so, going with an older generation rack server from dell or HP will give you capacity, room to grow, and relatively low prices, but you will pay with noise and space occupied …
That is fine, and it means that you can do with a NAS that only supports 1Gbps as opposed to 10Gbps networking, that will save you a lot of money.
Will you be needing snapshots (i.e. point in time saves on your nas that will cover you from accidentally deleting files)?
Do you want an off the shelf product or do you want to go DIY?
Off the shelf will give you a solution that works from day 1, you usually pay with higher price for same feature set, less options for expandability (unless you add $$$ ), vendor lock-in and less options in terms of what software you can run
DIY gives back all that - flexibility, lower cost, you decide the pace you want to grow at and you have better options for incremental upgrades, allowing you to spend money in stages.
You pay with your time, all the responsibility of making it work on you and, usually a less ‘ergonomic’ experience as you’ll inevitably end up making some mistakes/having to replace hardware and all the stress that comes with that.
So, if all you want is a quick and dirty solution that will allow you to focus on shooting film and working on your workflow, something like a DS720+ (DS720+ | Synology Inc.) will be more than enough to get you started on a relatively low budget and still leave you room for expanding lated with a DX517 unit.
An equivalent model from QNAP would be the TS-253D
You will get your initial nas for ~600USD (plus storage) and that will give you 10-20TB of raid-1 capability to get you started, with all bells and whistles of the Synlogy/QNAP Oses/interfaces
For the same amount of money you will struggle to find an equivalent DYI solution that gives you new hardware, the same small form factor , networking and expandability, but youwill have plenty of options if going older generation …
for example the same amount of money would easily net you a Microserver Gen8 with a 4 core xeon, 16GB of RAM, 4 drive bays and IPMI and a full pcie slot where you could either add 10Gb networking or an HBA for an external disk shelf when you need more space … or an old supermicro 1u rack server with a xeon that will be much more powerful but suck 150+watts of power at idle …
then why is LTO too expensive for you? If you are going to take the added step of transfering the storage from a warm or cold storage area to a hot storage area, work on the data, then save it off to the warm/cold storage, then LTO is going to be the cheapest cost in the long run. Yes, the drives are more expensive upfront, but spending 20 - 80 USD for a 1.2TiB to a 12TiB (compressed) tape is cheaper than spending 140USD, every time you need to add another 8 TiB. LTO also holds up better as a long term backup solution.
With that said, you really need to understand what your workflow is and what your requirements are. That will help as you decypher all of the storage information and philosophies out there.