Practical Backup - when 3-2-1 meets reality - questions and discussion

BLUF => As a private person, where is the sweetspot between only storing data on a TLC SSD made form the purest chinesium and the extreme “… I also have a fourth offsite backup, where all my data is etched into the stonefoundations of a UNESCO protected building”?

My situation:
A NAS with ~40 TB data of mixed importance. Familiy media, certificates etc are very important while all the format shifted DVD and CDs are less important, although a lot of work to restore from scratch.
I would like that the photos of my familiy aren’t being used as training data for some unhinged tech-billionarie’s image and face recognition software. The same with tax return etc. which is all digital-only where I live. I have backed up the familiy media on an external HDD. It is a hassle, because I take it offline after each backup in order to insure data sanctity and prolonged HDD life - yes, I am aware of bitrot. Also, currently I am running SHR-2 (basically RAID-6) for redundancy on the NAS.

3-2-1 is the rule of thump of course. For that I need way more storage, and whole other NAS for offsite (but online) backup. So where is the sweetspot? A NAS and a JBOD?

I hope that I both can get some good feedback, but overall discussions would also be welcome :slight_smile:

Old hard disks can get grumpy when started from cold. Watch for slow start-up times when it’s been in use for a handful of years.

How big is that offline archive? Can you afford a few pennies each month with a copy of it at Backblaze, sync’d using rclone ?

K3n.

1 Like

It does get both expensive and complicated, being a data hoarder! For me, I guess the sweet spot is when I’m equally uncomfortable with the amount I’m spending on storage and the risk I’m taking with my data. :slight_smile:

Since uptime isn’t critical for me I’ve forgone redundancy and instead replicate my data once per day within the same machine: the primary storage is mostly non-redundant flash and this is replicated to non-redundant HDD on the same machine. So that takes care of two copies on two different kinds of media. Then once in a while I hook up an external HDD and replicate to that for offsite, offline storage (I have two of these and rotate them so one copy is always offsite and offline).

I have less data than you, <10 TB, which means I can store a complete copy of my data on a single HDD, which makes things easier.

Either way, zfs is what makes this feasible. It’s easy to have multiple different datasets on different drives and snapshot and replicate them to other storage as needed. (So far I’ve managed this manually but I’m looking into sanoid+syncoid now since it’s starting to get a bit out of hand. ;))

For the external copies, so far I’ve used a USB<->SATA caddy, but hooking up power + USB and managing the cables is rather unwieldy which makes me not do those offsite replications as often as I’d like. So now I’ve installed a Trayless Hot Swap Mobile Rack in the NAS. Hopefully that will make things easier. (A nice thing about these types of rack is that they have a special SATA connector that reduces the wear on the drives’ connectors substantially - from something like 50 cycles to 50,000!)

2 Likes

Well said!

I think this is a moving target for most homelabbers.

Start by finding a technology that you believe will last you for a while (say 1-3 years). Buy it and put it into service and possibly get angry once you find out that it doesn’t quite work (or ideally it outlives your target time horizon).

I personally have three different quality standards for my data storage systems.

The first one I want to trust quite a bit, because I don’t intend of fixing it all the time. Also, it’s the one that is “always on”. It contains technology that protects against human error, such as a simple time machine (in my case hourly btrfs/zfs snapshots). It has the newest components and the highest performance.
I receive daily emails that show me that it’s main functions are working (or not). I validate the data with monthly scrubs. Because the machine runs 24/7 a cron job pulls updates weekly however, I only reboot manually when I feel the need for it. I have very few issues with this (probably because it’s not Windows based). In my mind this process offers better protection than most enterprise organizations enjoy with dedicated staff.

The second one is a cold backup. It runs on hw that I have put to the side because of the last upgrade and got repurposed into its current role. It doesn’t have great power efficiency, which is why it’s off for 23 hours 55mins a day. The remaining time, it boots and pulls daily snapshots from the main machine and goes back to sleep. It runs unattended and sends daily emails about its progress that keep me informed about its continued operation (or lack thereof).
Once a month the machine runs a scrub on its datasets for a few hours. Over the years I learned a few things with what hw NOT to use (e.g. SMR drives) and occasionally this machine gets upgraded with more storage or better components that migrate off of other machines.
Whenever I have time I manually update the software (security, feature fixes) typically once a month.
If anything goes wrong with this system the main machine is still fine and when I get around to fixing this system it will catch up on pulling snapshots - I estimate that I get to it within about 2 weeks - this is how many snapshots I need to keep around on the main machine.
OTOH if the main machine breaks this system is first in line to supply the backup once the main machine is fixed.

The third one is even older and does the same as the second - just on cheaper and jankier hw. It’s setup for remote operation “calling home” via a tailscale VPN, so that it can do its job remotely (e.g. from a parents’ home or from a friend’s place). Longer repair times are expected.

Yes, I have a lot of computer components around my place, even racked. I also acquired quite intricate knowledge on how to troubleshoot storage issues over the years and I am quite comfortable with this. That is likely not the right solution for you today.

Generally - the less parts your solution has the fewer can break or cause issues. The price is right when you’re willing to pay it.
I am willing to guess that many on this forum followed some part of the following progression:

  1. The backup across 2-3x single 20TB USB drive as backup strategy will get replaced as soon as one or two drives fail or you find that you cannot keep these in sync or your data simply outgrows the single storage medium.
  2. A basic NAS with two to four large drives is a good step up, but can get frustrating when the speed is sooo much slower than any other storage medium in your house. That can go through a few iterations with better CPU, flash cache options, etc.
  3. Once your data needs have grown beyond the storage capacity of ~4 HDDs (let’s be real - HDDs still have the highest byte/$$$ ratio) you probably have built up so much experience that you’re ready to try a custom build because it seems to be sooo much cheaper than the 8x NAS solutions.
  4. You find a corner in your basement or garage to install a rackmount home data center :slight_smile:

Where do you think you are on this sliding scale?

4 Likes

Personally i have my main storage box with zfs2 with 6 6tb disks for 24tb usable storage which is on 24/7 and then i have a backup box with a mix of disks using mergerfs as a backup. Once a week the main box will wakeup the backup which runs a script for rsync with the main one and then turns itself off. I just wrote about it in Linux Magazine actually. Worked for me for years and since the backup uses mergerfs with truly a bunch of cheap disks, if one fails i swap it out and that data is repopulated within a weekmfrom the next rsync. Works well for me.

1 Like

Thanks, I’ll look into that

Some solid points. Maybe the “single HDD” thing can be mitigated with some sort of JBOD

Besides being waaaayy to accurate with the slider description :laughing: I say that I am on the last leg of the NAS. Have a good Synology 8 bay NAS and the convenience is very appreciated. But it also “kept” me from learning some software intricacies. Homelabbing is definitely on the horizon, only held at bay by my lack of knowledge and money.
While used equipment is an option I’m looking in to, the market for homelab hardware is a bit of a quagmire… Usually some ancient SKUs (e.g. 2014 Dell stuff) are trying to be peddled to obscure prices.

1 Like

In prod, we build new NAS and repurpose current NAS as backup.

When upgrading from 4 drive NAS to 24 drive high capacity monsters, we’ll just configure backups for mission critical data (your family photos, tax dox, etc.) and let the prod NAS handle the rest.

For real mission critical NAS deployments we’ll deploy a pair for redundancy and still use the retired one for n+1 redundancy of mission critical data.

So really, you NEED to build another NAS and you can tell your wife the internet told you so…

2 Likes

Entry to homelabbing should be cheap and easy. Any old computer will do. Find one or more storage devices to start and test the waters. From here pick the OS you’re most comfortable with and figure out how to share (part of) your storage over the net (LAN) - badaboom! You have a NAS :smiley:
You can then add any old app to your new “NAS” - either by installing directly on the OS, or more modern in a virtual machine, or, even better, in a container. I’d recommend doing all three in that order with a relatively simple piece of software (e.g. pihole). You’ll learn quite a bit doing that. You’ll also develop a preference on any of these three methods (cough … containers … cough).

Another approach is to use NAS software, such as TrueNAS. It offers a bunch of functionality baked into a maintained OS image - with a GUI!

Man - use what you have laying around unused in your home. Don’t you have a rarely used laptop? Maybe an old desktop computer your aunt wants to get rid of?
I don’t recommend spending any money until you get a little bit more familiar with the software.

You are in a great spot. You don’t “need” to get a NAS working soon. You already have one. Now you can take your time to copy and explore - extend the existing functionality or add a backup option.

Here is a thread with ideas for inspiration.

3 Likes

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.