Evolving Concept with 3-2-1 off-cloud backup - TrueNAS, SyncThing, Plex - comments welcome

Hello all,

This is a long one, I’ve been crafting this post for some time! TL:DR is that I’m unsure about how to arrange ma’ data. :roll_eyes:

Following THIS THREAD, I’m having a few concerns about space and practical implementation.

How I do it now (you may cringe)

This is a mess. I have a ‘storage only’ Windows machine to store long term data (Backed up with BackBlaze Personal), though it has been specifically built for TrueNAS, eventually. I use Google Drive (1TB Paid) to store active and nearly archive data. Then Dropbox (20GB) to store live data. It gets moved slowly down to the storage machine.
I have 1 TrueNAS Server (Re-purposed old Workstation) that I’ve played with over the last 1-2 years that I use for test/learning storage and Plex, I’ve also dabbled with SyncThing and found it to be brilliantly reliable.

Why Bother?

This isn’t related to a data privacy concern, but data integrity. I’ve spent decades storing data on hard drives using Windows machines, a few times files have become corrupted. I’d quite like to pass on my data to the next generation if possible…and if they even care about it!
Thru L1Tech, Lawrence Systems and Craft Computing I learned of TrueNAS, it seemed to be the logical solution, even if it’s not as turnkey as the alternatives (Synology, etc.) and very disk consuming.
Also, Google Drive doesn’t seem to like files exceeding 5GB (throws up sync errors) and Dropbox seems to be less immediate, potentially because it’s a free account that I’ve been using for many years. I’m constantly teetering on the edge of using up that dropbox 20GB storage limit, it’s really annoying having to play musical data every week or so!

The data

I have around 12TB Non-Cloud (but covered with Backblaze) and around 800GB on cloud.
Broken down:
5TB is Plex films/media (increasing by 1TB every 3-6 months)
1TB of personal videos/photos (+1GB per week may be)
1TB programmes
3TB of other peoples disk images
1TB work related (+1-2GB per week)
1TB of general fluff

Why not just use BackBlaze?

Weeeeell, that was great for when I stored data on a ‘storage only’ Windows machine, but with TrueNAS comes hugely differently costs per year of £600/$800 to start with (not including future additional data). I figure that even if I’m buying hard drives, over 5 years I will be saving money…although there is the electric cost that I’ve not considered yet. Something to note about BackBlaze is that it doesn’t backup ALL file types (Personal Edition), which then involves zip’ing them and all that hassle.

The financial breakdown

I already have a primary server (of sorts, Xeon + ECC), I built a snapshot server (currently the windows storage machine) consisting of a SuperMicro board, low power CPU + ECC. The 3rd ‘off-site’ machine will be a reused older machine I have.

So the only cost with the exception of hardware failure/damage, is the hard drives.

At the moment I have 7 x 4TB Ironwolfs and a few older consumer drives that I plan to run into the ground. The primary server gets Ironwolfs, the snapshot server gets re-used drives and the off-site machine will have high capacity drives (like Exos), but only 2 mirrored.
When all drives are bought, it will likely exceed the cost of 2 years worth of Backblaze, however, the cost will be more fixed and unlikely to waver, and I like that.

I have most of the hardware needed to ensure there’s a 10G connection between all these machines (SFP+).

The Disk and backup arrangement (today)

My latest plan for the main server was:
9TB Useable RAIDZ2 Active Storage (5x4TB) and 9TB Useable RAIDZ1 Archive/rarely access (5x4TB)
I thought that the Archive could be set to spin down when not in use - sometimes it wouldn’t be accessed for weeks or even months.

How much Plex content do I really need?

This is something I’ve asked myself recently, especially when uncompressed a film is around 25-30GB. Do I really need this media conveniently accessible? Something I would like to do is arrange my photo collection for easier viewing on Plex. I’ve tested this already and it’s pretty good, I bought my parents an Nvidia Shield and my personal videos play well on it.

The paradox is:

  • If I really like a film/series, I’m going to enjoy and watch once, then many years later if I’ve forgotten the story/plot, I might watch it again.
  • But if it’s a film/series I’ve never watched and decided to buy, it could be rubbish and I might not want to watch it again and therefore won’t put it on Plex!

So as I mentioned, I’m going to have 3 ‘servers’.

Data I have right now is around 12TB, though some of this can be discarded (other peoples disc images, very bad films/tv series).

Primary Server (TN1)

Pool 1: Active (9TB RAIDZ2)
Pool 2: At rest (12TB RAIDZ1)

Snapshot (TN2)

Pool 1: for Active Snapshot
Pool 2: for less frequent snapshots (of data at rest)

Off-site...ish (TN3)

Just a reliable enough box that will hold 18TB noisy drives and be turned on weekly.

Some other notes and things

Why all this fuss and bother for just home use?

This is the rub, it’s not just for personal use - I work from home (now) and my intention is to use SyncThing to sync my work files, so there’s always at least 3 copies (even without the 3rd server on). My work files are a combination of architectural and photo/video records. A whole project file is around the 3GB range.
One handy thing is that I’m self-employed/sole trader, so virtually all money spent on IT is Tax deductible.

My first thoughts (not so good)

At first, I was going to build up slowly to having 9 x Ironwolfs in RAIDZ2 which would give me around 21TB of storage (including 20% buffer). My hope was that before I get close to filling it up, I would already be replacing 4TB drives for 8TB, expand the pool and happy days, I get lots of storage. Thing is, that’s 9 drives constantly spinning, while only 10% of the stored data was being accessed on more than a weekly basis. So as above, I thought it practical to split 10 drives in half, have one half always spinning and the other half spin down.

Am I a TrueNAS fan boy?

No, I just like the concept. I dabbled with UnRAID and found it so easy to do - but as always, there are sacrifices with the easy solution. I looked at Synology and really disliked that you MUST buy another Synology if it goes bang with working drives in it (citation welcome). I really like the idea that push comes to shove, I can load TrueNAS on virtually anything and recover data. I like the data scrubbing, the snapshot options, the GUI, the order of it all. I’m the same person that hand writes websites (for myself) because I disagreed with the lack of portability or hassle of updating a CMS website. I’m also the same person that has virtually all the power/hand tools (and experience) needed to fix almost anything that goes wrong in a house.

What do you think of TrueNAS Scale?

I’m far from qualified to answer this, but I will :smile: Its (future) capability is in excess of my abilities, but I have done a test install and played with it. If I get more involved in containerising my life, I’ll definitely give it another try. Big respect to IX Systems for taking this course of action :clap: For the moment though, I’m going to use that test box to practice my snapshot/restore experience…ready for that moment when it all goes tits up!

So, you’ve got to the end, my hat off to you! I hope this is fairly understandable, I was thinking of doing a youtube video of it as well, in case that was more easily digested. I very much welcome your thoughts and suggestions :clap: :+1:

1 Like

Something I have noticed, is that with over 100,000 files on google drive, it does take a fair amount of time to sync, so have it done local via SyncThing will be really helpful. In fairness I should probably trim quite a few files, I often wonder how much wearing of the drive, sync’ing files actually does?

Truenas scale has a lot of potential. More features, more intuitive to non-experts, broader hardware support. Still a bit rough around the edges though. I’m not ready to switch everything over…yet.

Start by putting your data into three tiers based on how possible/difficult is to replace. Irreplacable data backs up all the way to the cloud, stuff that would be a pain to replace stops at the back-up server, stuff that can be easily replaced or don’t mind losing is fine not being backed up. That’s the system that works for me anyways.

2 Likes

Oh TrueNAS Scale definitely has potential, I just found there to be far more steps (boxes to fill in) to install things, such as TrueNAS.

That’s a very different and healthy perspective (thank you), I should really look at it that way. There’s certainly some of it that I wouldn’t mind losing, but to separate them from ‘keep for life’ files would be a chore and a half to do.

The more I live with files, the more I’m looking forward to using the NAS full time. Right now I’m practicing the recovery of different failure scenarios, so that I’m familiar enough with them in the event they occur.

1 Like

It took me honestly a couple of months to organize all my data, but I found of the 60TB of crap I have only about half really needed backups (I can always download star trek again, right?) and only about 5 percent was irreplaceable (lots of small files). So i figure the time was well worth it given how much money it saved versus triplicating everything.

1 Like

Wow, that’s commitment right there.

I think I could probably remove quite a few films from my plex library…some are in folders called “not-very-good” for example.

An element I am looking forward to is general arrangement, I have so many files in so many places that need centralising.

12TB isn’t that much to handle. :slight_smile: Up to around 25 to 30TB you can just backup to a couple large SATA USB drives and store them offsite at a friend or relative’s house.

I don’t know what format your media is in, but I pre-process all media before adding it to my collection. I add a 2 channel AAC audio track if not present, strip out any subtitles in languages that are of no use to me, download external full and force subs if needed and then convert the video to h.265 8-bit (not 10 bit) with a modest setting. By modest I mean a setting that works across the board that is nearly impossible to see any degradation in picture quality on 75" TV.

H.265/HEVC done like this produces smaller files that need less bandwidth to stream and with the inclusion of 2 channel audio allows direct play on nearly any device sold these days. Basically, browsers are the only problem for h.265 at this point but that’s easy to fix as you just run a dedicated client on the PC vs using a browser or transcode it on the fly.

The 12 TB library now magically becomes something closer to 6 or 7 TB so you need less storage and backup per average media file. Translation, allows you to accumulate more/faster with each new storage addition. :slight_smile:

ZFS with 4 drives is tough and personally wouldn’t go down the rabbit hole until you have closer to 8 drives. Even using small vdevs of 4 drives you have to consider expansion. You have to add like vdevs so if you have 10 20TB drives and create a pool from those as one vdev you better start saving money now for the expansion because you’ll need to add 10 additional “like” drives. If you did a raidz2 of 8 drives, you’ll need 8 drives for the expansion. On a smaller size system a couple things to think about. First don’t purchase massive size drives but limit yourself to 10 TB and purchase 2 of them vs one 20 TB drive. That gives you more IO to work with. So if you have 10TB size drives and 4 of them you could do raidz with a 4 disk vdev which is basically like raid 5 and the rough translation to zfs speak would be 25% redundancy. So if one drive failed, was replaced and rebuilt (resilvering) you have 30 raw TB at most to reprocess to be back online. If they were 20 TB drives you could have to reprocess 60 TB or twice as much data. You now have a lot higher chance of loosing a 2nd drive during this critical resilvering process. The good news is that future expansion will require 4 more drives of 10 TB each so less costly than the bigger drives.

If you are worried about raidz and only have 1 redundant drive you could use raidz2 which will use 2 drives for parity meaning the vdev could survive 2 drive losses and your data is safe but a 3rd drive loss kills your pool. But you probably don’t want to do a 4 drive raid2 as that 50% redundant. If you went with 8 drives you are back to 25% again but now have better protection which is good. But now expansions require 8 drives at a time which might be too much. It’s a trade off and that’s what makes it harder to efficiently use at home vs a business where adding 8 drives isn’t a factor.

You could mirror drives and then stripe them which gives good performance but you really don’t need that kind of performance for a media server. The advantage is that you can expand with 2 drives (new mirror set) but now you have 50% loss of space. It’s a trade off and that’s what makes it harder to efficiently use at home vs a business where adding 8 drives isn’t a factor.

Another route to take is not buy new drives. Don’t use SATA but move to SAS drives which makes it far easier to expand with since you can pickup 12 and 24 storage expansion boxes off eBay for $200-$300ish. You can also purchase refurbished lots of SAS drives cheap this way. Target 6 to 8 TB drives which makes it cheaper. 6 TB if lots of 5 selling for roughly $375 right now. SAS drives across the board are generally faster/better than consumer SATA drives and better for ZFS use. Use minimum raidz2 with 5 drives or better for reasons I won’t go into here use 8 drives with raidz3. That gives you 3 redundant drives (since they were used) but 5 drives of “performance”. Using multiples of 4 drives is ideal since that matches your SAS ports where each port is good for 4 devices. You will have much better ZFS performance with 6 to 8 TB drives vs 1 20 TB drives (assuming you add the same gross TB).

I gave up long ago on cheap offsite storage with things like Google Drive, MS OneDrive, Amazon and others as they tend to close the doors on “unlimited” plans. It’s one thing if it’s just a backup but quite another for those people using it as a primary (with rclone) when this happens. Companies like Backblaze are kind of a joke for home media as it could take you a year to recoup your data from them. They will “gladly” put your media on USB drives and ship it back to you for $500 or so per drive to help you out. You could do S3 storage but that gets expensive fast even with knockoff companies which still cost $6 per TB per month for storage plus bandwidth over 10% of storage.

Your best bet is setup a cheap backup server at a friend or relative’s house and do backups to it. This can be done with an old router that has a USB3 port and external drive. Just switch the drive out as needed. You could of course go bigger with a NAS like TrueNAS but most people have an old home router not being used that has USB3 drive sharing built in so something to think about. However, one of the best methods of “backup” for media is find a couple “buddies” with similar setups and use each other for backup. You basically trade space with each other so you provide 10 TB to them and they do the same for you. You can choose to encrypt your data and compress it or just rsync so your “buddy” can look at your collection. In the event something bad happens hopefully your buddy can load up or give you USB3 drives full of content without “convenience” charges. :slight_smile:

If you do that with a few people you can handle your offsite storage similar to a raid setup with redundant copies of media offsite with redundancy in case one of your “buddies” crashes. LOL

Hopefully, there was a nugget or two in the above that was worth reading,
Carlo

1 Like

Thank you for such an extensive response @Carlo :slight_smile:

I like your thinking about the USB drive option, I’d prefer to use SATA though and control data corruption where possible.

My media format is mainly MKV’s, I did play around with Handbrake and reducing file size, but I wasn’t able to do a consistent job, so I gave up and settled with original file size. I also found it frustrating that I had to ‘test watch’ the new media to confirm it worked OK, and that slightly ruined my experience! I’d be interested to know what you use for compressing though? As I watch media on the LAN, bandwidth has never been a real problem I’d say.

For expansion I was going to start with 4TB and assume that if space is needed later, I can swap them gradually out for 8TB. Originally I was really keen on optimizing drive capacity and have a 9 drive Pool, but then the cost of electric went up massively in my country and I’m more enthusiastic about reducing running costs (as in KWh’s). I thought the best way of doing this, is simply reduce the number of drives active.

It seems that generally having high capacity drives is a bit risky when it comes to replacement and resilvering, so I am generally sticking to 4/8TB drives as a precaution.

I will only be using RAIDZ2 for the most part, though I am having a change of mind, do see in separate post shortly (again, for the millionth time!).

I would really like to get SAS drives, but the cost of them (in my country, UK) seems far far higher than SATA, so I’ve always been put off by that.

I don’t mind BackBlaze in general, and the re-download speeds are tolerable, considering that it’s when I have failed with my backup procedure, that I have to lean on them. Though one of my comments in the opening post shows that as I’m going to NAS from a traditional PC, the cost of Backblaze is far in excess what I feel I can justify, and I opt for spending that money on my own 3rd backup routine.

I am preparing some of my network to be 10G, because I would just like the backups to be fast and done quickly, before an incident occurs. I do have a generic router in use at the moment, but I do have a dedicated modem ready and waiting, along with a USG (will probably replace for PFSense when able).

Sadly, I’m one lonely amateur geek, most of my friends are in construction or motorcycles…so I stand alone! :slight_smile:

Really grateful for your feedback, I am having a re-think about the arrangement and your comments have jogged my plan into a different shape. :+1:

So…it’s number 4 re-think time :frowning:

I’ve been thinking about how I actually use data storage right now. For business and light personal use.

The important stuff is quite small in storage size, so I’m thinking now about:

A 3TB Pool (consisting of 2 x 4TB mirrored drives) for active files (modified hourly).
17TB pool (consisting of 8x 4TB RAIDZ2) for less accessed files.

That is pretty much the way things work right now, I very rarely access the vast majority of files, so why have drives spinning for no good reason?

The Snap shot machine will be on, but will have a similar arrangement that totals 3TB in storage space. So that’s 2 copies of files, not including those on the workstation.

Then I would still have a 3rd TrueNAS machine with a high capacity drive, that just deals with everything and is turned on, perhaps twice weekly.

That’s “A” plan for today, but not necessary THE plan! :laughing: