How large is your library? How do you back it up?

A colleague of mine is a videographer, who has talked to me over the years about a problem he faces. His library is over 100TB in size, and he doesn’t have a good way to back it up that won’t cost an arm and a leg.

I figured I would ask the community, how large are your media libraries and what is your backup strategy? Do you even have one?

I only have backups of my crucial data… Family photos, documents, encrypted files, etc. Everything else I am happy to lose.

So in essence I have about 200-300GB of real data that requires backup.

I tinker too much with OS and phone so I really don’t keep much on these devices and am happy to redownload those if required.

I have unraid for main storage, then a copy of it on another pc. Then 3 x bit locker encrypted external drives with 2 of them in the house (one up front and one at the rear) and another I keep in a shock case in the car but might move this one to my new rear shed as it’s far enough away from the house that if it burnt down I would be good.

100tb is a huge amount of data. Does the person actually require it all backed up? Can they get away with only backing up the exported finished project? My guess is that they would need a decent set up to store that kind of data being a server or home server with large drives in some kind of raid or similar which would be expensive for initial set-up but probably work out over the longer term.

I too prioritize which data I really need to backup. That’s currently less than 1 TB in backup size. But I’m currently not good in switching the HDD’s and store them at separate places, so if the house burns down, it would be lost.

But I started also looking together with my dad (lives in a different place, but same town) that we can exchange backups automatically. Automatically is important, because if it’s a manual task I will for sure forget it or say “let’'s do it tomorrow”.

For protection against 1-2 drive failures, I currently use 6 hdds in a small 2 node ceph cluster + 1 additional monitor node (raspi), but I need a different case for my old MiniITX PC, so that I can distribute the disks more equally.

1 Like

I have a separate 2x 4TB mirror array in my personal computer. I wrote a few scripts to RAR my stuff onto it every week. Once a month I make a backup of the RAR archives on an encrypted 5TB external hard disk, which is then stored elsewhere. Not the most cutting-edge strategy, but it works.

1 Like

My library is around 60TB or so. I have backups for critical stuff and everything else I use snapraid. While snapraid isn’t a backup solution is has some functionality that overlaps with a backup solution and is less likely to suffer a complete failure like a teaditional RAID array.

I haven’t lost any data since 2006 when I started using a similar solution.

1 Like

At 100TB, it’s about… $1500 for raw, $2000 or so for something raid5? Maybe less if you shop around, but I’m not too sure.
I think LTO-8 starts to make sense around 4~5k, but it does have it’s advantages for cold storage, since, even if the mechanism fails, you can still recover the data. But, the drives are some $3k or so iirc.
Not too sure on LTO maintenance though, but I sometimes see LTO5 drives that look cheapish($500 or so?), and the tapes are about half the price of HDD per TB? It might be something to look into, but if the hoard is still growing, it might be more prudent to get 7 or 8 for a more forward looking solution. It’s also still a lot of money, especially if the drive fails.

But, no, 100TB is not cheap, no matter how you set it up.

To the topic, I have about 75TB of storage with I think about 50 used.

1 Like

IIRC Lenovo (ex IBM) TS4300 Tape Libraries can be had for 4k new.
You would then still need the surrounding infrastructure pushing the overall investment into 8-ish k range.

I was thinking just a used SAS 5.25" tape drive, which are around 3~4k cad, so probably 2~3k USD. Is that not a good option for casual-if-expensive data hoarding?

For that size, it is costly no matter which way you go.
Local will be cheaper, but not safe. In case of fire/break in, they will lose everything.
Cloud, will be slow and expensive for this size.

I was using Crash Plan Pro, but moved to iDrive cause they have much better Linux/NAS client.
However, Crash Plan Pro had no limitation for data, so typically they could upload as much as they want. Don’t know their current plans though, but worth checking.

1 Like

define arm and a leg in USD?
He/She should be able to get a decent NAS solution for anywhere from 5K to 10K depending on how much he/she can do DYI … if that is already in arm and leg territory then there’s really nothing other than deciding what budget is allowable, and then trim the data to fit … there’s no escape to the cost of the storage, there’s no secret technology that will give you better prices at this relatively small scale …

2 Likes

It sure is. I don’t know the specifics, but from what I gathered its mostly stored on a pile of external hard drives.

My math broke it down to about $10k per 100TB for a full (spinning rust) solution. That could be something prebuilt or by someone like 45Drives. I’m sure there are deals out there though. LTO has about the same upfront cost, but starts to look better the farther up you scale.

I don’t think there is any specific number in mind. I pitched a few options for cloud storage (B2, Glacier, including myself with a shameless plug for Im building a datacenter ) and a few local options (Drobo, servers). They didn’t seem too thrilled at the options.

I suspect this is a case of them suddenly realizing that piles and piles of external drives is not ideal, and not knowing what their options are.

The lower you can get for 100TB of reliable (at least raid6) local storage is 1200USD for a QNAP-ts-873a 3-4000USD for 120-160TB of RAW disks and all the time it takes to set it up, create the shares/the workflow/the backups, so yes, 5K plus services minimum without getting creative or considering used enterprise gear, that in this case I wouldn’t touch anyway.

If that is too much for running their business the only thing I can say is that they will be spending that amount of money for a point refresh of one of their Sonys or Nikons or Canons, so only they can decide if they can skip one and have backups/one less lens or YOLO it until something bad happens …

1 Like

I am assuming they would want to have local access on demand and that options like glacier/cloud object storage infrequent use is not an option ?

100TB infrequent access on Oracle cloud Object Storage is 250USD/month … the upload cost may be zero using Data Transfer Appliances, but the scale of the project and the amount of complexity in dealing with an Enterprise cloud service will cost well over 5K in the end … and they will want the customer to pay in advance evry year …

If you’re on a budget, something prebuilt is not the way to go.
It sounds like the best solution is probably looking for deals on high capacity enterprise/datacenter drives on liquidation via ebay or amazon sellers, and assembling something custom, but for 100TB worth of drives… There’s not many consumer cases on the market that will handle that much 3.5" capacity without sticking to the bleeding edge of data density.

Doing some price checking on ebay, I think you can get the drives themselves for ~$1700 for 7x16TB, with some space to spare at about 112TB of raw capacity. A non-redundant backup is better than no backup, so that might be a good starting point.
If you get something like Anidees AI Raider XL, and 4 3x5.25"->5x3.5" drive adapters, you could fit 20 drives in it, in theory. Might be better to do 16, for thermal reasons, but I think these helium drives run pretty cool so it’d probably be fine. I’m not sure how well that case fits these adapters, though, so it might require some modification.
That would be another $400 or so, so I think you could certainly get a complete running system for under $3000, $4500 maybe for a fully redundant backup, everything included.

But, if he’s unwilling to assemble, and doesn’t have any friends or family willing to help out, that means spending a lot of money.

1 Like

There are no many options and none will be cheap when we exceed a certain amount of data collection. :wink:

Server with many HDD… or Tape, Blu-ray, Cloud. With a scale above 100TB, nothing will be super cheap, unfortunately. :frowning:

1 Like