First home-build NAS -- BTRFS Questions

kai-tub · December 14, 2022, 9:07pm

Hey all!

I heard that this forum is a great place to get help diving deeper down the Linux Server rabbit hole (of course, not exclusively), and would like to get some “experienced” input and understand a couple of concepts regarding BTRFS and how to plan ahead.

I will try to keep it short and to the point:

I am building my first Server/NAS after dreaming about it for almost a decade now
I am a programmer and quite familiar with Linux but less so with Server related parts (which will hopefully change in the near future!)
The new Server will be used as a playground for some ‘server-y’ tasks and also as a NAS for my data.

So while I am waiting for my hardware to arrive (yes, I bought it in advance…) I am trying to make my final decision on how I want to configure it. And this is where we finally get to BTRFS. (Side note: I have used ZFS in the past with TrueNAS at work, though very briefly, but would like to go with BTRFS)

I have 3x 6TB hard drives that should be used for the “NAS” part and currently have another 3TB and 4TB hard drive flying around with my current data, which I don’t want to continue to use for the NAS.

I would like to be able to ‘lose’ a drive without losing any data. So, currently, I am looking at setting a BTRFS-managed RAID1 configuration. The RAID1C3 is a bit aggressive as I would really like to have enough space to copy the data from my old drives. But with the ability to add more devices in the future, what does the ‘classic’ BTRFS upgrade path look like? Let’s say that I get another 3x 6TB drives and add them to BTRFS. If two drives out of those 6 drives fail, I will lose all/some of my data, correct?
But that seems a bit weak, and I would assume that you would usually not use this configuration in practice.
With 6 drives, I would assume a good target would be two possible drive failures, but how would I set this up with BTRFS?
Or, more clearly, what resources or tools do you use to plan the future upgrade path of your BTRFS volumes?
How do you handle redundancy when adding/removing disks (even without the additional complexity of having differently sized HDDs)?
Creating multiple volumes (like with datasets in ZFS ?) is something I would really like to avoid since that is a very ‘messy’ solution, IMHO.
The reason why I am asking is that I do see a possibility that in the next 1/2 years to double/triple my storage needs, and would really like not to make a bad decision early on.

I am really looking forward to your feedback!
Even if you say that I should just read the BTRFS documentation more closely

Thanks!

twin_savage · December 14, 2022, 9:29pm

I would strongly recommend against using btrfs’s raid functionality, imo it is experimental at best and I would not trust my data to it.
If you absolutely must use btrfs as a file system, use mdadm to manage the raid upstream of btrfs. With mdadm you’ll be restricted to “traditional” raid levels (ie no raid1c3).
If it were me in that situation I’d use mdadm raid 5 (although I’d really wished I’d had a 4th drive so I could go to raid 6), and then use mdadm’s raid expansion functionality for future upgrades.

freed00m · December 14, 2022, 10:03pm

Also question if you want your data encrypted, there is difference how you LUKS it.

I would say for RAID5 and RAID6 but I would feel safe with RAID1.

I am currently driving 2 HDDs with RAID1 encrypted with LUKS ( separately on each drive the same LUKS header ) to preserve btrfs self-healing of data even when encrypted.
If I’d go for mdam raid + LUKS btrfs would not repair data easily.

I am not professional datahorder but I think I could expand in future with 2 same sized HDDs and create some sort of RAID10. I hope I will be able to rebuild it live without need for extra backuping.

Please if I am also completely full of BS please correct me for all of our sakes.

twin_savage · December 15, 2022, 12:35am

ahh I wasn’t really considered volume level encryption, I always containerize my encryption because I don’t want to take any recovery or management options off the table if things go sideways (or more realistically just complicate recovery/management procedures).

Raid 1 is definitely safer than the higher raid levels with btrfs’s raid, but I’m too paranoid to use it by association.

My thinking is if I’m not going to be choosing a completely rock stable, battle-tested FS I might as well opt for something more advanced than btrfs, like betrfs.

Exard3k · December 15, 2022, 12:44am

just a sidenote on BTRFS RAID5/6

seems like there are improvements coming, according to Phoronix

jode · December 15, 2022, 12:48am

The article is listing improvements but not a “ok to use”. Also the improvements were merged for kernel version 6.2 which is going to be released in the spring of 2023 - don’t hold your breath until it arrives in the linux distro you use.

Exard3k · December 15, 2022, 1:04am

I didn’t say I’d recommend using BTRFS RAID 5/6 now or with 6.2, just that there is some movement and “hope?”

jode · December 15, 2022, 1:11am

I have lots of hope for btrfs, but since your comment appeared in the thread that starts with “my first Server/NAS” I felt a little clarification was needed.

freed00m · December 15, 2022, 5:16pm

I trust raid1 btrfs way way more than raid1 ext4 or any other fs.

Lt.Broccoli · December 15, 2022, 11:45pm

I use btrfs “raid 0” on 4 block devices. But those block devices are sitting on a remote server, and are backed by ZFS ZVOLs…

kai-tub · December 16, 2022, 6:59am

Hey, thank you for your thoughts!
From what I’ve seen, many dislike BTRFS because of their misleading/incomplete Raid56 support. As long as they don’t explicitly remove the ‘warning’ from their documentation, I definitely wouldn’t use it.
But I wouldn’t ‘discard’ the other options by association. At least not in my ‘hobby environment’, but maybe I am not yet a burned child and will learn my lesson

Encryption is a good point! Yes, I was thinking about encrypting the drives but now that @twin_savage mentioned containerized encryption I have never really thought about that… What tools to you use? How much work is it to continuously grow the encrypted “directories”? I am mostly thinking about family pictures which would be the ‘main’ reason for encryption but that would also be the directory with the highest change…
I have definitely been bitten by trying to recover from an encrypted drive before but on the other hand, if I have containerized encryption that is usually the data I want to recover. So how does that play out in the end?

I guess I am more leaning toward volume-level encryption but don’t have a good reason to do so. I am happy to hear your input on this subject too!

As far as the changes to the Linux Kernel, I will be using NixOS as the operating system and would be using ‘newer’ kernels than other first NAS builders probably.

risk · December 17, 2022, 6:49am

Hi,

I’ve been running btrfs for years and years on a few machines at home. raid5/raid6/raid1/raid10. Various numbera of drives.

Raid1 is the safest/easiest setup for 3 6TB drives - giving you 9T of space. This is because of how degraded mode is handled in btrfs.

You could run raid1c3 or raid5 with 3 disks, but...(caveats)

…you won’t be able to mount it except in degraded mode with only 2 remaining disks (should one of them die), effectively forcing you to keep a hot spare or have backups. This is because lots of protections are disabled in degraded mode, and you practically only want to use it to get out of a disaster, not run in that mode for weeks.

ZFS does the degradation a lot better, LTT famously ran with dead drives for years without noticing before it died and Wendel saving their bacon, but ZFS sucks for any kind of asymmetric drive expansion - you need to upgrade a vdev of storage at a time, if your storage needs are small that may mean never upgrading (paying for storage you never use), or not riding the drive price reduction curve. If your storage needs grow predictably about 50% per year, or you’re not cost sensitive, this isn’t an issue.

By having 1 more than minimum number of disks required for a raid level you can continue to store data at that raid level even with wonky disks

If you want maximum, maximum, maximum space and flexibility and your data is mostly static, snapraid is for you. Otherwise btrfs is more cost efficient than ZFS for the space used in long term.

Generally, with either btrfs, or snapraid, about a month or two before you think you’ll run out of space, buy a new disk whichever is the cheapest per TB. See diskprices.com or skinflint.co.uk (or one of their sister sites for your local market) or subscribe to alerts on camelcamelcamel. Typically expect prices to fall about 15-20% YoY per TB.

There’s a primer here which I wrote a while back and could do with a refresh
Small Debian NAS [btrfs] [WIP]

Ironically it’s missing periodic scrubbing and weekly utilization reports, but should be enough to get you going with all the crypto nonsense.

It has you setting things up with 1 disk, then adding more, changing raid levels, rebalancing data and so on.

Removing disks in the future (because you filled up your chassis and have small disks is easy), you’d rebalance data off the disk, and remove the disk.

It’s possible to sneak LVM underneath btrfs later, but requires a bit of computation with a spreadsheet and dd commands - or using some weird tooling to do basically that, it’s best done ahead of time… or if you really want SSD cashing and storage tiering then ZFS is probably for you.

Since it’s wiki - do edits where you spot fixes are needed.

kai-tub · December 17, 2022, 8:54pm

Thanks!
I will have a detailed look at the guide!

I guess I will be going for BTRFS with RAID1.
I took a quick look at snapraid but it seems like a bit ‘too specialized’ for me, and I would like to ‘play around’ with compression and quotas. Mostly to get some experience with it

Thanks for sharing the websites! I’ve used camelcamelcamel this year, too, to buy the hard drives and would also recommend it.

Dang. SSD caching is something I have never really thought of. I guess that could be nice to keep the disk usage down when I keep accessing the same data… I think I will stills stick to BTRFS and check out ZFS again when I get another chance.

twin_savage · December 17, 2022, 9:49pm

For me atleast it isn’t just the raid5/6 support in btrfs that turns me off, it’s the seemingly constant stream of updates to fix some kind of decent sized bug that gets found.
Just last week I had to patch uek because a btrfs-related 7.8 cve was finally patched in the kernel in question.
To be clear I’m not saying btrfs should be abandoned, I applaud the people willing to run it and and solidify its codebase; but it’s not for me with my risk tolerance.

I’ve been using veracrypt for both windows and linux encryption containerization. Its simple to deal with, the entire encrypted volume is stored as a file on the host and will grow in size with the data fed into it assuming you created the volume with the “dynamic” option checked and overprovisioned it sufficiently; otherwise manual volume expansion will need to be done through veracrypt while the volume is dismounted. the veracrypt application mounts that file to present a new volume to the OS.

For me the biggest selling point of this method of encryption beyond the simplicity in data recovery is the hidden volume/plausible deniability aspect of it; I’m not going to get into why this is so good to protect future me.

It may be possible to use external headers to crate hidden volumes with luks but I haven’t actually looked through that to make sure it doesn’t leave breadcrumbs behind or other gotchas… If anyone knows the answer to that feel free to enlighten me.

risk · December 18, 2022, 6:34am

I just put things on LUKS in order to worry less about data being exposed when e.g. sending drives through RMA or having machines stolen while I travel… Realistically I have maybe less than .1% of something that might be concerning for me if it ended up floating on the internet, but managing what goes where is a pain. Full disk encryption (except bootloader, kernel, initramfs) with unlocking over ssh or with a USB stick works well for me.

Also, don’t forget collating and buffering random writes, you get about a 100 iops per drive and many writes on any system are small writes, appending to logs and other kinds of journals.

If you could turn each one into 1M or half-meg writes, woohoo happy days.

If you just LVM caching under btrfs raid1, you’re probably storing days in cache twice.

kai-tub · December 21, 2022, 6:55am

Thanks for getting back to me!
I see. As I am not using it in a professional setting, I am more willing to ‘live’ with the still-evolving file system and will maybe force me to keep everything up-to-date.

I’ve been using veracrypt for both windows and linux encryption containerization. Its simple to deal with, the entire encrypted volume is stored as a file on the host and will grow in size with the data fed into it assuming you created the volume with the “dynamic” option checked and overprovisioned it sufficiently; otherwise manual volume expansion will need to be done through veracrypt while the volume is dismounted. the veracrypt application mounts that file to present a new volume to the OS.

I have heard of veracrypt and used it for entire drive encryption once before. I wasn’t aware that there was an option to create dynamic volumes.

Yeah, that sounds like an interesting application. Though I don’t really see a use-case for me and could see having disk-level encryption as a simpler, more direct solution. Although if I have issues with LUKS and BTRFS I might switch to containerized encryption.

Do you mind expanding this a bit? How do I manage that?
What terms do I have to google?
Also I don’t quite get what you mean by:

system · September 21, 2023, 12:55am

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.