Btrfs: ways to free up space in a more predictable/controllable manner?

I intend to switch to Btrfs send/receive for my backup solution for media files on numerous slow SMR 1-4TB external archival HDDs (not using these drives by choice–it’s the only use I can come up with for them) as an alternative to backup software like Borg/Kopia (which I will still use as my primary backup solution for more performant drives). I suspect it can yield better than the 15 MB/s write speed I’m getting with Kopia since rsync gets 2-5x that. Of course Borg/Kopia does de-duplication, compression, encryption, etc. but I really only need encryption (Btrfs on LUKS) and handling of file renames) and I suspect Btrfs understanding more about the filesystem can be more efficient in this regard. I also intend to switch to Btrfs for my workstations if I can find a good workflow below.

It’s not straightforward to me how one can free up space predictably; e.g. for a 4 TB disk, I want to ideally maximize its storage capacity used by comfortably filling up to say 3.0-3.5 TB. Typical examples of using Btrfs snapshots involve a rules-based policy where .e.g. the last 5 snapshots are retained and the older ones get deleted, but this is arbitrary in that it does not take into account disk usage. For backing large media files incrementally, it seems like a “size-based” policy is more appropriate (when it comes to disk space being a concern, deleting the oldest snapshot may not be necessarily free up enough space), but I’m not aware of people or tools using a similar strategy, e.g. “delete X oldest snapshots that will bring the filesystem to just under Y amount of space used”.

  • How to know how much size an incremental backup takes so one can be confident the destination drive has enough space to comfortably (e.g. fill up to 3.5TB) accept the full transfer? Knowing this should also give a decent idea on how long a transfer may take which is important because send/receive is not interruptible (all progress is lost. Technically you can send to a file locally and then transfe that file then receive it on the other end, but it requires even more time to send/replicate and additional space on both ends).

  • How to know how much space deleting a snapshot will free up? Is this the amount that’s reported as exclusive from btrfs fi du <path> -s?

  • Would deleting e.g. a 2 GiB file from all my snapshots free up 2 GiB worth of usable space? Doubtful since Btrfs works at the block-level and not file-level, but how does one free up space at more granular level than simply deleting snapshots which frees up an “arbitrary” amount of space than e.g. a set of files like on a traditional filesystem?

  • Which of the numbers from from the various btrfs utilities (df, du, btdu) are most relevant for this purpose (knowing how much practical space is used/free)? I only vague understand the nuisance of the similarities between what is reported and not sure from a user standpoint what one should pay most attention to when it comes to being space-conscious.

Any tips are much appreciated including specific commands I might want to know to get some of these numbers in order to use Btrfs for backups in a more space-conscious way.

For your purposes look into btrbk. I use it to backup and snapshot my desktop and home server and it works great. The only thing is the learning curve (on the btrfs side) as you’re already finding out.

I’m not aware of a way of doing it, unless you are sure you have the same ‘parent’ subvolumes on both sides and look at the exclusive use.

Yes, that should be it. Note that deleting one snapshot can change the exclusive use of another (e.g. if two snapshots hold some large file and none of the others do - exclusive use for both may be very low but once you delete one the other has a large exclusive use)

If you have deleted all references/reflinks to that 2 GiB file, yes. Space will be freed when there’s no files referencing the blocks any more.

btrfs filesystem use gives the high level overview of free space. btdu is very useful too. du is less useable on btrfs. If I’d du my snapshots directory It’d say I’m using hundreds of TBs since it is not aware of reflinks or snapshots.

IMO this is the real cost of the zero-overhead snapshots: the complexity. The learning curve is significant, but for me it’s worth it. I enjoy having 15m snapshots of my home directory if I screw something up and it already saved me a couple of times.