Linux: backup strategies for boot and root FS for easy rollback from broken upgrade

I’d like to make a list of strategies for backing up a Linux workstation - while it is running - so that if an upgrade breaks something (e.g. on a rolling-release distro), rolling back is as easy as possible, regardless of what component broke. Anyone who knows a good strategy for this, please weigh in.

Initial Ideas

  • Periodically rsync or tar /boot, /boot/efi, and / to a seperate drive or partition. If an update results in an unbootable system, boot into a live USB environment (or a seperate install on another partition), mount the broken system, and copy the aforementioned things back.

  • Perhaps xfsdump can be used as an alternative to file-level copies for the above strategy (in the case where the root partition is xfs, of course) ?

  • Perhaps BTRFS snapshots can be used to recover from certain kinds of unbootable systems?

This was prompted by my familiarity with FreeBSD. On the surface, FreeBSD has a much easier time with this than Linux, because of native support for ZFS boot environments, and the very recent addition of UFS snapshots even when soft updates journalling is enabled, but I’m wondering if people here, who might have more bleeding-edge Linux experience than I do, have ideas that make recovery from a bad upgrade equally trivial.

Edit: There are some more ideas here on the arch wiki.

This is my choice. I run a distro (OpenSUSE Tumbleweed) that puts all snapshots into grub menu, so I can just boot into any previous snapshot.

Grub → boot into read-only snapshot xyz → check if I booted into working system → sudo snapper rollback → reboot → done.

Snapshots aren’t a backup. They just save you from restoring the system from backup most of the time. I backup my system by sending the BTRFS snapshots to my ZFS-backed homeserver. Fast and easy. But sending the snapshot to a USB SSD/HDD works just as well.

BTRFS is great. CoW, snapshot, native linux support and distros using it by default and doing scripts to make these things happen.

2 Likes

BTRFS is an invaluable resource that can unbrick your OS, but it’s not a backup. I copy and paste everything from my /home directory that I don’t want to lose onto an external hard drive just in case I get like multiple drive failures or accidentally delete stuff or something.

2 Likes

Interesting. Two quick votes for BTRFS! What about the contents of /boot and /boot/efi? I’ve never used BTRFS so maybe /boot (or wherever your kernel is) is itself on btrfs, but surely the bootloader (e.g. grub) and the grub config are not?

/boot itself is BTRFS because / is. /boot/efi is EFI system partition type/vfat while /boot/grub2/x86_664-efi and /boot/grub2/i386-pc are fully fledged subvolumes in BTRFS. (welcome to my /etc/fstab :slight_smile: )

I’m not that much of an expert in partitioning and boot stuff. I don’t touch the stuff and I don’t want to. But from experience with multiple distros doing the scripts to update grub pre&post package manager + manual snapshots using BTRFS, I was always able to boot into a working snapshot again.

Sending snapshot to either a file or another BTRFS filesystem is a serialized stream (much like ZFS does). It’s like 100x faster than Rsync and covers every single block, not just files.

Send/Receive via BTRFS or ZFS is the way to do backups today if you have access to either of them. There is no competition.

That is not necessarily the case. My /boot is a separate partition and thus can be a different FS, as is /boot/efi on my server (which I had to use 'cause it won’t accept MBR style booting, much to my chagrin :rage: )

1 Like

I guess this is getting into the weeds a little bit, but I guess that means that GRUB can read BTRFS filesystems? Since it needs to do so in order to load the kernel?

Interesting - you prefer MBR style booting? I don’t know much about boot processes, but I just use UEFI always out of habit. Why do you prefer MBR/BIOS?

Another vote for btrfs. The others haven’t emphasised that snapshots are close to instant, and reverting to a snapshot is the time for a reboot. Restoring from a backup even with rsync is time consuming, and can fail.

For some forms of encryption /boot is a separate partition.

The EFI system partition, ESP, usually mounted on /boot/efi, should have it’s own strategies. An OS update can in principle bork the boot; so, I don’t let OS updates touch it, by telling grub not to automatically update, installing grub to its own subvolume independent of any OS, and manually maintaining grub and the ESP myself, though it’s rarely necessary, maybe a few times a decade.

1 Like

And if restoring over WAN, latency can make it a very tedious endeavor. Don’t ask why I know this :slight_smile:

Yeah, Snapshots are like 120Hz display, optical mouse, broadband internet or free refills. Once you used them, everything else just feels like stone age.

This is a wise approach nonetheless. When I switched to Linux, I was told “don’t touch the /boot”. I admire people knowing all that stuff. I’m just happy not having to use fdisk and deciding on using MBR/EFI. For some parts of a system, I’m just the average guy with no clue.

I totally agree, I love zfs on root on FreeBSD. My experience with it on Linux has been sketchy though, and I’m afraid that on bleeding edge distros it might break often.

Well, I’m an “old f@rt”, really :stuck_out_tongue: I dislike the fact that a E(U)FI partition can only be formatted in a proprietary M$ file system, not open source when there’s really no reason it can’t. Being a long term Linux user (ditched Win98SE back in 2005 after one BSOD too many, used Linux as my main OS ever since) I also take offence by the cr@p that’s systemd, but that’s another, irrelevant story.

My fstab file contains separate entries for /boot, /, swap, /var, /usr, /tmp and /home (latter mostly on a separate disk as well) on pretty much all of my systems. Saved me more then once on dying hardware or renegade processes filling space with enormous log files etc.

HTH!

1 Like

Fair reasons! Thanks for weighing in.

I totally agree on ZFS with FreeBSD. I wish we could have boot environments for ZFS just like FreeBSD. But BTRFS on certain distros can be a 95% substitution.

There were some attempts to imitate boot environments for ZFS on Linux, but projects like zsys being abandoned/deprecated and general non-native problems just lead me to using BTRFS which sees great support from some distros. I feel comfortable in using BTRFS as root partition and unless you’re building a server with parity RAID, BTRFS is more than a good substitute and is well recognized and supported by both the kernel as well as several enterprise/leading distros.

Right, this makes sense to me… These days on linux I use ext4 or xfs for the root and then use large zfs pools for the “actual data”, but perhaps I should move over to btrfs for the root, in light of this discussion, particularly on more bleeding edge distros…

I just changed root to BTRFS. I’d prefer ZFS but for most distros, that’s just too much work and running leading edge distros on ZFS is outright dangerous.
I have a zvol with BTRFS as backup drive and replication (once a week). Would be way smoother with a simple ZFS send/receive, but Zvol with BTRFS does the job just as fine.

If you come with some ZFS experience, BTRFS commands and features will be very familiar. You know what CoW, checksums, metadata and scrubs are. But you have to use /etc/fstab again because you don’t get 100% of the good things. Compression is actually a mount option in BTRFS :frowning:
I treat BTRFS as the little brother of ZFS.

1 Like

I know (in the internet sense) someone who feels like you about UEFI, but he was an early adopter of GPT, so his systems are BIOS/GPT. IMO the old MBR scheme from the early 80’s was a kludge back then, and it’s a horrible crock now.

Both 1. (rsync /boot into elsewhere on /) and followed by 3. (btrfs snapshot / with snapper).

In my case / is on btrfs within luks2 within lvm; /boot is non encrypted btrfs, and /boot/efi is usually required fat32.

After snapper snapshots are taken I upload them into rclone encrypted google drive folder.

You can do incremental btrfs too, and you can store the send stream (same as with ZFS - not recommended).

/home … ok. /boot because crypt.

Keping rest separated is probably a mistake these days for a general purpose system.

There’s this cryptfs thing that’s supposed to land into btrfs that might work slightly better than LUKS , and could mean /boot can be unencrypted on same subvolume as encrypted rest of /.

It can do all kinds of things, incl. various btrfs and zfs … but its support for various filesystems is usually a bit lagging e.g. it didn’t support raid1c3 for btrfs at same time it was implemented in kernel, …
… unfortunately btrfs doesn’t let you mix raid levels on a filesystem.

The linux distro maintainers really need to look long and hard at FreeBSD (or solaris, RIP) BEADM.

Its been around for oh… about a decade (or more) at this point.

https://man.freebsd.org/cgi/man.cgi?beadm

1 Like

It’s very interesting to me that everyone who has weighed in so far and who feels like they have an actual solution uses btrfs as part of that solution. How did Arch people deal with this in the days before btrfs was available?

Very much agreed.