Database-safe backups of data and VMs using sanoid and ZFS

McMonster · February 7, 2024, 2:56am

I’ll address the elephant in the room first. I am aware that Proxmox exists and that I’m essentially trying to MacGyver my way into a subset of things it does well. I probably will eventually move my home server setup to Proxmox, but before that day comes I want to do some baby steps to improve my current configuration. Moving to Proxmox would require a lot more work at the moment.

Current situation

My “homelab” is essentially a single physical box running Fedora 38. On that box I have an OpenZFS pool with all my data. It also hosts a Fedora 38 VM that has some ZFS datasets exposed from host using a mix of NFS shares and virtiofs (I am mid switching from NFS to virtiofs, but I never finish anyth^C). This VM is running all of my services in Podman and I attach all data to containers using simple mounts of directories that live on NFS/virtiofs resources from my ZFS pool. I’ve recently added sanoid to snapshot the pool and I want to add syncoid and a remote box for backups. VM image lives on root filesystem and is not snapshotted, as long as container data is safe I can recreate it. I would also like to create more VMs and LXC containers in the future.

The goal

Improving my backup situation, which includes moving Podman VM and any future VMs to ZFS pool so it can serve as a single source of truth. This will allow me to just snapshot and backup everything at once with sanoid/syncoid.

The problem

How do I make sure that snapshots of ZFS pool are consistent and I can restore them without corrupting data, especially databases? My strategy up to this point was just to snapshot ZFS (and previously btrfs) and forget. On the rare occasions I had to restore something from a snapshot it just worked. But I am aware this is not guaranteed.

I’ve searched the Internet, but there aren’t a lot of resources about that and most look outdated or at least ignoring some other things I’ve found. The most interesting is this blog post about solving this problem with sanoid[1]. Unfortunately I don’t quite get why it does what it does. Until recently I believed that the only way is to either shut down the VM before backup (which Jim Salter, creator of sanoid/syncoid, seems to prefer[2]) or make full live snapshot, including RAM. My understanding of the this solution is that it doesn’t really care about live snapshot, but simply forces the VM to flush pending writes to disk and temporarily moves further I/O elswhere, merging back after snapshot. Is it even sufficient to make databases happy? From the snapshot point of view it still feels like pulling the power cord, just the running VM state remains consistent. Wouldn’t it be better to save full live state along with VM image and libvirt’s domain XML file and snapshot that? I checked and Proxmox does more or less that for running VMs.

Wouldn’t it be simple to do libvirt’s manual snapshot feature[3], including saving mamory file along the zvol, in prescript and virsh resume in postscript? I guess the advantage of using temporary external over manual snapshot is that it limits how long the VM is in a paused state as sanoid takes up to 10 seconds to snapshot all datasets. But with some finer grained configuration I may pause-snapshot-resume each VM individually with their respective datasets and snapshot shared datasets separately. I don’t yet share datasets between VMs but this will certainly happen, I’m yet to figure out how I want it to work.

Last issue with blog post[1] is that I completely don’t get what’s the deal with RAM_BACKUP. It is saved so it can be reconstructed into live snapshot along with ZFS snapshot of the disk image?

Disadvantages

Seems like using zvols and ZFS for snapshotting means giving up easy on-demand snapshot functionality in virt-manager or cockpit-machines. They simply refuse when there’s anything other than qcow2 attached. I can imagine why people behind Proxmox chose to create a custom solution. But it can most likely be scripted as a workaround.

[1] Consistently backup your virtual machines using libvirt and zfs - part 1
[2] Reshuffling ZFS pool storage on the fly
[3] libvirt: Snapshots

Ghan · February 7, 2024, 3:16am

I’ve done this in the past with MySQL by issuing a FLUSH TABLES WITH READ LOCK statement prior to the snapshot, then doing the snapshot and unlocking the tables with UNLOCK TABLES right after. This writes all changes to disk and then locks all tables in the database to prevent further changes during the snapshot. MySQL’s manual even mentions ZFS as a recommended use case for this: https://dev.mysql.com/doc/refman/8.0/en/flush.html#flush-tables-with-read-lock

However, that of course only helps with MySQL. You’d have to check your database application to see if there is a similar way to do this efficiently.

Outside of databases, it will likely vary even more as the goal is to either hook into the application to inform it to stop making any writes, or do something similar with the OS itself. Backup agents typically have some functionality to do this.

The ultimate answer is that you’re asking for application consistent snapshots, which are going to require some kind of integration with wherever the application is running to properly quiesce the disk(s) before the snapshot is taken. The easiest (from the admin’s perspective) way to do this is what you mentioned above with just shutting the VM down.