Layers of backing up VMs

Say you’ve got a hypervisor (proxmox, xcp-ng) running a VM, running services and docker containers. You can snapshot and back up the virtual disk to a NAS. You can run Timeshift to back up the OS in the VM. You can use borgbackup to grab your service’s data, maybe even the docker compose files.

How deep do you go? Is there such a thing as having too much redundant data? What do you think is the most practical “depth” to reach to make restoration easy without having to manage too many layers?

I’m intentionally being vague because I want to hear opinions for different situations. A homelab is very different than a datacenter, and I don’t know what the “standard” backup strategy is for either. My own setup involves a homelab proxmox cluster backing up all of the above (disk images, docker compose files, etc) to my NAS, backing up to another NAS off-site. I always think “but what if I don’t want to restore to a hypervisor later and instead need to spin up the containers on a bare metal machine as fast as possible”, as unlikely as that scenario is. I can’t help but feel much of it is a waste of storage.

  1. Make a list of all the things you want/need to backup.
  2. Implement your backup strategy
  3. Try to restore all items from list in 1. If you fail with any item revise or enhance your backup strategy and restart.

in a home lab it is mostly ‘apatite for risk vs budget authority (WAF)’

WAF is the good ol Wife Acceptance Factor, just in case someone does not know.

so in the advent of a disaster, and a need to recover something usable quickly, statements like ‘what if i want…’ really do not matter. you WILL do what is needed to recover quickly. if that is a NUC with ProxMox because you found the hardware at a garage sale, then so be it.

for my home lab i have a ZFS clone of my user storage, and VM dumps that i grab monthly. that is all. i have had to recover once.

at my datacenter at work, i have far more.

1 Like

You can treat your VMs as physical systems and do backups from inside the OS if you want to. Entirely practical for a small number of VMs. You’ll need to log a little bit of additional info though, like disk sizes (dump partition layout), how much memory & CPU is allocated and which networks it was connected, in order to allow quickly recreating the VM. Slower to restore, and gets pretty unwieldy with a large number of VMs.

Generally the better option is to backup the VMs and rely on that. Will have all your data in there. Very quick to restore. Should be possible to extract select files from the image if required. etc.

They key to backups isn’t how you copy the data, it’s making sure you’ve got all of it, making sure you do it often enough, ensuring you’ve got enough copies and that they’re not all in one vulnerable location, that you can both backup and restore in a reasonable time, and double and triple checking that all of this is true and will work when you really need it.

1 Like

If you’re using Proxmox, Proxmox Backup Server has deduplication.