KVM virtual machine replication

I want to replicate the Hyper-V virtual machine replication feature using KVM. I am hoping someone can list what services/projects/components I need and where in the stack/layer they present, so I can learn them in the correct order to make it easier to relate them to each other. Also, to know their purpose and how they directly affect the other components for the purpose of KVM virtual machine replication.

I’ve seen mention of DRBD, but also cluster LVM and those both need GFS for a reason I don’t understand. Can I still migrate the VM to the server that I replicate to temporarily? Does migrating move the storage? Is KVM aware the storage is replicated and does not need to be moved along with the migration?
I read that need pacemaker and corosync to detect when a resource is offline. If I don’t intend to use high availability, rather I will start the VM manually if a server goes offline, do I need pacemaker and/or corosync?
So many questions and it is difficult to find enough information to know what to do. Which service I should use and which I should not since some almost do the same thing…

Please help me reduce the immense amount of reading and sorting of information. it’s big and I only have evenings and weekends to learn and figure it out.

1 Like

So let’s get started! For HA on KVM you need some kind of shared storage ( Ceph, Gluster, DRBD, NFS, iSCSI etc). As far as I’m aware there is no such thing as storage vMotion, which obviously means that all server that should be able to run the VM need access to the VM in some kind of way.

You could check out oVirt, which does have HA and all the bells and whistles Hyper v does. (Proxmox would also work I guess)

Corosync or Pacemaker are as far as I know just there to check which servers are alive and well in a cluster. So just having it doesn’t make fail over work. Making all this work on your own is just too much. You should use something like oVirt or Proxmox! But even a normal virt-manager setup with libvirt can do live migrations if you add shared storage to it.

Did I miss sth?

1 Like

As far as I can tell oVirt uses glusterfs, so I’ll use that for the replication.

Yep they do. Qemu seems to even have a integration to directly access a gluster volume via libgfapi, which speeds up drive speed a lot.

Actually I have to remedy my statement about storage live storage migration, kvm/qemu can totally do that! I just did on Proxmox.

I just installed ovirt on a couple of VMs on my workstation. The WebUI looks purdy.

Sad day. I have to use 3 nodes with gluster :frowning:

If it is possible to do with GlusterFS, I’ll have to manage it manually I guess.

Trying to learn glusterfs, but can’t figure out why I can’t create volumes. Typical: guides don’t work… :frowning:

You actually don’t have to, if you click on the self-hosted engine deployment, you can click on single node gluster install! I didn’t see that straight away either.

Just make sure to have a second hdd with >61 gigs or more. You’ll need to enable it in lvm, because the stupid installer blacklist every drive that is not the boot drive…

# vi /etc/lvm/lvm.conf
...
filter = ["a|^/dev/disk/by-id/lvm-pv-uuid-PFcZNx-pF4f-ryXM-6jKc-2MFF-LHwB-Ldiry6$|", "r|.*|"]
filter = ["a|/dev/sdb|"]
...

The important thing to note here is the "r|.*|" which blacklist everything else. Removing that would make every drive work. But since the installer created that, I didn’t feel like changing that so I instead opted to whitelist the drive for gluster. Please use a disk by id path like the other one so that you don’t have to worry about the drive not showing up at some time.

Now everything should deploy nicely!

To get HA to work you could add another node with the option to run the hosted-engine (!) and replicate the gluster setup on that node as well. I don’t know if this is going to work tho.

What I instead did on a testbed was deploying a three node cluster with an arbiter setup. Since I used a VM with macvlantaps (yeah stupid me didn’t think about the fact that with this, a communication with any other node wouldn’t work on the vm) the hosted-engine deployed fine and the gluster volume was working great. Since the communication didn’t work I couldn’t add the other 2 nodes and I was left with a node setup basically.

But what I wanted to originally say is that I was able to change the arbiter node to a Pi4. Gluster was doing fine and the hosted-engine did not complain. So that might be an option, but don’t expect any great results… The latency of the NIC is just awful. I’m still waiting for my Xeon (it’s coming from China…) and when it finally arrives I’ll shift my homelab from promox to oVirt.

I couldn’t get anything to work with the ovirt project. I also don’t have the 3-nodes for the replication it requires.

I instead am setting up manually using debian+glusterfs+cockpit-machines .
I am stuck though, I can’t see to create .qcow files in the virsh pool. I get the following error:

ERROR Couldn’t create storage volume ‘deb.qcow2’: ‘this function is not supported by the connection driver: storage pool does not support volume creation’

My command:
virt-install -n deb --memory 1024 --vcpus 1 --cdrom /glusterfs/iso/debian-10.8.0-amd64-netinst.iso --disk pool=default,size=20,format=qcow2 --network bridge=brSwitchVM --graphics type=spice --virt-type kvm --video qxl --channel spicevmc

I’m replacing the debian install with a ubuntu server install to see if the qemu package from ubuntu has gluster support.

Still the same thing! If I try to create file using qemu-img that fails as well.
qemu-img create -f qcow2 gluster://localhost/vmstore/ubuntu1-root.qcow2 40G

qemu-img: gluster://localhost/vmstore/ubuntu1-root.qcow2: Unknown protocol ‘gluster’

Related? Or rather still true? GlusterFS support for Libvirt, Qemu, Samba & TGT in Ubuntu - Just another IT(?) blog

Fedora or OpenSuse seem to have the required pkgs https://pkgs.org/download/qemu-block-gluster

What problems did you face? I mean if you Klick on gluster setup you can setup oVirt on a single host as well

If I do a single host gluster setup I can only do distributed not replicated. I’ll have to re-install ovirt and try again to get the error log.

Ahh okay yeah that makes sense. So no need to recreate. Someone on the forum told me that he’d always create his gluster volumes manually before running the installer. So if you’d do that I’m sure you can achieve what you want.

Since the installer is just running the gluster ansible script you should also be able to install with an arbiter node running as a Centos VM with the gluster7 repo enabled. But that as you probably know comes with its own set of problems…

You can use oVirt without Gluster, but you need another external storage system. I’m using an external NFS server and HA/Replication work just fine

That would be ideal. All I have is two servers, so I am trying make that work.

1 Like

I’m not 100% sure, but you might need 3 oVirt nodes for HA to work correctly, at least for the HostedEngine (the vm that controls the other vms). So for NFS on that setup you would need a total of 4 physical servers. But they don’t have to be big. An RPI4 with a hard drive could technically be an NFS server for you, but it would probably be a little too slow

If I set up a desktop PC in the server rack and use that as an arbiter. That could work. Definitely not ideal, but a 2 node cluster without an arbiter is not ideal either I suppose.

My plan was to have the servers directly connected to each other (no switch in between). That is a bit complex with 3 servers, but I figure if I bridge 2 interfaces on each server (with STP enabled) and connect them in a loop I can still achieve direct connectivity.

Note: I have a 4-port 1G ethernet card (VM and host connectivity) and a dual port 10G sfp+ card (for storage replication) in the two servers. I think I have another 10G sfp+ card I can pop into a desktop PC for this scenario.

Alright, new plan!

1 Like