VFIO Gaming Tip: Multi-GPUs / Multi-Users / "Shared Storage" VMs

Hello All,

I came across an interesting VFIO Gaming use case that I figured I would share - along with my solution which I hope can help some others here…

This is a rather unique and ‘edge’ use case that I doubt will show up too frequently.

Use Case:
Building a new two-user / two-gpu gaming rig for my better half and myself; however, I wanted to ensure that each of us could independently choose which GPU we wanted to use. Understanding my partner would not want to monkey around with attaching / detaching the PCI devices on their VM, this would leave us with two VMs (and more importantly two different OS installations - maybe even two copies of each game) per person… not great.

Addressing the problem…
This got me thinking… What if I create not one but two VMs per person, but with an odd twist… carbon-copy configurations and shared storage image files. :face_with_open_eyes_and_hand_over_mouth: Essentially, think of it like this… Alice-VM-4090 and Alice-VM-3080 are clones (even down to the MAC address on the vNIC). The one difference between them is Alice-VM-4090 and Alice-VM-3080 have two different GPUs attached (RTX 4090 and RTX 3080 respectively).

Now if I was to make Alice-VM-3080 use the same disk image files as Alice-VM-4090 then I could effectively achieve only requiring one OS installation along with a single installation for each game - per person. Practically, a simple way to do this is to delete the disk image files created with the cloned vm (Alice-VM-3080) and replace them with symbolic links to the disk image files for the original machine (Alice-VM-4090).

Great idea… but there is a problem… Unlike with physical hardware devices that are passed-through to a VM, KVM/Qemu will not stop the execution of a virtual machine that is attempting to use the same storage device (disk image file) as another running virtual machine. This can very easily lead to data corruption if somehow both machines are powered on (using the storage image) at the same time. I figured for my build, this was far too big of a risk, to leave to “trusting that only one of each pair of VMs would be running at a given time” so… what to do?

The Solution:
Broad-stroke: Qemu Guest Locking and Hookscripts - both very well documented.

To solve this problem, I created a hookscript (in PERL) that each of these machines will use - link to code shared below. Qemu, being as nice and cuddly as it is, allows for hook scripts to be used to automation functions in three different states of VM operations: pre-start, post-start, pre-stop, and post-stop. Combining this with the ability to lock machines via the Qemu shell, we have a rather elegant solution. Essentially, whenever Alice-VM-4090 starts, Alice-VM-3080 is locked via Qemu - and vice-versa.

About the Script
Link: KVM/Qemu Storage Collision Control
I just posted this moments ago, so please forgive me for not providing a complete write-up as of yet…

I tried to keep this script pretty simple. There is a dynamic data structure called %Families that can be configured specific to your needs. What this script will do is take the ID of the VM that is starting and lock (qemu clone lock) the cloned / associated vm(s) prior to starting the requested vm. When the used vm is powered off, this lock is then removed from the cloned / associated vm.

I hope this can be of use to others who are playing around with some unique use cases for VFIO.

Best,
AX

1 Like

I don’t get the use case :wink:
What happens, if you want to play at the same time or are you essential hot-seating the PC? Then, I don’t understand why you need a two GPU solution or even a VM in the first place (the storage being the same essentially for all VMs).

It would somewhat make more sense for me, if you or your partner is using the host OS (if it is a Desktop OS) and the other one is gaming on the VM in your example, or if the second GPU is used for “something else” (compute tasks for example) and gets bumped up and down (4090 vs. 3080) depending on the workload and games being played.

Another solution to the “multiple copies of x” would be deduplication if the host’s file system supports it. But this may not be feasible for gaming (depending on your setup).

When you put 2 gaming OS into one box, you also create a dependency, which is almost a deal breaker sometimes. That means when you have to tinker your host rig, your very important half could not play game.

Build a saperated PC, and decorate it with full RGB (yeah, very important), and you partner will appreciate it.