Is suspend to RAM or disk feasible with GPU passthrough?

qpok · February 17, 2023, 6:43pm

I’m in the process of building my first VFIO machine, using QEMU/libvirt. My host boots headless and I’m running my Linux desktop as a GPU accelerated guest. That’s all working nicely - so far so good!

If it makes any difference to this, I’m currently running Nvidia 970 and 770 cards for VFIO on an ASRock B650 LiveMixer B650 (AMD AM5).

I’d like to be able to sleep my physical machine, which means suspending the running guest to either RAM or disk first, but my initial attempts with this haven’t worked. The guest often doesn’t suspend fully and most of the time I need to reboot the host to make the VFIO GPU usable again. In fact, I’ve been unable to get a VM to reliably suspend (and resume) to disk or RAM either with or without passing through a GPU.

On one hand perhaps I shouldn’t really be all that surprised, since suspend and resume on Linux has always been fairly hit or miss for me, but my intuition told me that it would be more reliable with virtual machines where a good chunk of the hardware is defined in software. Seems my intuition missed the mark!

What are your experiences with suspending (Linux in particular) guests? Is this avenue worth pursuing for me? Or, since my real problem is wanting to suspend the physical machine, maybe my time would be better spend figuring out how to persist and restore my desktop/windowing layout between reboots? I welcome any thoughts.

FWIW, I have seen reports of people successfully hibernating Windows VMs. How much of this is down to the luck of particular hardware combinations is not clear.

vic · February 19, 2023, 6:09pm

I briefly tried hibernating Windows 10 VM and resuming from it. It did work reliably for the few times I tried with GPU passthrough many months ago.

I couldn’t recall after additionally I suspended the Linux host, did Windows 10 VM resumed successfully or not after resuming the host. By the nature of me… I would believe Win10 VM didn’t fail 'cos I’m quite sure I’ll remember any failure clearly than a success…So I think it does work for Win10.

Anyway, I no longer have Win10 but Win11. Both tests failed. I’m not surprised at all though since Win11 seems have a few bugs to iron out in virtualization from experience.

MacOS failed miserably but that’s a different story. Never tried Linux VM with passthrough.

Actually…why do you think it’s better to additionally run Linux desktop as VM instead of directly on the host? Seems none to me.

qpok · February 20, 2023, 11:45am

Thanks a lot for the info! So many moving parts with all of this stuff means making clear sense of it all is tough. Definitely a bit of luck involved!

Mostly for braindead simple snapshots and rollbacks, although I suppose I could achieve that a different way too.

Something about being able to spin up experimental OSes for my desktop feels freeing too (or at least I imagine it will when I get it fully working!). Not that there was really anything stopping me doing that by my multi-booting my OS, it was awkward enough that I never really bother.

vic · February 20, 2023, 4:50pm

I think you’re on the right approach. Run as VM while you can’t as bare-metal. In the case of your main Linux desktop, better to run it on the host.

Unlike other hypervisors, Linux KVM let you co-exist with a Linux desktop on the host concurrently. That’s a very nice feature IMO. While you could run other guest OSes or experimental Linux desktop/server as VMs.

qpok · February 21, 2023, 4:21pm

Aye, that’s definitely my fallback if things don’t work out as planned!

Something I didn’t mention in my last reply is that simplifying my machines in a big draw and I’m keen to reduces the responsibilities of each machine as much as possible, ultimately to make them more understandable. Of course, this type of simplification at the machine level only happens by introducing the whole additional layer of virtualisation (increasing the complexity), so it will only turn out to be a net win if the layer that enables all that tasty isolation turns out to be fairly stable and keeps out of the way once it’s set up.

qpok · March 25, 2023, 10:37am

In case anyone stumbles upon this wondering the same thing, the answer is yes - suspend to RAM and disk are both possible (although I hardly ever use suspend to disk).

I’ve been experimenting with nouveau and the binary drivers as well as with an AMD 6400 RX in Linux guests for a few weeks and as far as I can tell things are just as reliable (or not) as suspending on a non-VFIO setup. Since debugging an issue unrelated to VMs, suspend and resume on VFIO guests and the host have been working without issue for me.