So i have a Linux Qemu VM (AMD 5950X Linux host, OVMF) with an Nvidia GPU and a USB controller passed through to it, and i want to hibernate it, the usual way from inside.
I’ve set up a swap disk for it, added the resume kernel parameter and so on, and on systemd hibernate it does hibernate just fine.
Qemu quits, i restart it, and there is a 50% chance that the VM would un-hibernate and work fine, GPU and all.
This post is about the other 50%, when it fails to resume and does a clean boot.
The error is:
[ 9.860061] Hibernate inconsistent memory map detected!
[ 9.860062] PM: hibernation: Image mismatch: architecture specific data
[ 9.860065] PM: hibernation: Read 13405188 kbytes in 0.01 seconds (1340518.80 MB/s)
[ 9.860960] PM: Error -1 resuming
[ 9.860963] PM: hibernation: Failed to load image, recovering.
[ 9.861363] PM: hibernation: Basic memory bitmaps freed
On some googling i found that it’s because the e820 memory map is slightly different for some reason.
Boot #1:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000007fffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000000800000-0x0000000000807fff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x0000000000808000-0x000000000080ffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000000810000-0x00000000008fffff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x0000000000900000-0x000000007df09fff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007df0a000-0x000000007df0afff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000007df0b000-0x000000007e8b4fff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007e8b5000-0x000000007e8b8fff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x000000007e8b9000-0x000000007e8bafff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x000000007e8bb000-0x000000007e8c2fff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x000000007e8c3000-0x000000007e8dafff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000007e8db000-0x000000007e8fafff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007e8fb000-0x000000007e91afff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000007e91b000-0x000000007f99afff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007f99b000-0x000000007f9cafff] type 20
[ 0.000000] BIOS-e820: [mem 0x000000007f9cb000-0x000000007f9f2fff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000007f9f3000-0x000000007f9fafff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x000000007f9fb000-0x000000007f9fefff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x000000007f9ff000-0x000000007fe5ffff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007fe60000-0x000000007fe7ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000007fe80000-0x000000007fffffff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000ffe00000-0x00000000ffffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x0000000e7fffffff] usable
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] efi: EFI v2.7 by EDK II
[ 0.000000] efi: SMBIOS=0x7f9cc000 ACPI=0x7f9fa000 ACPI 2.0=0x7f9fa014 MEMATTR=0x7ea21018
[ 0.000000] efi: Remove mem49: MMIO range=[0xffe00000-0xffffffff] (2MB) from e820 map
[ 0.000000] e820: remove [mem 0xffe00000-0xffffffff] reserved
[ 0.000000] SMBIOS 2.8 present.
Boot #2:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000007fffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000000800000-0x0000000000807fff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x0000000000808000-0x000000000080ffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000000810000-0x00000000008fffff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x0000000000900000-0x000000007e38afff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007e38b000-0x000000007e38bfff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000007e38c000-0x000000007e8b4fff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007e8b5000-0x000000007e8b8fff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x000000007e8b9000-0x000000007e8bafff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x000000007e8bb000-0x000000007e8c2fff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x000000007e8c3000-0x000000007e8dafff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000007e8db000-0x000000007e8fafff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007e8fb000-0x000000007e91afff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000007e91b000-0x000000007f99afff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007f99b000-0x000000007f9cafff] type 20
[ 0.000000] BIOS-e820: [mem 0x000000007f9cb000-0x000000007f9f2fff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000007f9f3000-0x000000007f9fafff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x000000007f9fb000-0x000000007f9fefff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x000000007f9ff000-0x000000007fe5ffff] usable
[ 0.000000] BIOS-e820: [mem 0x000000007fe60000-0x000000007fe7ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000007fe80000-0x000000007fffffff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000ffe00000-0x00000000ffffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x0000000e7fffffff] usable
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] efi: EFI v2.7 by EDK II
[ 0.000000] efi: SMBIOS=0x7f9cc000 ACPI=0x7f9fa000 ACPI 2.0=0x7f9fa014 MEMATTR=0x7ea48118
[ 0.000000] efi: Remove mem51: MMIO range=[0xffe00000-0xffffffff] (2MB) from e820 map
[ 0.000000] e820: remove [mem 0xffe00000-0xffffffff] reserved
[ 0.000000] SMBIOS 2.8 present.
Line 6 is reserved in slightly different places, and this throws off the resume.
Best i can google it’s some kind of a kernel bug that the kernel devs refused to fix because the BIOS/UEFI should be doing things right and it’s not their job to work around BIOS bugs.
So the question is - is there a way around this?
Somehow convince Qemu or OVMF to provide a consistent table between runs?
Somehow get the kernel to ignore such shifts or blacklist the whole range or something?
Qemu config:
qemu-system-x86_64 \
-nodefaults \
-nographic \
-enable-kvm \
-m 57344 -mem-path /dev/hugepages \
-cpu host,kvm=off,hv-vendor-id=PC,hv-frequencies=on,hv-reenlightenment=on,hv-relaxed=on,hv-reset=on,hv-runtime=on,hv-spinlocks=4096,hv-time=on,hv-stimer=on,hv-stimer-direct=on,hv-synic=on,hv-vapic=on,hv-vpindex=on \
-smp cores=32,threads=1,sockets=1 \
-machine q35,vmport=off,kernel_irqchip=on \
-drive if=pflash,format=raw,readonly=on,file=ovmf_code.fd \
-drive if=pflash,format=raw,file=ovmf_vars-1024x768.fd \
-smbios type=2 \
-netdev user,id=net0,hostfwd=tcp::5002-:22,hostfwd=tcp::5902-:5900 \
-device e1000,netdev=net0,mac=00:25:4B:00:00:02 \
-device nvme,drive=nvme0,serial=deadbeaf1,max_ioqpairs=8 -drive file=vm2_sys.qcow2,if=none,id=nvme0 \
-device nvme,drive=nvme1,serial=deadbeaf2,max_ioqpairs=8 -drive file=vm5_aux.qcow2,if=none,id=nvme1 \
-device nvme,drive=nvme2,serial=deadbeaf3,max_ioqpairs=8 -drive file=vm5_games.qcow2,if=none,id=nvme2 \
-device nvme,drive=nvme3,serial=deadbeaf4,max_ioqpairs=8 -drive file=data.qcow2,if=none,id=nvme3 \
-device nvme,drive=nvme4,serial=deadbeaf5,max_ioqpairs=8 -drive file=swap.qcow2,if=none,id=nvme4 \
-device pcie-root-port,chassis=1,id=root1,bus=pcie.0 \
-device vfio-pci,host=0a:00.0,bus=root1,multifunction=on,addr=00.0 \
-device vfio-pci,host=0a:00.1,bus=root1,addr=00.1 \
-device vfio-pci,host=0c:00.3 \
-vga none