I have a working windows 10 guest VM on my archlinux host, with GTX1070 Ti passthrough, and virtual USB input keyboard/mouse. Everything works fine, except guest boot time. It took about 5 mins to boot with 4 cores, 16GB ram. It’s kind of slow, but I did not use Windows that much.
However, I recently need to test some Windows specific software and decided to give my Windows guest 10 cores, 64GB ram, the boot time went from 5 to 16 mins, and the CPU usage was sky high during boot. Does large ram slow the guest boot time? is there any setting I will need to tweak?
Run something like glances or atop on the host and see if you spot a bottleneck causing that high CPU usage.
Does your guest run highly threaded apps that it needs 20 threads? How many are left over for the host? You could try pinning specific core/thread pairs instead of letting the host decide. Also pin your IO to a specific core.
I ran atop while booting up my VM. All I saw is 2000% CPU usage by qemu-system-x86, drop to 5x% after boot. I have Xeon E5-2683v3, so 4 cores are left to host. I probably do not need all 10 cores for testing purpose, but most of 3d rendering software I used are capable to cap all 10 cores while rendering heavy scene.
I will try cpu pinging this weekend, see how it goes. Regarding to pin IO to a specific core, can you explain a bit more? or maybe a link?
Right, the cpu is pegged, but something is causing a bottle neck, it’s not just raw computations bogging down your cpu.
What’s your iowait look like? Glances is easier to read than atop and it highlights and logs issues like iowait. What are the overall specs on the host machine?
I install glance and monitoring vm boot. It’s 0.0% iowait most of time, 4-5% at the end of boot, then drop back to 0.0%. 7x% CPU and 7x% MEM usage during boot.
Host Spec
CPU : Xeon E5 2683 v3
MB : ASRock Taichi X99
RAM : 96GB DDR4 2133 Ram
GPU0 : GTX 970 for host
GPU1 : GTX 1070 for passthrough
HDD : 512G NVME for host sys
HDD : 60G raw image on 240G SATA SSD for guest
Host OS : Archlinux 5.7.2-arch1-1
Guest OS : Windows 10 1903
Thanks for the link, it seems like a good read, really appreciate it. I will try to optimize my config this weekend.
What CPU are is your system running? It may be that your hypervisor is running into an issue while waiting for resource availability from the CPU or another NUMA node.
There was a discussion on reddit mention archlinux and preemption causing windows 10 vm slow boot on archlinux host, I might give it a try, although this seems required compiling kernel, which I’ve never done before.
I think you’ll need to hunt around your bios and see what options you have for the CPU configuration. I’m reading this CPU used something called cluster on die to segregate cores and memory controllers. Not finding a lot about it other than stuff like contact your OEM for more information
Thank you @gordonthree,
I already enable Hugepage, so I guess it’s not the reason. But I will definitely look into my bios, and CoD stuff, see if I can find something related to vm boot time.
Yes, more guest RAM does increase startup time. But it shouldn’t be minutes, that’s just ridiculous.
I have a 64 GB machine where I run a 32 GB Win10 guest. It starts up in about 30 seconds (NVMe drive). I’ve adjusted the RAM settings in the guest from 8 to 32 GB at various times. The startup time is affected by how much RAM qemu has to clear out during startup. If it decides to start swapping things it slows down. I believe there’s some slowdown because of huge page compaction as well, since it starts doing page defrag during startup.
Hey, wait a minute. If I understand this correctly this requires hugepages. They’re not optional. No wonder startup takes so long! Linux has to do very time consuming memory defrag operations to get enough hugepages.
It’ll be a lot better to just let qemu do its own thing. It already marks pages for transparent huge page allocation. It will get as many as it can at start, and in the background kernel threads will acquire more huge pages as it can.
You only want to require hugepages if you’ve preallocated 64 GiB or more of them at kernel start. If you did that, then that memory isn’t available for any other uses.
Sorry I did not mention that I follow this archwiki and this to setup hugepage. I tried removing memoryBacking tag in xml, but my host system froze after guest started.
I will check my hugepages setup again see if I set it wrong, or maybe removing my hugepage setup for now, because I saw no hugepage revd and surp usage at all while my guest vm.
#before vm start
#grep Huge /proc/meminfo
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 32800
HugePages_Free: 32800
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 67174400 kB
#after vm starts, during high cpu usage, before seeing Tiano logo
#grep Huge /proc/meminfo
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 32800
HugePages_Free: 32
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 67174400 k
Over my head and beyond where I’ve tread in the land of KVM … Hopefully you will find a fix.
Any interesting looking CPU options in the bios? I wonder if there are side effects running a server CPU on a HEDT chipset, versus the full c602 (whatever) server chipset?
I am not familiar with BIOS set up related to VM, but I will keep it mind while searching for answer. Will try to disable hugepages setting first this weekend see how it goes.
This looks really promising.
I did try it couple days ago, using linux-vfio from AUR, it failed to compile on my host, but works in a clean chroot. I was going to test it out yesterday, but I accidentally override my inittramfs without notice. While trying to figure out what’s going on, my Bios CMOS battery was out. I do believe Murphy’s law now.
Anyway, I got my system back, and ready to do this again, hope this will works, knock on wood.
BTW, cloning from archlinux git repo took forever. Is 150.00 KiB/s a normal speed cloning from archlinux git? Speed test shows 45Mb for my internet download speed.