Windows Gaming VM on Proxmox: Performance Optimization in MSFS 2020

Hello all! About 2 years ago my gaming desktop was running Ubuntu with a Windows gaming VM, however I was having trouble getting good performance in Microsoft Flight Simulator specifically. Around that time I decided to switch back to Windows running on the bare metal due to several other reasons. However, I recently switched back to a Linux OS, this time Proxmox, and I still appear to be having the same issue despite the OS change. Other games seem to run fine with performance that is similar to running them on bare metal, however MSFS is the most intensive game I play, so it could just be that the performance difference was never big enough for me to notice.

A similar issue was reported on Reddit, however the solution the author found seems to be AMD specific and would not work for me: https://www.reddit.com/r/MicrosoftFlightSim/comments/ig1mwz/running_msfs2020_under_qemu_utilizes_cpu_poorly/

While in a flight with the a320 in a VM, the game averages around 15 fps with the ā€˜high-endā€™ preset, DX12, and Motion Blur off. Changing the graphics settings does not appear to affect performance. When I was running Windows 10 on bare metal, I was averaging 50fps most of the time with the same graphics settings. CPU usage never went past 40%, the GPU never went past 30%, and there was plenty of RAM available to the VM. These tests were with a monitor connected directly to the GPU and on a completely fresh installation of Windows 10 with only steam, the game, and GPU drivers installed.

I tried changing a bunch of settings the Proxmox GUI has options for, but the above numbers were the best I got. I am not very familiar with the KVM CPU arguments that are available, so I was looking for some advice when it comes to performance tuning in Proxmox/KVM. I understand I probably wonā€™t be able to get 100% of the bare-metal performance in a VM, but my goal is to get a steady 30fps at a minimum. Any help is greatly appreciated, and if you need any more info just let me know!

Computer Specifications:
-Dell Precision 5820
-Xeon W-2140B (8c16t, if unfamiliar the benchmarks online show it to be very similar in performance to an r5-3600)
-64GB total host RAM
-RTX 3060

Proxmox VM Configuration (/etc/pve/qemu-server/101.conf):

gent: 1
balloon: 0
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 4
cpu: host,flags=+pdpe1gb;+hv-tlbflush;+aes
efidisk0: local-lvm:vm-101-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:65:00,pcie=1,x-vga=1
ide2: local:iso/virtio-win-0.1.221.iso,media=cdrom,size=519030K
machine: q35
memory: 32768
meta: creation-qemu=6.2.0,ctime=1660431967
name: Windows
net0: virtio=[REDACTED],bridge=vmbr0
numa: 1
ostype: win10
scsi0: local-lvm:vm-101-disk-1,cache=writethrough,size=128G,ssd=1
scsi2: /dev/disk/by-id/ata-WDC_WD10EZEX-00WN4A0_WD-WCC6Y0ZYY37E,size=976762584K
scsihw: virtio-scsi-pci
smbios1: uuid=10761cfd-a9a0-4dc6-ac0a-90ab79a53a2b
sockets: 4
tablet: 0
tpmstate0: local-lvm:vm-101-disk-2,size=4M,version=v2.0
usb0: host=046d:c539,usb3=1
usb1: host=06a3:0c2d,usb3=1
usb2: host=046d:c215,usb3=1
vmgenid: 544df9ed-4e9a-4ac7-a184-479c372575a8

NOTE: While the config file shows 4 cores were allocated to the VM, I am using 4 sockets with the NUMA option selected (shown later in the config file), for a total of 16 vCPUsā€¦

I have a nearly identical setup with a similar performance issue, havenā€™t figured it out yet. I have a windows 11 guest vm on a proxmox host, running on a w-2140b and rtx 3070. My average frame rates are fairly high, but every 5 seconds or so it stutters into the single digits and the audio glitches. None of the usual msfs stutter fixes have any effect (thread priority, disabling asset streaming, graphics settings), and the odd thing is it happens in menus as well. Are you seeing the slowdown in-game only?

My performance issues seem to only happen while in game. The menu and the rendering of the aircraft in the background are all very smooth. I wouldnā€™t describe my in-game issues as a ā€œstutter,ā€ since the FPS in game is consistent but low. For me this behavior is not affected by the application/thread priority you set in task manager. I havenā€™t tried disabling asset streaming since the world textures seem to load in fine, and the behavior is still present even if Iā€™m on the ground and not moving, therefore not loading in any new scenery. Additionally, I tried fiddling around with the CPU Units and CPU core count after making my initial post, and the game seems to run the same with half the cores and any adjustment to the CPU Units value.

Based on your description, Iā€™m inclined to think that we are having two separate issues. But, hereā€™s what Iā€™m thinking: maybe we can piece together the differences in our setups to figure out why Iā€™m not having your issue, and why you arenā€™t having my issue, and then use that information to make two working setups.

Do any of your Proxmox VM settings differ from mine?
Have you installed the VirtIO drivers available here: Windows VirtIO Drivers - Proxmox VE (not doing so may cause storage/memory IO performance issues I think)

I would try, in order

Configuring Hugepages
CPU pinning
Interrupt pinning
Shielding the kvm/proxmox cores from the VM dedicated cores

It is usually not a ā€˜good thingā€™ on an hypervisor to allocate all the available cores to a VM, you may get better performance by allocating 12 and pinning themā€¦

3 Likes

Add these settings to your grub configuration and reboot:

default_hugepagesz=1G hugepagesz=1G hugepages=xxxnum

where xxxnum is the amount in GB of memory you want dedicated to hugepages (32 to match your setup)

then, in your vm config add:

hugepages: 1024
2 Likes

Clone this repo:

install relevant files

cp root/var/lib/vz/snippets/exec-cmds /var/lib/vz/snippets/taskset-hook.sh
cp root/usr/loca/bin/shield-set.sh /usr/local/bin/

Shielding:

I have a 24 core/48 thread EPYC, and dedicate 4 cores to the hypervisor, and 20 to VMs:

root@pve:~# cat /usr/local/bin/shield-set.sh
#!/bin/bash
cset shield --kthread on  --cpu 0-19,24-43
cset proc --move --fromset=root --toset=system --threads --kthread --force
root@pve:~# cset shield
cset: --> shielding system active with
cset: "system" cpuset of CPUSPEC(20-23,44-47) with 757 tasks running
cset: "user" cpuset of CPUSPEC(0-19,24-43) with 13 tasks running

This effectively breaks down the system in two partitions, with kvm (and its related interrupts )running on a limited set of cores

Then for each VM for which I want max perfromance, I pin the virtual cores to physical cores, and pin the interrupts to them too, as well as assigning a dedicated core for the kvm emulator thread:

CPU Governor:

#cpu_governor ondemand

This sets the governor for the pinned threads to ondemand (by default they are in power save mode)

CPU Pinning:

#cpu_emulatorpin 20

This assigns a specific thread to the emulator process for this VM

#cpu_taskset 0,24,1,25,2,26,3,27,4,28,5,29,6,30

This locks the 12 virtual threads to 12 physical threads, coupled to match actual cores/thread on my processor

#assign_interrupts 0,24,1,25,2,26,3,27,4,28,5,29,6,30 0000:c1:00.0
#assign_interrupts 0,24,1,25,2,26,3,27,4,28,5,29,6,30 0000:c1:00.1
#assign_interrupts 0,24,1,25,2,26,3,27,4,28,5,29,6,30 0000:81:00.0

This locks interrupts for the GPU and USB PCI card to the threads that are running the VM

hookscript: local:snippets/taskset-hook.sh

This configures the script to be run when the vm starts/stops, the script will parse the above commands and act accordingly

root@pve:~# cat /etc/pve/qemu-server/106.conf
#cpu_governor ondemand
#cpu_emulatorpin 20
#cpu_taskset 0,24,1,25,2,26,3,27,4,28,5,29,6,30
#assign_interrupts 0,24,1,25,2,26,3,27,4,28,5,29,6,30 0000:c1:00.0
#assign_interrupts 0,24,1,25,2,26,3,27,4,28,5,29,6,30 0000:c1:00.1
#assign_interrupts 0,24,1,25,2,26,3,27,4,28,5,29,6,30 0000:81:00.0
balloon: 0
bios: ovmf
boot: order=virtio0;ide2
cores: 12
cpu: host,hidden=1,flags=+pcid;-ssbd;+ibpb;-virt-ssbd;+amd-no-ssb;+pdpe1gb;+aes
cpulimit: 12
efidisk0: vmpool:vm-106-disk-1,size=1M
hookscript: local:snippets/taskset-hook.sh
hostpci0: 0000:c1:00.0,pcie=1,x-vga=1
hostpci1: 0000:81:00.0,pcie=1
hostpci3: 0000:c1:00.1,pcie=1
hugepages: 1024
ide2: local-zfs:iso/virtio-win-0.1.185.iso,media=cdrom,size=402812K
kvm: 1
machine: pc-q35-5.2
memory: 16384
name: win10GamingTest
net0: virtio=D2:98:19:5F:EB:02,bridge=vmbr0,firewall=1
numa: 1
ostype: win10
scsihw: virtio-scsi-pci
smbios1: uuid=8f5cff06-d005-4928-92e0-4e63d6be635b
sockets: 1
startup: order=1,up=20
unused0: local-lvm:vm-106-disk-0
vcpus: 12
vga: none
virtio0: vmpool-4TB:vm-106-disk-0,backup=0,size=950G
virtio1: vmpool-4TB:vm-106-disk-1,size=32G
3 Likes

sorry, writing a lot of disconnected comments, but thatā€™s how my brain works in the morning, if you are having low fps and I am assuming low GPU usage, that is probably the system struggling with memory allocation, before going into the pinning stuff, try just with hugepages, what it does it it pre-allocates the RAM that kvm is going to use for your VM, and avoids constant paging back and forth of single 4K pages from normal ram. For memory intensive tasks (and I would bet FS is a hell of a memory intensive program) hugepages give a significant performance boost ā€¦

2 Likes

Thank you very much for the info, hereā€™s where Iā€™m at now:

I first tried setting up hugepages, but unfortunately there was no affect on performance, and GPU/CPU usage in the VM and on the host was still very low. As part of this process, I had to change my CPU settings to 1 socket with 12 cores (I was getting a NUMA error with my previous configuration after enabling hugepages for some reason), so it seems like no matter what number of vCPUs and sockets I have performance does not changeā€¦

I then tried to set up the CPU Shielding/Pinning, but unfortunately I was not able to follow your instructions even after reading the documentation in the repo. I started with following the steps in your post as written but I didnā€™t have the cset command available, so I then tried following the instructions in the repo to set it up with the VM hookscript. However after following the repo instructions, this is the output I get in Proxmox when starting the VM:

swtpm_setup: Not overwriting existing state file.
/var/lib/vz/snippets/exec-cmds: line 10: expect: command not found
* No VCPUS for VM101
hookscript error for 101 on post-start: command '/var/lib/vz/snippets/exec-cmds 101 post-start' failed: exit code 1

TASK OK

I feel like Iā€™m either not understanding something or Iā€™m missing a step. Could you clarify where the ā€œcsetā€ command comes from, and how the exec-cmds/taskset-hook.sh files are supposed to be used?

In the meantime I did try running taskset manually and tried using NUMA options to pin CPU cores, however neither made any performance difference. (Following suggestions here: CPU pinning? | Proxmox Support Forum)

EDIT: I looked into the expect error a little further and I found that expect is a package that is not on debain by default, so Iā€™m going to try the steps again with the expect package installed and report back when I can.

1 Like

I got a bit farther, but I still need some assistance setting up cpuset.

I installed the ā€˜expectā€™ and ā€˜cpusetā€™ packages on Proxmox and gave it another go with the instructions in the repo. I can now run cset commands, but this is the output of running ā€˜cset shieldā€™:

mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?

I tried researching this error, but I never saw anything that would help, or even indicate what the problem is. With that said I tried starting the VM despite this error and now the VM output looks like this:

swtpm_setup: Not overwriting existing state file.
Running taskset for emulator task with 20 for 7306...
* Assigning process 7306 to core 20 ...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
pid 7306's current affinity list: 0-15
taskset: failed to set pid 7306's affinity: Invalid argument
* Setting Governor for core 20 to ondemand...
/var/lib/vz/snippets/exec-cmds: line 81: cpupower: command not found

* Detected 12 assigned to VM101...
* Resetting cpu shield...
* Assigning 5 to 7344...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
pid 7344's current affinity list: 0-15
pid 7344's new affinity list: 5
* Setting Governor for core 5 to ondemand...
/var/lib/vz/snippets/exec-cmds: line 107: cpupower: command not found
* Assigning 6 to 7345...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
pid 7345's current affinity list: 0-15
pid 7345's new affinity list: 6
* Setting Governor for core 6 to ondemand...
/var/lib/vz/snippets/exec-cmds: line 107: cpupower: command not found
* Assigning 7 to 7346...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
pid 7346's current affinity list: 0-15
pid 7346's new affinity list: 7
* Setting Governor for core 7 to ondemand...
/var/lib/vz/snippets/exec-cmds: line 107: cpupower: command not found
* Assigning 8 to 7347...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
pid 7347's current affinity list: 0-15
pid 7347's new affinity list: 8
* Setting Governor for core 8 to ondemand...
/var/lib/vz/snippets/exec-cmds: line 107: cpupower: command not found
* Assigning 9 to 7348...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
pid 7348's current affinity list: 0-15
pid 7348's new affinity list: 9
* Setting Governor for core 9 to ondemand...
/var/lib/vz/snippets/exec-cmds: line 107: cpupower: command not found
* Assigning 10 to 7349...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
pid 7349's current affinity list: 0-15
pid 7349's new affinity list: 10
* Setting Governor for core 10 to ondemand...
/var/lib/vz/snippets/exec-cmds: line 107: cpupower: command not found
* Assigning 11 to 7350...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
pid 7350's current affinity list: 0-15
pid 7350's new affinity list: 11
* Setting Governor for core 11 to ondemand...
/var/lib/vz/snippets/exec-cmds: line 107: cpupower: command not found
* Assigning 12 to 7351...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
pid 7351's current affinity list: 0-15
pid 7351's new affinity list: 12
* Setting Governor for core 12 to ondemand...
/var/lib/vz/snippets/exec-cmds: line 107: cpupower: command not found
* Assigning 13 to 7352...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
pid 7352's current affinity list: 0-15
pid 7352's new affinity list: 13
* Setting Governor for core 13 to ondemand...
/var/lib/vz/snippets/exec-cmds: line 107: cpupower: command not found
* Assigning 14 to 7353...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
pid 7353's current affinity list: 0-15
pid 7353's new affinity list: 14
* Setting Governor for core 14 to ondemand...
/var/lib/vz/snippets/exec-cmds: line 107: cpupower: command not found
* Assigning 15 to 7354...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
pid 7354's current affinity list: 0-15
pid 7354's new affinity list: 15
* Setting Governor for core 15 to ondemand...
/var/lib/vz/snippets/exec-cmds: line 107: cpupower: command not found
* Assigning  to 7355...
mount: /cpusets: none already mounted or mount point busy.
cset: **> mount of cpuset filesystem failed, do you have permission?
taskset: failed to parse CPU list: 
* Setting Governor for core  to ondemand...
/var/lib/vz/snippets/exec-cmds: line 107: cpupower: command not found

Wating 30 seconds for all vfio-gpu interrupts to show up...
Moving 0000:65:00.0 interrupts to 5,6,7,8,9,10,11,12,13,14,15 cpu cores 101...
- IRQ:  101:          0          0          0          0       3902          0          0          0          0          0          0          0          0          0          0          0  IR-PCI-MSI 52953088-edge      vfio-msi[0](0000:65:00.0)
Wating 30 seconds for all vfio-gpu interrupts to show up...
Moving 0000:65:00.1 interrupts to 5,6,7,8,9,10,11,12,13,14,15 cpu cores 101...
- IRQ:   85:          0          0          0          0          0        279          0          0          0          0          0          0          0          0          0          0  IR-IO-APIC    4-fasteoi   vfio-intx(0000:65:00.1)
TASK OK

ā€¦and unfortunately there was no performance difference at all. Any suggestions on how to fix the cpuset error?

I failed to mention earlier that I am running proxmox 7.2-7 with kernel 5.15.39-3-pve.

I love linux with all my heart but sometimes I feel like itā€™s going a little too fast with the fundamental changes:

Looks like cset has been deprecated and thereā€™s a new way to accomplish the same thing ā€¦ will need to upgrade my proxmox, see what gets broken, and update the script accordingly ā€¦

I am on proxmox 7 and 5.13.19-6-pve:

root@pve:~# pveversion
pve-manager/7.1-12/b3c09de3 (running kernel: 5.13.19-6-pve)


1 Like

At least it looks like its not an uncommon situation (low fps on MS FS and KVM/VFIO):

However very much playable at high-end settings now with my modifications while earlier it was utterly unplayable.

Check out this thread I created. We had a pretty nice discussion about it there and in the order of greatest to smallest difference, the changes I've made were:

(1) Ensure AVIC is enabled and utilized. Made a huge difference and came at no cost.

(2) Potentially disable "mitigations". This improved other issues I've been having but performance is actually pretty good even without disabling them.

(3) Ensure each vCPU has two sibling cores isolated and dedicated.

(4) Proper NUMA Configuration

If you did (1), (3) and (4), I think there is a good chance you'll get near bare-metal performance.
1 Like

Add the flag

systemd.unified_cgroup_hierarchy=0

to your kernel boot command line, it should revert to enabling cgroups v1 and cset should work:

Alright @MadMatt here is where Iā€™m at now:

I added those kernel flags and installed the ā€˜linux-cpupowerā€™ package and now cset appears to work correctly. However there seems to still be some slight issues shown in the VM output:

swtpm_setup: Not overwriting existing state file.
Running taskset for emulator task with 20 for 6921...
* Assigning process 6921 to core 20 ...
cset: moving following pidspec: 6921
cset: moving 1 userspace tasks to /user
cset: done
pid 6921's current affinity list: 2-15
taskset: failed to set pid 6921's affinity: Invalid argument
* Setting Governor for core 20 to ondemand...
Error parsing cpu list

* Detected 12 assigned to VM101...
* Resetting cpu shield...
* Assigning 5 to 6960...
cset: moving following pidspec: 6960
cset: moving 1 userspace tasks to /user
cset: done
pid 6960's current affinity list: 2-15
pid 6960's new affinity list: 5
* Setting Governor for core 5 to ondemand...
Setting cpu: 5
Error setting new values. Common errors:
- Do you have proper administration rights? (super-user?)
- Is the governor you requested available and modprobed?
- Trying to set an invalid policy?
- Trying to set a specific frequency, but userspace governor is not available,
   for example because of hardware which cannot be set to a specific frequency
   or because the userspace governor isn't loaded?
* Assigning 6 to 6961...
cset: moving following pidspec: 6961
cset: moving 1 userspace tasks to /user
cset: done
pid 6961's current affinity list: 2-15
pid 6961's new affinity list: 6
* Setting Governor for core 6 to ondemand...
Setting cpu: 6
Error setting new values. Common errors:
- Do you have proper administration rights? (super-user?)
- Is the governor you requested available and modprobed?
- Trying to set an invalid policy?
- Trying to set a specific frequency, but userspace governor is not available,
   for example because of hardware which cannot be set to a specific frequency
   or because the userspace governor isn't loaded?
* Assigning 7 to 6962...
cset: moving following pidspec: 6962
cset: moving 1 userspace tasks to /user
cset: done
pid 6962's current affinity list: 2-15
pid 6962's new affinity list: 7
* Setting Governor for core 7 to ondemand...
Setting cpu: 7
Error setting new values. Common errors:
- Do you have proper administration rights? (super-user?)
- Is the governor you requested available and modprobed?
- Trying to set an invalid policy?
- Trying to set a specific frequency, but userspace governor is not available,
   for example because of hardware which cannot be set to a specific frequency
   or because the userspace governor isn't loaded?
* Assigning 8 to 6963...
cset: moving following pidspec: 6963
cset: moving 1 userspace tasks to /user
cset: done
pid 6963's current affinity list: 2-15
pid 6963's new affinity list: 8
* Setting Governor for core 8 to ondemand...
Setting cpu: 8
Error setting new values. Common errors:
- Do you have proper administration rights? (super-user?)
- Is the governor you requested available and modprobed?
- Trying to set an invalid policy?
- Trying to set a specific frequency, but userspace governor is not available,
   for example because of hardware which cannot be set to a specific frequency
   or because the userspace governor isn't loaded?
* Assigning 9 to 6964...
cset: moving following pidspec: 6964
cset: moving 1 userspace tasks to /user
cset: done
pid 6964's current affinity list: 2-15
pid 6964's new affinity list: 9
* Setting Governor for core 9 to ondemand...
Setting cpu: 9
Error setting new values. Common errors:
- Do you have proper administration rights? (super-user?)
- Is the governor you requested available and modprobed?
- Trying to set an invalid policy?
- Trying to set a specific frequency, but userspace governor is not available,
   for example because of hardware which cannot be set to a specific frequency
   or because the userspace governor isn't loaded?
* Assigning 10 to 6965...
cset: moving following pidspec: 6965
cset: moving 1 userspace tasks to /user
cset: done
pid 6965's current affinity list: 2-15
pid 6965's new affinity list: 10
* Setting Governor for core 10 to ondemand...
Setting cpu: 10
Error setting new values. Common errors:
- Do you have proper administration rights? (super-user?)
- Is the governor you requested available and modprobed?
- Trying to set an invalid policy?
- Trying to set a specific frequency, but userspace governor is not available,
   for example because of hardware which cannot be set to a specific frequency
   or because the userspace governor isn't loaded?
* Assigning 11 to 6966...
cset: moving following pidspec: 6966
cset: moving 1 userspace tasks to /user
cset: done
pid 6966's current affinity list: 2-15
pid 6966's new affinity list: 11
* Setting Governor for core 11 to ondemand...
Setting cpu: 11
Error setting new values. Common errors:
- Do you have proper administration rights? (super-user?)
- Is the governor you requested available and modprobed?
- Trying to set an invalid policy?
- Trying to set a specific frequency, but userspace governor is not available,
   for example because of hardware which cannot be set to a specific frequency
   or because the userspace governor isn't loaded?
* Assigning 12 to 6967...
cset: moving following pidspec: 6967
cset: moving 1 userspace tasks to /user
cset: done
pid 6967's current affinity list: 2-15
pid 6967's new affinity list: 12
* Setting Governor for core 12 to ondemand...
Setting cpu: 12
Error setting new values. Common errors:
- Do you have proper administration rights? (super-user?)
- Is the governor you requested available and modprobed?
- Trying to set an invalid policy?
- Trying to set a specific frequency, but userspace governor is not available,
   for example because of hardware which cannot be set to a specific frequency
   or because the userspace governor isn't loaded?
* Assigning 13 to 6968...
cset: moving following pidspec: 6968
cset: moving 1 userspace tasks to /user
cset: done
pid 6968's current affinity list: 2-15
pid 6968's new affinity list: 13
* Setting Governor for core 13 to ondemand...
Setting cpu: 13
Error setting new values. Common errors:
- Do you have proper administration rights? (super-user?)
- Is the governor you requested available and modprobed?
- Trying to set an invalid policy?
- Trying to set a specific frequency, but userspace governor is not available,
   for example because of hardware which cannot be set to a specific frequency
   or because the userspace governor isn't loaded?
* Assigning 14 to 6969...
cset: moving following pidspec: 6969
cset: moving 1 userspace tasks to /user
cset: done
pid 6969's current affinity list: 2-15
pid 6969's new affinity list: 14
* Setting Governor for core 14 to ondemand...
Setting cpu: 14
Error setting new values. Common errors:
- Do you have proper administration rights? (super-user?)
- Is the governor you requested available and modprobed?
- Trying to set an invalid policy?
- Trying to set a specific frequency, but userspace governor is not available,
   for example because of hardware which cannot be set to a specific frequency
   or because the userspace governor isn't loaded?
* Assigning 15 to 6970...
cset: moving following pidspec: 6970
cset: moving 1 userspace tasks to /user
cset: done
pid 6970's current affinity list: 2-15
pid 6970's new affinity list: 15
* Setting Governor for core 15 to ondemand...
Setting cpu: 15
Error setting new values. Common errors:
- Do you have proper administration rights? (super-user?)
- Is the governor you requested available and modprobed?
- Trying to set an invalid policy?
- Trying to set a specific frequency, but userspace governor is not available,
   for example because of hardware which cannot be set to a specific frequency
   or because the userspace governor isn't loaded?
* Assigning  to 6971...
cset: moving following pidspec: 6971
cset: moving 1 userspace tasks to /user
cset: done
taskset: failed to parse CPU list: 
* Setting Governor for core  to ondemand...
Error parsing cpu list

Wating 30 seconds for all vfio-gpu interrupts to show up...
Moving 0000:65:00.0 interrupts to 5,6,7,8,9,10,11,12,13,14,15 cpu cores 101...
- IRQ:   84:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-IO-APIC    0-fasteoi   vfio-intx(0000:65:00.0)
Wating 30 seconds for all vfio-gpu interrupts to show up...
Moving 0000:65:00.1 interrupts to 5,6,7,8,9,10,11,12,13,14,15 cpu cores 101...
- IRQ:   85:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  IR-IO-APIC    4-fasteoi   vfio-intx(0000:65:00.1)
TASK OK

Specifically, cpupower is making those ā€˜Error settings new valuesā€™ errors which Iā€™m not too concerned about, tasksel apparently is having trouble reading a ā€œCPU Listā€ but Iā€™m not exactly sure what itā€™s refering to, and there is one PID tasksel can never change the CPU affinity of, but again Iā€™m not too concerned with this.

Despite the improvement in the output, the performance in MSFS still remains unchanged. Again, guest CPU usage hovers around 30% as well as GPU. However, to me it seems we have eliminated the CPU scheduling on the host side as being a problem. To prove this I got smart and installed Passmark performanceTest to get some cold hard numbers, and it appears I have 85% of the host CPU and GPU performance, with acceptable memory speed. Considering the fact that I also have a small pfsense VM running and I was using NVIDIA GameStream to access the VM during these tests, these results are more than acceptable.

This is something that really doesnā€™t seem very well documented online so I greatly appreciate your contribution. If I have some time I may start an issue on Github to help add clarity to the documentation.

But back to the topic at hand, I have to conclude that there is something about my config that the MSFS game engine doesnā€™t like due to the low guest CPU usage. Iā€™m not really sure where to go from here at this point, but it does give me hope that there is at least one other person out there with a similar Intel CPU (@GMntR) who does not have this issue.

At least it looks like its not an uncommon situation (low fps on MS FS and KVM/VFIO):

I mentioned that thread in my initial post, unfortinatly these fixes did not work for me. #1 is AMD specific, we already saw that #3 did not make a difference, and I donā€™t think #4 would apply to me since I only have 1 physical CPU. To be hounest I did not try #2 since the only intel option Proxmox exposes in the GUI for this is only for Westmere, SandyBridge, and IvyBridge CPUs, which does not apply to me.

Again, Iā€™m glad I was able to isolate host CPU scheduling as a potential issue, but it appears a different approach is needed. Still open to suggestions while I do more research.

For record-keeping purposes I am currently researching/trying solutions to similar problems, but not necessarily ones specifically involving a Proxmox/KVM setup. So far I found this: https://www.reddit.com/r/MicrosoftFlightSim/comments/o0nv84/low_gpu_and_cpu_usage_low_fps/ which recommended turning on hardware-based GPU scheduling in Windows, and to try a program called Bitsum Process Lasso which is supposed to improve Windows CPU scheduling. Unfortunately, neither changed the performance for me.

Can you post your config?

You donā€™t have core 20, just comment out the emulatorpin setting, for now ā€¦

Can you run a

root@pve:~# cpupower frequency-info
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 1.50 GHz - 2.80 GHz
  available frequency steps:  2.80 GHz, 2.40 GHz, 1.50 GHz
  available cpufreq governors: conservative ondemand userspace powersave performance schedutil
  current policy: frequency should be within 1.50 GHz and 2.80 GHz.
                  The governor "schedutil" may decide which speed to use
                  within this range.
  current CPU frequency: 1.50 GHz (asserted by call to hardware)
  boost state support:
    Supported: yes
    Active: yes
    Boost States: 0
    Total States: 3
    Pstate-P0:  2800MHz
    Pstate-P1:  2400MHz
    Pstate-P2:  1500MHz
root@pve:~# lsmod | grep acpiroot@pve:~# cpupower frequency-info
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 1.50 GHz - 2.80 GHz
  available frequency steps:  2.80 GHz, 2.40 GHz, 1.50 GHz
  available cpufreq governors: conservative ondemand userspace powersave performance schedutil
  current policy: frequency should be within 1.50 GHz and 2.80 GHz.
                  The governor "schedutil" may decide which speed to use
                  within this range.
  current CPU frequency: 1.50 GHz (asserted by call to hardware)
  boost state support:
    Supported: yes
    Active: yes
    Boost States: 0
    Total States: 3
    Pstate-P0:  2800MHz
    Pstate-P1:  2400MHz
    Pstate-P2:  1500MHz
root@pve:~# lsmod | grep acpiroot@pve:~# cpupower frequency-info
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 1.50 GHz - 2.80 GHz
  available frequency steps:  2.80 GHz, 2.40 GHz, 1.50 GHz
  available cpufreq governors: conservative ondemand userspace powersave performance schedutil
  current policy: frequency should be within 1.50 GHz and 2.80 GHz.
                  The governor "schedutil" may decide which speed to use
                  within this range.
  current CPU frequency: 1.50 GHz (asserted by call to hardware)
  boost state support:
    Supported: yes
    Active: yes
    Boost States: 0
    Total States: 3
    Pstate-P0:  2800MHz
    Pstate-P1:  2400MHz
    Pstate-P2:  1500MHz

and post the result?

In general, can you change the governor for any given core from the command line?

1 Like

I might repeat something that was already said but I am a long time kvm user and since switching from a single die cpu to a multi die (numa like) cpu, i had similar issues.
As said, i have changed my 3700x to a 5900x and i also noticed some issues when virtualizing with proxmox, and later with fedora and libvirt.
What helped my a lot was using hwloc and isolating 1 numa/die per vm.
Also, core pinning does help with latency but not performance .
So to sum it up:

  1. Hugepages, transparent or not , use them and better yet , use 1gb size
  2. check you cpu and cores with hwloc and make sure you are not using cores from multiple numa nodes/dies
  3. Assign the correspondent thread per core (hwloc will show you what threads each core has)
  4. Pin the cores, this depends on you , I would only do this if i would use the first 2 cores/corresponding threads
  5. Remove any bloat from vm configuration and use host for cpu model, some cpu flags could help as well, but currently i cannot remember what I used.

In the end, to have a true metric of how good or not your vm is , do a benchmark on bare metal check single core scores and gpu scores, and than do the same test on the vm, the diffrence should be under 2%.
For benchmarking i use 3dmark(gpu) and cpu-z

1 Like

Sorry for the late response, was moving residences.

#cpu_governor ondemand
#cpu_taskset 5,6,7,8,9,10,11,12,13,14,15
#assign_interrupts 5,6,7,8,9,10,11,12,13,14,15 0000%3A65%3A00.0
#assign_interrupts 5,6,7,8,9,10,11,12,13,14,15 0000%3A65%3A00.1
agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 12
cpu: host,flags=+pdpe1gb;+hv-tlbflush;+aes
cpuunits: 2048
efidisk0: local-lvm:vm-101-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hookscript: local:snippets/exec-cmds
hostpci0: 0000:65:00,pcie=1,x-vga=1
hugepages: 1024
ide2: local:iso/virtio-win-0.1.221.iso,media=cdrom,size=519030K
machine: q35
memory: 32768
meta: creation-qemu=6.2.0,ctime=1660431967
name: Windows
net0: virtio=[MAC],bridge=vmbr0
numa: 1
ostype: win10
scsi0: local-lvm:vm-101-disk-1,cache=writethrough,size=128G,ssd=1
scsi2: /dev/disk/by-id/ata-WDC_WD10EZEX-00WN4A0_WD-WCC6Y0ZYY37E,size=976762584K
scsihw: virtio-scsi-pci
smbios1: uuid=10761cfd-a9a0-4dc6-ac0a-90ab79a53a2b
sockets: 1
tablet: 0
tpmstate0: local-lvm:vm-101-disk-2,size=4M,version=v2.0
usb0: host=046d:c539,usb3=1
usb1: host=06a3:0c2d,usb3=1
usb2: host=046d:c215,usb3=1
vmgenid: 544df9ed-4e9a-4ac7-a184-479c372575a8

Removed. I wasnā€™t initially sure what that setting doesā€¦

root@pve1:~# cpupower frequency-info
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 1.20 GHz - 4.20 GHz
  available cpufreq governors: performance powersave
  current policy: frequency should be within 1.20 GHz and 4.20 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: 1.33 GHz (asserted by call to kernel)
  boost state support:
    Supported: yes
    Active: yes

Based on the information above, I donā€™t think I can. I canā€™t really figure out how to do it on a per-core basis using cpupower though. With that said, I observed expected CPU frequencies while running the benchmark I tried earlier.

Hello, and thank you for the response!

Unfortunately, all of those things have been tried. I do know however that benchmarks put my setup 15% slower than bare metal with all of those tweaks. This was while using NVIDIA GameStream to remote into the VM and with a 2nd pfsense vm running on the host at the same time, so that number could probably be improved by not using those two services. Unfortunately, its hard to test for me with the pfsense VM not running and without using GameStream.

However, the 15% performance reduction does not explain why Iā€™m getting 1/3 the framerate at best, leading me to conclude there is something else going on here. It might end up being as simple as the MSFS engine not liking KVM.

I appreciate the advice regardless.