Proxmox slow RAM in Windows VM

Anti-ctrl · January 5, 2021, 3:43pm

Hey Guys/Girls!
Long Time Viewer, first time writer here.

I´m somewhat a Noob in these things, so forgive me if my question is dumb, but i couldnt find any answers in the last few days of searching.

I have Proxmox installed and a Windows VM with an AMD R9 380 Passed through. Tried to play GTA V there and other Games. I thought it lagged because Parsec is just too much for the Hardware, but i ducked(?) a bit deeper and found out, that my RAM Speed inside the VM is REALLY Bad.

I get around ~1500 - 1700 MB/s in AIDA64. All VirtIO drivers are installed, ballooning is on.

With Sysbench in an Ubuntu VM i get around 4000MB/s with 1K BS and around 24GB/s with 1M BS.

I tried to Setup a fresh VM with all drivers and no Programs whatsoever, but that didnt work either. Numa is on, and an SSD is passed through.

Is there anything i can do, to fix that? I can provide all necessary Infos if you ask, but i dont really know what to Include atm.

I´m sorry for my Bad English, its not my native Language and hope we can figure out that Problem together!

Thanks in advance and a Happy new Year!

Setup:
OS: Proxmox 6.3-3
CPU: Dual x5675 on Z8NA6D
SSD Passthrough into the VM, no difference with RAW Image on ZFS SSD Mirror.
HBA: IBM 1015 in IT Mode
RAM: 96GB 1066Mhz DDR3 ECC CL7.
GPU: Sapphire R9 380 4GB
PSU: 550W 80Plus, nothing Special, but not that crappy. Will be changed soon. Edit: Changed to Seasonic PX650W

NZSNIPER · January 5, 2021, 4:11pm

Could it be using the SSD as a paging file?

What is the R/W performance of your SSD?

Anti-ctrl · January 5, 2021, 4:13pm

Could be. I will disable that as a test. Thanks for Pointing that out!

R/W is tbh also really bad. Under Bare Metal the SSD does something around 400MB/s R/W. In KVM its more like 40MB/s Peak. Will check that again.

EDIT: So atm i have the ZFS mirror RAW image in there, because the SSD was just an extra Pain.
Here are the Results for the RAW Image File on the ZFS SSD Mirror (Crucial MX500 M.2 Sata 500GB)

EDIT2: These are my ZFS Settings for the storage on which the RAW File sits.

Summary

root@pve:~# zfs get all VMData/VM
NAME       PROPERTY              VALUE                  SOURCE
VMData/VM  type                  filesystem             -
VMData/VM  creation              Sat Nov 28 18:46 2020  -
VMData/VM  used                  262G                   -
VMData/VM  available             187G                   -
VMData/VM  referenced            262G                   -
VMData/VM  compressratio         1.00x                  -
VMData/VM  mounted               yes                    -
VMData/VM  quota                 none                   default
VMData/VM  reservation           none                   default
VMData/VM  recordsize            4K                     local
VMData/VM  mountpoint            /VMData/VM             default
VMData/VM  sharenfs              off                    default
VMData/VM  checksum              on                     default
VMData/VM  compression           on                     local
VMData/VM  atime                 off                    local
VMData/VM  devices               on                     default
VMData/VM  exec                  on                     default
VMData/VM  setuid                on                     default
VMData/VM  readonly              off                    default
VMData/VM  zoned                 off                    default
VMData/VM  snapdir               hidden                 default
VMData/VM  aclinherit            restricted             default
VMData/VM  createtxg             72                     -
VMData/VM  canmount              on                     default
VMData/VM  xattr                 on                     default
VMData/VM  copies                1                      default
VMData/VM  version               5                      -
VMData/VM  utf8only              off                    -
VMData/VM  normalization         none                   -
VMData/VM  casesensitivity       sensitive              -
VMData/VM  vscan                 off                    default
VMData/VM  nbmand                off                    default
VMData/VM  sharesmb              off                    default
VMData/VM  refquota              none                   default
VMData/VM  refreservation        none                   default
VMData/VM  guid                  13252447685947962808   -
VMData/VM  primarycache          all                    default
VMData/VM  secondarycache        all                    default
VMData/VM  usedbysnapshots       0B                     -
VMData/VM  usedbydataset         262G                   -
VMData/VM  usedbychildren        0B                     -
VMData/VM  usedbyrefreservation  0B                     -
VMData/VM  logbias               latency                default
VMData/VM  objsetid              135                    -
VMData/VM  dedup                 off                    default
VMData/VM  mlslabel              none                   default
VMData/VM  sync                  standard               default
VMData/VM  dnodesize             legacy                 default
VMData/VM  refcompressratio      1.00x                  -
VMData/VM  written               262G                   -
VMData/VM  logicalused           258G                   -
VMData/VM  logicalreferenced     258G                   -
VMData/VM  volmode               default                default
VMData/VM  filesystem_limit      none                   default
VMData/VM  snapshot_limit        none                   default
VMData/VM  filesystem_count      none                   default
VMData/VM  snapshot_count        none                   default
VMData/VM  snapdev               hidden                 default
VMData/VM  acltype               off                    default
VMData/VM  context               none                   default
VMData/VM  fscontext             none                   default
VMData/VM  defcontext            none                   default
VMData/VM  rootcontext           none                   default
VMData/VM  relatime              off                    default
VMData/VM  redundant_metadata    all                    default
VMData/VM  overlay               off                    default
VMData/VM  encryption            aes-256-gcm            -
VMData/VM  keylocation           prompt                 local
VMData/VM  keyformat             passphrase             -
VMData/VM  pbkdf2iters           977K                   -
VMData/VM  encryptionroot        VMData/VM              -
VMData/VM  keystatus             available              -
VMData/VM  special_small_blocks  0                      default

EDIT3: Removing Page File makes no difference at all.

Anti-ctrl · January 6, 2021, 10:12am

So, i tested the same thing on a VM with Windows Server 2019 and got almost the same results. RAM is abysmally slow. IDK what todo next or how to tackle that Problem. Linux VMs are still fine.

Anti-ctrl · January 6, 2021, 6:27pm

I tested the Ramspeed again with Phoronix Test Suite and i can confirm that its not the “fault” of the Benchmark. In Windows the Test took ~1300 seconds.

In a linux Container its around 10 Seconds.

Does nobody has any Idea what it could possibly be? I´m really out of Ideas…

NZSNIPER · January 7, 2021, 8:30am

Have you over-provisioned your total system RAM amongst all of the VM’s on the hypervisor?

Anti-ctrl · January 8, 2021, 2:46pm

Have you over-provisioned your total system RAM amongst all of the VM’s on the hypervisor?

Not really. Even if ALL of my VMs would take all the RAM they have assigned, then i would come to a total usage of around 81GB of RAM. That said, ZFS also has ~16GB of ARC. But whenever i run the Test, most VMs are off and there is plenty of free RAM.

Log · January 8, 2021, 4:41pm

~~Are you using a dataset or a zvol? There’s currently an issue with zvol performance in some cases, such as writing to to nvme drives causing 100% cpu usage on all cores.~~
edit Nevermind, didn’t notice your other post. Still wouldn’t hurt to look at htop on the host while benchmarking.

Also, doesn’t everyone recommend against memory ballooning?

Do you have huge pages enabled?

Anti-ctrl · January 9, 2021, 11:06pm

Also, doesn’t everyone recommend against memory ballooning?

Could be, but makes no difference for me, if on or off.

Do you have huge pages enabled?

At least its on in the Processor Tab in the VM under Proxmox.
So i would guess yes. But idk really know much about Hugepages tbh.
EDIT: Searched a bit and

grep -e AnonHugePages /proc/*/smaps | awk '{ if($2>4) print $0} ’ | awk -F “/” '{print $0; system("ps -fp " $3)} ’

Gives me some output of running VMs, so i guess it works and is in Use.
Windows VM is also there.

EDIT2: If i turn Hugepages OFF, then i loose around 500MB/s of RAM speed. So i get around 1.2GB/s. Just as a reminder, on a Linux VM its around 24GB/s and at worst 4GB/s if Blocksize is 1K and only 1 Thread is used.

Anti-ctrl · January 17, 2021, 12:50am

#push
No one has any Idea ? It MUST be something Software related, but I cant figure out what it might be.

emma.concrete · January 18, 2021, 4:54am

I’ve had some ram performance issues, but nothing nearly as bad.

For me I made sure my huge pages was set to 10240 pages. (Was passing 20 GB to the VM)

on ubuntu typing:
cat /proc/meminfo | grep Huge

If its a small value like 1024, then huge pages isn’t reserving that memory.
In the VM I added to the xml file:

<memoryBacking>
<hugepages/>
</memoryBacking>

(NOTE This was ubuntu and qemu/kvm not proxmox)

Some people with dual processors say to try disabling numa, but… very mixed results with that.
Some people were saying their ZFS pool was part to blame, giving terrible performance for IO. (But that’s more on the storage side not the memory)
How are you initially connecting to the VM? Is it through RDP or Spice?
RDP Says:
Turns out by default, starting from Remote Desktop Protocol (RDP) 8.0, it will try to use UDP for connections in order provide a better user experience on slow connections...... it causes a lot of IP fragmentation which can’t be properly reassembled in time, or at all - leading to drop outs, black screens and freezing.

Forcing RDP to not use UDP may fix some problems (but that’s just some people) and if you are initially using RDP to login / setup the VM.

Are all the virtio drivers installed correctly on windows?
The one post I found that had a solution, he found his HP server motherboard if he had it set to balanced power saving, it would throttle his PCI-E performance. So anything that was passed through would cause a major issues and lag on the VM. Changing it to performance mode fixed his issue.
When doing these tests for io / memory, what is your CPU doing? On host / guest? Is it pegged with this poor performance? Also have you done any CPU pinning to try and maximize performance?
I’m Curious what bios mode are you using for the vm? I know 440fx is outdated, and some say q35 is the best choice (I honestly don’t know if you have a different option with proxmox). But might be worth looking into

I did find a chunk of posts and complaints about poor performance on windows 10 with proxmox, with many eventually giving up

Some people said they think it might have been a problem with windows scheduler / trying to hide the VM to pass a nvidia card through (which you clearly aren’t doing)

Sorry if none of this helped, its a very strange looking problem.

^---- From the Context I think you meant to say. “You Dug a little deeper”. (Your English over all was very easy to comprehend thumbs up)

Anti-ctrl · January 18, 2021, 2:44pm

Thanks ALOT for the effort!

I will try and Provide as much info as i can.
First things first:

root@pve:~# cat /proc/meminfo | grep Huge
AnonHugePages: 28952576 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB

Thats the Output for said command.
The Windows VM is running atm, so it should reserve something, right? I have tbh not really an Idea about how Hugepages work and if "AnonHugePages is what I´m looking for. Just know that it can boost performance.

I tried disabling Numa, at least the “tick” in the Options for the VM on Proxmox. Doesnt do a thing for me. At least not Memory Performance wise.

I provided some Pictures further up. Thats the Storage Performance of the RAW Image, the Windows VM is running on atm. ZFS Mirror on 2 Sata M.2 are the Underlying Filesystem, outside the VM. Normally it runs on a dedicated Sata SSD, but thats slow too (passed through)

For the Most Part i use Teamviewer for some Management and light Tasks, but, in case it works as intended, i would use Parsec (for Gaming).

Pretty sure they are, yes. Nothing in Device Manager that shows up and no errors durring install.

Thats a “feature” on my Platform too. I will have a look into that. I enabled it, because Energy Prices in Germany are pretty damn high. And the System sits at around 200+W Idle atm. For me the 2 Options i can choose from are “Power Optimized” and “Traditional Powermanagement”(?). Not sure on the Last one.

Yes i tried CPU Pinning and it Improved stuttering in GTA V a bit. With a few Tweaks in Parsec Config its now somewhat playable, but not nearly “nice”. The Overall Memory Bandwith inside of the VM didnt got any better, at least in AIDA64 Extreme, which is the Program I´m testing with and which I use as Comparison. I didnt even had the Idea to check the CPU Usage on Host/VM, while the Test runs…Stupid me. Its packed. Till the Max. Even tho the VM only has 6 vCPUs, the 24 Thread Host is on its knees, for whatever reason. At least according to HTOP.

I use q35 as suggested by the Community and i guess its the only one where Passthrough is possible, at least on Proxmox. I also use UEFI Bios for the VM, if thats Important.

Yes, same. Thats why i made an Account here, as @wendell always says, that the most knowledgable ppl are around here
Thought its worth a shot and i guess we are at least on the Way to tackle this!

It helped me alot. At least there is now one or two things i can check.
I´m really thankful for the effort you made and hope we can continue to solves this Problem.

Yes thats what i was trying to express At least the context is understandable, lol.

emma.concrete · January 18, 2021, 9:19pm

Glad to help anyway I can

Ok good, so most of that checks out and looks like its fine. But there is a few small things to try first.

First, try getting huge pages setup, it looks like its enabled, but ram was not reserved

That’s the value of the reserved hugepages.
If you were reserving 20 GB, that value would be 10240 (Not including extra for safety)

Calculating; take ram being passed into the VM and multiply that by 1024mb, then divide by 2mb, for safety add 1-10% extra.
16gb example:

16 x 1024MB = 16384MB
16384MB / 2MB = 8192 huge pages
8192 * 10% = 819.2, 8192 + 819.2 = 9011.2 rounded 9010 
8192 * 2% = 163.8, 8192 + 163.8 = 8355.84 rounded 8355 (Some recommend only 2% needed for safety)
8355 - 9010  hugepages would be a good value for 16 GB of ram.

Now making the system reserve those hugepages, can be a bit different. For my gpu passthrough setup I just edited the:
/etc/sysctl.conf added the line at the end of the file:

vm.nr_hugepages = 13056

Is what I had for 25gb being passed through. After a system reboot it showed with cat:

cat /proc/meminfo | grep Huge
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       13056 kB
Hugetlb:               0 kB

Keep in mind that is the “old way”, and there is a possibility to have conflict with numa enabled. With numa you may use an echo command directly manipulating the hugepage files for each node
echo /sys/devices/system/node/node<NODID>/hugepages/hugepages-<SIZE>kB
or using a the utility: hugeadm (Note I have not personally used it, but may be required with the dual cpu system)

Heres a few pages I found that may be useful:
https://help.ubuntu.com/community/KVM%20-%20Using%20Hugepages

Huge pages does reserve system memory, so it may cause annoying problems, if this doesn’t work you can just remove the entries for the huge pages or set it back to 2048 kb and rebooting again.

I will note, for my ryzen setup pci passthrough setup with ddr4. Without hugepages setup, using the passmark benchmark tool to look at ram speed it showed it with a score of something around 450 score, while bare metal people were getting 3332 on average. With the huge pages enabled it jumped up to something like 1800-2.2. Still not even close to bare metal for my memory, but a lot better. (I’m sorry I can’t provide exact statistics, that’s going off of memory heh he pun not intended)

Please Note: this was using ubuntu qemu/kvm with libvirt. Things may be different for proxmox (Waiting on ram from ebay to get more into proxmox).

Now to change gears, specifically because of you said the host CPU was pegged while doing tests on the guest.

Hugepages is supposed to help with IO responses on windows, because windows can be a trash fire at dealing with IO.

The main thing about a VM, is its supposed to seperate the host from the guest, so if the guest is at 100% load, the host shouldn’t drop down at all, except for the resources it is allowed to utilize.

I found a bunch of people complaining specifically about windows 1803, causing tuns of problems to the HOST machine.
A few Examples:

100% Pegged CPU on host, while the guest was idle with low cpu usage.
Under even small load on the guest windows VM, 100% usage on the host
Under small load would make the host unresponsive to keyboard and mouse input, with 30+ second delays with host.
100% CPU usage on all windows guests and 100% cpu usage on host

The one fix I could find was removing all unused devices for the VM, network adapters, virtio keyboard, mice, serial adapters/ devices, virtual hubs, usb devices etc. (An example is a network adapter being passed into the vm was constantly trying to configure itself glitching out the host, another was a usb / serial device in the guest that was also trying to poll a service on the guest, causing the host to lag out)

Some of fixes I could find is setting certain CPU flags on startup that could correct the issue. EI hv_sync && hv_stimer enlightments.

A possible thing to check on that vein of thought, This bug report for Rhel: https://bugzilla.redhat.com/show_bug.cgi?id=1610461

Was reporting window 1803 would have more than 2000 interrupts in second. While on 1709 windows was getting windows only 140-200 interrupts in second. Saying
Adding the flags ‘hv_synic,hv_stimer’ resolve the problem.

I’m not sure but you might be able to run the command:
perf kvm --host stat live --event=ioport on the host to see how many interupts you have while trying to run the guest or run something on the guest.

Sadly I am also somewhat new to the whole virtualization thing, I do feel having the host CPU being maxed out is a very big hint to the problem that is happening.

However it does still feel like throwing a wrench at a car that won’t start, and hoping it will fix the problem.

I really do hope one of these things would help. Good luck

Anti-ctrl · January 19, 2021, 6:49pm

So, thanks again for helping me out here!

I researched a bit for Hugepages and found out, that “AnonHugepages” are essentially the same as Hugepages, but not “reserved” so the Host can free it up, as soon as the VM stops running.

I upped the Limit from 2MB to 1GB per Page, as my Host supports it.

root@pve:~# cat /proc/meminfo | grep Huge
AnonHugePages: 26046464 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
Hugetlb: 0 kB

Proxmox SHOULD take care of the Rest, as i can see reserved Stuff for the VM via:

grep -e AnonHugePages /proc/*/smaps | awk '{ if($2>4) print $0} ’ | awk -F “/” '{print $0; system("ps -fp " $3)} ’

The Output looks something like this:

root 22045 1 99 17:27 ? 02:29:36 /usr/bin/kvm -id 105 -name Windows
/proc/22045/smaps:AnonHugePages: 8388608 kB

Also i made a Mistake earlier and I´m really sorry for that, but i allocated 24 Threads to the VM and forgot about it, which is why it pushed the Host to the Limit. Now with 6 vCPUs all behaves normally, at least Ressource wise.

Tried these, unfortunately they didnt do anything for me. Passed them Through via the args line in the VM config in Proxmox and also tried setting them as CPU flags in the config.
However; I disabled HPET in the Windows VM, which now has a ~50% snappier feeling. TimerBenchmark confirms that too. Before it was at around 7.0 microseconds, now it shows 0.059 or something. IDK if that can be even right tho. Memory Bandwith is still really Low and GTA V still laggs as heck, as soon as the Mouse is moved (via Parsec) but except for that, it now runs somewhat smooth(er). Not totally as it should on that kind of Hardware, but we getting there. Will try another Game just for testing.

I cant, because its not available for me.

/usr/bin/perf: line 13: exec: perf_5.4: not found
E: linux-perf-5.4 is not installed.

I cant install it either. Or maybe I´m not smart enough. (yet)

I also took a few screenshots from my Netdata, while the VM Performed a Memory Benchmark, via AIDA64.

On the Second one, you can see the Windows VM ideling and then the Benchmark starts. IDK if thats too high, or whatever, but thats what it looks like. NOTE: Other VMs where running in the Background, but doing mostly nothing. They are all Ubuntu-Server VMs.

I will try and Disable HPET in the Bios Tomorrow, and see if that Improves the Situation, but i doubt it.

I also disabled the “Power Optimization” Feature we talked over before. It has done nothing for me. Its not worse, its not better. Will shove it back to Power Optimized, as Energy Costs are somewhat high here. Also tomorrow.

Thanks again, maybe you got any more Ideas ?

SgtAwesomesauce · January 19, 2021, 7:27pm

AnonHugepages are fine if you’re not using the IOMMU.

It’s not a limit. A hugepage is either 2MB or 1GB.

You allocate it in those chunks. If you’re not happy with inefficient allocation in 1GB increments, switch to 2MB.

Anti-ctrl · January 19, 2021, 7:30pm

Yes, thats what i tried to express, sorry for the inconvenience.

Well… Thats a Problem, as I use IOMMU because I´m passing through a GPU.
So I would be better of, just using regular HugePages ?

SgtAwesomesauce · January 19, 2021, 7:31pm

You need static hugepages, yes.

Anti-ctrl · January 19, 2021, 7:33pm

Well, then I will try and set that up, without crashing the system, lol.
Thanks for that piece of Information. Apparently i missed that.

Anti-ctrl · January 20, 2021, 7:34pm

Tried setting up Hugepages and failed miserably.

If I set /etc/default/grub to

GRUB_CMDLINE_LINUX_DEFAULT=“quiet intel_iommu=on video=efifb:off default_hugepagesz=1G hugepagesz=1G hugepages=18”

and run

update-grub

and reboot, then it shows me that there are 18 Hugepages and 17 are free. I then decrypt my ZFS storage and start a VM. It then goes back to this.

root@pve:~# cat /proc/meminfo | grep Huge
AnonHugePages: 3463168 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 1
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 1
Hugepagesize: 1048576 kB
Hugetlb: 1048576 kB

I cant figure it out and I´m damn near to just throw Money on the Problem and buy a Dual 2011 System.

SgtAwesomesauce · January 20, 2021, 7:57pm

in /etc/sysctl.conf

vm.nr_hugepages=18

I recommend not setting hugepages from the kernel commandline.

This is a software issue, not hardware, buying new stuff won’t help.

ZFS must be messing with the memory setup. Never seen this behavior before.

CC @wendell have you ever seen ZFS reset the hugepages config?