1950x Optimize windows guest vm for gaming

ShayBox · March 20, 2019, 4:13am

Specs:
CPU: TR 1950x
GPU: 1080 Ti
RAM: 6x4gb sticks of DDR4 2667MHz in slots:
0,1,2,5,6,7
Board: MSI Gaming Carbon AC x399
Distro: ArchLinux
Kernel: linux-zen
Bootloader: rEFInd
qemu: qemu-patched

I am looking to optimize my windows gaming vm for more gaming performance, I’m mainly asking for help with pinning topology, I’ve read a lot of posts about numa and I don’t know what the best way to go about it would be, but any other optimizations would be appreciated.

I’m unsure if I want to use huge pages, as far as I know it’s constantly reserved, but if I’m wrong please let me know.

And I believe I want to keep threads 0,16 to the host, maybe also 1,17 just in case but i’ve seen guides do weird things with numbering, maybe that’s just for 1900x or an old AGESA code (I’m running 1.1.0.2)

Currently I unbind my 1080 ti from the nvidia prop drivers on my host, bind it to vfio, and pass it through to the vm, which all works perfectly.
Here are my files:
unbind
rebind
xml
boot params

> lscpu -e
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE
0   0    0      0    0:0:0:0       yes
1   0    0      1    1:1:1:0       yes
2   0    0      2    2:2:2:0       yes
3   0    0      3    3:3:3:0       yes
4   0    0      4    4:4:4:1       yes
5   0    0      5    5:5:5:1       yes
6   0    0      6    6:6:6:1       yes
7   0    0      7    7:7:7:1       yes
8   1    0      8    8:8:8:2       yes
9   1    0      9    9:9:9:2       yes
10  1    0      10   10:10:10:2    yes
11  1    0      11   11:11:11:2    yes
12  1    0      12   12:12:12:3    yes
13  1    0      13   13:13:13:3    yes
14  1    0      14   14:14:14:3    yes
15  1    0      15   15:15:15:3    yes
16  0    0      0    0:0:0:0       yes
17  0    0      1    1:1:1:0       yes
18  0    0      2    2:2:2:0       yes
19  0    0      3    3:3:3:0       yes
20  0    0      4    4:4:4:1       yes
21  0    0      5    5:5:5:1       yes
22  0    0      6    6:6:6:1       yes
23  0    0      7    7:7:7:1       yes
24  1    0      8    8:8:8:2       yes
25  1    0      9    9:9:9:2       yes
26  1    0      10   10:10:10:2    yes
27  1    0      11   11:11:11:2    yes
28  1    0      12   12:12:12:3    yes
29  1    0      13   13:13:13:3    yes
30  1    0      14   14:14:14:3    yes
31  1    0      15   15:15:15:3    yes

If there’s any additional information needed let me know, and thanks for your time.

FurryJackman · March 20, 2019, 4:42am

First thing you might want to do if you’re able to get into a desktop environment on the host is to use lstopo

Wendell happens to have a video on NUMA if you’re interested:

ShayBox · March 20, 2019, 4:44am

Thanks, that’s quite useful:

And I’ll go ahead and watch that video.

ShayBox · March 20, 2019, 5:12am

So if I understand this correctly, it would be beneficial to instead give it all the cores/threads on node 1, and not node 0, and make sure my ram and gpu are on node 1 too? Would I need to use hugepages to make sure the ram the vm is accessing is in that node?

lessaj · March 20, 2019, 5:29am

It would be a good idea to use hugepages to make sure the RAM being used is connected to the same NUMA node that you are pinning your threads to, otherwise a lot of latency is introduced in asking the other node for RAM access. However you don’t need to specifically reserve them since you can do this on the fly with transparent hugepages which is what I did on my CentOS server to pin certain servers to certain NUMA nodes but especially for my media server which I use the hugepages for. Transparent huge pages can be a little bit of a pain though, I found after I shut the media VM down not all the memory freed up and I had to allocate just a little more to be able to start it again, or reboot the hypervisor - I don’t have this problem on my desktop with static huge pages.

Depending whether you want to use 2MB or 1GB tables you can basically do this to allocate them. Right now I’m drawing a blank about what I did to make this persist with a reboot since I don’t see a systemd unit and there’s no grub arguments, but I’m sure I did something since I would have tested it after a reboot.

echo # > /sys/devices/system/node/node#/hugepages/hugepages-1048576kB/nr_hugepages

EDIT: Found it.

cat /usr/lib/systemd/system/hugetlb-gigantic-pages.service

[Unit]
Description=HugeTLB Gigantic Pages Reservation
DefaultDependencies=no
Before=dev-hugepages.mount
ConditionPathExists=/sys/devices/system/node
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/lib/systemd/hugetlb-reserve-pages
[Install]
WantedBy=sysinit.target

cat /lib/systemd/hugetlb-reserve-pages

#!/bin/sh
nodes_path=/sys/devices/system/node/
if [ ! -d $nodes_path ]; then
echo "ERROR: $nodes_path does not exist"
exit 1
fi
reserve_pages()
{
echo $1 > $nodes_path/$2/hugepages/hugepages-1048576kB/nr_hugepages
}
# This example reserves 2 1G pages on node0 and 1 1G page on node1.
# You can modify it to your needs or add more lines to reserve memory in
# other nodes. Don't forget to uncomment the lines, otherwise they won't
# be executed.
# reserve_pages 2 node0
# reserve_pages 1 node1
reserve_pages 16 node0

ShayBox · March 20, 2019, 5:30am

Do I have to make 16gb of hugepages to give the guest 16gb of ram?

lessaj · March 20, 2019, 5:32am

If using 1GB pages, yes.

ShayBox · March 20, 2019, 5:59am

Alright, I changed my layout, now I have 16gb of ram, my gpu, and a dedicated network card on node 1, does this all look good?

lessaj · March 20, 2019, 6:17am

Yea looks like they’re all on node 1 so you can tell to use huge pages by adding this.

  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB'/>
    </hugepages>
  </memoryBacking>

And pin the cores something like this.

  <vcpu placement='static'>16</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='8'/>
    <vcpupin vcpu='1' cpuset='9'/>
    <vcpupin vcpu='2' cpuset='10'/>
    ....etc up to 15 using 8-15 and 24-31
    <emulatorpin cpuset='0,8'/>
  </cputune>
  <numatune>
    <memnode cellid='1' mode='strict' nodeset='1'/>
  </numatune>

  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='8' threads='2'/>
    <numa>
      <cell id='1' cpus='8-15,24-31' memory='16777216' unit='KiB'/>
    </numa>
  </cpu>

I’m not 100% sure about the <numa> part but something like that, on my system it’s pinned to 0 so I’m not confident that everything I changed to a 1 is what should be a 1, review the libvirt documentation about the parameters.

ShayBox · March 20, 2019, 7:26am

This is my xml
and this is the error I’m getting:
Error starting domain: internal error: Unable to find any usable hugetlbfs mount for 1048576 KiB
I have the systemd unit you gave loaded, and it’s reserving the ram, though I modified it to be node1, and confirmed it’s running, but it’s late so i’ll continue in the morning, Thanks for all the help!

lessaj · March 20, 2019, 7:37am

Check if /dev/hugepages1G exists, I may have done some other things to set up the pages. mount | grep huge returns for me

cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,hugetlb)
hugetlbfs on /dev/hugepages1G type hugetlbfs (rw,relatime,seclabel,pagesize=1024M)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel,pagesize=2M)

Polepetko · March 20, 2019, 8:49am

Shouldnt you alternate between the 2 ranges?
like:

    <vcpupin vcpu='0' cpuset='8'/>
    <vcpupin vcpu='1' cpuset='24'/>
    <vcpupin vcpu='2' cpuset='9'/>
    <vcpupin vcpu='3' cpuset='25'/>
    ....etc up to 15 using 8-15 and 24-31

IIRC qemu exposed SMT cores right after. So vcpu 0-1,2-3,… are the SMT pairs in the VM.

lessaj · March 20, 2019, 9:28am

Yes I believe that’s correct, sorry was looking at Intel config.

Polepetko · March 20, 2019, 4:13pm

Seems that the ids changed with AGESA. Check your config with lstopo -p or lscpu -e.

ShayBox · March 20, 2019, 6:06pm

Yes, both of those are at the top of my post, I think i set it up correctly for my topology though, I think intel is the one that does C/T,C/T,etc mine does CCCCCCCCTTTTTTTTCCCCCCCCTTTTTTTT

ShayBox · March 20, 2019, 6:14pm

This is what that returns, I guess /dev/hugepages1G doesn’t eixst

cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)

lessaj · March 20, 2019, 11:12pm

Intel is cores starting from 0 and then hyper threaded cores from wherever real cores ended, I thought Ryzen/TR was different but maybe that has changed, I haven’t been able to play with a Ryzen system yet, summer 2019.

CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ    MINMHZ
0   0    0      0    0:0:0:0       yes    5000.0000 800.0000
1   0    0      1    1:1:1:0       yes    4500.0000 800.0000
2   0    0      2    2:2:2:0       yes    5000.0000 800.0000
3   0    0      3    3:3:3:0       yes    5000.0000 800.0000
4   0    0      0    0:0:0:0       yes    5000.0000 800.0000
5   0    0      1    1:1:1:0       yes    5000.0000 800.0000
6   0    0      2    2:2:2:0       yes    5000.0000 800.0000
7   0    0      3    3:3:3:0       yes    5000.0000 800.0000

Okay yes I see you don’t have it, I followed the redhat article for getting 1GB pages (8.2.3.3 - Procedure 8.3. Allocating 1GB huge pages at runtime). I believe once it’s been created it will exist every time on boot and the systemd service will handle the allocation.

ShayBox · March 21, 2019, 12:55am

Yeah I can’t wrap my head around hugepages at all, I’ll just try and remove them and see if the cpu pinning is working

ShayBox · March 21, 2019, 2:04am

Alright, after reading some redhat docs and help from r/VFIO I was able to wrap my head around transparent hugepages, I guess I don’t need to allocate anything, I just had to tell it which node to use, this is my current xml and it will allocate the ram in that node before any other nodes, which before was doing half in node 0 and the rest in node 1

lessaj · March 21, 2019, 2:08am

Ah yes because you’ve set strict allocation to that NUMA node, that’s probably why. oVirt wouldn’t let me start my VM because it was configured for hugepages and the pages weren’t allocated at boot, still not sure I want to statically allocate at boot or not, but run time is working okay. I actually need to go back over my VM configs because I think I set it to preferred but according to numastat for qemu-kvm there’s usually at least a few MB on the other node despite that so it should be set to static.