Specs:
CPU: TR 1950x
GPU: 1080 Ti
RAM: 6x4gb sticks of DDR4 2667MHz in slots:
0,1,2,5,6,7
Board: MSI Gaming Carbon AC x399
Distro: ArchLinux
Kernel: linux-zen
Bootloader: rEFInd
qemu: qemu-patched
I am looking to optimize my windows gaming vm for more gaming performance, I’m mainly asking for help with pinning topology, I’ve read a lot of posts about numa and I don’t know what the best way to go about it would be, but any other optimizations would be appreciated.
I’m unsure if I want to use huge pages, as far as I know it’s constantly reserved, but if I’m wrong please let me know.
And I believe I want to keep threads 0,16 to the host, maybe also 1,17 just in case but i’ve seen guides do weird things with numbering, maybe that’s just for 1900x or an old AGESA code (I’m running 1.1.0.2)
Currently I unbind my 1080 ti from the nvidia prop drivers on my host, bind it to vfio, and pass it through to the vm, which all works perfectly.
Here are my files: unbind rebind xml boot params
So if I understand this correctly, it would be beneficial to instead give it all the cores/threads on node 1, and not node 0, and make sure my ram and gpu are on node 1 too? Would I need to use hugepages to make sure the ram the vm is accessing is in that node?
It would be a good idea to use hugepages to make sure the RAM being used is connected to the same NUMA node that you are pinning your threads to, otherwise a lot of latency is introduced in asking the other node for RAM access. However you don’t need to specifically reserve them since you can do this on the fly with transparent hugepages which is what I did on my CentOS server to pin certain servers to certain NUMA nodes but especially for my media server which I use the hugepages for. Transparent huge pages can be a little bit of a pain though, I found after I shut the media VM down not all the memory freed up and I had to allocate just a little more to be able to start it again, or reboot the hypervisor - I don’t have this problem on my desktop with static huge pages.
Depending whether you want to use 2MB or 1GB tables you can basically do this to allocate them. Right now I’m drawing a blank about what I did to make this persist with a reboot since I don’t see a systemd unit and there’s no grub arguments, but I’m sure I did something since I would have tested it after a reboot.
#!/bin/sh
nodes_path=/sys/devices/system/node/
if [ ! -d $nodes_path ]; then
echo "ERROR: $nodes_path does not exist"
exit 1
fi
reserve_pages()
{
echo $1 > $nodes_path/$2/hugepages/hugepages-1048576kB/nr_hugepages
}
# This example reserves 2 1G pages on node0 and 1 1G page on node1.
# You can modify it to your needs or add more lines to reserve memory in
# other nodes. Don't forget to uncomment the lines, otherwise they won't
# be executed.
# reserve_pages 2 node0
# reserve_pages 1 node1
reserve_pages 16 node0
I’m not 100% sure about the <numa> part but something like that, on my system it’s pinned to 0 so I’m not confident that everything I changed to a 1 is what should be a 1, review the libvirt documentation about the parameters.
This is my xml
and this is the error I’m getting: Error starting domain: internal error: Unable to find any usable hugetlbfs mount for 1048576 KiB
I have the systemd unit you gave loaded, and it’s reserving the ram, though I modified it to be node1, and confirmed it’s running, but it’s late so i’ll continue in the morning, Thanks for all the help!
Check if /dev/hugepages1G exists, I may have done some other things to set up the pages. mount | grep huge returns for me
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,hugetlb)
hugetlbfs on /dev/hugepages1G type hugetlbfs (rw,relatime,seclabel,pagesize=1024M)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel,pagesize=2M)
Shouldnt you alternate between the 2 ranges?
like:
<vcpupin vcpu='0' cpuset='8'/>
<vcpupin vcpu='1' cpuset='24'/>
<vcpupin vcpu='2' cpuset='9'/>
<vcpupin vcpu='3' cpuset='25'/>
....etc up to 15 using 8-15 and 24-31
IIRC qemu exposed SMT cores right after. So vcpu 0-1,2-3,… are the SMT pairs in the VM.
Yes, both of those are at the top of my post, I think i set it up correctly for my topology though, I think intel is the one that does C/T,C/T,etc mine does CCCCCCCCTTTTTTTTCCCCCCCCTTTTTTTT
This is what that returns, I guess /dev/hugepages1G doesn’t eixst
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
Intel is cores starting from 0 and then hyper threaded cores from wherever real cores ended, I thought Ryzen/TR was different but maybe that has changed, I haven’t been able to play with a Ryzen system yet, summer 2019.
Okay yes I see you don’t have it, I followed the redhat article for getting 1GB pages (8.2.3.3 - Procedure 8.3. Allocating 1GB huge pages at runtime). I believe once it’s been created it will exist every time on boot and the systemd service will handle the allocation.
Alright, after reading some redhat docs and help from r/VFIO I was able to wrap my head around transparent hugepages, I guess I don’t need to allocate anything, I just had to tell it which node to use, this is my current xml and it will allocate the ram in that node before any other nodes, which before was doing half in node 0 and the rest in node 1
Ah yes because you’ve set strict allocation to that NUMA node, that’s probably why. oVirt wouldn’t let me start my VM because it was configured for hugepages and the pages weren’t allocated at boot, still not sure I want to statically allocate at boot or not, but run time is working okay. I actually need to go back over my VM configs because I think I set it to preferred but according to numastat for qemu-kvm there’s usually at least a few MB on the other node despite that so it should be set to static.