7950X3D + Proxmox CPU Core Pinning = Gaming VM w/ V-Cache?

An additional thought on this subject:

One of my concerns with the 7950X3D under virtualization is how the system would handle the 7950X3D/7900X3D’s core parking mechanic. Basically, will Windows still attempt to park cores?

In one of the 7800X3D reviews (I think GN), it was pointed out that systems previously used with the 7950X3D would need a fresh Windows install because it would still attempt to park cores (parkour!) despite this mechanism not being needed on the 7800X3D. I am curious as to what exactly happens to a 7950X3D VM with half the core count.

It’s probably a good idea to stay as far away from chipset drivers in a VM as possible (also for non-V-cache CPUs) AFAIK those are responsible for the parking behaviour. So if you never install them, I don’t see how there could be an issue?

That said, I don’t have a 7950X3D to try it with. I can confirm however that the cache sizes of my 7950X are recognised correctly by Windows regardless of how I pin cores (1 CCD, 2 CCD’s, parts of CCD’s etc.)

There’s probably no way to be 100% sure without trying, if the AMD/Microsoft implementation is so buggy that it won’t recognise the difference between a 7950X3D and a 7800X3D. Though even then, one might be able to spoof the CPUs model to the hypervisor such that it is unaware of the exact model you have.

1 Like

Ah, you’re right. I wasn’t sure how exactly the core parking mechanism worked, but chipset drivers makes sense. In my current system, I have it set up such that I can either boot into Windows via a Proxmox VM, or Windows baremetal (it’s installed on its own dedicated drive). This setup would probably not fly with the 3D v-cache parts, haha.

1 Like

I am receiving my 7950X3D by end of this week, and my plan is to pass-through only the V-Cache cores. So, Windows will “see” an 7800X3D. Therefore, I won’t install chipset drivers (chipset actually is Q35 for VM) nor Gamebar.
My only concern is, how Linux will handle Core 0, since it is used by Kernel, even if you pin/isolate it.

Have you had a chance to do any testing with the 7950X3D in a VM yet?

Actually I did. Lots of issues to setup my system and VM, you can see my thread here.

Without any Isolation, just pinned the CPUs, I have these results (compared to same settings with isolated & pinned 5900X)

  <vcpu placement="static">16</vcpu>
  <iothreads>2</iothreads>
  <cputune>
    <vcpupin vcpu="0" cpuset="0"/>
    <vcpupin vcpu="1" cpuset="16"/>
    <vcpupin vcpu="2" cpuset="1"/>
    <vcpupin vcpu="3" cpuset="17"/>
    <vcpupin vcpu="4" cpuset="2"/>
    <vcpupin vcpu="5" cpuset="18"/>
    <vcpupin vcpu="6" cpuset="3"/>
    <vcpupin vcpu="7" cpuset="19"/>
    <vcpupin vcpu="8" cpuset="4"/>
    <vcpupin vcpu="9" cpuset="20"/>
    <vcpupin vcpu="10" cpuset="5"/>
    <vcpupin vcpu="11" cpuset="21"/>
    <vcpupin vcpu="12" cpuset="6"/>
    <vcpupin vcpu="13" cpuset="22"/>
    <vcpupin vcpu="14" cpuset="7"/>
    <vcpupin vcpu="15" cpuset="23"/>
    <emulatorpin cpuset="15,31"/>
    <iothreadpin iothread="1" cpuset="13,29"/>
    <iothreadpin iothread="2" cpuset="14,30"/>
    <vcpusched vcpus="0" scheduler="rr" priority="1"/>
    <vcpusched vcpus="1" scheduler="rr" priority="1"/>
    <vcpusched vcpus="2" scheduler="rr" priority="1"/>
    <vcpusched vcpus="3" scheduler="rr" priority="1"/>
    <vcpusched vcpus="4" scheduler="rr" priority="1"/>
    <vcpusched vcpus="5" scheduler="rr" priority="1"/>
    <vcpusched vcpus="6" scheduler="rr" priority="1"/>
    <vcpusched vcpus="7" scheduler="rr" priority="1"/>
    <vcpusched vcpus="8" scheduler="rr" priority="1"/>
    <vcpusched vcpus="9" scheduler="rr" priority="1"/>
    <vcpusched vcpus="10" scheduler="rr" priority="1"/>
    <vcpusched vcpus="11" scheduler="rr" priority="1"/>
    <vcpusched vcpus="12" scheduler="rr" priority="1"/>
    <vcpusched vcpus="13" scheduler="rr" priority="1"/>
    <vcpusched vcpus="14" scheduler="rr" priority="1"/>
    <vcpusched vcpus="15" scheduler="rr" priority="1"/>
  </cputune>

Haven’t enable any EXPO or Curve optimizers yet.

does your board have the option to set the CCD for your host OS?
I got my new system yesterday, ASUS B650 Creator and 7950x, luckily had no problems, just transferred my config from the old system and was done.
On my CPU, the first CCD is a bit better, but apparently not all boards have the option to set the CCD to initialize the host system.
Did you change the cpu governor from the host?

My board has 3 options for the CCDs:
Frequency
Gaming
Auto
I set it to Frequency but haven’t notice any difference in the behavior. Even when running plain Linux, not the VM, it uses any available CPU core, not just the non-VCache ones.

What I meant was, you pinned the cores of CCD0 for your VM, but your host OS also uses CCD0, with the new bios versions you can change that

Prioritization of CCDs

Asus doesn’t follow 100% AMDs guidelines, so I am not sure how efficient this will be. At the moment, all the tests I have done point to correct utilization of cores and I haven’t even enable isolation.

here is the explanation why I had asked this

Pinning and isolation of 7950x3d

Possible solution

These are my posts :slight_smile:
And as I said, for now, I am not isolating, as I don’t have any issues. It turns out Linux scheduler does a great job. Also, Kernel 6.3 has even better scheduling for the V-cache processors, but haven’t try it, as it is beta and I am getting a white screen as soon as I login to KDE.

yes just realized that this is your post :slight_smile:

yes with Kernel 6.2 and 6.3, I’m also waiting for the fix

1 Like

it works for me with the kernel parameter “amdgpu.sg_display=0”

1 Like

Thanks for that, will try it today!

do you use 16 cores or 8 cores 16 threads? I get 97% single thread and exactly 50% multi-thread performance.

I use 4 cores per CCD, I also tried to use only one CCD, but the latency doesn’t get any better, currently I’m at about 60ns memory latency with Aida64.

  <vcpu placement='static' current='16'>32</vcpu>
  <iothreads>1</iothreads>
  <cputune>
    <vcpupin vcpu='0' cpuset='1'/>
    <vcpupin vcpu='1' cpuset='17'/>
    <vcpupin vcpu='2' cpuset='9'/>
    <vcpupin vcpu='3' cpuset='25'/>
    <vcpupin vcpu='4' cpuset='2'/>
    <vcpupin vcpu='5' cpuset='18'/>
    <vcpupin vcpu='6' cpuset='10'/>
    <vcpupin vcpu='7' cpuset='26'/>
    <vcpupin vcpu='8' cpuset='3'/>
    <vcpupin vcpu='9' cpuset='19'/>
    <vcpupin vcpu='10' cpuset='11'/>
    <vcpupin vcpu='11' cpuset='27'/>
    <vcpupin vcpu='12' cpuset='4'/>
    <vcpupin vcpu='13' cpuset='20'/>
    <vcpupin vcpu='14' cpuset='12'/>
    <vcpupin vcpu='15' cpuset='28'/>
    <emulatorpin cpuset='7,23'/>
    <iothreadpin iothread='1' cpuset='9,25'/>
  </cputune>
  <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-8.0'>hvm</type>
    <firmware>
      <feature enabled='no' name='enrolled-keys'/>
      <feature enabled='yes' name='secure-boot'/>
    </firmware>
    <loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/x64/OVMF_CODE.secboot.4m.fd</loader>
    <nvram template='/usr/share/edk2/x64/OVMF_VARS.4m.fd'>/var/lib/libvirt/qemu/nvram/win11-offg_VARS.fd</nvram>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv mode='custom'>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vpindex state='on'/>
      <synic state='on'/>
      <stimer state='on'/>
      <reset state='on'/>
      <vendor_id state='on' value='1234567890ab'/>
      <frequencies state='on'/>
    </hyperv>
    <pmu state='off'/>
    <vmport state='off'/>
    <smm state='on'/>
  </features>
  <cpu mode='host-passthrough' check='none' migratable='off'>
    <topology sockets='1' dies='2' cores='8' threads='2'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
    <feature policy='require' name='invtsc'/>
  </cpu>

1 Like

Works! Finally running on Kernel 6.3 which supposedly has optimizations for X3D CPUs

Cinebench single thread always run way slower in my vm. Same as Geekbench 6. 85% is optimistic in my case.
You get 60ns latency inside vm? That is amazing. I am about 60ns on the bare metal.

Yes inside the VM.
You’re right, I hadn’t tested R23 single thread yet, single thread R23 It’s not quite there yet.
I am using an 40 bucks air cooler right now, it’s a good one, but when I got my AIO next week, I might get to 2000 points in single thread.

This is again a different configuration, it has the same 1% lows (Metro Exodus) as a single CCD configuration but better multi tread performance.

 <vcpu placement='static' current='16'>32</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no'/>
    <vcpu id='1' enabled='yes' hotpluggable='yes'/>
    <vcpu id='2' enabled='yes' hotpluggable='yes'/>
    <vcpu id='3' enabled='yes' hotpluggable='yes'/>
    <vcpu id='4' enabled='yes' hotpluggable='yes'/>
    <vcpu id='5' enabled='yes' hotpluggable='yes'/>
    <vcpu id='6' enabled='yes' hotpluggable='yes'/>
    <vcpu id='7' enabled='yes' hotpluggable='yes'/>
    <vcpu id='8' enabled='no' hotpluggable='yes'/>
    <vcpu id='9' enabled='no' hotpluggable='yes'/>
    <vcpu id='10' enabled='no' hotpluggable='yes'/>
    <vcpu id='11' enabled='no' hotpluggable='yes'/>
    <vcpu id='12' enabled='no' hotpluggable='yes'/>
    <vcpu id='13' enabled='no' hotpluggable='yes'/>
    <vcpu id='14' enabled='no' hotpluggable='yes'/>
    <vcpu id='15' enabled='no' hotpluggable='yes'/>
    <vcpu id='16' enabled='no' hotpluggable='yes'/>
    <vcpu id='17' enabled='no' hotpluggable='yes'/>
    <vcpu id='18' enabled='no' hotpluggable='yes'/>
    <vcpu id='19' enabled='no' hotpluggable='yes'/>
    <vcpu id='20' enabled='no' hotpluggable='yes'/>
    <vcpu id='21' enabled='no' hotpluggable='yes'/>
    <vcpu id='22' enabled='no' hotpluggable='yes'/>
    <vcpu id='23' enabled='no' hotpluggable='yes'/>
    <vcpu id='24' enabled='yes' hotpluggable='yes'/>
    <vcpu id='25' enabled='yes' hotpluggable='yes'/>
    <vcpu id='26' enabled='yes' hotpluggable='yes'/>
    <vcpu id='27' enabled='yes' hotpluggable='yes'/>
    <vcpu id='28' enabled='yes' hotpluggable='yes'/>
    <vcpu id='29' enabled='yes' hotpluggable='yes'/>
    <vcpu id='30' enabled='yes' hotpluggable='yes'/>
    <vcpu id='31' enabled='yes' hotpluggable='yes'/>
  </vcpus>
  <iothreads>1</iothreads>
  <cputune>
    <emulatorpin cpuset='8,24'/>
    <iothreadpin iothread='1' cpuset='6,22'/>
  </cputune>
  <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-8.0'>hvm</type>
    <firmware>
      <feature enabled='no' name='enrolled-keys'/>
      <feature enabled='yes' name='secure-boot'/>
    </firmware>
    <loader readonly='yes' secure='yes' type='pflash'>/usr/share/edk2/x64/OVMF_CODE.secboot.4m.fd</loader>
    <nvram template='/usr/share/edk2/x64/OVMF_VARS.4m.fd'>/var/lib/libvirt/qemu/nvram/win11-offg-clone2_VARS.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv mode='custom'>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vpindex state='on'/>
      <synic state='on'/>
      <stimer state='on'/>
      <reset state='on'/>
      <vendor_id state='on' value='1234567890ab'/>
      <frequencies state='on'/>
    </hyperv>
    <pmu state='off'/>
    <vmport state='off'/>
    <smm state='on'/>
  </features>
  <cpu mode='host-passthrough' check='none' migratable='off'>
    <topology sockets='1' dies='2' cores='8' threads='2'/>
    <cache mode='passthrough'/>
    <feature policy='require' name='topoext'/>
    <feature policy='require' name='invtsc'/>
    <feature policy='disable' name='monitor'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
    <timer name='hypervclock' present='yes'/>
  </clock>

1 Like