Proxmox vs XCP-NG performance

Ive been evaluating Proxmox & XCP-NG for MSSQL/Windows workloads and long story short I cant replicate the performance of XCP-NG with Proxmox.
Whilst fio, Geekbench & other benchmarks were fairly similar, when it got to actual real workloads that will eventually run on these servers the difference was staggering.

MSSQL ETL task
Current Xeon E5-2689 v4 on VMWare 6.7 (OVH)
40m 18s

Dell R740 2 x XEON Gold 6152, RAID 10 SAS/SSD on Perc H730
Proxmox 7.3-4 - VM 16 vCPU, 32GB RAM
41m 06s

XCP-NG 8.2.1- VM 16 vCPU, 32GB RAM
17m 20s

My Proxmox VM settings

agent: 1
bios: ovmf
boot: order=scsi0;net0;ide0;ide2
cores: 16
cpu: host
efidisk0: SSD-R10:vm-200-disk-1,efitype=4m,pre-enrolled-keys=1,size=528K
ide0: none,media=cdrom
machine: pc-q35-7.1
memory: 32768
meta: creation-qemu=7.0.0,ctime=1663398507
name: Win-Srv-22-01
net0: virtio=AE:9A:D1:19:00:E7,bridge=vmbr0,firewall=1,tag=25
numa: 0
ostype: win11
scsi0: SSD-R10:vm-200-disk-2,cache=writeback,discard=on,iothread=1,size=64G
scsi1: SSD-R10:vm-200-disk-3,discard=on,iothread=1,size=64G
scsi2: SSD-R10:vm-200-disk-4,discard=on,iothread=1,size=128G
scsi3: SSD-R10:vm-200-disk-5,discard=on,iothread=1,size=64G
scsihw: virtio-scsi-single
smbios1: uuid=38eb0316-63e2-4dbb-9e1a-e0f45bc1d220
sockets: 2
tags: windows
tpmstate0: SSD-R10:vm-200-disk-0,size=4M,version=v2.0
vmgenid: 5a31f7fb-22dd-4e1a-adcc-136fcba06c34

uname -a
Linux pve02 5.15.83-1-pve #1 SMP PVE 5.15.83-1 (2022-12-15T00:00Z) x86_64 GNU/Linux

Ive tried different CPU types in PVE with minimal changes in result.
Any recommendations on what to try next?

Dave

I’ve not used PVE, but it uses QEMU/KVM, so have you tried to use a “host-passthough” option? This should pass along the host CPU flags, so you should have the same instructions available(well, detected, I think they’re always available).

Maybe it’s not the CPU though, so have you tried running an disk/IO benchmark, maybe? What’s your backing device for the VM storage on Proxmox?

Backing storage is 6 X 960 GB SAS SSD connected to a Dell PERC H730 card in RAID10.
I’m digging into BIOS settings at the moment to see if Ive missed something obvious.

In Proxmox Im using raw format with LVM-thin.

Have also now tried

  • disabling mem ballooning
  • changing cache on controller to writeback
  • setting trace flag for SQL to T8038
  • Disable use tablet for pointer

Tried same ETL process on a different node that has 2 x Silver 4114 CPU’s.
Initial process took 46m, went down to 23m after disabling all CPU vuln mitigation.

1 Like

XCP-NG seems less affected by CPU vuln mitigation than Proxmox - at least for Windows workloads.

1 Like

Have you enabled huge pages?
Is the test CPU or disk bound?

Arent they enabled by default? Ill check later when Im doing more testing.
Oddly the task doesnt really max out CPU or disk.

Nope …
You need to configure them in your kernel boot params, then you can use them in your VMs:

Any progress @davemcl or did you decide to stick with XCP-NG for performance at the end?

1 Like

Went with Proxmox and did lots of test workloads, however theres nothing quite like production ramping up and Ive had lots of issues with VM’s locking up under load. Basically we get SCSI reset messages in the VM and obscure errors on the PVE host.
Theres a number of forum posts about it and similar issues are being experienced by others.

1 Like

Did you install the virtio-win drivers in the VM?

Have you tried increasing the number of I/O threads? Have you tried pinning the CPUs?

You should probably be running with cache=none on all those volumes. Let Windows manage its own caching.

You should probably be running with hugepages as someone else mentioned. It’s most likely running with 2MiB transparent hugepages out of the box but you can switch to 1GiB static hugepages by adding default_hugepagesz=1G hugepagesz=1G hugepages=32 to your kernel commandline and this to your domain XML:

<memoryBacking>
	<hugepages/>
</memoryBacking>

I’m not familiar with Proxmox’s VM configuration file format but if you can figure out how to dump the domain XML I can take a closer look. Cheers.

Yes, the latest Virt-IO drivers are installed, plenty of others experiencing the same issue.
Its being tracked on the virt-io side here: