Hi all,
I am currently studying how the GPU benchmark performance can be affected by CPU-bound benchmark if they are collocated together. A weird observation I got is that one benchmark, NAMD CUDA, can actually run faster on VM than native execution under the same interference from the CPU-bound co-runner.
Here are my experiment settings:
Native: UBUNTU 18.04.2 LTS, 16 [email protected], 1 NVIDIA Tesla P100 PCIe 16GB GPU, both NAMD and CPU-bound benchmark launch 16 threads and run simultaneously on the machine.
VM: Built on the same native machine above, VMM is KVM, two VMs are created and both use UBUNTU 18.04.2 LTS
VM1 (runs NAMD) – 16 VCPUs (one-on-one pinned on cores), 1 NVIDIA Tesla P100 PCIe 16GB GPU (using Direct-Passthrough)
VM2 (runs CPU-bound benchmark) – 16 VCPUs (one-on-one pinned on cores)
The command to run NAMD is ./namd2 +p 16 +devices 0 apoa1/apoa1.namd
The co-running CPU-bound program uses 1600% of CPU if it is running alone on the native machine or the VM.
Here is the result (run time and GPU utilization of NAMD):
Native:
Execution time: 85.1 seconds
GPU utilization: 21.5%
VM:
Execution time: 69.5 seconds
GPU utilization: 29.0%
The native execution of NAMD under the interference from the CPU-bound program has a higher execution time and lower GPU utilization which is contradicting theoretical beliefs. Do you know what could be the problem or how I can analyze this observation with more experiments?