QEMU/libvirt SATA disk performance

This morning I decided to finally dig into the disk tuning for Qemu, it turns out that the libvirt/qemu defaults are totally junk for SSDs. By default it uses a single IO thread for all IO operations, as such blocking occurs and disk performance suffers.

Before:
SCSI Random F

After:

I am passing an entire disk into the VM, an older Samsung 840 EVO 1TB SSD and I switched to using SCSI rather then AHCI in the guest as it’s performance is much more predictable.

Here is the magic:

-object iothread,id=iothread1 \
-device virtio-scsi-pci,id=scsi1,iothread=iothread1 \
-drive if=none,id=hd1,file=/dev/disk/by/id/ata-Samsung_SSD_EVI_1TB_xxxx,format=raw,aio=threads \
-device scsi-hd,bus=scsi1.0,drive=hd1,bootindex=1

A SCSI controller should be created for each additional disk so that each get their own IO thread.

This could likely be tuned even more by telling qemu to report 4K blocks instead of 512 bytes, but I will need to reinstall windows to test.

I also tested all the different cache modes, improvements can be had for write if cache is set to none, but since my use case is minimal write the default used by qemu gives the best read performance.

I will leave it up to the reader to figure out how to translate this to libvirt :slight_smile:

7 Likes

Well done!

I think you can improve performance even further by applying “Queues” to the VirtIO-SCSI device. According to VFIO discord you may add queues up to the amount of threads you are serving to the VM.

The guest is running in a ZFS zvol (4K blocksize) and writeback enabled.
Have a look at my current config (it’s libvirt XML though). Notice how I strictly separated CPU pins and applied 2 cores (4 threads) as one IO-thread. It did in fact improve performance in my case. At the bottom I enabled 4 queues, one for each vcpu added to the VM.

EDIT: Benchmark run on WD Red 3TB, ZFS write cache (SLOG) on Samsung 950 Pro 256GB. I haven’t particularily tested latency back then. Will pick this up one more time and report back. :slight_smile:

<domain>
  ...
  <vcpu placement='static'>4</vcpu>
  <iothreads>1</iothreads>
  <cputune>
    <vcpupin vcpu='0' cpuset='2'/>
    <vcpupin vcpu='1' cpuset='6'/>
    <vcpupin vcpu='2' cpuset='3'/>
    <vcpupin vcpu='3' cpuset='7'/>
    <emulatorpin cpuset='0-1,4-5'/>
    <iothreadpin iothread='1' cpuset='0-1,4-5'/>
  </cputune>
  ...
  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
    <topology sockets='1' cores='2' threads='2'/>
  </cpu>
  ...
  <devices>
  ...
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='writeback' io='threads'/>
      <source dev='/dev/zvol/hdd/VM/gamelib-01'/>
      <target dev='sdd' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='3'/>
    </disk>
    ...
    <controller type='scsi' index='0' model='virtio-scsi'>
      <driver queues='4' iothread='1'/>
      <address type='pci' domain='0x0000' bus='0x0b' slot='0x00' function='0x0'/>
    </controller>
    ...
  </devices>
</domain>
1 Like

A little heads-up regarding latency with above configuration. Since I am using Writeback caching I’ve inclused 2 runs per disk to show how speeds and latency behave when data is being read from RAM. Any other run after the 2nd one did not improve the result in any further.

NVME Latency (Samsung 950 Pro 256 GB)

1st Run:
hdtune_nvme_1

2nd Run
hdtune_nvme_2

HDD Latency (WD Red 3 TB)

1st Run:
hdtune_HDD_1

2nd Run:
hdtune_HDD_2

1 Like

I’m trying to use these settings and am seeing greatly improved HD Tune results, however SQL performance and benchmarking with HammerDB have gone way down from the SATA controller I was using. It seems to hit a limiter on how fast it can go (250000 TPM) with nothing making that go any faster. I do see in Device Manager when using the SATA controller each Drive is on it’s own bus. But on VirtIO-SCSI driver, they all seem to be on Bus 0. Could one of you guys check your config to see if that may be the problem for me?

Not really an answer to your question, but if you really care about disk performance, you could try passing through the SATA controller your drives are physically attached to.