Is it possible to direct all NVME SSD interrupts to certain cores in Linux?

Long story short, I have a gaming VM that I’m trying to get maximum performance out of. One of the ways I’m achieving that is to direct all interrupt sources to only send interrupts to cores that are not busy running games in a VM. (I’m using cores 0-3 for host stuff.)

This is easy enough:

# Send all interrupts to cores 0-3 by default
echo 0000000f | tee /proc/irq/default_smp_affinity

# Map all current interrupt sources to core 0-3
echo 0-3 | tee /proc/irq/*/smp_affinity_list

This works for everything EXCEPT interrupts coming from my NVME SSDs. These appear to be pre-mapped evenly across all my cores:

$ cat /proc/irq/81/smp_affinity_list 
0-1,16-17
$ cat /proc/irq/82/smp_affinity_list
2-3,18-19
$ cat /proc/irq/83/smp_affinity_list
4-5,20-21
$ cat /proc/irq/84/smp_affinity_list
6-7,22-23
...

If I try to change the affinity of these interrupt sources, I get this error:

bash: line 1: echo: write error: Input/output error

Is there simply nothing I can do about these? Any time disk activity goes up, my virtualization cores are interrupted to handle the interrupts, causing stutters in my VM.


I’m on kernel 5.12.1.

Hardware:

  • Gigabyte Aorus X570 Master
  • Ryzen 5950X
  • 3x Corsair Force Series MP510 960GB (LVM on top of md raid-0)

I asked this question on Serverfault months ago but I haven’t received any answers, so I’m hopeful somebody on this forum may have some insight. I just started a bounty on the Serverfault post, if somebody wants to grab 250 points.

3 Likes

Have you considered hitting the kernel mailing list with this question? I know mailing lists are outdated and cumbersome to deal with, but maby you can find and talk to the guy who actually coded the nvme interrupts.

A quick google search suggest that perhaps Keith Busch would have some familiarity of where to head.

If you find a solution, we’d appreciate an update!

1 Like

So here’s something interesting, from irqbalance: distribute hardware interrupts across processors on a multiprocessor system - Linux Man Pages (1)

IRQBALANCE_BANNED_CPUS

Provides a mask of CPUs which irqbalance should ignore and never assign interrupts to. If not specified, irqbalance use mask of isolated and adaptive-ticks CPUs on the system as the default value.

Have you looked into setting this environmental variable?

This seems to explain how to actually use it: IRQBALANCE_BANNED_CPUS explained and working on Ubuntu 14.04 | ForDoDone

That actually hadn’t occurred to me, I’ll do that!

I don’t think that will do anything. As far as I understand “irqbalance” is a userland daemon that automatically balances IRQ around my cores using /proc/irq. If I can’t do it manually, I don’t see why irqbalance would succeed.

I’ll give it a shot anyway though, just to be sure.


Meanwhile though, I found the documentation for the isolcpus kernel command line option:

https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html?highlight=isolcpus

Apparently one of the things it can do is:

managed_irq

Isolate from being targeted by managed interrupts
which have an interrupt mask containing isolated
CPUs. The affinity of managed interrupts is
handled by the kernel and cannot be changed via
the /proc/irq/* interfaces.

So I tried adding this to my kernel command line: isolcpus=managed_irq,8-31, aaaaand… It did absolutely nothing. Interrupts are still evenly spread across my cores, even the ones I’m allowed to move.

1 Like

Have you tried vfio-isolate already?

So I have very little working knowledge of IRQ stuff (only started looking into it when you brought it up the other day) but I wonder if for nvme the IRQ stuff is vestigial and we are looking in the wrong place since some adaptive/hybrid polling scheme might be what is actually being used now

Here’s what brought this up:

Unfortunately I don’t know enough yet to form a better question.

1 Like

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.