I’ve been trying the 6.11 kernel since it plays better with the nvidia 560 drivers but realised that it has changed my iommu groups.
I’ve confirmed that in 6.10.10 I have the same group layout that in the current lts (6.6.52). Also I have not found any similar case with this particular kernel update.
I went for a total of 14 groups with my secondary gpu conveniently isolated to only eight groups with the secondary gpu grouped with my main nvme storage among others.
Do any of you know if this is the intended behaviour or should I file a bug report? Where is the proper channel to get the info and logs to the correct dev?
There’s still a Bugzilla instance at bugzilla.kernel.org and the process that gets more than zero attention from the Linux Kernel Mailing List is:
Find the mainline kernel commit that brought about this change
Write a Bugzilla entry that specifies your hardware, the commit that caused the regression and makes the case that you want the nVidia 560 drivers from 6.11 as a reason for them to fix 6.11
Copy the content of the bug report to the LKML and the PCI/PCIE and ACPI sub-list
That ‘find the commit’ sounds daunting, but you get to use a binary chop to subdivide the search space into a good and bad half and then resume search on the bad half to find the culprit. If there’s 4000 or so commits, it should take 12 iterations of the test to complete, not 4000 iterations.
Step 0 – confirm that it’s not caused by patches added by your Linux distribution, clone the mainline kernel git tree, copy the config from /boot/ and build a local edition using your distribution’s step by step instructions. Install and reboot to see the IOMMU groupings.
Step 1 – find the last-known good marker, say the hash from the commit that’s tagged 6.10.10, and a bad commit hash that’s tagged 6.10.10 (git version tags are like symlinks to a commit hash.
Step 2 – check the git bisect documentation then begin git bisect [bad hash] [good hash]
Step 3 (and loop) – compile this revision, test it, and report to git bisect either git bisect good or git bisect bad.
When completed, there should be a commit that causes these IOMMU groups to fall into a different layout than you’ve previously desired. There wasn’t anything highlighted in the summaries of the 6.11 merge window I read (LWN first half / LWN second half) that points to an ACPI or PCIE change.
I’m seeing the same thing, wondering if you reported it.
My GPU is in the same IOMMU group as other devices that I cannot bind to vfio-pci (pass-through) on kernel 6.11.3 but the GPU is in its own IOMMU group on kernel 6.10.12.
This is why I gave up on the whole thing. Its so niche and bleeding edge, its not worth it if kernel updates break it. (I run Debian Testing). Still looking for a 4k/120fps KVM switch.