dmesg | grep IOMMU states the following:
[ 1.240358] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 1.245782] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 1.247402] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[ 1.280673] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel [email protected]
So I assume it’s enabled?
i do want to add yesterday I did test out my 5700 xt on my windows 10 install I have on a ssd i keep around for testing. i just wanted to run 3dmark and bench it to see if its performing as it should. i decided to play a round of call of duty warzone with my buddy and when i went to alt tab out to chrome, my screen flashed black and locked up as well (no mouse or keyboard, couldn’t even activate caps lock), but windows recovered after a little over a minute and reported two TDR’s caused by the amd gpu driver in event viewer. thinking about that crash was very similar to this crash on linux. same symptoms, screen flashed black, hard lock, couldn’t activate caps lock, and after around a minute or so the machine randomly rebooted instead. self rebooting is the only difference than the windows “crash.” i was alt tabbing out of warcraft 3 reforged after i lost my game. i also had corectrl running to set a manual fixed fan rate as well. which was monitoring temps and what not.
i understand linux doesn’t have as graceful way to handle crashes with gpu’s so that’s why i asked if its possible the MCE is related to the GPU. according to that wikipedia article, they can happen from i/o, memory, cpu, or buss. if the gpu hangs or something, how would linux handle it? i would assume in a form of a kernel panic or MCE? depending on how it crashes? i also noticed when i was doing my initial research into navi stability on linux, i did notice people talking about their screens going black and randomly rebooting on them. very similar symptoms to what I just endured. which means it must have been a MCE for them. so your statement about iommu has me interested in if that’s the cause.
should i disable it via bios (my bios has an option for it according to my manual) or the kernel line amd_iommu=off?