X570 Taichi - AER Cap Enabled causing errors?

Hi all,
I have an X570 Taichi and am trying to passthrough a GPU to a VM within UnRAID - this has worked before, but now (potentially after BIOS upgrade to the latest one, 2.70), it does not work, and instead,I get these errors:

Dec 25 12:33:10 Tower kernel: pcieport 0000:00:03.2: AER: Uncorrected (Non-Fatal) error 
received: 0000:00:03.2
Dec 25 12:33:10 Tower kernel: pcieport 0000:00:03.2: PCIe Bus 
Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
Dec 25 12:33:10 Tower kernel: pcieport 0000:00:03.2: device [1022:1483] error 
status/mask=00100000/00000000

The VM will boot, but hang on the Windows boot screen, or the VM will just crash after a few minutes on a Linux VM.
The GPU works perfectly fine when AER Cap is set to Disabled, but then the IOMMU groupings are a mess.

I’ve downgraded back to the BIOS version that last worked, 2.10, however this seems to not work anymore either. I’ve also moved the slot the GPU is in, too.

If anyone has any ideas or suggestions, I’d be really, really grateful!
Thanks! :grinning:

1 Like

Is AER Cap disabled the same as booting with the pci=noaer option?

I have a temperamental onboard USB controller that throws up the same type Uncorrected Non-Fatal errors without noaer on. Of course, noaer suppresses everything across the board, but I don’t recall it changing my IOMMU groups at all.

FWIW I found a nice guide on how to suppress only a targeted type of errors on a single device rather than all errors on everything and use that instead.

I can’t boot into Windows 10 with it enabled, not sure if that helps confirm an issue exists.

Sorry to bring up this old thread, but I am running 4.60 on my X570 Taichi, and disabling ACS, and AER Cap is the only thing that allowed me to passthrough any GPU in the top pcie slot.

I have no issues with passthrough in Windows VM with an Nvidia P620 in the middle slot with ACS, and AER Cap on, but anything in the top slot fails.

It makes no sense as I didn’t have this problem with an X470 K4 Gaming but I did have to set pcie_acs_override=downstream as a boot option for both GPUs to function properly in separate VMs, which I’m trying to avoid with proper IOMMU groupings by enabling ACS in BIOS.

Anyway, has anyone been able to get the top slot GPU to work properly without disabling ACS, AER? I am sure it’s either a goofy BIOS option I haven’t toggled, or a grub boot flag I haven’t enabled.

Thanks in advance.

I got it working. I found another thread on here that mentioned specifying the PCI-E generation per slot to fix similar issues. Anyway, I now have AER Cap and ACS Enabled with both GPUs working. It seems like an odd thing to change, but now my RX 480 or 5700 XT in the top slot are working perfectly in a Linux Mint VM, and the P620 in the middle slot with Windows 10. It also seemed to fix the issue where my i350-T4 would randomly disable SR-IOV. I also disabled the onboard Intel NIC. I am super happy with the Taichi after these changes.

It also works properly without the acs_override flag in grub. I hope this helps someone else in a similar situation.