Sata-ahci IO_PAGE_FAULT on Asrock X399D8A-2T

Hi,

under mostly idle (rootfs/swap/lvm/mdraid) config i keep getting this error after about a day…

ahci 0000:01:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x000000006af54000 flags=0x0000]

after which none of the disks attached to the x399 controller are available anymore. It seems like the whole controller is gone…

root:~# ls -d /sys/devices/pci0000\:*
pci0000:00/ pci0000:40/

I cannot find any similar reports…

After the first crash i’ve switched one leg of the mdraid to the nvme,
which is rock solid, so far.

Is there any way to reset just the sata ctl without resetting the whole box?
Does anyone have an idea what might be causing this, or how i might go
about finding the cause?

greetings
/moddie

hardware:
Threadripper 1950X
ASRockRack X399D8A-2T
BIOS Version: P1.30
Release Date: 08/08/2019
root:~# lspci | egrep -i ‘stata|ahci|lsi|asm’
06:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 02)
0c:00.2 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
41:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] (rev 02)
43:00.2 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)

looks like your AHCI controller is faulty. i recommend contacting the manufacturer if your warranty is still good.