TR System Hangs When GPU Flagged for Passthrough

Hey all, new account on the forum, however been browsing for awhile.
My current specs are:

TR 1950X
ASRock X399M Taichi
2x8GB Corsair Vengeance DDR4
AMD Radeon HD 6450 (first slot)
Powercolor AMD 5700XT Red Devil (second slot)
EVGA G3 850W
2x500GB Samsung SSD
WD 1TB M.2 SSD

The problem:

So the common issue is that the system will randomly hang when I try to enable the 5700XT for passthrough, the only way to fix it is to shut down the system. However, it goes into a boot-loop. The only solution I have found is to unplug the computer, let it sit for a few minutes, and then plug it back in. After this, it boots however the BIOS are reset.

I have tried this on a multitude of different software. First unRAID, Proxmox, then ESXi. Always the same issue. I assumed it was a software issue so I made the switch, then it happened on proxmox so I figured it was a manual config error, and with ESXi, it happened again.

Due to ESXi being the last operating system I installed, I left it on overnight and did not have any hanging issue. However this morning I went to ESXi to enable the 5700XT for passthrough it reached the 95% mark, and then the entire system hangs. So I figure it is related to this?

Steps I have taken:

  • Updated BIOS
  • Disabled anything C-State related in BIOS
  • Reseated RAM, GPUs
  • Tried different software

The cooling situation:

Corsair iCUE H100i Elite Capellix. The MB consistently reports temps of ~60C. However, putting my hand on the exhaust fan, does not feel hot. To be safe - I have made sure all the fan headers are running at full speed from the BIOS (I set it every time the BIOS resets)

These are all known working parts from separate systems I have.

I am at a loss for what to do.

Thank for you for time.

I have the ESXi support bundle file but I cannot include links in my post.

EDIT:

To anyone who has this issue in the future. I was able to solve it by reflashing bios and disabling anything C-State related in the BIOS. I guess it was a fluke from me updating the BIOS the first time.

can you set the kernel to panic on hang and then use kdump to send us a log of this hang?

im also using gen1 TR with passthrough and have none of these issues, so something must be going wrong here.

Having trouble figuring out how to do this - not very experienced with that part of Linux. I do have the ESXi full support dump. I can send you link to download?

i really only need to see the dmesg logs at the instant of the hang.