SDL2 on Radeon R9 390 crashing video/audio

Hello,

I have teaching myself how to code using the SDL2 library in C++ recently, however I have been having an issue where within ~10 seconds of starting my program, all three of my monitors lose signal, and the last ~1 second of audio output loops endlessly, necessitating a hard reboot of the system.

I’m too new to link/upload the source code (it’s a very simple test program), but if I can in the comments for anyone interested, I will.

I don’t believe bad code is to blame, however. Firstly, the crash does not occur on my Microsoft Surface running Linux Mint Debian Edition. Secondly, the make/model of video card I have installed has been known to cause video output related issues with Linux, such that I have needed to set the following like in my /etc/default/grub in order to have stable video output just on the desktop:

GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3 quiet radeon.cik_support=0 radeon.si_support=0 amdgpu.cik_support=1 amdgpu.si_support=1 amdgpu.dc=1"

I’m currently on kernel 6.7.5-arch1-1

Let me know what other log/command outputs will help to resolve the issue. It makes development effectively impossible for SDL2.

I think the first step is to isolate if this a driver or hardware issue. You can run sudo journalctl -x --boot=-1 to list the systemd log from the previous boot.

You can type the /<phrase> to search for any keyword you want.
Then use n or shift+n to search forward and backwards through the log.

If the software-stack or kernel driver crashed, you should see something in the logs, otherwise, I imagine this is purely a hardware issue.

You could also install lm-sensors if you haven’t already, and check the GPU’s temperature under idle, and right before a crash happens.

Does this only happen when running your code, or with any 3D-accelerated application? I’ve had odd crashes on Radeon cards in the past, but it was patched around 2021 or so.

Okay, so that journalctl command looks like it might be pointing me in the right direction. Notably the bold red line that says kfd kfd: amdgpu: HAWAII not supported in kfd.

Feb 24 08:28:52 Archibald kernel: [drm] amdgpu kernel modesetting enabled.
Feb 24 08:28:52 Archibald kernel: amdgpu: Virtual CRAT table created for CPU
Feb 24 08:28:52 Archibald kernel: amdgpu: Topology: Add CPU node
Feb 24 08:28:52 Archibald kernel: [drm] initializing kernel modesetting (HAWAII 0x1002:0x67B1 0x1462:0x2015 0x80).
Feb 24 08:28:52 Archibald kernel: [drm] register mmio base: 0xF7E00000
Feb 24 08:28:52 Archibald kernel: [drm] register mmio size: 262144
Feb 24 08:28:52 Archibald kernel: [drm] add ip block number 0 <cik_common>
Feb 24 08:28:52 Archibald kernel: [drm] add ip block number 1 <gmc_v7_0>
Feb 24 08:28:52 Archibald kernel: [drm] add ip block number 2 <cik_ih>
Feb 24 08:28:52 Archibald kernel: [drm] add ip block number 3 <gfx_v7_0>
Feb 24 08:28:52 Archibald kernel: [drm] add ip block number 4 <cik_sdma>
Feb 24 08:28:52 Archibald kernel: [drm] add ip block number 5 <powerplay>
Feb 24 08:28:52 Archibald kernel: [drm] add ip block number 6 <dm>
Feb 24 08:28:52 Archibald kernel: [drm] add ip block number 7 <uvd_v4_2>
Feb 24 08:28:52 Archibald kernel: [drm] add ip block number 8 <vce_v2_0>
Feb 24 08:28:52 Archibald kernel: resource: resource sanity check: requesting [mem 0x00000000000c0000-0x00000000000dffff], which spans more than PCI Bus 0000:00 [mem 0x000d0000-0x000e7fff window]
Feb 24 08:28:52 Archibald kernel: caller pci_map_rom+0x69/0x1b0 mapping multiple BARs
Feb 24 08:28:52 Archibald kernel: amdgpu 0000:01:00.0: No more image in the PCI ROM
Feb 24 08:28:52 Archibald kernel: amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
Feb 24 08:28:52 Archibald kernel: amdgpu: ATOM BIOS: MS-V30823-F5
Feb 24 08:28:52 Archibald kernel: amdgpu: KFD support on Hawaii is experimental. See modparam exp_hw_support
Feb 24 08:28:52 Archibald kernel: kfd kfd: amdgpu: HAWAII  not supported in kfd
Feb 24 08:28:52 Archibald kernel: amdgpu 0000:01:00.0: vgaarb: deactivate vga console
Feb 24 08:28:52 Archibald kernel: amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
Feb 24 08:28:52 Archibald kernel: [drm] PCIE gen 3 link speeds already enabled
Feb 24 08:28:52 Archibald kernel: [drm] vm size is 128 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
Feb 24 08:28:52 Archibald kernel: amdgpu 0000:01:00.0: amdgpu: VRAM: 8192M 0x000000F400000000 - 0x000000F5FFFFFFFF (8192M used)
Feb 24 08:28:52 Archibald kernel: amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
Feb 24 08:28:52 Archibald kernel: [drm] Detected VRAM RAM=8192M, BAR=256M
Feb 24 08:28:52 Archibald kernel: [drm] RAM width 512bits GDDR5
Feb 24 08:28:52 Archibald kernel: [drm] amdgpu: 8192M of VRAM memory ready
Feb 24 08:28:52 Archibald kernel: [drm] amdgpu: 15980M of GTT memory ready.
Feb 24 08:28:52 Archibald kernel: [drm] GART: num cpu pages 262144, num gpu pages 262144
Feb 24 08:28:52 Archibald kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F400800000).
Feb 24 08:28:52 Archibald kernel: amdgpu: hwmgr_sw_init smu backed is ci_smu
Feb 24 08:28:52 Archibald kernel: [drm] Found UVD firmware Version: 1.64 Family ID: 9
Feb 24 08:28:52 Archibald kernel: [drm] Found VCE firmware Version: 50.10 Binary ID: 2
Feb 24 08:28:52 Archibald kernel: amdgpu 0000:01:00.0: [drm] dce110_link_encoder_construct: Failed to get encoder_cap_info from VBIOS with error code 4!
Feb 24 08:28:52 Archibald kernel: amdgpu 0000:01:00.0: [drm] dce110_link_encoder_construct: Failed to get encoder_cap_info from VBIOS with error code 4!
Feb 24 08:28:52 Archibald kernel: amdgpu 0000:01:00.0: [drm] dce110_link_encoder_construct: Failed to get encoder_cap_info from VBIOS with error code 4!
Feb 24 08:28:52 Archibald kernel: [drm] Display Core v3.2.259 initialized on DCE 8.0
Feb 24 08:28:52 Archibald kernel: hid-generic 0003:1EA7:CCBB.0001: input,hidraw0: USB HID v1.11 Keyboard [UBEST zoom65 wireless rgb] on usb-0000:00:14.0-4.3/input0
Feb 24 08:28:52 Archibald kernel: input: UBEST zoom65 wireless rgb Consumer Control as /devices/pci0000:00/0000:00:14.0/usb3/3-4/3-4.3/3-4.3:1.2/0003:1EA7:CCBB.0002/input/input5
Feb 24 08:28:52 Archibald kernel: input: UBEST zoom65 wireless rgb System Control as /devices/pci0000:00/0000:00:14.0/usb3/3-4/3-4.3/3-4.3:1.2/0003:1EA7:CCBB.0002/input/input6
Feb 24 08:28:52 Archibald kernel: input: UBEST zoom65 wireless rgb Mouse as /devices/pci0000:00/0000:00:14.0/usb3/3-4/3-4.3/3-4.3:1.2/0003:1EA7:CCBB.0002/input/input7
Feb 24 08:28:52 Archibald kernel: hid-generic 0003:1EA7:CCBB.0002: input,hidraw1: USB HID v1.11 Mouse [UBEST zoom65 wireless rgb] on usb-0000:00:14.0-4.3/input2
Feb 24 08:28:52 Archibald kernel: hid-generic 0003:1EA7:CCBB.0003: hiddev96,hidraw2: USB HID v1.11 Device [UBEST zoom65 wireless rgb] on usb-0000:00:14.0-4.3/input1
Feb 24 08:28:52 Archibald kernel: input: Kensington Expert Mouse as /devices/pci0000:00/0000:00:14.0/usb3/3-4/3-4.4/3-4.4:1.0/0003:047D:1020.0004/input/input8
Feb 24 08:28:52 Archibald kernel: hid-generic 0003:047D:1020.0004: input,hidraw3: USB HID v1.11 Mouse [Kensington Expert Mouse] on usb-0000:00:14.0-4.4/input0
Feb 24 08:28:52 Archibald kernel: usbcore: registered new interface driver usbhid
Feb 24 08:28:52 Archibald kernel: usbhid: USB HID core driver
Feb 24 08:28:52 Archibald kernel: [drm] UVD initialized successfully.
Feb 24 08:28:52 Archibald kernel: [drm] VCE initialized successfully.
Feb 24 08:28:52 Archibald kernel: amdgpu 0000:01:00.0: amdgpu: SE 4, SH per SE 1, CU per SH 11, active_cu_number 40
Feb 24 08:28:52 Archibald kernel: amdgpu 0000:01:00.0: amdgpu: Using BOCO for runtime pm
Feb 24 08:28:52 Archibald kernel: [drm] Initialized amdgpu 3.57.0 20150101 for 0000:01:00.0 on minor 2
Feb 24 08:28:52 Archibald kernel: fbcon: amdgpudrmfb (fb0) is primary device
Feb 24 08:28:52 Archibald kernel: fbcon: Deferring console take-over
Feb 24 08:28:52 Archibald kernel: amdgpu 0000:01:00.0: [drm] fb0: amdgpudrmfb frame buffer device

I did a bit of Googling and modified the GRUB_CMDLINE_LINUX_DEFAULT value in my /etc/default/grub file from:
"loglevel=3 quiet radeon.cik_support=0 radeon.si_support=0 amdgpu.cik_support=1 amdgpu.si_support=1 amdgpu.dc=1"
to
"loglevel=3 quiet splash radeon.cik_support=0 radeon.si_support=0 amdgpu.cik_support=1 amdgpu.si_support=1"

But this did not noticeably change the outcome.

sensors

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:        1.13 V  
fan1:         812 RPM  (min =    0 RPM, max = 6000 RPM)
edge:         +64.0°C  (crit = +104000.0°C, hyst = -273.1°C)
PPT:          53.07 W  (cap = 230.00 W)

As for 3D-accelerated application, I haven’t tried. I have a Windows machine to handle all my gaming, so I’m not sure if my Linux machine has ever done 3D graphics since I first installed it.