RX 7900 XTX on Fedora 37

Hey all,

I’ve just picked up an XFX RX 7900 XTX and having a bit of fun trying to get it running on my main system with Fedora 37.

I’ve bumped to rawhide kernel (6.1.0-65.fc38.x86_64) but I seem to be running into an issue with loading the firmware.
I noticed Wendell mentioned day 1 Linux support in his review, but I’m beginning to wonder if that was sarcasm!

Relevant portion of the logs:

kernel: amdgpu 0000:4d:00.0: vgaarb: deactivate vga console
kernel: amdgpu 0000:4d:00.0: enabling device (0006 -> 0007)
kernel: [drm] initializing kernel modesetting (IP DISCOVERY 0x1002:0x744C 0x1002:0x0E3B 0xC8).
kernel: [drm] register mmio base: 0xB1B00000
kernel: [drm] register mmio size: 1048576
kernel: [drm] add ip block number 0 <soc21_common>
kernel: [drm] add ip block number 1 <gmc_v11_0>
kernel: [drm] add ip block number 2 <ih_v6_0>
kernel: [drm] add ip block number 3 <psp>
kernel: [drm] add ip block number 4 <smu>
kernel: [drm] add ip block number 5 <dm>
kernel: [drm] add ip block number 6 <gfx_v11_0>
kernel: [drm] add ip block number 7 <sdma_v6_0>
kernel: [drm] add ip block number 8 <vcn_v4_0>
kernel: [drm] add ip block number 9 <jpeg_v4_0>
kernel: [drm] add ip block number 10 <mes_v11_0>
kernel: amdgpu 0000:4d:00.0: amdgpu: Fetched VBIOS from VFCT
kernel: amdgpu: ATOM BIOS: 113-D7020100-102
kernel: [drm] VCN(0) encode/decode are enabled in VM mode
kernel: [drm] VCN(1) encode/decode are enabled in VM mode
kernel: amdgpu 0000:4d:00.0: [drm:jpeg_v4_0_early_init [amdgpu]] JPEG decode is enabled in VM mode
kernel: amdgpu 0000:4d:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
kernel: [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
kernel: amdgpu 0000:4d:00.0: amdgpu: VRAM: 24560M 0x0000008000000000 - 0x00000085FEFFFFFF (24560M used)
kernel: amdgpu 0000:4d:00.0: amdgpu: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
kernel: [drm] Detected VRAM RAM=24560M, BAR=256M
kernel: [drm] RAM width 384bits GDDR6
kernel: [drm] amdgpu: 24560M of VRAM memory ready
kernel: [drm] amdgpu: 32071M of GTT memory ready.
kernel: [drm] GART: num cpu pages 131072, num gpu pages 131072
kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
kernel: amdgpu 0000:4d:00.0: Direct firmware load for amdgpu/psp_13_0_0_sos.bin failed with error -2
kernel: amdgpu 0000:4d:00.0: amdgpu: failed to init sos firmware
kernel: [drm:psp_sw_init [amdgpu]] *ERROR* Failed to load psp firmware!
kernel: [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* sw_init of IP block <psp> failed -2
kernel: amdgpu 0000:4d:00.0: amdgpu: amdgpu_device_ip_init failed
kernel: amdgpu 0000:4d:00.0: amdgpu: Fatal error during GPU init
kernel: amdgpu 0000:4d:00.0: amdgpu: amdgpu: finishing device.
kernel: amdgpu: probe of 0000:4d:00.0 failed with error -2

I should note that while my (also installed) 6900XT appears to initialize correctly, I can’t get display out of either when both are installed without resorting to nomodeset.

Any ideas?

phoronix does a very good job of showing how to actually get GPUs to work in linux. Lots of issues and the work arounds are covered in their review.

7 Likes

Thanks for this, was able to get it up and running after pulling firmware from kernel source.

Sorry for the delay on my end ive been a bit under the weather

4 Likes

Not at all! I hope you feel better.
Thanks for all the great work

A bit more feedback after some time to play:

OpenGL seems to be working fine - I’m seeing from 30 to 60 percent uplift over my 6900XT in Heaven and Superposition.

Vulkan is another story entirely. With mesa 22.2.3 on kernel 6.1.0-65.fc38.x86_64 and just the firmware added from upstream it locks up on init.

Launching vkcube hangs the driver, eventually tries and fails to soft reset, then does a hard reset which breaks my wayland session:

kernel: amdgpu 0000:4d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32768, for process vkcube pid 3987 thread vkcube pid 3987)
kernel: amdgpu 0000:4d:00.0: amdgpu:   in page starting at address 0x0000058100000000 from client 10
kernel: amdgpu 0000:4d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601031
kernel: amdgpu 0000:4d:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
kernel: amdgpu 0000:4d:00.0: amdgpu:          MORE_FAULTS: 0x1
kernel: amdgpu 0000:4d:00.0: amdgpu:          WALKER_ERROR: 0x0
kernel: amdgpu 0000:4d:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
kernel: amdgpu 0000:4d:00.0: amdgpu:          MAPPING_ERROR: 0x0
kernel: amdgpu 0000:4d:00.0: amdgpu:          RW: 0x0

Switching to a text console works fortunately, and the system shuts down cleanly.

I haven’t had much luck with amdgpu-pro either. I can get the packages for rhel9 installed, ROCm & HIP report they’re working (via rocminfo, hipconfig) but segfault as soon as I try to access them, at least via pytorch.

2 Likes

Updating mesa to 22.3.0-2 from rawhide has resolved the issues with vulkan. All the games I’ve tested seem to run fine (Stray, Valheim & CP2077) and show between 40 and 70 percent improvement over my 6900XT.

I still haven’t had any success with compute, and I’ve noticed a regression with 2D graphics; firefox frequently (3 times in the last couple hours) causes hard lockups, necessitating a forced reset. I haven’t noticed anything particular in common at the time like video playback, but anecdotally it seems to be just after switching tabs.

The last logs recorded prior to lockup are:

kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=426303, emitted seq=426305
kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process firefox pid 10128 thread firefox:cs0 pid 10249
kernel: amdgpu 0000:4d:00.0: amdgpu: IP block:gfx_v11_0 is hung!
kernel: amdgpu 0000:4d:00.0: amdgpu: GPU reset begin!
kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:67:crtc-0] hw_done or flip_done timed out
gnome-shell[2735]: Could not release device '/dev/input/event2' (13,66): Timeout was reached
2 Likes

What about OpenCL via rocm? It might work even if the other rocm stuff doesn’t.

Did you end up having any success?

Not trying to take over your thread but

After quickly working out that my pop_os! install was dead in the water after the switch to my 7900 xtx, I tried Arch Linux having read that the 6.0 kernel should support the card.
Any time I start GDM,
Blinking cursor…

Tried installing the amdgpu driver per arch wiki on AMDGPU, blinking cursor
compiled 6.1, blinking cursor.

Seems when people try Linux with these cards, it’s ether a flawless boot, or unsuccessful.

Well I ended up getting gnome loaded, installed the testing branch of linux-firmware that contains the latest amdgpu. But it’s definitely not ready for prime time. Something very wrong with the image buffer. Lots of screen flickering and image ghosting. Guessing the configuration is going to take time to mature.

I’ll have to try the testing branch for F37, my xtx isnt giving video out either.

system boots and i can ssh in,and last line is something about ccp cant access the device

updated the firmware from github + dracut
updated mesa to 22.3 from testing branch

and now games are working again



2 Likes

No dice so far… The fedora provided rocm/hip drivers on mainline and rawhide are both borked - they give me this:

$ rocminfo
ROCk module is loaded
Unable to open /dev/kfd read-write: Resource temporarily unavailable
sythezn is member of render group
$ rocm-clinfo
Number of platforms:				 1
  Platform Profile:				 FULL_PROFILE
  Platform Version:				 OpenCL 2.1 AMD-APP (3513.0)
  Platform Name:				 AMD Accelerated Parallel Processing
  Platform Vendor:				 Advanced Micro Devices, Inc.
  Platform Extensions:				 cl_khr_icd cl_amd_event_callback 


  Platform Name:				 AMD Accelerated Parallel Processing
Number of devices:				 0

$ clinfo
Number of platforms                               1
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (3513.0)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback 
  Platform Extensions function suffix             AMD
  Platform Host timer resolution                  1ns

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 0

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  AMD Accelerated Parallel Processing
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No devices found in platform [AMD Accelerated Parallel Processing?]
  clCreateContext(NULL, ...) [default]            No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No devices found in platform

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loaderns
  ICD loader Vendor                               OCL Icd free softwarens
  ICD loader Version                              2.3.1ns
  ICD loader Profile                              OpenCL 3.0ns

I’ve been trying to get amdgpu-pro setup correctly, but not having much luck so far.
Keep running into either hard lockups or just no platform reported at all.

mesa 22.3 is in mainline @updates now.
The firmware on rawhide has been updated as well: dnf update --enablerepo=rawhide amd-gpu-firmware linux-firmware

Seems a lot more stable - haven’t had any lockups from firefox since updating.

Edit: scratch that. still getting lockups from firefox, but they’re more graceful (app goes unresponsive rather than hard system lockup).

1 Like

Was there a linux video on the l1 channel for these GPUs? Its mentioned in the main review but I found nothing. Still not much luck getting this usable

Did you look at the manjaro thread? maybe what worked for that person might help you.

Ive been able to game on F37 with the firmware and mesa 22.3 updates.

I’ve seen some reports of the drivers working with kernels as far back as 5.16, but I haven’t had any luck at all below 6.1. I’m on 6.2 at the moment and it seems fairly stable for opengl and vulkan on Wayland at least (apart from ongoing issues with Firefox specifically).

I don’t use Pop OS myself, but from what I understand it’s still Xorg by default? May be support just isn’t there yet for the older renderer?

Bit of an update, I’ve been able to get ROCM working with my 6900XT at least by setting the ROCR_VISIBLE_DEVICES environment variable so the 7900XTX is excluded.
For the moment that at least brings my workstation back to functional.

Still no dice with HIP/ROCM, or OpenCL via HIP on the 7900XTX, but gauging by the support list on AMD’s docs that’s expected as there’s no RDNA3/CDNA3 support yet.

Firefox is still exhibiting weird symptoms. With hardware acceleration disabled I get long (5+ seconds) delays where the app goes unresponsive whenever I open a context menu via right click or the (meta?) key.
There’s also long delays when rendering more complex pages, though I somewhat expect that.

Watching the logs I see this at the time of a context menu command, though it’s not clear to me whether it’s related:

firefox.desktop[18903]: [Parent 18903, Main Thread] WARNING: gtk_widget_get_clipboard: assertion 'gtk_widget_has_screen (widget)' failed: 'glib warning', file /builddir/build/BUILD/firefox-108.0.1/toolkit/xre/nsSigHandlers.cpp:167
firefox.desktop[18903]: [Parent 18903, Main Thread] WARNING: gtk_clipboard_request_contents: assertion 'clipboard != NULL' failed: 'glib warning', file /builddir/build/BUILD/firefox-108.0.1/toolkit/xre/nsSigHandlers.cpp:167
firefox[18903]: gtk_widget_get_clipboard: assertion 'gtk_widget_has_screen (widget)' failed
firefox.desktop[18903]: [Parent 18903, Main Thread] WARNING: gtk_widget_get_clipboard: assertion 'gtk_widget_has_screen (widget)' failed: 'glib warning', file /builddir/build/BUILD/firefox-108.0.1/toolkit/xre/nsSigHandlers.cpp:167
firefox.desktop[18903]: [Parent 18903, Main Thread] WARNING: gtk_clipboard_request_contents: assertion 'clipboard != NULL' failed: 'glib warning', file /builddir/build/BUILD/firefox-108.0.1/toolkit/xre/nsSigHandlers.cpp:167
firefox.desktop[18903]: [Parent 18903, Main Thread] WARNING: gtk_widget_get_clipboard: assertion 'gtk_widget_has_screen (widget)' failed: 'glib warning', file /builddir/build/BUILD/firefox-108.0.1/toolkit/xre/nsSigHandlers.cpp:167
firefox.desktop[18903]: [Parent 18903, Main Thread] WARNING: gtk_clipboard_request_contents: assertion 'clipboard != NULL' failed: 'glib warning', file /builddir/build/BUILD/firefox-108.0.1/toolkit/xre/nsSigHandlers.cpp:167
firefox.desktop[18903]: [Parent 18903, Main Thread] WARNING: gtk_widget_get_clipboard: assertion 'gtk_widget_has_screen (widget)' failed: 'glib warning', file /builddir/build/BUILD/firefox-108.0.1/toolkit/xre/nsSigHandlers.cpp:167
firefox.desktop[18903]: [Parent 18903, Main Thread] WARNING: gtk_clipboard_request_contents: assertion 'clipboard != NULL' failed: 'glib warning', file /builddir/build/BUILD/firefox-108.0.1/toolkit/xre/nsSigHandlers.cpp:167
firefox[18903]: gtk_clipboard_request_contents: assertion 'clipboard != NULL' failed
firefox[18903]: gtk_widget_get_clipboard: assertion 'gtk_widget_has_screen (widget)' failed
firefox[18903]: gtk_clipboard_request_contents: assertion 'clipboard != NULL' failed
firefox[18903]: gtk_widget_get_clipboard: assertion 'gtk_widget_has_screen (widget)' failed
firefox[18903]: gtk_clipboard_request_contents: assertion 'clipboard != NULL' failed
firefox[18903]: gtk_widget_get_clipboard: assertion 'gtk_widget_has_screen (widget)' failed
firefox[18903]: gtk_clipboard_request_contents: assertion 'clipboard != NULL' failed

have you updated again?

I noticed the firmware and mesa 22.3 hit the stable branch in the last two days.

1 Like

Yeah, currently on mesa 22.3.1, kernel 6.2.0-rc0 and rocm 5.4.1