Fedora 33 - Gnome 3 Screen Flicker

Hello,

I’m looking for some instruction on how to troubleshoot the screen flicker I’m getting on my workstation.

The issue begins in GDM after a boot. Certain mouse movements or menu animations will be accompanied by a flicker across my screen.After logging in the issue persists and a flicker can be seen when I press the super key to open the gnome search window.

If I go into displays and change the screen refresh rate from 60 to 30 and back to 60, this usually fixes my issue, including if I log out and log back in. However, if I reboot my system, the issue comes back and I have to play with the display refresh options again.

I’ve tried disabling all extensions. The issue occurs on X11 and Wayland. I’ve also deleted the monitors.xml file in my ~/.config and /var/lib/gdm/.config directories. This resulted in a lower default resolution which also had a flicker.

Can I get information on where I should look for logs to troubleshoot the issue?

System Information:
Fedora 33
Kernel 5.9.8-200.fc33.x86_64
Gnome 3.38.1 on Wayland/Xorg
AMD Radeon RX 5700XT
5120x1440@60 (or 30 or 70)

lspci output:
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] (rev c1)
Subsystem: Gigabyte Technology Co., Ltd Radeon RX 5700 XT Gaming OC
Kernel driver in use: amdgpu
Kernel modules: amdgpu

I was unable to get the flicker on a screen shot but a photo on my phone shows what I’m seeing. basically it flashes these bars during animations.

I tried posting on ask.fedoraproject.org but I didn’t get any traction.

Thanks

Hardware debugging
Have you been able to replicate this on…

  1. Other ports on your graphics card?
  2. Other cables?
  3. On hdmi or displayports?
  4. On a different monitor?

Software
Have you been able to replicate this on…

  1. On windows?
  2. On the amdgpu-pro drivers?
  3. On another window manager such as SSDM
  4. On another DE such as Cinnamon, XFCE, or KDE?

Monitors are very finicky with linux. There’s a lot of variables to debug. Tougher to debug if you have multiple monitors. I have found that if I can’t replicate it on windows, the easiest is to start with hardware.

I approach debugging with the mindset of little changes to an existing workflow. For both hardware and software

Hardware debugging

  1. Try to change the ports on your graphics card
  2. Try to change the ports on your monitor
  3. Try a different cable (if you have a hdmi, try displayport and vice versa)
  4. Try a replacement cable
  5. Try steps 1-4 on another monitor

I’m running on HDMI right now. I didn’t think to try DisplayPort . I only have a single HDMI port. I’ll give that a go.

For what it’s worth, I don’t have any issues when I use a mac book pro on this screen so I don’t think it’s an issue with the screen, but I suppose the HDMI could be the issue since I use USB-C/Thunderbolt.

I didn’t have the issue when I was on OpenSUSE, where I tried out KDE, XFCE, and several window managers. I’m not sure which kernel version and what not I was on when I switched to fedora a month or so ago. My TW install was quite out of date.

Doing some testing. Be back in a bit.

I think it might be GNOME. I’ve had more issues GNOME then the other DE’s. GNOME refuses to let me change the overlapping point between 2 monitors. I learned after switching to Cinnamon.

If you are curious, I documented it here

I really don’t think this is a gnome issue.

Switched to display port an all hell broke loose. I’ve tried two ports on the video card. I’m now showing llvmpipe for my display driver, wayland is an option at login so it’s only x11, it doesn’t recognize my monitor, and I can’t select a refresh rate or the appropriate resolution.

Found this in dmesg while using displayport:

  4.183424] amdgpu 0000:03:00.0: amdgpu: failed send message: TransferTableSmu2Dram (18) 	param: 0x00000009 response 0xffffffc2
[    4.183425] amdgpu 0000:03:00.0: amdgpu: Failed to get overdrive table!
[    4.183426] amdgpu 0000:03:00.0: amdgpu: Failed to setup default OD settings!
[    4.183509] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block <smu> failed -62
[    4.183510] amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_late_init failed
[    4.183511] amdgpu 0000:03:00.0: amdgpu: Fatal error during GPU init
[    4.183521] [drm] amdgpu: finishing device.
[    4.389505] [drm] amdgpu: ttm finalized
[    4.389748] amdgpu: probe of 0000:03:00.0 failed with error -62

Im running a 5700XT with dual monitors on Fedora 33 without issues. I guess like above. Drop in replacement parts to test are the only method to test the problems.

One display is DP and the other HDMI.

Have you tried booting from a live install and checked for the problem on a usb stick ?

Just did a system update, checked my bios settings, checked my cabling. On reboot, driver is again correct, and my resolution is correct, while using display port.

The flicker was still present.

I’m digging through the dmesg logs and i found this:

WARNING: CPU: 7 PID: 271 at drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_resource.c:3241 dcn20_validate_bandwidth_fp+0x8d/0xd0 [amdgpu]

Which has two associated bugs on redhat:
https://bugzilla.redhat.com/show_bug.cgi?id=1829049
https://bugzilla.redhat.com/show_bug.cgi?id=1871324

This seems to be the right thread:

But i’m not sure if the flickering I’m observing is related to that message or not.

Thoughts?

Reseat the GPU and CPU I guess. Free check.
My DMESG seems plain

I’ve reverted back to HDMI so I’m back to just having a flicker which I can fix with a refresh change after logging in.

I’ve also tested kernel 5.8.18 and the flicker is present there as well.

Hi @Komrade_K , i have the same problem with Fedora 33, i upgrade video driver controller, change kernel version and change the OS to Ubuntu and the problem continue, I don’t know the cause of the problem because it should not occur in Ubuntu but it all started when I changed from Fedora 32 to 33. If you solved your problem I would appreciate if you shared the solution.

BR!
Felipe

No solution. I simply change the resolution a few times upon logging in. I’m pretty sure the AMDGPU gitlab has a bug logged for the issue. Not sure if a fix went out with the 5.10 kernel or not.

My laptop has an Nvidia video card so I think the problem has to do with the kernel management of some resources, I hope that in the next release it will be solved because it is a common problem.

BR!

The same problem on Fedora 33 Gnome Wayland with Nouveau and Nvidia:

$ sudo lshw -C video
[sudo] пароль для admin:
*-display
description: VGA compatible controller
product: GF106 [GeForce GTS 450]
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:01:00.0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress vga_controller bus_master cap_list rom
configuration: driver=nouveau latency=0
resources: irq:131 memory:ec000000-edffffff memory:e0000000-e7ffffff memory:e8000000-ebffffff ioport:e000(size=128) memory:c0000-dffff

What I noticed:

  1. the only kernel where the problem is absent — 5.8.14. All Kernels 5.9.xx and 5.10.xx fail — little but annoying flickering across the screen is present.

  2. There is no flickering on Windows 10 LTSC or Ubuntu 20.04 Gnome Wayland.

Any help on kernel patch? or Nouveau? or anything?

A miracle has happened. Just have updated to Kernel 5.10.13, no flickering on my system. God bless Fedora!

What I’ve found:

commit e4d2a196fdc5f7eeab21d3d6a27566f3dc1f4d60
Author: Bastian Beranek [email protected]
Date: Thu Jan 21 15:27:36 2021 +0100

drm/nouveau/dispnv50: Restore pushing of all data.

commit fd55b61ebd31449549e14c33574825d64de2b29b upstream.

Commit f844eb485eb056ad3b67e49f95cbc6c685a73db4 introduced a regression for
NV50, which lead to visual artifacts, tearing and eventual crashes.

In the changes of f844eb485eb056ad3b67e49f95cbc6c685a73db4 only the first line
was correctly translated to the new NVIDIA header macros:

-               PUSH_NVSQ(push, NV827C, 0x0110, 0,
-                                       0x0114, 0);
+               PUSH_MTHD(push, NV827C, SET_PROCESSING,
+                         NVDEF(NV827C, SET_PROCESSING, USE_GAIN_OFS, DISABLE));

The lower part ("0x0114, 0") was probably omitted by accident.

This patch restores the push of the missing data and fixes the regression.
3 Likes

5.10.9-201 had regressed so far that my computer would kernel panic on boot.

I switched to the rawhide kernel 5.11.0-0.rc4.129 and that seems to have got me back into working condition. I’ll test out 5.10.13

Here’s a link to the related ask fedora thread:

My Fedora 33 works excellent with 5.10.14 and 5.10.15. I’m really happy because I thought it was my old video card problem.