Vega64 fan control

After a slew of updates, I’ve just noticed that the fans on my ageing Vega64 have decided to go on strike. I only noticed when I decided to load up steam and plough through a few titles. In short, everything appears to be working well apart from the fans, so any reasonable load results in the card thermal throttling…

So far, I’ve played with a few kernel parameters, played with sysfs and forcing 50%/100% fixed fan speeds to no avail…

Arch, 5.11 kernel, mesa-git, etc…

My Google-Fu has got to the point of no longer pulling up fresh avenues to look into and now has me running in circles,.

Any ideas that might be worth looking into?

Edit: ASUS Strix Vega64

Have you ruled out the possibility of it being a hardware issue?

1 Like

Not fully, but I would have expected there to be a few signs before failing out right. Though I could be very wrong here…

Usually yeah, I’ve never had a fan just stop all of a sudden, but I’d still check it just in case.
I’d check if the fan spins freely (which it likely does), and then I’d though it in another machine if you have one.

1 Like

Good shout!

Well the fans (all three) do indeed spin freely. As for swapping to an alternative tower, unfortunately most of my HW is rack mounted without enough space to shoehorn the card.

For the time being I’m primarily wanting to focus on the SW side, as cobbling together a fresh tower from parts in my spare junk bin (all the past builds that I’ve kept hold of over the years) doesn’t sound like a weekend of entertainment.

I will try and see if I can get the fans spinning up on a slightly older manjaro build and then play spot the difference.

1 Like

If you think it’s an update issue I would just pull a Live-ISO and boot that, check if the fan works there since Live-ISOs don’t come with updated-everything.
Side note: Vega 64 on 5.10.14 here and no issues as of yet.

2 Likes

I’m in the process of packaging up an AUR / debian package for an amdgpu fan controller I’ve been working on and use myself on my Vega64 and 6800xt.

I’ve got it published to the test pypi instance as I’m just working out the packaging stuff now. So you could pip install it from there. Or snag it from my github for amdfan if your comfortable building your own arch packages etc.

May help you troubleshoot to make sure your actually setting the sysfs settings properly.

Also on Arch, 5.10.16 no issues with the fan control there either.

2 Likes

I will be keeping an eye out for when your package hits the AUR. I pulled it down from github and went through the motions to get it running, with no issues. However, still no dice…

I have been able to confirm that the fans do still work on kernel 5.9.x (manjaro) and FreeBSD. So that rules out the possible hardware failure.

After a little more digging about, it appears that my issue could be linked to a regression within the kernel. This commit looks to be the cause…

8d6e65adc25e23fabbc5293b6cd320195c708dca

The good news is that there is a fix in the pipeline, but it appears that the fix is targeting 5.12…

One thing that is buuging the living daylights out of me, is that poking about inside sysfs yields no tangible results.

# echo 1 > /sys/class/drm/card0/device/hwmon/hwmon2/pwm1_enable
bash: /sys/class/drm/card0/device/hwmon/hwmon2/pwm1_enable: Permission denied

&

# echo 1 | tee /sys/class/drm/card0/device/hwmon/hwmon2/pwm1_enable
1
# cat /sys/class/drm/card0/device/hwmon/hwmon2/pwm1_enable
0

Even root has no power here… :confused:

1 Like

cut the fan power wire and connect it directly to the power supply and be happy))) If you need to solve the problem quickly

Though that is very tempting. I’ll take a rain-check on the hardware butchery for now. :wink:

@xentoo AUR (en) - amdfan added the AUR and the pypi package amdfan · PyPI

sucks about the regression in 5.11 was kinda looking forward to trying it out.

1 Like

I’m assuming it’s the same commit from the later 5.10.x kernel, the symptoms seem to be the same as discussed here.

So I could be way off the mark, but for now I believe this is the issue I’m fighting with.

1 Like

@xentoo Updated to 5.11.1 (zen) in Arch and still able to control my Vega’s Fan speeds, not getting any permission denied issues so far.

1 Like

When I was using a Vega 56 in my old Linux rig, I used a tool called CoreCtrl to edit the fan curve. It’s also useful for raising the power limit, overclocking, etc. It does require that you add amdgpu.ppfeaturemask=0xffffffff to your kernel options, but is otherwise painless to use. Setting that kernel option might also get rid of the permission errors you’ve been seeing in sysfs.

Finally got the fans to play ball. @robbbot

Solution :- (In my case at least)

  • Updated uefi (because why not at this point)
  • Pull latest kernel source and build against last known good config (so now running 5.12-rc1)
  • Moved to using modprobe.d/amdgpu.conf (as having “War and Peace” inside GRUB “CMDLINE_LINUX_DEFAULT” was getting really annoying.
  • Poked about inside /sys/class/drm/card0/device/hwmon/hwmon2/ { pwm1_enable, pwm1, fan1_enable, fan1_target, etc…} Managed to get a flutter from the fans.
  • Ran “amdfan --manual” (finally manual control is up) Set to auto and all seems well now.

Now it’s sorted, I’ll be running away from poking sysfs for a while, as apparently with my config "Here be schizophrenic dragons!"

1 Like

Glad you got it sorted!

1 Like

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.