How to overclock Vega on Linux...?

I recently upgraded to a Vega 56 (specifically the MSI Air Boost Vega 56 “OC” edition.) Initial clock speeds under Windows were pretty weak; I was only hitting 1320MHz in demanding scenarios. Then following some advice from @thro I undervolted by 100mV and increased the power limit by +50%, and saw clock speeds jump to a high of 1580MHz. Keep in mind that the vendor set the clock speed target to 1622MHz, so technically I’m not overclocking…

I started with Windows, because I’m familiar with stress tests, benchmarks and overclocking tools available on that platform. Now I’m trying to accomplish the same thing on Linux. I found a few scant articles on Phoronix about how to increase the power limit, but even manipulating the files in /sys/class/drm/card0/device/hwmon/hwmon0 will only let me lower the power limit, not raise it. So in Linux I’m still stuck at a 165 watt power limit.

Any advice? Is this only possible with amdgpu-pro?

2 Likes

Theres overclocking built in to amdgpu itself, but in used to not work properly. Might give that a google.

It seems that amdgpu driver only allows lowering power limit for all cards. The code responsible for this is here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/powerplay/amd_powerplay.c?h=v4.19-rc4#n940 (lines 940, 941). You could just comment out these two lines and see what happens (of course, at your own risk).

I’ve also reported a kernel bug here: https://bugzilla.kernel.org/show_bug.cgi?id=201201

Though for me stock Vega performance is enough for everything I run on Linux.

3 Likes

I took the path of least resistance, and flashed the MSI with a Vega 64 BIOS. The power limit is now 220 watts, and clocks are now in the upper 1400’s, sometimes the low 1500s. Memory clocks are now 945 MHz, whereas before they were 800 MHz. So far it seems stable in a few benchmark runs, but I still need to put in some extended gaming sessions to make sure.

2 Likes

I’m at the same place, no tools on Ubuntu and it sucks :frowning:
In windows I modified the powerplay table so that power usage now goes up to +100%, then used “overdriveNtool” to specify clocks and fan target and then finally exported the powerplay table.
I read somewhere I could just replace the power play table (pp_table) in the “/sys/kernel/debug/dri/0/” directory, however I get a permissions error when trying to delete/replace or modify the current “pp_table” file in that directory.
I’ve also read that radeonopencompute ROC “https://github.com/RadeonOpenCompute/ROC-smi” will let you specify clocks, power usage and fan/temperature through the terminal, haven’t tired that yet tho. Freaking wish we didn’t have to go through all this BS and could just mod bios like times past or at a bare minimum at least give us a linux GUI, Really kind of hate AMD over locking the bios at least in the past you could overclock AMD gpu’s to try and bring their sub performance products up to the Nvidia level’s albeit while consuming double the power, the two Vega64 cards I just bought will likely be the last AMD Radeon cards I will buy. The only good comment I can say is my OC settings got me +16% FPS in fortnite vs stock, VEGA’s still shit tho

1 Like

@seansplayin The +33% power limit and 945 MHz memory clocks from the Vega 64 BIOS are “good enough” for me under Linux. That’s already pushing the cooling capabilities of the Vega’s blower, so I don’t have a whole lot of headroom left anyway.

While you’re right about the lack of OC tools being disappointing, I love the seamless amdgpu driver. No muss, no fuss, no worrying if my X desktop will be blown away by a kernel update. So despite the limitations, I’ll still take AMD over Nvidia.

1 Like

I have the Sapphire Nitro+ vega64 so I’m sure I have more thermal headroom vs referenced blower design, that said undervolting with -100mv drops power usage by 95watts (225watts) with gpu core set to 1680mhz and yields an actual gpu core clock of around 1630mhz while gaming and enables an HBM speed of 1120mhz as long as HBM temp stay around 56c. my other profile is -50mv which consumes 320watts with gpu core set to 1730mhz which yields an actual gpu core of 1655mhz while gaming and an HBM clock of 1090mhz as long as HBM temp stays below 60c.

I get the benefits native kernel support brings but I’d trade that in a heartbeat for the ability to flash my settings into the gpu bios and just have it work on any OS I choose to use whether I’m using fedora, ubuntu or windows.
previously on ubuntu 12.04/14.04 once I got Catalyst installed I would uncheck xorg and the community AMD driver, I could run updates and that shit would just work for years at a time. I will admit Updating the kernel was problematic tho.

1 Like

@seansplayin I tried an experiment under Windows last night. With the Vega 64 BIOS (which already has a +33% power limit) I dialed in a +50% power limit in MSI Afterburner. My core clock actually hit 1620MHz and memory temps stayed at 85C or below, which was a surprise. However, the blower was insanely loud and there was actually little difference in Fire Strike/Time Spy. Power consumption was up to 310w at times, but more often around 295w.

My MSI card has a reference board design, but has a custom blower. From my experiment, it does seem superior to the reference blower. But it’s still a blower.

Good luck with your card. Please let me know if you have any luck with ROC…

EDIT: According to what I’ve heard, the reason BIOS can’t be flashed to whatever you like is to conform to Microsoft’s Secure Boot. I have some choice words for M$ if that’s true…

2 Likes

There is now a proper patch from AMD that allows increasing the power limit: https://patchwork.freedesktop.org/patch/255970/

7 Likes

This is what I have been looking for, for the last 5 days! Thank you so much for showing me this.

Any idea when this can be expected to land on archlinux or what is required for this to be “activated” or in use?

Will that code come in the next kernel update?

And could you point us/me onto some how-to article regarding how to use this patch?
As far as my experience goes, I was able to compile some simple linux programs, but I have never tried something this complex.

Now I see. Its part of the a kernel and the patch is not yet included in 4.19.

1 Like

[insert wait for 4:20 blazin’ fast GPU pun]

4 Likes

So When it gets added in the kernel… Do we just echo some value into a file and then the GPU can target higher clocks based on thermal limits?

It should be like that… I guess it should be like it is, but it will allow us to increase the power limit instead of shrinking it only.

I am currently trying to build my own kernel, but it changed so much since 2007 that I am still trying.

Also, here you can view the file that needs to be patched: https://elixir.bootlin.com/linux/v4.19/source/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
As i stated previously, it is still not pached.

I guess you could say. they will be more power hungry

Here’s how it’s supposed to work:
https://www.phoronix.com/scan.php?page=news_item&px=AMDGPU-OverDrive-Linux-4.15
You’re also supposed to be able to change the power limit with similar methods. And for my card at least, upping the power limit is way more effective.

2 Likes

nothing seems to be working right now

I tried changing it on the default Fedora 28 kernel, and didn’t get anywhere. Maybe Rawhide would work…?

1 Like

Yeah I guess we gotta wait for 4.19