Hi all, I’ve been struggling with OC/UV on my Vega64 and found out a few things, so I share it here maybe it will help a few people stuck with the same issues (from what I see). This bumps an old thread, sorry if this has been addressed already somewhere else.
I also started by the “echo s …” and “echo m …” and found out that some settings seem to be actually applied, but not the ones that matter to me. The power limit goes up and my card ends up chewing 300W, but then it means the UV settings are ignored, core voltage stays at the default 1200mV and my HBM2 is actually downclocked and stuck at 800MHz, which seems to be a very common issue. All in all not what I want to achieve: a mix of efficient OC/UV and <200W if possible.
So I started to look into the powerplay tables, and I found a way that seems to work, at least in my case. There is a certainly better / straightforward way of doing this, but at least that’s a beginning.
You need a tool that is able to produce a binary powerplay table. I do it this way: using OverdriveNTool in a windows VM to produce a registry file from a Vega 64 bios downloaded from techpowerup. Then I use notepad to remove ALL characters other than the hex code. I then save this as an ANSI text file that I feed into the java tool taken from there (you need java 10 sdk):
https://github.com/xmrminer01102018/VegaToolsNConfigs/tree/master/config/PPTDIR
This produces the powerplay table binary file. If the size of this file is anything else than 694 bytes, or if the java tool complains, you’re doing something wrong.
(edit nov 2019: if you download the .jar file by cliking on it, it seems to be corrupt ATM, or there is something I’m missing with GitHub. What you need to do is clone the git repository, and then go to the directory containing the .jar)
I then feed this binary to the AMDGPU driver with a
cat binary_ppt > /sys/class/drm/card0/device/pp_table
(adjust as needed if your card has a different ID)
Then I noticed that my HBM2 is still stuck at 800MHz. But for whatever reason, if I do a
echo “m 3 1050 900” > /sys/class/drm/card0/device/pp_od_clk_voltage
on top of having applied the PPT, this gets me where I wanted: all “s” and “m” states are exactly as I wanted them:
cat /sys/class/drm/card0/device/pp_od_clk_voltage
OD_SCLK:
0: 852Mhz 800mV
1: 991Mhz 810mV
2: 1084Mhz 820mV
3: 1138Mhz 840mV
4: 1200Mhz 870mV
5: 1401Mhz 900mV
6: 1536Mhz 930mV
7: 1630Mhz 960mV
OD_MCLK:
0: 167Mhz 800mV
1: 500Mhz 800mV
2: 800Mhz 820mV
3: 1050Mhz 900mV
OD_RANGE:
SCLK: 852MHz 2400MHz
MCLK: 167MHz 1500MHz
VDDC: 800mV 1200mV
This passes a loop of Unigine Superposition, the card draws 190W, GPU frequency is at 1488MHz stable (I’m on water), and HBM2 is at 1050MHz. Yay.
FYI I use kernel 5.1.9 ATM, and amdgpu.ppfeaturemask=0xffffffff in the kernel boot string, nothing else.
Hope it helps, and if you have a more straightforward way to produce PPT tables on linux let me know. Note I don’t claim my PPT table is stable and most optimized, what matters is to have the OC/UV settings actually applied to the GPU…