Mi25, Stable Diffusion's $100 hidden beast

GigaBusterEXE · February 15, 2023, 10:00pm

In this guide I’ll show you how to get stable diffusion up and running on your 100$ Mi25 on linux

Cooling

This thing does not come with a fan, you need to rig up your own cooling solution
This thing is HOT and its heatsink is not that large, I had enough space to fit an entire blower fan in the shroud
You might be able to do this on a stock MI25 bios or even Flashed to a Vega FE but I’ve done my testing with the MI25 Flashed to a WX9100
MI25 and Vega FE both have higher power limits but neither have video output working, also with stock heatsink its VERY hard to cool more than 170w
Vega FE stock will need A LOT of voltage tuning to be stable

There are VRMs on the back as well you need to keep cool, they need Direct airflow

I went a little overboard but a 80mx15mm or 92mmx15 fan raised 5mm laying on the back of the card would be fine

Just don’t use metal to raise it, sticky tac or rubber feet are fine

Transformation Sequence

The MI25 has a Mini-Display port meant for debugging but its not active under any Vbios except the WX9100
If you wish to use it you must free it from its cage
Be very gentle as its a fragile connector not designed to be used, maybe put some HotSnot to reenforce it

First things first, we need to get your Mi25 Flashed into a WX9100
You’ll need ATI Flash 2.93 specifically
You’ll need a uefi bios which you can nab here

Make sure CSM is disabled
put the rom in the folder you have the flasher in
then open open cmd as admin, navigate to the folder with the flasher inside, you want to use amdvbflash.exe, not amdvbflashWin.exe
run amdvbflash.exe -i to find the device ID
amdvbflash.exe -p 0 218718.rom -fs
where 0 is the device id of the mi25
you might have to right click amdvbflash.exe run as admin first to trigger it in cmd

image942×298 9.29 KB

Installing Linux

First things first get your self and iso of Ubuntu 20.04.4

Just ctrl+f 20.04.4-desktop-amd64.iso
It has to be that version to save yourself some headache, if you’re a linux jedi master maybe you can use something else but I am a caveman, if I can do it I know you can
I tend to install it with the minimal setup, you can download updates if you want

Hyper Specific Kernel

If you update the kernel we will have to remove the new one it installs
You’ll want to boot into the 5.13.0-30 kernel, you can do this from the advanced boot options right after the PC posts its splash screen
To remove the new Kernels follow this guide

https://help.ubuntu.com/community/RemoveOldKernels

To show what kernels are installed

apt-mark showauto 'linux-image-.*'

Then copy and paste in at the end of this command any that isn’t

linux-image-5.13.0-30-generic
linux-image-generic-hwe-20.04

sudo apt-get purge linux-image-version-you-want-to-remove linux-image-unsigned-same-version

After that it’ll make an Unsigned version that you’ll need to purge too, just run the same command and after that

Sudo apt autoremove

Reboot

AMD ROCm Suite

Now we want the 5.2.5 installer of rocm which we can get from this

http://repo.radeon.com/amdgpu-install/22.20.5/ubuntu/focal/

Install that package then-

amdgpu-install --usecase=rocm,lrt,opencl,openclsdk,hip,hiplibsdk,dkms,mllib

Then we need to add your user to these group so just type you user name after each

sudo usermod -a -G video 
sudo usermod -a -G render

Wooo buddy 16GB driver suit, not sure if you need all of those but we can figure that out later
Reboot

Dependencies, Dependencies, Dependencies!

Developers! Developers! Developers! GIF | Gfycat

Next we need a few prerequisites

sudo apt install -y git
sudo apt install -y python3
sudo apt install -y python3-pip

Now we need the appropriate version of pytorch, the rocm 5.2 variant

pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/rocm5.2

Stable Diffusion with a web Gui

next we need to grab the webui for stable diffusion

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui

It’s at this point you want to grab your models and VAE and put them in the appropriate folder before starting
FP16 is twice as fast and uses half as much ram, but you need FP16 checkpoints to use it

cd stable-diffusion-webui

If you have FP16 checkpoints just use

python3 launch.py

If you use regular FP32 then do

python3 launch.py --precision full --no-half

You have to enter those commands while you’re in the stable-diffusion-webui folder in the terminal
You might get an error like this on every time you start the program and do your first genneration

MIOpen(HIP): Warning [SQLiteBase] Missing system database file: gfx900_64.kdb Performance may degrade. Please follow instructions to install: https://github.com/ROCmSoftwarePlatform/MIOpen#installing-miopen-kernels-package

I haven’t figured out how to fix that yet, but its fine, the first genneration will take forever but the rest will be fine

Underclocking? In My Linux

The wx9100 bios is a little more aggressive with the frequency than the stock MI25 so its possible you want to limit it to 991 or 1138Mhz on the core using CoreCTRL and ROCm-smi
CoreCtrl / CoreCtrl · GitLab
You can control power limit in ROCm-smi with this command

rocm-smi -d 0 --setpoweroverdrive 140

Replace 140 with how many watts you want it to top out at
You will likely have no voltage control

GigaBusterEXE · February 16, 2023, 6:06pm

at roughly 1Ghz for 16 images, 35 steps, 640x768 takes about 20 minutes
not bad for 100 bucks

regulareel · February 16, 2023, 11:43pm

Hey congrats you got it to work! You should really come to the channel (maybe) and show off this!

GigaBusterEXE · February 17, 2023, 12:13am

Nah I’m no good on camera, I’ll leave that to Wendell, video in the works

mutation666 · February 17, 2023, 1:50am

How far is the P100 fund from victory?

wendell · February 17, 2023, 2:20am

Yes this is awesome.

Don’t forget we have that liquid block v100 you could use for testing

regulareel · February 17, 2023, 3:37am

You stream in twitch, I’d argue you could tailor a better image over on youtube where there is prep time.

lemij31400 · February 17, 2023, 9:14am

Pretty awesome! Tempted to pick one up to play around with and compare along with my RTX A5000 and Radeon Pro Duo

Iron_Bound · February 18, 2023, 10:37am

Found this SD benchmark for Nvidia, Amd & Intel, 7900 xtx is doing well on the charts now with it’s WMMA.
They also talked about models for 6000 series that should boost performance, could also look at running in lower precision?

Occam · February 19, 2023, 9:48pm

Yeah, the 6000 series is not as fast as it could be in that Tom’s Hardware benchmark. On Linux with ROCm you get about double the speed they show there.

GigaBusterEXE · February 19, 2023, 10:10pm

I kept getting NAN errors at higher resolutions under half precision, maybe I need to use a different sampler, model, vae or something
the amount of abominations also drastically increased with increased resolution so limits the usefulness of FP16 other than doubling the speed of 512x512 images

its great for consumer vega which only has 8GB of vram

Iron_Bound · February 20, 2023, 7:10am

Could you generate images at 512 and use an upscaler on the ones worth keeping?

GigaBusterEXE · February 20, 2023, 1:53pm

Probably, the upscaler on automatic 1111 is broken though

GigaBusterEXE · February 20, 2023, 2:29pm

I e also found Euler A is twice as fast as DPM samplers but DPM doesn’t give as many abominations

GigaBusterEXE · February 21, 2023, 10:58pm

testing if the MI25 Vbios works, and how much faster it’ll be and how much hotter

GigaBusterEXE · February 22, 2023, 5:28am

most mi25’s come with a 110w bios, this is the 220w one

GigaBusterEXE · February 22, 2023, 7:14am

you can make it a fire breathing Vega FE if you really want to, I don’t like having my edge and hbm temp that high

GigaBusterEXE · February 22, 2023, 11:10pm

GigaBusterEXE · February 24, 2023, 6:07am

after much modification and testing…
maybe don’t go past 220w…
Your core, junction and HBM will be fine, but your vrm WILL burn your finger even with 3 6000rpm fans pointed at it at 265w
I think the Vega FE has lower voltage per level, if unstable use the MI25 220w bios

don’t go above 1440Mhz SET on any bios
220w bios

264w bios

if you get the 264 I would set the limit to 233w and the GPU core to 1348 SET, you have 1300Mhz GET with that and that’s as hard as would push it, that’ll get you 2.37 it/s on 512 euler a