Mi25, Stable Diffusion's $100 hidden beast

thanks. i don’t know if its unstable yet. i rendered the classroom in blender/windows. it got HOT! yes i have a 100mm fan pulling air out the back.

it’s going to be used in a Linux enviroment, so i’ll probably underclock it as you put in your 1st post.

how much of a drop did you get using liquid metal?

thanks

Pretty noticeable but you need more than an intake, I recommend a blower fan

It’s not just the core needs to keep cool, the vrm gets toasty

thanks. doesn’t the VRM get cooling from the front of the card/opening? the fan i’m using on the back of the case is sealed up tight drawing all of it’s air through the card. [edit] i see the VRM (as you show) is on the back of the card. i have a 590 next to it and i’m forcing those fans to run. i’ll be adding a fan to move that air up. my motherboard is layed down not on its side.

if i underclock it in linux shouldn’t the voltage going to the GPU drop? example P2 1138MHz 950mV etc? that should be much easier on the VRM than 1500Mhz @ 1.2v

Well that’s half of it the other half is under a heatpipe

ah, i see. thanks

The seller of the mi25 cooling shrouds on eBay finally released all of their .stl files on thingiverse after I messaged them about it. They are really nice and should be compatible with other instinct cards. They are available here: https://www.thingiverse.com/thing:6636428

I have been attempting to compile SYCL code with openSYCL to run on a vega 64 (and soon a mi25), and openSYCL uses ROCm as a backend, and so far my code only outputs garbage answers with a simple vector addition example. What was the motivation to use ROCm 5.2.5? I have been looking into installing ROCm 4.5.2 since it’s the last supported release for gfx900 GPU’s, but I was wondering why this guide uses a later unsupported release?

2 Likes

It’s supported but it doesn’t say it is

1 Like

Has anyone tried passing this into a VM? I’m getting a lot of bluescreens and VM not starting depending on if i blacklisted the driver… the driver even crashed a few times.

Did you do the scary use amdgpu and ROCm for Ubuntu or did you only use the ROCm libs in Debian? I only get noise when I try to use it on my AMD RX 5600 XT.

has anyone tested rocm 5.6? I just upgraded from rocm 5.5 and am seeing some speed improvements on training. I was using my 6900xt and saw my time to process all of my data went from 1 hour 30 minutes to 1 hour 20 minutes so at least for what I was doing it seemed to be faster.

Hi,

So just to make sure I’m clear, if I flash to the Vega 64 VBIOS, the MI25 will be more compatible with later versions of ROCM?

Also, I flashed my MI25 to the WX9100 vbios, but the readings from CoreCTRL in OpenSUSE (It’s what I daily drive and I just wanted to see what how it would interact with the card) are all over the place. Do I need to run the card in Ubuntu 22.04.4 to get proper readings from CoreCTRL or do you need to use the Vega VBIOS for that.

Also, I don’t know if anyone’s tried, but when I booted the card with WX9100 VBIOS, it was pretty capable at gaming… I just couldn’t get any readings from corectrl to see the temps and voltages.

You shouldn’t flash a vega 64 vbios because they do not have 16gb of memory, only the FE and the wx9100 are correct for the mi25, and only the wx9100 has working displayport.

CoreCtrl works fine on arch with the wx9100, so you’re probably looking at some other issue, recommend seeing what spits out from the debug interface located /sys/kernel/debug/dri//amdgpu_pm_info

try using sudo watch -n1 "cat /sys/kernel/debug/dri/0/amdgpu_pm_info" , if you don’t see anything wacky, then the problem is likely the application itself.

1 Like

Hey, I just got my MI25 up and running with the wx9100 bios. Sd is running, but unfortunately I can only get it to run with the --lowvram argument. If I use medvram sd will try to run for a second, and then it immediately BSODs. I see a few other people have a similar issue, but I have only seen people talking about solving the issue on linux. I am wondering if there is a workaround on windows.

Edit: Disregard. Immediately after posting, I realized I should just try running SD without any arguments at all, and found out it works just fine.

I also figured out if you disable HBCC memory segment in the AMD driver, it improves overall stability.

BTW there is a new Driver from a Chinese source that gets this card working now.

Got it working on mine. I could DM you the link? as its on another Fourm

Someone that knows how to use HxD hex editor should have a look at the Chinese and the official driver to make sure it is safe to use. Plus find out what the changes were to make it work.

1 Like

So , small tip , if using the pptables, when you load a pptable it disables, avfs (adaptive voltage frequency scaling), so you lose power savings if you don’t manually setting the voltages and the peformance/performance per watt will be worse , so inorder to reenable avfs you have to set the featuremask via pp_features
like so :
sudo bash -c "echo 0x000000000ba1ff4f > /sys/class/drm/card0/device/pp_features"
this is kinda important if don’t want to touch the voltages, but want the extra power / clock headroom.

I got my second mi25 installed into my water loop. I flashed the bios, and everything seems to be functional in linux (although I am still working on getting stable diffusion working in opensuse). On windows 11 however I am having a BSOD about 5 minutes after bootup claiming an issue with driver power states.

Disabling the second gpu in device manager solves the BSOD issue, but obviously it is not ideal. The card itself seems to be working. It outputs video, plays games, and says its a wx9100, so I don’t see why this second card seems to be an issue. Any thoughts?

Where do you find the drivers?

Sorry mate, haven’t been here for a while. I didn’t install Ubuntu packages in Debian at all, so this was only what is in Debian repos.
But finally I never needed those too. Pytorch install for AMD already have specific ROCm version inside so that was enough to get SD going.

Do you have more info you can give? We know windows driver’s exist for mi25 but they have been locked in azure cloud vms last I heard.