thanks. doesn’t the VRM get cooling from the front of the card/opening? the fan i’m using on the back of the case is sealed up tight drawing all of it’s air through the card. [edit] i see the VRM (as you show) is on the back of the card. i have a 590 next to it and i’m forcing those fans to run. i’ll be adding a fan to move that air up. my motherboard is layed down not on its side.
if i underclock it in linux shouldn’t the voltage going to the GPU drop? example P2 1138MHz 950mV etc? that should be much easier on the VRM than 1500Mhz @ 1.2v
The seller of the mi25 cooling shrouds on eBay finally released all of their .stl files on thingiverse after I messaged them about it. They are really nice and should be compatible with other instinct cards. They are available here: https://www.thingiverse.com/thing:6636428
I have been attempting to compile SYCL code with openSYCL to run on a vega 64 (and soon a mi25), and openSYCL uses ROCm as a backend, and so far my code only outputs garbage answers with a simple vector addition example. What was the motivation to use ROCm 5.2.5? I have been looking into installing ROCm 4.5.2 since it’s the last supported release for gfx900 GPU’s, but I was wondering why this guide uses a later unsupported release?
Has anyone tried passing this into a VM? I’m getting a lot of bluescreens and VM not starting depending on if i blacklisted the driver… the driver even crashed a few times.
Did you do the scary use amdgpu and ROCm for Ubuntu or did you only use the ROCm libs in Debian? I only get noise when I try to use it on my AMD RX 5600 XT.
has anyone tested rocm 5.6? I just upgraded from rocm 5.5 and am seeing some speed improvements on training. I was using my 6900xt and saw my time to process all of my data went from 1 hour 30 minutes to 1 hour 20 minutes so at least for what I was doing it seemed to be faster.
So just to make sure I’m clear, if I flash to the Vega 64 VBIOS, the MI25 will be more compatible with later versions of ROCM?
Also, I flashed my MI25 to the WX9100 vbios, but the readings from CoreCTRL in OpenSUSE (It’s what I daily drive and I just wanted to see what how it would interact with the card) are all over the place. Do I need to run the card in Ubuntu 22.04.4 to get proper readings from CoreCTRL or do you need to use the Vega VBIOS for that.
Also, I don’t know if anyone’s tried, but when I booted the card with WX9100 VBIOS, it was pretty capable at gaming… I just couldn’t get any readings from corectrl to see the temps and voltages.
You shouldn’t flash a vega 64 vbios because they do not have 16gb of memory, only the FE and the wx9100 are correct for the mi25, and only the wx9100 has working displayport.
CoreCtrl works fine on arch with the wx9100, so you’re probably looking at some other issue, recommend seeing what spits out from the debug interface located /sys/kernel/debug/dri//amdgpu_pm_info
try using sudo watch -n1 "cat /sys/kernel/debug/dri/0/amdgpu_pm_info" , if you don’t see anything wacky, then the problem is likely the application itself.
Hey, I just got my MI25 up and running with the wx9100 bios. Sd is running, but unfortunately I can only get it to run with the --lowvram argument. If I use medvram sd will try to run for a second, and then it immediately BSODs. I see a few other people have a similar issue, but I have only seen people talking about solving the issue on linux. I am wondering if there is a workaround on windows.
Edit: Disregard. Immediately after posting, I realized I should just try running SD without any arguments at all, and found out it works just fine.
I also figured out if you disable HBCC memory segment in the AMD driver, it improves overall stability.
Someone that knows how to use HxD hex editor should have a look at the Chinese and the official driver to make sure it is safe to use. Plus find out what the changes were to make it work.
So , small tip , if using the pptables, when you load a pptable it disables, avfs (adaptive voltage frequency scaling), so you lose power savings if you don’t manually setting the voltages and the peformance/performance per watt will be worse , so inorder to reenable avfs you have to set the featuremask via pp_features
like so : sudo bash -c "echo 0x000000000ba1ff4f > /sys/class/drm/card0/device/pp_features"
this is kinda important if don’t want to touch the voltages, but want the extra power / clock headroom.
I got my second mi25 installed into my water loop. I flashed the bios, and everything seems to be functional in linux (although I am still working on getting stable diffusion working in opensuse). On windows 11 however I am having a BSOD about 5 minutes after bootup claiming an issue with driver power states.
Disabling the second gpu in device manager solves the BSOD issue, but obviously it is not ideal. The card itself seems to be working. It outputs video, plays games, and says its a wx9100, so I don’t see why this second card seems to be an issue. Any thoughts?
Sorry mate, haven’t been here for a while. I didn’t install Ubuntu packages in Debian at all, so this was only what is in Debian repos.
But finally I never needed those too. Pytorch install for AMD already have specific ROCm version inside so that was enough to get SD going.