Radeon Instinct MI50 and ollama in a MV

So, in these days you can find some 32GB Radeon Instinct MI50 for around 200$, which seem quite a bargain if someone wants to experiment a bit with AI for cheap.
So I bought one, and here are some random notes from my journey to use it.

First, MI50 is no longer supported in ROCm - latest version that supports it is 6.3.3.
Also, after struggling to get the amdgpu-dkms compiling on 24.04 i switched to 22.04 with 5.15 kernel.

So, here are more-or-less the steps I followed to make it work.

First, pass the MI50 to the VM in the usual way, nothing strange here. But you’ll need to vendor-reset dkms module, otherwise the MI50 won’t work properly in the VM.

Second, no spice video: rocm seem to get confused when there’s a virtual GPU in the system and tries to use it - but failing miserably to do so and switching back to the CPU. Setting various environment variables like CUDA_VISIBLE_DEVICES didn’t work either.

After setting up the VM, install ROCm 6.3.3:

wget -c https://repo.radeon.com/amdgpu-install/6.3.3/ubuntu/jammy/amdgpu-install_6.3.60303-1_all.deb
dpkg -i ./amdgpu-install_6.3.60303-1_all.deb
amdgpu-install --vulkan=amdvlk --usecase=rocm,lrt,opencl,openclsdk,hip,hiplibsdk,dkms,mllib

After that install ollama 0.12.4 - later versions don’t support MI50 anymore; maybe it will work again with Vulkan support, but it’s still experimental and you’ll have to compile it yourself.

curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.12.4 sh

With this you should be good to go (hopefully :wink: ).

Hope it helps people also trying to use this card :slight_smile:
Bye
Andrea

PS: I also tried llama.cpp, but it segfaults when trying to run a model.

1 Like

If you are using Debian or ArchLinux, you can use the distro maintained version of ROCm and you don’t need the dkms to pass through to a container (at least if you are using lxc). I am on ROCm 7.0+

Debian and ArchLinux enable support for all ROCm capable devices. You may have to use some overrides once AMD removes gfx families from the header files of some of the libraries as they have been known to do.

1 Like

If you compile llama.cpp with the Vulkan backend the MI50 does inferencing just fine. You need the Vulkan SDK linked during compilation for some libraries but after compilation all you need is amdgpu driver active and the compiled program.

1 Like