ROCm and containers

Can you run AMD ROCm driver stack inside a container when host machine is not running AMD closed drivers but the default open source + non free binary (Debian 12 on old AMD 5600XT GPU)?

Backstory is that; I’m trying to get into machine learning and would like to run GPU accelerated stuff and I would also like to keep my system clean from AMD driver stack since everything is working so well and stable on default install.

How can you provide container access to a gpu that isnt functioning on the host…?

I’m not very familiar with containers. I have default drivers (open) running and they provide /dev/kfd and /dev/dri/renderD128 on host. Which at least are needed if I understand this. But as said this is all new to me. Does the host itself need the ROCm? Or is it enough that driver files are accessible on host?

Sorry, I read your post before my coffee. It should be working if you have amdgpu installed on the host.

This might help…

1 Like

Short answer is yes. I run inside distribox which is basically rootless podman.

ROCm is not a driver stack. More like a runtime and SDK.

The driver part is actually included in every Linux kernel newer than 5.14.

Share /dev/kfd and /dev/render/ to your container and make sure it has access to these devices. Generally the render group owns those.

4 Likes