Backgroun
note – still working on this but chime in if you’re following along or wanna help
We’ve done a few videos lately about
… which is a great resource for building ROCm support in your environment. It is great for getting up and running on Ubuntu 24.04 LTS, and to a lesser extent 25.04, but on bleeding edge distros like Arch with newer kernels it can be a little problematic.
Firstly, there is no warning in the readme in the repo that it really wants cmake
prior to version 4.0, but it can also be a little misleading as to whether or not you need the proprietary AMD driver or if it’ll work with the open source driver. [ I recommend the open source driver and latest kernel / linux-firmware packages, generally, fwiw… ]
The rationale for this guide is to walk through setting up TheRock for gfx1151, for example, on the HP G1A.
This approach is also how you can approach 5-10 tokens/sec on Q4 70b models. Note that I find the Vulkan backend to be faster and more reliable on gfx1151 at least as of the first week of June in 2025.
… this guide should work with this setup, if your setup is this or resembles this.
These patch steps are not necessary on distros like Ubuntu 24.04 LTS.
Setup Guide - ROCm “TheRock” on Arch Linux with Kernel 6.14+
If you just git clone and use cmake, errors abound. Things like
[therock-host-blas] /home/w/TheRock/build/third-party/host-blas/source/lapack-netlib/SRC/sgges.c:896:27: error: too many arguments to function ‘selctg’; expected 0, have 3
[therock-host-blas] 896 | bwork[i__] = (*selctg)(&alphar[i__], &alphai[i__], &beta[i__]);
[therock-host-blas] | ~^~~~~~~~ ~~~~~~~~~~~~
[therock-host-blas] /home/w/TheRock/build/third-party/host-blas/source/lapack-netlib/SRC/sgges.c:997:22: error: too many arguments to function ‘selctg’; expected 0, have 3
[therock-host-blas] 997 | cursl = (*selctg)(&alphar[i__], &alphai[i__], &beta[i__]);
[therock-host-blas] | ~^~~~~~~~ ~~~~~~~~~~~~
[therock-host-blas] [3/7024] Building C object build-openblas/CMakeFiles/LAPACK_OVERRIDES.dir/lapack-netlib/SRC/sgeesx.c.o
[therock-host-blas] FAILED: build-openblas/CMakeFiles/LAPACK_OVERRIDES.dir/lapack-netlib/SRC/sgeesx.c.o
Here’s the changes I needed to do for TheRock
So the first thing is L_fp pointers defined in a way modern compilers seem not to like:
/* le sigh
#ifdef __cplusplus
typedef logical (*L_fp)(...);
#else
typedef logical (*L_fp)();
#endif
*/
/* this seems like the more correct way to do this
that is more compatible with c and c++ compilers */
typedef int logical;
typedef double doublereal;
// different files use different versions of select and selctg with different
// argument counts so uncomment the one you need for the
// particular file
typedef logical (*L_fp)(const doublereal *, const doublereal *, const doublereal *);
//typedef logical (*L_fp)(const doublereal *, const doublereal *);
//typedef logical (*L_fp)(const doublereal *);
in a couple of files:
build/third-party/host-blas/source/lapack-netlib/SRC/sggesx.c
build/third-party/host-blas/source/lapack-netlib/SRC/sgges.c
build/third-party/host-blas/source/lapack-netlib/SRC/dgges3.c
build/third-party/host-blas/source/lapack-netlib/SRC/dgges.c
build/third-party/host-blas/source/lapack-netlib/SRC/dggesx.c
...and more
heres the error im trying to correct:
build/third-party/host-blas/source/lapack-netlib/SRC/sgees.c:720:27: error: too many arguments to function ‘select’; expected 0, have 2
[therock-host-blas] 720 | bwork[i__] = (*select)(&wr[i__], &wi[i__]);
[therock-host-blas] | ~^~~~~~~~ ~~~~~~~~
[therock-host-blas] /home/w/TheRock/build/third-party/host-blas/source/lapack-netlib/SRC/sgees.c:836:22: error: too many arguments to function ‘select’; expected 0, have 2
and
[therock-host-blas] /home/w/TheRock/build/third-party/host-blas/source/lapack-netlib/SRC/zggesx.c:1236:27: error: too many arguments to function ‘selctg’; expected 1, have 2
[therock-host-blas] /home/w/TheRock/build/third-party/host-blas/source/lapack-netlib/SRC/zggesx.c:1314:22: error: too many arguments to function ‘selctg’; expected 1, have 2
Its a bit of a game of whackamole to look at the error: emitted trying to compile. As this is a third party download git diff
is useless here for me to see what I changed to just give you a diff. There might be something to add to the host-blas source to make it okay with definitions like (L_fp)() as a function def to make the compiler not worry about agument count, but… thats more calories than I;m willing to commit at the moment.
Other types of errors are just missing headers:
then in ./compiler/amd-llvm/llvm/include/llvm/ADT/SmallVector.h
I added this code block just below #include <cstdint>
#if __cplusplus >= 201103L
#include <cstdint>
#else
#include <stdint.h>
#endif
This solves the problem, hopefully with backward compatibility, about the int type being undefined.
More of the same:
./TheRock/compiler/amd-llvm/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h:44:43: error: ‘uint8_t’ has not been declared
And the fix:
#if defined(__cplusplus)
#include <cstdint>
#else
#include <stdint.h>
#endif
right after #include <memory>
the next cluster of errors is from the rocmprofiler-sdk:
[rocprofiler-sdk] /home/w/TheRock/profiler/rocprofiler-sdk/external/yaml-cpp/src/emitterutils.cpp:221:11: error: ‘
uint16_t’ was not declared in this scope
[rocprofiler-sdk] /home/w/TheRock/profiler/rocprofiler-sdk/external/yaml-cpp/src/emitterutils.cpp:221:21: error: ‘uint16_t’ was not declared in this scope
but this is the same change needed near the top of the file to include int types:
#if defined(__cplusplus)
#include <cstdint>
#else
#include <stdint.h>
#endif
… same for [rocprofiler-sdk] /home/w/TheRock/profiler/rocprofiler-sdk/external/elfio/elfio/elf_types.hpp:30:20: error: ‘uint1 6_t’ does not name a type
but since it is a header I did it slightly differently:
#ifndef ELFTYPES_H
#define ELFTYPES_H
// add the next 4 lines below the above lines near the top of the file
#ifndef ELF_TYPES_HPP_CSTDINT_INCLUDED
#define ELF_TYPES_HPP_CSTDINT_INCLUDED
#include <cstdint>
#endif
I think -DCMAKE_CXX_FLAGS="-include cstdint"
would work better but I didnt see where to add this flag to the compiler.
In my case I wanted to target gfx1151
but as the supports a bit experimental, I also got errors like:
[hipBLASLt] /usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/array:219:2: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
[hipBLASLt] /usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/array:219:2: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
[hipBLASLt] /usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/array:219:2: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
[hipBLASLt] /usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/array:219:2: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
[hipBLASLt] /usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/array:219:2: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
[hipBLASLt] /usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/array:219:2: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
[hipBLASLt] /usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/array:219:2: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
[hipBLASLt] /usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/array:219:2: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
[hipBLASLt] /usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/array:219:2: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
[hipBLASLt] /usr/lib64/gcc/x86_64-pc-linux-gnu/15.1.1/../../../../include/c++/15.1.1/array:219:2: error: reference to __host__ function '__glibcxx_assert_fail' in __host__ __device__ function
[hipBLASLt] 10 errors generated when compiling for gfx1151.
Fortinuately (unfortuately) this is problems specific to gfx1151 where most of the forward progress for gfx1151 seems to be contributions from the community, so far.
Don’t forget you’ll have to install the older version of cmake
until TheRock supports 4+ You can download it from the arch archives: Index of /packages/c/cmake/
pacman -U ~/cmake-3.31.6-1-x86_64.pkg.tar.zst
One thing I’m not entirely sure of is if it is a good idea to switch to an older version of gcc
the way we switched to an older cmake
. I suspect the answer is no. The answer should be no. But if you have some thoughts on this, please engage below.
Docker Alternative
It is also possible to provide support for gfx1151 via docker image builds, I think. If you have had success with the docker approach, please engage below.
TODO rest of stuff here.