I have struggled getting Cuda, NVCC, and the cuda-samples project running on my computer, so I am writing this to help others.
Installing NVIDIA drivers
As someone that uses the professional CUDA blackwell cards, I am required to use the NVIDIA open kernel module. It works for both gaming and professional cards. I would advise you to use it as the closed source kernel module will eventually be deprecated.
sudo dnf install rpmfusion-nonfree-release # make sure rpmfusion non-free is enabled
sudo sh -c 'echo "%_with_kmod_nvidia_open 1" > /etc/rpm/macros.nvidia-kmod' # use the open modules when building the kernel modules
sudo dnf install akmod-nvidia xorg-x11-drv-nvidia xorg-x11-drv-nvidia-cuda xorg-x11-drv-nvidia-cuda-libs # install the fedora drivers and cuda libraries
DO NOT REBOOT UNTIL KERNEL MODULES HAVE BEEN COMPILED. You can simply look to see if they are running by calling ps aux | grep kmod
. I personally run it with the watch command and wait until nothing but the issued watch commands are returned. watch -n 2 "ps aux | grep kmod"
More information can be found on the RPM Fusion NVIDIA how-to site
Reboot
Install Cuda 12.9
Instructions are based on the RPM Fusion CUDA page
sudo dnf config-manager addrepo --from-repofile=https://developer.download.nvidia.com/compute/cuda/repos/fedora41/$(uname -m)/cuda-fedora41.repo
sudo dnf clean all
sudo dnf config-manager setopt cuda-fedora41-$(uname -m).exclude=nvidia-driver,nvidia-modprobe,nvidia-persistenced,nvidia-settings,nvidia-libXNVCtrl,nvidia-xconfig
sudo dnf -y install cuda-toolkit # 12.9.0 at time of writing
Now comes the juicy part, installing the requirements to get the cuda-samples
Installing NVCC, compatible GCC
To compile cuda C/C++ code, you will need NVCC from the NVIDIA cuda fedora repo we added. We also need GCC 14 as Fedora 42 comes with version 15, which is too new.
sudo dnf install gcc14.x86_64 gcc14-c++.x86_64 cuda-nvcc-12-9
Now set the correct environment variables so that nvcc
uses g++
14 and cmake
projects use the correct versions of GCC compilers:
export CUDAHOSTCXX=/usr/bin/g++-14
export CPATH=/usr/include/openmpi-x86_64:$CPATH
export PATH=$PATH:/usr/lib64/openmpi/bin
export CC=/usr/bin/gcc-14
export CXX=/usr/bin/g++-14
export NVCC_CCBIN=/usr/bin/g++-14
Finally we want to make sure that we have the libraries and includes folders added correct to the paths and that the nvcc
binary is in our executable paths.
export LD_LIBRARY_PATH=/usr/local/cuda-12.9/targets/x86_64-linux/lib:$LD_LIBRARY_PATH
export CPATH=/usr/local/cuda-12.9/targets/x86_64-linux/include:$CPATH
export PATH=/usr/local/cuda-12.9/bin:$PATH
The dirty hack
/usr/local/cuda-12.9/targets/x86_64-linux/include/crt/math_functions.h
have externel declaration functions to /usr/include/bits/mathcalls.h
that are incompatible. So let’s fix that by editing /usr/local/cuda-12.9/targets/x86_64-linux/include/crt/math_functions.h
. Here are a set of diffs of what I did :
*
* \note_accuracy_double
*/
-extern __DEVICE_FUNCTIONS_DECL__ __device_builtin__ double sinpi(double x);
+extern __DEVICE_FUNCTIONS_DECL__ __device_builtin__ double sinpi(double x) noexcept (true);
/**
* \ingroup CUDA_MATH_SINGLE
* \brief Calculate the sine of the input argument
@@ -2576,7 +2576,7 @@
*
* \note_accuracy_single
*/
-extern __DEVICE_FUNCTIONS_DECL__ __device_builtin__ float sinpif(float x);
+extern __DEVICE_FUNCTIONS_DECL__ __device_builtin__ float sinpif(float x) noexcept (true);
/**
* \ingroup CUDA_MATH_DOUBLE
* \brief Calculate the cosine of the input argument
@@ -2598,7 +2598,7 @@
*
* \note_accuracy_double
*/
-extern __DEVICE_FUNCTIONS_DECL__ __device_builtin__ double cospi(double x);
+extern __DEVICE_FUNCTIONS_DECL__ __device_builtin__ double cospi(double x) noexcept (true);
/**
* \ingroup CUDA_MATH_SINGLE
* \brief Calculate the cosine of the input argument
@@ -2620,7 +2620,7 @@
*
* \note_accuracy_single
*/
-extern __DEVICE_FUNCTIONS_DECL__ __device_builtin__ float cospif(float x);
+extern __DEVICE_FUNCTIONS_DECL__ __device_builtin__ float cospif(float x) noexcept (true);
/**
* \ingroup CUDA_MATH_DOUBLE
* \brief Calculate the sine and cosine of the first input argument
Notice how we added noexcept (true)
at the end of the sine and cosine commands, that’s what is need to make them compatible with the newer libraries.
And that is it folks. The rest is all about getting the cuda-samples
working.
Getting cuda-samples
working
Now, there are a few packages that you can install to help with the OpenGL, Vulkan, FreeImage and MPI demoes:
sudo dnf install freeglut freeglut-devel freeimage-devel openmpi openmpi-devel vulkan
To get MPI working, you want to add it’s binaries to your path
export PATH=$PATH:/usr/lib64/openmpi/bin
In a directory of your choice, clone the cuda-samples
project
git clone https://github.com/NVIDIA/cuda-samples.git
cd cuda-samples
Now let’s get building:
# in the cuda-samples diretory
mkdir build
cd build
cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CUDA_ARCHITECTURES=native .. # I prefer ninja over make
ninja -j30 # change the number to whatever you want, I have 32 threads available in my computer
Grab a cuppa tea and wait (or not). If all goes well, we should have projects we can now run. In the same build
directory, let’s run two of my favorite examples
./Samples/2_Concepts_and_Techniques/MC_EstimatePiP/MC_EstimatePiP # simple monte-carlo algorithm for estimating Pi
./Samples/5_Domain_Specific/Mandelbrot/Mandelbrot # tests OpenGL with CUDA, use `+` and `-` to zoom in and out
Happy hacking!
Note:
- I am not a professional C++ developer, have not done it professionally in over a decade
- My only CUDA real world experience was when I was at university and I used it to accelerate simulations that were written with OpenMP
- If there is a better way let me know!
- If you are not married to Fedora, just use Ubuntu. It’s far easier. For me, I want the newer packages, no Snapshot, and I work with RHEL at work so it’s closer to what I use there.
BONUS: Working inside CLion
Open the root project directory where the CMake with CLion. You will notice that cmake
won’t initially work. That is because you need to add the environment variables we have above there:
Go to Settings
→ Build, Execution, Deployment
→ CMake
and add the following in Environment
:
CUDAHOSTCXX=/usr/bin/g++-14;CXX=/usr/bin/g++-14;NVCC_CCBIN=/usr/bin/g++-14;CC=/usr/bin/gcc-14
Then reload the CMake project, which you can easily do from the bottom left corner of the IDE