GPU thermal test under Ubuntu 22.04 LTS

Hi everyone. I am currently using gpuburn to perform thermal test on the two RTX A6000 GPU cards. But gpuburn does not seem to push the cards hard enough. They only consume about 150W.

Every 1.0s: nvidia-smi gpu: Sat Dec 16 10:59:26 2023

Sat Dec 16 10:59:26 2023
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A6000 On | 00000000:41:00.0 Off | Off |
| 36% 64C P2 141W / 300W| 43842MiB / 49140MiB | 100% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+
| 1 NVIDIA RTX A6000 On | 00000000:81:00.0 Off | Off |
| 38% 66C P2 149W / 300W| 43853MiB / 49140MiB | 100% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+

±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2874 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 5948 C ./gpu_burn 43834MiB |
| 1 N/A N/A 2874 G /usr/lib/xorg/Xorg 283MiB |
| 1 N/A N/A 3089 G /usr/bin/gnome-shell 84MiB |
| 1 N/A N/A 5957 C ./gpu_burn 43482MiB |
±--------------------------------------------------------------------------------------+

Does anyone know a better GPU stress test to verify the thermal performance? I am looking for something simple without having to install so many extra libraries.

Thanks!

Furmark perhaps?

Thanks! But the system is running Linux. Furmark is only available for Windows.

Have not tried, but a quick ddg.gg search brought up two command line launched apps

sudo apt install glmark2

glmark2 –run-forever

sudo apt install furmark

sudo furmark
As per article, it looks less useful, being Short run…

This command will launch FurMark, and you will see a window displaying the test in progress. The test will typically last for a few minutes, and you will see your graphics card’s temperature and other details in real-time.

1 Like

Try memtest_vulkan:

1 Like

Thanks again. I ended up using Furmark within the Phoronix test suite. I was hoping for a tool that does not require X Windows. But that will do for now.

1 Like

Thanks! If I read the information correctly, this tool is designed to verify GPU memory stability. I am not sure if this task is GPU intensive. OTOH, graphic renderer such as Furmark has pushed TDP to the limit.

1 Like

Thanks for reportimg back. There must be some tools that do a better job, like renderers or ML trainers, even altcoin miners, but I haven’t tried any.

Despite the name and description says memtest, it uses GPU compute to read and write from memory at the maximum speed in a loop, so it does put a lot of load on the GPU compute as well. It’s not as intensive as Furmark, but from my testing, it can pull 300W (100% of TDP) on 7900 XTX, and around 320W (75% of TDP) on RTX 4090 without rendering anything on screen.

I have no idea why nvtop is showing 0% for memtest_vulkan process, but the graph does correctly report the correct GPU load.

1 Like

Thank you for the detail. I will test memtest_vulkan next time.

fwiw i keep a windows install around for new PC build testing. Mostly i run my workstations as headless Ubuntu boxes, but for new builds i find the Windows stress testing and benchmarking tools are a bit easier for me to use and understand the results, as well as compare with community results.

List would include:
aida64 - overall system stress tests, including GPU(s). Will get everything well loaded (90-100% on GPU when testing whole system) and hot and typically brings out problems other stress tests dont. I paid for this one as it’s been so useful.
blender - good for GPU benchmarks, but not so good for an extended stress test. Free.
cinebench - very good stress test for GPU(s). Free.
geekbench - useless for stress testing, but good for checking that a system is performing in line with specs compared to previous builds or the community. Multi OS, and free.