WIP: Blackwell RTX 6000 Pro / Max-Q Quickie Setup Guide on Ubuntu 24.04 LTS / 25.04

DRAFT — COMMENTS WELCOME

Quick Background

I did this setup on my 96 core Threadripper Falcon Northwest system because its the nicest computer that I own. It also has two RTX 5580s in it (which is sort of the 300w “Max Q” equivalent of that generation.

@wFNWtr:~$ nvidia-smi 
Sun May 11 16:45:55 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.51.03              Driver Version: 575.51.03      CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX PRO 6000 Blac...    Off |   00000000:C1:00.0  On |                  Off |
| 30%   37C    P0             70W /  600W |    1559MiB /  97887MiB |      2%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            5377      G   /usr/lib/xorg/Xorg                      820MiB |
|    0   N/A  N/A            6998      G   /usr/bin/gnome-shell                    306MiB |
|    0   N/A  N/A            8266      G   .../6103/usr/lib/firefox/firefox        358MiB |
+-----------------------------------------------------------------------------------------+

You need pytorch / torch cu128 (or newer) and the 126 will not do it. There is also some weird circular dependency with cuda-drivers as of the time I’m writing this, so the manual driver download from nvidia.com’s driver section or the driver utility in Ubuntu are really not the best way to go.

This page starts with downloading the nvidia keyring driver, and that gets you most of the way there:

First, I did

apt install cuda-toolkit-12-9 nvidia-open

then after that succeded I rebooted. That worked!

To get pyTorch up and running you need something like

bin/pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

to grab the cu128 version.

Huge thanks to @eousphoros for helping out in this one, this was the fastest path to sanity on both Ubuntu 25 and Ubuntu 24.04 LTs. Kudos, and good work.


Linux wFNWtr 6.11.0-25-generic #25~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Apr 15 17:20:50 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

I also like this script, what we usually use in live streams and hangouts, for testing.

Quick setup guide:

# grab Blackwell-aware deps
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

# run script
./bin/python3 ./mamf-finder.py --m_range 0 4096 256 --n_range 0 4096 256 --k_range 0 20480 256 --output_file=2025-05-08-14:50:42.txt

Here There Be Tigers

Nvidia themselves are still debugging. These cards are not widely distributed yet, so there may be bugs, fwiw.

Ballpark Numbers

RTX Pro 6000 @ 600W: Deepseek-R1 Q8 70b Distill: 19.94 t/s
RTX Pro 6000 @ 300W: Deepseek-R1 Q8 70b Distill: 16.4 t/s

MAMF @ 300W: 377.5 (max) TFLOPS (288.4 median)
MAMF @ 600w: 414.4 (max) TFLOPS (404.0 median)

MAMF @ 450w: 391.5 (max) TFLOPS (374.4 median)

Blender

5 Likes

Its confusing as hell naming but cuda-drivers are the older closed source gpu driver while nvidia-open are the modern gpu driver. So what you want is nvidia-open, dont bother with cuda-drivers.

1 Like

Anecdotally I have not had the best experience with nvidia-open in terms of compatibility and stability, especially with older OpenGL software.

Whether or not nvidia-open currently has feature parity with cuda-driver I am not sure; this was not the case as recently as version 560.

It’s also not a difference in modernity; both are maintained and current.

The difference is that nvidia-open hides everything important in proprietary binary blobs (much how AMD does). In some cases these proprietary binaries are even inlined within the “open” code :laughing:

I moved over to nvidia-open starting with 570 and now on 575. Zero issues with any of the software I use.

Anyone able to get this working?

marton# nvidia-smi -i 0 -mig 1  
Unable to enable MIG Mode for GPU 00000000:41:00.0: Not Supported
Treating as warning and moving on.
All done.
marton# nvidia-smi 
Sat May 31 05:57:21 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.51.03              Driver Version: 575.51.03      CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX PRO 6000 Blac...    On  |   00000000:41:00.0 Off |                  Off |
| 30%   28C    P8             11W /  600W |       4MiB /  97887MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+```
1 Like

Hi everyone,
Does it support GPU confidential Computing?
If so, has anyone manage to get it working?
In my lab we are stilling trying to activate confidential computing on a H100, without success (back and forth with NVIDIA support without any solution in sight).

Thanks!
Rolando

What commands with nvidia-smi would be useful to get more performance out of a pro6000 or a 5090? I have power limited them but I haven’t played around with under/overclocks or undervolts

Not sure if you’ve seen, I don’t understand myself so asked about something I saw on reddit: RTX Pro 6000 Blackwell in the house - #23 by lerner

I sent you a message as well Wendell, if you want to do a video to test performance across multiple workloads, I have a machine with 7 of these cards in it running right now. Ping me and let’s chat, it’s definitely an extreme workstation setup.

1 Like

I would love to see how this scales with 7 cards

Currently have 2 but was debating on getting more, the more you buy the more you save as the man says.

Some of you may be interested to know that Seasonic has released a 2200W PSU. It’s time to get that 240 volt line in for your home lab!

https://seasonic.com/atx3-prime-px-2200/

1 Like

That power supply looks nice but its unobtainium. Tried the three usual spots and no one can get one.

I wouldn’t mess with the voltage, but you can lock the memory frequency lower to get more cpu speed or lock the cpu frequency to get more memory speed. Its pretty good at managing the power envelope you set.

Ok, at the moment I do not know if it is Europe Only or coming to North America. I have seen both things in my search.

edit: I emailed Seasonic to inquire about the availability of the PSU in North America.

1 Like
{
#if defined(NV_CC_PLATFORM_PRESENT)
    os_cc_enabled = cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);

#if defined(NV_CC_ATTR_SEV_SNP)
    os_cc_sev_snp_enabled = cc_platform_has(CC_ATTR_GUEST_SEV_SNP);
#endif

    os_cc_sme_enabled = cc_platform_has(CC_ATTR_MEM_ENCRYPT);

#if defined(NV_HV_GET_ISOLATION_TYPE) && IS_ENABLED(CONFIG_HYPERV) && defined(NVCPU_X86_64)
    if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP)
    {
        os_cc_snp_vtom_enabled = NV_TRUE;
    }
#endif

#if defined(X86_FEATURE_TDX_GUEST)
    if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
    {
        os_cc_tdx_enabled = NV_TRUE;
    }
#endif
#else
    os_cc_enabled = NV_FALSE;
    os_cc_sev_snp_enabled = NV_FALSE;
    os_cc_sme_enabled = nv_detect_sme_enabled();
    os_cc_snp_vtom_enabled = NV_FALSE;
    os_cc_tdx_enabled = NV_FALSE;
#endif //NV_CC_PLATFORM_PRESENT
}

I’m not setup atm to move a card into a VM (Looks required for CC) at the moment but there might be some requirements on the host that you have to enable in order to get confidential computing to get detected correctly?

Quick guess would be SEV-SNP, SME, and TPM at a min

EDIT:

$ nvidia-smi conf-compute -q

==============NVSMI CONF-COMPUTE LOG==============

    CC State                   : OFF
    Multi-GPU Mode             : None
    CPU CC Capabilities        : None
    GPU CC Capabilities        : CC Capable
    CC GPUs Ready State        : Not Ready

I have this one in a new build that is awaiting its RTX Pro 6000. Be aware this is a very long PSU. E.g. mounting it fan down in a North XL isn’t possible because 1/3 of the fan would be covered. Otherwise, delighted so far, high-quality cables and accessories. I’m in Europe, so it’s 2200W for me.

1 Like

hi there,
we took care of enabling the necessary host requirements, namely SEV-SNP in our epic rig. We have it working (followed Nvidia’s guide, well we had to rewrite it as it as completely outdated). When we bootstrap the VM, we got a firmware error on the H100 side when it tried to enable CC. Our guess is that our firmware has issues. We requested firmware updates from supermicro, that did not fix the issue. And are for months trying to get proper support from Nvidia…

is it possible to overclock with nvidia-smi in linux? I’ve only used the -pl power limit function

In my country >1600w PSU are hard to find and they increase the price exponentially. At the moment I’m using 1000w corsair atx3.1 psu daisy chained, each connected to their own UPS and I haven’t had any problem.

1 Like

not oc but you can tweak where the power budget goes