Truenas Scale 25.04 Nvidia P4000 not working

I am assembling a new Proxmox server with a used 45 drives 60xl. I got my hands on server class hardware AMD EPYC 7532 and ASRock ROMED8-2T. I went through the bios and turned on virtualization and IOMMU installed Proxmox just like i have on my current other build and passed through the HBA cards and a M2000 i had laying around and a P4000 that work was going to toss. I then installed Truenas scale 25.04. Went to Apps → Configuration → Settings and checked the Install Nvidia Drivers box then saved. I gave it a minute and restarted. When i went to test install an App just to make sure the video cards would show up in the app and neither card is there. When I go to nvidia-smi I can see both cards and they do not have any processes. If i sudo nvidia-smi -i -pm ENABLED to each card i can now see them in the app but i get a UUID error when i go to try and start the app. So i shut that Truenas verson down and loaded 23.10 because i know that had the Nvidia drives baked in. Attached all the same pcie devices as before and boom the GPU’s show up and work properly.

What am i doing wrong with the new version that I’m not with the old version?

Finally had time to test just straight installing Truenas Scale 25.04 onto the 2 NVME drives and everything is working fine seams like the issue is with the way Proxmox is trying to pass through the GPU’s.

I was trying to get it to work on Proxmox 8.4.1 i have done quite a bit of searching but i can’t seam to find anything with passing through gpu’s in 8.4.1. Do i have to pass through the audio of the gpu along with the GPU its self now?

Did you Blacklist the drivers for your GPUs to avoid conflicts with the Proxmox host?

In the Proxmox 8.4.1 docs, there is a line that reads, “If you have multiple GPUs, you’ll need to ensure you only pass through the specific P4000 device, not any other GPUs”

Use the lspci in Proxmox host terminal to list PCI devices IDs

Edit the VM’s configuration file (/etc/pve/qemu/vm-id.cfg where vm-id is the VM’s ID).

Add a line to adjust the correct PCI ID: pci0: pci-passthrough-pci-id=01:00.0

Or use the Web interface to add the hardware:
Browse virtual machine settings for the VM that will use the 4000.
Click “Hardware” tab.
Click “Add” and select, “PCI Device” type.
Select the PCI device ID found with lspci (like 01:00.0).
Save

Hope this helps.

I though i had done it the right way. The following is the way i had done it before. And the way i did it this time. Did i miss something new?

nano /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT=“quiet amd_iommu=on iommu=pt”

nano /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

echo “options vfio_iommu_type1 allow_unsafe_interrupts=1” > /etc/modprobe.d/iommu_unsafe_interrupts.conf
echo “options kvm ignore_msrs=1” > /etc/modprobe.d/kvm.conf
echo “blacklist nouveau” >> /etc/modprobe.d/blacklist.conf

I’d make sure that BIOS is the latest version (there’s a lot of talk on the wire about the slew of BIOS versions and various issues therein for that Motherboard)

I think the best bet is to follow the “standard steps” and make sure your BIOS settings are lining up for Proxmox and the VMs and then getting the blacklists correct after that because of the mixed hardware. I think you are missing a couple of blacklist items that may be a problem with the M2000 in there.

I highly suggest taking a look at: https://www.youtube.com/watch?v=Il6HhOCfDjI

These steps take you from the beginning (after enabling it BIOS) and start with basic testing to make sure you are ready to move forward. They generally work for getting everything to be seen correctly, at least the steps align with my personal experiences. (See the pinned first post with commands, they are helpful).

Hope this helps!

got the gpu to show up in the app finally but i got this error when trying to save the app.

EFAULT] Failed to render compose templates: Traceback (most recent call last): File “/usr/bin/apps_render_app”, line 33, in sys.exit(load_entry_point(‘apps-validation==0.1’, ‘console_scripts’, ‘apps_render_app’)()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/lib/python3/dist-packages/catalog_templating/scripts/render_compose.py”, line 48, in main render_templates_from_path(args.path, args.values) File “/usr/lib/python3/dist-packages/catalog_templating/scripts/render_compose.py”, line 19, in render_templates_from_path rendered_data = render_templates( ^^^^^^^^^^^^^^^^^ File “/usr/lib/python3/dist-packages/catalog_templating/render.py”, line 31, in render_templates rendered_templates[i.name] = env.get_template(i.name).render( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/lib/python3/dist-packages/jinja2/environment.py”, line 1301, in render self.environment.handle_exception() File “/usr/lib/python3/dist-packages/jinja2/environment.py”, line 936, in handle_exception raise rewrite_traceback_stack(source=source) File “/mnt/.ix-apps/app_configs/plex/versions/1.1.22/templates/docker-compose.yaml”, line 3, in top-level template code {% set c1 = tpl.add_container(values.consts.plex_container_name, values.plex.image_selector) %} ^^^^^^^^^^^^^^^^^^^^^^^^^ File “/mnt/.ix-apps/app_configs/plex/versions/1.1.22/templates/library/base_v2_1_16/render.py”, line 59, in add_container container = Container(self, name, image) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/mnt/.ix-apps/app_configs/plex/versions/1.1.22/templates/library/base_v2_1_16/container.py”, line 94, in init self.deploy: Deploy = Deploy(self._render_instance) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/mnt/.ix-apps/app_configs/plex/versions/1.1.22/templates/library/base_v2_1_16/deploy.py”, line 15, in init self.resources: Resources = Resources(self._render_instance) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/mnt/.ix-apps/app_configs/plex/versions/1.1.22/templates/library/base_v2_1_16/resources.py”, line 24, in init self._auto_add_gpus_from_values() File “/mnt/.ix-apps/app_configs/plex/versions/1.1.22/templates/library/base_v2_1_16/resources.py”, line 55, in _auto_add_gpus_from_values raise RenderError(f"Expected [uuid] to be set for GPU in slot [{pci}] in [nvidia_gpu_selection]“) base_v2_1_16.error.RenderError: Expected [uuid] to be set for GPU in slot [0000:02:00.0] in [nvidia_gpu_selection]
More info…
Error: Traceback (most recent call last):
File “/usr/lib/python3/dist-packages/middlewared/job.py”, line 515, in run
await self.future
File “/usr/lib/python3/dist-packages/middlewared/job.py”, line 562, in __run_body
rv = await self.middleware.run_in_thread(self.method, *args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 599, in run_in_thread
return await self.run_in_executor(io_thread_pool_executor, method, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 596, in run_in_executor
return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3.11/concurrent/futures/thread.py”, line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/service/crud_service.py”, line 294, in nf
rv = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/api/base/decorator.py”, line 96, in wrapped
result = func(*args)
^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/plugins/apps/crud.py”, line 229, in do_update
app = self.update_internal(job, app, data, trigger_compose=app[‘state’] != ‘STOPPED’)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/plugins/apps/crud.py”, line 259, in update_internal
update_app_config(app_name, app[‘version’], new_values, custom_app=app[‘custom_app’])
File “/usr/lib/python3/dist-packages/middlewared/plugins/apps/ix_apps/lifecycle.py”, line 60, in update_app_config
render_compose_templates(
File “/usr/lib/python3/dist-packages/middlewared/plugins/apps/ix_apps/lifecycle.py”, line 51, in render_compose_templates
raise CallError(f’Failed to render compose templates: {cp.stderr}')
middlewared.service_exception.CallError: [EFAULT] Failed to render compose templates: Traceback (most recent call last):
File “/usr/bin/apps_render_app”, line 33, in
sys.exit(load_entry_point(‘apps-validation==0.1’, ‘console_scripts’, ‘apps_render_app’)())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/catalog_templating/scripts/render_compose.py”, line 48, in main
render_templates_from_path(args.path, args.values)
File “/usr/lib/python3/dist-packages/catalog_templating/scripts/render_compose.py”, line 19, in render_templates_from_path
rendered_data = render_templates(
^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/catalog_templating/render.py”, line 31, in render_templates
rendered_templates[i.name] = env.get_template(i.name).render(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/jinja2/environment.py”, line 1301, in render
self.environment.handle_exception()
File “/usr/lib/python3/dist-packages/jinja2/environment.py”, line 936, in handle_exception
raise rewrite_traceback_stack(source=source)
File “/mnt/.ix-apps/app_configs/plex/versions/1.1.22/templates/docker-compose.yaml”, line 3, in top-level template code
{% set c1 = tpl.add_container(values.consts.plex_container_name, values.plex.image_selector) %}
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/mnt/.ix-apps/app_configs/plex/versions/1.1.22/templates/library/base_v2_1_16/render.py”, line 59, in add_container
container = Container(self, name, image)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/mnt/.ix-apps/app_configs/plex/versions/1.1.22/templates/library/base_v2_1_16/container.py”, line 94, in init
self.deploy: Deploy = Deploy(self._render_instance)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/mnt/.ix-apps/app_configs/plex/versions/1.1.22/templates/library/base_v2_1_16/deploy.py”, line 15, in init
self.resources: Resources = Resources(self._render_instance)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/mnt/.ix-apps/app_configs/plex/versions/1.1.22/templates/library/base_v2_1_16/resources.py”, line 24, in init
self._auto_add_gpus_from_values()
File “/mnt/.ix-apps/app_configs/plex/versions/1.1.22/templates/library/base_v2_1_16/resources.py”, line 55, in _auto_add_gpus_from_values
raise RenderError(f"Expected [uuid] to be set for GPU in slot [{pci}] in [nvidia_gpu_selection]”)
base_v2_1_16.error.RenderError: Expected [uuid] to be set for GPU in slot [0000:02:00.0] in [nvidia_gpu_selection]

I think the issue is that Truenas Scale 25.04 is changing the UUID to something different than Proxmox.