Good afternoon the tl;dr is that I cannot figure out why the PCI pass through in my TrueNAS scale install has stopped working. I’ve tried a lot of different things including reinstalling TrueNAS scale, moving the card to a different PCI slot and about a dozen other things which I have described in detail below.
I have also described all the different things I did that led up to this issue and all the solutions I’ve tried so far to diagnose and remedy. At this point though I have no idea why its now working.
–Back story–
In the beginning I had an old daily driving PC I replaced with some fairly good specs. It has an intel i7-4770 CPU and 32GB of ram. I built it as a gaming PC back in 2013 that I was using up until a few months ago when I replaced it after it had some issues with handling a newer graphics card.
Not wanting to throw it out or have it become a dust collector, I put linux mint on there bought a TV tuner card (Hauppauge 1196 WinTV HVR-1265 PCI Express), attached an antenna to it and hooked it up to my living rooms TV, with Plex and nextPVR. This worked great as a little media server for several months in my living room with no problems.
However, watching level1 and getting inspired to degooglify my life I decided it might be better to turn that machine into something that could handle such things better and I had read and heard about TrueNAS scale. In my research I saw that while it was not quite there to pass hardware through to a docker image, a Virtual Machine should be no problem. So I decided to get some extra HDDs and install TrueNAS scale.
–Discovering the Problem–
I got TrueNAS scale up and running on this machine very quickly and set up my first pool. After tooling around with the provided Plex app I decided a VM is what I really needed. So I set up an ubuntu server vm in TrueNAS. I tried to add hardware but the list was empty. After some research(hxxps://www.truenas.com/community/threads/how-to-pass-through-a-pcie-device-such-as-a-network-card-to-vm.95635/) I discovered there may be some settings not enabled on my motherboard. So into the bios I went, virtualization was enabled but then I dug around and found that vt-d was not enabled so I turned that on and booted back into TrueNAS. I was then able to see all the PCI devices and could add them to the VM.
Using lspci
I found the TV tuner card and made sure to add that as a PCI passthrough device in the VM settings then booted up the VM, installed nextPVR and did a channel scan and sure enough I had broadcast TV coming through the VM to my web browser on my other desktop.
From here I went to set up the NFS mounting for this media server VM, but to do that I had to set up a network bridge. As an aside I must say that following these directions from TrueNAS was a complete PITA and required several reboots to get changes to stick (hxxps://www.truenas.com/docs/scale/virtualization/accessingnasfromvm/). Finally the VM could ping the internet and the host machine.
At this point I wanted to add another pair of HDDs to the machine so I shut the machine down and plugged an additional pair of HDDs. Turned it back on and created a second Pool with them.
Happy with my creation, I decided to double check the media servers nextPVR instance to see if the channel guide had updated and when I clicked to watch a channel at the bottom I saw Streaming Failed (transcoder exited)
. I slowly went through rebooting everything; the VM, the machine TrueNAS was on but kept running into the same issue.
–Diagnosis–
It was previously working. I had seen it with my own eyes, so I wasn’t sure what had changed. I started with the last non hardware thing I did which was the network bridge. So I deleted that and put it back to how it was and that did not solve the problem.
I then suspected it was a power draw problem so I opened the machine back up and unplugged the power from the two additional HDDs I added and the same problem when I booted back up.
Next to make sure it was not an actual hardware problem, I took a USB of Linux Lite and I booted that and installed nextPVR and told it to channel scan and it worked and I was once again able to stream live TV to my web browser. So the hardware was fine.
Next I moved the TV tuner PCI card to a different PCI slot and changed the VM to point to the new PCI address. Same problem.
Next I spun up a new VM and added the TV cards PCI address as a passthrough to that new VM and installed nextPVR there and still the same problem.
Next I started looking into dmesg to see if there were any problems. No errors reported.
[ 5.967675] tveeprom: Hauppauge model 161111, rev A1I6, serial# 4036167623
[ 5.967676] tveeprom: MAC address is 00:0d:fe:93:07:c7
[ 5.967677] tveeprom: tuner model is SiLabs Si2157 (idx 186, type 4)
[ 5.967677] tveeprom: TV standards NTSC(M) ATSC/DVB Digital (eeprom 0x88)
[ 5.967678] tveeprom: audio processor is CX23888 (idx 40)
[ 5.967678] tveeprom: decoder processor is CX23888 (idx 34)
[ 5.967679] tveeprom: has no radio, has IR receiver, has no IR transmitter
[ 5.967679] cx23885: cx23885[0]: hauppauge eeprom: model=161111
[ 5.986754] Console: switching to colour frame buffer device 240x67
[ 6.011266] amdgpu 0000:01:00.0: [drm] fb0: amdgpudrmfb frame buffer device
[ 6.036847] cx25840 8-0044: cx23888 A/V decoder found @ 0x88 (cx23885[0])
[ 6.041001] b43-phy0: Broadcom 4352 WLAN found (core revision 42)
[ 6.041634] b43-phy0 ERROR: FOUND UNSUPPORTED PHY (Analog 12, Type 11 (AC), Revision 1)
[ 6.041987] b43: probe of bcma0:1 failed with error -95
[ 6.042221] Broadcom 43xx driver loaded [ Features: PNLS ]
[ 6.049063] [drm] Initialized amdgpu 3.40.0 20150101 for 0000:01:00.0 on minor 0
[ 6.668160] cx25840 8-0044: loaded v4l-cx23885-avcore-01.fw firmware (16382 bytes)
[ 6.697143] intel_rapl_common: Found RAPL domain package
[ 6.697376] intel_rapl_common: Found RAPL domain core
[ 6.697590] intel_rapl_common: Found RAPL domain dram
[ 6.748200] cx23885: cx23885[0]: registered device video0 [v4l2]
[ 6.748484] cx23885: cx23885[0]: registered device vbi0
[ 6.748807] cx23885: cx23885[0]: alsa: registered ALSA audio device
[ 6.748809] cx23885: cx23885_dvb_register() allocating 1 frontend(s)
[ 6.749080] cx23885: cx23885[0]: cx23885 based dvb card
[ 6.761028] si2157 7-0060: Silicon Labs Si2147/2148/2157/2158 successfully attached
[ 6.761353] dvbdev: DVB: registering new adapter (cx23885[0])
[ 6.761595] cx23885 0000:06:00.0: DVB: registering adapter 0 frontend 0 (LG Electronics LGDT3306A VSB/QAM Frontend)...
[ 6.762261] cx23885: cx23885_dev_checkrevision() Hardware revision = 0xd0
[ 6.762559] cx23885: cx23885[0]/0: found at 0000:06:00.0, rev: 4, irq: 16, latency: 0, mmio: 0xf0800000
d
No errors reported.
The latest thing I’ve done is replace the battery on my machine’s motherboard and after it reset the CMOS, turned virtualization and vt-d back on, and checked it turned on correctly which it looks like it did.
dmesg | grep -i -e DMAR -e IOMMU
[ 0.000000] Command line: BOOT_IMAGE=/ROOT/22.02.RELEASE@/boot/vmlinuz-5.10.93+truenas root=ZFS=boot-pool/ROOT/22.02.RELEASE ro console=ttyS0,9600 console=tty1 libata.allow_tpm=1 systemd.unified_cgroup_hierarchy=0 amd_iommu=on iommu=pt kvm_amd.npt=1 kvm_amd.avic=1 intel_iommu=on zfsforce=1
[ 0.007451] ACPI: DMAR 0x00000000DD955738 000080 (v01 INTEL HSW 00000001 INTL 00000001)
[ 0.007467] ACPI: Reserving DMAR table memory at [mem 0xdd955738-0xdd9557b7]
[ 0.018013] Kernel command line: BOOT_IMAGE=/ROOT/22.02.RELEASE@/boot/vmlinuz-5.10.93+truenas root=ZFS=boot-pool/ROOT/22.02.RELEASE ro console=ttyS0,9600 console=tty1 libata.allow_tpm=1 systemd.unified_cgroup_hierarchy=0 amd_iommu=on iommu=pt kvm_amd.npt=1 kvm_amd.avic=1 intel_iommu=on zfsforce=1
[ 0.018097] DMAR: IOMMU enabled
[ 0.091269] DMAR: Host address width 39
[ 0.091410] DMAR: DRHD base: 0x000000fed90000 flags: 0x1
[ 0.091604] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap d2008c20660462 ecap f010da
[ 0.091888] DMAR: RMRR base: 0x000000dd8c8000 end: 0x000000dd8d4fff
[ 0.092114] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed90000 IOMMU 0
[ 0.092343] DMAR-IR: HPET id 0 under DRHD base 0xfed90000
[ 0.092539] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.093041] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 0.423257] iommu: Default domain type: Passthrough (set via kernel command line)
[ 1.176592] DMAR: No ATSR found
[ 1.176749] DMAR: dmar0: Using Queued invalidation
[ 1.176967] pci 0000:00:00.0: Adding to iommu group 0
[ 1.177164] pci 0000:00:01.0: Adding to iommu group 1
[ 1.177357] pci 0000:00:14.0: Adding to iommu group 2
[ 1.177552] pci 0000:00:16.0: Adding to iommu group 3
[ 1.177744] pci 0000:00:19.0: Adding to iommu group 4
[ 1.177937] pci 0000:00:1a.0: Adding to iommu group 5
[ 1.178129] pci 0000:00:1b.0: Adding to iommu group 6
[ 1.178321] pci 0000:00:1c.0: Adding to iommu group 7
[ 1.178515] pci 0000:00:1c.1: Adding to iommu group 8
[ 1.178707] pci 0000:00:1c.3: Adding to iommu group 9
[ 1.178900] pci 0000:00:1d.0: Adding to iommu group 10
[ 1.179104] pci 0000:00:1f.0: Adding to iommu group 11
[ 1.179299] pci 0000:00:1f.2: Adding to iommu group 11
[ 1.179538] pci 0000:00:1f.3: Adding to iommu group 11
[ 1.179730] pci 0000:01:00.0: Adding to iommu group 1
[ 1.179917] pci 0000:01:00.1: Adding to iommu group 1
[ 1.180110] pci 0000:03:00.0: Adding to iommu group 12
[ 1.180305] pci 0000:04:00.0: Adding to iommu group 13
[ 1.180500] pci 0000:05:01.0: Adding to iommu group 14
[ 1.180695] pci 0000:05:04.0: Adding to iommu group 15
[ 1.180894] pci 0000:05:05.0: Adding to iommu group 16
[ 1.181089] pci 0000:05:06.0: Adding to iommu group 17
[ 1.181284] pci 0000:05:07.0: Adding to iommu group 18
[ 1.181479] pci 0000:05:08.0: Adding to iommu group 19
[ 1.181674] pci 0000:05:09.0: Adding to iommu group 20
[ 1.181872] pci 0000:06:00.0: Adding to iommu group 21
[ 1.182069] pci 0000:08:00.0: Adding to iommu group 22
[ 1.182266] pci 0000:0a:00.0: Adding to iommu group 23
[ 1.182463] pci 0000:0c:00.0: Adding to iommu group 24
[ 1.182701] DMAR: Intel(R) Virtualization Technology for Directed I/O
[ 1.197929] AMD-Vi: AMD IOMMUv2 functionality not available on this system - This is not a bug.
[ 1.340976] intel_iommu=on
And still nothing. Finally I thought I’d pull out a big hammer and reinstall the TrueNAS scale. Which I did but I didn’t nuke my old settings (it did not seem to give me the option) this did not work but I do wonder if I should escalate to that.
–conclusion–
So I am stumped both on what could be wrong and how to diagnose this.
Please let me know your thoughts or any additional information I might be able to provide
Thanks