PXE booting RPi from TrueNAS

Hi guys, not sure if this is the most appropriate place for this topic but here we go.

I’m trying to network boot a Raspberry Pi after an SD card died, but I’m at the end of line in trying to diagnose my problems. No idea what else to try. I’m running my TFTP server from TrueNAS SCALE, my DHCP server on my EdgeRouter 4 and booting from a Raspberry Pi 4B. I’m following this guide from Wendell, as well as using a few of his recources:

For reference, my IP’s are assigned as follows:
Gateway: 192.168.1.1
TFTP: 192.168.1.181
RPi: 192.168.1.154

Here is the important stuff from the bootup screen of the RPi:

Summary
Raspberry Pi 4 Model B 2GB - bootloader: c2f8c388 Apr 29 2021
update-ts: 1649213267
board: b03112 37ca3181 dc:a6:32:7a:3d:63
boot: mode NETWORK 2 order 241 retry 2/5 restart SD: card not detected
part: O mbr [0x00:00000000 0x00:00000000 0x00:00
fw: start.elf fixup.dat 
net: up ip: 192.168.1.154 s sn: 255.255.255.0 
tftp: 192.168.1.181 18:31:bf:cd:4b: 70

Reset USB port-power 1000 ms
Failed to open device: 'sdcard' (cmd 371a0010 status 1fff0001)
Failed to open device: 'sdcard' (cmd 371a0010 status 1fff0001)
USB2[1] 400202e1 connected
USB2 root HUB port 1 init
HUB [01:00] 2.16 000000:01 init port 3 speed 1
USB MSD timed out after 20 seconds 
NET_BOOT: dc:a6:32:7a:3d:63 wait for link TFTP: 192.168.1.181
Link ready

YI_ADDR 192.168.1.154 
[66]: 192.168.1.181
TFTP 1: File not found
TFTP 1: File not found
Read config.txt bytes 816 hnd 0x00000000
TFTP 1: File not found
TFTP 1: File not found
TFTP 1: File not found
TFTP 1: File not found 
TFTP 1: File not found
TFTP 1: File not found
Firmware not found

I’m able to connect to the TFTP server from the Pi just fine because I’m mounting and transferring my boot files just fine. I’ve run a diff and this is the output which is pretty much as I would expect (note that I’ve added cmdline.txt to the network boot folder trying to diagnose this issue):

Summary
[email protected]:~# sudo diff /boot/ /nfs/boot/
diff: /nfs/boot/System.map-5.13.0-1008-raspi: Permission denied
diff: /nfs/boot/System.map-5.13.0-1024-raspi: Permission denied
Only in /nfs/boot/: cmdline.txt
diff /boot/config.txt /nfs/boot/config.txt
0a1,32
> [pi4]
> max_framebuffers=2
>
> [all]
> kernel=vmlinuz
> cmdline=cmdline.txt
> initramfs initrd.img followkernel
>
> # Enable the audio output, I2C and SPI interfaces on the GPIO header
> dtparam=audio=on
> dtparam=i2c_arm=on
> dtparam=spi=on
>
> # Enable the serial pins
> enable_uart=1
>
> # Comment out the following line if the edges of the desktop appear outside
> # the edges of your display
> disable_overscan=1
>
> # If you have issues with audio, you may try uncommenting the following line
> # which forces the HDMI output into HDMI mode instead of DVI (which doesn't
> # support audio output)
> #hdmi_drive=2
>
> # If you have a CM4, uncomment the following line to enable the USB2 outputs
> # on the IO board (assuming your CM4 is plugged into such a board)
> #dtoverlay=dwc2,dr_mode=host
>
> # Config settings specific to arm64
> arm_64bit=1
> dtoverlay=dwc2
Common subdirectories: /boot/dtbs and /nfs/boot/dtbs
Common subdirectories: /boot/firmware and /nfs/boot/firmware

I’ve also got my DHCP logs from the router here, though I’m not sure exactly what they’re supposed to look like. I have enabled option 66 for the entire subnet at this point.:

Summary
Apr  7 10:42:43 EdgeRouter-4 dhcpd3: DHCPDISCOVER from dc:a6:32:7a:3d:63 via eth1
Apr  7 10:42:43 EdgeRouter-4 dhcpd3: DHCPOFFER on 192.168.1.154 to dc:a6:32:7a:3d:63 via eth1
Apr  7 10:42:43 EdgeRouter-4 dhcpd3: DHCPREQUEST for 192.168.1.154 (192.168.1.1) from dc:a6:32:7a:3d:63 via eth1
Apr  7 10:42:43 EdgeRouter-4 dhcpd3: DHCPACK on 192.168.1.154 to dc:a6:32:7a:3d:63 via eth1
Apr  7 10:42:47 EdgeRouter-4 dhcpd3: DHCPDISCOVER from dc:a6:32:7a:3d:63 via eth1
Apr  7 10:42:47 EdgeRouter-4 dhcpd3: DHCPOFFER on 192.168.1.154 to dc:a6:32:7a:3d:63 via eth1
Apr  7 10:42:47 EdgeRouter-4 dhcpd3: DHCPREQUEST for 192.168.1.154 (192.168.1.1) from dc:a6:32:7a:3d:63 via eth1
Apr  7 10:42:47 EdgeRouter-4 dhcpd3: DHCPACK on 192.168.1.154 to dc:a6:32:7a:3d:63 via eth1
Apr  7 10:42:57 EdgeRouter-4 dhcpd3: DHCPDISCOVER from dc:a6:32:7a:3d:63 via eth1
Apr  7 10:42:57 EdgeRouter-4 dhcpd3: DHCPOFFER on 192.168.1.154 to dc:a6:32:7a:3d:63 via eth1
Apr  7 10:42:57 EdgeRouter-4 dhcpd3: DHCPREQUEST for 192.168.1.154 (192.168.1.1) from dc:a6:32:7a:3d:63 via eth1
Apr  7 10:42:57 EdgeRouter-4 dhcpd3: DHCPACK on 192.168.1.154 to dc:a6:32:7a:3d:63 via eth1
Apr  7 10:43:07 EdgeRouter-4 dhcpd3: DHCPDISCOVER from dc:a6:32:7a:3d:63 via eth1
Apr  7 10:43:07 EdgeRouter-4 dhcpd3: DHCPOFFER on 192.168.1.154 to dc:a6:32:7a:3d:63 via eth1
Apr  7 10:43:07 EdgeRouter-4 dhcpd3: DHCPREQUEST for 192.168.1.154 (192.168.1.1) from dc:a6:32:7a:3d:63 via eth1
Apr  7 10:43:07 EdgeRouter-4 dhcpd3: DHCPACK on 192.168.1.154 to dc:a6:32:7a:3d:63 via eth1
Apr  7 10:43:17 EdgeRouter-4 dhcpd3: DHCPDISCOVER from dc:a6:32:7a:3d:63 via eth1
Apr  7 10:43:17 EdgeRouter-4 dhcpd3: DHCPOFFER on 192.168.1.154 to dc:a6:32:7a:3d:63 via eth1
Apr  7 10:43:17 EdgeRouter-4 dhcpd3: DHCPREQUEST for 192.168.1.154 (192.168.1.1) from dc:a6:32:7a:3d:63 via eth1
Apr  7 10:43:17 EdgeRouter-4 dhcpd3: DHCPACK on 192.168.1.154 to dc:a6:32:7a:3d:63 via eth1
Apr  7 10:43:27 EdgeRouter-4 dhcpd3: DHCPDISCOVER from dc:a6:32:7a:3d:63 via eth1
Apr  7 10:43:27 EdgeRouter-4 dhcpd3: DHCPOFFER on 192.168.1.154 to dc:a6:32:7a:3d:63 via eth1
Apr  7 10:43:27 EdgeRouter-4 dhcpd3: DHCPREQUEST for 192.168.1.154 (192.168.1.1) from dc:a6:32:7a:3d:63 via eth1
Apr  7 10:43:27 EdgeRouter-4 dhcpd3: DHCPACK on 192.168.1.154 to dc:a6:32:7a:3d:63 via eth1

Does anyone have any idea why the RPi isn’t able to find the firmware on the TFTP server? I’d greatly appreciate the help.

when debugging this kind of thing… it’d be hugely helpful if you could log into your TrueNAS Scale system … and capture packets into a pcap file.

… later on, you can open this capture file in wireshark, and you can follow the conversation between your TrueNAS Scale server and your raspberry pi.

Specifically
tcpdump -w output.pcap -i <network_if_name> port 67 or port 68 or port 69
will do the capture, … you can drop the -w output.pcap to test if the command line works at all.

In Wireshark you should be able to see what file (if any) raspberry pi is requesting and not finding… and what DHCP parameters are being passed.

For example, normal/usual PXE would offer up both the tftp server address and the filename to load as part of the dhcp offer/acknowledgement process).

Bare in mind that some tftp servers are case sensitive, some are not, some depend on configuration, some will follow symlinks some won’t - your filesystem plays a role as well.

Also, there might be firewall settings on TrueNAS that might be blocking TFTP - it’s just a computer afterall, it’s worth checking if you can fetch a TFTP file that you’re expecting Raspberry pi would fetch, but from your machine using whatever tftp client you have installed.


(advanced: you can also pipe tcpdump output over ssh into wireshark directly if you’re crafty with command line redirections and so on, check the manpage of tcpdump for how to control buffering in that case… This would allow you to see all the details on a nice comfortable screen in real time … e.g. your ER-4 has tcpdump as well, but no real storage… you could capture stuff with tcpdump on your ER-4).

Legend! I was convicted it was a dhcp problem. Didn’t even thing to do a tcpdump. Anyway, for whatever reason, it seems the /boot directory on the SD card is laid out differently to what netboot is expecting.

tcpdump
22:36:06.170507 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from dc:a6:32:7a:3d:63 (oui Unknown), length 322
22:36:06.171387 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from dc:a6:32:7a:3d:63 (oui Unknown), length 309
22:36:06.172214 IP 192.168.1.154.33243 > 192.168.1.181.tftp: TFTP, length 49, RRQ "37ca3181/start4.elf" octet tsize 0 blksize 1024
22:36:06.173879 IP 192.168.1.154.33244 > 192.168.1.181.tftp: TFTP, length 48, RRQ "37ca3181/start.elf" octet tsize 0 blksize 1024
22:36:06.174919 IP 192.168.1.154.33245 > 192.168.1.181.tftp: TFTP, length 40, RRQ "config.txt" octet tsize 0 blksize 1024
22:36:06.176907 IP 192.168.1.154.33246 > 192.168.1.181.tftp: TFTP, length 39, RRQ "vl805.sig" octet tsize 0 blksize 1024
22:36:06.177921 IP 192.168.1.154.33247 > 192.168.1.181.tftp: TFTP, length 42, RRQ "pieeprom.sig" octet tsize 0 blksize 1024
22:36:06.178883 IP 192.168.1.154.33248 > 192.168.1.181.tftp: TFTP, length 42, RRQ "recover4.elf" octet tsize 0 blksize 1024
22:36:06.179793 IP 192.168.1.154.33249 > 192.168.1.181.tftp: TFTP, length 42, RRQ "recovery.elf" octet tsize 0 blksize 1024
22:36:06.180560 IP 192.168.1.154.33250 > 192.168.1.181.tftp: TFTP, length 40, RRQ "start4.elf" octet tsize 0 blksize 1024
22:36:06.181457 IP 192.168.1.154.33251 > 192.168.1.181.tftp: TFTP, length 39, RRQ "start.elf" octet tsize 0 blksize 1024

start.elf and start4.elf were in /boot/firmware. Unfortunately I’m not finding vl805.sig, pieeprom.sig, recovery.elf, or recovery4.elf at all though. So I’ll look into that next. Thanks for the advice!