TrueNAS Scale VM randomly locking up

I’m continuing to rebuild my existing home server in a new architecture.

I have a single physical Fedora 38 machine with two VMs, one is a TrueNAS Scale that serves NFS shares to my other VM running come containers off of the NFS shares.

The problem is that TrueNAS will just randomly lock up with nearly 100% CPU usage until forcibly rebooted. It’s unresponsive even when attached with VNC console. Last time it happened was during rsyncing data from old HDD attached to host machine to my new ZFS pool. But it did happen randomly before with no load.

All the logs say is:

# last log before problem starts
Jul 18 11:31:25 kraka kernel: hrtimer: interrupt took 9953 ns
# starts being unresponsive, following log repeats every few seconds
Jul 18 19:15:45 kraka kernel: lockd: server esteban not responding, timed out
# ...
Jul 18 19:19:02 kraka kernel: lockd: server esteban not responding, timed out
Jul 18 19:19:11 kraka kernel: lockd: server esteban not responding, timed out
Jul 19 21:59:18 kraka kernel: lockd: couldn't shutdown host module for net f0000000!
Jul 19 21:59:18 kraka kernel: nfsd: last server has exited, flushing export cache
Jul 19 21:59:19 kraka kernel: NFSD: Using UMH upcall client tracking operations.
Jul 19 21:59:19 kraka kernel: NFSD: starting 90-second grace period (net f0000000)
Jul 19 22:10:07 kraka kernel: nfsd: last server has exited, flushing export cache
Jul 19 22:10:09 kraka kernel: NFSD: Using UMH upcall client tracking operations.
Jul 19 22:10:09 kraka kernel: NFSD: starting 90-second grace period (net f0000000)
Jul 19 22:12:21 kraka kernel: nfsd: last server has exited, flushing export cache
Jul 19 22:12:23 kraka kernel: NFSD: Using UMH upcall client tracking operations.
Jul 19 22:12:23 kraka kernel: NFSD: starting 90-second grace period (net f0000000)
# last log repeats again until I forcibly reboot the VM
Jul 20 23:50:34 kraka kernel: lockd: server esteban not responding, timed out

The entire time this happens Podman server (esteban) remains operational except for containers that try to access data from NFS shares. When TrueNAS reboots containers on esteban resume to function normally. But this is obviously unacceptable.

Any ideas how to debug that issue? Or is it better to just ditch TrueNAS, install OpenZFS directly on the host and live without the pretty interface? I can use cockpit-zfs-manager and set up all the tasks manually, but I can’t deny TrueNAS has a nice interface.