TrueNAS Scale JBOD Issue

Hello everyone. I have a Dell R720XD with a Perc Mini H710 mini flashed to IT mode and an LSI SAS9207-8e also in IT mode. Both cards running on P20 firmware.
The LSI SAS9207-8e is connected to a netapp ds4243 JBOD. Made a 36 drive pool and all is working fine however, when it comes to rebooting the R720XD, is when the fun starts… So almost every single time, after a reboot, for some reason truenas is not able to properly mount the drives in the JBOD. I get flooded with alerts that most of the drives are unhealthy, and are therefore not mounted. To rectify the issue, I have to restart the server an additional 5-20 times until the pool miraculously mounts properly again which is a real headache… thoughts?

So far I have tried going back and forth from the bluefin rc1 release and the stable release to no avail. I have also reflashed both cards in case of a bad flash but the issue persists. Appreciate the help/insight!

How do the drives show in zpool status? Do they have generic names? e.g. sda-sdz or similar?
In this case the lettering does not get assigned identically at every boot. The zfs best practice is to build a pool using drive-specific identifiers.
In linux, the recommended way is to use drive devices from the /dev/disk/by-id folder. Not sure what exists in TrueNAS.

here is my zpool status screen

Thanks for sharing. This looks pretty device-specific :smiley:

thought as much :sweat_smile: Thanks for your input though

Just a thought, do you have enough power for that initial startup when everything just goes online?

I recall seeing QNAP devices having problems when TrueNAS is installed. Default OS (QTS) would bring groups of drives online instead of everything at once.

2 Likes

i was thinking the same thing as @vivante except it is not power related per se. but that the QNAP chassis firmware itself is designed to spin up drives in batches causing the OS to boot before the drives are up.

there may be a way via bash or FSTAB or something, to cause a WAIT state in the OS and get it to pause long enough to wait for the chassis. actually you might be able to just at _netdev in the fstab options as that might cause enough hang time for the QNAP to be fully up.

1 Like

I doubt power is the issue as well since the drives come up right away with no issue when booting into windows on the same server or a gparted live cd. The controller not being given enough time before the OS loads up makes alot of sense.

I’m having a similar issue with my NetApp JBOD and TrueNAS Scale. I’ll try to edit with some more pertinent details when I have a moment.

Edit:
My system is a HPE ProLiant DL380e Gen8 running TrueNAS-SCALE-22.02.4 with a HP 9217-4i4e (LSI HBA SAS 9207-4i4e) in IT mode. sas2flash lists the firmware as 20.00.07.00. This HBA is connected to a NetApp DS4246 via a IOM6 module. The NetApp shelf is full with 24 disks.

zpool status:
~# zpool status
  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:01:46 with 0 errors on Sun Dec  4 03:46:48 2022
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdad3   ONLINE       0     0     0
            sda3    ONLINE       0     0     0

errors: No known data errors

  pool: internal
 state: ONLINE
  scan: scrub repaired 0B in 00:05:40 with 0 errors on Wed Dec  7 00:05:42 2022
config:

        NAME                                      STATE     READ WRITE CKSUM
        internal                                  ONLINE       0     0     0
          mirror-0                                ONLINE       0     0     0
            05628f32-293b-47cb-8ec1-f464164ee3ab  ONLINE       0     0     0
            a1bd107b-77af-4c13-8599-8677c3d39309  ONLINE       0     0     0
          mirror-1                                ONLINE       0     0     0
            a47fdd82-cb6a-486b-9b64-c7aedd39d68a  ONLINE       0     0     0
            52ed0948-5f2f-459a-8db1-8c05c4762700  ONLINE       0     0     0

errors: No known data errors

  pool: shelf
 state: ONLINE
  scan: scrub repaired 0B in 11:33:19 with 0 errors on Sun Nov 13 11:33:24 2022
config:

        NAME                                      STATE     READ WRITE CKSUM
        shelf                                     ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            843c78cb-e4f4-49c3-94ed-6df39813d0ef  ONLINE       0     0     0
            ab837ac4-b2f8-4e90-be23-f3eba9270a6a  ONLINE       0     0     0
            c9dc3a80-dba2-43d7-ae07-e4c795fe5725  ONLINE       0     0     0
            f6ce1fbe-9b24-4f77-bb71-a1dcb917be7a  ONLINE       0     0     0
            01448d17-2e96-4a23-ae24-784205b5baac  ONLINE       0     0     0
          raidz2-1                                ONLINE       0     0     0
            ff427fbc-4378-4704-96b0-c5bafd733ac9  ONLINE       0     0     0
            e4db13f0-9f23-40e5-89c3-1c80109b5be8  ONLINE       0     0     0
            9d3fa290-bd66-4237-9d62-4cbfca338006  ONLINE       0     0     0
            f46fd6a3-04f8-4f6d-8c4b-2a6af5d57afa  ONLINE       0     0     0
            da0f430b-c41e-4c73-a6cf-2cb796196e0a  ONLINE       0     0     0
          raidz2-2                                ONLINE       0     0     0
            b2d582ab-40aa-4308-993c-fa19da583b53  ONLINE       0     0     0
            0fca336a-3ba0-43ba-a7db-1c25f5c53b1d  ONLINE       0     0     0
            08b6f785-af46-444b-9bfb-7f328781c1e9  ONLINE       0     0     0
            1b1edf4a-ec8b-416d-86ef-160c511717ac  ONLINE       0     0     0
            c40684a5-ba2b-4acf-b14f-754579413e19  ONLINE       0     0     0
          raidz2-3                                ONLINE       0     0     0
            9391acdb-fa3c-4e70-a372-63437236f630  ONLINE       0     0     0
            04197bd5-6ceb-4155-972d-d0b6413b36d7  ONLINE       0     0     0
            34d68ce2-d06e-4cd4-93b5-d6e13e184cb9  ONLINE       0     0     0
            629e798e-3783-4132-bd3f-b0b0e59a909f  ONLINE       0     0     0
            b37cf4bd-02e4-4046-a70f-643333701aa6  ONLINE       0     0     0
        special
          mirror-4                                ONLINE       0     0     0
            9efed506-136d-482e-8bf4-298baf9b7577  ONLINE       0     0     0
            cce016bb-169b-49a5-898b-a3b0081d3b7d  ONLINE       0     0     0
        logs
          mirror-5                                ONLINE       0     0     0
            7e4395d2-a6da-4073-a3b1-69b6559571dd  ONLINE       0     0     0
            bbb6919a-c4ad-4628-a16b-a211da06dc05  ONLINE       0     0     0
        cache
          9fa1042f-573a-4498-b98d-f503666eb500    ONLINE       0     0     0
          036aaa42-e7d7-4dcd-8b66-f9fa87e6ed8c    ONLINE       0     0     0
        spares
          de525a83-906f-49e0-89d1-1a95fabdac5f    AVAIL   
          e0a461ee-1099-455e-a79f-f57e7208a23f    AVAIL   

errors: No known data errors

I haven’t tried Bluefin, nor have I tried the multiple reboots. I just told ZFS that everything was alright, cleared the errors and been fine (no noticed data loss).

this chassis for sure supports staggered spin-up. i do not know how to check and see if it is on or off or how to change it. but that seems like it would cause your issue.

How about exporting before reboot, in importing after boot?

I’m not familiar with True/Freenas. Exporting does kill shares… my bad…

If that makes a difference, the delay zpool import task some how?

I would love to know if anyone else knows how to tweak this setting on these NetApp shelves.

I believe that would really mess up shares, k8s apps, etc.

1 Like

One must bring those services down for a reboot anyway?

And if pool not coming up upon boot, then they can’t be That adversely affected?

I am happy to be corrected, and for sure a delayed import may not at all resolve the issue.

Just trying to reduce the number of reboots before working system is all.

1 Like

It is okay to say I am mistaken in my guess.
I was just spitballing a possible avenue to approach is all.

I could be wrong, but I believe exporting a pool that is attached to TrueNAS will try to clean up all its dependencies. If a pool is gone after a reboot it’ll retain the dependencies, but the services will fail to start.

I guess I should say this system is as “production” as homelabs can get, so I try to reboot it very infrequently. Data is all backed up, but no HA/redundancy is available. I should also say this appears to be new behavior. It’s worked for years without this issue, but it’s also only recently that I completely filled the shelf.

1 Like

That sounds far to be honest. Thanks

I updated my system from Anglefish to Bluefin today hoping for some improvement, but it appears the same issue occurred during boot. I don’t believe there’s an issue with the hard drives as ZFS doesn’t report any errors while operating normally.

In the first boot it showed the two metadata SSDs had 4 read errors each and together had 6. (Not sure how that math is being done.) I ran zpool clear shelf and then the IO errors skyrocketed saying the hard drives were disconnecting.

I’ve attached the dmesg output of the second boot. :point_down:
dmesg.txt (237.1 KB)

I’ve done two more reboots with no improvement.

zpool status output
~# zpool status -v shelf
  pool: shelf
 state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-JQ
  scan: scrub repaired 0B in 11:16:10 with 0 errors on Thu Jan 26 11:16:35 2023
config:

	NAME                                      STATE     READ WRITE CKSUM
	shelf                                     ONLINE       0     0     0
	 raidz2-0                                ONLINE       0    18     0
	   843c78cb-e4f4-49c3-94ed-6df39813d0ef  ONLINE       3    13     0
	   ab837ac4-b2f8-4e90-be23-f3eba9270a6a  ONLINE       3    13     0
	   c9dc3a80-dba2-43d7-ae07-e4c795fe5725  ONLINE       3    11     0
	   f6ce1fbe-9b24-4f77-bb71-a1dcb917be7a  ONLINE       3    11     0
	   01448d17-2e96-4a23-ae24-784205b5baac  ONLINE       0    10     0
	 raidz2-1                                ONLINE       0    48     0
	   ff427fbc-4378-4704-96b0-c5bafd733ac9  ONLINE       3    39     0
	   e4db13f0-9f23-40e5-89c3-1c80109b5be8  ONLINE       3    38     0
	   9d3fa290-bd66-4237-9d62-4cbfca338006  ONLINE       3    41     0
	   f46fd6a3-04f8-4f6d-8c4b-2a6af5d57afa  ONLINE       3    39     0
	   da0f430b-c41e-4c73-a6cf-2cb796196e0a  ONLINE       3    36     0
	 raidz2-2                                ONLINE       0     0     0
	   b2d582ab-40aa-4308-993c-fa19da583b53  ONLINE       0     0     0
	   0fca336a-3ba0-43ba-a7db-1c25f5c53b1d  ONLINE       0     0     0
	   08b6f785-af46-444b-9bfb-7f328781c1e9  ONLINE       0     0     0
	   1b1edf4a-ec8b-416d-86ef-160c511717ac  ONLINE       0     0     0
	   c40684a5-ba2b-4acf-b14f-754579413e19  ONLINE       0     0     0
	 raidz2-3                                ONLINE       0     0     0
	   9391acdb-fa3c-4e70-a372-63437236f630  ONLINE       0     0     0
	   04197bd5-6ceb-4155-972d-d0b6413b36d7  ONLINE       0     0     0
	   34d68ce2-d06e-4cd4-93b5-d6e13e184cb9  ONLINE       0     0     0
	   629e798e-3783-4132-bd3f-b0b0e59a909f  ONLINE       0     0     0
	   b37cf4bd-02e4-4046-a70f-643333701aa6  ONLINE       0     0     0
	special	
	 mirror-4                                ONLINE   21.3K    12     0
	   9efed506-136d-482e-8bf4-298baf9b7577  ONLINE       4    12     0
	   cce016bb-169b-49a5-898b-a3b0081d3b7d  ONLINE       4    12     0
	logs	
	 mirror-5                                ONLINE       0     0     0
	   7e4395d2-a6da-4073-a3b1-69b6559571dd  ONLINE       0     0     0
	   bbb6919a-c4ad-4628-a16b-a211da06dc05  ONLINE       0     0     0
	cache
	 9fa1042f-573a-4498-b98d-f503666eb500    ONLINE       0     0     0
	 036aaa42-e7d7-4dcd-8b66-f9fa87e6ed8c    ONLINE       0     0     0
	spares
	 de525a83-906f-49e0-89d1-1a95fabdac5f    AVAIL   
	 e0a461ee-1099-455e-a79f-f57e7208a23f    AVAIL   

errors: List of errors unavailable: pool I/O is currently suspended

@wendell, have you ever come across something like this with the NetApp enclosures and/or IT Mode HBAs? I’m not sure how to troubleshoot this without buying new hardware to see if it resolves the issues.

You have a couple of hardware errors in the boot log. I am not sure if these are related to your zpool issue, but it is something you probably want to look into.

These errors happen even before the mpt2sas driver is loaded.

From log
[    0.807758] swapper/0: page allocation failure: order:12, mode:0xcc1(GFP_KERNEL|GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0-1
[    0.811675] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G          I       5.15.79+truenas #1
[    0.815669] Hardware name: HP ProLiant DL380e Gen8, BIOS P73 05/24/2019
[    0.815669] Call Trace:
[    0.815669]  <TASK>
[    0.815669]  dump_stack_lvl+0x46/0x5e
[    0.815669]  warn_alloc+0x13b/0x160
[    0.815669]  ? __alloc_pages_direct_compact+0xa9/0x200
[    0.815669]  __alloc_pages_slowpath.constprop.0+0xcc3/0xd00
[    0.815669]  ? __cond_resched+0x16/0x50
[    0.815669]  __alloc_pages+0x1e9/0x220
[    0.815669]  alloc_page_interleave+0xf/0x60
[    0.815669]  atomic_pool_expand+0x11d/0x220
[    0.815669]  ? __dma_atomic_pool_init+0x97/0x97
[    0.815669]  ? __dma_atomic_pool_init+0x97/0x97
[    0.815669]  __dma_atomic_pool_init+0x45/0x97
[    0.815669]  dma_atomic_pool_init+0xb9/0x15e
[    0.815669]  do_one_initcall+0x44/0x1d0
[    0.815669]  kernel_init_freeable+0x216/0x27d
[    0.815669]  ? rest_init+0xc0/0xc0
[    0.815669]  kernel_init+0x16/0x120
[    0.815669]  ret_from_fork+0x22/0x30
[    0.815669]  </TASK>
[    0.815672] Mem-Info:
[    0.818245] active_anon:0 inactive_anon:0 isolated_anon:0
                active_file:0 inactive_file:0 isolated_file:0
                unevictable:0 dirty:0 writeback:0
                slab_reclaimable:30 slab_unreclaimable:3328
                mapped:0 shmem:0 pagetables:3 bounce:0
                kernel_misc_reclaimable:0
                free:18478151 free_pcp:273 free_cma:61440
[    9.471321] DMAR: [INTR-REMAP] Request device [01:00.0] fault index 0x27 [fault reason 0x26] Blocked an interrupt request due to source-id verification failure

The driver and drives (28) are identified and get connected fine.

[   19.122820] ERST: [Firmware Warn]: Firmware does not respond in time.

What firmware is referred to here? Probably not storage related?

It takes another minute for the iscsi system to start ([ 78.583082] to [ 80.942195]) but seemingly without issues.

7 seconds later is what I think leads to the issues you’re seeing in ZFS.
Between [ 87.707695] and [ 102.484660] the system deals with the enclosure and all the drives not being accessible and as a result being removed from the system.

Then the enclosure gets reset and re-attached (starting [ 105.640809] through [ 115.005482]).

About two mins later there is storage related issue.

task txg_sync:7325 blocked for more than 120 seconds
[  242.527903] INFO: task txg_sync:7325 blocked for more than 120 seconds.
[  242.536657]       Tainted: P          IOE     5.15.79+truenas #1
[  242.544577] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  242.554430] task:txg_sync        state:D stack:    0 pid: 7325 ppid:     2 flags:0x00004000
[  242.564925] Call Trace:
[  242.568902]  <TASK>
[  242.572481]  __schedule+0x2f0/0x950
[  242.577470]  schedule+0x5b/0xd0
[  242.581999]  schedule_timeout+0x88/0x140
[  242.587310]  ? __bpf_trace_tick_stop+0x10/0x10
[  242.593152]  io_schedule_timeout+0x4c/0x80
[  242.598568]  __cv_timedwait_common+0x128/0x160 [spl]
[  242.604953]  ? finish_wait+0x90/0x90
[  242.609798]  __cv_timedwait_io+0x15/0x20 [spl]
[  242.615626]  zio_wait+0x109/0x220 [zfs]
[  242.621017]  dsl_pool_sync_mos+0x37/0xa0 [zfs]
[  242.627014]  dsl_pool_sync+0x3ab/0x400 [zfs]
[  242.632811]  spa_sync_iterate_to_convergence+0xdb/0x1e0 [zfs]
[  242.640268]  spa_sync+0x2e9/0x5d0 [zfs]
[  242.645611]  txg_sync_thread+0x229/0x2a0 [zfs]
[  242.651636]  ? txg_dispatch_callbacks+0xf0/0xf0 [zfs]
[  242.658586]  thread_generic_wrapper+0x59/0x70 [spl]
[  242.664940]  ? __thread_exit+0x20/0x20 [spl]
[  242.670630]  kthread+0x127/0x150
[  242.675139]  ? set_kthread_struct+0x50/0x50
[  242.680680]  ret_from_fork+0x22/0x30
[  242.685564]  </TASK>
[  242.688935] INFO: task agents:9256 blocked for more than 120 seconds.
[  242.698756]       Tainted: P          IOE     5.15.79+truenas #1
[  242.708194] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  242.719637] task:agents          state:D stack:    0 pid: 9256 ppid:     1 flags:0x00000000
[  242.731879] Call Trace:
[  242.737582]  <TASK>
[  242.742977]  __schedule+0x2f0/0x950
[  242.743030]  schedule+0x5b/0xd0
[  242.754461]  io_schedule+0x42/0x70
[  242.754482]  cv_wait_common+0xaa/0x130 [spl]
[  242.767147]  ? finish_wait+0x90/0x90
[  242.767180]  txg_wait_synced_impl+0x92/0x110 [zfs]
[  242.780708]  txg_wait_synced+0xc/0x40 [zfs]
[  242.789230]  spa_vdev_state_exit+0x8a/0x170 [zfs]
[  242.798152]  zfs_ioc_vdev_set_state+0xe2/0x1b0 [zfs]
[  242.807327]  zfsdev_ioctl_common+0x698/0x750 [zfs]
[  242.816286]  ? __kmalloc_node+0x3d6/0x480
[  242.823730]  ? _copy_from_user+0x28/0x60
[  242.831118]  zfsdev_ioctl+0x53/0xe0 [zfs]
[  242.839031]  __x64_sys_ioctl+0x8b/0xc0
[  242.846057]  do_syscall_64+0x3b/0xc0
[  242.852939]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[  242.861637] RIP: 0033:0x7f310d7326b7
[  242.868844] RSP: 002b:00007f310c97f308 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  242.880305] RAX: ffffffffffffffda RBX: 00007f310c97f320 RCX: 00007f310d7326b7
[  242.891329] RDX: 00007f310c97f320 RSI: 0000000000005a0d RDI: 000000000000000d
[  242.901808] RBP: 00007f310c982d10 R08: 000000000006ebf4 R09: 0000000000000000
[  242.913141] R10: 00007f310d8a0216 R11: 0000000000000246 R12: 00007f31000425f0
[  242.924152] R13: 00007f310c9828d0 R14: 0000557bd43a1e30 R15: 00007f3100041730
[  242.935224]  </TASK>

This seems to repeat every 2 mins or so until the end of the log.

So, the enclosure connects fine, then locks up and reconnects. Is there a way for you to stagger the start of your boxes? E.g. start the enclosure about 3 mins before you start the server?

1 Like

I could use some Wi-Fi smart plugs, but the enclosure already has power with no way to shut off so it’s always on.

Honestly no idea

I looked up this error and first result was someone on Proxmox with a bad RAM stick. Running memtest86 now.

1 Like

Been going for around 5 hours… Not sure if I should be seeing errors by now if a DIMM was bad?

1 Like