Recover faulted ZFS pool?

gordonthree · June 12, 2022, 6:48pm

Hi Gang!

I know it’s a longshot, but wondering if anyone has ideas how I might recover a corrupted ZFS pool. Not sure on what diagnostic information to provide, I’ll post up what I have and what I tried so far.

I originally had a 12 drive pool arranged as a single vdev, raidz2. Due to power distribution issues (too many sata splitters and extension cables) I would get random drive drops and server crashes. After the last crash (May 29th) the pool refused to import, so I have had the server shut down while I built custom SATA power cables and cleaned up the data cabling. My server is running Fedora 35, and the latest stable DKMS release of OpenZFS.

I do have backups of the important data from this pool, but they’re cloud backups, so restoring them will take a while. Thought I would give recovering the pool a shot first.

First off, here’s the output from zfs import showing the pool, including a missing one device which is a drive that has completely failed and no longer accessible. That drive has been pulled from the server, and is inaccessible on my test machine as well.

[root@superx10 ~]# zpool import
   pool: wide
     id: 12866334539261191398
  state: FAULTED
status: One or more devices contains corrupted data.
 action: The pool cannot be imported due to damaged devices or data.
        The pool may be active on another system, but can be imported using
        the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
 config:

        wide                                 FAULTED  corrupted data
          raidz2-0                           DEGRADED
            wwn-0x5000c50063bbf42a           UNAVAIL
            wwn-0x5000c50063c0fea6           ONLINE
            wwn-0x5000c50063c75cff           ONLINE
            wwn-0x5000c50063c9e54d           ONLINE
            wwn-0x5000c5006435dab6           ONLINE
            wwn-0x5000c5006435f0ba           ONLINE
            wwn-0x5000c50092937c98           ONLINE
            wwn-0x5000c50092939808           ONLINE
            wwn-0x5000c5009295a5e5           ONLINE
            wwn-0x5000c5009296f867           ONLINE
            wwn-0x5000c5009297eb04           ONLINE
            wwn-0x5000c50092989b50           ONLINE
        logs
          mirror-1                           ONLINE
            nvme-eui.0025385581b1b75e-part4  ONLINE
            nvme-eui.6479a751d0c0005a-part4  ONLINE

Next step, trying basic recovery…

[root@superx10 ~]# zpool import -f -F wide
cannot import 'wide': I/O error
        Destroy and re-create the pool from a backup source.
[root@superx10 ~]# zpool import -d /dev/disk/by-id -f -F wide
cannot import 'wide': I/O error
        Destroy and re-create the pool from a backup source.
[root@superx10 ~]# zpool import -d /dev/disk/by-id -f -F -XN wide
cannot import 'wide': one or more devices is currently unavailable

I tried setting zfs_recover in module parameters, hasn’t changed the output of the above recovery steps.

Found an article from 2011 about label corruption, and the diagnostic command zdb -lll was recommended, so here’s my output, showing all four labels intact, this is the output:

Here is the output from zdb without any extra arguments:

Exard3k · June 12, 2022, 6:55pm

How about importing with -f without recovery ? ZFS likes to lock down to raise attention on things that aren’t normal, but are usually resolvable. Had this multiple times with different pool ZFS versions.

Redundancy is still fine as only a single drive is down. If -f works, run a scrub asap. Then try recovery again.

Is this a recent problem or did you have errors in previous scrubs?

gordonthree · June 12, 2022, 7:07pm

Tried a vanilla import as well (even without -f since this is the machine the pool was last used on), no difference same error

[root@superx10 ~]# zpool import wide
cannot import 'wide': I/O error
        Destroy and re-create the pool from a backup source.
[root@superx10 ~]# zpool import -f wide
cannot import 'wide': I/O error
        Destroy and re-create the pool from a backup source.

The total failure is a recent problem… previous times when the server had crashed I ran scrubs and didn’t find anything.

Trooper_ish · June 12, 2022, 7:25pm

I’m not sure, but I think the missing drive is the blocker to importing.

I think it needs to say something like “437284373892794 - was wwn-blhablha”
as in, so the system knows it is not gonna import that one disk, and to import in a degraded state…

just googling now

gordonthree · June 12, 2022, 7:31pm

I was wondering that too… Only thing I’m finding on google is how to import if your LOG device is missing, which I’m not.

I am reviewing old logs on the server now to see if any other devices went missing prior to the crash, I’m wondering if the metadata is out of sync across devices in the pool… not sure how to tell that.

Exard3k · June 12, 2022, 7:43pm

Two things I found in the Docs is importing with readonly mount options. The other one being a rollback to previous TXGs (which isn’t trivial because you lose data).

zpool import tank -o readonly=on

If readonly works, you can at least get your data before taking further steps.

I won’t post rollback flag here, even though you also used desperate measures like -XN. People like to copy&paste dangerous stuff.

gordonthree · June 12, 2022, 7:48pm

I’ve used zdb to find different txg’s in the journal and tried rolling back to them, sometimes I get the IO error, sometimes I get a “required device not available” error, but it doesn’t indicate what device.

Anyone happen to know what this debug message means?

ZFS_DBGMSG(zdb) START:
spa.c:6092:spa_import(): spa_import: importing wide
spa_misc.c:418:spa_load_note(): spa_load(wide, config trusted): LOADING
vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-id/wwn-0x5000c50063c75cff-part1': best uberblock found for spa wide. txg 1941839
spa_misc.c:418:spa_load_note(): spa_load(wide, config untrusted): using uberblock with txg=1941839
spa_misc.c:403:spa_load_failed(): spa_load(wide, config untrusted): FAILED: couldn't get 'config' value in MOS directory [error=5]
spa_misc.c:418:spa_load_note(): spa_load(wide, config untrusted): UNLOADING
ZFS_DBGMSG(zdb) END

Exard3k · June 12, 2022, 8:13pm

I can’t comment on zdb as I have not a real clue because I always hated debugging.

If the cause of I/O errors were your “fancy cabling” and power connections or the now removed drive, you could also try a zpool clear tank with or without -f, which clears the entire error log. BUT if the source of errors is still there, it will damage the pool even further (that’s why the pool went into lockdown in the first place, as a damage mitigation measure).

Log · June 12, 2022, 8:17pm

I don’t have any useful experience with ZFS pool recovery, but you may try running the trial version of Klennet ZFS Recovery and see if what it finds is worth $400.

gordonthree · June 12, 2022, 10:20pm

zpool clear fails, claims it can’t find the pool… I think the command only works on imported pools?

thro · June 12, 2022, 11:23pm

From what you said, is it possible that multiple drives dropped out previously and perhaps multiple drives are missing data (hence ZFS corrupted data message)?

i.e., one drive currently isn’t coming online, but due to drops, multiple drives (i.e., 3 or more) are in an inconsistent state and the uberblock is corrupted?

Sorry don’t have any suggestions on how to fix, more thinking how it has died to this point despite being raidz2…?

gordonthree · June 12, 2022, 11:30pm

Yes I think that’s the problem … the uberblocks it finds don’t match up with what it sees on the currently available disks.

I went through several days of dmesg logs prior to the crash that took the pool down and I only found the one drive puking… so not really sure what’s happened.

If I could get it to tell me what other device it’s having an IO error with that would be helpful. I’ve run a SMART long test on the 11 remaining drives, and all report healthy status.

thro · June 12, 2022, 11:30pm

Also… found this which seems similar to your issue from what i’m reading:

I’m not a zfs expert, but what the errors say seem like uberblock corruption and maybe its possible to roll back to a previous copy. Maybe this might give you some leads or stuff to look at (i.e., try get the diagnostic info out with listing the uberblocks).

(and then maybe consider trying to import the pool using a previous transcation’s uberblock copy).

gordonthree · June 12, 2022, 11:34pm

I have tried rolling back to a txg (-T argument) that was a good half hour before the disk went bonkers, but the import still fails… I think the pool must have already been degraded at the point when the last disk went, and that nuked the pool.

thro · June 12, 2022, 11:35pm

Possible to roll back any further in time? Guessing there’s only so far you can go… if the hardware issues were going on for too long guessing the pool may indeed be toast…

gordonthree · June 12, 2022, 11:36pm

I’m running a command I found on an oracle forum from 2009 that examines and maybe repairs block damage, using the zdb command … seems a lot like a scrub, it’s reporting it will take roughly 40 hours to complete. bummer that I started it without benefit of a screen session.

thro · June 12, 2022, 11:37pm

Interested to hear the results. I’ve been fortunate enough to never run into these things but my setup is pretty simple… dual mirror vdevs.

Glad to hear you have backups! Gives you the option to attempt repair without quite so much pressure

thro · June 12, 2022, 11:38pm

Can maybe background it with ctrl+z (and then bg command) and re-attach with screen, then fg it?

Maybe stay connected in the first session in case it needs to write to stdout until you’ve fg’d it on a new session though.

edit:
hmm not sure if that will work