I know it’s a longshot, but wondering if anyone has ideas how I might recover a corrupted ZFS pool. Not sure on what diagnostic information to provide, I’ll post up what I have and what I tried so far.
I originally had a 12 drive pool arranged as a single vdev, raidz2. Due to power distribution issues (too many sata splitters and extension cables) I would get random drive drops and server crashes. After the last crash (May 29th) the pool refused to import, so I have had the server shut down while I built custom SATA power cables and cleaned up the data cabling. My server is running Fedora 35, and the latest stable DKMS release of OpenZFS.
I do have backups of the important data from this pool, but they’re cloud backups, so restoring them will take a while. Thought I would give recovering the pool a shot first.
First off, here’s the output from
zfs import showing the pool, including a missing one device which is a drive that has completely failed and no longer accessible. That drive has been pulled from the server, and is inaccessible on my test machine as well.
[[email protected] ~]# zpool import pool: wide id: 12866334539261191398 state: FAULTED status: One or more devices contains corrupted data. action: The pool cannot be imported due to damaged devices or data. The pool may be active on another system, but can be imported using the '-f' flag. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E config: wide FAULTED corrupted data raidz2-0 DEGRADED wwn-0x5000c50063bbf42a UNAVAIL wwn-0x5000c50063c0fea6 ONLINE wwn-0x5000c50063c75cff ONLINE wwn-0x5000c50063c9e54d ONLINE wwn-0x5000c5006435dab6 ONLINE wwn-0x5000c5006435f0ba ONLINE wwn-0x5000c50092937c98 ONLINE wwn-0x5000c50092939808 ONLINE wwn-0x5000c5009295a5e5 ONLINE wwn-0x5000c5009296f867 ONLINE wwn-0x5000c5009297eb04 ONLINE wwn-0x5000c50092989b50 ONLINE logs mirror-1 ONLINE nvme-eui.0025385581b1b75e-part4 ONLINE nvme-eui.6479a751d0c0005a-part4 ONLINE
Next step, trying basic recovery…
[[email protected] ~]# zpool import -f -F wide cannot import 'wide': I/O error Destroy and re-create the pool from a backup source. [[email protected] ~]# zpool import -d /dev/disk/by-id -f -F wide cannot import 'wide': I/O error Destroy and re-create the pool from a backup source. [[email protected] ~]# zpool import -d /dev/disk/by-id -f -F -XN wide cannot import 'wide': one or more devices is currently unavailable
I tried setting zfs_recover in module parameters, hasn’t changed the output of the above recovery steps.
Found an article from 2011 about label corruption, and the diagnostic command
zdb -lll was recommended, so here’s my output, showing all four labels intact, this is the output:
Here is the output from
zdb without any extra arguments: