ZFS Cannot import <pool>: I/O error

I had a raidz1 pool that I was expanding, and while it was working I forgot it was mid expansion and powered off the system. After powering the system back on the expansion continued but stalled saying it would take several days. At some point during this one of the disks entered a faulted state with too many READ and CKSUM errors. So I replaced the disk however that also stalled after around 3 days. I rebooted it, which allowed it to continue with no errors, it stalled again after about a day, rebooted one more time and it made a lot of progress, but this morning I noticed multiple errors in the console, unfortunately I forgot to screenshot them but it was something with mpt2sas_cm0 IO. When I noticed those I also had more drives faulted at this point with a lot of data errors.

zpool_error

At this point I started to think it wasnā€™t a drive fault but a HBA or backplane issue, so I decided to connect all the drives directly to the motherboard, however Iā€™m unable to import the array and get the error cannot import 'Tank': I/O error Destroy and re-create the pool from a backup source..

Once I saw that I started trying to import the pool by manually specifying individual disks to see if I could get it to import at all, but that didnā€™t work. I then ran zdb -l /dev/disk/by-id/ata-<disk> to verify everything looked ok on all the disks and it looked good to me. At this point Iā€™m out of ideas of where to go from here. Any direction or tips of what to do would be greatly appreciated!

1 Like

Well it does tell you what to do. Is restoring from backup totally out of the question?

Otherwise searching the net for

zfs "cannot import" "I/O error" Destroy and re-create the pool from a backup source.

does seem to return quite a lot of hits. Maybe you can find something useful in those?

Itā€™s a hard lesson to learn, but at this point you should have stopped everything, mounting everything read-only and transferred all the data you could to a backup.

Powering off a machine during a rebuild seems like the biggest test of ZFS and it looks like it struggles here. Unless youā€™re willing to start deep diving into how ZFS is stored on the file system and digging through the fs tables/headers, fixing this might be really difficult.

Iā€™m sorry.

Unfortunately no, this all happened right after creating the pool.

I did try a couple things I found while searching around, but then I started getting worried about causing more harm than good so I decided to make this post just in case.

When I shut it down by mistake there was no indication that anything had gone wrong which is why I let it continue, and when the one drive started showing errors I figured it was just because it was an older used drive.

If thereā€™s nothing else I can do I might try this since thereā€™s not much to lose.

I would try asking in the openzfs chat room (#openzfs on Libera.chat). According to their newcomers wiki, there are some Slack channels and mailing lists too.

You can also try reposting this question on ServerFault and the Unix&Linux Stack Exchange. Youā€™re more likely to get more responses from their broader audience.

Good luck!

Iā€™d take the pool offline and run some smart tests on the drive. Something doesnā€™t add up.

I have had situations like this where a drive was dying/near death and zfs would stop using it. However I was able to rescue the pool by using ddrescue to 1:1 clone the malfunctioning drive onto a new fresh drive. then zpool import worked as expected.

It seems like one of your drives has faulted and maybe the other drive is throwing errors. So you may have to ddrescue multiple times onto fresh/known good drives.

with the pool offline you can rely on smart for the drive to self test. IF that passes, then that might be telling you your interface or cabling or hba (or motherboard) is the part that is malfunctioning.

A lot of the time back planes are not designed for sata signaling and mobo sata + backplane is asking for trouble whereas hba + backplane is totally okay. When in one of these dicey situations I always use known good sata6 rated direct-to-motherboard cables.