This morning, my 1-vdev, 1-drive ZFS pool on an EXTERNAL USB hard drive that I was working on last night, failed to get auto imported. This is a zstd-10 compressed and aes-256-gcm encrypted pool. I got a message about CORRUPTED metadata (the specific code for which I don’t remember) and the suggested action was to DESTROY the pool and restore from backup. It didn’t give a hint of hope. This is an non-mirrored drive.
You can imagine the panic. After calming down a bit, I looked the error up, and by suggestion of a server fault thread I tried:
zpool import [pool] -F -X
After roughly the time it would take to scrub the pool (single vdev, single disk), it was auto-imported and then I realized that zpool was also SCRUBBING the disk, with a starting timestamp of yesterday evening!
The scrubbing has now ended with 0B repaired and 0 errors found. Everything seems perfectly perfect.
So all of this leads me to some very serious (for me) questions:
- Can I be sure that my drive is 100% back to its original state, since 0 bytes were found corrupt, and no errors were reported by the latest scrub?
- Is it possible that this corruption happened because a scrub was initiated yesterday evening (the scrub starting timestamp seems valid, as I was working on the PC at that time), and interrupted by shutting down (This is a USB drive we’re talking about, so this adds more possible error vectors)?
- I did not (unless I am schizophrenic) initiate last night’s scrub. Can I check what did? I don’t know what output to check.
- Why on earth would ZFS suggest a total pool destruction, when (apparently) the simplest of actions returned the pool to 100% health? The encryption key loads well, the dataset mounts normally, and I can see yesterday’s work. If something would have gone wrong, it would surely not let me open ANY files, right (again, this is an encrypted and compressed dataset, at the pool level)?
I am willing to read/learn if you can provide documentation.
I wish @wendell would shine some light, as I was in awe of his video and he surely will have input on this
I am running Debian testing and ZFS (on linux?) 2.0. Please spare me the scolding for my setup, I know it is well-meaning, but this was supposed to be temporary anyway…
zfs-2.0.3-1
zfs-kmod-2.0.3-1
UPDATE May 15th:
I have verified through an external means that indeed no data seems to be corrupted (check summing approx 1TB of data from the backup and verifying it against the zpool). This is to be expected since ZFS reports no errors, but it’s a nice re-assurance to have.