[SOLVED] ZFS mirrors, do they have checksums like RAID-Z?

Does ZFS mirrors have checksums? I know you can get checksum errors, so I figure they do, but how is it different from RAID-Z?

I have a bunch of 2-drive mirrors in my pool, and I’ve been trying to figure out why everyone recommends mirrors over RAID-Z when RAID-Z2 has far better fault-tolerance and two sets of parity bits. Forget about resilver timing and shrink-grow features, I wanna focus on checksums specifically.

With a 2-drive mirror, if you lose a drive; until the resilver is done, how do you know the data’s valid? And if it’s invalid, because you lost a drive, those files won’t be recoverable right?

I’m thinking about moving to 3-drive mirrors even for my modern SSDs (which never have issues) only because I truly don’t understand all the specifics of ZFS checksumming.

I have 1 local and 1 offsite backup of my main zpool, but I want to confirm what kind of risk I’m taking with 2-drive mirrors vs 3-drive mirrors in terms of data integrity (not redundancy).

All ZFS configurations have checksums, even single disk volumes. The check-summing is independent of the parity/mirroring and is just part of the file system metadata. The checksum data allows the file system to tell if the block it’s reading is the same as it was when it was written, and if there is parity data or a mirror it fix an error it detects. If there isn’t any parity data or mirrors because it’s a single disk or because of too many disk failures it will just be able to tell you that there is an error but won’t be able to repair it.

7 Likes

Short version of above. Checksums (actually sha256 hashes) are per block and have nothing to do with the redundancy level.

They just can’t be used to repair data without redundancy but they still tell you if data is corrupt even on single disk.

Edit: yes they use some disk capacity even on single disk but not much. But this is how ZFS detects errors and how it can reliably read at 2x speed for a two drive mirror - it doesn’t need to cross check both copies unless a checksum doesn’t add up.

2 Likes

Minor nit - the checksum algorithm is configurable, sha256 is one option. There are situations where you might want to prefer a checksum over a hash because the guarentees are different (checksums guarentee being unique for a checksum size worth of data and can always detect a single bit-flip, hashes only offer a (vanishingly small) statistical probability).

2 Likes

This tells me a lot. It says I don’t need 3-drive mirrors at all.

If data corruption is the only issue, I can copy that back from either of 2 backups if it occurs, and those two backups also have snapshots, so the good data should be there.

That brings up a different question, so I started a new thread to ask how I handle a situation where a single-disk checksum fails:

Resilvering is per block - if the checksum is bad on each block. Not like traditional raid.

So if an individual file can’t be recovered you lose that file. not the whole filesystem. If the bit error is on your recovery drive and different to the bit error on your otherwise failed drive it should have spare copies on each drive of the respective broken data to repair both errors.

Again, ZFS re-silver is very different to a regular RAID rebuild. If there’s only one bit error it may only copy over/repair one single block.

It’s not per file though right?

If a block is bad, and your block size is 128K, then you’re likely to have multiple files in that one block. I do coding, and a lot of my files are under 1K.

Is there a way to know which file(s) were damaged by those corrupt blocks?

If a file is smaller than the record size then only the size required is used. A record doesn’t contain data for multiple files.

Anyway, I’m pretty sure that running zpool status -v will list any corrupted files or other errors.

2 Likes

Depends how big your files are yes, but only the block is lost. If that includes multiple files multiple files will be lost, otherwise…

I believe ZFS will refuse to return corrupt data so the files will be known corrupt because they won’t be returned.

1 Like