I'm looking for a way to monitor btrfs for checksum errors and run a script when they're detected. I know that I can periodically check the syslog but I'm hoping for a way to do it in real time. Zfs has a daemon for this and I was wondering if btrfs has something similar or if there is a third party program which can do this.
You don't need that in btrfs, it has several automated mechanisms to prevent errors.
In practice, btrfs does the following to prevent bitrot:
- per-block checksums
- atomic COW shots
- asynchronous incremental replication
- self-healing RAIDing
Just like ZFS, it does all this out of the box.
Edit: of course, you would typically enhance the data reliability of a system by making a snapper profile for your data partition. You normally would have one for your root partition, and that by default encompasses everything on the system except the /var/log for instance, as a snap revert would reset your system logs and you wouldn't be able to diagnose the event that lead to you having to revert.
Although state of the art, btrfs is not a magical file system that can never fail and it's not guaranteed to recover everything. Add to that potential problems from the hardware and power loss and you'll see a need for such a monitoring and notification tool.
ZFS is just as good as btrfs and they provide such a tool.
sudo btrfs device stats <device>
no separate daemon, default functionality, btrfs is made for lower overhead than ZFS
What I'm looking for is a way to run a script when a checksum error is detected. So I take it there is no daemon that can be used for this then?
Btrfs is fully integrated in modern system logging, a btrfs checksum error just pops up in dmesg. It just says btrfs error: csum failed something something (where in device the error occurred).
Therefore you can get notifications from that. Just run a script that will grep the csum error in dmesg and if there is one, notify or mail it to you. You can do that with system notifications or with a cron script.
Yeah that's what I was thinking
There is a ready-made script available in some ubuntu mailing list for this.
RPM distros have a system notification option by filter in the GUI
I wrote this...Im a total newbie but I wanted the same thing. Dont laugh at my bash skills :)
!/bin/sh
btrfs device stats /mnt/6fc928da-8b4c-428c-a973-33084b377b9d > /var/tmp/poolcurrent.log
if diff /var/tmp/poolstats.log /var/tmp/poolcurrent.log >/dev/null ; then
rm /var/tmp/poolcurrent.log
else
rm /var/tmp/poolstats.log
mv /var/tmp/poolcurrent.log /var/tmp/poolstats.log
cat /var/tmp/poolstats.log | mail -s "Pool errors" [email protected]
fi
Change the bold parts to your systems drive pool and email. I have this run in a cron job.