ZFS without UPS?

I’m planning on replacing my NAS (which is used purely for backup storage) with a Linux box, and was thinking of using ZFS because the backup software can do a few extra tricks with it. It would be 2 drives mirrored (mainly because the machine I’m thinking of using is a home made rack case built inside an old switch case, and there’s only room for 2 drives :joy: )

But what I’m curious about is how well ZFS handles a power failure.
I’ve seen warnings in the past about needing a UPS because ZFS can be corrupted by power loss. But my Minecraft server has lost power many times in the last year or so it’s been running, and it’s on ZFS.

I’m OK with data loss of unwritten data since it’s only backups (and the software will restart the current backup if it loses power). What I’m mainly concerned about is the entire pool being corrupted or destroyed… Is that a common issue for power fail, or just another ZFS boogieman type warning?

ZFS is very tolerant of power failures. My workstation (with a 64tb disk shelf running zfs attached) was fairly unstable a couple months ago and I would regularly have to plug pull the system.

I never had the array come up damaged.

ZFS is designed specifically to be capable of handling power failures and disks dropping out of the array and whatnot. That’s part of the beauty of doing copy on write and using transactional writes via the intent log.

I have no idea who told you that zfs is any more sensitive than any other array software, but if anything, zfs is more stable than other systems.

In my opinion, zfs is the most reliable Filesystem you can get your hands on without buying a multi-million dollar SAN. And some SANs even use zfs.

6 Likes

I’ve had several power failures on my home setup, running 32tb array. I always make sure to run a scrub after outage, never an issue.

So a scrub doesn’t do anything ZFS won’t do when it encounters an error, it simply becomes aware of any errors earlier on.

1 Like

Would report if there were errors tho, no?

I am running freenas and I have lost power many times and come up just fine. I have one system that’s become unstable and pulled the plug many times troubleshooting and no troubles.
ZFS seems to be ever fault tolerant

Yes, it would. ZFS reports any errors it encounters at any time.

One of the reasons ZFS is so slow is that it validates data before returning it. “return good data or none at all”

3 Likes

Also never had issue with powerlost, unlike with other filesystem. I have a ups but I never fixed a timing issue in the halt sequence.
At least 10 cut in 2 y.

I use my UPS for 2 things. Power filter, and enough time to berate the GF for using the hair dryer and flat iron at the same time when the breaker trips. (bathroom is on the same circuit as my office :/)

It won’t save me from anything more serious than that. I get about 12 minutes of runtime.

3 Likes

Same here (aside for the GF :sweat_smile:) I bought a high starting load breaker for the circuit because I couldn’t pull it up.

For something that’s not a power hungry lab, I would recommend a small desk-size ups, for power filtering, flicker, and storm-fry prevention. You really don’t need much with zfs. Just avoid MD raid :slight_smile:

I have really horrible power that blows breakers (Known building issue) and using a UPS blows my breaker instantly so needless to say, my array goes off regularly. I’ve never had data loss that wasn’t in transit on power loss. I have added a slog that has power resilience though to deal with it, though I didn’t have any detectable issues before. I don’t recommend pulling the plug, but my zfs array has outlasted every other filesystem on this power grid.

OK, this is all sounds more promising. I have had a box running ZFS for about 12 months and hadn’t had any issues with the 3 or 4 times it’s gone down uncleanly, but wondered if I just got lucky.

I do plan on having a UPS because we often get those pain in the arse cuts to the power that are just long enough to reset every electronic device in the house, but short enough that the UPS on the server doesn’t even start beeping. I just wondered how safe it will be in 2-3 years when the batteries have turned useless and it doesn’t run more than a few minutes, and the thing doesn’t shut down in time.

I’d honestly really recommend making a test array one drives you don’t care about and see how much you can abuse it before it dies. My test array survived hundreds of power pulls, literally pulled ram while the system was live, ect. I got bored before any data at rest was harmed. Obviously in flight data was lost, but absolutely 0 corruption. No ECC on my test box either. We all stress good practice, but ZFS can take a real beating and be fine.

2 Likes

My ZFS box at home has never been on UPS and has not lost anything in 9 years.

That includes surviving a couple of drive hardware failures.

As per @SgtAwesomesauce, the way ZFS operates, doing copy-on-write with an intent log, whilst a power loss my result in in-flight data being lost, anything that was persisted to disk will be just fine.

So in theory that means the only data at risk will be open files that may not have been completely written to at the application level by the end user application. It would be the same as if you had an application crash due to a bug.

The filesystem itself is robust.

2 Likes