Powering down HDD in ZFS RaidZ2

I am currently building a RaidZ2 with Ubuntu for scientific data storage for my PhD in physics and some VM use (cf. this thread (link))
There are 10x 4TB WD RED (CMR) in the RaidZ2 with dedup and lz4 compression as well as a WD Black NVMe for l2arc.
The HDD are connected via a HPE 240 in HBA mode.

In the moment I do not use it primarily for scientific data storage, aside the occasional photo and video storage/editing and VM use.
Later I will also use it as NAS but in the moment I access the data only once a day mainly when storing results of number-crunching and loading the new data.

I have come to the conclusion that it would be best to power down the HDD (even after factoring in spin up etc.) So my question is:

What is the best practice for powering down the spinning rust in the RaidZ2 automatically?
Any major caveats?

How to implement that in Ubuntu 20.04?

Thanks a lot for your input! :grinning: :+1:

I use sudo systemctl poweroff to shut down my raidz array.

That powers off system not just the array.

If the OP’s hardware supports AHCI and hotplug then he can just yank the drive after issuing the relevent zfs commands to stop the degraded disk.

If hotplug is not supported then it wont be possible to yank drives while the system is powered on.

My sarcastic response I guess was indicating there’s no benefit to spin down an array while leaving the system operational.

Constructively, if you stop the array, the eject command should power off the drive(s) … Not sure how you get them restarted?

There is a benefit: being able to replace a drive without powering off a system. This is very important in the world of servers where you can’t just shut stuff off all the time.

Make sure your controller and physical mounting system support hot swap. If a failed drive is screwed into a chassis and plugged into cables, I’m shutting down.

A hot or cold spare drive will carry the torch until a maintenance window can be scheduled.

Anything that’s too important to shutdown ever is designed incorrectly.

Best practice I’d suggest is “don’t do that!”.

The caveats being that

  • maybe ZFS or a RAID controller sees > /number of disk failures tolerated/ drives disappear and fails the entire pool due to insufficient replicas.
  • if any of your drives die on power up (which is a common time for disks to die) is it failed in ZFS because it powered off or is it really dead? Who knows?

However if you insist on making trouble for yourself… before powering drives off I’d be un-mounting the pool and maybe even potentially exporting it, to make sure that ZFS isn’t expecting drives to be up and running.

But that’s a pain in the rear. I’d suggest “don’t turn drives off” because you’ll just create problems for yourself - either if you screw anything up or encounter an edge case bug in ZFS. ZFS isn’t intended for this sort of stuff, I suggest that very few people are using it that way, and thus you’re putting yourself in a “not well supported” position here.

Powering a single failed drive off to replace is different…

1 Like

Thank you all for your answers!

unmounting the pool before putting the spinning rust to sleep seems reasonable :+1: :nerd_face:

I am not that firm with ZFS yet. All I know is that exporting acts like “unmounting + deleting it form some quick access list” (I am paraphrasing and oversimplifying of course).

If someone could give a more nuanced difference between unmounting and exporting I would be happy. :grinning:

Is there an auto-unmount-if-idle feature for ZFS or has anyone done a cron script for that?

Thanks a lot!

Your drives may or may not be built for constant spin up and down. NAS and enterprise drives usually are, but don’t take much of a life hit in my experience by staying powered on. It’s safe to assume that you are using drives of a proper class, but even then, power down and up also causes wear, so it’s a tradeoff and not really a gain if they are spinning up once a day every single day. I’ve had arrays running for 5+ years with no spindown and have actually seen more drives fail when they were allowed to spin down (single disk) though that’s also personal data, and not scientific.

EDIT: You stated WD Red CMR, those can take the additional spin up and down, but they are also built to spin all day in a raid environment assuming they are in safe temps.

2 Likes