[TrueNAS / ZFS] Replacing disk in encrypted pool

Hey there

So I have my TrueNAS setup with 2 pools (vm storage and general storage) and everything is working perfect so far. Both my pools are encrypted with keyfiles and get auto unlocked upon boot.

I was reading something about the encryption when i stumbled over a thread in the truenas forums about the procedure of replacing disk in an encrypted pool, which lead me down a rabbit hole with no answer at the end besides confusion.

So appareantly encryption comes with some extra steps when performing some operations on the pools which I didn’t knew about and caused some concerns, as many people seem to loose data when doing them wrong.

So I started looking into the replacement of failed disk in an encrypted pool, as that will be something I will likely be doing at some point. There a handfull thread discussing this matter and all of them point to different answer with no clear solution. Also all these threads are rather old and thank about FreeNAS which used GELI as encryption and not native ZFS Encryption added with TrueNAS 12.

Can anybody explain to me what the correct and full procedure of replacing a disk in a ZFS encrypted pool on TrueNAS 12+ is? Do I have to pay attention to something regarding encryption that would make me loose my pool?

As far as I can tell, the process should be rather simple which would be to:

  • Open the pool status
  • Set the disk offline
  • Shutdown server
  • Replace failed disk with new one
  • Reboot
  • Use the replace function on the offline disk
  • Select new installed disk
  • Wait for ZFS to finish resilvering
  • Done

Is this still accurate or do I need the reset the keys, regenerate them and then apply new recovery keys and all that? I am under the assumption that most of those quirks only applied to GELI encryption. The TrueNAS documentation also does not mention anything about encryption in the replacement steps, allthough many people say the documentation is incomplete in that regard.

As Wendell and some in the forum are ZFS wizards I figured it would be a good place to ask, as there are no threads on the TrueNAS forums.

Thanks in advance!

Replacing a drive on a native zfs encrypted pool should (based on documentation) work as you’ve described.
Can’t say for certain since I haven’t tried it and I also use FreeBSD not TrueNAS. TrueNAS Core is based on FreeBSD though.

The thing I’ve always done before performing a significant change on my zpool (like major freebsd version upgrade, replacing disks) is to try it out in a VM. I did this since I didn’t have a backup and wanted to make sure I don’t lose all my data.
I used the same version of the OS, same drive configuration with small virtual disks, copied over a few files and tried it out.
One other tip would be to make a VM snapshot after you configure it. With this you can easily reset it, in case the operation you try doesn’t work.

1 Like

I’ve done at least three drive replacements with a ZFS on Linux pool using native ZFS encryption, no issues at all. So go crazy.

It also makes sense that it works. ZFS can be described as having two layers, the vdev/block storage layer, and the filesystem layer. Any operation with the “zfs” tool is operating on the filesystem layer, and any operation with the “zpool” tool is operating on the vdev/block storage layer.

Since encryption uses the “zfs” command, it can be described as a filesystem operation. So data is encrypted once, than handed to the vdev/block layer for striping/RAID/mirroring to be performed. So any operations with the “zpool” command should have no effect on encryption, including drive replacing, adding a new vdev, etc.

(The above is butchering reality, so just use it as a thought experiment).

Why does GELI encryption require re-encrypting a replaced disk? This is because each drive is a separate encrypted block device, and ZFS is installed on top of the encrypted block devices. This means for a mirror the data is encrypted twice, once for each drive. Whereas native ZFS encryption only needs to encrypt the data once, and writes the same data to both drives.

2 Likes

Fortunately I have another TrueNAS system with all my pools and datasets are replicated to, so I’m not concerned with loosing data when updating and such. :slight_smile:

Thanks a lot for the clarification, it is surely reassuring knowing I am good to go and don’t have do care about regenerating and reapplying keys.

Now let’s hope I won’t have to replace drives regurarly! :smiley:

I bought some dodgy drives off ebay that kept giving me errors and even breaking completely so I’ve done this several times.

Attach a fresh drive first without removing the one you want to replace. If you don’t have the connectors then you have to remove the old drive.
In the TrueNAS pool click on the drive to replace and select Replace. It will offer you the unused drive to replace it with. OK that.
It will take a couple of hours depending on the amount of data but that’s it, job done.

There is no extra step due to encryption as far as what you have to do. It’s the simplest possible thing to do. I replaced all my dodgy drives one at a time like this. I don’t have to erase the old drives because they’re encrypted.

2 Likes