Newbie Q: Advice needed re. ZFS dataset layout with passphrase encryption in TrueNAS Scale

In my homelab I currently own a Synology NAS (DS1819+) with a 6-disk array in SHR-2 (equivalent to RAID6, i.e. 2-disk redundancy) with Btrfs as the file system. Soon (once the ordered hardware will arrive) I’m going to setup a box with TrueNAS Scale (v24.10 on bare metal) with a 4-disk HDD vdev in RAIDz2 (and join the cult with integrity :wink:).

On my current Synology setup I use passphrase encryption for the shared folders (which seem to be the equivalent thing to ZFS’ datasets, except they cannot be nested in Synology) and for certain system folders, such as the docker folder.

The only downside currently is that - upon reboot - I have to enter a passphrase 4 times (once for each encrypted shared folder) and it seems on TrueNAS/ZFS that won’t be a problem, if I use nested encrypted datasets.
And FWIW I currently also have to shut down and restart docker on my NAS, after unlocking the shared folders, since I’ve also encrypted the docker folder (and that is totally fine with me).

If I understand things correctly, there would be - in principle - two ways to set up passphrase encryption on TrueNAS Scale:

  1. Passphrase encryption on “root” dataset
  tank        -> Passphrase encrypted dataset (DS)
  tank/nas    -> DS inherits encryption (shared to clients)
  tank/media  -> DS inherits encryption (shared to clients)
  tank/docker -> DS inherits encryption
  ...
  1. Intermittent dataset as ancestor of all encrypted datasets
  tank              -> Encrypted or not (?)
  tank/vault        -> Passphrase encrypted DS as the ancestor of all encrypted datasets (not shared)
  tank/vault/nas    -> DS inherits encryption (shared to clients)
  tank/vault/media  -> DS inherits encryption (shared to clients)
  tank/vault/docker -> DS inherits encryption
  ...

If I understand things correctly, encrypting the “root” dataset with a passphrase requires moving off the system dataset to another volume (e.g. the bootdrive), whereas the second approach does not require that, it just has that extra parent node for encryption, which is just there to unlock all encrypted datasets with one action and a single passphrase.

So here are my questions:

  1. Since I won’t be mirroring the boot drive, there is of course the potential of the system dataset getting lost, if I use option #1. Would a regular (external) backup of the system configuration be sufficient to mitigate this or does the system system dataset contain more stuff than is contained in the backup of the system configuration?

  2. Also I have a pair of Intel Optane P1600X (58G) SSDs that I plan to use in mirrored mode as an SLOG vdev. Would it be possible to use those drives to additionally store the system dataset (out of the box, without any shenanigans that are not natively supported in TrueNAS) or is the SLOG vdev exclusively for SLOG usage in TrueNAS?

  3. And finally: Which of the two options (or perhaps a third one altogether) would you use and why? And if it’s #2 with the intermittent dataset, would you still encrypt the root dataset (with a key) or not?

I hope I provided all the pertinent info, but I’m new to this, so please let me know, If I need to specify anything else.

Thanks in advance for any advice on this matter.

It seems to be generally not recommended to store data on the root dataset at all, so use option #2 (e.g. create what’s called an “encryption root” in ZFS terms). IIRC this makes the datasets under the encryption root all share the same encryption key (which is in turn unlocked by the passphrase).

For your non-encrypted data, create additional datasets under the root dataset.

One of the reasons for not storing data directly in the root dataset is that it makes data migration (using zfs send/recv) more difficult - it’s not possible to overwrite a root dataset through zfs recv, so it’s not possible to keep the data layout when migrating a root dataset.

Also, properties are inherited from parent datasets in zfs, so you might want to use the root dataset for setting “sane defaults” and then tweaking things on the datasets where you actually store data.

2 Likes

@homeserver78 Thanks so much for your response.
It seems that i have expressed myself badly in my original post or perhaps I am not quite understanding your point. I never wanted to store any data on the root dataset, even in option 1, just create datasets there (which in turn will contain the data).

My question was whether the root dataset should function as my encryption root (since it has a name it’s apparently a thing…) in option 1 or whether that encryption root should be a separate dataset right below the root dataset (option 2). And having a shared encryption key is indeed what I am looking for.

In option 2 I would turn on key-based encryption anyway on the root dataset (and even the datasets that I don’t need passphrase-based encryption on), so - if a drive ever goes bad and I need to junk it or RMA it - the data would be inaccessible. But that key should be stored in the system dataset and so the root dataset should be automatically unlocked (if I understand things correctly).

Learning the term “encryption root” was indeed very helpful, that has given me a great pointer for further research. So thank you again!

I’m the one who expressed myself badly; I kinda included any encryption settings in what I thought of as “data”, which wasn’t exactly clear! My point was that if you want to migrate your tree of datasets to another pool at some point, it’s easier if the encryption root is not the pool’s root dataset.

I’m not sure it’s a good idea to turn on encryption on the root dataset even if you want to encrypt all data on the pool. Again, the root dataset will then hold the encryption keys for any child datasets (except for the ones that are themselves encryptionroots), which means you must include the root dataset in any later migrations, forcing what was the root dataset to become a child dataset on the new pool and thus changing your data layout.

Better to create multiple encryption roots under the root dataset if you want several groups of datasets with different encryption settings.

1 Like

Here’s a thread over on the old TrueNAS forum from someone who didn’t ask the same question as you before setting up an encrypted root dataset:

1 Like