Manjaro - Root on ZFS with encryption

This guide can be improved and will be rewritten soon™. - Sapiens 13.06.2021

If you can adapt this yourself have a look of these very good instructions for Arch: Overview — OpenZFS documentation

Manjaro - Root on ZFS with encryption

Following these instructions you can setup Manjaro with Grub as a bootloader on an encrypted ZFS root pool. The boot pool will be unencrypted. This enables you to setup other features like remote unlock via SSH (dropbear) as well as rollback of the entire system as well. This can be usefull when for example a kernel update has gone wrong. You should be able to have different boot environments with this setup. Optional swap partitions will also be encrypted using luks.

Prerequisites:

Now boot from the Manjaro installation media and open a terminal

Secure erasure of the storage media (recommended):

All of the data that has been previously written to the storage media could still be read after setting up our system. Erasing the storage media is therefore strongly reommended!

Installation:

Since Linux will need to boot off of the disk we want to use for installation the path to these drives has to be static. Therefor it is required to use a persistent block device naming identifier, which you will get from the following command.:


ls -al /dev/disk/by-id

Create a short handle to reference the disk array by a persistent block device naming identifier, e.g.:


DISK=(/dev/disk/by-id/disk1 /dev/disk/by-id/disk2)

for a single disk use:


DISK=(/dev/disk/by-id/disk1)

Now we format the disks into an 1G EFI partition, a 4G zfs boot pool partition, a swap partition and a zfs root pool. The last partition on each disk will be used for an luks encrypted swap partitions. We resort to luks encrypted swap partitions since using swap on zfs is currently known to cause lockups in high memory pressure scenarios.

I strongly encourage the usage of swap partitions!

In the following variable, specify how much swap space in gigabytes you want to use in total, distributed across all disks, e.g.:


SWAP=16

After that, execute the following code block to partition the disks:

This will make all data currently on the disks inaccessible!


last=$(($SWAP / ${#DISK[@]} + $SWAP % ${#DISK[@]}))

for i in ${DISK[@]}; do

sudo sgdisk --zap-all $i

sudo sgdisk -n1:1M:+1G -t1:EF00 $i

sudo sgdisk -n2:0:+4G -t2:BE00 $i

if [ $last -gt 0 ]

then

sudo sgdisk -n3:0:-${last}G -t3:BF00 $i

sudo sgdisk -n4:0:0 -t4:8308 $i

else

sudo sgdisk -n3:0:0 -t3:BF00 $i

fi

done

Create a random UUID to append to the zfs pool names so that importing the pools on aother computer e.g. for rescue or backing them up won’t result into name conflicts


UUID=$(dd if=/dev/urandom bs=1 count=100 2>/dev/null | tr -dc 'a-z0-9' | cut -c-6)

Create the boot pool. For a single disk setup omit the topology specification mirror in the following command:


sudo zpool create \

-o ashift=12 \

-o autotrim=on \

-d -o feature@async_destroy=enabled \

-o feature@bookmarks=enabled \

-o feature@embedded_data=enabled \

-o feature@empty_bpobj=enabled \

-o feature@enabled_txg=enabled \

-o feature@extensible_dataset=enabled \

-o feature@filesystem_limits=enabled \

-o feature@hole_birth=enabled \

-o feature@large_blocks=enabled \

-o feature@lz4_compress=enabled \

-o feature@spacemap_histogram=enabled \

-O acltype=posixacl \

-O canmount=off \

-O compression=lz4 \

-O devices=off \

-O normalization=formD \

-O relatime=on \

-O xattr=sa \

-O mountpoint=/boot \

-R /mnt \

bpool_$UUID \

mirror \

$(for i in ${DISK[@]}; do

printf "$i-part2 ";

done)

Create the root pool. This command will ask you for the encryption passphrase. Choose a strong passphrase! Since ZFS is a copy-on-write filesystem changing the passphrase later will not re-encrypt already written data. If you believe your passphrase is compromised only wiping the drives and setting up the pool anew will suffice:


sudo zpool create \

-o ashift=12 \

-o autotrim=on \

-R /mnt \

-O acltype=posixacl \

-O canmount=off \

-O compression=lz4 \

-O dnodesize=auto \

-O normalization=formD \

-O relatime=on \

-O xattr=sa \

-O mountpoint=/ \

-O encryption=aes-256-gcm \

-O keyformat=passphrase \

-O keylocation=prompt \

rpool_$UUID \

mirror \

$(for i in ${DISK[@]}; do

printf "$i-part3 ";

done)

Now we are going to create the container datasets. Nothing is stored directly under rpool and bpool, hence: canmount=off!


sudo zfs create -o canmount=off -o mountpoint=none bpool_$UUID/BOOT

sudo zfs create -o canmount=off -o mountpoint=none rpool_$UUID/DATA

sudo zfs create -o canmount=off -o mountpoint=none rpool_$UUID/ROOT

Create filysystem datasets. The canmount property is set to noauto for root as advised.:


sudo zfs create -o mountpoint=legacy -o canmount=noauto bpool_$UUID/BOOT/default

sudo zfs create -o mountpoint=/ -o canmount=off rpool_$UUID/DATA/default

sudo zfs create -o mountpoint=/ -o canmount=noauto rpool_$UUID/ROOT/default

Mount root and boot datasets:


sudo zfs mount rpool_$UUID/ROOT/default

sudo mkdir /mnt/boot

sudo mount -t zfs bpool_$UUID/BOOT/default /mnt/boot

Create datasets to separate user data from root filesystem:


for i in {usr,var,var/lib};

do

sudo zfs create -o canmount=off rpool_$UUID/DATA/default/$i

done

for i in {home,root,srv,usr/local,var/log,var/spool};

do

sudo zfs create -o canmount=on rpool_$UUID/DATA/default/$i

done

sudo chmod 750 /mnt/root

Create optional user data datasets to omit data from possible rollback:


sudo zfs create -o canmount=on rpool_$UUID/DATA/default/var/games

sudo zfs create -o canmount=on rpool_$UUID/DATA/default/var/www

# for GNOME

sudo zfs create -o canmount=on rpool_$UUID/DATA/default/var/lib/AccountsService

# for Docker

sudo zfs create -o canmount=on rpool_$UUID/DATA/default/var/lib/docker

# for NFS

sudo zfs create -o canmount=on rpool_$UUID/DATA/default/var/lib/nfs

# for LXC

sudo zfs create -o canmount=on rpool_$UUID/DATA/default/var/lib/lxc

# for Libvirt

sudo zfs create -o canmount=on rpool_$UUID/DATA/default/var/lib/libvirt

Format and mount EFI system partitions:


for i in ${DISK[@]}; do

sudo mkfs.vfat -n EFI ${i}-part1

sudo mkdir -p /mnt/boot/efis/${i##*/}

sudo mount -t vfat ${i}-part1 /mnt/boot/efis/${i##*/}

done

sudo mkdir -p /mnt/boot/efi

sudo mount -t vfat ${DISK[1]}-part1 /mnt/boot/efi

Configuring the root filesystem as advised:


sudo zpool set bootfs=bpool_$UUID/BOOT/default bpool_$UUID

sudo zpool set cachefile=/etc/zfs/zpool.cache bpool_$UUID

sudo zpool set cachefile=/etc/zfs/zpool.cache rpool_$UUID

Since Manjaro dropped support for the architect installer we need to get it manually:


sudo pacman -Sy

git clone https://gitlab.manjaro.org/applications/manjaro-architect.git

cd manjaro-architect

git checkout 0.9.34

sudo pacman -S base-devel --noconfirm

sudo pacman -S f2fs-tools gptfdisk manjaro-architect-launcher manjaro-tools-base mhwd nilfs-utils pacman-mirrorlist parted --noconfirm

makepkg -si --noconfirm

cd ..

rm -rf manjaro-architect

Now the Manjaro installtion procedure will start, run the setup command in the terminal to start the interactive installer.


setup

In the following section I will give you hints which menu points you are required to use. At the time of writing this guide to most up-to-date linux kernel is Kernel 5.11 (package linux511), if you follow this tutorial at a later point you will need to adjust some parts of the following instructions accordingly:


Select Language -> <your selection> -> OK


Prepare Installation -> Set Virtual Console -> change -> <your selection>


Install Desktop System -> Install Manjaro Desktop -> yay + base-devel; linux511 -> <your selection> -> Yes -> cryptsetup; linux511-zfs -> <your selection>


Install Desktop System -> Install Bootloader -> grub -> Yes -> Yes


Install Desktop System -> Configure Base -> Generate FSTAB -> fstabgen -U -p


Install Desktop System -> Configure Base -> Set Hostname -> <your selection>


Install Desktop System -> Configure Base -> Set System locale -> <your selection> -> <your selection>


Install Desktop System -> Configure Base -> Set Desktop Keybaord Layout -> <your selection>


Install Desktop System -> Configure Base -> Set Timezone and Clock -> <your selection> -> <your selection> -> Yes -> utc


Install Desktop System -> Configure Base -> Set Root Password


Install Desktop System -> Configure Base -> Add New User(s) -> <your selection> -> <your selection>


Done -> Yes -> No

Leaving the architect-installer unmounts the zfs datasets and the efi partition so we need to remount them:


sudo zfs mount rpool_$UUID/ROOT/default

sudo mount -t zfs bpool_$UUID/BOOT/default /mnt/boot

sudo zfs mount -a

sudo mount ${DISK[1]}-part1 /mnt/boot/efi

for i in ${DISK[@]}; do

sudo mount -t vfat ${i}-part1 /mnt/boot/efis/${i##*/}

done

Modify /etc/default/grub:


sudo sed 's/GRUB_CMDLINE_LINUX=\"/GRUB_CMDLINE_LINUX=\"zfs_import_dir=\/dev\/disk\/by-id\ /' /mnt/etc/default/grub | sudo tee /mnt/etc/default/grub.new

sudo rm /mnt/etc/default/grub

sudo mv /mnt/etc/default/grub.new /mnt/etc/default/grub

Configuring the root filesystem:


sudo zpool set bootfs=rpool_$UUID/ROOT/default rpool_$UUID

sudo zpool set cachefile=/etc/zfs/zpool.cache rpool_$UUID

sudo zpool set cachefile=/etc/zfs/zpool.cache bpool_$UUID

Copy over the zpool.cache file to the new system:


sudo cp /etc/zfs/zpool.cache /mnt/etc/zfs/zpool.cache

Add SWAP entries to /etc/fstab:


if [ $last -gt 0 ]

then

for i in ${DISK[@]}; do

echo " " | sudo tee -a /mnt/etc/fstab

echo swap-${i##*/} ${i}-part4 /dev/urandom swap,cipher=aes-cbc-essiv:sha256,size=256,discard | sudo tee -a /mnt/etc/crypttab > /dev/null

echo /dev/mapper/swap-${i##*/} none swap defaults 0 0 | sudo tee -a /mnt/etc/fstab > /dev/null

done

fi

Activating the ZFS services like advised:


sudo systemctl enable zfs.target --root=/mnt

sudo systemctl enable zfs-import-cache --root=/mnt

sudo systemctl enable zfs-mount --root=/mnt

sudo systemctl enable zfs-import.target --root=/mnt

When running ZFS on root, the machine’s hostid will not be available at the time of mounting the root filesystem. There are two solutions to this. You can either place your hostid in the kernel parameters in your boot loader. For example, adding spl.spl_hostid=0x12345678, to get your number use the hostid command.

The other, and suggested, solution is to make sure that there is a hostid in /etc/hostid, and then regenerate the initramfs image which will copy the hostid into the initramfs image. To write the hostid file safely you need to use the zgenhostid command. The hostid string has to be 8 alphanumeric characters.


sudo manjaro-chroot /mnt zgenhostid

sudo manjaro-chroot /mnt mkinitcpio -P

Now we install the bootmanager GRUB and generate grub menu.


sudo manjaro-chroot /mnt echo 'export ZPOOL_VDEV_NAME_PATH=YES' | sudo tee -a /etc/profile

sudo manjaro-chroot /mnt grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=manjaro --removable

sudo manjaro-chroot /mnt grub-mkconfig -o /boot/grub/grub.cfg

sudo manjaro-chroot /mnt echo 'Defaults env_keep += "ZPOOL_VDEV_NAME_PATH"' | sudo tee -a /etc/sudoers

If using multi-disk setup, mirror EFI system partitions:


sudo cp -r /mnt/boot/efi/EFI /tmp

sudo umount /mnt/boot/efi

for i in ${DISK[@]}; do

sudo cp -r /tmp/EFI /mnt/boot/efis/${i##*/}

sudo efibootmgr -cgp 1 -l "\EFI\BOOT\BOOTX64.EFI" \

-L "manjaro-${i##*/}" -d ${i}-part1

done

sudo mount -t vfat ${DISK[1]}-part1 /mnt/boot/efi

Finally we unmount all the disks and then restart into our newly installed system:


sudo umount /mnt/boot/efi

for i in ${DISK[@]}; do

sudo umount /mnt/boot/efis/${i##*/}

done

sudo umount /mnt/boot

sudo zfs umount -a

sudo zfs umount rpool_$UUID/ROOT/default

sudo zpool export rpool_$UUID

sudo zpool export bpool_$UUID

6 Likes

Hey thanks for posting this.

I’m about to upgrade (replace) my old 8700K that’s running ZoL (Ubuntu) on a two-drive zraid-1 (+ 1TB NVME L2ARC & optane SLOG) storage setup and was researching how to do it on Arch.

The Ubuntu root on ZFS instructions provided by OpenZFS were complex enough, Arch looks even more so, so I’m glad you posted this guide that distills it down.

One question / request if you don’t mind side-tracking this thread a little:

Background: About a year ago, this current ZoL system somehow made itself unbootable. Something updated grub but got confused and didn’t update the zfs boot partition correctly. I spent several long days trying to repair things but the ubuntu zsys generated snapshot rollback didn’t work, manual repair/regen attempts just made things worse (ubuntu apparmor, etc), and I had work to do. So I ended up copying off my data, reinstalling & restoring the same setup, and getting caught back on my deadlines. Took me down for a week and cost me that much vacation time. This made me realize that my backup system (basically cobbled together using syncoid+sanoid to another system) was inadequate - although I’m reasonably protected against hardware failure, I’m still pretty vulnerable to software fails. Everything was redundant & backed up, but I couldn’t restore a working system.

Since then, I haven’t solved the ‘backup/restore linux root on zfs boot’ problem but I sure would like to. Do you or anyone have a way to backup & restore these systems?

From what I understand the following would need to be captured in a backup and then restored properly:

  1. parition tables of drive(s) involved (l2arc and slog can be manually recreated)
  2. EFI partition (disk image?)
  3. ZFS boot partition (disk image?)
  4. swap partition (recreated?)
  5. main ZFS linux root partition
  6. etc

1-5 are of primary interest to me - once I have a basic system going again (without having to rebuild it manually!) I’m happy to do the rest of my restore process manually. It doesn’t have to be elaborate, just reliable.

Or is there some recommended way people do ZFS system backups? I’ve looked and haven’t found much at all. When I was stuck unbootable I asked the gurus on the OpenZFS forum and they didn’t seem to understand the question (I was a little desperate at the time though haha)

So if you or anyone has recommendations on how to set this up, I’d really like to implement it on this new system. If not, I’d be happy to help create one, although I’d need to be really careful - I’m more than a little gun shy after that last debacle and the data on my workstation is pretty valuable.