Let's fix Arch!

This will be a separate project entirely.

Standby for numbers. EDIT: it’s not happy with my NVMe disk. I’ll play with it, but it might take some time.

2 Likes

Alright, so the way I see this, we’re going to need a few things to automate it:

  • ALPM hooks (for monitoring pacman)
  • script to handle the dataset creation (I propose the name znapper for the time being, suggest something better please)
  • some way to notify GRUB of the additional datasets.

Best way I see for notifying GRUB would be a hook. How we determine this though, will be interesting. I’m thinking we need a standard dataset configuration.

I propose the following:

We name the boot pool something reasonable, have that in an env somewhere. (people can have multiple pools, afterall)

We make a dataset off the pool called sys. In there goes your home and root dataset control groups.

We can also create a dataset off the pool called data for user data that we don’t want to get versioned in relation to package updates.

IE:

${POOL}/sys/${HOSTNAME}/ROOT/default  # Root datasets
${POOL}/sys/${HOSTNAME}/home  # Home datasets.  Typically dotfiles don't like being messed with too much.  We'll snapshot this with the root partition, but we might want different ZFS options on /home.
${POOL}/sys/${HOSTNAME}/var  # Not sure about separate usr, var, opt, etc... datasets for now.  (I don't have them at the moment)
${POOL}/data/* # User Data datasets.  These don't get messed with

From there, we can write a wrapper script for GRUB that fixes this.

We will also need a dataset that holds control data for the management script, since it’s really not a good idea to keep multiple versions of the data. For that, I propose creating a ${POOL}/znapper dataset. I think we’ll just use sqlite for now, unless someone has a better option.

I’m going to write the initial script in Python, since there are ZFS bindings for it, so we don’t have to do a bunch of string parsing.

1 Like

For the grub bits did you look at the previous posts here from other distros? Maybe something reusable there. This is aaawweesooomeeeeeeee :smiley:

3 Likes

libbe is C, I don’t think we need C. snapper is a good thought. I need to look more at it.

snap-pac is an arch tool for hooking ALPM with btrfs. I’m using that for some reference, but I’ll need to write my own scripts for managing the datasets.


Expect a repo up soon-ish.

3 Likes

Boot Environments are a feature that’s already an integrated part of ZFS, that’s why I suggest taking advantage of it.

See ie https://github.com/evan-king/grub2-zfs-be

beadm is an alternative to bectl, if you’d prefer to stick to scripting.

Either way, there are existing tools to do this, and the “Not Invented Here” approach means that people who already know how to use the existing tools won’t be able to take advantage of their existing knowledge and infrastructure built on them. Fragmentation is not a feature.

1 Like

Thanks for the info. I’ll be looking into this then.

1 Like

(I’m slowly catching up with all the posts in this thread)

I have run Arch on ZFS root before, that’s fairly well documented already. And FWIW I’ve shared a pool between FreeBSD and Ubuntu even, with each installed to its own boot environment. Just to give an idea of the realm of possibilities here.

Here’s a little brain dump to help people who might not have used boot environments get up to speed.

The convention for boot environments is ${POOL}/ROOT/${BENAME} where ${POOL}/ROOT has mountpoint=none and canmount=off, and each child dataset is canmount=noauto and mountpoint=/. Currently libbe only supports a single dataset for the root fs, but it is planned to support nested datasets in the future (if they didn’t already add that while I wasn’t paying attention). I think beadm only ever dealt with a single dataset per BE.

Consider the following example pool named system:

NAME                                      USED  AVAIL  REFER  MOUNTPOINT
system                                    423G  16.7G    96K  none
system/ROOT                              62.8G  16.7G    96K  none
system/ROOT/11.2-RELEASE                    8K  16.7G  19.0G  /
system/ROOT/11.2-RELEASE-p1                 8K  16.7G  21.6G  /
system/ROOT/11.2-RELEASE-p9                 8K  16.7G  43.7G  /
system/ROOT/default                      62.8G  16.7G  50.0G  /
system/tmp                               3.20G  16.7G  3.20G  /tmp
system/home                          36.1G  16.7G  36.1G  /home
system/var                               3.64M  16.7G    96K  /var

The system/ROOT/default dataset is the default boot environment for the system. Before a change to the system, a snapshot of the boot environment is taken and a clone is created from that snapshot to serve as a fallback boot environment in case the change leaves the system in a broken state. The change is then made to the default boot environment. If necessary, the snapshot can be rolled back to undo the changes, or the clone and snapshot can be destroyed if the change was determined to have succeeded. You can create new datasets in system/ROOT to install a different distro or different OS even. The bootloader ideally should respect the bootfs property of the pool, though I’m not sure if GRUB supports this or if it needs to be manually reconfigured to change the default boot environment. Normally you would leave the bootfs as system/ROOT/default though and only select a different BE from the boot menu if necessary.

Here are the synopsis of bectl and beadm:

This was not the case in my experience. I’m fairly certain it’s not a grub issue, but one with Linux or Systemd or something along those lines. :confused:

Okay, glad to know we’re on the same page here.

This is great information. Something I’m confused about here. . . .

Are you using a normal Grub2 efi partition using vfat or is the boot partition also ZFS?

I was kind of surprised people had used ZFS as the root partition earlier as I’d never heard it used in this manner, so just wondering if I need even more education here. I’ve tried reading through the docs and I can’t find much about doing this.

Thanks, I’m enjoying learning about this weird use case, it may be something I can test soon too.

I’m not using grub at all here, but that is a system with an efi partition (typical fat32 filesystem). Grub has limited zfs feature support, and you might need a small partition for grub to load a zfs module from depending on the build, I’m not sure.

Read this: https://github.com/zfsonlinux/zfs/wiki
eg https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS

1 Like

are you using lilo ? something different?

edit: wanted to say thanks for the info!

If using the ArchZFS repository, Do we just go through the installation as normal, and add the repo to the /etc/pacman.d or whatever before running pacstrap?

As always, the wiki us super helpful on this :roll_eyes:

I think your looking for:

  • You need to add the Arch ZFS repository to /etc/pacman.conf , sign its key and install zfs-linux (or zfs-linux-lts if you are running the LTS kernel) within the arch-chroot before you can update the ramdisk with ZFS support.

in

https://wiki.archlinux.org/index.php/Installing_Arch_Linux_on_ZFS

Uh huh. The GitHub was far more useful. Yet, modprobe zfs still fails.

Oh well. It was worth a try.

I’m sorry, but there is no way you guys are just using the Arch wiki

Wiki instructions

No where does it say anything about editing pacman.conf or installing packages, or mounting anything if following sequentially.

2 Likes

Installation

See ZFS#Installation for installing the ZFS packages. If installing Arch Linux onto ZFS from the archiso, it would be easier to use the archzfs repository.

in https://wiki.archlinux.org/index.php/Installing_Arch_Linux_on_ZFS :

Install from the [Arch User Repository]>(https://wiki.archlinux.org/index.php/Arch_User_Repository) or the archzfs repository:

When working with the Arch wiki - read from top to bottom.

  • Never jump to a section - this will almost always lead to missing something important.

I am, none of that is working. Now gcc is failing to install from every mirror because damn disk space is full. Jesus, what a nightmare.

Regardless, with zfs-linux installed:

modprobe zfs
modprobe: FATAL: Module zfs not found in directory /lib/modules/5.0.5-arch1-1-ARCH

Apparently the trick is to have a 64GB installation medium.

2 Likes

Just throwing out an idea here, systemd-boot (formerly Gummiboot). The System76 people use it in combination with their home-grown kernelstub for various functionality in Pop!_OS like a system recovery mechanism (upgradable) and factory-shipped full-disk encryption. Evidently some of the reasoning is speed and reduced complexity compared with GRUB.

Sorry everyone, I’m tapping out. I’ve spent a day on this. zfs-dkms is installed but zfs isn’t found.

1 Like

If everything already worked what would there be to fix? :thinking:

2 Likes