I’ve set up a backup box with Fedora 39 and OpenZFS, created a natively encrypted pool and pulled some data from my main server. Everything worked until I rebooted to verify if all services come to life as configured and I was met with failed zfs-mount.service
. This was not the first reboot since setting up a pool.
Mount consistently fails with Invalid argument
error but only only on the dataset I used to hold my pulled backups. It did mount before.
There were no updates in the meantime, this is a stock kernel version from Fedora 39 and zfs-dkms was built once on installation.
root@tao:~# zfs --version
zfs-2.2.2-1
zfs-kmod-2.2.2-1
root@tao:~# uname -a
Linux tao 6.5.6-300.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Oct 6 19:57:21 UTC 2023 x86_64 GNU/Linux
Backups were synced in raw mode with syncoid and are encrypted with a separate key. They are children of the tao/syncoid/papacamayo
dataset and tao/syncoid
is the one that fails.
root@tao:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
tao 4.04T 373G 232K /tao
tao/remote 272K 373G 192K /tao/remote
tao/syncoid 4.04T 373G 200K /tao/syncoid
tao/syncoid/papacamayo 4.04T 373G 192K /tao/syncoid/papacamayo
(list of datasets from the main server)
Top and sibling dataset did mount automatically with zfs-mount.service
.
root@tao:~# zfs mount
tao /tao
tao/remote /tao/remote
root@tao:~# zfs get mountpoint tao/syncoid
NAME PROPERTY VALUE SOURCE
tao/syncoid mountpoint /tao/syncoid default
root@tao:~# ls -alsh /tao/syncoid/
total 17K
8.5K drwxr-xr-x. 2 root root 2 Feb 13 02:05 .
8.5K drwxr-xr-x 4 root root 4 Feb 14 20:53 ..
root@tao:~# zfs mount tao/syncoid
cannot mount 'tao/syncoid': Invalid argument
root@tao:~# zfs get canmount tao/syncoid
NAME PROPERTY VALUE SOURCE
tao/syncoid canmount on default
I started a scrub, so far no errors. I previously ran another scrub after replacing a failing drive in one of the vdevs and there were no errors.
What is the problem here and how can I fix it?
perhaps zfs mount -a
wait, while dataset is not mounted/ before mounting it, , try and delete the empty syncoid folder, if the folder exists
then mount dataset
then claim / give ownership of folder, if you need to
1 Like
Unfortunately it’s not the problem, /tao
is empty already.
root@tao:/# zfs unmount tao
root@tao:/# ll /tao/
total 0
root@tao:/# zfs mount -a
cannot mount 'tao/syncoid': Invalid argument
root@tao:/# ll /tao/
total 17
drwxr-xr-x 2 root root 2 Feb 14 20:51 remote
drwxr-xr-x 2 root root 2 Feb 17 23:53 syncoid
root@tao:/# ll /tao/syncoid/
total 0
and you are sure there is not a folder, in the tao
dataset, with a name syncoid
, occupying the mountpoint?
1 Like
I’ve checked, the directory /tao/syncoid
only appears when I do zfs mount -a
or just zfs mount tao
. Note that dataset tao/remote
mounts, it’s just empty on the pool.
root@tao:/# zfs unmount -a
root@tao:/# ll tao/
total 0
root@tao:/# ls -alsh tao
total 0
0 drwxr-xr-x. 1 root root 0 Feb 13 00:47 .
0 dr-xr-xr-x. 1 root root 178 Feb 15 13:39 ..
root@tao:/# zfs mount tao
root@tao:/# ls -alsh tao
total 26K
8.5K drwxr-xr-x 4 root root 4 Feb 18 00:09 .
0 dr-xr-xr-x. 1 root root 178 Feb 15 13:39 ..
8.5K drwxr-xr-x. 2 root root 2 Feb 14 20:52 remote
8.5K drwxr-xr-x 2 root root 2 Feb 18 00:09 syncoid
root@tao:/# ls -alsh tao/*
tao/remote:
total 17K
8.5K drwxr-xr-x. 2 root root 2 Feb 14 20:52 .
8.5K drwxr-xr-x 4 root root 4 Feb 18 00:09 ..
tao/syncoid:
total 17K
8.5K drwxr-xr-x 2 root root 2 Feb 18 00:09 .
8.5K drwxr-xr-x 4 root root 4 Feb 18 00:09 ..
root@tao:/# zfs mount tao/remote
root@tao:/# ls -alsh tao/*
tao/remote:
total 17K
8.5K drwxr-xr-x 2 root root 2 Feb 14 20:51 .
8.5K drwxr-xr-x 4 root root 4 Feb 18 00:09 ..
tao/syncoid:
total 17K
8.5K drwxr-xr-x 2 root root 2 Feb 18 00:09 .
8.5K drwxr-xr-x 4 root root 4 Feb 18 00:09 ..
root@tao:/# zfs mount tao/syncoid
cannot mount 'tao/syncoid': Invalid argument
root@tao:/# ls -alsh tao/*
tao/remote:
total 17K
8.5K drwxr-xr-x 2 root root 2 Feb 14 20:51 .
8.5K drwxr-xr-x 4 root root 4 Feb 18 00:09 ..
tao/syncoid:
total 17K
8.5K drwxr-xr-x 2 root root 2 Feb 18 00:09 .
8.5K drwxr-xr-x 4 root root 4 Feb 18 00:09 ..
can you change the syncoid mountpoint slightly, just for testing?
1 Like
I experimented a bit in the meantime, apparently tao/syncoid
and tao/syncoid/papacamayo
got cursed on this cursed server. I can almost hear the Cenobites ringing their bell. I moved datasets around and everything is mountable now.
root@tao:/# zfs create tao/test
root@tao:/# zfs rename tao/syncoid/papacamayo tao/test/papacamayo
root@tao:/# zfs list
NAME USED AVAIL REFER MOUNTPOINT
tao 4.04T 373G 228K /tao
tao/remote 272K 373G 192K /tao/remote
tao/syncoid 200K 373G 200K /tao/syncoid
tao/test 4.04T 373G 192K /tao/test
tao/test/papacamayo 4.04T 373G 192K /tao/test/papacamayo
tao/test/papacamayo/hdd 4.04T 373G 340K /tao/test/papacamayo/hdd
(list of children datasets of hdd)
root@tao:/# zfs mount -a
cannot mount 'tao/syncoid': Invalid argument
cannot mount 'tao/test/papacamayo': Invalid argument
root@tao:/# zfs unmount tao/test
root@tao:/# zfs mount tao/test
root@tao:/# zfs create tao/test2
root@tao:/# zfs rename tao/test/papacamayo/hdd tao/test2/hdd
root@tao:/# zfs unmount -a
root@tao:/# zfs mount -a
cannot mount 'tao/test/papacamayo': Invalid argument
cannot mount 'tao/syncoid': Invalid argument
root@tao:/# zfs destroy tao/syncoid
root@tao:/# zfs list
NAME USED AVAIL REFER MOUNTPOINT
tao 4.04T 373G 268K /tao
tao/remote 272K 373G 192K /tao/remote
tao/test 392K 373G 200K /tao/test
tao/test/papacamayo 192K 373G 192K /tao/test/papacamayo
tao/test2 4.04T 373G 192K /tao/test2
tao/test2/hdd 4.04T 373G 340K /tao/test2/hdd
(list of children datasets of hdd)
root@tao:/# zfs destroy -r tao/test
root@tao:/# zfs list
NAME USED AVAIL REFER MOUNTPOINT
tao 4.04T 373G 252K /tao
tao/remote 272K 373G 192K /tao/remote
tao/test2 4.04T 373G 192K /tao/test2
tao/test2/hdd 4.04T 373G 340K /tao/test2/hdd
(list of children datasets of hdd)
root@tao:/# zfs unmount -a
root@tao:/# zfs mount -a
root@tao:/#
I’m still not sure what happened, but at least I know how to recover from this.
EDIT: Thank you for the time you spent on helping me.
2 Likes
dunno what was wrong, but well done sticking with it, and not just giving up or whatever
3 Likes
This post documents what I already wen through while building this box: Cursed server build vent
I will not allow this piece of cursed circuits from hell defeat me!
1 Like
To those interested, I’ve nailed what’s causing the problem.
All datasets created from cockpit-zfs-manager have the same problem. Calling zfs create
is fine and the issue is within the dataset itself, not related to the mountpoint. It’s also only on specific combination of kernel/zfs/cockpit-zfs-manager version.
I’ll try to figure out what this plugin is doing under the hood and report the problem on this plugin’s GitHub.
I always disliked SELinux. The difference now is that I hate it with a passion that’s beyond human understanding.
It was the freaking SELinux. The freaking SELinux that was supposedly disabled! Apparently SELINUX=disabled
in /etc/selinux/config
is not equivalent to actually disabling it in the kernel. This cursed piece of software is still running in the kernel, just with no policy loaded. But it does run and, despite no policy, does interfere with the system.
The real way to disabling it is to run grubby --update-kernel ALL --args selinux=0
. Did it, rebooted, everything mounts.
1 Like