In my new homelab migration to Proxmox I came across a bug that will prevent you from being able to mount all your ZFS mount points and be a pain in the ass even more if you host containers in that folder.
Cause of the problem: When you use a different zpool than the default rpool, and setup a directory mount for PVE to use for ISO datastore, VZ dump, etc on reboot if the zfs mount points have not completed mounting at boot time. Proxmox will attempt to create the directory path structure.
The problem with creating a directory for something before is mounted is that when zfs-mount.service runs and attempts to mount the zfs mount points you will get these kind of errors:
[email protected]:~# systemctl status zfs-mount.service
● zfs-mount.service - Mount ZFS filesystems
Loaded: loaded (/lib/systemd/system/zfs-mount.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2017-06-30 18:10:21 PDT; 21s ago
Process: 6590 ExecStart=/sbin/zfs mount -a (code=exited, status=1/FAILURE)
Main PID: 6590 (code=exited, status=1/FAILURE)
Jun 30 18:10:19 pve systemd: Starting Mount ZFS filesystems...
Jun 30 18:10:20 pve zfs: cannot mount '/gdata/pve/subvol-102-disk-1': directory is not empty
Jun 30 18:10:20 pve zfs: cannot mount '/gdata/pve/subvol-106-disk-1': directory is not empty
Jun 30 18:10:20 pve zfs: cannot mount '/gdata/pve/subvol-109-disk-1': directory is not empty
Jun 30 18:10:21 pve systemd: zfs-mount.service: Main process exited, code=exited, status=1/FAILURE
Jun 30 18:10:21 pve systemd: Failed to start Mount ZFS filesystems.
Jun 30 18:10:21 pve systemd: zfs-mount.service: Unit entered failed state.
Jun 30 18:10:21 pve systemd: zfs-mount.service: Failed with result 'exit-code'.
Fixing the root of the problem: change how proxmox deals with mounts by editing /etc/pve/storage.cfg – you need to add “mkdir 0” and “is_mountpoint” to the directory mount. Example:
Now we need to do some system cleanup before we reboot and confirm the problem is fixed.
Let’s check which mount points have failed:
[email protected]:~# zfs list -r -o name,mountpoint,mounted
Now let’s umount all zfs mount points (except rpool of course – assuming the rootfs is zfs)
# zfs umount -a
After making sure ZFS mount points are unmounted, now we can delete the empty folders. Recall the failed mount points that the zfs list command gave you and one by one delete them like so:
# rm -rf /gdata/pve/subvol-102-disk-1
Do this for each folder that showed issues mounting. You have a choice to remount everything with zfs mount -O -a — or better… reboot the system and check its fixed. I like the later better. So reboot.
After it boots back up check that service was able to mount zfs without issues:
# systemctl status zfs-mount.service
# zfs list -r -o name,mountpoint,mounted
That’s all folks… if you made the edit to storage.cfg and added the two variables this should not occur again. This was an annoying bug to deal with but good to have found a better solution than a startup script doing some dirty tricks!