Zfs: Systemd zfs-mount.service requires network.target

Created on 10 May 2019  Â·  14Comments  Â·  Source: openzfs/zfs

System information


Type | Version/Name
--- | ---
Distribution Name | CentOS
Distribution Version | 7
Linux Kernel | 3.10.0-693.21.1.el7.x86_64
Architecture | x86_64
ZFS Version | version: 0.7.9-1
SPL Version | version: 0.7.9-1

Describe the problem you're observing

ZFS filesystems do not auto-mount at reboot on CentOS 7 with systemd unless After=network.target is added to the zfs-mount.service unit file.

Describe how to reproduce the problem

# zfs create <pool>/<filesystem>
# zfs mount <pool>/<filesystem>

# reboot

# mount

After reboot, the zfs filesystems do not mount automatically.

Add After=network.target to /usr/lib/systemd/system/zfs-mount.service and repeat the above steps. After reboot, zfs filesystems will automatically mount.

Include any warning/errors/backtraces from the system logs


I've skimmed through journalctl and can't find any indication of error. As far as I can tell, zfs-mount.service is a oneshot which means it will run, do what it can, and then exit cleanly assuming no errors were produced from zfs mount -a. If there are logs generated elsewhere that are worth looking at I'm happy to scan them and post any relevant information found.

Question

All 14 comments

The network.target thing seems like a red herring. It's probably working with that because you have delayed it until later in the boot process. It's far more likely that some other dependency is the real issue, as mounting ZFS filesystems does not require the network. Are you able to determine which additional units ran between when zfs-mount.service used to start and when it starts with After=network.target? See, for example: https://serverfault.com/questions/617398/is-there-a-way-to-see-the-execution-tree-of-systemd

Yes, I've run systemd-analyze as part of my debugging. Sorry, I should have included that in the debug info. I have included the output of systemd-analyze critical-chain zfs-mount.service with and without the dependency on network.target below. My initial thought was that the dependency might have been at either paths.target or basic.target but I've attempted to change the dependency to each of the intermediate services and targets, down as far as sysinit.target, but it was not until setting it back to network.target that mounting at boot worked as expected.

With After=network.target

zfs-mount.service +1.739s
└─network.target @18.401s
  └─network.service @6.058s +12.342s
    └─basic.target @6.029s
      └─paths.target @6.029s
        └─brandbot.path @6.029s
          └─sysinit.target @6.028s
            └─systemd-udev-settle.service @2.971s +3.036s
              └─systemd-udev-trigger.service @194ms +2.386s
                └─systemd-udevd-control.socket @162ms
                  └─-.slice

Without After=network.target

zfs-mount.service +5ms
└─systemd-udev-settle.service @3.196s +2.878s
  └─systemd-udev-trigger.service @187ms +2.767s
    └─systemd-udevd-control.socket @165ms
      └─-.slice

I'm surprised that there is no dependency on networking considering that ZFS is able to manage NFS exports.

I'm surprised that there is no dependency on networking considering that ZFS is able to manage NFS exports.

In terms of NFS, we should depend on remote-fs.target not network.target.

Depending on remote-fs.target is for when you want to ensure that your mounts of remote filesystems have completed. That's not applicable to the ZFS case. It is the NFS server, not the NFS client.

Even for the NFS server, I doubt that network.target is appropriate. Why can't ZFS mount filesystems without the network, even if that involves configuring NFS to share them? Put differently, why does modifying the NFS _configuration_ require the network to be up?

Without network.target, what does systemctl list-dependencies zfs-mount.service say? Is it depending on something (scan or cache) to import the pool, and has that worked?

@rlaager

# systemctl list-dependencies zfs-mount.service
zfs-mount.service
● ├─system.slice
● └─zfs-import-cache.service

It looks like it does depend on zfs-import-cache.service but I don't know that I understand your question "and has that worked?". Has what worked?

Did zfs-import-cache succeed and import the pool?

This is the only information in the logs around ZFS at time of boot. Is there another location I should look or flags I can enable to try to log more verbosely? Looking through the docs I didn't see anything that stuck out.

May  2 10:33:26 xxxxx01 kernel: ZFS: Loaded module v0.7.9-1, ZFS pool version 5000, ZFS filesystem version 5
May  2 10:33:26 xxxxx01 systemd: Started udev Wait for Complete Device Initialization.
May  2 10:33:26 xxxxx01 kernel: type=1130 audit(1556793206.434:68): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-udev-settle comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
May  2 10:33:26 xxxxx01 systemd: Starting Import ZFS pools by cache file...
May  2 10:33:26 xxxxx01 systemd: Starting Mount ZFS filesystems...
May  2 10:33:26 xxxxx01 systemd: Started Mount ZFS filesystems.

Look at the status of the zfs service with systemctl status zfs-mount.service, as this normally shows why the service is not starting properly
Also check the current logs for all logs that has anything to do with zfs-mount.service and zfs-import-cache.service with command journalctl -u zfs-mount.service -b 0 andjournalctl -u zfs-import-cache.service -b 0.
To get live log updates you could open another terminal or use tmux and run the command journalctl -u zfs-mount.service -b 0 -f and in the other terminal run systemctl restart zfs-mount.service or zfs mount -a. If there a lot of error messages you can run journalctl -u zfs-mount.service -p err -b 0 -f to filter them.

@johnnyjacq16 Here's the output from journalctl as you described. No errors:

[[email protected] ~] # journalctl -u zfs-mount.service -b 0
-- Logs begin at Tue 2018-12-25 23:01:49 GMT, end at Wed 2019-05-15 18:04:03 GMT. --
May 02 10:33:26 xxx.xxx.xxx systemd[1]: Starting Mount ZFS filesystems...
May 02 10:33:26 xxx.xxx.xxx systemd[1]: Started Mount ZFS filesystems.
[[email protected] ~] # journalctl -u zfs-import-cache.service -b 0
-- Logs begin at Tue 2018-12-25 23:01:49 GMT, end at Wed 2019-05-15 18:04:09 GMT. --
May 02 10:33:26 xxx.xxx.xxx systemd[1]: Starting Import ZFS pools by cache file...
May 02 10:33:35 xxx.xxx.xxx systemd[1]: Started Import ZFS pools by cache file.

To get live log updates you could open another terminal or use tmux and run the command journalctl -u zfs-mount.service -b 0 -f and in the other terminal run systemctl restart zfs-mount.service or zfs mount -a. If there a lot of error messages you can run journalctl -u zfs-mount.service -p err -b 0 -f to filter them.

Unfortunately, this issue does not reproduce unless the mount is attempted during the boot process. If there's an option I can pass to the daemon or the unit file to produce more verbose logging, perhaps that would yield more information?

@k4k
Instead of making changes to the /usr/lib/systemd/system/zfs-mount.service file by adding After=network.target manually you could allow systemd to do it for you with command systemctl edit --full zfs-mount.service and systemctl edit --full zfs-import-cache.service at which point you can add the After=network.target. Since this issue affect the boot process it may be a good idea to update the initramfs.

To get debug information at early boot, power down the machine, then start it up, when the grub menu shows up and your selected kernel is highlighted press e add systemd.debug-shell=1 systemd.log_level=debug systemd.log_target=kmsg log_buf_len=1M printk.devkmsg=on to the kernel parameter then press f10 to boot with changes, you should arrive at a shell then run the journalctl commands for more debug information. Important reboot the machine after viewing debug information where the your system will revert back to what was, as leaving the debugging on brings security risk.
refer to this site for more info about systemd debugging https://freedesktop.org/wiki/Software/systemd/Debugging/

I've continued to debug this issue on my end on and off over the last few weeks and here is what we've uncovered. It looks like the suspicion that network.service was required for zfs-mount.service was, in fact, a red herring. Adding network.service as a requirement for zfs-mount.service served to push the startup until later in the boot process which permitted the zfs filesystems to mount successfully but this was only coincidental and had nothing to do with the actual networking services.

It looks like there were some substantial changes to the unit files in zfs-0.8.0 that I would like to test. We're working on incorporating the CentOS 7.6 repositories in to our environment at this time. Once I've had a chance to test these packages I'll report back with any further problems or, hopefully, a report that the update resolved this issue.

Looking at your output from the following:
$ systemd-analyze critical-chain zfs-mount.service

zfs-mount.service +5ms
└─systemd-udev-settle.service @3.196s +2.878s
  └─systemd-udev-trigger.service @187ms +2.767s
    └─systemd-udevd-control.socket @165ms
      └─-.slice

@k4k Is zfs-mount.service not waiting on zfs-import.target and zfs-import-cache.service? I really expect zfs-import to be in that critical-chain. It is almost like your system has zfs-mount Wants=zfs-import-cache.service rather than After=.... The fix may be as simple as enabling zfs-import.target. zfs-mount.service needs to be After the import.

For reference on my system:

$ systemd-analyze critical-chain zfs-mount.service
...
zfs-mount.service +140ms
└─zfs-import.target @3.884s
  └─zfs-import-cache.service @2.494s +1.389s
    └─systemd-udev-settle.service @222ms +2.268s
      └─systemd-udev-trigger.service @160ms +60ms
        └─systemd-udevd-control.socket @158ms
          └─-.mount @128ms

and

$ systemctl list-dependencies --after zfs-mount.service
zfs-mount.service
● ├─system.slice
● ├─systemd-journald.socket
● └─zfs-import.target
●   └─zfs-import-cache.service

I'm closing. We can reopen if further testing reconfirms this.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

kernelOfTruth picture kernelOfTruth  Â·  4Comments

Greek64 picture Greek64  Â·  3Comments

Hubbitus picture Hubbitus  Â·  4Comments

avg-I picture avg-I  Â·  3Comments

pcd1193182 picture pcd1193182  Â·  4Comments