Nixpkgs: Services depending on `keys.target` can cause hanging boots on NixOS containers

Created on 22 Aug 2019  路  5Comments  路  Source: NixOS/nixpkgs

Describe the bug

When starting an imperative NixOS container which is deployed using the container backend from NixOps with several secrets uploaded using the deployment.keys module and a dovecot2 unit from services.dovecot installed, the boot times out and causes the container to fail as it's waiting for an infinite amount of time for keys.target (which is a systemd-target that indicates whether all keys from NixOps were successfully uploaded).

This happens because several modules from nixpkgs (including dovecot) wait for keys.target by default, but are wanted by multi-user.target which causes the system to wait until the unit is started up (which is supposed to happen keys.target is reached).

The problem with NixOS containers is that they don't have a proper uplink until the [email protected] is completely started when using scripted networking as the ve-<name> interface on the host side is configured after the container is started up: https://github.com/NixOS/nixpkgs/blob/578d712af46c7569f6c7c02a0a7a1ca51a6b6d89/nixos/modules/virtualisation/containers.nix#L178-L235

With the container being unreachable until start-up is done, it's impossible to send keys on an unattended reboot to containers to ensure that keys.target is properly reached (which makes the system wait for dovecot2.service as it currently depends on keys.target). The timeout of dovecot2 keeps the container from properly starting up.

In my case the issue wouldn't exist if dovecot2.service didn't depend on keys.target as I only deploy secrets for services.borgbackup currently, so it's completely unnecessary for dovecot2.service to wait for that target.

My current workaround looks like this:

{ lib, ... }: {
  systemd.services.dovecot2 = {
    wants = lib.mkForce [ ];
    after = lib.mkForce [ "network.target" ];
  };
}

To Reproduce
Steps to reproduce the behavior:

  1. Deploy a container with deployment.targetEnv = "container";
  2. Deploy several secrets with deployment.keys and a dovecot instance using services.dovecot
  3. Try to reboot the container

Expected behavior

I originally expected that no service would wait for the keys on its own without explicitly configuring it to do so as the nixops manual recommends to use the <key-name>-key.service units and recommends to explicitly add those to the units in question.

However one might argue as well that the actual issue is the broken uplink for NixOS containers at boot, so I'd like to gather some opinions before filing a patch :)

Maintainer information:

# a list of nixpkgs attributes affected by the problem
attribute:
# a list of nixos modules affected by the problem
module:
  - systemd
  - services.dovecot
  - services.httpd
  - services.nsd
  - services.strongswan
  - services.strongswan-swanctl

CCing @edolstra @hrdinka (for dovecot2) and @lheckemann (as we talked about this earlier that day)

bug nixos-container nixos

Most helpful comment

Hi,

Thanks for the detailed write-up. I have added the keys.target dependency to dovecot back in time. My problem then was that dovecot/the hole system would not start because it was missing the key files. Actually every service could possible depend on a key deployed by nixops. Therefore a solution that fixes this problem for all services would be favorable.

While it would be great to have a proper replacement, finding one isn't easy for the reasons described above. Since nixops does cover this in its documentation now (it didn't back then), I am in favor of removing the keys.target dependency from all services. We should however, add this to the NixOS realease notes and wait for NixOS 19.09 before porting this to stable.

All 5 comments

Hi,

Thanks for the detailed write-up. I have added the keys.target dependency to dovecot back in time. My problem then was that dovecot/the hole system would not start because it was missing the key files. Actually every service could possible depend on a key deployed by nixops. Therefore a solution that fixes this problem for all services would be favorable.

While it would be great to have a proper replacement, finding one isn't easy for the reasons described above. Since nixops does cover this in its documentation now (it didn't back then), I am in favor of removing the keys.target dependency from all services. We should however, add this to the NixOS realease notes and wait for NixOS 19.09 before porting this to stable.

Absolutely agree that this shouldn't go into 19.03, but yeah I'm also in favour of making the change on master before the feature freeze (7th September) :)

Thanks for the feedback! I'll open a PR tomorrow which removes the dependencies to keys.target from modules in <nixpkgs/nixos>.

However I'd keep this issue open after that until we've discussed whether keys.target should be declared in a module in NixOps.

The actual issue has been fixed for 19.09 already, so this should be closable now.

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/issues-using-nixos-container-to-set-up-an-etcd-cluster/8438/2

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yawnt picture yawnt  路  3Comments

retrry picture retrry  路  3Comments

matthiasbeyer picture matthiasbeyer  路  3Comments

ayyess picture ayyess  路  3Comments

tomberek picture tomberek  路  3Comments