See the following:
# systemctl status systemd-tmpfiles-setup.service
● systemd-tmpfiles-setup.service - Create Volatile Files and Directories
Loaded: loaded (/nix/store/6j1c354bzz70ylbh4cd0c3x165ic6hh2-systemd-212/example/systemd/system/systemd-tmpfiles-setup.service)
Active: active (exited) since Tue 2014-11-04 04:04:08 CET; 19min ago
# systemctl status postgresql.service
● postgresql.service - PostgreSQL Server
Loaded: loaded (/nix/store/q5rk80hmxy57ndl4zh0lq50bw3bcwza8-unit-postgresql.service/postgresql.service)
Active: active (running) since Tue 2014-11-04 04:04:06 CET; 20min ago
As you can see, PostgreSQL was started _before_ systemd-tmpfiles-setup.service was finished.
The result was that PostgreSQL's Unix socket created in /tmp/.s.PGSQL.5432 got deleted and clients are not able to connect to it anymore.
This is probably caused by this NixOS-specific patch: https://github.com/edolstra/systemd/commit/5f47c0ee21e46975115a885e666d2cb68f3e32e7
How should this be fixed?
I suppose the patch is now at https://github.com/NixOS/systemd/commit/5f47c0ee21e46975115a885e666d2cb68f3e32e7
From what I've read, systemd doesn't really allow services to run at shutdown only. One thing we could do, however, is call it in ExecStop instead, which has the desired effect. Would that be an acceptable workaround?
This also likely causes https://github.com/NixOS/nixpkgs/issues/12132#issuecomment-171284532, see also https://github.com/NixOS/nixpkgs/commit/b292e19fbddaddd4ada813965e3be3880f15fa7e (I've had no problems since I've disabled boot.cleanTmpDir). I want to look at this issue later, implementing a simple oneshot service that cleans up /tmp on ExecStop, if noone does that before me. @edolstra, would that be an acceptable fix for the problem?
Probably the best fix is to drop that patch (https://github.com/NixOS/systemd/commit/5f47c0ee21e46975115a885e666d2cb68f3e32e7) and find some other way to fix the nixops send-keys issue (assuming it even exists anymore).
Whom can I ping for nixops knowledge? I noticed /tmp had grown over the top today and also got a report from another victim of cleanTmpDir.
@edolstra: you self-assigned – are you (still) working on this?
Not right now.
If we don't make it to 16.09, maybe we want to remove this option or add a warning somewhere?
@domenkozar is it better to make a change in documentation to spell a big warning, or just remove the option?
@abbradar I would prefer this option not to be removed, since I am using it and I have no plans to use nixops send-keys.
To prevent this bug, I am adding the following config options to configuration.nix:
{
systemd.services = {
systemd-tmpfiles-setup.before = [ "sysinit.target" ];
systemd-update-utmp.after = [ "systemd-tmpfiles-setup.service" ];
};
}
@wizeman Awesome; maybe we want to add this conditionally when boot.cleanTmpDir is enabled to workaround the situation?
On the second thought, we can just move from systemd-tmpfiles; this should work always.
I have a draft of a possible fix here https://github.com/rvl/nixpkgs/commit/70383fc3d18dd3247e431ae15744dbca6021ce53 but I think we all agree it would be preferable to fix NixOps so that the NixOS systemd patch can be removed.
However I can't reproduce the problem. Here is my deployment:
{
network.description = "Test #4825";
testserver =
{ config, pkgs, ... }:
{
deployment.targetEnv = "virtualbox";
deployment.virtualbox.memorySize = 1024; # megabytes
deployment.virtualbox.headless = true;
virtualisation.virtualbox.guest.enable = true;
# deploy the test key
deployment.keys.secret.text = "secret";
# uses systemd-tmpfiles
boot.cleanTmpDir = true;
# reverses nixos patch to systemd
systemd.services = {
systemd-tmpfiles-setup.before = [ "sysinit.target" ];
systemd-update-utmp.after = [ "systemd-tmpfiles-setup.service" ];
};
};
}
After deploying this and ssh'ing into testserver, I see that the test key is correctly installed. What would I need to do to trigger the nixops send-keys problem? @edolstra
I think the problem occurs when you use the send-keys feature to send a LUKS key for an encrypted filesystem. Then you get a cycle in the unit dependency graph (since sshd depends on local-fs.target via sysinit.target, but the filesystems depend on sshd via keys.target).
However, we can probably just drop the systemd patch and if necessary relax the sshd dependencies in NixOps if any keys are specified.
Is this issue still relevant?
AFAICS the buggy systemd patch is still present in NixOS 19.03 but it doesn't seem to be present in the master branch, so it should be fixed in master but not in the stable branch.
That said, I don't know whether the nixops send-keys feature works correctly in master or if it's broken as a result of not having the systemd patch.
The patch seems to have been dropped in the 239 -> 242 systemd upgrade on purpose and it seems the nixops autoLuks option doesn't currently work on master. See issues #47550 and #62211.
Unless we are considering backporting the fix, it's probably safe to close this issue.
Awesome, I can finally drop my out of tree patches for this :3 I don't think we should backport this, people may rely on autoLuks on stable.