Nixpkgs: `boot.initrd.network.ssh.hostRSAKey` breaks activation if removed

Created on 25 Jan 2018  Â·  8Comments  Â·  Source: NixOS/nixpkgs

Issue description

After first setting hostRSAKey and rebuilding the system, if the file is subsequently removed (and the setting commented out) then activation will fail.

It appears that all generations use the same initrd, instead of creating a separate file for each. This is true even when they should be separate. My best guess would be that the hostRSAKey is not included in the hash.

personal> closures copied successfully
saya...> cp: cannot stat '/run/keys/hostRSAKey': No such file or directory
saya...> Traceback (most recent call last):
saya...>   File "/nix/store/c5bnfxl43j0f5lfivg2pgrczvl7vh9iv-systemd-boot-builder.py", line 210, in <module>
saya...>     main()
saya...>   File "/nix/store/c5bnfxl43j0f5lfivg2pgrczvl7vh9iv-systemd-boot-builder.py", line 197, in main
saya...>     write_entry(*gen, machine_id)
saya...>   File "/nix/store/c5bnfxl43j0f5lfivg2pgrczvl7vh9iv-systemd-boot-builder.py", line 85, in write_entry
saya...>     subprocess.check_call([append_initrd_secrets, "/boot%s" % (initrd)])
saya...>   File "/nix/store/53dyjh7xjhnbibqllr7j27lk2h98n7j7-python3-3.6.4/lib/python3.6/subprocess.py", line 291, in check_call
saya...>     raise CalledProcessError(retcode, cmd)
saya...> subprocess.CalledProcessError: Command '['/nix/store/0xfvmgbafj9xxzzvba2pckd1w0i83qrs-append-initrd-secrets/bin/append-initrd-secrets', '/boot/efi/nixos/7k38fm34cq6xrca4nxb10zz2hk191zp1-initrd-initrd.efi']' returned non-zero exit status 1.
grep -r 7k38fm /boot
/boot/loader/entries/nixos-generation-66.conf:initrd /efi/nixos/7k38fm34cq6xrca4nxb10zz2hk191zp1-initrd-initrd.efi
/boot/loader/entries/nixos-generation-67.conf:initrd /efi/nixos/7k38fm34cq6xrca4nxb10zz2hk191zp1-initrd-initrd.efi
/boot/loader/entries/nixos-generation-68.conf:initrd /efi/nixos/7k38fm34cq6xrca4nxb10zz2hk191zp1-initrd-initrd.efi
/boot/loader/entries/nixos-generation-69.conf:initrd /efi/nixos/7k38fm34cq6xrca4nxb10zz2hk191zp1-initrd-initrd.efi

On a sidenote, while fixing the problem (using nix-collect-garbage -d), I arrived at a situation where the most-recent GRUB boot entry referred to a system configuration that no longer existed. I'm not sure how.

Technical details

  • system: "x86_64-linux"
  • host os: Linux 4.14.14, NixOS, 18.03.git.d492cdc789c (Impala)
  • multi-user?: yes
  • sandbox: relaxed
  • version: nix-env (Nix) 1.11.16
  • channels(root): "nixos-18.03pre126063.95880aaf062"
  • nixpkgs: /nix/var/nix/profiles/per-user/root/channels/nixos/nixpkgs

Most helpful comment

After a lot of digging, it seems the problem is here: https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/system/boot/loader/systemd-boot/systemd-boot-builder.py#L84

Specifically, write_entry gets called in a loop, for every generation. This fails when the host key has already been removed from the system. It's theoretically fixable by not updating the initrd unnecessarily, but it might be easier to document it and wait for secrets-in-nix-store to exist.

There's another bug which would block that fix. Assuming this initrd secret is the only difference between the configurations, their respective initrds will have the same hash -- and so the same filename in /boot. That wouldn't cause trouble for initrd ssh, but should be fixed anyway.

All 8 comments

@Baughn could you paste the dropbear-specific config-section? :)

Sure, but it's a little complicated.

We talked this over on IRC. For anyone following along, http://ix.io/EG7 has the relevant configuration with the failed bits commented out, in emergency-shell.nix.

After a lot of digging, it seems the problem is here: https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/system/boot/loader/systemd-boot/systemd-boot-builder.py#L84

Specifically, write_entry gets called in a loop, for every generation. This fails when the host key has already been removed from the system. It's theoretically fixable by not updating the initrd unnecessarily, but it might be easier to document it and wait for secrets-in-nix-store to exist.

There's another bug which would block that fix. Assuming this initrd secret is the only difference between the configurations, their respective initrds will have the same hash -- and so the same filename in /boot. That wouldn't cause trouble for initrd ssh, but should be fixed anyway.

From the looks of #8, we won't be getting a perfect solution anytime soon. That leaves the options of "fixing" systemd-boot (which would make it more fragile), or simply documenting the bug in the hostess key and related attributes. I'm inclined towards the latter.

I ran into this issue while setting up a new nixos machine and was completely puzzled by the error until i figured what was going on. I think at the very least the error could be more helpful.

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.

What’s the state of this issue today? I have not modified this setting recently on any of my systems, but hitting this issue while re-installing a system could lead to confusion.

Self reply: I’ve re-organised yesterday the files in my config repository, moving my dropbear host RSA key—and correctly updating the relevant configuration line, forgetting about this issue. I’ve lost something like 40 minutes trying to remember how to fix it. For others—and maybe future self—who encounter this issue, follows a procedure that works to move the key file:

  1. Keep the file at the same place, otherwise the initrd builder is unhappy.
  2. Comment out the boot.initrd.network.ssh section.
  3. sudo nixos-rebuild switch
  4. sudo nix-collect-garbage -d (yes, all your generations are gone…)
  5. sudo nixos-rebuild switch => this updates the /boot partition, effectively updating boot entries and the initrd.
  6. Move the key file (doing a new sudo nixos-rebuild switch without changing the config should then work, as the initrd refering to it is now gone.
  7. Uncomment the boot.initrd.network.ssh section and update hostRSAKey.
  8. sudo nixos-rebuild switch
  9. You should be fine.
Was this page helpful?
0 / 5 - 0 ratings

Related issues

tfc picture tfc  Â·  68Comments

timokau picture timokau  Â·  66Comments

Infinisil picture Infinisil  Â·  146Comments

nico202 picture nico202  Â·  70Comments

fdietze picture fdietze  Â·  144Comments