Nixpkgs: RFC: Live environment with nix store included in initrd

Created on 8 Apr 2016  路  20Comments  路  Source: NixOS/nixpkgs

Issue description

I'm working on a new Linux based project that is currently using a different build system and I'm evaluating Nix as a replacement.

Our system runs mostly from a stripped down Linux image in memory that includes the full system in the initrd which makes it very very simple to boot with network booting. No need to fetch an additional squashfs (and it's practical because our image is so lean.) We then put a ZFS pool on the disks to use as storage for the workloads we run on the machines.

I've been flailing around (see https://github.com/NixOS/nixpkgs/compare/master...nshalman:cerana-test1) trying to get something that works and coming up short.

I think Nix/NixOS is a very good project for us to collaborate with. A much more elegant version of what I've been working on is something that I'd want to get merged into nixpkgs so that we can collaborate further with NixOS.

reporter feedback

Most helpful comment

@nshalman I have thought of this as well. I'm not sure, it shortens the build time (creating squashfs + creating initrd takes a bit of IO), but it seems from reading online that squashfs is still efficient in initrd. That said, I gave it a short try, but due to switch_root removing files from initrd it didn't immediately work.

I'd go for cleaning things up, de-branding the code and making it ready for inclusion in NixOS

@dezgeg I agree that the extra complexity might not be worth it, but do not underestimate the usage of @nshalman's changes. Having a separate kernel+initrd will allow various types of iPXE/PXE booting. That will allow netboot.xyz support as well as somewhat proper support for cloud providers like Digital Ocean (no need for nixos-in-place or nixos-assimilate).
If Hydra would generate these, like the ISOs, it'll be another medium for installation.

All 20 comments

Maybe in this part:

+    # Individual files to be included
+    cerana.contents =
...
+        { source = config.system.build.initialRamdisk + "/initrd";
+          target = "/boot/initrd";
+        }
...
+      ];
+

instead of using config.system.build.initialRamdisk directly, you create a new initrd based on that.

Thanks for the work! I still have to give this a try, but this looks promising. Will try tomorrow.

Also, related to #2100.

I've made some changes. Squashfs now mounts, but Stage 2 cannot be started because it cannot find init.

https://github.com/bobvanderlinden/nixpkgs/tree/cerana-test1

To build:

nix-build -A cerana_minimal nixos/release.nix

This results in the 3 files:

result/bzImage
result/initrd
result/kernelAppend

To run inside qemu:

qemu-system-x86_64 -kernel result/bzImage -initrd result/initrd -m 2G -nographic -serial mon:stdio -append "$(cat result/kernelAppend) console=ttyS0"

In stage 1 it has /, which is the contents of initrd and /mnt-root, which will be / for stage 2. It mounts squashfs from initrd into /mnt-root/nix. After that stage-1-init.sh will switch_root into /mnt-root and execute the stage 2 init script.

The problem is (I think) that switch_root deletes all files from its current root (from initrd). That means the squashfs file will also be deleted. I don't know for sure whether this is the case, but somehow it doesn't find init of stage 2. That made me think, why not skip squashfs and put everything in initrd directly? That way switch_root isn't necessary anymore.

Not sure whether this is the case, but I couldn't figure it out yet.

Yes, switch_root deleting everything in the initrd sounds correct (as in it should do that by design). But I think it shouldn't matter that the squashfs image gets deleted as accessing unlinked files that are already open should work just fine in Linux. Probably something else is amiss.

@dezgeg Thanks for the tips, it was an incorrect path for the mountpoint :/

Anyway, it's now booting correctly in Qemu. Not sure whether it works the same for iPXE.

The next phase I was going to work on was actually removing use of the squashfs and just putting the /nix contents directly into the initrd. @bobvanderlinden, do you think that would be possible?

Not sure if that's worth the complexity of having extra code for something that's used really rarely.

@nshalman I have thought of this as well. I'm not sure, it shortens the build time (creating squashfs + creating initrd takes a bit of IO), but it seems from reading online that squashfs is still efficient in initrd. That said, I gave it a short try, but due to switch_root removing files from initrd it didn't immediately work.

I'd go for cleaning things up, de-branding the code and making it ready for inclusion in NixOS

@dezgeg I agree that the extra complexity might not be worth it, but do not underestimate the usage of @nshalman's changes. Having a separate kernel+initrd will allow various types of iPXE/PXE booting. That will allow netboot.xyz support as well as somewhat proper support for cloud providers like Digital Ocean (no need for nixos-in-place or nixos-assimilate).
If Hydra would generate these, like the ISOs, it'll be another medium for installation.

Naming things is hard. Does anyone have suggestions of naming to use for removing the "cerana" stuff from the code? PXE or netboot are logical options, I guess.

Even though it's just a standalone initrd, I guess netboot is a good one, since it'll be mostly used for networking booting (whether PXE or iPXE or grub?). A .ipxe file and/or PXE services can be created in a second iteration.

@bobvanderlinden I have a de-branded version of this work in https://github.com/NixOS/nixpkgs/compare/master...nshalman:netboot-v1
That commit doesn't currently credit you; please let me know if/how you want to be credited in a version to be turned into a pull request.

For my purposes I definitely need to do further work to see how easy it would be to eliminate the use of squashfs. I'm guessing that if built correctly my variant wouldn't need to ever invokeswitch_root in the first place.

Testing of that netboot-v1 branch is very welcome. Further tweaks to it (e.g. moving the nouveau blacklisting from netboot.nix to netboot-minimal.nix) are also welcome.

Nice. Also, don't worry about crediting.

I tried to boot this today over iPXE in qemu, however I couldn't make it work. I haven't done much with iPXE before, so this might be a problem on my side. That said, it's worth checking whether it actually boots over the network.
Here are the steps I used:

  • Build the bzImage, initrd files.
  • Create boot directory.
  • Create boot/boot.ipxe with:
#!ipxe
kernel bzImage init=/nix/store/3zrg47j9ihydsn026qzpwf7fs4ncx8f8-nixos-system-nixos-16.09pre56789.gfedcba/init loglevel=7
initrd initrd
boot
  • Symlink bzImage and inird into boot/
  • Host boot/ over HTTP (I used darkhttpd).
  • Start qemu using -cdrom ipxe.iso and -m 2G
  • In qemu I pressed Ctrl+D and used: dhcp followed by chain --autofree http://192.168.0.197:8080/boot.ipxe.

It loads the kernel, it loads initrd into memory, it boots the kernel, it unpacks initrd, but then it fails on executing init with the following error:

                   [   22.824394] Kernel panic - not syncing: Requested init /nix/store/3zrg47j9ihy
                   dsn026qzpwf7fs4ncx8f8-nixos-system-nixos-16.09pre56789.gfedcba/init failed (erro
                   r -2).
                   [   22.824606] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.6 #1-NixOS
                   [   22.824660] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1
                   -20160216_104851-anatol 04/01/2014
                   [   22.824811]  0000000000000000 ffff880062c57eb8 ffffffff812b8abe ffffffff81703
                   148
                   [   22.824934]  ffff880062c57f48 ffff880062c57f38 ffffffff8114ba58 ffffffff00000
                   018
                   [   22.825042]  ffff880062c57f48 ffff880062c57ee0 0000000000000000 ffff88007ffda
                   4c5
                   [   22.825042] Call Trace:
                   [   22.825042]  [<ffffffff812b8abe>] dump_stack+0x63/0x85
                   [   22.825042]  [<ffffffff8114ba58>] panic+0xc4/0x1fc
                   [   22.825042]  [<ffffffff811c7ee4>] ? putname+0x54/0x60      
                   [   22.825042]  [<ffffffff814e1d90>] ? rest_init+0x80/0x80
                   [   22.825042]  [<ffffffff814e1e19>] kernel_init+0x89/0xe0            
                   [   22.825042]  [<ffffffff814e887f>] ret_from_fork+0x3f/0x70
                   [   22.825042]  [<ffffffff814e1d90>] ? rest_init+0x80/0x80
                   [   22.825042] Kernel Offset: disabled
                   [   22.825042] ---[ end Kernel panic - not syncing: Requested init /nix/store/3z
                   rg47j9ihydsn026qzpwf7fs4ncx8f8-nixos-system-nixos-16.09pre56789.gfedcba/init fai
                   led (error -2).

The error seems different from "init not found", so I don't know what is happening atm.

I've done some simple testing using the iPXE that is built into QEMU (see https://github.com/nshalman/nixpkgs/commit/e850edb7f88fbfa150e174026e1165b7ac8e439a) and it appears to work just fine.

@bobvanderlinden please confirm that it works for you as my example demonstrates.

Squashed down into https://github.com/NixOS/nixpkgs/compare/master...nshalman:netboot-v2

I think this is nearly ready to be turned into a PR...

Based on some testing, given how much of the nix store needs to live in the initrd anyway, I think that putting the nix store that ends up in the squashfs directly in the initrd will probably be a very effective way of further shrinking the final initrd size.

du -sh /nix/store /mnt-root/nix/store
302.2M  /nix/store
596.7M  /mnt-root/nix/store

Yes, that might be a more efficient way. However, it isn't as easy as bind-mounting / to /mnt-root, since switch_root will delete everything from /.

That said, I've added a boot-test for netboot: https://github.com/bobvanderlinden/nixpkgs/tree/netboot-v2
See whether it works and if you'd like to integrate it into your branch.

It's actually easier than I thought it would be. The essence of it is in the following change: https://github.com/nshalman/nixpkgs/commit/1ced0c949c34fe58cf3c7b8c9db71bdfe9268afb

We sidestep the need for stage1 by having /init be a link to the toplevel/stage2 init. A side effect of that is that the closure for the whole system ends up in the initrd so there's no need for the squashfs.

I've squashed that change into a new branch which also includes @bobvanderlinden's test code (in its own commit for credit purposes :wink:) which is here: https://github.com/NixOS/nixpkgs/compare/master...nshalman:netboot-v3

Edit: Sorry for the false closure, accidentally hit the wrong button on the web page.

And it's currently broken.. I've still got some work to do.

Okay, there's something weird with trying to sidestep the squashfs that's complicated enough that I think it should be separate work, either follow-on for an official netboot, or just something I use downstream and not in the official version at all.

Resolved by #14740

Was this page helpful?
0 / 5 - 0 ratings

Related issues

grahamc picture grahamc  路  3Comments

chris-martin picture chris-martin  路  3Comments

spacekitteh picture spacekitteh  路  3Comments

tomberek picture tomberek  路  3Comments

ob7 picture ob7  路  3Comments