USER_DATA is executed on restart of an EC2 instance, this is contrary to AWS documentation and general practice. It caused me some big problems as I assumed this wouldn't happen.
nixos-rebuild the machine with some different configurationUser data is not executed and machine state remains as it was before reboot
Machine configuration is rolled back to the user data version
Please see "View and Update the Instance User Data" in https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html
Whoops! Not going to have time to look into this for a few days at least, so if you want to take a stab at it, most of the logic for this is in here.
Easiest solution is probably just to touch /root/.initialized and then skip the rebuild if it already exists. We do have a nice VM test for this functionality so it should also be fairly easy to make sure it's doing the right thing.
Just use cloud-init, because then this logic doesn't need to be in NixOS anymore.
On this topic, I think we should also have recommendations as to how to use this feature if at all, because running nixos-rebuild can be a slow operation (not something you would want to do if you have 100s/1000s of machines).
Cloud-init is too bloated, see https://github.com/NixOS/nixpkgs/issues/39076#issuecomment-382385364.
I've also written plugins for cloud-init (which we'd need here) and it's kind of a miserable and undocumented project. I was not impressed. And of course we'd need to wrap our user-data with yaml, reimplement most of their existing yaml support because it wouldn't work on our platform (you can list users and such, and we'd need to translate that to our declarative config because their default implementation is to just call useradd and the like).
Due to political considerations (Canonical creates cloud-init and likely cannot allocate people who could implement this with acceptable quality), I retract my suggestion for cloud-init.
It caused me some big problems as I assumed this wouldn't happen.
INDEED
Thank you for your contributions.
This has been automatically marked as stale because it has had no activity for 180 days.
If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.
Here are suggestions that might help resolve this more quickly:
This hasn't been a problem for me recently as I've not been restarting things but has this been fixed? @copumpkin do you know?
I confirmed that this is still an issue, by doing the following:
Ah, but there _is_ a way to control this: you can set systemd.services.amazon-init.enable = false; in configuration.nix.
Is this documented anywhere?