Currently during cluster setup, if the first node enables password authentication but sets it to empty then other cluster noders are unable to join.
This is easily done during lxd init by just accepting the defaults.
We should enforce that if password authentication is enabled a non-empty password is provided.
@tomponline Interested in working on this but just getting started poking around the codebase. How do you want this implemented? Some target questions:
init that seems to explicitly enable password authentication. Is this just a function of whether the trust-password is part of the config?@komish so an example of the issue is this:
Setting up node 1 in a new cluster: (just press return when being asked for the trust password for new clients without specifying one).
root@v1:~# lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=v1]:
What IP address or DNS name should be used to reach this node? [default=10.143.131.8]:
Are you joining an existing cluster? (yes/no) [default=no]:
Setup password authentication on the cluster? (yes/no) [default=yes]:
Trust password for new clients:
Again:
We would like the validation to reject empty passwords at this stage, as the problem this causes is exemplified below:
Now on node 2, try joining the cluster with an empty trust password:
root@v2:~# lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What name should be used to identify this node in the cluster? [default=v2]:
What IP address or DNS name should be used to reach this node? [default=10.143.131.9]:
Are you joining an existing cluster? (yes/no) [default=no]: yes
IP address or FQDN of an existing cluster node: 10.143.131.8
Cluster fingerprint: b54e80cb34ebbd84adc16f40d22613875a4f931836f7c708f10c4ca6a486bae3
You can validate this fingerprint by running "lxc info" locally on an existing node.
Is this the correct fingerprint? (yes/no) [default=no]: yes
Cluster trust password:
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Error: Failed to setup trust relationship with cluster: Failed to add client cert to cluster: not authorized
The asking this question is here https://github.com/lxc/lxd/blob/master/lxd/main_init_interactive.go#L233
Right, looking at all the places where we use cli.AskPassword or cli.AskPasswordOnce, I think the easy fix for this is to simply make those functions refuse an empty string as a valid value.
So that would be a change directly to shared/cmd/ask.go to make both functions call invalidInput() when provided with an empty string and then have them ask again until they get a non-empty response.
@komish assigning the issue to you for now.
Understood - thanks for the guidance. More to come as I have it!
Just an update that I think I'm almost done with this. Just working on testing changes but I'm fighting pre-installed (via snap) lxd in cloud images of Ubuntu 20.04, and getting my source-built binary in place to test. I'll get more time on it going into the weekend.
@komish if its easier you can build your lxd/lxc go binaries and then move them into the snap temporarily to avoid needing to install all other dependencies using:
sudo mv lxd /var/snap/lxd/common/lxd.debug
sudo mv lxc /var/snap/lxd/common/lxc.debug
sudo systemctl reload snap.lxd.daemon
Circling back in case anyone finds this discussion helpful in the future. I figured out that I can just run my built lxd binary and then run lxd init and it'll talk to the socket that's generated by lxd, even on a system that has it installed via snap already. That was enough to test the init workflow. 馃憤
I did have to install libsqlite3-dev as described here on a fresh 20.04 machine in AWS (lightsail, if it matters). Not sure why, but make deps spat out issues finding sqlite3 which was installed as far as I could tell.
Thanks again for the guidance! Very helpful for someone who hasn't looked at the lxd codebase!