Has anyone experienced this issue when trying to bootstrap ACL via CLI? I havethe below config in my consul.json.
"acl": {
"enabled": true,
"default_policy": "allow",
"down_policy": "extend-cache"
},
Then I execute "consul acl bootstrap" but receive the below error:
"Failed ACL bootstrapping: Unexpected response code: 500 (The ACL system is currently in legacy mode.)"
I'm running Consul v1.4.0
@marcuschaney The error message would indicate that the tokens you are using are from the Legacy ACL System and need to be migrated.
Have you tried taken a look at this page and following the steps to migrate the tokens: https://www.consul.io/docs/guides/acl-migrate-tokens.html
@marcuschaney Is this when bootstrapping a single primary datacenter or when bootstrapping a secondary datacenter? Also are all servers in the cluster running v1.4.0?
The "currently in legacy mode" error should go away once Consul has detected that the cluster is able to use new ACLs. The primary datacenter will ensure that all servers in its dc are advertising new acl capabilities before enabling new ACLs. In secondary DCs the same check is done but additionally checks that the servers in the primary dc are also advertising new acl capabilities.
There are some other optimizations for the primary datacenters case to ensure that when bootstrapping a new cluster it should immediately transition into new ACL mode once leadership is acquired by one of the servers (assuming all are on 1.4.0+)
Same problem.
I tried the ACL Token Migration document, but always get the "The ACL system is currently in legacy mode" error. It occurs when "bootstrap_expect > 1" and there're more than one nodes in my cluster. I tried to delete the data files by rm -rf /opt/consul/* but got the same error.
Then I set "bootstrap_expect = 1" and retry the ACL guide steps, it was ok. As the guide said, I bootstrap the ACL successfully.
It seems that the server agent cannot enable the new ACL system with a bootstrap token when other server agent running in the cluster. I don't know why, but it works.
@kingfree Was leader election successful? Running consul info or curling the /v1/agent/self endpoint will include that information.
Prior to leader election (which will be prevented until the expected number of servers are in the cluster) consul will continue to be in legacy ACL mode. Once leader election occurs, the leader will detect whether all the servers in the cluster are capable of using the new ACL system and then transition itself. Then after a time (from 0 to ~30s) the other servers will pick it up and transition themselves. Then after another period of time (again 0 to ~30s) the client agents will detect that all servers are capable of new ACLs and transition themselves.
Note that all servers in the cluster must be on 1.4.0+ in order for this transitioning to take place. So you can't just upgrade 1 server to 1.4.0 and then do the migration. I suspect that setting bootstrap_expect = 1 effectively made that node separate from your other servers.
Any more information about your environment like the number of servers/clients and their versions would be helpful in tracking down the issue
So the issue ended up being the encryption key. Even though the configuration file had the correct key, the keyring file did not. After clearing that up and restarting Consul, everything was good to go.
@marcuschaney That would do it. Auto-transitioning ACL modes works by each node advertising its ACL capabilities. This advertisement happens through serf/gossip so if the encryption keys were incorrect then those messages containing the advertised capabilities would never be seen.
@kingfree If this isn't also a solution to your problem. Please open a new issue and we can work on a solution there.
@mkeeler Thank you for solving my problem.
@marcuschaney can you share your consul configuration file. Having the same issue when bootstrapping a consul cluster with ACL system.
@lmayorga1980 Did you ensure the encryption key in your config matched your key under:
/opt/consul/serf/local.keyring
??
@marcuschaney same issue here, were u able to fix this?
@pgold30 ensure the encryption key is the same in all your config files, and also stored in /opt/consul/serf/local.keyring. Also, be sure to check your "acl" line in the confing, to ensure all servers are set do "deny" by default.
I have same issue with latest Consul version 1.8.4,
Using Consul on Linux Alpine 3.12.0.
Installed successfully and agent running, but when I try consul acl bootstrap from my server, getting error:
Failed ACL bootstrapping: Unexpected response code: 500 (The ACL system is currently in legacy mode.)
Documentation say Legacy mode is for Consul 1.3 and earlier. https://www.consul.io/docs/security/acl/acl-legacy
This is my server config:
data_dir = "/var/consul"
node_name = "consul-1-server"
bind_addr = "{{ GetInterfaceIP \"eth0\" }}"
client_addr = "127.0.0.1 {{ GetPrivateInterfaces | sort \"default\" | join \"address\" \" \" }}"
retry_join = ["provider=aws tag_key=consul tag_value=server"]
datacenter = "US-EAST-1"
encrypt = "xxxxxxxxxxxxxxxx"
raft_protocol = 3
protocol = 2
enable_syslog = true
log_level = "warn"
telemetry {
disable_hostname = true
prometheus_retention_time = "15m"
}
server = true
ui = true
acl = {
enabled = true
default_policy = "deny"
enable_token_persistence = true
}
Any advice will be appreciated.
$>sudo consul info
Error querying agent: Unexpected response code: 403 (ACL not found)
Looks like fixed, update and added this and only with this setting it's working, any small change brake it.
client_addr = "0.0.0.0"
bootstrap_expect = 3
Most helpful comment
@kingfree Was leader election successful? Running
consul infoor curling the/v1/agent/selfendpoint will include that information.Prior to leader election (which will be prevented until the expected number of servers are in the cluster) consul will continue to be in legacy ACL mode. Once leader election occurs, the leader will detect whether all the servers in the cluster are capable of using the new ACL system and then transition itself. Then after a time (from 0 to ~30s) the other servers will pick it up and transition themselves. Then after another period of time (again 0 to ~30s) the client agents will detect that all servers are capable of new ACLs and transition themselves.
Note that all servers in the cluster must be on 1.4.0+ in order for this transitioning to take place. So you can't just upgrade 1 server to 1.4.0 and then do the migration. I suspect that setting
bootstrap_expect = 1effectively made that node separate from your other servers.Any more information about your environment like the number of servers/clients and their versions would be helpful in tracking down the issue