Couchdb: Cluster setup fails: unable to sync admin passwords

Created on 1 May 2020  路  5Comments  路  Source: apache/couchdb

Followup to #2797

Description

Cluster setup fails when request to / is not made before finish_cluster.

Steps to Reproduce

Assuming that 192.168.56.101 is setup node:

#curl http://192.168.56.101:5984/
curl http://192.168.56.102:5984/
curl http://user:[email protected]:5984/_cluster_setup
curl --request POST
  --url http://user:[email protected]:5984/_cluster_setup \
  --header 'content-type: application/json' \
  --data '{
    "action": "enable_cluster",
    "bind_address": "0.0.0.0",
    "username": "user",
    "password": "pass",
    "port": 5984,
    "node_count": 2,
    "remote_node": "192.168.56.102",
    "remote_current_user": "user",
    "remote_current_password": "pass"
}'
curl --request POST \
  --url http://user:[email protected]:5984/_cluster_setup \
  --header 'content-type: application/json' \
  --data '{
    "action": "add_node",
    "host": "192.168.56.102",
    "port": 5984,
    "username": "user",
    "password": "pass",
    "singlenode": false
}'
curl --request POST \
  --url http://user:[email protected]:5984/_cluster_setup \
  --header 'content-type: application/json' \
  --data '{ "action": "finish_cluster" }'
curl http://user:[email protected]:5984/_cluster_setup
{"couchdb":"Welcome","version":"3.0.0-ebdfbba","git_sha":"ebdfbba","uuid":"18b03233b7265d32443a8d576041a981","features":["access-ready","partitioned","pluggable-storage-engines","reshard","scheduler"],"vendor":{"name":"The Apache Software Foundation"}}
{"state":"cluster_enabled"}
{"ok":true}
{"ok":true}
{"error":"setup_error","reason":"Cluster setup unable to sync admin passwords"}
{"state":"cluster_enabled"}

Expected Behaviour

Cluster setup should succeed without visiting /.

Your Environment

Docker containers built with couchdb-docker/dev executed on 2 virtualbox vms

Additional Context

Here are the corresponding logs of the setup node:
Without GET /
Lines of note:

[error] 2020-05-01T09:04:07.246153Z [email protected] <0.714.0> 3e471a2ab6 setup sync_admin results [{badrpc,{'EXIT',{badarg,[{config,set,5,[{file,"src/config.erl"},{line,186}]},{rpc,'-handle_call_call/6-fun-0-',5,[{file,"rpc.erl"},{line,197}]}]}}}] errors []
[notice] 2020-05-01T09:04:07.255094Z [email protected] <0.714.0> 3e471a2ab6 192.168.56.101:5984 192.168.56.101 user POST /_cluster_setup 500 ok 43

With GET /

bug

Most helpful comment

Hey there, I stumbled upon the same issue using CouchDB 3.1.0 and I think #2797 should not have been closed with #2798.

The log message uuid set to ... caught my eye when using the workaround via GET /:

couchdb1_1  | [notice] 2020-07-24T20:46:41.473471Z [email protected] <0.88.0> -------- config: [couchdb] uuid set to 84a3191a25a2955130956d131232f98b for reason nil
couchdb1_1  | [notice] 2020-07-24T20:46:41.481822Z [email protected] <0.1868.0> 2bc6d8040a localhost:15984 172.19.0.1 admin GET / 200 ok 20

When manually setting a uuid on the coordinating node everything worked fine (no GET / or sleep 3 necessary).

So, an unset or empty uuid on the coordinating node seems to make the finish_cluster action fail. A fresh node even throws an error when trying to get its uuid:

curl --user admin:password 'http://localhost:5984/_node/_local/_config/couchdb/uuid'
{"error":"not_found","reason":"unknown_config_value"}

All 5 comments

what happens if you add sleep 3 instead of the curl to /?

sleeping 3s
{"couchdb":"Welcome","version":"3.0.0-ebdfbba","git_sha":"ebdfbba","uuid":"8b6a1310bd6d5751724e6abaffd2da3a","features":["access-ready","partitioned","pluggable-storage-engines","reshard","scheduler"],"vendor":{"name":"The Apache Software Foundation"}}
{"state":"cluster_enabled"}
{"ok":true}
{"ok":true}
sleeping 3s
{"error":"setup_error","reason":"Cluster setup unable to sync admin passwords"}
{"state":"cluster_enabled"}

Still

[error] 2020-05-01T23:54:31.282087Z [email protected] <0.543.0> 671c155668 setup sync_admin results [{badrpc,{'EXIT',{badarg,[{config,set,5,[{file,"src/config.erl"},{line,186}]},{rpc,'-handle_call_call/6-fun-0-',5,[{file,"rpc.erl"},{line,197}]}]}}}] errors []
[notice] 2020-05-01T23:54:31.292029Z [email protected] <0.543.0> 671c155668 192.168.56.101:5984 192.168.56.101 user POST /_cluster_setup 500 ok 26

Full logs

Hey there, I stumbled upon the same issue using CouchDB 3.1.0 and I think #2797 should not have been closed with #2798.

The log message uuid set to ... caught my eye when using the workaround via GET /:

couchdb1_1  | [notice] 2020-07-24T20:46:41.473471Z [email protected] <0.88.0> -------- config: [couchdb] uuid set to 84a3191a25a2955130956d131232f98b for reason nil
couchdb1_1  | [notice] 2020-07-24T20:46:41.481822Z [email protected] <0.1868.0> 2bc6d8040a localhost:15984 172.19.0.1 admin GET / 200 ok 20

When manually setting a uuid on the coordinating node everything worked fine (no GET / or sleep 3 necessary).

So, an unset or empty uuid on the coordinating node seems to make the finish_cluster action fail. A fresh node even throws an error when trying to get its uuid:

curl --user admin:password 'http://localhost:5984/_node/_local/_config/couchdb/uuid'
{"error":"not_found","reason":"unknown_config_value"}

I still have exactly the same problem with CouchDB 3.1.1.

Thanks @gitsnaf
3.1.1 does not include the fix. It seems that this change has not been backported to 3.x yet

Was this page helpful?
0 / 5 - 0 ratings

Related issues

JohnOllhorn picture JohnOllhorn  路  5Comments

mojito317 picture mojito317  路  3Comments

klaemo picture klaemo  路  3Comments

maciozo picture maciozo  路  5Comments

ghost picture ghost  路  5Comments