Vault: Error initializing storage of type raft: failed to create fsm: timeout

Created on 30 Dec 2019 · 1Comment · Source: hashicorp/vault

Describe the bug
I've been playing with the raft storage in 1.3.1. In a 3 node cluster, after restarting the original node (to which the other two were joined), it started logging:

Error initializing storage of type raft: failed to create fsm: timeout

All nodes seem functional however, and in "vault operator raft configuration" doesn't show anything different about this node.

To Reproduce
create 3 nodes master01[abc], with vault 1.3.1. These are running alpine linux with the vault package from the "edge" repo.

Initialize and configure master01a -- various auth mechanism, secrets engines, policies, etc -- I can detail it all if needed but I have no reason to believe the details are relevant.

TLS is disabled.

Join master01b and master01c to master01a.

Some time later, I restarted master01a, then the leader, and it began loging the message. However, it appears to be functional. vault operator raft configuration says:

```Key Value
--- -----
config map[index:14343 servers:[map[address:192.168.16.4:8201 leader:false node_id:master01a protocol_version:3 voter:true] map[address:192.168.18.4:8201 leader:true node_id:master01c protocol_version:3 voter:true] map[address:192.168.17.4:8201 leader:false node_id:master01b protocol_version:3 voter:true]]]


**Expected behavior**
No scary error messages are logged after a restart.

If there are error messages, they should provide more detail to understand what to do to fix them, or should be documented in this regard. I could find nothing useful in searches.

**Environment:**
* Vault Server Version (retrieve with `vault status`): 1.3.1
* Vault CLI Version (retrieve with `vault version`): Vault v1.3.1
* Server Operating System/Architecture: Alpine Linux 3.11.0, x86_64, on AWS t3a.nano instances

Vault server configuration file(s):

```hcl
storage "raft" {
  path = "/data/vault/db"
  node_id = "master01a"
}

listener "tcp" {
  address = "192.168.16.4:8200"
  tls_disable = true
}

listener "tcp" {
  address = "127.0.0.1:8200"
  tls_disable = true
}

seal "awskms" {
  region     = "us-east-1"
  kms_key_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}

api_addr = "http://192.168.16.4:8200"
cluster_addr = "http://192.168.16.4:8201"
pid_file = "/data/vault/vault.pid"

01b and 01c are the same but in 192.168.17 and 192.168.18 respectively.

Additional context
Add any other context about the problem here.

bug storagraft versio1.3.x

Source

tsarna

Most helpful comment

After restarting more recently, I can no longer reproduce this. I'm not sure what changed.

Still, it would be nice if the message and/or documentation added some guidance on what the problem was and anything that could be done to fix it or debug the issue further.

tsarna on 5 Feb 2020

👍2

>All comments

After restarting more recently, I can no longer reproduce this. I'm not sure what changed.

Still, it would be nice if the message and/or documentation added some guidance on what the problem was and anything that could be done to fix it or debug the issue further.

tsarna on 5 Feb 2020

👍2

Was this page helpful?

0 / 5 - 0 ratings