Hi,
I have 3 nodes cluster: consul1, consul2 and consul3.
consul1 (172.18.32.130) is done.
Still, as expected both Read and Write went ok with quorum of consul2 and 3 up.
Now I stop consul2 and start it back.
As expected consul3 enters candidate state, but consul2 does not answer him, and consul3 starts suspecting it.
On the other side, consul2 keeps "refuting a suspect message" from consul3, but apparently with no suyccess.
Doing consul members on consul2 confirms both nodes "alive".
How can I debug this?
Log/consul3:
2015/03/02 12:07:50 [ERR] raft: Failed to make RequestVote RPC to 172.18.32.130:8300: dial tcp 172.18.32.130:8300: connection refused
2015/03/02 12:07:51 [INFO] memberlist: Suspect consul2 has failed, no acks received
2015/03/02 12:07:52 [WARN] raft: Election timeout reached, restarting election
2015/03/02 12:07:52 [INFO] raft: Node at 172.18.33.110:8300 [Candidate] entering Candidate state
2015/03/02 12:07:52 [INFO] memberlist: Suspect consul2 has failed, no acks received
2015/03/02 12:07:52 [ERR] raft: Failed to make RequestVote RPC to 172.18.32.130:8300: dial tcp 172.18.32.130:8300: connection refused
2015/03/02 12:07:53 [WARN] raft: Election timeout reached, restarting election
2015/03/02 12:07:53 [INFO] raft: Node at 172.18.33.110:8300 [Candidate] entering Candidate state
2015/03/02 12:07:53 [ERR] raft: Failed to make RequestVote RPC to 172.18.32.130:8300: dial tcp 172.18.32.130:8300: connection refused
2015/03/02 12:07:54 [INFO] memberlist: Suspect consul2 has failed, no acks received
2015/03/02 12:07:55 [WARN] raft: Election timeout reached, restarting election
2015/03/02 12:07:55 [INFO] raft: Node at 172.18.33.110:8300 [Candidate] entering Candidate state
2015/03/02 12:07:55 [ERR] raft: Failed to make RequestVote RPC to 172.18.32.130:8300: dial tcp 172.18.32.130:8300: connection refused
2015/03/02 12:07:56 [INFO] memberlist: Suspect consul2 has failed, no acks received
2015/03/02 12:07:57 [INFO] memberlist: Suspect consul2 has failed, no acks received
2015/03/02 12:07:57 [WARN] raft: Election timeout reached, restarting election
2015/03/02 12:07:57 [INFO] raft: Node at 172.18.33.110:8300 [Candidate] entering Candidate state
2015/03/02 12:07:57 [ERR] raft: Failed to make RequestVote RPC to 172.18.32.130:8300: dial tcp 172.18.32.130:8300: connection refused
2015/03/02 12:07:58 [WARN] raft: Election timeout reached, restarting election
2015/03/02 12:07:58 [INFO] raft: Node at 172.18.33.110:8300 [Candidate] entering Candidate state
2015/03/02 12:07:58 [ERR] raft: Failed to make RequestVote RPC to 172.18.32.130:8300: dial tcp 172.18.32.130:8300: connection refused
2015/03/02 12:07:58 [ERR] agent: failed to sync remote state: No cluster leader
2015/03/02 12:07:59 [INFO] memberlist: Suspect consul2 has failed, no acks received
Log/consul2:
2015/03/02 12:07:32 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:34 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:37 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:39 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:41 [WARN] memberlist: Refuting a suspect message (from: consul2)
2015/03/02 12:07:44 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:44 [ERR] agent: failed to sync remote state: No cluster leader
2015/03/02 12:07:48 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:51 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:54 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:56 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:58 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:08:01 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:08:03 [WARN] memberlist: Refuting a suspect message (from: consul3)
ver = 0.4.1 (although i thought i went up to 0.5)
Are you using docker? My guess is likely the ARP cache issue.
Hi,
I am facing the same issue. I am running the client in docker and the servers in physical nodes.
2016/04/12 08:18:50 [WARN] memberlist: Refuting a dead message (from: server-251)
2016/04/12 08:18:56 [WARN] memberlist: Refuting a suspect message (from: server-252)
2016/04/12 08:19:34 [INFO] agent: Synced node info
2016/04/12 08:19:40 [WARN] memberlist: Refuting a suspect message (from: kpod2consul)
2016/04/12 08:20:50 [WARN] memberlist: Refuting a suspect message (from: server-17)
the client is in version 0.6.4 and the servers 0.6.3.
There were some Docker-related issues that I think have been fixed in later versions of Docker. Closing this out for now, let us know if you are still seeing issues.
I'm seeing this as well with consul 0.7.1 in kubernetes (agent only - server cluster is physical). Also the node status is flapping between green and orange in the consul UI.
Also seeing the same using Kubernetes 1.3 and Consul 0.7.5.
Most helpful comment
I'm seeing this as well with consul 0.7.1 in kubernetes (agent only - server cluster is physical). Also the node status is flapping between green and orange in the consul UI.