consul version for both Client and ServerClient: consul 1.0.1
Server: consul 1.0.1
consul info for both Client and ServerClient:
same as server
Server:
agent:
check_monitors = 0
check_ttls = 0
checks = 32
services = 45
build:
prerelease =
revision = 9564c29
version = 1.0.1
consul:
bootstrap = true
known_datacenters = 7
leader = true
leader_addr = 10.0.66.150:8300
server = true
raft:
applied_index = 16074526
commit_index = 16074526
fsm_pending = 0
last_contact = 0
last_log_index = 16074526
last_log_term = 15
last_snapshot_index = 16070464
last_snapshot_term = 15
latest_configuration = [{Suffrage:Voter ID:386b24e2-c793-cd40-49dd-4116232b96bd Address:10.0.66.150:8300}]
latest_configuration_index = 1
num_peers = 0
protocol_version = 3
protocol_version_max = 3
protocol_version_min = 0
snapshot_version_max = 1
snapshot_version_min = 0
state = Leader
term = 15
runtime:
arch = amd64
cpu_count = 8
goroutines = 472
max_procs = 8
os = linux
version = go1.9.2
serf_lan:
coordinate_resets = 0
encrypted = false
event_queue = 1
event_time = 15
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 1
members = 1
query_queue = 0
query_time = 1
serf_wan:
coordinate_resets = 0
encrypted = false
event_queue = 0
event_time = 1
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 886
members = 11
query_queue = 0
query_time = 1
Ubuntu 16.04.03LTS, Docker 17.09
Upon upgrade consul to version 1.0.1 logs started to fill with messages:
a.b.c.d 2017/11/21 09:01:38 [ERR] serf: Rejected coordinate from HOST1: round trip time not in valid range, duration -206.486碌s is not a positive value less than 10s
a.b.c.d 2017/11/21 09:02:14 [ERR] serf: Rejected coordinate from HOST2: round trip time not in valid range, duration -99.611868ms is not a positive value less than 10s
a.b.c.d 2017/11/21 09:04:28 [ERR] serf: Rejected coordinate from HOST3: round trip time not in valid range, duration -765.777碌s is not a positive value less than 10s
Hi @mnuic we tracked that down but the fix didn't make it into this release cycle but we will pick this up in the next minor release of Consul via https://github.com/hashicorp/memberlist/pull/139. Sorry for the log noise - these can be safely ignored.
@slackpad thank you for the info! Will wait for the next release for production use.
I'm afraid this is more than just log noise. consul 1.0.1 does break our test environment, whereas v0.9.3 works flawlessly. The above mentioned error messages are the only ones we see.
@sofax can you provide more details about what is broken for you?
@slackpad:
It may or may not be related to this issue - all I can say is that we don't see any other error messages.
Here is the scenario:
We have some integration tests for service health checks, e.g. one with two instances of service A, where initially both instances return an unhealthy state. Then service instance #2 is set to "healthy" (i.e. its health check resource returns a healthy state), which - as expected - makes it available via Consul. However, service instance #1 is suddenly available too, even though its health check resource still returns "unhealthy".
This does not happen with Consul 0.9.3.
@sofax thanks that's definitely not related to this error. Can you please open a new issue with some more details about how your test is working and we will take a look?
@slackpad:
Thanks - I think it turned out that the problem lies in our configuration (and in a misinterpration of the documentation or in a configuration example we found on the Internet, that was based on Consul > 0.9.3). We had the field id added to the check definition in both instances with the same value. v0.9.3 apparently/probably did not interpret that property at all, so it simply ignored it and assigned an automatic ID to the checks instead. v1.0.1 does interpret it though, but instead of treating the ID as local to the service instance (which IMO makes more sense), it seems to have global scope, so assigning the same ID to health checks for different service instances (of the same service) won't work.
Most helpful comment
Hi @mnuic we tracked that down but the fix didn't make it into this release cycle but we will pick this up in the next minor release of Consul via https://github.com/hashicorp/memberlist/pull/139. Sorry for the log noise - these can be safely ignored.