We brought up a cluster with three members and once the setup is stable we see the below logs
{"log":"2016-07-05 10:45:16.964197 W | etcdserver: failed to send out heartbeat on time (deadline exceeded for 1.85087351s)\n","stream":"stderr","time":"2016-07-05T10:45:16.964317922Z"}
{"log":"2016-07-05 10:45:16.964292 W | etcdserver: server is likely overloaded\n","stream":"stderr","time":"2016-07-05T10:45:16.964427649Z"}
I changed the heartbeat timer and the election timer based on the RTT time in my setup.
ping 10.18.3.148
PING 10.18.3.148 (10.18.3.148) 56(84) bytes of data.
64 bytes from 10.18.3.148: icmp_seq=1 ttl=64 time=0.554 ms
64 bytes from 10.18.3.148: icmp_seq=2 ttl=64 time=0.413 ms
64 bytes from 10.18.3.148: icmp_seq=3 ttl=64 time=0.396 ms
ping 10.18.3.151
PING 10.18.3.151 (10.18.3.151) 56(84) bytes of data.
64 bytes from 10.18.3.151: icmp_seq=1 ttl=64 time=0.466 ms
64 bytes from 10.18.3.151: icmp_seq=2 ttl=64 time=0.459 ms
64 bytes from 10.18.3.151: icmp_seq=3 ttl=64 time=0.526 ms
{"log":"2016-07-05 10:20:37.003826 I | etcdserver: heartbeat = 1000ms\n","stream":"stderr","time":"2016-07-05T10:20:37.003939995Z"}
{"log":"2016-07-05 10:20:37.003906 I | etcdserver: election = 5000ms\n","stream":"stderr","time":"2016-07-05T10:20:37.004101906Z"}
Stats on followers
curl http://127.0.0.1:4001/v2/stats/self | python -mjson.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 350 100 350 0 0 119k 0 --:--:-- --:--:-- --:--:-- 170k
{
"id": "d1496d0a12cd5f72",
"leaderInfo": {
"leader": "23d0b9d58779cdab",
"startTime": "2016-07-05T10:21:19.556075327Z",
"uptime": "32m43.572036434s"
},
"name": "etc150",
"recvAppendRequestCnt": 11273,
"recvBandwidthRate": 2398.514253044226,
"recvPkgRate": 5.476935247743307,
"sendAppendRequestCnt": 0,
"startTime": "2016-07-05T10:20:36.701918706Z",
"state": "StateFollower"
}
Leader
curl http://127.0.0.1:4001/v2/stats/leader | python -mjson.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 440 100 440 0 0 141k 0 --:--:-- --:--:-- --:--:-- 214k
{
"followers": {
"65a2890a61e31a80": {
"counts": {
"fail": 0,
"success": 11173
},
"latency": {
"average": 0.008557420656940854,
"current": 0.001534,
"maximum": 1.391827,
"minimum": 0.000549,
"standardDeviation": 0.024860923502788825
}
},
"d1496d0a12cd5f72": {
"counts": {
"fail": 0,
"success": 11231
},
"latency": {
"average": 0.0077906355622828745,
"current": 0.001608,
"maximum": 0.585447,
"minimum": 0.000603,
"standardDeviation": 0.017981316123206685
}
}
},
"leader": "23d0b9d58779cdab"
}
curl http://127.0.0.1:4001/v2/stats/self | python -mjson.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 352 100 352 0 0 173k 0 --:--:-- --:--:-- --:--:-- 343k
{
"id": "23d0b9d58779cdab",
"leaderInfo": {
"leader": "23d0b9d58779cdab",
"startTime": "2016-07-05T10:21:19.930360147Z",
"uptime": "33m36.131574527s"
},
"name": "etc151",
"recvAppendRequestCnt": 165,
"sendAppendRequestCnt": 22686,
"sendBandwidthRate": 3932.7020115961404,
"sendPkgRate": 10.999180555164079,
"startTime": "2016-07-05T10:20:37.059879681Z",
"state": "StateLeader"
}
Follower 2
curl http://127.0.0.1:4001/v2/stats/self | python -mjson.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 353 100 353 0 0 310k 0 --:--:-- --:--:-- --:--:-- 344k
{
"id": "65a2890a61e31a80",
"leaderInfo": {
"leader": "23d0b9d58779cdab",
"startTime": "2016-07-05T10:21:19.805540361Z",
"uptime": "33m59.848322369s"
},
"name": "etc148",
"recvAppendRequestCnt": 11425,
"recvBandwidthRate": 2012.7089388921465,
"recvPkgRate": 5.061954249442668,
"sendAppendRequestCnt": 335,
"startTime": "2016-07-05T10:20:36.883783373Z",
"state": "StateFollower"
}
However these errors are still seen.
The cluster state is mostly healthy even though there are few instances when it oscillates to unhealthy
sudo docker -H unix:///var/run/docker-bootstrap.sock exec cda5e65ec3b7 ./etcdctl cluster-health
member 23d0b9d58779cdab is healthy: got healthy result from http://10.18.3.151:4001
member 65a2890a61e31a80 is healthy: got healthy result from http://10.18.3.148:4001
member d1496d0a12cd5f72 is healthy: got healthy result from http://10.18.3.150:4001
cluster is healthy
Also there are no packet drops between these nodes.
Couple of doubts/questions.
1) What is the optimum configuration to avoid any heartbeat miss. The setup is fairly simple with a K8 master and around 15 pods created on 3 different nodes including the master. ETCD running on all the three nodes in clustered mode.
2) Why does cluster health oscillate to unhealthy.
etcdctl version 2.3.7
What is your environment? Shared environment or dedicated? enough CPU? SSD or HDD?
Have you checked your disk I/O? Is the etcd cluster under some workload? Can you try to get metrics from /metrics endpoint?
Sorry for the delay.. Am new to etcd nd wanted to understand if Prometheus the only way to collect metrics. Is there any rest endpoint I can query nd get the value
@maverick-racheal Which version of etcd are you running? Also can you answer the question I asked in the previous reply?
I am using 2.3.7.. Its a shared setup with 2 vcpus for the VMS I run etcd.. The CPU usage is around 20/30 % when the issue is seen.. I will collect the disk io nd metrics information and let you know..
The system is in stable state with some 29 watches on etcd cluster.. 27 from k8 nd 2 from our local modules
@maverick-racheal Can you try to do
curl http://127.0.0.1:4001/metrics | python -mjson.tool
Hi All,
I'm seeing similar log message as title during my local testing also ... my setup is that I'm running 3 etcd docker containers in a virtual box VM for local testing. The VM config is small like 2 vCPU and 4GB memory. Below shows part of the docker-compose.yml file for one of the etcd instance (all 3 instances have similar setup except the ports)
version: "2"
services:
etcd0:
image: "quay.io/coreos/etcd:latest"
container_name: "etcd0"
hostname: "etcd0"
volumes:
- "/usr/share/ca-certificates/:/etc/ssl/certs"
ports:
- "4001:4001"
- "2380:2380"
- "2379:2379"
command: "-name etcd0 \
-advertise-client-urls http://etcd0:2379,http://etcd0:4001 \
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
-listen-peer-urls http://0.0.0.0:2380 \
-initial-advertise-peer-urls http://etcd0:2380 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster etcd0=http://etcd0:2380,etcd1=http://etcd1:2380,etcd2=http://etcd2:2380 \
-initial-cluster-state new"
I have 2 java processes running locally that poll the etcd0 container to check/acquire lock every 3 seconds. I kept running the test for whole day and didn't see any errors in my application... everything looks fine except that captioned server log messages appeared occasionally that seems odd.
Below listed the result of curl http://etcd0:4001/metrics
# HELP etcd_http_failed_total Counter of handle failures of requests (non-watches), by method (GET/PUT etc.) and code (400, 500 etc.).
# TYPE etcd_http_failed_total counter
etcd_http_failed_total{code="412",method="PUT"} 1
# HELP etcd_http_received_total Counter of requests received into the system (successfully parsed and authd).
# TYPE etcd_http_received_total counter
etcd_http_received_total{method="PUT"} 196
# HELP etcd_http_successful_duration_second Bucketed histogram of processing time (s) of successfully handled requests (non-watches), by method (GET/PUT etc.).
# TYPE etcd_http_successful_duration_second histogram
etcd_http_successful_duration_second_bucket{method="PUT",le="0.0005"} 0
etcd_http_successful_duration_second_bucket{method="PUT",le="0.001"} 0
etcd_http_successful_duration_second_bucket{method="PUT",le="0.002"} 0
etcd_http_successful_duration_second_bucket{method="PUT",le="0.004"} 2
etcd_http_successful_duration_second_bucket{method="PUT",le="0.008"} 185
etcd_http_successful_duration_second_bucket{method="PUT",le="0.016"} 195
etcd_http_successful_duration_second_bucket{method="PUT",le="0.032"} 195
etcd_http_successful_duration_second_bucket{method="PUT",le="0.064"} 195
etcd_http_successful_duration_second_bucket{method="PUT",le="0.128"} 195
etcd_http_successful_duration_second_bucket{method="PUT",le="0.256"} 195
etcd_http_successful_duration_second_bucket{method="PUT",le="0.512"} 195
etcd_http_successful_duration_second_bucket{method="PUT",le="1.024"} 195
etcd_http_successful_duration_second_bucket{method="PUT",le="2.048"} 195
etcd_http_successful_duration_second_bucket{method="PUT",le="+Inf"} 195
etcd_http_successful_duration_second_sum{method="PUT"} 1.2068893850000002
etcd_http_successful_duration_second_count{method="PUT"} 195
# HELP etcd_rafthttp_message_sent_failed_total The total number of failed messages sent.
# TYPE etcd_rafthttp_message_sent_failed_total counter
etcd_rafthttp_message_sent_failed_total{msgType="MsgVote",remoteID="dcb68c82481661be",sendingType="pipeline"} 1
# HELP etcd_rafthttp_message_sent_latency_seconds message sent latency distributions.
# TYPE etcd_rafthttp_message_sent_latency_seconds histogram
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="0.0005"} 32651
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="0.001"} 32846
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="0.002"} 32853
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="0.004"} 32853
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="0.008"} 32853
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="0.016"} 32853
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="0.032"} 32853
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="0.064"} 32853
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="0.128"} 32853
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="0.256"} 32853
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="0.512"} 32853
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="1.024"} 32853
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="2.048"} 32853
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2",le="+Inf"} 32853
etcd_rafthttp_message_sent_latency_seconds_sum{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2"} 2.6088587280000004
etcd_rafthttp_message_sent_latency_seconds_count{msgType="MsgApp",remoteID="99eab3685d8363a1",sendingType="msgappv2"} 32853
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="0.0005"} 36777
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="0.001"} 36978
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="0.002"} 36979
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="0.004"} 36979
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="0.008"} 36979
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="0.016"} 36979
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="0.032"} 36979
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="0.064"} 36979
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="0.128"} 36979
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="0.256"} 36979
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="0.512"} 36979
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="1.024"} 36979
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="2.048"} 36979
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2",le="+Inf"} 36979
etcd_rafthttp_message_sent_latency_seconds_sum{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2"} 3.3456857450000315
etcd_rafthttp_message_sent_latency_seconds_count{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="msgappv2"} 36979
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.0005"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.001"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.002"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.004"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.008"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.016"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.032"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.064"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.128"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.256"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.512"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="1.024"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="2.048"} 2
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline",le="+Inf"} 2
etcd_rafthttp_message_sent_latency_seconds_sum{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline"} 2.3041887990000003
etcd_rafthttp_message_sent_latency_seconds_count{msgType="MsgApp",remoteID="dcb68c82481661be",sendingType="pipeline"} 2
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="0.0005"} 65105
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="0.001"} 65354
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="0.002"} 65361
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="0.004"} 65364
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="0.008"} 65367
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="0.016"} 65367
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="0.032"} 65367
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="0.064"} 65367
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="0.128"} 65367
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="0.256"} 65367
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="0.512"} 65367
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="1.024"} 65367
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="2.048"} 65367
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message",le="+Inf"} 65367
etcd_rafthttp_message_sent_latency_seconds_sum{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message"} 6.135820232999963
etcd_rafthttp_message_sent_latency_seconds_count{msgType="MsgHeartbeat",remoteID="99eab3685d8363a1",sendingType="message"} 65367
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="0.0005"} 65199
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="0.001"} 65359
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="0.002"} 65364
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="0.004"} 65365
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="0.008"} 65365
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="0.016"} 65365
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="0.032"} 65365
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="0.064"} 65365
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="0.128"} 65365
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="0.256"} 65365
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="0.512"} 65365
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="1.024"} 65365
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="2.048"} 65365
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message",le="+Inf"} 65365
etcd_rafthttp_message_sent_latency_seconds_sum{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message"} 4.3603451950000185
etcd_rafthttp_message_sent_latency_seconds_count{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="message"} 65365
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.0005"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.001"} 0
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.002"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.004"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.008"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.016"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.032"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.064"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.128"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.256"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="0.512"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="1.024"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="2.048"} 2
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline",le="+Inf"} 2
etcd_rafthttp_message_sent_latency_seconds_sum{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline"} 1.115215818
etcd_rafthttp_message_sent_latency_seconds_count{msgType="MsgHeartbeat",remoteID="dcb68c82481661be",sendingType="pipeline"} 2
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="0.0005"} 7839
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="0.001"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="0.002"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="0.004"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="0.008"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="0.016"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="0.032"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="0.064"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="0.128"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="0.256"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="0.512"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="1.024"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="2.048"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message",le="+Inf"} 7844
etcd_rafthttp_message_sent_latency_seconds_sum{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message"} 0.4068390939999997
etcd_rafthttp_message_sent_latency_seconds_count{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="message"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="0.0005"} 7829
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="0.001"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="0.002"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="0.004"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="0.008"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="0.016"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="0.032"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="0.064"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="0.128"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="0.256"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="0.512"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="1.024"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="2.048"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2",le="+Inf"} 7844
etcd_rafthttp_message_sent_latency_seconds_sum{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2"} 0.7283043240000001
etcd_rafthttp_message_sent_latency_seconds_count{msgType="MsgLinkHeartbeat",remoteID="0",sendingType="msgappv2"} 7844
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="0.0005"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="0.001"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="0.002"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="0.004"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="0.008"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="0.016"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="0.032"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="0.064"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="0.128"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="0.256"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="0.512"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="1.024"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="2.048"} 1
etcd_rafthttp_message_sent_latency_seconds_bucket{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message",le="+Inf"} 1
etcd_rafthttp_message_sent_latency_seconds_sum{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message"} 0.000118848
etcd_rafthttp_message_sent_latency_seconds_count{msgType="MsgVote",remoteID="99eab3685d8363a1",sendingType="message"} 1
# HELP etcd_server_file_descriptors_used_total The total number of file descriptors used.
# TYPE etcd_server_file_descriptors_used_total gauge
etcd_server_file_descriptors_used_total 26
# HELP etcd_server_pending_proposal_total The total number of pending proposals.
# TYPE etcd_server_pending_proposal_total gauge
etcd_server_pending_proposal_total 0
# HELP etcd_server_proposal_durations_seconds The latency distributions of committing proposal.
# TYPE etcd_server_proposal_durations_seconds histogram
etcd_server_proposal_durations_seconds_bucket{le="0.001"} 0
etcd_server_proposal_durations_seconds_bucket{le="0.002"} 0
etcd_server_proposal_durations_seconds_bucket{le="0.004"} 5
etcd_server_proposal_durations_seconds_bucket{le="0.008"} 186
etcd_server_proposal_durations_seconds_bucket{le="0.016"} 196
etcd_server_proposal_durations_seconds_bucket{le="0.032"} 196
etcd_server_proposal_durations_seconds_bucket{le="0.064"} 196
etcd_server_proposal_durations_seconds_bucket{le="0.128"} 196
etcd_server_proposal_durations_seconds_bucket{le="0.256"} 196
etcd_server_proposal_durations_seconds_bucket{le="0.512"} 196
etcd_server_proposal_durations_seconds_bucket{le="1.024"} 197
etcd_server_proposal_durations_seconds_bucket{le="2.048"} 197
etcd_server_proposal_durations_seconds_bucket{le="4.096"} 197
etcd_server_proposal_durations_seconds_bucket{le="8.192"} 197
etcd_server_proposal_durations_seconds_bucket{le="+Inf"} 197
etcd_server_proposal_durations_seconds_sum 2.104242892999998
etcd_server_proposal_durations_seconds_count 197
# HELP etcd_server_proposal_failed_total The total number of failed proposals.
# TYPE etcd_server_proposal_failed_total counter
etcd_server_proposal_failed_total 0
# HELP etcd_snapshot_save_marshalling_durations_seconds The marshalling cost distributions of save called by snapshot.
# TYPE etcd_snapshot_save_marshalling_durations_seconds histogram
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="0.001"} 0
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="0.002"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="0.004"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="0.008"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="0.016"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="0.032"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="0.064"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="0.128"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="0.256"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="0.512"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="1.024"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="2.048"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="4.096"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="8.192"} 1
etcd_snapshot_save_marshalling_durations_seconds_bucket{le="+Inf"} 1
etcd_snapshot_save_marshalling_durations_seconds_sum 0.001200149
etcd_snapshot_save_marshalling_durations_seconds_count 1
# HELP etcd_snapshot_save_total_durations_seconds The total latency distributions of save called by snapshot.
# TYPE etcd_snapshot_save_total_durations_seconds histogram
etcd_snapshot_save_total_durations_seconds_bucket{le="0.001"} 0
etcd_snapshot_save_total_durations_seconds_bucket{le="0.002"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="0.004"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="0.008"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="0.016"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="0.032"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="0.064"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="0.128"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="0.256"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="0.512"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="1.024"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="2.048"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="4.096"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="8.192"} 1
etcd_snapshot_save_total_durations_seconds_bucket{le="+Inf"} 1
etcd_snapshot_save_total_durations_seconds_sum 0.00153799
etcd_snapshot_save_total_durations_seconds_count 1
# HELP etcd_storage_db_compaction_pause_duration_milliseconds Bucketed histogram of db compaction pause duration.
# TYPE etcd_storage_db_compaction_pause_duration_milliseconds histogram
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="128"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="256"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="512"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="1024"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="2048"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="4096"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_bucket{le="+Inf"} 0
etcd_storage_db_compaction_pause_duration_milliseconds_sum 0
etcd_storage_db_compaction_pause_duration_milliseconds_count 0
# HELP etcd_storage_db_compaction_total_duration_milliseconds Bucketed histogram of db compaction total duration.
# TYPE etcd_storage_db_compaction_total_duration_milliseconds histogram
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="100"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="200"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="400"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="800"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="1600"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="3200"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="6400"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="12800"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="25600"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="51200"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="102400"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="204800"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="409600"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="819200"} 0
etcd_storage_db_compaction_total_duration_milliseconds_bucket{le="+Inf"} 0
etcd_storage_db_compaction_total_duration_milliseconds_sum 0
etcd_storage_db_compaction_total_duration_milliseconds_count 0
# HELP etcd_storage_db_total_size_in_bytes Total size of the underlying database in bytes.
# TYPE etcd_storage_db_total_size_in_bytes gauge
etcd_storage_db_total_size_in_bytes 0
# HELP etcd_storage_delete_total Total number of deletes seen by this member.
# TYPE etcd_storage_delete_total counter
etcd_storage_delete_total 0
# HELP etcd_storage_events_total Total number of events sent by this member.
# TYPE etcd_storage_events_total counter
etcd_storage_events_total 0
# HELP etcd_storage_index_compaction_pause_duration_milliseconds Bucketed histogram of index compaction pause duration.
# TYPE etcd_storage_index_compaction_pause_duration_milliseconds histogram
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="0.5"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="128"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="256"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="512"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="1024"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_bucket{le="+Inf"} 0
etcd_storage_index_compaction_pause_duration_milliseconds_sum 0
etcd_storage_index_compaction_pause_duration_milliseconds_count 0
# HELP etcd_storage_keys_total Total number of keys.
# TYPE etcd_storage_keys_total gauge
etcd_storage_keys_total 0
# HELP etcd_storage_pending_events_total Total number of pending events to be sent.
# TYPE etcd_storage_pending_events_total gauge
etcd_storage_pending_events_total 0
# HELP etcd_storage_put_total Total number of puts seen by this member.
# TYPE etcd_storage_put_total counter
etcd_storage_put_total 0
# HELP etcd_storage_range_total Total number of ranges seen by this member.
# TYPE etcd_storage_range_total counter
etcd_storage_range_total 0
# HELP etcd_storage_slow_watcher_total Total number of unsynced slow watchers.
# TYPE etcd_storage_slow_watcher_total gauge
etcd_storage_slow_watcher_total 0
# HELP etcd_storage_txn_total Total number of txns seen by this member.
# TYPE etcd_storage_txn_total counter
etcd_storage_txn_total 0
# HELP etcd_storage_watch_stream_total Total number of watch streams.
# TYPE etcd_storage_watch_stream_total gauge
etcd_storage_watch_stream_total 0
# HELP etcd_storage_watcher_total Total number of watchers.
# TYPE etcd_storage_watcher_total gauge
etcd_storage_watcher_total 0
# HELP etcd_store_expires_total Total number of expired keys.
# TYPE etcd_store_expires_total counter
etcd_store_expires_total 0
# HELP etcd_store_reads_total Total number of reads action by (get/getRecursive), local to this member.
# TYPE etcd_store_reads_total counter
etcd_store_reads_total{action="get"} 581
etcd_store_reads_total{action="getRecursive"} 2
# HELP etcd_store_watch_requests_total Total number of incoming watch requests (new or reestablished).
# TYPE etcd_store_watch_requests_total counter
etcd_store_watch_requests_total 384
# HELP etcd_store_watchers Count of currently active watchers.
# TYPE etcd_store_watchers gauge
etcd_store_watchers 2
# HELP etcd_store_writes_total Total number of writes (e.g. set/compareAndDelete) seen by this member.
# TYPE etcd_store_writes_total counter
etcd_store_writes_total{action="create"} 2
etcd_store_writes_total{action="set"} 3
etcd_store_writes_total{action="update"} 194
# HELP etcd_wal_fsync_durations_seconds The latency distributions of fsync called by wal.
# TYPE etcd_wal_fsync_durations_seconds histogram
etcd_wal_fsync_durations_seconds_bucket{le="0.001"} 0
etcd_wal_fsync_durations_seconds_bucket{le="0.002"} 6
etcd_wal_fsync_durations_seconds_bucket{le="0.004"} 2675
etcd_wal_fsync_durations_seconds_bucket{le="0.008"} 13159
etcd_wal_fsync_durations_seconds_bucket{le="0.016"} 13255
etcd_wal_fsync_durations_seconds_bucket{le="0.032"} 13274
etcd_wal_fsync_durations_seconds_bucket{le="0.064"} 13277
etcd_wal_fsync_durations_seconds_bucket{le="0.128"} 13277
etcd_wal_fsync_durations_seconds_bucket{le="0.256"} 13277
etcd_wal_fsync_durations_seconds_bucket{le="0.512"} 13277
etcd_wal_fsync_durations_seconds_bucket{le="1.024"} 13277
etcd_wal_fsync_durations_seconds_bucket{le="2.048"} 13277
etcd_wal_fsync_durations_seconds_bucket{le="4.096"} 13277
etcd_wal_fsync_durations_seconds_bucket{le="8.192"} 13277
etcd_wal_fsync_durations_seconds_bucket{le="+Inf"} 13277
etcd_wal_fsync_durations_seconds_sum 66.33083589699976
etcd_wal_fsync_durations_seconds_count 13277
# HELP etcd_wal_last_index_saved The index of the last entry saved by wal.
# TYPE etcd_wal_last_index_saved gauge
etcd_wal_last_index_saved 1.807006e+06
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 6.648400000000001e-05
go_gc_duration_seconds{quantile="0.25"} 0.00016350200000000002
go_gc_duration_seconds{quantile="0.5"} 0.000249463
go_gc_duration_seconds{quantile="0.75"} 0.000477232
go_gc_duration_seconds{quantile="1"} 0.0042545790000000005
go_gc_duration_seconds_sum 0.0333288
go_gc_duration_seconds_count 76
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 113
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 2.114048e+07
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 3.74153904e+08
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.473048e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 5.414783e+06
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 882688
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 2.114048e+07
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 3.629056e+06
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 2.2552576e+07
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 77753
# HELP go_memstats_heap_released_bytes_total Total number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes_total counter
go_memstats_heap_released_bytes_total 1.572864e+06
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 2.6181632e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.4679479927504718e+19
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 4762
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 5.492536e+06
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 2400
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 16384
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 113760
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 147456
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 3.4615386e+07
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 653024
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 1.081344e+06
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 1.081344e+06
# HELP go_memstats_sys_bytes Number of bytes obtained by system. Sum of all system allocations.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 3.0435576e+07
# HELP http_request_duration_microseconds The HTTP request latencies in microseconds.
# TYPE http_request_duration_microseconds summary
http_request_duration_microseconds{handler="prometheus",quantile="0.5"} 2281.415
http_request_duration_microseconds{handler="prometheus",quantile="0.9"} 2281.415
http_request_duration_microseconds{handler="prometheus",quantile="0.99"} 2281.415
http_request_duration_microseconds_sum{handler="prometheus"} 2281.415
http_request_duration_microseconds_count{handler="prometheus"} 1
# HELP http_request_size_bytes The HTTP request sizes in bytes.
# TYPE http_request_size_bytes summary
http_request_size_bytes{handler="prometheus",quantile="0.5"} 63
http_request_size_bytes{handler="prometheus",quantile="0.9"} 63
http_request_size_bytes{handler="prometheus",quantile="0.99"} 63
http_request_size_bytes_sum{handler="prometheus"} 63
http_request_size_bytes_count{handler="prometheus"} 1
# HELP http_requests_total Total number of HTTP requests made.
# TYPE http_requests_total counter
http_requests_total{code="200",handler="prometheus",method="get"} 1
# HELP http_response_size_bytes The HTTP response sizes in bytes.
# TYPE http_response_size_bytes summary
http_response_size_bytes{handler="prometheus",quantile="0.5"} 40192
http_response_size_bytes{handler="prometheus",quantile="0.9"} 40192
http_response_size_bytes{handler="prometheus",quantile="0.99"} 40192
http_response_size_bytes_sum{handler="prometheus"} 40192
http_response_size_bytes_count{handler="prometheus"} 1
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 98.67
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 27
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 3.3574912e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.4679415077e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 4.8164864e+07
"etcdctl cluster-health" also shows that the cluster is healthy
member 2c6c7b8d0d0933f7 is healthy: got healthy result from http://etcd0:2379
member 99eab3685d8363a1 is healthy: got healthy result from http://etcd1:2379
member dcb68c82481661be is healthy: got healthy result from http://etcd2:2379
cluster is healthy
@xloggerster Your disk io was slow. Some of them took longer than 50ms. That is why etcd complaints. Try to get a better disk (not shared by 3 etcd or use SSD) or make the heartbeat interval longer.
@xiang90 Thanks for your prompt reply! May I know which pointer in the metrics indicated that the disk io was slow?
I'm using SSD already... will retest with longer Heartbeat interval
etcd_wal_fsync_durations_seconds_bucket{le="0.004"} 2675
etcd_wal_fsync_durations_seconds_bucket{le="0.008"} 13159
etcd_wal_fsync_durations_seconds_bucket{le="0.016"} 13255
etcd_wal_fsync_durations_seconds_bucket{le="0.032"} 13274
etcd_wal_fsync_durations_seconds_bucket{le="0.064"} 13277
@xloggerster If you are running SSD, most fsyncs should be under 0.004s (4ms).
I am closing this out due to low activity. As I mentioned, I suspected it is because of slow disk io. Probably you need to increase timeout.
I hit the same issue when I'm playing 3 etcd processes on the localhost. It's 100% re-producible with latest etcd v3.0.8. (I also tried a little older 3.0.4, same issue). Here's the repro steps:
First, I wrote a simple bash script etcd.sh:
#!/bin/bash
INDEX=$1
MODE=$2
PEER_PORT=3${INDEX}01
CLIENT_PORT=3${INDEX}00
DATADIR=d.$INDEX
if [ -z "$MODE" ]; then
if [ -d "$DATADIR/member" ]; then
MODE=existing
else
MODE=new
fi
fi
set -x
exec etcd --name etcd$INDEX \
--initial-advertise-peer-urls http://localhost:$PEER_PORT \
--listen-peer-urls http://127.0.0.1:$PEER_PORT \
--listen-client-urls http://127.0.0.1:$CLIENT_PORT \
--advertise-client-urls http://localhost:$CLIENT_PORT \
--initial-cluster-token etcd \
--initial-cluster etcd0=http://localhost:3001,etcd1=http://localhost:3101,etcd2=http://localhost:3201 \
--initial-cluster-state $MODE \
--data-dir $DATADIR
Then I opened 3 terminals, and type one by one:
./etcd.sh 0
./etcd.sh 1
./etcd.sh 2
After I complete etcd.sh 0 and etcd.sh 1, quorum is working, and I can use etcdctl to access the cluster, and the logs looked normal from both processes.
After I complete etcd.sh 2, the whole cluster is working, and on process etcd0 the log keeps writing:
2016-09-14 10:43:46.550891 W | etcdserver: failed to send out heartbeat on time (exceeded the 100ms timeout for 104.011472ms)
2016-09-14 10:43:46.550900 W | etcdserver: server is likely overloaded
I'm running directly on MacBook Pro (2013/2014) with i7 processor (4 cores with HT) and 16GB memory, with SSD disk. No other workload is running at the same time, and the etcd cluster is completely new without any data.
Since them, I'm repeatedly typing etcdctl -C ... member list, it fails very frequently with error:
Error: client: etcd cluster is unavailable or misconfigured
error #0: client: endpoint http://localhost:3100 exceeded header timeout
error #1: client: endpoint http://localhost:3000 exceeded header timeout
error #2: client: endpoint http://localhost:3200 exceeded header timeout
And with curl http://localhost:3000/metrics, I got:
# HELP etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds Bucketed histogram of db compaction pause duration.
# TYPE etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds histogram
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="128"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="256"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="512"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1024"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2048"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4096"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_sum 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_count 0
# HELP etcd_debugging_mvcc_db_compaction_total_duration_milliseconds Bucketed histogram of db compaction total duration.
# TYPE etcd_debugging_mvcc_db_compaction_total_duration_milliseconds histogram
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="100"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="1600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="3200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="6400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="12800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="25600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="51200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="102400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="204800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="409600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="819200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_sum 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_count 0
# HELP etcd_debugging_mvcc_db_total_size_in_bytes Total size of the underlying database in bytes.
# TYPE etcd_debugging_mvcc_db_total_size_in_bytes gauge
etcd_debugging_mvcc_db_total_size_in_bytes 0
# HELP etcd_debugging_mvcc_delete_total Total number of deletes seen by this member.
# TYPE etcd_debugging_mvcc_delete_total counter
etcd_debugging_mvcc_delete_total 0
# HELP etcd_debugging_mvcc_events_total Total number of events sent by this member.
# TYPE etcd_debugging_mvcc_events_total counter
etcd_debugging_mvcc_events_total 0
# HELP etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds Bucketed histogram of index compaction pause duration.
# TYPE etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds histogram
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="0.5"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="128"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="256"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="512"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="1024"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_sum 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_count 0
# HELP etcd_debugging_mvcc_keys_total Total number of keys.
# TYPE etcd_debugging_mvcc_keys_total gauge
etcd_debugging_mvcc_keys_total 0
# HELP etcd_debugging_mvcc_pending_events_total Total number of pending events to be sent.
# TYPE etcd_debugging_mvcc_pending_events_total gauge
etcd_debugging_mvcc_pending_events_total 0
# HELP etcd_debugging_mvcc_put_total Total number of puts seen by this member.
# TYPE etcd_debugging_mvcc_put_total counter
etcd_debugging_mvcc_put_total 0
# HELP etcd_debugging_mvcc_range_total Total number of ranges seen by this member.
# TYPE etcd_debugging_mvcc_range_total counter
etcd_debugging_mvcc_range_total 0
# HELP etcd_debugging_mvcc_slow_watcher_total Total number of unsynced slow watchers.
# TYPE etcd_debugging_mvcc_slow_watcher_total gauge
etcd_debugging_mvcc_slow_watcher_total 0
# HELP etcd_debugging_mvcc_txn_total Total number of txns seen by this member.
# TYPE etcd_debugging_mvcc_txn_total counter
etcd_debugging_mvcc_txn_total 0
# HELP etcd_debugging_mvcc_watch_stream_total Total number of watch streams.
# TYPE etcd_debugging_mvcc_watch_stream_total gauge
etcd_debugging_mvcc_watch_stream_total 0
# HELP etcd_debugging_mvcc_watcher_total Total number of watchers.
# TYPE etcd_debugging_mvcc_watcher_total gauge
etcd_debugging_mvcc_watcher_total 0
# HELP etcd_debugging_snap_save_marshalling_duration_seconds The marshalling cost distributions of save called by snapshot.
# TYPE etcd_debugging_snap_save_marshalling_duration_seconds histogram
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.001"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.002"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.004"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.008"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.016"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.032"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.064"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.128"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.256"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.512"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="1.024"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="2.048"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="4.096"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="8.192"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="+Inf"} 0
etcd_debugging_snap_save_marshalling_duration_seconds_sum 0
etcd_debugging_snap_save_marshalling_duration_seconds_count 0
# HELP etcd_debugging_snap_save_total_duration_seconds The total latency distributions of save called by snapshot.
# TYPE etcd_debugging_snap_save_total_duration_seconds histogram
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.001"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.002"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.004"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.008"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.016"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.032"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.064"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.128"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.256"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.512"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="1.024"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="2.048"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="4.096"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="8.192"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="+Inf"} 0
etcd_debugging_snap_save_total_duration_seconds_sum 0
etcd_debugging_snap_save_total_duration_seconds_count 0
# HELP etcd_debugging_store_expires_total Total number of expired keys.
# TYPE etcd_debugging_store_expires_total counter
etcd_debugging_store_expires_total 0
# HELP etcd_debugging_store_reads_total Total number of reads action by (get/getRecursive), local to this member.
# TYPE etcd_debugging_store_reads_total counter
etcd_debugging_store_reads_total{action="getRecursive"} 5
# HELP etcd_debugging_store_watch_requests_total Total number of incoming watch requests (new or reestablished).
# TYPE etcd_debugging_store_watch_requests_total counter
etcd_debugging_store_watch_requests_total 0
# HELP etcd_debugging_store_watchers Count of currently active watchers.
# TYPE etcd_debugging_store_watchers gauge
etcd_debugging_store_watchers 0
# HELP etcd_debugging_store_writes_total Total number of writes (e.g. set/compareAndDelete) seen by this member.
# TYPE etcd_debugging_store_writes_total counter
etcd_debugging_store_writes_total{action="create"} 3
etcd_debugging_store_writes_total{action="set"} 5
# HELP etcd_disk_backend_commit_duration_seconds The latency distributions of commit called by backend.
# TYPE etcd_disk_backend_commit_duration_seconds histogram
etcd_disk_backend_commit_duration_seconds_bucket{le="0.001"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.002"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.004"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.008"} 6
etcd_disk_backend_commit_duration_seconds_bucket{le="0.016"} 7
etcd_disk_backend_commit_duration_seconds_bucket{le="0.032"} 7
etcd_disk_backend_commit_duration_seconds_bucket{le="0.064"} 7
etcd_disk_backend_commit_duration_seconds_bucket{le="0.128"} 7
etcd_disk_backend_commit_duration_seconds_bucket{le="0.256"} 7
etcd_disk_backend_commit_duration_seconds_bucket{le="0.512"} 7
etcd_disk_backend_commit_duration_seconds_bucket{le="1.024"} 7
etcd_disk_backend_commit_duration_seconds_bucket{le="2.048"} 9
etcd_disk_backend_commit_duration_seconds_bucket{le="4.096"} 9
etcd_disk_backend_commit_duration_seconds_bucket{le="8.192"} 9
etcd_disk_backend_commit_duration_seconds_bucket{le="+Inf"} 9
etcd_disk_backend_commit_duration_seconds_sum 3.510260482
etcd_disk_backend_commit_duration_seconds_count 9
# HELP etcd_disk_wal_fsync_duration_seconds The latency distributions of fsync called by wal.
# TYPE etcd_disk_wal_fsync_duration_seconds histogram
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.001"} 0
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.002"} 0
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.004"} 194
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.008"} 265
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.016"} 787
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.032"} 948
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.064"} 969
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.128"} 979
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.256"} 1012
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.512"} 1472
etcd_disk_wal_fsync_duration_seconds_bucket{le="1.024"} 1826
etcd_disk_wal_fsync_duration_seconds_bucket{le="2.048"} 1844
etcd_disk_wal_fsync_duration_seconds_bucket{le="4.096"} 1844
etcd_disk_wal_fsync_duration_seconds_bucket{le="8.192"} 1844
etcd_disk_wal_fsync_duration_seconds_bucket{le="+Inf"} 1844
etcd_disk_wal_fsync_duration_seconds_sum 454.6347694839992
etcd_disk_wal_fsync_duration_seconds_count 1844
# HELP etcd_network_client_grpc_received_bytes_total The total number of bytes received from grpc clients.
# TYPE etcd_network_client_grpc_received_bytes_total counter
etcd_network_client_grpc_received_bytes_total 0
# HELP etcd_network_client_grpc_sent_bytes_total The total number of bytes sent to grpc clients.
# TYPE etcd_network_client_grpc_sent_bytes_total counter
etcd_network_client_grpc_sent_bytes_total 0
# HELP etcd_network_peer_received_bytes_total The total number of bytes received from peers.
# TYPE etcd_network_peer_received_bytes_total counter
etcd_network_peer_received_bytes_total{From="0"} 60872
etcd_network_peer_received_bytes_total{From="750a0818d24251ea"} 420718
etcd_network_peer_received_bytes_total{From="7fd45661ffe5ecae"} 373535
# HELP etcd_network_peer_round_trip_time_seconds Round-Trip-Time histogram between peers.
# TYPE etcd_network_peer_round_trip_time_seconds histogram
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.0001"} 2
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.0002"} 3
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.0004"} 29
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.0008"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.0016"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.0032"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.0064"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.0128"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.0256"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.0512"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.1024"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.2048"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.4096"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="0.8192"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="750a0818d24251ea",le="+Inf"} 33
etcd_network_peer_round_trip_time_seconds_sum{To="750a0818d24251ea"} 0.011009753
etcd_network_peer_round_trip_time_seconds_count{To="750a0818d24251ea"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.0001"} 5
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.0002"} 5
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.0004"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.0008"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.0016"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.0032"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.0064"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.0128"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.0256"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.0512"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.1024"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.2048"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.4096"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="0.8192"} 33
etcd_network_peer_round_trip_time_seconds_bucket{To="7fd45661ffe5ecae",le="+Inf"} 33
etcd_network_peer_round_trip_time_seconds_sum{To="7fd45661ffe5ecae"} 0.008218936000000001
etcd_network_peer_round_trip_time_seconds_count{To="7fd45661ffe5ecae"} 33
# HELP etcd_network_peer_sent_bytes_total The total number of bytes sent to peers.
# TYPE etcd_network_peer_sent_bytes_total counter
etcd_network_peer_sent_bytes_total{To="750a0818d24251ea"} 671675
etcd_network_peer_sent_bytes_total{To="7fd45661ffe5ecae"} 663454
# HELP etcd_server_has_leader Whether or not a leader exists. 1 is existence, 0 is not.
# TYPE etcd_server_has_leader gauge
etcd_server_has_leader 1
# HELP etcd_server_leader_changes_seen_total The number of leader changes seen.
# TYPE etcd_server_leader_changes_seen_total counter
etcd_server_leader_changes_seen_total 1
# HELP etcd_server_proposals_applied_total The total number of consensus proposals applied.
# TYPE etcd_server_proposals_applied_total gauge
etcd_server_proposals_applied_total 1796
# HELP etcd_server_proposals_committed_total The total number of consensus proposals committed.
# TYPE etcd_server_proposals_committed_total gauge
etcd_server_proposals_committed_total 1796
# HELP etcd_server_proposals_failed_total The total number of failed proposals seen.
# TYPE etcd_server_proposals_failed_total gauge
etcd_server_proposals_failed_total 11
# HELP etcd_server_proposals_pending The current number of pending proposals to commit.
# TYPE etcd_server_proposals_pending gauge
etcd_server_proposals_pending 0
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0.00023858200000000001
go_gc_duration_seconds{quantile="0.25"} 0.00028397700000000004
go_gc_duration_seconds{quantile="0.5"} 0.000332335
go_gc_duration_seconds{quantile="0.75"} 0.00044774100000000004
go_gc_duration_seconds{quantile="1"} 0.000575489
go_gc_duration_seconds_sum 0.004898619
go_gc_duration_seconds_count 13
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 115
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 1.932664e+07
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 8.3761e+07
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.461664e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 977117
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 995328
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 1.932664e+07
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 7.421952e+06
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 2.138112e+07
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 48640
# HELP go_memstats_heap_released_bytes_total Total number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes_total counter
go_memstats_heap_released_bytes_total 5.169152e+06
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 2.8803072e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.4738752873928081e+19
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 6251
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 1.025757e+06
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 9600
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 16384
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 107280
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 196608
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 3.2374586e+07
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 2.17788e+06
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 1.605632e+06
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 1.605632e+06
# HELP go_memstats_sys_bytes Number of bytes obtained by system. Sum of all system allocations.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 3.5256568e+07
# HELP http_request_duration_microseconds The HTTP request latencies in microseconds.
# TYPE http_request_duration_microseconds summary
http_request_duration_microseconds{handler="prometheus",quantile="0.5"} NaN
http_request_duration_microseconds{handler="prometheus",quantile="0.9"} NaN
http_request_duration_microseconds{handler="prometheus",quantile="0.99"} NaN
http_request_duration_microseconds_sum{handler="prometheus"} 0
http_request_duration_microseconds_count{handler="prometheus"} 0
# HELP http_request_size_bytes The HTTP request sizes in bytes.
# TYPE http_request_size_bytes summary
http_request_size_bytes{handler="prometheus",quantile="0.5"} NaN
http_request_size_bytes{handler="prometheus",quantile="0.9"} NaN
http_request_size_bytes{handler="prometheus",quantile="0.99"} NaN
http_request_size_bytes_sum{handler="prometheus"} 0
http_request_size_bytes_count{handler="prometheus"} 0
# HELP http_response_size_bytes The HTTP response sizes in bytes.
# TYPE http_response_size_bytes summary
http_response_size_bytes{handler="prometheus",quantile="0.5"} NaN
http_response_size_bytes{handler="prometheus",quantile="0.9"} NaN
http_response_size_bytes{handler="prometheus",quantile="0.99"} NaN
http_response_size_bytes_sum{handler="prometheus"} 0
http_response_size_bytes_count{handler="prometheus"} 0
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 12.56
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1024
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 38
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 3.1346688e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.47387432684e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.0793181184e+10
I recommend re-opening the issue for investigation. It looks very likely a bug.
@easeway
Your disk has some issues
etcd_disk_backend_commit_duration_seconds_bucket{le="0.064"} 7
etcd_disk_backend_commit_duration_seconds_bucket{le="0.128"} 7
etcd_disk_backend_commit_duration_seconds_bucket{le="0.256"} 7
etcd_disk_backend_commit_duration_seconds_bucket{le="0.512"} 7
etcd_disk_backend_commit_duration_seconds_bucket{le="1.024"} 7
etcd_disk_backend_commit_duration_seconds_bucket{le="2.048"} 9
etcd_disk_backend_commit_duration_seconds_bucket{le="4.096"} 9
etcd_disk_backend_commit_duration_seconds_bucket{le="8.192"} 9
etcd_disk_backend_commit_duration_seconds_bucket{le="+Inf"} 9
fsync can take > 1 second. I cannot repo this on my slow mac air.
How much data is written with fsync? I tested fsync a 200MB file, it took only ~700ms, here's my testing Go code:
import (
"fmt"
"log"
"os"
"syscall"
"time"
"crypto/rand"
)
const bufSize = 200 * 1024 * 1024
func main() {
buf := make([]byte, bufSize)
_, err := rand.Read(buf)
if err != nil {
log.Fatalln(err)
}
f, err := os.Create("test-fsync.bin")
if err == nil {
_, err = f.Write(buf)
}
if err == nil {
start := time.Now()
err = syscall.Fdatasync(int(f.Fd()))
dur := time.Since(start)
fmt.Printf("Duration %v\n", dur)
f.Close()
os.Remove("test-fsync.bin")
}
if err != nil {
log.Fatalln(err)
}
}
Actually the fsync part I copied the same logic from etcd source. When I run, it shows:
Duration 692.479816ms
I installed Ubuntu Linux directly on MacBook Pro (erasing the MacOS). I'm curious why etcd reported so big latency with fsync constantly while all my other programs looks normal, and tested normal.
I found the smaller the file is, the longer fsync tooks. I changed the file size to 200K, now it tooks
Duration 1.318288485s
@easeway Can you try to do that consistently? like fsync in a loose for loop? I feel your disk performance is not stable
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.001"} 0
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.002"} 0
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.004"} 194
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.008"} 265
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.016"} 787
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.032"} 948
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.064"} 969
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.128"} 979
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.256"} 1012
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.512"} 1472
etcd_disk_wal_fsync_duration_seconds_bucket{le="1.024"} 1826
etcd_disk_wal_fsync_duration_seconds_bucket{le="2.048"} 1844
etcd_disk_wal_fsync_duration_seconds_bucket{le="4.096"} 1844
etcd_disk_wal_fsync_duration_seconds_bucket{le="8.192"} 1844
etcd_disk_wal_fsync_duration_seconds_bucket{le="+Inf"} 1844
The other fsync related metrics also agrees that the disk is slow sometimes (10%?).
@easeway I am not sure why. But this does not seem like an etcd issue.
@xiang90 Thanks for your quick response! I will look further.
Is there any specific reason to use syscall.Fdatasync instead of File.Sync? I tested with File.Sync, it took less than 100ms with 200K data.
@easeway Fdatasync only needs to sync down the file blocks. Sync will sync down extra file metadata (e.g., mtime). The performance with Fdatasync should be the same or better than Sync. What filesystem are you using? Does the SSD's SMART data show anything suspicious?
I understand Fdatasync should be faster. However as I tested, Fdatasync is always much slower than Fsync. I didn't see any suspicious information from SMART data.
Anyway, I will change to a different environment and test again.
Thanks again for your kind help!
What is a reasonable max value for heartbeat timeout? I'm also getting this on a VM that is not doing much of anything. Three-node etcd cluster; VM has 2 cores; 4gigs of ram. Metrics look like:
etcd_wal_fsync_durations_seconds_bucket{le="0.001"} 0
etcd_wal_fsync_durations_seconds_bucket{le="0.002"} 0
etcd_wal_fsync_durations_seconds_bucket{le="0.004"} 0
etcd_wal_fsync_durations_seconds_bucket{le="0.008"} 0
etcd_wal_fsync_durations_seconds_bucket{le="0.016"} 95999
etcd_wal_fsync_durations_seconds_bucket{le="0.032"} 236506
etcd_wal_fsync_durations_seconds_bucket{le="0.064"} 260727
etcd_wal_fsync_durations_seconds_bucket{le="0.128"} 276851
etcd_wal_fsync_durations_seconds_bucket{le="0.256"} 287385
etcd_wal_fsync_durations_seconds_bucket{le="0.512"} 293483
etcd_wal_fsync_durations_seconds_bucket{le="1.024"} 296013
etcd_wal_fsync_durations_seconds_bucket{le="2.048"} 296559
etcd_wal_fsync_durations_seconds_bucket{le="4.096"} 296600
etcd_wal_fsync_durations_seconds_bucket{le="8.192"} 296603
etcd_wal_fsync_durations_seconds_bucket{le="+Inf"} 296603
I know these machines don't have the fastest disk but since I can't control that, what is a reasonable upper limit on timeout value? What happens if you go over that?
etcd_wal_fsync_durations_seconds_bucket{le="1.024"} 296013
etcd_wal_fsync_durations_seconds_bucket{le="2.048"} 296559
etcd_wal_fsync_durations_seconds_bucket{le="4.096"} 296600
etcd_wal_fsync_durations_seconds_bucket{le="8.192"} 296603
There are quite a few fsyncs took more than 1 seconds. EBS should not be that slow. I think your etcd box is actually overloaded. Try to switch to a VM with more cores.
@xiang90 Can you define overloaded? The machines aren't that beefy (2 core / 4 gigs ram) but Prometheus doesn't show anything interesting with node1/5/15.
And if I read that metric correctly, that's saying 2500 (out of 300K) transactions took more than .5 second to do an fsync. What are the consequences of the fsyncs taking that long?
@matthughes
I hope this can help you https://github.com/coreos/etcd/blob/master/Documentation/faq.md#what-does-the-etcd-warning-failed-to-send-out-heartbeat-on-time-mean.
You'd better keep p99 < 30ms or so (which is more than 100ms on your current setup).
I meet same issue, and my disk is SSD, the system looks very low in CPU. how to analyze this below metrics data:
etcd version is 3.0.12.
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="128"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="256"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="512"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1024"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2048"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4096"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_sum 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_count 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="100"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="1600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="3200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="6400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="12800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="25600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="51200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="102400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="204800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="409600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="819200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_sum 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_count 0
etcd_debugging_mvcc_db_total_size_in_bytes 0
etcd_debugging_mvcc_delete_total 0
etcd_debugging_mvcc_events_total 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="0.5"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="128"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="256"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="512"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="1024"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_sum 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_count 0
etcd_debugging_mvcc_keys_total 0
etcd_debugging_mvcc_pending_events_total 0
etcd_debugging_mvcc_put_total 0
etcd_debugging_mvcc_range_total 0
etcd_debugging_mvcc_slow_watcher_total 0
etcd_debugging_mvcc_txn_total 0
etcd_debugging_mvcc_watch_stream_total 0
etcd_debugging_mvcc_watcher_total 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.001"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.002"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.004"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.008"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.016"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.032"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.064"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.128"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.256"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.512"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="1.024"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="2.048"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="4.096"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="8.192"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="+Inf"} 60
etcd_debugging_snap_save_marshalling_duration_seconds_sum 0.028723111000000003
etcd_debugging_snap_save_marshalling_duration_seconds_count 60
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.001"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.002"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.004"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.008"} 9
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.016"} 59
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.032"} 59
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.064"} 59
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.128"} 60
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.256"} 60
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.512"} 60
etcd_debugging_snap_save_total_duration_seconds_bucket{le="1.024"} 60
etcd_debugging_snap_save_total_duration_seconds_bucket{le="2.048"} 60
etcd_debugging_snap_save_total_duration_seconds_bucket{le="4.096"} 60
etcd_debugging_snap_save_total_duration_seconds_bucket{le="8.192"} 60
etcd_debugging_snap_save_total_duration_seconds_bucket{le="+Inf"} 60
etcd_debugging_snap_save_total_duration_seconds_sum 0.6176025580000001
etcd_debugging_snap_save_total_duration_seconds_count 60
etcd_debugging_store_expires_total 33
etcd_debugging_store_reads_total{action="get"} 90194
etcd_debugging_store_reads_total{action="getRecursive"} 100
etcd_debugging_store_watch_requests_total 11358
etcd_debugging_store_watchers 0
etcd_debugging_store_writes_total{action="compareAndSwap"} 151
etcd_debugging_store_writes_total{action="delete"} 1414
etcd_debugging_store_writes_total{action="set"} 3019
etcd_debugging_store_writes_total{action="update"} 320226
etcd_disk_backend_commit_duration_seconds_bucket{le="0.001"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.002"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.004"} 69
etcd_disk_backend_commit_duration_seconds_bucket{le="0.008"} 141
etcd_disk_backend_commit_duration_seconds_bucket{le="0.016"} 144
etcd_disk_backend_commit_duration_seconds_bucket{le="0.032"} 144
etcd_disk_backend_commit_duration_seconds_bucket{le="0.064"} 146
etcd_disk_backend_commit_duration_seconds_bucket{le="0.128"} 146
etcd_disk_backend_commit_duration_seconds_bucket{le="0.256"} 146
etcd_disk_backend_commit_duration_seconds_bucket{le="0.512"} 146
etcd_disk_backend_commit_duration_seconds_bucket{le="1.024"} 146
etcd_disk_backend_commit_duration_seconds_bucket{le="2.048"} 146
etcd_disk_backend_commit_duration_seconds_bucket{le="4.096"} 146
etcd_disk_backend_commit_duration_seconds_bucket{le="8.192"} 146
etcd_disk_backend_commit_duration_seconds_bucket{le="+Inf"} 146
etcd_disk_backend_commit_duration_seconds_sum 0.7025643579999997
etcd_disk_backend_commit_duration_seconds_count 146
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.001"} 0
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.002"} 225231
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.004"} 571608
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.008"} 594081
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.016"} 595201
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.032"} 597843
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.064"} 599720
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.128"} 600062
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.256"} 600138
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.512"} 600192
etcd_disk_wal_fsync_duration_seconds_bucket{le="1.024"} 600212
etcd_disk_wal_fsync_duration_seconds_bucket{le="2.048"} 600218
etcd_disk_wal_fsync_duration_seconds_bucket{le="4.096"} 600219
etcd_disk_wal_fsync_duration_seconds_bucket{le="8.192"} 600219
etcd_disk_wal_fsync_duration_seconds_bucket{le="+Inf"} 600219
etcd_disk_wal_fsync_duration_seconds_sum 1570.179954888027
etcd_disk_wal_fsync_duration_seconds_count 600219
etcd_http_failed_total{code="400",method="GET"} 8110
etcd_http_failed_total{code="403",method="DELETE"} 14
etcd_http_failed_total{code="404",method="DELETE"} 59
etcd_http_failed_total{code="404",method="GET"} 299
etcd_http_failed_total{code="500",method="PUT"} 13
etcd_http_received_total{method="DELETE"} 406
etcd_http_received_total{method="GET"} 997
etcd_http_received_total{method="PUT"} 76533
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.0005"} 0
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.001"} 0
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.002"} 0
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.004"} 83
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.008"} 328
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.016"} 330
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.032"} 330
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.064"} 330
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.128"} 332
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.256"} 333
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.512"} 333
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="1.024"} 333
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="2.048"} 333
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="+Inf"} 333
etcd_http_successful_duration_seconds_sum{method="DELETE"} 2.140354297999999
etcd_http_successful_duration_seconds_count{method="DELETE"} 333
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.0005"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.001"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.002"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.004"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.008"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.016"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.032"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.064"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.128"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.256"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.512"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="1.024"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="2.048"} 698
etcd_http_successful_duration_seconds_bucket{method="GET",le="+Inf"} 698
etcd_http_successful_duration_seconds_sum{method="GET"} 0.037799214000000005
etcd_http_successful_duration_seconds_count{method="GET"} 698
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.0005"} 0
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.001"} 0
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.002"} 0
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.004"} 16408
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.008"} 73751
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.016"} 75499
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.032"} 75879
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.064"} 76201
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.128"} 76372
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.256"} 76448
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.512"} 76493
etcd_http_successful_duration_seconds_bucket{method="PUT",le="1.024"} 76519
etcd_http_successful_duration_seconds_bucket{method="PUT",le="2.048"} 76520
etcd_http_successful_duration_seconds_bucket{method="PUT",le="+Inf"} 76520
etcd_http_successful_duration_seconds_sum{method="PUT"} 446.16455357700147
etcd_http_successful_duration_seconds_count{method="PUT"} 76520
etcd_network_client_grpc_received_bytes_total 0
etcd_network_client_grpc_sent_bytes_total 0
etcd_network_peer_received_bytes_total{From="0"} 9.308544e+06
etcd_network_peer_received_bytes_total{From="6b1e7ec3a3fb02c3"} 1.19770393e+08
etcd_network_peer_received_bytes_total{From="f221546ff8120149"} 6.3104227e+07
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.0001"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.0002"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.0004"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.0008"} 1
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.0016"} 3435
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.0032"} 4642
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.0064"} 4642
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.0128"} 4643
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.0256"} 4677
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.0512"} 4710
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.1024"} 4710
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.2048"} 4710
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.4096"} 4710
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="0.8192"} 4710
etcd_network_peer_round_trip_time_seconds_bucket{To="6b1e7ec3a3fb02c3",le="+Inf"} 4710
etcd_network_peer_round_trip_time_seconds_sum{To="6b1e7ec3a3fb02c3"} 7.783577236999999
etcd_network_peer_round_trip_time_seconds_count{To="6b1e7ec3a3fb02c3"} 4710
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.0001"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.0002"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.0004"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.0008"} 1490
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.0016"} 3463
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.0032"} 4623
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.0064"} 4624
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.0128"} 4625
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.0256"} 4626
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.0512"} 4630
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.1024"} 4690
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.2048"} 4695
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.4096"} 4710
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="0.8192"} 4710
etcd_network_peer_round_trip_time_seconds_bucket{To="f221546ff8120149",le="+Inf"} 4710
etcd_network_peer_round_trip_time_seconds_sum{To="f221546ff8120149"} 13.468057980999994
etcd_network_peer_round_trip_time_seconds_count{To="f221546ff8120149"} 4710
etcd_network_peer_sent_bytes_total{To="6b1e7ec3a3fb02c3"} 1.28569087e+08
etcd_network_peer_sent_bytes_total{To="f221546ff8120149"} 8.8763795e+07
etcd_server_has_leader 1
etcd_server_leader_changes_seen_total 53
etcd_server_proposals_applied_total 1.126558e+06
etcd_server_proposals_committed_total 1.126558e+06
etcd_server_proposals_failed_total 13
etcd_server_proposals_pending 0
go_gc_duration_seconds{quantile="0"} 0.00015746
go_gc_duration_seconds{quantile="0.25"} 0.000193942
go_gc_duration_seconds{quantile="0.5"} 0.000226302
go_gc_duration_seconds{quantile="0.75"} 0.000335172
go_gc_duration_seconds{quantile="1"} 0.007048052000000001
go_gc_duration_seconds_sum 0.770003988
go_gc_duration_seconds_count 1218
go_goroutines 206
go_memstats_alloc_bytes 2.4550608e+07
go_memstats_alloc_bytes_total 9.800526208e+09
go_memstats_buck_hash_sys_bytes 1.611416e+06
go_memstats_frees_total 1.27239757e+08
go_memstats_gc_sys_bytes 1.583104e+06
go_memstats_heap_alloc_bytes 2.4550608e+07
go_memstats_heap_idle_bytes 2.0676608e+07
go_memstats_heap_inuse_bytes 2.7164672e+07
go_memstats_heap_objects 95306
go_memstats_heap_released_bytes_total 2.0267008e+07
go_memstats_heap_sys_bytes 4.784128e+07
go_memstats_last_gc_time_seconds 1.4851646956228925e+19
go_memstats_lookups_total 458351
go_memstats_mallocs_total 1.27335063e+08
go_memstats_mcache_inuse_bytes 2400
go_memstats_mcache_sys_bytes 16384
go_memstats_mspan_inuse_bytes 180480
go_memstats_mspan_sys_bytes 344064
go_memstats_next_gc_bytes 3.8949456e+07
go_memstats_other_sys_bytes 858720
go_memstats_stack_inuse_bytes 1.572864e+06
go_memstats_stack_sys_bytes 1.572864e+06
go_memstats_sys_bytes 5.3827832e+07
http_request_duration_microseconds{handler="prometheus",quantile="0.5"} 1532.788
http_request_duration_microseconds{handler="prometheus",quantile="0.9"} 1593.581
http_request_duration_microseconds{handler="prometheus",quantile="0.99"} 1593.581
http_request_duration_microseconds_sum{handler="prometheus"} 22243.968
http_request_duration_microseconds_count{handler="prometheus"} 6
http_request_size_bytes{handler="prometheus",quantile="0.5"} 63
http_request_size_bytes{handler="prometheus",quantile="0.9"} 63
http_request_size_bytes{handler="prometheus",quantile="0.99"} 63
http_request_size_bytes_sum{handler="prometheus"} 378
http_request_size_bytes_count{handler="prometheus"} 6
http_requests_total{code="200",handler="prometheus",method="get"} 6
http_response_size_bytes{handler="prometheus",quantile="0.5"} 29225
http_response_size_bytes{handler="prometheus",quantile="0.9"} 29244
http_response_size_bytes{handler="prometheus",quantile="0.99"} 29244
http_response_size_bytes_sum{handler="prometheus"} 175228
http_response_size_bytes_count{handler="prometheus"} 6
process_cpu_seconds_total 1263.23
process_max_fds 1024
process_open_fds 57
process_resident_memory_bytes 6.0010496e+07
process_start_time_seconds 1.48502346095e+09
process_virtual_memory_bytes 1.0811764736e+10
Hi
then I switched the HDD disk in another environment, looks the metric get the similar result,
could you kindly help how to analyze the data:
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="128"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="256"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="512"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1024"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2048"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4096"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_sum 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_count 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="100"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="1600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="3200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="6400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="12800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="25600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="51200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="102400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="204800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="409600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="819200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_sum 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_count 0
etcd_debugging_mvcc_db_total_size_in_bytes 0
etcd_debugging_mvcc_delete_total 0
etcd_debugging_mvcc_events_total 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="0.5"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="128"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="256"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="512"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="1024"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_sum 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_count 0
etcd_debugging_mvcc_keys_total 0
etcd_debugging_mvcc_pending_events_total 0
etcd_debugging_mvcc_put_total 0
etcd_debugging_mvcc_range_total 0
etcd_debugging_mvcc_slow_watcher_total 0
etcd_debugging_mvcc_txn_total 0
etcd_debugging_mvcc_watch_stream_total 0
etcd_debugging_mvcc_watcher_total 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.001"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.002"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.004"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.008"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.016"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.032"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.064"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.128"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.256"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.512"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="1.024"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="2.048"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="4.096"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="8.192"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="+Inf"} 54
etcd_debugging_snap_save_marshalling_duration_seconds_sum 0.022645302000000003
etcd_debugging_snap_save_marshalling_duration_seconds_count 54
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.001"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.002"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.004"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.008"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.016"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.032"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.064"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.128"} 17
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.256"} 46
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.512"} 53
etcd_debugging_snap_save_total_duration_seconds_bucket{le="1.024"} 54
etcd_debugging_snap_save_total_duration_seconds_bucket{le="2.048"} 54
etcd_debugging_snap_save_total_duration_seconds_bucket{le="4.096"} 54
etcd_debugging_snap_save_total_duration_seconds_bucket{le="8.192"} 54
etcd_debugging_snap_save_total_duration_seconds_bucket{le="+Inf"} 54
etcd_debugging_snap_save_total_duration_seconds_sum 10.340719423000001
etcd_debugging_snap_save_total_duration_seconds_count 54
etcd_debugging_store_expires_total 2
etcd_debugging_store_reads_total{action="get"} 104186
etcd_debugging_store_reads_total{action="getRecursive"} 8
etcd_debugging_store_watch_requests_total 2551
etcd_debugging_store_watchers 3
etcd_debugging_store_writes_total{action="compareAndSwap"} 2
etcd_debugging_store_writes_total{action="create"} 3
etcd_debugging_store_writes_total{action="delete"} 2
etcd_debugging_store_writes_total{action="set"} 1023
etcd_debugging_store_writes_total{action="update"} 246400
etcd_disk_backend_commit_duration_seconds_bucket{le="0.001"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.002"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.004"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.008"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.016"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.032"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.064"} 3
etcd_disk_backend_commit_duration_seconds_bucket{le="0.128"} 46
etcd_disk_backend_commit_duration_seconds_bucket{le="0.256"} 61
etcd_disk_backend_commit_duration_seconds_bucket{le="0.512"} 62
etcd_disk_backend_commit_duration_seconds_bucket{le="1.024"} 62
etcd_disk_backend_commit_duration_seconds_bucket{le="2.048"} 62
etcd_disk_backend_commit_duration_seconds_bucket{le="4.096"} 62
etcd_disk_backend_commit_duration_seconds_bucket{le="8.192"} 62
etcd_disk_backend_commit_duration_seconds_bucket{le="+Inf"} 62
etcd_disk_backend_commit_duration_seconds_sum 7.031537422000001
etcd_disk_backend_commit_duration_seconds_count 62
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.001"} 0
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.002"} 0
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.004"} 0
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.008"} 0
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.016"} 85
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.032"} 113280
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.064"} 481014
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.128"} 539179
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.256"} 544414
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.512"} 547215
etcd_disk_wal_fsync_duration_seconds_bucket{le="1.024"} 547269
etcd_disk_wal_fsync_duration_seconds_bucket{le="2.048"} 547271
etcd_disk_wal_fsync_duration_seconds_bucket{le="4.096"} 547271
etcd_disk_wal_fsync_duration_seconds_bucket{le="8.192"} 547271
etcd_disk_wal_fsync_duration_seconds_bucket{le="+Inf"} 547271
etcd_disk_wal_fsync_duration_seconds_sum 25466.07978440011
etcd_disk_wal_fsync_duration_seconds_count 547271
etcd_http_failed_total{code="404",method="GET"} 118
etcd_http_received_total{method="DELETE"} 2
etcd_http_received_total{method="GET"} 1393
etcd_http_received_total{method="PUT"} 98850
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.0005"} 0
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.001"} 0
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.002"} 0
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.004"} 0
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.008"} 0
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.016"} 0
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.032"} 0
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.064"} 1
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.128"} 2
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.256"} 2
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="0.512"} 2
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="1.024"} 2
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="2.048"} 2
etcd_http_successful_duration_seconds_bucket{method="DELETE",le="+Inf"} 2
etcd_http_successful_duration_seconds_sum{method="DELETE"} 0.149054029
etcd_http_successful_duration_seconds_count{method="DELETE"} 2
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.0005"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.001"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.002"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.004"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.008"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.016"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.032"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.064"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.128"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.256"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.512"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="1.024"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="2.048"} 1275
etcd_http_successful_duration_seconds_bucket{method="GET",le="+Inf"} 1275
etcd_http_successful_duration_seconds_sum{method="GET"} 0.06790619099999984
etcd_http_successful_duration_seconds_count{method="GET"} 1275
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.0005"} 0
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.001"} 0
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.002"} 0
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.004"} 0
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.008"} 0
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.016"} 0
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.032"} 8666
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.064"} 69353
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.128"} 94777
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.256"} 97784
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.512"} 98781
etcd_http_successful_duration_seconds_bucket{method="PUT",le="1.024"} 98847
etcd_http_successful_duration_seconds_bucket{method="PUT",le="2.048"} 98849
etcd_http_successful_duration_seconds_bucket{method="PUT",le="+Inf"} 98850
etcd_http_successful_duration_seconds_sum{method="PUT"} 5914.484009911007
etcd_http_successful_duration_seconds_count{method="PUT"} 98850
etcd_network_client_grpc_received_bytes_total 0
etcd_network_client_grpc_sent_bytes_total 0
etcd_network_peer_received_bytes_total{From="0"} 1.01416e+07
etcd_network_peer_received_bytes_total{From="623d1c8a341085da"} 9.7989616e+07
etcd_network_peer_received_bytes_total{From="cb6d24e35038f349"} 9.419587e+07
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.0001"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.0002"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.0004"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.0008"} 1
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.0016"} 4489
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.0032"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.0064"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.0128"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.0256"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.0512"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.1024"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.2048"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.4096"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="0.8192"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="623d1c8a341085da",le="+Inf"} 5030
etcd_network_peer_round_trip_time_seconds_sum{To="623d1c8a341085da"} 5.854275826999988
etcd_network_peer_round_trip_time_seconds_count{To="623d1c8a341085da"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.0001"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.0002"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.0004"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.0008"} 202
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.0016"} 2676
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.0032"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.0064"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.0128"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.0256"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.0512"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.1024"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.2048"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.4096"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="0.8192"} 5030
etcd_network_peer_round_trip_time_seconds_bucket{To="cb6d24e35038f349",le="+Inf"} 5030
etcd_network_peer_round_trip_time_seconds_sum{To="cb6d24e35038f349"} 7.6719413499999956
etcd_network_peer_round_trip_time_seconds_count{To="cb6d24e35038f349"} 5030
etcd_network_peer_sent_bytes_total{To="623d1c8a341085da"} 1.37519233e+08
etcd_network_peer_sent_bytes_total{To="cb6d24e35038f349"} 1.38664671e+08
etcd_server_has_leader 1
etcd_server_leader_changes_seen_total 2
etcd_server_proposals_applied_total 549590
etcd_server_proposals_committed_total 549590
etcd_server_proposals_failed_total 0
etcd_server_proposals_pending 0
go_gc_duration_seconds{quantile="0"} 0.000102541
go_gc_duration_seconds{quantile="0.25"} 0.000170736
go_gc_duration_seconds{quantile="0.5"} 0.00031394100000000004
go_gc_duration_seconds{quantile="0.75"} 0.00037800100000000004
go_gc_duration_seconds{quantile="1"} 0.003890422
go_gc_duration_seconds_sum 0.46006146900000006
go_gc_duration_seconds_count 1261
go_goroutines 205
go_memstats_alloc_bytes 2.70812e+07
go_memstats_alloc_bytes_total 1.0358015488e+10
go_memstats_buck_hash_sys_bytes 1.569128e+06
go_memstats_frees_total 1.33924029e+08
go_memstats_gc_sys_bytes 1.419264e+06
go_memstats_heap_alloc_bytes 2.70812e+07
go_memstats_heap_idle_bytes 1.3205504e+07
go_memstats_heap_inuse_bytes 2.9196288e+07
go_memstats_heap_objects 116215
go_memstats_heap_released_bytes_total 1.2320768e+07
go_memstats_heap_sys_bytes 4.2401792e+07
go_memstats_last_gc_time_seconds 1.4852223399318383e+19
go_memstats_lookups_total 485447
go_memstats_mallocs_total 1.34040244e+08
go_memstats_mcache_inuse_bytes 4800
go_memstats_mcache_sys_bytes 16384
go_memstats_mspan_inuse_bytes 199200
go_memstats_mspan_sys_bytes 311296
go_memstats_next_gc_bytes 3.9975296e+07
go_memstats_other_sys_bytes 921488
go_memstats_stack_inuse_bytes 1.6384e+06
go_memstats_stack_sys_bytes 1.6384e+06
go_memstats_sys_bytes 4.8277752e+07
http_request_duration_microseconds{handler="prometheus",quantile="0.5"} NaN
http_request_duration_microseconds{handler="prometheus",quantile="0.9"} NaN
http_request_duration_microseconds{handler="prometheus",quantile="0.99"} NaN
http_request_duration_microseconds_sum{handler="prometheus"} 0
http_request_duration_microseconds_count{handler="prometheus"} 0
http_request_size_bytes{handler="prometheus",quantile="0.5"} NaN
http_request_size_bytes{handler="prometheus",quantile="0.9"} NaN
http_request_size_bytes{handler="prometheus",quantile="0.99"} NaN
http_request_size_bytes_sum{handler="prometheus"} 0
http_request_size_bytes_count{handler="prometheus"} 0
http_response_size_bytes{handler="prometheus",quantile="0.5"} NaN
http_response_size_bytes{handler="prometheus",quantile="0.9"} NaN
http_response_size_bytes{handler="prometheus",quantile="0.99"} NaN
http_response_size_bytes_sum{handler="prometheus"} 0
http_response_size_bytes_count{handler="prometheus"} 0
process_cpu_seconds_total 1088.62
process_max_fds 1024
process_open_fds 66
process_resident_memory_bytes 3.8129664e+07
process_start_time_seconds 1.48507151709e+09
process_virtual_memory_bytes 1.0806214656e+10
Hi
I am thinking, whether the time consume computing is also based on system real time, so if system time jumps due to NTP sync, then the computing will not be correct either, I just suppose maybe this is one possible cause?
@nenzhang can you please provide the server logging?
I looked that the metrics, all almost fsync took more than 10ms. If you run a old version of etcd, it will complaint. We loose the timeout to 100ms to make etcd compliant less on slow HDD.
Hi Xiang
thanks, but for previous SSD, we also meet warning logs many, we configured the heartbeat interval to 200ms. could you kindly help check the reason? is it really disk issue?
what do you mean you loose the timeout to 100ms? Because I want to know whether it is really be slow to etcd and impact the etcd performance, I mean it will impact the request handling time to etcd from function also.
Hi
ok, the log collected from SSD environment:
journalctl -b |grep -i etcd
Jan 24 10:35:00 SN-2 etcd[1609]: data dir = /mnt/etcd/SN-2
Jan 24 10:35:00 SN-2 etcd[1609]: member dir = /mnt/etcd/SN-2/member
Jan 24 10:35:00 SN-2 etcd[1609]: heartbeat = 200ms
Jan 24 10:35:00 SN-2 etcd[1609]: election = 1000ms
Jan 24 10:35:00 SN-2 etcd[1609]: snapshot count = 10000
Jan 24 10:35:00 SN-2 etcd[1609]: advertise client URLs = http://169.254.0.20:2379
Jan 24 10:35:00 SN-2 etcd[1609]: initial advertise peer URLs = http://169.254.0.20:2380
Jan 24 10:35:00 SN-2 etcd[1609]: initial cluster = SN-2=http://169.254.0.20:2380
Jan 24 10:35:00 SN-2 etcd[1609]: starting member 2eb57a48c2641c1d in cluster 53a2a5178aad20e1
Jan 24 10:35:00 SN-2 etcd[1609]: 2eb57a48c2641c1d became follower at term 0
Jan 24 10:35:00 SN-2 etcd[1609]: newRaft 2eb57a48c2641c1d [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
Jan 24 10:35:00 SN-2 etcd[1609]: 2eb57a48c2641c1d became follower at term 1
Jan 24 10:35:00 SN-2 etcd[1609]: starting server... [version: 3.0.12, cluster version: to_be_decided]
Jan 24 10:35:00 SN-2 etcd[1609]: added member 2eb57a48c2641c1d [http://169.254.0.20:2380] to cluster 53a2a5178aad20e1
Jan 24 10:35:00 SN-2 etcd[1609]: apply entries took too long [12.631153ms for 1 entries]
Jan 24 10:35:00 SN-2 etcd[1609]: avoid queries with large range/delete range!
Jan 24 10:35:01 SN-2 etcd[1609]: 2eb57a48c2641c1d is starting a new election at term 1
Jan 24 10:35:01 SN-2 etcd[1609]: 2eb57a48c2641c1d became candidate at term 2
Jan 24 10:35:01 SN-2 etcd[1609]: 2eb57a48c2641c1d received vote from 2eb57a48c2641c1d at term 2
Jan 24 10:35:01 SN-2 etcd[1609]: 2eb57a48c2641c1d became leader at term 2
Jan 24 10:35:01 SN-2 etcd[1609]: raft.node: 2eb57a48c2641c1d elected leader 2eb57a48c2641c1d at term 2
.....
Jan 24 14:38:41 SN-2 etcd[1609]: 2eb57a48c2641c1d is starting a new election at term 70
Jan 24 14:38:41 SN-2 etcd[1609]: 2eb57a48c2641c1d became candidate at term 71
Jan 24 14:38:41 SN-2 etcd[1609]: 2eb57a48c2641c1d received vote from 2eb57a48c2641c1d at term 71
Jan 24 14:38:41 SN-2 etcd[1609]: 2eb57a48c2641c1d [logterm: 67, index: 62340] sent vote request to eb6038ffd8be1f89 at term 71
Jan 24 14:38:41 SN-2 etcd[1609]: 2eb57a48c2641c1d [logterm: 67, index: 62340] sent vote request to 832b962ea502f7d0 at term 71
Jan 24 14:38:41 SN-2 etcd[1609]: 2eb57a48c2641c1d received vote from eb6038ffd8be1f89 at term 71
Jan 24 14:38:41 SN-2 etcd[1609]: 2eb57a48c2641c1d [quorum:2] has received 2 votes and 0 vote rejections
Jan 24 14:38:41 SN-2 etcd[1609]: 2eb57a48c2641c1d became leader at term 71
Jan 24 14:38:41 SN-2 etcd[1609]: raft.node: 2eb57a48c2641c1d elected leader 2eb57a48c2641c1d at term 71
Jan 24 14:42:31 SN-2 etcd[1609]: apply entries took too long [13.186775ms for 1 entries]
Jan 24 14:42:31 SN-2 etcd[1609]: avoid queries with large range/delete range!
Jan 24 14:45:48 SN-2 etcd[1609]: start to snapshot (applied: 70007, lastsnap: 60006)
Jan 24 14:45:48 SN-2 etcd[1609]: saved snapshot at index 70007
Jan 24 14:45:48 SN-2 etcd[1609]: compacted raft log at 65007
Jan 24 14:46:01 SN-2 etcd[1609]: purged file /mnt/etcd/SN-2/member/snap/0000000000000043-000000000000ea66.snap successfully
Jan 24 14:49:00 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 3.618403ms)
Jan 24 14:49:00 SN-2 etcd[1609]: server is likely overloaded
Jan 24 14:49:00 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 3.671987ms)
Jan 24 14:49:00 SN-2 etcd[1609]: server is likely overloaded
Jan 24 14:52:02 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 11.549135ms)
Jan 24 14:52:02 SN-2 etcd[1609]: server is likely overloaded
Jan 24 14:52:02 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 11.616418ms)
Jan 24 14:52:02 SN-2 etcd[1609]: server is likely overloaded
Jan 24 14:54:58 SN-2 etcd[1609]: start to snapshot (applied: 80008, lastsnap: 70007)
Jan 24 14:54:58 SN-2 etcd[1609]: saved snapshot at index 80008
Jan 24 14:54:58 SN-2 etcd[1609]: compacted raft log at 75008
Jan 24 14:55:01 SN-2 etcd[1609]: purged file /mnt/etcd/SN-2/member/snap/0000000000000047-0000000000011177.snap successfully
and this time the metrics data:
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="128"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="256"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="512"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1024"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2048"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4096"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_sum 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_count 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="100"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="1600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="3200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="6400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="12800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="25600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="51200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="102400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="204800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="409600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="819200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_sum 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_count 0
etcd_debugging_mvcc_db_total_size_in_bytes 0
etcd_debugging_mvcc_delete_total 0
etcd_debugging_mvcc_events_total 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="0.5"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="128"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="256"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="512"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="1024"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_sum 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_count 0
etcd_debugging_mvcc_keys_total 0
etcd_debugging_mvcc_pending_events_total 0
etcd_debugging_mvcc_put_total 0
etcd_debugging_mvcc_range_total 0
etcd_debugging_mvcc_slow_watcher_total 0
etcd_debugging_mvcc_txn_total 0
etcd_debugging_mvcc_watch_stream_total 0
etcd_debugging_mvcc_watcher_total 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.001"} 7
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.002"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.004"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.008"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.016"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.032"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.064"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.128"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.256"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.512"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="1.024"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="2.048"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="4.096"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="8.192"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="+Inf"} 9
etcd_debugging_snap_save_marshalling_duration_seconds_sum 0.006014972
etcd_debugging_snap_save_marshalling_duration_seconds_count 9
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.001"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.002"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.004"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.008"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.016"} 5
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.032"} 8
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.064"} 8
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.128"} 8
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.256"} 9
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.512"} 9
etcd_debugging_snap_save_total_duration_seconds_bucket{le="1.024"} 9
etcd_debugging_snap_save_total_duration_seconds_bucket{le="2.048"} 9
etcd_debugging_snap_save_total_duration_seconds_bucket{le="4.096"} 9
etcd_debugging_snap_save_total_duration_seconds_bucket{le="8.192"} 9
etcd_debugging_snap_save_total_duration_seconds_bucket{le="+Inf"} 9
etcd_debugging_snap_save_total_duration_seconds_sum 0.376242862
etcd_debugging_snap_save_total_duration_seconds_count 9
etcd_debugging_store_expires_total 5
etcd_debugging_store_reads_total{action="get"} 97940
etcd_debugging_store_reads_total{action="getRecursive"} 7
etcd_debugging_store_watch_requests_total 92427
etcd_debugging_store_watchers 3
etcd_debugging_store_writes_total{action="compareAndSwap"} 25
etcd_debugging_store_writes_total{action="create"} 3
etcd_debugging_store_writes_total{action="delete"} 9876
etcd_debugging_store_writes_total{action="set"} 24807
etcd_debugging_store_writes_total{action="update"} 26636
etcd_disk_backend_commit_duration_seconds_bucket{le="0.001"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.002"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.004"} 11
etcd_disk_backend_commit_duration_seconds_bucket{le="0.008"} 21
etcd_disk_backend_commit_duration_seconds_bucket{le="0.016"} 22
etcd_disk_backend_commit_duration_seconds_bucket{le="0.032"} 22
etcd_disk_backend_commit_duration_seconds_bucket{le="0.064"} 22
etcd_disk_backend_commit_duration_seconds_bucket{le="0.128"} 22
etcd_disk_backend_commit_duration_seconds_bucket{le="0.256"} 22
etcd_disk_backend_commit_duration_seconds_bucket{le="0.512"} 22
etcd_disk_backend_commit_duration_seconds_bucket{le="1.024"} 22
etcd_disk_backend_commit_duration_seconds_bucket{le="2.048"} 22
etcd_disk_backend_commit_duration_seconds_bucket{le="4.096"} 22
etcd_disk_backend_commit_duration_seconds_bucket{le="8.192"} 22
etcd_disk_backend_commit_duration_seconds_bucket{le="+Inf"} 22
etcd_disk_backend_commit_duration_seconds_sum 0.094174526
etcd_disk_backend_commit_duration_seconds_count 22
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.001"} 0
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.002"} 47532
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.004"} 87594
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.008"} 91006
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.016"} 91650
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.032"} 92642
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.064"} 93389
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.128"} 93572
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.256"} 93592
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.512"} 93596
etcd_disk_wal_fsync_duration_seconds_bucket{le="1.024"} 93597
etcd_disk_wal_fsync_duration_seconds_bucket{le="2.048"} 93597
etcd_disk_wal_fsync_duration_seconds_bucket{le="4.096"} 93598
etcd_disk_wal_fsync_duration_seconds_bucket{le="8.192"} 93598
etcd_disk_wal_fsync_duration_seconds_bucket{le="+Inf"} 93599
etcd_disk_wal_fsync_duration_seconds_sum 360.241949550997
etcd_disk_wal_fsync_duration_seconds_count 93599
etcd_http_failed_total{code="404",method="GET"} 36
etcd_http_failed_total{code="500",method="PUT"} 16
etcd_http_received_total{method="GET"} 77
etcd_http_received_total{method="PUT"} 5360
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.0005"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.001"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.002"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.004"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.008"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.016"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.032"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.064"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.128"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.256"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.512"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="1.024"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="2.048"} 41
etcd_http_successful_duration_seconds_bucket{method="GET",le="+Inf"} 41
etcd_http_successful_duration_seconds_sum{method="GET"} 0.002250607
etcd_http_successful_duration_seconds_count{method="GET"} 41
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.0005"} 1
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.001"} 3
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.002"} 6
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.004"} 2236
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.008"} 5085
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.016"} 5133
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.032"} 5195
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.064"} 5278
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.128"} 5327
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.256"} 5333
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.512"} 5334
etcd_http_successful_duration_seconds_bucket{method="PUT",le="1.024"} 5335
etcd_http_successful_duration_seconds_bucket{method="PUT",le="2.048"} 5337
etcd_http_successful_duration_seconds_bucket{method="PUT",le="+Inf"} 5344
etcd_http_successful_duration_seconds_sum{method="PUT"} 58.16143633899986
etcd_http_successful_duration_seconds_count{method="PUT"} 5344
etcd_network_client_grpc_received_bytes_total 0
etcd_network_client_grpc_sent_bytes_total 0
etcd_network_peer_received_bytes_total{From="0"} 1.082536e+06
etcd_network_peer_received_bytes_total{From="832b962ea502f7d0"} 1.3940261e+07
etcd_network_peer_received_bytes_total{From="eb6038ffd8be1f89"} 1.9952945e+07
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0001"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0002"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0004"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0008"} 131
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0016"} 393
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0032"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0064"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0128"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0256"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0512"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.1024"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.2048"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.4096"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.8192"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="+Inf"} 542
etcd_network_peer_round_trip_time_seconds_sum{To="832b962ea502f7d0"} 0.6662752810000002
etcd_network_peer_round_trip_time_seconds_count{To="832b962ea502f7d0"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0001"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0002"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0004"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0008"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0016"} 104
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0032"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0064"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0128"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0256"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0512"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.1024"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.2048"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.4096"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.8192"} 542
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="+Inf"} 542
etcd_network_peer_round_trip_time_seconds_sum{To="eb6038ffd8be1f89"} 1.0603192909999994
etcd_network_peer_round_trip_time_seconds_count{To="eb6038ffd8be1f89"} 542
etcd_network_peer_sent_bytes_total{To="832b962ea502f7d0"} 2.2467564e+07
etcd_network_peer_sent_bytes_total{To="eb6038ffd8be1f89"} 2.2573012e+07
etcd_server_has_leader 1
etcd_server_leader_changes_seen_total 5
etcd_server_proposals_applied_total 93716
etcd_server_proposals_committed_total 93716
etcd_server_proposals_failed_total 16
etcd_server_proposals_pending 0
go_gc_duration_seconds{quantile="0"} 0.000174181
go_gc_duration_seconds{quantile="0.25"} 0.00021255700000000002
go_gc_duration_seconds{quantile="0.5"} 0.000274232
go_gc_duration_seconds{quantile="0.75"} 0.000393224
go_gc_duration_seconds{quantile="1"} 0.007415469
go_gc_duration_seconds_sum 0.08618228300000001
go_gc_duration_seconds_count 170
go_goroutines 187
go_memstats_alloc_bytes 3.2484144e+07
go_memstats_alloc_bytes_total 1.95992204e+09
go_memstats_buck_hash_sys_bytes 1.546432e+06
go_memstats_frees_total 2.82752e+07
go_memstats_gc_sys_bytes 1.753088e+06
go_memstats_heap_alloc_bytes 3.2484144e+07
go_memstats_heap_idle_bytes 1.2271616e+07
go_memstats_heap_inuse_bytes 4.0517632e+07
go_memstats_heap_objects 188457
go_memstats_heap_released_bytes_total 0
go_memstats_heap_sys_bytes 5.2789248e+07
go_memstats_last_gc_time_seconds 1.4852416328930922e+19
go_memstats_lookups_total 50834
go_memstats_mallocs_total 2.8463657e+07
go_memstats_mcache_inuse_bytes 2400
go_memstats_mcache_sys_bytes 16384
go_memstats_mspan_inuse_bytes 393840
go_memstats_mspan_sys_bytes 442368
go_memstats_next_gc_bytes 4.4802903e+07
go_memstats_other_sys_bytes 806968
go_memstats_stack_inuse_bytes 1.47456e+06
go_memstats_stack_sys_bytes 1.47456e+06
go_memstats_sys_bytes 5.8829048e+07
http_request_duration_microseconds{handler="prometheus",quantile="0.5"} NaN
http_request_duration_microseconds{handler="prometheus",quantile="0.9"} NaN
http_request_duration_microseconds{handler="prometheus",quantile="0.99"} NaN
http_request_duration_microseconds_sum{handler="prometheus"} 0
http_request_duration_microseconds_count{handler="prometheus"} 0
http_request_size_bytes{handler="prometheus",quantile="0.5"} NaN
http_request_size_bytes{handler="prometheus",quantile="0.9"} NaN
http_request_size_bytes{handler="prometheus",quantile="0.99"} NaN
http_request_size_bytes_sum{handler="prometheus"} 0
http_request_size_bytes_count{handler="prometheus"} 0
http_response_size_bytes{handler="prometheus",quantile="0.5"} NaN
http_response_size_bytes{handler="prometheus",quantile="0.9"} NaN
http_response_size_bytes{handler="prometheus",quantile="0.99"} NaN
http_response_size_bytes_sum{handler="prometheus"} 0
http_response_size_bytes_count{handler="prometheus"} 0
process_cpu_seconds_total 205.86
process_max_fds 1024
process_open_fds 67
process_resident_memory_bytes 6.6473984e+07
process_start_time_seconds 1.48522529731e+09
process_virtual_memory_bytes 1.0816765952e+10
Hi
and later the delay become large in this SSD environment. We make the write request to etcd continuously, but it is serial request which each need wait its complete from etcd response.
And, the data looks not big as below indication.
total 125008
drwx------ 2 root root 4096 Jan 24 10:35 .
drwx------ 4 root root 4096 Jan 24 10:35 ..
-rw------- 1 root root 64000000 Jan 24 15:22 0000000000000000-0000000000000000.wal
-rw------- 1 root root 64000000 Jan 24 10:35 0.tmp
total 1180
drwx------ 2 root root 4096 Jan 24 15:13 .
drwx------ 4 root root 4096 Jan 24 10:35 ..
-rw-r--r-- 1 root root 1171976 Jan 24 15:13 000000000000004b-00000000000186aa.snap
-rw------- 1 root root 16805888 Jan 24 15:13 db
Jan 24 15:04:08 SN-2 etcd[1609]: start to snapshot (applied: 90009, lastsnap: 80008)
Jan 24 15:04:08 SN-2 etcd[1609]: saved snapshot at index 90009
Jan 24 15:04:08 SN-2 etcd[1609]: compacted raft log at 85009
Jan 24 15:04:31 SN-2 etcd[1609]: purged file /mnt/etcd/SN-2/member/snap/0000000000000047-0000000000013888.snap successfully
Jan 24 15:05:39 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 182.952307ms)
Jan 24 15:05:39 SN-2 etcd[1609]: server is likely overloaded
Jan 24 15:05:39 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 183.082168ms)
Jan 24 15:05:39 SN-2 etcd[1609]: server is likely overloaded
Jan 24 15:06:55 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 207.093527ms)
Jan 24 15:06:55 SN-2 etcd[1609]: server is likely overloaded
Jan 24 15:06:55 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 207.152151ms)
Jan 24 15:06:55 SN-2 etcd[1609]: server is likely overloaded
Jan 24 15:09:52 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 428.804817ms)
Jan 24 15:09:52 SN-2 etcd[1609]: server is likely overloaded
Jan 24 15:09:52 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 429.769516ms)
Jan 24 15:09:52 SN-2 etcd[1609]: server is likely overloaded
Jan 24 15:10:43 SN-2 etcd[1609]: 2eb57a48c2641c1d [logterm: 71, index: 97024, vote: 2eb57a48c2641c1d] ignored vote from 832b962ea502f7d0 [logterm: 71, index: 97024] at term 71: lease is not expired (remaining ticks: 4)
Jan 24 15:10:43 SN-2 etcd[1609]: sync duration of 1.117012795s, expected less than 1s
Jan 24 15:10:43 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 748.72307ms)
Jan 24 15:10:43 SN-2 etcd[1609]: server is likely overloaded
Jan 24 15:10:43 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 748.764407ms)
Jan 24 15:10:43 SN-2 etcd[1609]: server is likely overloaded
Jan 24 15:10:43 SN-2 etcd[1609]: 2eb57a48c2641c1d [term: 71] received a MsgAppResp message with higher term from 832b962ea502f7d0 [term: 72]
Jan 24 15:10:43 SN-2 etcd[1609]: 2eb57a48c2641c1d became follower at term 72
64MB is fine. We do pre-fallocate for fast write. Are you using v2 auth at all? Do you have metrics against this server log?
Hi
do you mean the new metrics data from this leader:
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="128"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="256"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="512"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1024"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2048"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4096"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_sum 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_count 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="100"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="1600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="3200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="6400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="12800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="25600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="51200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="102400"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="204800"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="409600"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="819200"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_sum 0
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_count 0
etcd_debugging_mvcc_db_total_size_in_bytes 0
etcd_debugging_mvcc_delete_total 0
etcd_debugging_mvcc_events_total 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="0.5"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="128"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="256"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="512"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="1024"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="+Inf"} 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_sum 0
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_count 0
etcd_debugging_mvcc_keys_total 0
etcd_debugging_mvcc_pending_events_total 0
etcd_debugging_mvcc_put_total 0
etcd_debugging_mvcc_range_total 0
etcd_debugging_mvcc_slow_watcher_total 0
etcd_debugging_mvcc_txn_total 0
etcd_debugging_mvcc_watch_stream_total 0
etcd_debugging_mvcc_watcher_total 0
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.001"} 7
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.002"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.004"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.008"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.016"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.032"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.064"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.128"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.256"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.512"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="1.024"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="2.048"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="4.096"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="8.192"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="+Inf"} 10
etcd_debugging_snap_save_marshalling_duration_seconds_sum 0.007692212
etcd_debugging_snap_save_marshalling_duration_seconds_count 10
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.001"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.002"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.004"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.008"} 0
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.016"} 5
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.032"} 9
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.064"} 9
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.128"} 9
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.256"} 10
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.512"} 10
etcd_debugging_snap_save_total_duration_seconds_bucket{le="1.024"} 10
etcd_debugging_snap_save_total_duration_seconds_bucket{le="2.048"} 10
etcd_debugging_snap_save_total_duration_seconds_bucket{le="4.096"} 10
etcd_debugging_snap_save_total_duration_seconds_bucket{le="8.192"} 10
etcd_debugging_snap_save_total_duration_seconds_bucket{le="+Inf"} 10
etcd_debugging_snap_save_total_duration_seconds_sum 0.394149646
etcd_debugging_snap_save_total_duration_seconds_count 10
etcd_debugging_store_expires_total 5
etcd_debugging_store_reads_total{action="get"} 117215
etcd_debugging_store_reads_total{action="getRecursive"} 7
etcd_debugging_store_watch_requests_total 111162
etcd_debugging_store_watchers 4
etcd_debugging_store_writes_total{action="compareAndSwap"} 25
etcd_debugging_store_writes_total{action="create"} 3
etcd_debugging_store_writes_total{action="delete"} 12018
etcd_debugging_store_writes_total{action="set"} 30165
etcd_debugging_store_writes_total{action="update"} 29398
etcd_disk_backend_commit_duration_seconds_bucket{le="0.001"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.002"} 0
etcd_disk_backend_commit_duration_seconds_bucket{le="0.004"} 11
etcd_disk_backend_commit_duration_seconds_bucket{le="0.008"} 22
etcd_disk_backend_commit_duration_seconds_bucket{le="0.016"} 23
etcd_disk_backend_commit_duration_seconds_bucket{le="0.032"} 23
etcd_disk_backend_commit_duration_seconds_bucket{le="0.064"} 23
etcd_disk_backend_commit_duration_seconds_bucket{le="0.128"} 23
etcd_disk_backend_commit_duration_seconds_bucket{le="0.256"} 23
etcd_disk_backend_commit_duration_seconds_bucket{le="0.512"} 23
etcd_disk_backend_commit_duration_seconds_bucket{le="1.024"} 23
etcd_disk_backend_commit_duration_seconds_bucket{le="2.048"} 23
etcd_disk_backend_commit_duration_seconds_bucket{le="4.096"} 23
etcd_disk_backend_commit_duration_seconds_bucket{le="8.192"} 23
etcd_disk_backend_commit_duration_seconds_bucket{le="+Inf"} 23
etcd_disk_backend_commit_duration_seconds_sum 0.09842727799999999
etcd_disk_backend_commit_duration_seconds_count 23
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.001"} 0
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.002"} 56820
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.004"} 100384
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.008"} 104233
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.016"} 104941
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.032"} 106036
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.064"} 106857
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.128"} 107048
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.256"} 107071
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.512"} 107075
etcd_disk_wal_fsync_duration_seconds_bucket{le="1.024"} 107077
etcd_disk_wal_fsync_duration_seconds_bucket{le="2.048"} 107078
etcd_disk_wal_fsync_duration_seconds_bucket{le="4.096"} 107079
etcd_disk_wal_fsync_duration_seconds_bucket{le="8.192"} 107080
etcd_disk_wal_fsync_duration_seconds_bucket{le="+Inf"} 107081
etcd_disk_wal_fsync_duration_seconds_sum 399.93726846399517
etcd_disk_wal_fsync_duration_seconds_count 107081
etcd_http_failed_total{code="404",method="GET"} 36
etcd_http_failed_total{code="500",method="PUT"} 16
etcd_http_received_total{method="GET"} 86
etcd_http_received_total{method="PUT"} 5882
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.0005"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.001"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.002"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.004"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.008"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.016"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.032"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.064"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.128"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.256"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="0.512"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="1.024"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="2.048"} 50
etcd_http_successful_duration_seconds_bucket{method="GET",le="+Inf"} 50
etcd_http_successful_duration_seconds_sum{method="GET"} 0.002676219
etcd_http_successful_duration_seconds_count{method="GET"} 50
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.0005"} 1
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.001"} 3
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.002"} 6
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.004"} 2364
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.008"} 5572
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.016"} 5631
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.032"} 5698
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.064"} 5786
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.128"} 5836
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.256"} 5842
etcd_http_successful_duration_seconds_bucket{method="PUT",le="0.512"} 5844
etcd_http_successful_duration_seconds_bucket{method="PUT",le="1.024"} 5846
etcd_http_successful_duration_seconds_bucket{method="PUT",le="2.048"} 5857
etcd_http_successful_duration_seconds_bucket{method="PUT",le="+Inf"} 5866
etcd_http_successful_duration_seconds_sum{method="PUT"} 80.15189647799988
etcd_http_successful_duration_seconds_count{method="PUT"} 5866
etcd_network_client_grpc_received_bytes_total 0
etcd_network_client_grpc_sent_bytes_total 0
etcd_network_peer_received_bytes_total{From="0"} 1.193304e+06
etcd_network_peer_received_bytes_total{From="832b962ea502f7d0"} 1.4694404e+07
etcd_network_peer_received_bytes_total{From="eb6038ffd8be1f89"} 2.3047582e+07
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0001"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0002"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0004"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0008"} 159
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0016"} 444
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0032"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0064"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0128"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0256"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.0512"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.1024"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.2048"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.4096"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="0.8192"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="832b962ea502f7d0",le="+Inf"} 597
etcd_network_peer_round_trip_time_seconds_sum{To="832b962ea502f7d0"} 0.7200944739999998
etcd_network_peer_round_trip_time_seconds_count{To="832b962ea502f7d0"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0001"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0002"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0004"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0008"} 0
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0016"} 106
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0032"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0064"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0128"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0256"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.0512"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.1024"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.2048"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.4096"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="0.8192"} 597
etcd_network_peer_round_trip_time_seconds_bucket{To="eb6038ffd8be1f89",le="+Inf"} 597
etcd_network_peer_round_trip_time_seconds_sum{To="eb6038ffd8be1f89"} 1.1749873839999994
etcd_network_peer_round_trip_time_seconds_count{To="eb6038ffd8be1f89"} 597
etcd_network_peer_sent_bytes_total{To="832b962ea502f7d0"} 2.3792544e+07
etcd_network_peer_sent_bytes_total{To="eb6038ffd8be1f89"} 2.4832411e+07
etcd_server_has_leader 1
etcd_server_leader_changes_seen_total 8
etcd_server_proposals_applied_total 107268
etcd_server_proposals_committed_total 107268
etcd_server_proposals_failed_total 16
etcd_server_proposals_pending 0
go_gc_duration_seconds{quantile="0"} 0.000174181
go_gc_duration_seconds{quantile="0.25"} 0.00021439200000000002
go_gc_duration_seconds{quantile="0.5"} 0.000279183
go_gc_duration_seconds{quantile="0.75"} 0.000405336
go_gc_duration_seconds{quantile="1"} 0.007415469
go_gc_duration_seconds_sum 0.09541828000000001
go_gc_duration_seconds_count 188
go_goroutines 195
go_memstats_alloc_bytes 3.0810848e+07
go_memstats_alloc_bytes_total 2.228108448e+09
go_memstats_buck_hash_sys_bytes 1.552136e+06
go_memstats_frees_total 3.2360274e+07
go_memstats_gc_sys_bytes 1.835008e+06
go_memstats_heap_alloc_bytes 3.0810848e+07
go_memstats_heap_idle_bytes 1.540096e+07
go_memstats_heap_inuse_bytes 3.9616512e+07
go_memstats_heap_objects 148563
go_memstats_heap_released_bytes_total 1.4983168e+07
go_memstats_heap_sys_bytes 5.5017472e+07
go_memstats_last_gc_time_seconds 1.48524320352104e+19
go_memstats_lookups_total 55626
go_memstats_mallocs_total 3.2508837e+07
go_memstats_mcache_inuse_bytes 2400
go_memstats_mcache_sys_bytes 16384
go_memstats_mspan_inuse_bytes 390000
go_memstats_mspan_sys_bytes 458752
go_memstats_next_gc_bytes 4.6997184e+07
go_memstats_other_sys_bytes 780784
go_memstats_stack_inuse_bytes 1.47456e+06
go_memstats_stack_sys_bytes 1.47456e+06
go_memstats_sys_bytes 6.1135096e+07
http_request_duration_microseconds{handler="prometheus",quantile="0.5"} 1661.35
http_request_duration_microseconds{handler="prometheus",quantile="0.9"} 1661.35
http_request_duration_microseconds{handler="prometheus",quantile="0.99"} 1661.35
http_request_duration_microseconds_sum{handler="prometheus"} 8566.711
http_request_duration_microseconds_count{handler="prometheus"} 2
http_request_size_bytes{handler="prometheus",quantile="0.5"} 63
http_request_size_bytes{handler="prometheus",quantile="0.9"} 63
http_request_size_bytes{handler="prometheus",quantile="0.99"} 63
http_request_size_bytes_sum{handler="prometheus"} 126
http_request_size_bytes_count{handler="prometheus"} 2
http_requests_total{code="200",handler="prometheus",method="get"} 2
http_response_size_bytes{handler="prometheus",quantile="0.5"} 27720
http_response_size_bytes{handler="prometheus",quantile="0.9"} 27720
http_response_size_bytes{handler="prometheus",quantile="0.99"} 27720
http_response_size_bytes_sum{handler="prometheus"} 55211
http_response_size_bytes_count{handler="prometheus"} 2
process_cpu_seconds_total 229.18
process_max_fds 1024
process_open_fds 65
process_resident_memory_bytes 5.4153216e+07
process_start_time_seconds 1.48522529731e+09
process_virtual_memory_bytes 1.0819072e+10
Hi
Are you using v2 auth at all?
-->sorry, I am not so catching this?
etcdctl --version
etcdctl version: 3.0.12
API version: 2
@nabeken
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.032"} 106036
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.064"} 106857
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.128"} 107048
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.256"} 107071
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.512"} 107075
etcd_disk_wal_fsync_duration_seconds_bucket{le="1.024"} 107077
etcd_disk_wal_fsync_duration_seconds_bucket{le="2.048"} 107078
etcd_disk_wal_fsync_duration_seconds_bucket{le="4.096"} 107079
etcd_disk_wal_fsync_duration_seconds_bucket{le="8.192"} 107080
There are still fsyncs took > 100ms to finish. I am not sure why. You need to make sure no fsync takes more than 100ms. Or you will see the warning. If you know your disk is not stable, then just ignore the warning.
Hi
How to analyze this data, why you say it need 100ms ? maybe you can help share the doc link for how to read the etcd report, thanks.
Even it is 100ms, why the etcd report delay to even below so large one?
Jan 24 15:10:43 SN-2 etcd[1609]: failed to send out heartbeat on time (exceeded the 200ms timeout for 748.764407ms)
And, from our own SSD performance report, it looks good enough, the write MB/s reaches 109 in random, and 414 in sequential test while the read performance in MB/s is 77 in random.
馃憤
Hi,
Having same heartbeat issue, though it comes and goes; currently these are my cluster statistics. I'm having trouble understanding the floating point part (if I understand correctly, my stats are terribly appaling?
# TYPE etcd_disk_wal_fsync_duration_seconds histogram
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.001"} 2.8872813e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.002"} 3.2572623e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.004"} 3.2683716e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.008"} 3.2773303e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.016"} 3.284899e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.032"} 3.2899384e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.064"} 3.2937582e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.128"} 3.2954717e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.256"} 3.2957131e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.512"} 3.2957703e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="1.024"} 3.2957739e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="2.048"} 3.2957741e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="4.096"} 3.2957741e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="8.192"} 3.2957741e+07
etcd_disk_wal_fsync_duration_seconds_bucket{le="+Inf"} 3.2957741e+07
etcd_disk_wal_fsync_duration_seconds_sum 22704.826552452094
etcd_disk_wal_fsync_duration_seconds_count 3.2957741e+07
etcd_disk_backend_commit_duration_seconds_bucket{le="0.001"} 6.421544e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="0.002"} 8.351543e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="0.004"} 8.401466e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="0.008"} 8.428063e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="0.016"} 8.45627e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="0.032"} 8.478932e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="0.064"} 8.496687e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="0.128"} 8.506733e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="0.256"} 8.509377e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="0.512"} 8.509859e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="1.024"} 8.509883e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="2.048"} 8.509886e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="4.096"} 8.509886e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="8.192"} 8.509886e+06
etcd_disk_backend_commit_duration_seconds_bucket{le="+Inf"} 8.509886e+06
etcd_disk_backend_commit_duration_seconds_sum 10687.892327041654
etcd_disk_backend_commit_duration_seconds_count 8.509886e+06
Many thanks!
Most helpful comment
Hi
I am thinking, whether the time consume computing is also based on system real time, so if system time jumps due to NTP sync, then the computing will not be correct either, I just suppose maybe this is one possible cause?