Title: get request errors: "no healthy upstream"
Description:
Dynamic configuration discovery through control panel.
In step:
2019-11-15 19:00:43: update cluster timeout and change config version.
2019-11-15 21:17:03: get A lot of request errors, grpc-status: 14, grpc-message: no healthy upstream
Deploying 21 envoy nodes, two of them had this error.
Envoy restart or reload node returns to normal.
Envoy version: 1.11.2
Config:
xds:
cache.NewSnapshotCache(false, xdHasher{}, xdLogger{})
envoy yaml:
admin:
access_log_path: /data/log/envoybridge/admin_access.log
address:
socket_address:
address: 0.0.0.0
port_value: 9901
dynamic_resources:
ads_config:
api_type: GRPC
grpc_services:
- envoy_grpc:
cluster_name: xds_cluster
cds_config:
ads: {}
lds_config:
ads: {}
node:
cluster: grpc-cluster
id: grpc-node
static_resources:
clusters:
- name: xds_cluster
connect_timeout: 1s
type: strict_dns
lb_policy: round_robin
http2_protocol_options: {}
load_assignment:
cluster_name: xds_cluster
endpoints:
- lb_endpoints:
endpoint:
address:
socket_address:
address: x.x.x.x
port_value: 9000
- lb_endpoints:
endpoint:
address:
socket_address:
address: x.x.x.x
port_value: 9000
- lb_endpoints:
endpoint:
address:
socket_address:
address: x.x.x.x
port_value: 9000
- name: log_cluster
type: EDS
connect_timeout: 0.1s
lb_policy: ROUND_ROBIN
http2_protocol_options: {}
eds_cluster_config:
service_name: log_cluster
eds_config:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: xds_cluster
Logs:
[2019-11-15 19:00:43.206][23415][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 14, upstream conne
ct error or disconnect/reset before headers. reset reason: connection termination
[2019-11-15 19:03:03.399][23415][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 14, upstream conne
ct error or disconnect/reset before headers. reset reason: connection termination
[2019-11-15 19:08:36.138][23415][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 13,
[2019-11-15 19:08:36.620][23415][info][upstream] [source/server/lds_api.cc:60] lds: add/update listener 'grpc-listener'
[2019-11-15 19:08:36.621][23415][info][upstream] [source/common/upstream/cluster_manager_impl.cc:495] add/update cluster app.xxx starting warming
[2019-11-15 19:08:36.625][23415][info][upstream] [source/common/upstream/cluster_manager_impl.cc:495] add/update cluster app.xxx starting warming
[2019-11-15 19:08:36.626][23415][info][upstream] [source/common/upstream/cluster_manager_impl.cc:495] add/update cluster app.xxx starting warming
[2019-11-15 19:08:36.626][23415][info][upstream] [source/common/upstream/cluster_manager_impl.cc:495] add/update cluster app.xxx starting warming
[2019-11-15 19:08:36.627][23415][info][upstream] [source/common/upstream/cluster_manager_impl.cc:495] add/update cluster app.xxx starting warming
[2019-11-15 19:08:36.627][23415][warning][misc] [source/common/protobuf/utility.cc:199] Using deprecated option 'envoy.api.v2.route.CorsPolicy.allow_origin_regex' from file route.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-11-15 19:08:36.627][23415][warning][misc] [source/common/protobuf/utility.cc:199] Using deprecated option 'envoy.api.v2.route.CorsPolicy.allow_origin_regex' from file route.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-11-15 19:08:36.627][23415][warning][misc] [source/common/protobuf/utility.cc:199] Using deprecated option 'envoy.api.v2.route.CorsPolicy.allow_origin_regex' from file route.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-11-15 19:08:36.627][23415][warning][misc] [source/common/protobuf/utility.cc:199] Using deprecated option 'envoy.api.v2.route.CorsPolicy.allow_origin_regex' from file route.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-11-15 19:08:36.627][23415][warning][misc] [source/common/protobuf/utility.cc:199] Using deprecated option 'envoy.api.v2.route.CorsPolicy.allow_origin_regex' from file route.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-11-15 19:08:52.511][23415][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination
[2019-11-15 21:01:22.203][23415][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 13,
[2019-11-15 21:17:03.937][23415][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 13,
[2019-11-15 21:17:03.937][23415][info][upstream] [source/common/upstream/cluster_manager_impl.cc:507] warming cluster app.xxx complete
[2019-11-15 21:17:03.938][23415][info][upstream] [source/common/upstream/cluster_manager_impl.cc:507] warming cluster app.xxx complete
[2019-11-15 21:17:03.938][23415][info][upstream] [source/common/upstream/cluster_manager_impl.cc:507] warming cluster app.xxx complete
[2019-11-15 21:17:03.938][23415][info][upstream] [source/common/upstream/cluster_manager_impl.cc:507] warming cluster app.xxx complete
[2019-11-15 21:17:03.939][23415][info][upstream] [source/common/upstream/cluster_manager_impl.cc:507] warming cluster app.xxx complete
[2019-11-15 21:41:03.451][3324][info][main] [source/server/server.cc:238] initializing epoch 7 (hot restart version=11.104)
[2019-11-15 21:41:03.451][3324][info][main] [source/server/server.cc:240] statically linked extensions:
[2019-11-15 21:41:03.451][3324][info][main] [source/server/server.cc:242] access_loggers: envoy.file_access_log,envoy.http_grpc_access_log
[2019-11-15 21:41:03.451][3324][info][main] [source/server/server.cc:245] filters.http: envoy.buffer,envoy.cors,envoy.csrf,envoy.ext_authz,envoy.fault,envoy.filters.http.dynamic_forward_proxy,envoy.filters.http.grpc_http1_reverse_bridge,envoy.filters.http.header_to_metadata,envoy.filters.http.jwt_authn,envoy.filters.http.original_src,envoy.filters.http.rbac,envoy.filters.http.tap,envoy.grpc_http1_bridge,envoy.grpc_json_transcoder,envoy.grpc_web,envoy.gzip,envoy.health_check,envoy.http_dynamo_filter,envoy.ip_tagging,envoy.lua,envoy.rate_limit,envoy.router,envoy.squash
[2019-11-15 21:41:03.451][3324][info][main] [source/server/server.cc:248] filters.listener: envoy.listener.original_dst,envoy.listener.original_src,envoy.listener.proxy_protocol,envoy.listener.tls_inspector
[2019-11-15 21:41:03.451][3324][info][main] [source/server/server.cc:251] filters.network: envoy.client_ssl_auth,envoy.echo,envoy.ext_authz,envoy.filters.network.dubbo_proxy,envoy.filters.network.mysql_proxy,envoy.filters.network.rbac,envoy.filters.network.sni_cluster,envoy.filters.network.thrift_proxy,envoy.filters.network.zookeeper_proxy,envoy.http_connection_manager,envoy.mongo_proxy,envoy.ratelimit,envoy.redis_proxy,envoy.tcp_proxy
[2019-11-15 21:41:03.452][3324][info][main] [source/server/server.cc:253] stat_sinks: envoy.dog_statsd,envoy.metrics_service,envoy.stat_sinks.hystrix,envoy.statsd
[2019-11-15 21:41:03.452][3324][info][main] [source/server/server.cc:255] tracers: envoy.dynamic.ot,envoy.lightstep,envoy.tracers.datadog,envoy.tracers.opencensus,envoy.zipkin
[2019-11-15 21:41:03.452][3324][info][main] [source/server/server.cc:258] transport_sockets.downstream: envoy.transport_sockets.alts,envoy.transport_sockets.tap,raw_buffer,tls
[2019-11-15 21:41:03.452][3324][info][main] [source/server/server.cc:261] transport_sockets.upstream: envoy.transport_sockets.alts,envoy.transport_sockets.tap,raw_buffer,tls
[2019-11-15 21:41:03.452][3324][info][main] [source/server/server.cc:267] buffer implementation: old (libevent)
[2019-11-15 21:41:03.458][23415][warning][main] [source/server/server.cc:574] shutting down admin due to child startup
[2019-11-15 21:41:03.458][23415][warning][main] [source/server/server.cc:580] terminating parent process
[2019-11-15 21:41:03.459][3324][info][main] [source/server/server.cc:322] admin address: 0.0.0.0:9901
[2019-11-15 21:41:03.460][3324][info][main] [source/server/server.cc:432] runtime: layers:
- name: base
static_layer:
{}- name: admin
admin_layer:
{}
[2019-11-15 21:41:03.460][3324][warning][runtime] [source/common/runtime/runtime_impl.cc:497] Skipping unsupported runtime layer: name: "base"
static_layer {
}
[2019-11-15 21:41:03.460][3324][info][config] [source/server/configuration_impl.cc:61] loading 0 static secret(s)
[2019-11-15 21:41:03.460][3324][info][config] [source/server/configuration_impl.cc:67] loading 2 cluster(s)
[2019-11-15 21:41:03.462][3324][info][upstream] [source/common/upstream/cluster_manager_impl.cc:124] cm init: initializing secondary clusters
[2019-11-15 21:41:03.463][3324][info][config] [source/server/configuration_impl.cc:71] loading 0 listener(s)
[2019-11-15 21:41:03.463][3324][info][config] [source/server/configuration_impl.cc:96] loading tracing configuration
[2019-11-15 21:41:03.463][3324][info][config] [source/server/configuration_impl.cc:116] loading stats sink configuration
[2019-11-15 21:41:03.463][3324][info][main] [source/server/server.cc:516] starting main dispatch loop
[2019-11-15 21:41:03.468][3324][info][upstream] [source/common/upstream/cluster_manager_impl.cc:144] cm init: initializing cds
[2019-11-15 21:41:03.469][3324][info][upstream] [source/common/upstream/cluster_manager_impl.cc:489] add/update cluster app.xxx during init
[2019-11-15 21:41:03.470][3324][info][upstream] [source/common/upstream/cluster_manager_impl.cc:489] add/update cluster app.xxx during init
[2019-11-15 21:41:03.471][3324][info][upstream] [source/common/upstream/cluster_manager_impl.cc:489] add/update cluster app.xxx during init
[2019-11-15 21:41:03.471][3324][info][upstream] [source/common/upstream/cluster_manager_impl.cc:489] add/update cluster app.xxx during init
[2019-11-15 21:41:03.472][3324][info][upstream] [source/common/upstream/cluster_manager_impl.cc:489] add/update cluster app.xxx during init
[2019-11-15 21:41:03.472][3324][info][upstream] [source/common/upstream/cluster_manager_impl.cc:124] cm init: initializing secondary clusters
[2019-11-15 21:41:03.475][3324][info][upstream] [source/common/upstream/cluster_manager_impl.cc:148] cm init: all clusters initialized
[2019-11-15 21:41:03.475][3324][info][main] [source/server/server.cc:500] all clusters initialized. initializing init manager
[2019-11-15 21:41:03.479][3324][info][upstream] [source/server/lds_api.cc:60] lds: add/update listener 'grpc-listener'
[2019-11-15 21:41:03.480][3324][warning][misc] [source/common/protobuf/utility.cc:199] Using deprecated option 'envoy.api.v2.route.CorsPolicy.allow_origin_regex' from file route.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-11-15 21:41:03.480][3324][warning][misc] [source/common/protobuf/utility.cc:199] Using deprecated option 'envoy.api.v2.route.CorsPolicy.allow_origin_regex' from file route.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-11-15 21:41:03.480][3324][warning][misc] [source/common/protobuf/utility.cc:199] Using deprecated option 'envoy.api.v2.route.CorsPolicy.allow_origin_regex' from file route.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-11-15 21:41:03.481][3324][warning][misc] [source/common/protobuf/utility.cc:199] Using deprecated option 'envoy.api.v2.route.CorsPolicy.allow_origin_regex' from file route.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-11-15 21:41:03.481][3324][warning][misc] [source/common/protobuf/utility.cc:199] Using deprecated option 'envoy.api.v2.route.CorsPolicy.allow_origin_regex' from file route.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-11-15 21:41:03.481][3324][info][config] [source/server/listener_manager_impl.cc:761] all dependencies initialized. starting workers
Normal node:

Abnormal node:

This seems to be a race between DNS resolution/cluster warming and readiness. Do you mind attaching full snippets of your logs with debug level logging? I assume that the errors you are talking about are requests to the web.interface_cluster that you are pointing out in the screenshots?
The error was triggered by accident. My service has been running in production for over a month. I updated the dynamic configuration, and the error occurred two hours later.The node restart returned to normal.
My ads mode is false, will it affect?
Does the panic threshold need to be configured to take effect?
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted". Thank you for your contributions.
@tonyboxes Hello Tony, have you solved your issue ? I'm facing the same currently
Hi @tonyboxes I'm facing the same issue, is there any solution?
Can you show GRPC code? I want to learn how to use GRPC server config xds.
Most helpful comment
Hi @tonyboxes I'm facing the same issue, is there any solution?