Am not sure if this is related to #326 , still referencing the issue since it has the same error message, but on the face of it they seem to be different.
I'm trying to get the address field in a listener to work and I'm unable to do that. I've written a simple shell script as a test harness that will do the following - (this will work only on linux) -
envoyct1 with a default config and installs curl and python packages in it.service/1 backend.When I docker exec into envoyct1, and fire curl <VIP1>/service/1, I expect to get a 404. But I see this error -
bash-4.3# curl 192.45.67.90/service/1
upstream connect error or disconnect/reset before headersbash-4.3#
If I spin up a python server on a different port, and curl the IP:port, it works -
bash-4.3# python -m SimpleHTTPServer 9002
Serving HTTP on 0.0.0.0 port 9002 ...
192.45.67.90 - - [22/Apr/2017 01:44:27] "GET / HTTP/1.1" 200 -
bash-4.3# curl 192.45.67.90:9002
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"><html>
<title>Directory listing for /</title>
<body>
<h2>Directory listing for /</h2>
<hr>
<ul>
<li><a href=".dockerenv">.dockerenv</a>
<li><a href="bin/">bin/</a>
<li><a href="dev/">dev/</a>
<li><a href="etc/">etc/</a>
<li><a href="home/">home/</a>
<li><a href="lib/">lib/</a>
<li><a href="lib64/">lib64/</a>
<li><a href="media/">media/</a>
<li><a href="mnt/">mnt/</a>
<li><a href="proc/">proc/</a>
<li><a href="root/">root/</a>
<li><a href="run/">run/</a>
<li><a href="sbin/">sbin/</a>
<li><a href="srv/">srv/</a>
<li><a href="sys/">sys/</a>
<li><a href="tmp/">tmp/</a>
<li><a href="usr/">usr/</a>
<li><a href="var/">var/</a>
</ul>
<hr>
</body>
</html>
bash-4.3#
So this doesn't look like a network configuration issue (the curl is being issued from inside the envoy container).
Is this an envoy config issue? Or some other?
When I tried to debug this using gdb and a debug envoy build, it looked like a worker thread that handles the connection request somewhere in the connection_manager_impl.cc chain sees a socket close event and so spits out this error. I'm not sure why it should see a socket close event..
Am I doing something wrong with the config? Can someone please take a look?
BTW, it doesn't matter if I have one or two listeners in my config file. It's the same result. Also, it doesn't matter whether I plumb the VIPs or not - using a simple 127.0.0.10 loopback IP yields the same result.
I'm attaching the harness as a zip file. Unzip it and simply run ./setup_ifaces.sh, and it'll spin up an envoy alpine container and do the rest of the plumbing. If you fire ./setup_ifaces.sh ubuntu, it will pull the lyft/envoy ubuntu image instead and do the same stuff there.
So basically, this happens across ubuntu/alpine, loopback/eth0. Any pointers/help would be much appreciated.
Thanks!
"upstream connect error or disconnect/reset before headers" means that Envoy cannot connect to the upstream that is being routed to. Your listener config is probably fine. I would use a combination of the /stats and /clusters admin endpoint output to debug further, and verify that you can connect to your backend services from within the Envoy container.
@mattklein123 Thanks for taking a look! The text below is a bit long owing to the outputs I've pasted - thanks in advance for reading through them!
When I look at the /clusters output, I see service1 and service 2 there, with a series of entries for 127.0.0.1:9001 (the python backend service), but for it, I see the cx_connect_fail stat set to 0 - if it were a connectivity issue from the envoy server, that shouldn't be 0, correct?
bash-4.3# curl 127.0.0.10:8001/clusters
service1::default_priority::max_connections::1024
service1::default_priority::max_pending_requests::1024
service1::default_priority::max_requests::1024
service1::default_priority::max_retries::3
service1::high_priority::max_connections::1024
service1::high_priority::max_pending_requests::1024
service1::high_priority::max_requests::1024
service1::high_priority::max_retries::3
service1::127.0.0.1:9001::cx_active::0
service1::127.0.0.1:9001::cx_connect_fail::0
service1::127.0.0.1:9001::cx_total::0
service1::127.0.0.1:9001::rq_active::0
service1::127.0.0.1:9001::rq_timeout::0
service1::127.0.0.1:9001::rq_total::0
service1::127.0.0.1:9001::health_flags::healthy
service1::127.0.0.1:9001::weight::1
service1::127.0.0.1:9001::zone::
service1::127.0.0.1:9001::canary::false
service1::127.0.0.1:9001::success_rate::-1
service2::default_priority::max_connections::1024
service2::default_priority::max_pending_requests::1024
service2::default_priority::max_requests::1024
service2::default_priority::max_retries::3
service2::high_priority::max_connections::1024
service2::high_priority::max_pending_requests::1024
service2::high_priority::max_requests::1024
service2::high_priority::max_retries::3
bash-4.3#
With the /stats output, I see some parameters that seem to apply here, pasting only them below (and the complete output in a separate excerpt below that) -
cluster.service1.max_host_weight: 1
cluster.service1.membership_change: 1
cluster.service1.membership_healthy: 1
cluster.service1.membership_total: 1
cluster.service1.update_attempt: 49
cluster.service1.update_failure: 0
cluster.service1.update_success: 49
The membership_healthy value shows 1 , which I infer means that envoy is able to see the backend service1 in the cluster - is that the case?
What are the update attempts referring to? They also seem to have gone through successfully 100% of the time (49 attempts).
complete output -
bash-4.3# curl 127.0.0.10:8001/stats | grep service1
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 95cluster.service1.lb_healthy_panic: 0 0 --:--:-- --:--:-- --:--:-- 0
10 0 9510 0 0 5479k 0 --:--:-- --:--:-- --cluster.service1.lb_local_cluster_not_ok: 0
:--:-- 9287k
cluster.service1.lb_recalculate_zone_structures: 0
cluster.service1.lb_zone_cluster_too_small: 0
cluster.service1.lb_zone_no_capacity_left: 0
cluster.service1.lb_zone_number_differs: 0
cluster.service1.lb_zone_routing_all_directly: 0
cluster.service1.lb_zone_routing_cross_zone: 0
cluster.service1.lb_zone_routing_sampled: 0
cluster.service1.max_host_weight: 1
cluster.service1.membership_change: 1
cluster.service1.membership_healthy: 1
cluster.service1.membership_total: 1
cluster.service1.update_attempt: 49
cluster.service1.update_failure: 0
cluster.service1.update_success: 49
cluster.service1.upstream_cx_active: 0
cluster.service1.upstream_cx_close_header: 0
cluster.service1.upstream_cx_connect_fail: 0
cluster.service1.upstream_cx_connect_timeout: 0
cluster.service1.upstream_cx_destroy: 0
cluster.service1.upstream_cx_destroy_local: 0
cluster.service1.upstream_cx_destroy_local_with_active_rq: 0
cluster.service1.upstream_cx_destroy_remote: 0
cluster.service1.upstream_cx_destroy_remote_with_active_rq: 0
cluster.service1.upstream_cx_destroy_with_active_rq: 0
cluster.service1.upstream_cx_http1_total: 0
cluster.service1.upstream_cx_http2_total: 0
cluster.service1.upstream_cx_max_requests: 0
cluster.service1.upstream_cx_none_healthy: 0
cluster.service1.upstream_cx_overflow: 0
cluster.service1.upstream_cx_protocol_error: 0
cluster.service1.upstream_cx_rx_bytes_buffered: 0
cluster.service1.upstream_cx_rx_bytes_total: 0
cluster.service1.upstream_cx_total: 0
cluster.service1.upstream_cx_tx_bytes_buffered: 0
cluster.service1.upstream_cx_tx_bytes_total: 0
cluster.service1.upstream_rq_active: 0
cluster.service1.upstream_rq_cancelled: 0
cluster.service1.upstream_rq_maintenance_mode: 0
cluster.service1.upstream_rq_pending_active: 0
cluster.service1.upstream_rq_pending_failure_eject: 0
cluster.service1.upstream_rq_pending_overflow: 0
cluster.service1.upstream_rq_pending_total: 0
cluster.service1.upstream_rq_per_try_timeout: 0
cluster.service1.upstream_rq_retry: 0
cluster.service1.upstream_rq_retry_overflow: 0
cluster.service1.upstream_rq_retry_success: 0
cluster.service1.upstream_rq_rx_reset: 0
cluster.service1.upstream_rq_timeout: 0
cluster.service1.upstream_rq_total: 0
cluster.service1.upstream_rq_tx_reset: 0
bash-4.3#
Thing is, I'm able to curl the backend service (it runs in the same envoy container) from within the envoy container on both VIPs and localhost without any issues -
Here's the netstat output to begin with -
bash-4.3# hostname
a24847dd0490
bash-4.3# netstat -apn
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:8001 0.0.0.0:* LISTEN 39/envoy
tcp 0 0 0.0.0.0:9001 0.0.0.0:* LISTEN 47/python
tcp 0 0 192.45.67.90:80 0.0.0.0:* LISTEN 39/envoy
tcp 0 0 192.45.67.89:80 0.0.0.0:* LISTEN 39/envoy
tcp 3 0 127.0.0.1:10000 0.0.0.0:* LISTEN 1/envoy
tcp 85 0 127.0.0.1:10000 127.0.0.1:43004 CLOSE_WAIT -
tcp 88 0 127.0.0.1:10000 127.0.0.1:43006 CLOSE_WAIT -
tcp 80 0 127.0.0.1:10000 127.0.0.1:43002 CLOSE_WAIT -
udp 0 0 172.17.0.3:55605 10.254.58.55:53 ESTABLISHED 39/envoy
udp 0 0 172.17.0.3:59536 10.241.16.126:53 ESTABLISHED 39/envoy
udp 0 0 172.17.0.3:47713 10.254.58.54:53 ESTABLISHED 39/envoy
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags Type State I-Node PID/Program name Path
unix 2 [ ] DGRAM 657299 39/envoy @envoy_domain_socket_1
unix 2 [ ] DGRAM 666235 1/envoy @envoy_domain_socket_0
bash-4.3#
Here's ps -
bash-4.3# ps aux
PID USER TIME COMMAND
1 root 0:00 /usr/local/bin/envoy -c /usr/local/conf/envoy/google_com_proxy.json
39 root 0:00 /usr/local/bin/envoy -c /usr/local/conf/envoy/envoy-multiple-listener-config.json --restart-epoch 1
47 root 0:00 python -m SimpleHTTPServer 9001
61 root 0:00 bash
83 root 0:00 ps aux
bash-4.3#
Now, I'm pinging the backend python simpleHTTP server directly on its 9001 port via one of the VIPs (192.45.67.89) -
bash-4.3# curl 192.45.67.89:9001
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"><html>
<title>Directory listing for /</title>
<body>
<h2>Directory listing for /</h2>
<hr>
<ul>
<li><a href=".dockerenv">.dockerenv</a>
<li><a href="bin/">bin/</a>
<li><a href="dev/">dev/</a>
<li><a href="etc/">etc/</a>
<li><a href="home/">home/</a>
<li><a href="lib/">lib/</a>
<li><a href="lib64/">lib64/</a>
<li><a href="media/">media/</a>
<li><a href="mnt/">mnt/</a>
<li><a href="proc/">proc/</a>
<li><a href="root/">root/</a>
<li><a href="run/">run/</a>
<li><a href="sbin/">sbin/</a>
<li><a href="srv/">srv/</a>
<li><a href="sys/">sys/</a>
<li><a href="tmp/">tmp/</a>
<li><a href="usr/">usr/</a>
<li><a href="var/">var/</a>
</ul>
<hr>
</body>
</html>
bash-4.3#
Next, I'm pinging the backend python simpleHTTP server directly on its 9001 port via the other VIP (192.45.67.90) again from within the envoy container -
bash-4.3# curl 192.45.67.90:9001
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"><html>
<title>Directory listing for /</title>
<body>
<h2>Directory listing for /</h2>
<hr>
<ul>
<li><a href=".dockerenv">.dockerenv</a>
<li><a href="bin/">bin/</a>
<li><a href="dev/">dev/</a>
<li><a href="etc/">etc/</a>
<li><a href="home/">home/</a>
<li><a href="lib/">lib/</a>
<li><a href="lib64/">lib64/</a>
<li><a href="media/">media/</a>
<li><a href="mnt/">mnt/</a>
<li><a href="proc/">proc/</a>
<li><a href="root/">root/</a>
<li><a href="run/">run/</a>
<li><a href="sbin/">sbin/</a>
<li><a href="srv/">srv/</a>
<li><a href="sys/">sys/</a>
<li><a href="tmp/">tmp/</a>
<li><a href="usr/">usr/</a>
<li><a href="var/">var/</a>
</ul>
<hr>
</body>
</html>
bash-4.3#
But when I try to go via the VIP on port 80 -
bash-4.3# curl -vvv 192.45.67.90:80/service/1
* Trying 192.45.67.90...
* TCP_NODELAY set
* Connected to 192.45.67.90 (192.45.67.90) port 80 (#0)
> GET /service/1 HTTP/1.1
> Host: 192.45.67.90
> User-Agent: curl/7.52.1
> Accept: */*
>
< HTTP/1.1 503 Service Unavailable
< content-length: 57
< content-type: text/plain
< date: Sun, 23 Apr 2017 21:34:13 GMT
< server: envoy
<
* Curl_http_done: called premature == 0
* Connection #0 to host 192.45.67.90 left intact
upstream connect error or disconnect/reset before headersbash-4.3#
Why is envoy getting a 503 as a response when it should be able to reach the backend service?
Finally, the admin access log doesn't show up any new entry when I issue a curl on the VIP/service/1 path, I'm guessing that is expected. Are there any other logs that I can enable to view envoy connection activity?
bash-4.3# curl 192.45.67.90/service/1
upstream connect error or disconnect/reset before headersbash-4.3#
bash-4.3# cat /var/log/envoy/admin_access.log
[2017-04-22T00:08:14.583Z] "GET / HTTP/1.1" 404 - 0 530 0 - "172.17.0.3" "curl/7.52.1" "-" "127.0.0.10:8001" "-"
[2017-04-23T21:29:19.803Z] "GET /clusters HTTP/1.1" 200 - 0 1195 0 - "172.17.0.3" "curl/7.52.1" "-" "127.0.0.1:8001" "-"
[2017-04-23T21:29:35.917Z] "GET /admin HTTP/1.1" 404 - 0 530 0 - "172.17.0.3" "curl/7.52.1" "-" "127.0.0.1:8001" "-"
[2017-04-23T21:29:49.917Z] "GET /server_info HTTP/1.1" 200 - 0 36 0 - "172.17.0.3" "curl/7.52.1" "-" "127.0.0.1:8001" "-"
[2017-04-23T21:44:24.729Z] "GET / HTTP/1.1" 404 - 0 530 0 - "172.17.0.3" "curl/7.52.1" "-" "127.0.0.10:8001" "-"
[2017-04-23T21:44:34.330Z] "GET /clusters HTTP/1.1" 200 - 0 1195 0 - "172.17.0.3" "curl/7.52.1" "-" "127.0.0.10:8001" "-"
[2017-04-23T21:47:00.993Z] "GET /admin HTTP/1.1" 404 - 0 530 0 - "172.17.0.3" "curl/7.52.1" "-" "127.0.0.10:8001" "-"
[2017-04-23T21:47:03.843Z] "GET /stats HTTP/1.1" 200 - 0 9511 0 - "172.17.0.3" "curl/7.52.1" "-" "127.0.0.10:8001" "-"
[2017-04-23T21:47:12.991Z] "GET /stats HTTP/1.1" 200 - 0 9510 0 - "172.17.0.3" "curl/7.52.1" "-" "127.0.0.10:8001" "-"
[2017-04-23T21:47:57.548Z] "GET /stats HTTP/1.1" 200 - 0 9510 0 - "172.17.0.3" "curl/7.52.1" "-" "127.0.0.10:8001" "-"
bash-4.3#
I skimmed through this quickly and I don't see any call to service1 at all in the stats above, so they are probably going to service2. I can't tell without seeing the full config, full dump of stats, and full dump of clusters output. I won't be able to help you further in this issue. If someone else doesn't help I would try Gitter for more interactive help. This is a configuration or docker setup issue.
Np, thanks @mattklein123 ! I'll post this on gitter.
@mattklein123 This issue was being caused because I plumbed subinterfaces but didn't configure any routing on them. Going via the docker network create and connect commands resolved connectivity issues and we were able to bring up multiple listeners. Thanks for your help on this!
@vijayendrabvs I am running into the same problem. My golang service is accessible from within the service container on port 9096 but not accessible through the envoy front-proxy container, with exactly the same response as you reported.
Can you provide any details on the resolution please?
I'm running into the same issue today.
I can access the service from the container using curl but not able to accesss through the envoy container via http://localhost:10000/symphony
My envoy.yaml
tatic_resources:
listeners:
Same issue here, is there some way to solve this?
I can wget -qO- localhost:80/ping my service from within the container but I get the error when curling the ingress: upstream connect error or disconnect/reset before headers.
I resolved my issue by removing http2_protocol_options: {}
Where did you change that option?
Share your Envoy config file. I will take a look.
Envoy is used in my Istio container. But I don't know where to find that config file.
@danesavot ,
Where can you find envoy config file in istio-container? And how to change this config file?
Do we need modify them and create istio-proxy container by ourselves?
@AmerbankDavd Had you resolved this?
In my istio 0.5.1, there is no http2_protocol_options: {} at all.
kubectl exec -ti istio-pilot-676d495bf8-9c2px -c istio-proxy -n istio-system -- cat /etc/istio/proxy/envoy_pilot.json
{
"listeners": [
{
"address": "tcp://0.0.0.0:15003",
"name": "tcp_0.0.0.0_15003",
"filters": [
{
"type": "read",
"name": "tcp_proxy",
"config": {
"stat_prefix": "tcp",
"route_config": {
"routes": [
{
"cluster": "in.8080"
}
]
}
}
}
],
"bind_to_port": true
}
],
"admin": {
"access_log_path": "/dev/stdout",
"address": "tcp://127.0.0.1:15000"
},
"cluster_manager": {
"clusters": [
{
"name": "in.8080",
"connect_timeout_ms": 1000,
"type": "static",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://127.0.0.1:8080"
}
]
}
]
}
}
I have added all the changes recommended to get the hello world example to run into this repo https://github.com/oinke/gprc-hello
The terminal still shows:
server_1 | E0302 08:41:34.225022613 7 http_server_filter.cc:271] GET request without QUERY
and when i browse localhost:8080 i can see
upstream connect error or disconnect/reset before headers. reset reason: remote reset
Running on macOS Mojave 10.14.2 with Docker version 18.09.2, build 6247962
@oinke seems like your issue is not related to this issue. I have posted a PR (https://github.com/oinke/gprc-hello/pull/1) to your repo.
@danesavot I also resolved by commenting out the empty http2 options. huge thanks!
# http2_protocol_options: {}
Outside of that, for everyone else, if you're running containers on the host, checkout networking: https://docs.docker.com/network/network-tutorial-standalone/
I created a custom docker bridge network, had the other containers run with --network and the jumped into the envoy container and ensured I could curl to those by name.
the empty http2 options was from the envoy tutorial
Most helpful comment
I resolved my issue by removing http2_protocol_options: {}