I'm seeing something very strange with an app that I'm testing out in a service mesh environment. Here's what the setup looks like:
*.github.com over TLShttp://...:443 syntaxFor reference you can assume that the service's code to get an OAuth token looks very similar to this. What's strange is that the exact same code using the same client_id, client_secret, and code (the one that GitHub gives you in the callback) works locally. I was able to verify this by spinning up the Python repl locally, copy and pasting the same code that was deployed to AWS into the repl, providing the client_id, client_secret, and code (I grabbed the code from testing the live deployment as part of my debugging), and executing it all. I got a 200 response along with the OAuth token.
As the title says, this exact same code and execution flow throws a 404 along with 404 Not Found back to the app. I guess you could say the code is not 100% the same as the local test I did since the URLs need to be adjusted for the egress (http://github.com:443/login/oauth/access_token instead of https://github.com/login/oauth/access_token for example).
Is there something that Envoy is not handling or is stripping out from these kinds of requests?
@aaronjwood your route rules are likely not setup correctly but it's hard to say without looking at configs. If this is via Istio, I might recommend asking for help from the Istio folks?
Sure, I'll reach out to them about this. I'll share my configs in this issue once I'm back on Monday too.
FWIW I'm only using an egress, no routing rules:
apiVersion: config.istio.io/v1alpha2
kind: EgressRule
metadata:
name: github-egress
spec:
destination:
service: "*.github.com"
ports:
- port: 443
protocol: https
Envoy should preserve any form data, headers, etc. that the app passes, right?
I've just confirmed that taking Envoy out of the equation solves the issue. Once Envoy is no longer in place and the URLs in the code are changed back to https://github.com/... everything works. I just tried launching everything with --debug --log_output_level debug and didn't see anything in the logs other than a libprotobuf info log and a normal HTTP failure:
[libprotobuf INFO external/mixerclient_git/src/check_cache.cc:155] Add a new Referenced for check cache: Absence-keys: request.headers[cookie], source.uid, Exact-keys: context.protocol, destination.ip, destination.service, destination.uid, request.path, source.ip,
[2018-02-26T18:11:18.305Z] "POST /login/oauth/access_token?client_secret=<MY_SECRET>&code=<MY_CODE>&client_id=<MY_ID> HTTP/1.1" 404 NR 0 0 2 - "-" "python-requests/2.18.4" "3a77c52a-bec7-970e-bab5-e4d417d76df3" "github.com:443" "-"
I spun up a local instance of Envoy and tested the OAuth flow manually via curl which worked. I pretty much followed this and modified the config:
admin:
access_log_path: /tmp/admin_access.log
address:
socket_address: { address: 0.0.0.0, port_value: 9901 }
static_resources:
listeners:
- name: listener_0
address:
socket_address: { address: 0.0.0.0, port_value: 10000 }
filter_chains:
- filters:
- name: envoy.http_connection_manager
config:
stat_prefix: ingress_http
codec_type: AUTO
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match: { prefix: "/" }
route: { host_rewrite: github.com, cluster: service_google }
http_filters:
- name: envoy.router
clusters:
- name: service_google
connect_timeout: 0.25s
type: LOGICAL_DNS
# Comment out the following line to test on v6 networks
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
hosts: [{ socket_address: { address: github.com, port_value: 443 }}]
tls_context: { sni: github.com }
curl -d "client_id=MYID&client_secret=MYSECRET&code=MYCODE" -X POST http://localhost:10000/login/oauth/access_token
access_token=MYTOKEN&scope=read%3Auser&token_type=bearer
md5-149834b7a28a57739170a9c51833ca7a
curl 127.0.0.1:9901/routes
[
]
md5-2a066812597ccd88489247bb0f9ad060
curl localhost:15000/routes
...
{
"version_info": "hash_647cee278e53d1e3",
"route_config_name": "443",
"cluster_name": "rds",
"route_table_dump": {"name":"443","virtual_hosts":[{"name":"*.github.com:443","domains":["*.github.com","*.github.com:443"],"routes":[{"match":{"prefix":"/"},"route":{"cluster":"out.*.github.com|external-HTTPS-443","timeout":"0s"},"decorator":{"operation":"default-route"}}]},{"name":"public-api.adsbexchange.com:443","domains":["public-api.adsbexchange.com","public-api.adsbexchange.com:443"],"routes":[{"match":{"prefix":"/"},"route":{"cluster":"out.public-api.adsbexchange.com|external-HTTPS-443","timeout":"0s"},"decorator":{"operation":"default-route"}}]}]}
}
this looks like the egress for https issue. your best bet is to use includeIPRange
https://istio.io/docs/tasks/traffic-management/egress.html#calling-external-services-directly
to not intercept outgoing traffic unless you are able to change the client to use http on port 443
I am able to change the client to do http over port 443. I have verified that the TLS handshake from Envoy to GitHub succeeds. Before when I hadn't changed the URLs to something like http://github.com:443/... I would get handshake errors.
A few small updates: I wrongly thought that I was using Istio 0.5.0 in my original deployment when it is in fact 0.4.0. Also, when I exec into the container where these issues are happening and try to curl any part of GitHub I get 404's. Why would Envoy spit back 404's at me for paths that actually exist (such as / on GitHub)?
curl -vv http://github.com:443
* Rebuilt URL to: http://github.com:443/
* Hostname was NOT found in DNS cache
* Trying 192.30.255.112...
* Connected to github.com (192.30.255.112) port 443 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.38.0
> Host: github.com:443
> Accept: */*
>
< HTTP/1.1 404 Not Found
< date: Mon, 26 Feb 2018 22:33:34 GMT
* Server envoy is not blacklisted
< server: envoy
< content-length: 0
<
* Connection #0 to host github.com left intact
yes you will get 404 for everything that isn't configured (ie without egress in this case, unless you use includeIPRange as mentioned above)
Are you saying that the egress isn't applied properly in my case? I do have an egress config installed in the cluster (I posted what it looks like a few comments above) so I would think it should work...
the curl is from your injected pod ?
I have a similar rule that does work:
https://github.com/istio/admin-sites/blob/master/fortio/fortio.yaml#L11
I don't use a wildcard but I think wildcard should work - just to be sure though, mind dropping the wildcard ?
Yup, that's right. I actually tried dropping the wildcard earlier today and instead added two entries, one for github.com and another for api.github.com. I saw the same result.
which istio version?
0.4.0.
FWIW when I tried out a local test with a standalone instance of Envoy (https://github.com/envoyproxy/envoy/issues/2670#issuecomment-368642900) I was using the latest version of Envoy.
I don't think egress was working in 0.4; can you try our 0.6 release candidate:
http://gcsweb.istio.io/gcs/istio-prerelease/daily-build/0.6.0-pre20180222-22-57-06/
download the TESTONLY tar matching your client OS
Thanks, I'll give this a try and circle back.
@ldemailly was egress only partially working in 0.4.0? I have another microservice that has an egress defined:
apiVersion: config.istio.io/v1alpha2
kind: EgressRule
metadata:
name: adsb-egress
spec:
destination:
service: public-api.adsbexchange.com
ports:
- port: 443
protocol: https
and it works (I'm inside the pod here):
curl -s -i http://public-api.adsbexchange.com:443/VirtualRadar/AircraftList.json | more
HTTP/1.1 200 OK
date: Mon, 26 Feb 2018 23:33:06 GMT
content-type: application/json
content-length: 4671861
...
what's the difference between where it works and where it doesn't ? (wildcard maybe?)
I tried without the wildcard. I guess the only other difference is that the request is a POST with some headers to accept JSON back. The request I just tried was a basic GET.
I spoke with @rshriram a little about this and he mentioned that Istio is not setting the SNI field. Sounds like that may be the issue...
@aaronjwood friendly request. Do you mind potentially moving this issue over to the Istio repo? I don't think this has anything to do with Envoy specifically.
@mattklein123 @aaronjwood suspects it has to do with envoy not providing the SNI field when it makes the TLS connection; if that's confirmed it is an envoy issue no ?
Istio needs to configure upstream SNI if that is the issue: https://www.envoyproxy.io/docs/envoy/latest/api-v1/cluster_manager/cluster_ssl "sni"
thanks matt ! we can possibly close this issue in favor of the above istio one,
we seem to send an empty ssl_context for external services so I guess that's one issue
this being said, could we have an "automatic" mode where it uses the Authority/Host: ?
because for wildcard it will probably not work otherwise, would it ?
@mattklein123 Following https://github.com/envoyproxy/envoy/issues/2670#issuecomment-368726197. For orig_dst clusters, we cannot set SNI since we don't know the host in advance. Can Envoy set SNI automatically according to the Authority/Host headers in case the "sni" field is omitted?
@ldemailly @vadimeisenbergibm because of how Envoy does connection pooling, setting SNI based on host header is not trivial. I think we could likely figure something out but it will require a bunch of thinking. I'm going to go ahead and close this issue out. Please open a fresh issue requesting setting SNI based on host/authority and we can discuss there.
@mattklein123 I have created an example of using Envoy with NGINX, to solve this and other issues: https://github.com/vadimeisenbergibm/envoy-generic-forward-proxy . It shows how Envoy can function as a _generic_ forward proxy, _generic_ means being a proxy for arbitrary hosts.
Any comments are welcome.