Envoy: Envoy does not adhere to HTTP/2 RFC 7540

Created on 1 May 2019 · 13Comments · Source: envoyproxy/envoy

Title: Envoy does not adhere to HTTP/2 RFC 7540

Description:

RFC 7540 Section 9.1.1 and 9.1.2 specifies when a request coming in through a re-used HTTP/2 connection is accidentally sent to a non-origin but authoritative server that a 421 response should be returned. This can happen if two servers one with a wildcard certificate (e.g., a.example.com) and another server (b.example.com) with a non-wildcard on the same IP address using SNI responds to requests those meant for server b.example.com will accidentally be forwarded down the re-used HTTP/2 connection for a.example.com. In this situation a.example.com should send back a 421 to indicate the request was destined for b.example.com. This forces browsers to re-establish a new connection, re-negotiate the SNI, and thus the backing server and subsequently route to the correct origin.

[optional Relevant Links:]

https://tools.ietf.org/html/rfc7540#section-9.1.1 - section describing connection re-use
https://tools.ietf.org/html/rfc7540#section-9.1.2 - section describing misdirected response
https://bugs.chromium.org/p/chromium/issues/detail?id=954160#c5 - The bug was originally filed against Chromium however they indicated Istio was the issue.
https://github.com/istio/istio/issues/13589 - The bug was then filed against istio who indicated envoy was the issue.

arehttp aretls design proposal help wanted

Source

trevorlinton

👍8 👀1

Most helpful comment

Is is possible to reprioritize this issue? We have a use case where we have thousands of services behind hundreds of FQDN's that are served by a set of identical envoys, (all using a wildcard TLS cert). We exhibit this exact issue when HTTP/2 is enabled but not when enforcing usage of HTTP/1.1.

jcrowthe on 20 Aug 2019

👍15

All 13 comments

@alyssawilk @PiotrSikora thoughts on this? I haven't read the relevant RFCs in detail to fully understand what is needed.

mattklein123 on 2 May 2019

Intersection of H2 specs and TLS handshake? I eagerly anticipate Piotr sorting this out :-P

alyssawilk on 2 May 2019

The gist is that browsers coalesce HTTP/2 connections pretty aggressively: when a browser opens connection to www.example.com and is presented with a certificate for *.example.com during TLS handshake, then it will re-use this connection for all requests to *.example.com, as long as the hostname resolves to the same IP (and some browsers don't even care about that).

As long as all *.example.com hostnames are served by the same listener/filter chain, this shouldn't be an issue in Envoy, since routing is happening on a per-request, and not per-connection basis (please correct me if I'm wrong).

However, if www.example.com (with *.example.com certificate) is served by one listener/filter chain, and app.example.com is served by another listener/filter chain, then we have an issue, since connections are latched to a single listener/filter chain for the lifetime of the connection (again, please correct me if I'm wrong), and if the connection to www.example.com is established first, then requests to app.example.com will be coalesced on the same connection, using configuration for www.example.com, and forwarded to the wrong backend.

One solution would be to send 421 Misdirected Request response to requests for hostnames that are not configured on a given listener/filter chain (but this wouldn't work if *.example.com is configured), or send 421 Misdirected Request response to requests for hostnames that are configured on other listeners/filter chains (but this requires a global list of all configured hostnames).

Another solution would be using HTTP/2 ORIGIN frame (RFC8336) to advertise allowed hostnames on a given listener/filter chain (but this requires a global list as well, and this extension is supported only by a few clients).

PiotrSikora on 2 May 2019

jcrowthe on 20 Aug 2019

👍15

Here's the CVE for this vulnerability https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11767

haf on 15 Apr 2020

CC @envoyproxy/security-team

htuch on 15 Apr 2020

One solution would be to send 421 Misdirected Request response to requests for hostnames that are not configured on a given listener/filter chain (but this wouldn't work if *.example.com is configured), or send 421 Misdirected Request response to requests for hostnames that are configured on other listeners/filter chains (but this requires a global list of all configured hostnames).

I can think of 3 alternatives:

What if you could specify the HTTP response code for an RBAC filter DENY? Then the management server that configured the HCM can add an RBAC policy for the server names that it allows on that HCM and generate 421 on DENY.
The management server could program the SNI server name check in the Lua filter, generating a 421 when it didn't match.
Add a dedicated filter that can be configured with the acceptable SNI server names.

jpeach on 28 Apr 2020

👍2

What is the plan for fixing https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11767 in envoy?

HarinadhD on 18 Jun 2020

AFAIK there is no plan currently. Someone needs to own this issue and drive a resolution if they are passionate about fixing it.

mattklein123 on 18 Jun 2020

I hacked up a 421 response when the virtual host lookup fails and this works for a simple case that I tried. If this is a reasonable approach I'd need a bit of help to polish it up and figure out how to write tests for it.

https://gist.github.com/jpeach/e01f5f752eed5ffd09ea1f18634d1fc5

jpeach on 17 Jul 2020

👍1

I think I managed to find a workaround:

On Envoy instance I added an "envoy.lua" HTTP filter, that checks if the response code is a 404 (the same code, that is being generated for non-existent route) AND checks if the "x-envoy-upstream-service-time" header is NOT present.

The Lua code:

function envoy_on_response(response_handle)
    if response_handle:headers():get(":status") == "404" and response_handle:headers():get("x-envoy-upstream-service-time") == nil then
        response_handle:headers():replace(":status", "421")
    end
end

Example configuration on Envoy (fetched by LDS):

"http_filters": [
    {
        "name": "envoy.lua",
        "typed_config": {
            "@type": "type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua",
            "inline_code": `
function envoy_on_response(response_handle)
    if response_handle:headers():get(":status") == "404" and response_handle:headers():get("x-envoy-upstream-service-time") == nil then
        response_handle:headers():replace(":status", "421")
    end
end`
        }
    },
    {
        "name": "envoy.filters.http.router"
    }
]

lunighty on 7 Sep 2020

👍3

I think I managed to find a workaround:

On Envoy instance I added an "envoy.lua" HTTP filter, that checks if the response code is a 404 (the same code, that is being generated for non-existent route) AND checks if the _"x-envoy-upstream-service-time"_ header is NOT present.

Nice! That's a bit cleaner than my equivalent 👍

jpeach on 7 Sep 2020

👍1

@jpeach @lunighty Is there a plan to have a fix in Envoy code or do you think the current approach with EnvoyFilter is sufficient?