Envoy: Envoy v1.11.0 removing content-length = 0 header and adding transfer-encoding = chunked header

Created on 30 Jul 2019  Â·  34Comments  Â·  Source: envoyproxy/envoy

This happens when:

  1. Envoy's version is v1.11.0, does not happen with v1.10.0
  2. A request comes in without a request body, that is it has content-length: 0. So a POST request with a non-empty body doesn't show this behavior
  3. Both ratelimit and ext_authz filters are enabled, if only one is enabled then this doesn't happen
  4. The ratelimit service is called for the request using the envoy.rate_limit filter
  5. The ratelimit service responds with a 200 rather than a 429, i.e. the request does not exceed any ratelimits.

When the above things happen, envoy removes the content-length: 0 header and adds transfer-encoding: chunked to the request.

arehttp help wanted investigate

All 34 comments

When we upgraded to Envoy v1.11.0, we found that for a GET request with no body:

  • if ratelimit filter is enabled, "Transfer-Encoding: chunked" header is set
  • if ratelimit filter is not enabled, "Content-Length: 0" header is set

Having such a request with "Transfer-Encoding: chunked" header causes issues for our upstream service. Ratelimiting works for us with v1.10.0, so as far as we can tell this header was not set before by Envoy. Can someone please confirm what is the intended behavior?

(To clarify, I'm working with Sid. The details in my description of the issue reflect additional debugging, so please refer to my description).

I also ran into this in the same way others have described. I dug in using a local reproducer and GDB and found that by the time the filter handling code got to the second filter, the code had started to buffer the request body due to end_stream=false in some of the continuation code.

cc @alyssawilk @soya3129 @snowp we can try to figure out which change caused this behavior difference. With that said, what Envoy is doing now is technically still spec compliant, though definitely not as optimal as what was done previously.

I'd hoped that mikehuston had a use case with only ratelimit, where I could at least quickly binary search for regression, but turns out it does require both filters and I'm not even seeing an authz integration test so I don't think I can help out before I take off.

Rate limit only test here: https://pastebin.com/XRQAFK0G in case it's useful to whoever does pick this up.

Thanks esmet, mattklein123, and alyssawilk. Yes, I am using both extauthz and ratelimit filters. Please let me know if there is any more context I can share.

FWIW I had opened https://github.com/go-openapi/runtime/issues/148 on the go-openapi maintainers after running into this, since to previous points made here, the behavior is valid HTTP though it is curious why the behavior changed.

The behavior changed almost definitely do to a patch related to pause/continuation during filter operation, and we are now ending the stream with an empty data frame vs. a headers only request. I'm going to leave this open as help wanted because I do want to understand why this changed and if there is a way to get back the old behavior.

This is currently affecting a few Bottle (python) based upstreams for us. It causes a Error while parsing chunked transfer body error (https://github.com/bottlepy/bottle/blob/eff4960d941b51629f8378b1bd9498ed2aec92c7/bottle.py#L1310).

We have both rate limiting and ext_authz enabled, it goes away if we disable ext_authz, but not sure about rate limiting, could be doing something wrong.

Anyone has a workaround for this in the short term? Not sure how work around Bottle's behaviour here.

I was able to properly reproduce it now. If I remove either ext_authz or rate-limit it does not add the transfer-encoding header.

We've noticed this issue as well during our upgrade testing. We see it when using a custom http filter and the rate limiting filter together. We found that applying a lua filter to set the content-length based on the body seems to resolve this though as a workaround.

Is this on the roadmap for a patch version of 1.12.x or will it be addressed in time for 1.13.0?

We need someone to investigate what is going on here.

Second pass at integration test (hella hacky, but I believe it's running both filters) also fails to reproduce: the content length goes all the way through. I'm sure this is an actual problem given the number of disparate reports, but it'd really help if someone encountering the issue could repro it in test somehow?

https://github.com/envoyproxy/envoy/compare/master...alyssawilk:7753

[ RUN ] IpVersionsCientType/ExtAuthzGrpcIntegrationTest.HTTP1DownstreamRequestWithBody/1
Waiting auths
Waiting ratelimit
Waiting response
':authority', 'host'
':path', '/test'
':method', 'GET'
'content-length', '0'
'x-forwarded-proto', 'http'
'x-request-id', '0cc517ab-74d4-4c67-8795-7dfbbf04a8b7'
'x-ratelimit-done', 'true'
'x-envoy-expected-rq-timeout-ms', '15000'
done
1
clean
[ OK ] IpVersionsCientType/ExtAuthzGrpcIntegrationTest.HTTP1DownstreamRequestWithBody/1 (388 ms)

I worked a bit on the integration test that @alyssawilk wrote and I couldn't get the issue to reproduce either. I also tried using a different set of filter but that also didn't work. The only way I could recreate it was to change makeHeaderOnlyRequest to makeRequestWithBody with an empty body. Unfortunately when I ported this modified test back to v1.10 it produced the same result which was unexpected.

I have some cycles to look at this. If anyone is still able to repro this is it possible to come up with a docker compose example? I can probably manually set something up with the filters but it would be easier if there was a pre-canned repro even if it's manual.

I will try to throw something together if I get time this week @mattklein123 , I probably have some old docker-compose files laying around from when I was troubleshooting this last time.

@leosunmo great thanks that will help a lot.

Alright @mattklein123 , here we go: https://github.com/leosunmo/envoy-chunked-bug.
Please let me know if you need anything else.

@leosunmo awesome, thanks. I will take a look soon!

Any update on this @mattklein123 ? Do you need anything else from me? I have some spare time to help, but I don't have nearly enough C++ or Envoy codebase knowledge to help on that front.

1.11.0 is over 15 months old with the current release being 1.15.2. Has this scenario been tested with a current release?

The above example uses envoyproxy/envoy-dev:latest and I pulled that image fresh when I tested it on 3rd of Sept, so I'd say so :shrug:

Sorry I haven't had time to look at this yet. It's on my list.

This is going to be very unsatisfying, but I just tried to repro this on current latest, and AFAICT it doesn't repro anymore. Has it been fixed in the last month? @leosunmo can you check me?

Haha, roger, I'll check when I get in to the office.

(There have been some relevant changes in this area in the filter chain the last month that I could believe might impact this, cc @snowp @alyssawilk)

Later I can try to pull an old image to see if I can repro and maybe come up with a regression test.

I am still seeing the same problem. Am I doing something wrong here?

➜  curl -s localhost:8000/headers -H "Authorization: Bearer foo"
{
  "headers": {
    "Accept": "*/*",
    "Host": "localhost:8000",
    "Transfer-Encoding": "chunked",
    "User-Agent": "curl/7.68.0",
    "X-Envoy-Expected-Rq-Timeout-Ms": "15000",
    "X-Ext-Auth-Ratelimit": "LCa0a2j/xo/5m0U8HTBBNBNCLXBkg7+g+YpeiGJm564="
  }
}
➜  curl -s localhost:8000/nolimit/headers -H "Authorization: Bearer foo"
{
  "headers": {
    "Accept": "*/*",
    "Content-Length": "0",
    "Host": "localhost:8000",
    "User-Agent": "curl/7.68.0",
    "X-Envoy-Expected-Rq-Timeout-Ms": "15000",
    "X-Envoy-Original-Path": "/nolimit/headers",
    "X-Ext-Auth-Ratelimit": "LCa0a2j/xo/5m0U8HTBBNBNCLXBkg7+g+YpeiGJm564="
  }
}

Envoy dockerfile

FROM envoyproxy/envoy:v1.16.0

COPY /front-envoy.yaml /etc/

RUN apt-get update && apt-get -q install -y \
    curl
CMD /usr/local/bin/envoy -c /etc/front-envoy.yaml -l debug --service-cluster front-proxy

I'm trying with docker compose and it's not reproing. I had to make one change due to binding issues, but here is the diff that I'm using, even going back to an image from 9/3:

ubuntu@ip-10-0-0-215:~/Source/envoy-chunked-bug$ git diff
diff --git a/docker-compose.yaml b/docker-compose.yaml
index 860726b..3c1f02b 100644
--- a/docker-compose.yaml
+++ b/docker-compose.yaml
@@ -7,10 +7,10 @@ services:
     networks:
       - envoymesh
     expose:
-      - "80"
+      - "8000"
       - "8001"
     ports:
-      - "8000:80"
+      - "8000:8000"
       - "8001:8001"

   httpbin:
diff --git a/front-envoy/front-envoy.Dockerfile b/front-envoy/front-envoy.Dockerfile
index f9a8136..1f0ded7 100644
--- a/front-envoy/front-envoy.Dockerfile
+++ b/front-envoy/front-envoy.Dockerfile
@@ -1,4 +1,4 @@
-FROM envoyproxy/envoy-dev:latest
+FROM envoyproxy/envoy-dev:3a5d691e582e360bb1a35272e4972984937e7d6a

 COPY /front-envoy.yaml /etc/

diff --git a/front-envoy/front-envoy.yaml b/front-envoy/front-envoy.yaml
index a0d5093..8a2382d 100644
--- a/front-envoy/front-envoy.yaml
+++ b/front-envoy/front-envoy.yaml
@@ -2,7 +2,7 @@ static_resources:
   listeners:
     - name: listener_0
       address:
-        socket_address: { address: 0.0.0.0, port_value: 80 }
+        socket_address: { address: 0.0.0.0, port_value: 8000 }
       filter_chains:
         - filters:
             - name: envoy.http_connection_manager
ubuntu@ip-10-0-0-215:~/Source/envoy$ curl -s localhost:8000/nolimit/headers -H "Authorization: Bearer foo"
{
  "headers": {
    "Accept": "*/*", 
    "Content-Length": "0", 
    "Host": "localhost:8000", 
    "User-Agent": "curl/7.58.0", 
    "X-Envoy-Expected-Rq-Timeout-Ms": "15000", 
    "X-Envoy-Original-Path": "/nolimit/headers", 
    "X-Ext-Auth-Ratelimit": "LCa0a2j/xo/5m0U8HTBBNBNCLXBkg7+g+YpeiGJm564="
  }
}
ubuntu@ip-10-0-0-215:~/Source/envoy$  hstr -- 
curl -s localhost:8000/headers -H "Authorization: Bearer foo"
ubuntu@ip-10-0-0-215:~/Source/envoy$ curl -s localhost:8000/headers -H "Authorization: Bearer foo"
{
  "headers": {
    "Accept": "*/*", 
    "Content-Length": "0", 
    "Host": "localhost:8000", 
    "User-Agent": "curl/7.58.0", 
    "X-Envoy-Expected-Rq-Timeout-Ms": "15000", 
    "X-Ext-Auth-Ratelimit": "LCa0a2j/xo/5m0U8HTBBNBNCLXBkg7+g+YpeiGJm564="
  }
}

```

Ah I spotted a mistake! I wasn't properly rebuilding and getting cached stuff. This is now indeed working!

➜  curl -s localhost:8000/nolimit/headers -H "Authorization: Bearer foo"
{
  "headers": {
    "Accept": "*/*",
    "Content-Length": "0",
    "Host": "localhost:8000",
    "User-Agent": "curl/7.68.0",
    "X-Envoy-Expected-Rq-Timeout-Ms": "15000",
    "X-Envoy-Original-Path": "/nolimit/headers",
    "X-Ext-Auth-Ratelimit": "LCa0a2j/xo/5m0U8HTBBNBNCLXBkg7+g+YpeiGJm564="
  }
}
➜  curl -s localhost:8000/headers -H "Authorization: Bearer foo"
{
  "headers": {
    "Accept": "*/*",
    "Content-Length": "0",
    "Host": "localhost:8000",
    "User-Agent": "curl/7.68.0",
    "X-Envoy-Expected-Rq-Timeout-Ms": "15000",
    "X-Ext-Auth-Ratelimit": "LCa0a2j/xo/5m0U8HTBBNBNCLXBkg7+g+YpeiGJm564="
  }
}

That's now working on envoyproxy/envoy:v1.16.0 _and_ envoyproxy/envoy:v1.15.2!

Very exciting! :D Good work mysterious bug fixer!

Hmm. I wonder when this was fixed. Let me do a little digging.

I just tried all of the major versions and this was fixed somewhere between 1.11 and 1.12. Given how long ago this was, I'm not sure it's worth digging on this more?

Yeah I just came to the same conclusion using Docker images between 1.11 and 1.12. It does indeed look like that was the fixed version! I'd say just close it.

OK glad this is resolved!

Was this page helpful?
0 / 5 - 0 ratings