Envoy: 413 error payload too large when use nginx and envoy together

Created on 28 Mar 2018  Â·  18Comments  Â·  Source: envoyproxy/envoy

Title: 413 error payload too large when use nginx and envoy together when posting a file that is bigger than 1Mb. If I post the file to nginx or envoy by itself, the problem doesn't exist

Description:
I deployed the simple app from https://github.com/rayh0001/gs-spring-boot-docker to my k8s cluster in ibm cloud. I also have istio running in my k9s cluster.

When i send the curl command to the service with a post of a file over 1MB, I'm getting 413.

$ curl -svo  /dev/null -F '[email protected]' mycluster.us-east.containers.mybluemix.net/scan -H 'transactionId: 11111'  -H 'Cache-Control: no-cache'
*   Trying 169.60.83.14...
* TCP_NODELAY set
* Connected to mycluster.us-east.containers.mybluemix.net (169.60.83.14) port 80 (#0)
> POST /scan HTTP/1.1
> Host: mycluster.us-east.containers.mybluemix.net
> User-Agent: curl/7.54.0
> Accept: */*
> transactionId: 11111
> Cache-Control: no-cache
> Content-Length: 1449466
> Expect: 100-continue
> Content-Type: multipart/form-data; boundary=------------------------c65bbc05407833b1
>
< HTTP/1.1 100 Continue
} [154 bytes data]
< HTTP/1.1 413 Payload Too Large
< Date: Sat, 24 Mar 2018 00:32:08 GMT
< Content-Type: text/plain
< Content-Length: 17
< Connection: keep-alive
* HTTP error before end of send, stop sending
<
{ [17 bytes data]
* Closing connection 0

Chatted with @PiotrSikora who suggested to tweak per_connection_buffer_limit_bytes to workaround this issue. I developed a pilot webhook (https://github.com/linsun/istioluawebhook) and the issue is resolved when the webhook is running.

I wanted to open this issue to see if this is something envoy can fix, e.g. Envoy should slow-down reading from the wire once it exceeds soft watermark. As a user, I'd prefer this is handled dynamically by envoy.

  • Logs *:
[2018-03-13 01:39:55.026][350][debug][http] external/envoy/source/common/http/conn_manager_impl.cc:1310] [C36][S17322680420845703538] request data too large watermark exceeded
question

Most helpful comment

You can also write the filter such that it only buffers up till the decision is made, and then starts streaming. That is how our internal Lyft auth filters work, as well as the ratelimit filter in the public repo. But you will need to increase the buffer level to some amount that is mostly safe to account for service latency.

All 18 comments

cc @cmluciano

FWIW, I spent some time looking into this last week, but I couldn't replicate it myself, even with something as ridiculous as per_connection_buffer_limit_bytes: 1024.

cc @alyssawilk

By default Envoy is fully streaming and will apply back pressure. Most likely there is some filter that is buffering somewhere. Can you provide your full configuration? Are any of your installed filters buffering the request?

Yep, the http connection manager only sends a 413 for non-streaming (buffering) filters so I suspect you've got one somewhere in your config. Well or we have a bug, but if you post the config we can take a look :-)

@mattklein123 @alyssawilk thanks for your comment. Just to make sure - are you asking for envoy configuration on the istio-ingress in this case?

The config is shipped from pilot, so it's not there in the bootstrap. Istio
uses mixerfilter, fault filter, CORS filter, and the regular router. I'll
need to double check if any of them are buffering.

On Wed, Mar 28, 2018 at 8:45 AM cmluciano notifications@github.com wrote:

I attached the config as a gist link

/usr/local/bin/envoy -c /etc/istio/proxy/envoy-rev18.json --restart-epoch
18 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster
istio-ingress --service-node
ingress~~istio-ingress-779649ff5b-nl92l.istio-system~istio-system.svc.cluster.local
--max-obj-name-len 189

envoy-rev18.json
https://gist.github.com/cmluciano/85424f51713d44460b922a23d56f2a30

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/envoyproxy/envoy/issues/2919#issuecomment-376935131,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJGIxiwl4UuSvf58avhndzIKnLlvsM7iks5ti7AegaJpZM4S965P
.

Mixerfilter does buffer
https://github.com/istio/proxy/blob/master/src/envoy/http/mixer/filter.cc#L136

On Wed, Mar 28, 2018 at 9:17 AM Kuat Yessenov kuat@google.com wrote:

The config is shipped from pilot, so it's not there in the bootstrap.
Istio uses mixerfilter, fault filter, CORS filter, and the regular router.
I'll need to double check if any of them are buffering.

On Wed, Mar 28, 2018 at 8:45 AM cmluciano notifications@github.com
wrote:

I attached the config as a gist link

/usr/local/bin/envoy -c /etc/istio/proxy/envoy-rev18.json --restart-epoch
18 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster
istio-ingress --service-node
ingress~~istio-ingress-779649ff5b-nl92l.istio-system~istio-system.svc.cluster.local
--max-obj-name-len 189

envoy-rev18.json
https://gist.github.com/cmluciano/85424f51713d44460b922a23d56f2a30

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/envoyproxy/envoy/issues/2919#issuecomment-376935131,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJGIxiwl4UuSvf58avhndzIKnLlvsM7iks5ti7AegaJpZM4S965P
.

Yeah, so if you never want to 413 you have to either only use streaming filters, or configure infinite buffering (which opens you up to OOM attacks if you don't control the downstream traffic). We don't currently have anything which will clamp down on memory on a global level, only the per-connection limits.

BTW, here is my istio ingress envoy config for listeners (with the newly added per_connection_buffer_limit_bytes by me via pilot webhook). I'm not seeing mixer filter buffer in there. @kyessenov I assume the code you pointed earlier will overwrite the filter in the config below? Do we know why mixer does buffer? is it to improve performance? if so, what would be our recommendation for istio users when they need to load relatively large files to their services....

curl istio-pilot.istio-system:8080/v1/listeners/istio-ingress/ingress~~istio-ingress-779649ff5b-nl92l.istio-system~istio-system.svc.cluster.local { "listeners": [ { "address": "tcp://0.0.0.0:80", "bind_to_port": true, "filters": [ { "config": { "access_log": [ { "path": "/dev/stdout" } ], "codec_type": "auto", "filters": [ { "config": { "v2": { "defaultDestinationService": "istio-ingress.istio-system.svc.cluster.local", "forwardAttributes": { "attributes": { "source.uid": { "stringValue": "kubernetes://istio-ingress-779649ff5b-nl92l.istio-system" } } }, "mixerAttributes": { "attributes": { "destination.uid": { "stringValue": "kubernetes://istio-ingress-779649ff5b-nl92l.istio-system" } } }, "serviceConfigs": { "istio-ingress.istio-system.svc.cluster.local": { "mixerAttributes": { "attributes": { "destination.service": { "stringValue": "istio-ingress.istio-system.svc.cluster.local" } } } } }, "transport": { "checkCluster": "mixer_check_server", "reportCluster": "mixer_report_server" } } }, "name": "mixer", "type": "decoder" }, { "config": {}, "name": "cors", "type": "" }, { "config": {}, "name": "router", "type": "decoder" } ], "generate_request_id": true, "rds": { "cluster": "rds", "refresh_delay_ms": 1000, "route_config_name": "80" }, "stat_prefix": "http", "tracing": { "operation_name": "egress" }, "use_remote_address": true }, "name": "http_connection_manager", "type": "read" } ], "name": "http_0.0.0.0_80", "per_connection_buffer_limit_bytes": 2049264 } ] }

            "name": "mixer",
            "type": "decoder"

is all you need for 413s, given the line of code @kyessenov linked. Basically any time state_ == Calling you buffer limited data instead of pushing back on downstream via H2 flow control / TCP window limits.

I can't speak to why that decision was made - it looks like the filter shouldn't pass data upstream until the check is complete but if the body doesn't need to be completely read for the check to proceed you might just be able to change the return at :136 and solve your problem. If there's documentation we can make more clear for future filters do let us know, otherwise we'll probably close this off a working as intended.

@qiwzhang is there a reason to buffer when state_ == Calling

cc @lizan

Mixerfilter is similar to ext_authz filter in that it requires an RPC to a
remote server before passing through requests. It seems that buffering is
required in this case. I think we should selectively mixerfilter behavior
not to buffer, unless authorization is required, and not use remote
authorization check for those paths.

On Wed, Mar 28, 2018 at 11:57 AM mandarjog notifications@github.com wrote:

cc @lizan https://github.com/lizan

—
You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
https://github.com/envoyproxy/envoy/issues/2919#issuecomment-376997923,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AJGIxgGcdkS4ZtGElwG3m2vUc0KQxEf7ks5ti90EgaJpZM4S965P
.

You can also write the filter such that it only buffers up till the decision is made, and then starts streaming. That is how our internal Lyft auth filters work, as well as the ratelimit filter in the public repo. But you will need to increase the buffer level to some amount that is mostly safe to account for service latency.

It is likely mixer filter is buffering the data, filed https://github.com/istio/proxy/issues/1315, it do only buffers up till the decision is made but it can still buffers too much.

thank you @lizan @mattklein123 @PiotrSikora @kyessenov @mandarjog @cmluciano @alyssawilk -- i appreciate everyone look into this prob and try to propose a solution. I agree this issue can be closed in envoy as we are not expecting any action from envoy at the moment.

Leaving this closed but commenting for the record, if you change the return code you're not telling Envoy "do not buffer the data" you're telling Envoy "I can make forward progress without the entire request body being buffered"

A streaming filter will result in the current data will be buffered, but Envoy will push back (again via TCP window or H2 flow control) and ask downstream to stop proxying data. Envoy will buffer any and all data until that flow control kicks in. The "Buffering" on return indicates the filter must buffer the full request body before it makes a decision (like it needs to do some checksum on the request body before it can send headers on) so will not be able to make forward progress if downstream stops spooling data. Does that make sense?

Thanks for explanation @alyssawilk, yeah that matches my understand. Before the flow control support for filters, we didn't have StopIterationAndWatermark, so StopIterationAndBuffer was the only option.

I feel we need kind of release notes for filter (or extension) developers to document this kind of C++ API changes (in include/). While I regularly monitor diffs of include/ but somehow I missed #1417 to migrate istio filters. I will open another issue for this.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sabiurr picture sabiurr  Â·  3Comments

jeremybaumont picture jeremybaumont  Â·  3Comments

hzxuzhonghu picture hzxuzhonghu  Â·  3Comments

dstrelau picture dstrelau  Â·  3Comments

justConfused picture justConfused  Â·  3Comments