Looking at http://docs.gunicorn.org/en/stable/settings.html:
limit_request_field_size
allows restricting header sizes.819 000 bytes
of headers (defaults of limit_request_fields: 100
* limit_request_field_size: 8190
bytes)?limit_request_field_size
allows restricting header sizes.819204
. Gunicorn adds space for \r\n
between each header and at the endi. I'm running Flask now. When I get access to the request headers, the whole request has already been received (so it's too late). Might be different if I ran bare werkzeug
, I don't know.
Unfortunately, I'm not that familiar with chunked encoding, but seems like Flask supports it somehow (https://github.com/pallets/flask/issues/367, https://github.com/pallets/werkzeug/pull/1198) while it's not really supported in WSGI spec. So, maybe with chunked requests, keeping track of read bytes at app level might work, but non-chunked would still be a problem.
ii. If WSGI middleware can do that, sure why not. I can look into it. But wouldn't there be some inconsistency then as gunicorn would support:
REQUEST LINE
Headers: this as well
...then why not also body?
Use case: Running AWS ALB -> gunicorn -> flask
with gevent
. So there's no nginx
to set request body max size (and I wouldn't want to introduce yet another component to the chain).
Why to limit body max size at all? To reduce attack surface in case someone wants to try to kill my app with 1GB uploads or something like that.
There's documentation available from Werkzeug on limited the content length. I think it probably applies to Flask, too: http://werkzeug.pocoo.org/docs/0.13/request_data/
I'm not opposed to the idea of letting Gunicorn limit the input size, but I'm always interested to understand how necessary a feature is before implementing it. The argument from symmetry is not bad.
I think you just want to use request.stream
instead of request.data
.
And it seems you can set MAX_CONTENT_LENGTH
on the flask app configuration.
Thanks for the pointers! Though I need to study WSGI a bit to understand what the werkzeug docs mean in practice 馃槂.
MAX_CONTENT_LENGTH
is a nice finding, but does it really mean that Flask drops reading of the response or is the request first buffered and only then dropped? Can't see that mentioned.
If handling chunked requests needs reading request.stream
, then my app doesn't work with chunked requests anyway. So not a problem there.
If I got it right servers will interpret requests without Content-Length
as chunked ones, but what happens if a client sends invalid Content-Length
value? Of course, if attacker sends shorter Content-Length
than the actual payload (if that's even possible over proxies and load balancers), the rest is dropped as gunicorn/werkzeug will trust the given length and never read the overflow part anyway. So in that sense, checking and relying on Content-Length
is reliable, no matter what?
Good questions.
Reading into the werkzeug code it seems that the limiting only works when the content length is specified.
When Content-Length
is longer than the actual content, most software probably blocks waiting for more content unless the client hangs up or a time out is reached. In Gunicorn, the worker timeout would apply for sync workers.
When Content-Length
is too short, the server will simply stop reading after a point. This only poses a problem when keep-alive is used by the client. In this case, the next request might fail to parse if the first bytes are actually the end of a previous request. This is not the server's business to mitigate, though.
It looks like Werkzeug will limit the input stream to the length of the content [1]. So, it seems like you could trust the Content-Length, if it's present. If it's not present and the WSGI server does not indicate that it marks the end of the input stream, then it seems the default for werkzeug is to throw it away for safety.
There is a recent open issue for the wsgi.input_terminated
flag, #1653. I propose we fold this discussion into there. If Gunicorn will support wsgi.input_terminated
I think it may make sense for Gunicorn to decide where to truncate and support a maximum content length configuration.
@tuukkamustonen mind if I close this issue and just continue the work in #1653?
it seems that the limiting only works when the content length is specified.
By that I assume you refer to werkzeug's max_content_length
. Which is probably what MAX_CONTENT_LENGTH
in flask sets. I'll look into code, but feel free to correct me.
When Content-Length is longer than the actual content, most software probably blocks waiting for more content unless the client hangs up or a time out is reached. In Gunicorn, the worker timeout would apply for sync workers.
Hmm, that would actually provide a nice attack vector - send in 1000 requests with Content-Length
of 2, while really having only 1 byte in body? It would be pretty instant and leave all workers hanging in there, until timeouts kick in...
When Content-Length is too short, the server will simply stop reading after a point. This only poses a problem when keep-alive is used by the client. In this case, the next request might fail to parse if the first bytes are actually the end of a previous request. This is not the server's business to mitigate, though.
That's an interesting thought. Any idea what gunicorn does in that case - does it just drop the connection? (I believe to really understand this we would need to go deeper into TCP comms... let's not.)
It looks like Werkzeug will limit the input stream to the length of the content [1]. So, it seems like you could trust the Content-Length, if it's present. If it's not present and the WSGI server does not indicate that it marks the end of the input stream, then it seems the default for werkzeug is to throw it away for safety.
Yeah, and with wsgi.input_terminated
supported in WSGI server (be it gunicorn, werkzeug's dev server, whatever) the behavior changes so that chunked transfers are supported.
But I don't know, it sounds like when WSGI server sets wsgi.input_terminated
it has already buffered the whole chunked body (and that's required for WSGI server to ensure/limit body max size, if set... or is the stream rather a generator that checks it on the fly...), while the idea with chunked transfer would be able to "slowly" stream the body for the application. So that when application reads the input, line by line, it can act immediately on new data, without waiting for the whole request to first get buffered. I don't know, I guess I need to re-read mitsuhiko's paper.
mind if I close this issue and just continue the work in #1653?
Yeah, let's do so. Funny coincidence that the ticket is just 10 days older - convenient!
Hmm, that would actually provide a nice attack vector - send in 1000 requests with Content-Length of 2, while really having only 1 byte in body? It would be pretty instant and leave all workers hanging in there, until timeouts kick in...
That's more or less the Slowloris. If you need to protect against that, use async workers or put a good reverse proxy in front.
That's an interesting thought. Any idea what gunicorn does in that case - does it just drop the connection?
If Gunicorn can't parse a request, it will drop the connection.
But I don't know, it sounds like when WSGI server sets wsgi.input_terminated it has already buffered the whole chunked body
Not necessary. It just means that the server guarantees that eventually reading from the input object will return EOF. That could be because the server has cut off at some maximum request size, the remote client indicated the end of the request (such as with a zero length chunk in chunked transfer encoding), or because the Content-Length
was provided and the server has wrapped the input object to only return that many bytes.
Thanks for pulling on all these threads. These discussions are often helpful for others wondering the same thing.
That's more or less the Slowloris. If you need to protect against that, use async workers or put a good reverse proxy in front.
Good point, it's practically the same.
It just means that the server guarantees that eventually reading from the input object will return EOF
That's a nice way to put it.
I used bad wording earlier:
But I don't know, it sounds like when WSGI server sets wsgi.input_terminated it has already buffered the whole chunked body
I was just wondering if the WSGI server actually _buffers_ a chunked request, before passing it on. But if I'm getting this right, the body is rather streamed with a LimitedStream
like construct, that gives EOF once either condition (_"server has cut off at some maximum request size", "the remote client indicated the end of the request"_) is hit. So, chunked requests get _streamed_ (as request.stream
) to the application, not _buffered_ as non-chunked (as request.data
or request.form
maybe).
So maximum size limit (with a default) is a requirement for wsgi.input_terminated
. And the check for the maximum size is probably done on-the-fly within the stream code.
Thanks for pulling on all these threads. These discussions are often helpful for others wondering the same thing.
Humble thanks for taking the time to explain and educate on these.
Most helpful comment
Good point, it's practically the same.
That's a nice way to put it.
I used bad wording earlier:
I was just wondering if the WSGI server actually _buffers_ a chunked request, before passing it on. But if I'm getting this right, the body is rather streamed with a
LimitedStream
like construct, that gives EOF once either condition (_"server has cut off at some maximum request size", "the remote client indicated the end of the request"_) is hit. So, chunked requests get _streamed_ (asrequest.stream
) to the application, not _buffered_ as non-chunked (asrequest.data
orrequest.form
maybe).So maximum size limit (with a default) is a requirement for
wsgi.input_terminated
. And the check for the maximum size is probably done on-the-fly within the stream code.Humble thanks for taking the time to explain and educate on these.