Gunicorn: Clarifications on request size limits and setting max request body size

Created on 8 Dec 2017  路  11Comments  路  Source: benoitc/gunicorn

Looking at http://docs.gunicorn.org/en/stable/settings.html:

  1. limit_request_field_size allows restricting header sizes.

    1. Does it apply for value only, or for "Name: value"?

    2. Does it apply for each header at a time?

    3. So, by default, gunicorn allows 819 000 bytes of headers (defaults of limit_request_fields: 100 * limit_request_field_size: 8190 bytes)?

    4. As I understand, it's not possible to set total maximum size of all headers (names + values)? So that some field names/values could be longer and some shorter, as long as total limit is not exceeded. What would you think about such a feature?

  2. I assume connection is just dropped in case a client tries to send in larger request than this?
  3. Limiting request body size:

    1. Is there a way to limit request body size?

    2. I don't really know, but I get a feeling that adding that wouldn't be a tremendous task ("just" _check the size against conf option and reject if exceeded_, the same way as with headers)?

Most helpful comment

That's more or less the Slowloris. If you need to protect against that, use async workers or put a good reverse proxy in front.

Good point, it's practically the same.

It just means that the server guarantees that eventually reading from the input object will return EOF

That's a nice way to put it.

I used bad wording earlier:

But I don't know, it sounds like when WSGI server sets wsgi.input_terminated it has already buffered the whole chunked body

I was just wondering if the WSGI server actually _buffers_ a chunked request, before passing it on. But if I'm getting this right, the body is rather streamed with a LimitedStream like construct, that gives EOF once either condition (_"server has cut off at some maximum request size", "the remote client indicated the end of the request"_) is hit. So, chunked requests get _streamed_ (as request.stream) to the application, not _buffered_ as non-chunked (as request.data or request.form maybe).

So maximum size limit (with a default) is a requirement for wsgi.input_terminated. And the check for the maximum size is probably done on-the-fly within the stream code.

Thanks for pulling on all these threads. These discussions are often helpful for others wondering the same thing.

Humble thanks for taking the time to explain and educate on these.

All 11 comments

  1. limit_request_field_size allows restricting header sizes.
    i. "Name: value" all together
    ii. Yes, separately to each header.
    iii. 819204. Gunicorn adds space for \r\n between each header and at the end
    iv. Can you explain a need for this feature?
  2. I assume connection is just dropped in case a client tries to send in larger request than this? Yes.
  3. Limiting request body size:
    i. Check the content length header in your application. If the transfer encoding is chunked, only consume the body iterator as much as you'd like to.
    ii. Maybe. For chunked transfers Gunicorn would have to track how many bytes were read. It does not do that right now, but it could. This is also something WSGI middeware could do, so I am not sure it belongs in Gunicorn.

  1. i-iii. Ack.
    iv. No real need, just thinking that it probably makes more sense to limit the total size of headers rather than maximum length for each. Use cases vary, for sure.
  2. Ack
  3. i. I'm running Flask now. When I get access to the request headers, the whole request has already been received (so it's too late). Might be different if I ran bare werkzeug, I don't know.

    Unfortunately, I'm not that familiar with chunked encoding, but seems like Flask supports it somehow (https://github.com/pallets/flask/issues/367, https://github.com/pallets/werkzeug/pull/1198) while it's not really supported in WSGI spec. So, maybe with chunked requests, keeping track of read bytes at app level might work, but non-chunked would still be a problem.

    ii. If WSGI middleware can do that, sure why not. I can look into it. But wouldn't there be some inconsistency then as gunicorn would support:

      REQUEST LINE
      Headers: this as well
    
      ...then why not also body?
    

Use case: Running AWS ALB -> gunicorn -> flask with gevent. So there's no nginx to set request body max size (and I wouldn't want to introduce yet another component to the chain).

Why to limit body max size at all? To reduce attack surface in case someone wants to try to kill my app with 1GB uploads or something like that.

There's documentation available from Werkzeug on limited the content length. I think it probably applies to Flask, too: http://werkzeug.pocoo.org/docs/0.13/request_data/

I'm not opposed to the idea of letting Gunicorn limit the input size, but I'm always interested to understand how necessary a feature is before implementing it. The argument from symmetry is not bad.

I think you just want to use request.stream instead of request.data.

And it seems you can set MAX_CONTENT_LENGTH on the flask app configuration.

Thanks for the pointers! Though I need to study WSGI a bit to understand what the werkzeug docs mean in practice 馃槂.

MAX_CONTENT_LENGTH is a nice finding, but does it really mean that Flask drops reading of the response or is the request first buffered and only then dropped? Can't see that mentioned.

If handling chunked requests needs reading request.stream, then my app doesn't work with chunked requests anyway. So not a problem there.

If I got it right servers will interpret requests without Content-Length as chunked ones, but what happens if a client sends invalid Content-Length value? Of course, if attacker sends shorter Content-Length than the actual payload (if that's even possible over proxies and load balancers), the rest is dropped as gunicorn/werkzeug will trust the given length and never read the overflow part anyway. So in that sense, checking and relying on Content-Length is reliable, no matter what?

Good questions.

Reading into the werkzeug code it seems that the limiting only works when the content length is specified.

When Content-Length is longer than the actual content, most software probably blocks waiting for more content unless the client hangs up or a time out is reached. In Gunicorn, the worker timeout would apply for sync workers.

When Content-Length is too short, the server will simply stop reading after a point. This only poses a problem when keep-alive is used by the client. In this case, the next request might fail to parse if the first bytes are actually the end of a previous request. This is not the server's business to mitigate, though.

It looks like Werkzeug will limit the input stream to the length of the content [1]. So, it seems like you could trust the Content-Length, if it's present. If it's not present and the WSGI server does not indicate that it marks the end of the input stream, then it seems the default for werkzeug is to throw it away for safety.

There is a recent open issue for the wsgi.input_terminated flag, #1653. I propose we fold this discussion into there. If Gunicorn will support wsgi.input_terminated I think it may make sense for Gunicorn to decide where to truncate and support a maximum content length configuration.

[1] https://github.com/pallets/werkzeug/blob/04eb249d04e1bf2a6fd6b77b44ff818d394f09ef/werkzeug/wsgi.py#L187

@tuukkamustonen mind if I close this issue and just continue the work in #1653?

it seems that the limiting only works when the content length is specified.

By that I assume you refer to werkzeug's max_content_length. Which is probably what MAX_CONTENT_LENGTH in flask sets. I'll look into code, but feel free to correct me.

When Content-Length is longer than the actual content, most software probably blocks waiting for more content unless the client hangs up or a time out is reached. In Gunicorn, the worker timeout would apply for sync workers.

Hmm, that would actually provide a nice attack vector - send in 1000 requests with Content-Length of 2, while really having only 1 byte in body? It would be pretty instant and leave all workers hanging in there, until timeouts kick in...

When Content-Length is too short, the server will simply stop reading after a point. This only poses a problem when keep-alive is used by the client. In this case, the next request might fail to parse if the first bytes are actually the end of a previous request. This is not the server's business to mitigate, though.

That's an interesting thought. Any idea what gunicorn does in that case - does it just drop the connection? (I believe to really understand this we would need to go deeper into TCP comms... let's not.)

It looks like Werkzeug will limit the input stream to the length of the content [1]. So, it seems like you could trust the Content-Length, if it's present. If it's not present and the WSGI server does not indicate that it marks the end of the input stream, then it seems the default for werkzeug is to throw it away for safety.

Yeah, and with wsgi.input_terminated supported in WSGI server (be it gunicorn, werkzeug's dev server, whatever) the behavior changes so that chunked transfers are supported.

But I don't know, it sounds like when WSGI server sets wsgi.input_terminated it has already buffered the whole chunked body (and that's required for WSGI server to ensure/limit body max size, if set... or is the stream rather a generator that checks it on the fly...), while the idea with chunked transfer would be able to "slowly" stream the body for the application. So that when application reads the input, line by line, it can act immediately on new data, without waiting for the whole request to first get buffered. I don't know, I guess I need to re-read mitsuhiko's paper.

mind if I close this issue and just continue the work in #1653?

Yeah, let's do so. Funny coincidence that the ticket is just 10 days older - convenient!

Hmm, that would actually provide a nice attack vector - send in 1000 requests with Content-Length of 2, while really having only 1 byte in body? It would be pretty instant and leave all workers hanging in there, until timeouts kick in...

That's more or less the Slowloris. If you need to protect against that, use async workers or put a good reverse proxy in front.

That's an interesting thought. Any idea what gunicorn does in that case - does it just drop the connection?

If Gunicorn can't parse a request, it will drop the connection.

But I don't know, it sounds like when WSGI server sets wsgi.input_terminated it has already buffered the whole chunked body

Not necessary. It just means that the server guarantees that eventually reading from the input object will return EOF. That could be because the server has cut off at some maximum request size, the remote client indicated the end of the request (such as with a zero length chunk in chunked transfer encoding), or because the Content-Length was provided and the server has wrapped the input object to only return that many bytes.

Thanks for pulling on all these threads. These discussions are often helpful for others wondering the same thing.

That's more or less the Slowloris. If you need to protect against that, use async workers or put a good reverse proxy in front.

Good point, it's practically the same.

It just means that the server guarantees that eventually reading from the input object will return EOF

That's a nice way to put it.

I used bad wording earlier:

But I don't know, it sounds like when WSGI server sets wsgi.input_terminated it has already buffered the whole chunked body

I was just wondering if the WSGI server actually _buffers_ a chunked request, before passing it on. But if I'm getting this right, the body is rather streamed with a LimitedStream like construct, that gives EOF once either condition (_"server has cut off at some maximum request size", "the remote client indicated the end of the request"_) is hit. So, chunked requests get _streamed_ (as request.stream) to the application, not _buffered_ as non-chunked (as request.data or request.form maybe).

So maximum size limit (with a default) is a requirement for wsgi.input_terminated. And the check for the maximum size is probably done on-the-fly within the stream code.

Thanks for pulling on all these threads. These discussions are often helpful for others wondering the same thing.

Humble thanks for taking the time to explain and educate on these.

Was this page helpful?
0 / 5 - 0 ratings