I think it would be beneficial to have a plugin that allows a user to configure that maximum size that a client can upload to the server. This would be similar to client_max_body_size in Nginx.
Maybe something like this in Caddyfile which would limit the body size to 5mb?
client_max_body 5M
How would this be enforced? One way of doing it: we would have to read the body up to the limit and see if we hit EOF by then, and if not, then reject the request I suppose... of course this would have a performance impact. Is this how nginx does it?
Hey @mholt,
There are a couple of different approaches
¯\_(ツ)_/¯Edit: BUT, we could do both. If Content-Length is bigger than allowed, we can just block the request immediately. If it's not, we can read the request body to verify. This way, you can't lie about a short Content-Length to get away with it. And since most clients don't lie, it would probably prove more efficient in most cases.
It would be worth trying implementing this with and without a buffer pool to see if a pool does indeed perform better.
Would like to take a look
It would be worth trying implementing this with and without a buffer pool to see if a pool does indeed perform better.
Does it mean that we need to read whole request before actual proxing?
P. S.
I was playing a little with replacing request Body with different kind of reader wrappers, but almost without success -- pkg/net/http takes control over error and as result 502 is returned.
@vchimishuk
Does it mean that we need to read whole request before actual proxying?
If the Content-Length is smaller than the max size, then we would need to read the body up to the max size and can terminate after that. (<--- be sure to consider that, @evvvvr - more efficient than reading whole body when all we need to do is read max+1 bytes.)
Will Caddy protect upstream if Content-Length is not valid (less than the actual body size)?
I'm not sure than you can read Body two times, which means that we need to read whole body in memory. Which is not good for very big requests.
Will Caddy protect upstream if Content-Length is not valid (less than the actual body size)?
As we mentioned above, if Content-Length is less than max size, we will read up to max+1 bytes to verify that Content-Length is not lying. So the limit here is max+1, not body size. If max+1 is very large, then yes, it will use a lot of memory. Not really sure how to get around that unless we don't verify until we start proxying, but that isn't really feasible yet. Perhaps it will be in the future when we look at ways to optimize.
As far as I understand @vchimishuk is right, entire request body has to be read into a buffer, because to preserve request body (which can't be read twice) we have to assign Reader for this buffer to Request.Body. @definitelycarter already did the same thing in caddyhttp/httpserver/replacer.go readRequestBody func.
Not sure about wrapping Request.Body into MaxBytesReader or some other kind of reader wrapper. As I understand, in this case check for exceeding body size limit should be performed in code that actually uses Request.Body.
As I understand, in this case check for exceeding body size limit should be performed in code that actually uses Request.Body.
That's quite a good idea. We could store this value in the SiteConfig and just have any interested plugins read the value during their setup functions. Then any reader of the body just needs to make LimitReader (or MaxBytesReader, although that might supersede our error handling?) using that value. So, no need for a new _middleware_, but a directive that sets that value. How does that sound?
Then any reader of the body just needs to make LimitReader (or MaxBytesReader, although that might supersede our error handling?) using that value
Sounds like a better option from performance point of view, even with concern that it implies implementation contract for interested plugins and that maximum request body size handling is up to plugin, instead of middleware which definitely will do this.
However I can't imagine any other normal alternative except these two.
@evvvvr Fair enough; middlewares already need to access the SiteConfig to get the Root path (if accessing local files), etc -- so I think it's okay to expect that of middlewares.
Okay, so here's what we should do. Somebody who has a few minutes: add a MaxRequestBodySize field to the SiteConfig struct, add a directive which sets the value (follow root or bind for a good example - except even simpler probably), then find the places where middlewares read the request body, and use a http.MaxBytesReader with the configured size (if > 0).
add a directive which sets the value
IMO it should be useful to support exact path for this directive.
Probably will do it today.
@evvvvr Cool, and the path could be optional maybe?
@mholt Yep, syntax will look like maxrequestbody [path] size
If no path specified - size limit is implied for any location and size is either a number (for bytes) or number with 'K' or 'M' suffix meaning kilobytes/megabytes.
Sorry, but I can't make it today :confused: probably only the next week, if anyone would like to do it - feel free to take this issue.
I would like to take on this issue :smile:
Learning purposes, to be honest. Also, I've been using Caddy for personal stuff for a while, and it's time I give back to the project.
As soon as I get to it (mangling uni/work/stuff), I'll update this. Probably later tonight.
@nubunto How's this going?
Hey - so sorry, things are quite messed up around here. I feel it would be best if I postponed this for now, so anyone can feel free to start this.
Really sorry about disappearing :disappointed:
Hey, I understand how it goes! :smile: I'll see if any Hacktoberfest participants want to help.
I'll take a look as part of the fest