It seems like gunicorn is vulnerable to a CL-TE attack when used with something like Amazon鈥檚 ECS service + a load balancer. It looks like gunicorn parses a malformed transfer-encoding header for this attack.
See this for an example https://nathandavison.com/blog/haproxy-http-request-smuggling
A suggested fix might be to add an option to fail a request if the request has Content-Length and Transfer-Encoding that is not identity.
An alternative approach would be to ignore Transfer-Encoding
header when Amazon Load Balancer ignores it, i.e. when the header value has a new line before it.
Thanks for the report. I will take care of it ASAP? We have a security mailing list for it why didn't you use it? http://docs.gunicorn.org/en/stable/community.html#security-issues
We are new to contributing to gunicorn. We are working on a fix in parallel. Our approach was to replicate Go team's solution by leaving the header names intact (we dropped rstrip call from message.py:93). We'll try this approach against a vulnerability scanner and report back. Thanks for looking into it.
@kovalevvlad thanks. would be cool to make a release over the we with a fix
We'll try to send a PR over the weekend (assuming it works)
Any progress? I will start working on it otherwise, because this is a critical vulnerability. I wonder if the new option (routing.http.drop_invalid_header_fields.enabled
) on AWS ALB might mitigate the issue.
@Sytten this is how far we got https://github.com/benoitc/gunicorn/compare/19.x...ryderdamen:19.x Feel free to take over. Will try to test today if this change or routing.http.drop_invalid_header_fields.enabled
fixes the issue.
This seems like a first step, but we need to test all the variations of the attack. I will enable this field in prod for now and I will work on the fix today. Do we need to backport it to v19 or we only do it for v20 @benoitc?
@Sytten it needs to be for 19 as well. Thanks for it. Thing we also need is a test showing the attack. I'm online if needed. I will wait for your fix to make a release. Can be done today :)
Some findings: The vanilla attack now seems to be blocked by AWS sometimes.
I was able to generate some errors:
[2019-11-18 20:05:45 +0000] [9] [ERROR] Socket error processing request.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base_async.py", line 65, in handle
util.reraise(*sys.exc_info())
File "/usr/local/lib/python3.7/site-packages/gunicorn/util.py", line 546, in reraise
raise value
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base_async.py", line 48, in handle
req = next(parser)
File "/usr/local/lib/python3.7/site-packages/gunicorn/http/parser.py", line 35, in __next__
data = self.mesg.body.read(8192)
File "/usr/local/lib/python3.7/site-packages/gunicorn/http/body.py", line 215, in read
data = self.reader.read(1024)
File "/usr/local/lib/python3.7/site-packages/gunicorn/http/body.py", line 30, in read
self.buf.write(next(self.parser))
File "/usr/local/lib/python3.7/site-packages/gunicorn/http/body.py", line 72, in parse_chunked
raise ChunkMissingTerminator(rest[:2])
gunicorn.http.errors.ChunkMissingTerminator: Invalid chunk terminator is not '\r\n': b'\rP'
[2019-11-18 20:05:43 +0000] [10] [ERROR] Socket error processing request.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/gunicorn/http/body.py", line 90, in parse_chunk_size
chunk_size = int(chunk_size, 16)
ValueError: invalid literal for int() with base 16: b'Q'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base_async.py", line 65, in handle
util.reraise(*sys.exc_info())
File "/usr/local/lib/python3.7/site-packages/gunicorn/util.py", line 546, in reraise
raise value
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base_async.py", line 48, in handle
req = next(parser)
File "/usr/local/lib/python3.7/site-packages/gunicorn/http/parser.py", line 35, in __next__
data = self.mesg.body.read(8192)
File "/usr/local/lib/python3.7/site-packages/gunicorn/http/body.py", line 215, in read
data = self.reader.read(1024)
File "/usr/local/lib/python3.7/site-packages/gunicorn/http/body.py", line 30, in read
self.buf.write(next(self.parser))
File "/usr/local/lib/python3.7/site-packages/gunicorn/http/body.py", line 73, in parse_chunked
(size, rest) = self.parse_chunk_size(unreader, data=rest[2:])
File "/usr/local/lib/python3.7/site-packages/gunicorn/http/body.py", line 92, in parse_chunk_size
raise InvalidChunkSize(chunk_size)
gunicorn.http.errors.InvalidChunkSize: Invalid chunk size: b'Q'
EDIT: For reference I am using https://github.com/portswigger/http-request-smuggler
I am trying each attack and see how it behaves.
let me know how鈥檌t goes. i wil make a release by tomorrow now
I have a working payload for a typical case.
First time it works and second requests returns a 405 (meaning I injected the final X in the next request which becomes XGET). Other attacks seems to be mitigated by AWS pretty well.
New payload causing a 404 for the next client requesting a resource
This confirms the CL-TE attack, the attack above was a TE-CL attack. So it seems we are vulnerable to both.
Ok we need to do at least those four things to be compliant:
No whitespace is allowed between the header field-name and colon
. Further more, it should return a 400 to compliant with the latest spec.Content-Length: 6
Transfer-Encoding : chunked
If a message is received with both a Transfer-Encoding header field and a Content-Length header field, the latter MUST be ignored.
. Gunicorn already does that partially.Content-Length: 6
Transfer-Encoding: chunked
Transfer-Encoding: cow
Because it iterates over all TE, it will discard the TE since it's not the last one. A bigger refactor needs to be done to support multiple TE (since its required by the spec) and probably raise an exception if we encounter an invalid one. The specs recommends to return a 501 if we do not support an encoding.
Content-Length: 6
Transfer-Encoding: chunked
Content-Length: 6
Transfer-Encoding:
chunked
This successfully returns chunked
(with the space included)
I will start a PR for all of those based on the one posted above by @snovity
Here is a thread about changes by AWS related to this from a week ago https://forums.aws.amazon.com/thread.jspa?messageID=923401
Did some testing, on our load balancer my experience was that with routing.http.drop_invalid_header_fields.enabled
default False
it was dropping headers with leading and trailing spaces, but wasn't dropping headers with spaces within headers. After making it True
it started dropping all bad headers, but, after I switched it back to False
, surprisingly, it started allowing everything and I could replicate the CL-TE attack. I couldn't cleanly prepend another user request with my header, but could break all subsequent user requests.
So to sum it up:
routing.http.drop_invalid_header_fields.enabled
helps with at least CL-TE varietyCouldn't replicate TE-CL. @Sytten not sure, but might be both attacks you did are CL-TE, seems like Content-Length is correct in both cases and used on balancer, Transfer-Encoding in both cases has a space before :
and is used only on gunicorn. Looks like to do TE-CL you need balancer to be more permissive than web server to make sure that it uses TE and web server does not.
Good to have your feedback on the AWS fix @snovity!
Yeah I realize later that I was mistaken, the ALB is pretty strict. I found another way to create a desync (fixed in my PR). If you send a chunked TE followed by an identity TE, AWS will use the chunked but gunicorn will use the CL.
thanks all for the useful discussion and @snovity @Sytten for the code.
@Sytten just merged your PR and backported it to 19.x .
Great work, thanks all!
It appears there was no CVE assigned for this issue - was this a conscious/explicit decision? It would certainly making tracking and propagating a resolution in various distributions etc.
Agreed there should have been one.
Cool, would you be up for making one now? I believe a number of distributions are vulnerable and don't really know it, and it's never really too late. :+1:
I have never done that before, I but I will try and report here 馃憤
Cool, let me know how you get on. :)
this should be discussed in a separate ticket. I noticed that security@ ml was disabled. That's fixed now.
Most helpful comment
Ok we need to do at least those four things to be compliant:
No whitespace is allowed between the header field-name and colon
. Further more, it should return a 400 to compliant with the latest spec.For example:
If a message is received with both a Transfer-Encoding header field and a Content-Length header field, the latter MUST be ignored.
. Gunicorn already does that partially.By that I mean that it handles the usual case, but I believe it might create a desync in case of duplicate TE:
Because it iterates over all TE, it will discard the TE since it's not the last one. A bigger refactor needs to be done to support multiple TE (since its required by the spec) and probably raise an exception if we encounter an invalid one. The specs recommends to return a 501 if we do not support an encoding.
This successfully returns
chunked
(with the space included)I will start a PR for all of those based on the one posted above by @snovity