Channels: Channel server responds with 400 Bad Request for multipart on Python 3.7+

Created on 30 Nov 2019  Â·  14Comments  Â·  Source: django/channels

Reproduction Repo

Minimal reproducible example with steps - https://github.com/iamareebjamal/channels-multipart-bad-request

Bug

Most probably relate to #1100

Consider a multipart request as such:

Body:

Content-Disposition: form-data; name="content"
Content-Type: multipart/form-data; charset=utf-8

Anything
--d66f495a-c4d1-487c-9277-9ab1a20001cc--

Header:
Content-Type: multipart/form-data; boundary=d66f495a-c4d1-487c-9277-9ab1a20001cc

And any simple handler/view in Django. For example, one that just returns Hello World without even processing the request data.

For Python 3.6 (tested with Python 3.6.9 and 3.6.6), the channels server (dev server [runserver] and daphne both) handle multipart requests correctly. The specified view is called and the request is handled and correct response with 200 OK status code is returned.

For the same channels and Django version, if the Python version is switched to 3.7+ (tested with Python 3.7.0 3.7.4 3.7.5 3.8.0), the server returns 400 Bad Request and empty response body. The request does not even reach any view or middleware regardless of the view processing the request or statically returning content like in the example repo above. No exception is raised even with django log level set to DEBUG. Although I have not tested if the channels logger is separate and logs anything else. At least it does not log in default settings or throw any exception.

As seen in the example repo, the entire environment is same except the python version. And hence, the behaviour of channels should be same across Python versions as well.

Environment

pip freeze

Identical for both versions

Python 3.7
asgiref==3.2.3
attrs==19.3.0
autobahn==19.11.1
Automat==0.8.0
certifi==2019.11.28
cffi==1.13.2
channels==2.3.1
chardet==3.0.4
constantly==15.1.0
cryptography==2.8
daphne==2.4.0
Django==2.2.7
hyperlink==19.0.0
idna==2.8
incremental==17.5.0
pyasn1==0.4.8
pyasn1-modules==0.2.7
pycparser==2.19
PyHamcrest==1.9.0
pyOpenSSL==19.1.0
pytz==2019.3
requests==2.22.0
service-identity==18.1.0
six==1.13.0
sqlparse==0.3.0
Twisted==19.10.0
txaio==18.8.1
urllib3==1.25.7
zope.interface==4.7.1
Python 3.6
asgiref==3.2.3
attrs==19.3.0
autobahn==19.11.1
Automat==0.8.0
certifi==2019.11.28
cffi==1.13.2
channels==2.3.1
chardet==3.0.4
constantly==15.1.0
cryptography==2.8
daphne==2.4.0
Django==2.2.7
hyperlink==19.0.0
idna==2.8
incremental==17.5.0
pyasn1==0.4.8
pyasn1-modules==0.2.7
pycparser==2.19
PyHamcrest==1.9.0
pyOpenSSL==19.1.0
pytz==2019.3
requests==2.22.0
service-identity==18.1.0
six==1.13.0
sqlparse==0.3.0
Twisted==19.10.0
txaio==18.8.1
urllib3==1.25.7
zope.interface==4.7.1
OS

Tested on:

  • Linux/Ubuntu 19.10
  • Linux/Mint 19.1

Most helpful comment

Ah, no, perhaps I was unclear. Django doesn't support _nested_ multipart, nor did cgi.parse_multipart() before Python 3.7. In Python 3.7 they changed cgi.parse_multipart() to cgi.FieldStorage, which does and therefore chokes on the bad Content-Type header. That change is causing Twisted to reject the request.

All 14 comments

Seems related to this: https://github.com/twisted/twisted/pull/1012

Twisted throws error if charset=utf8 whereas it handled it correctly in previous python versions. Our mobile clients send the header and it worked correctly previously (on Python 3.7-). Not sure what the course of action will be here to preserve the backward compatibility

Edit: Twisted is trying to parse an empty byte string as a valid boundary in Python 3.7+, and hence failing

Can you open an issue with Twisted? What do they say?

Will do

I looked into this as I was digging into an issue in the same part of the codebase and left some comments on the Twisted ticket. To summarize, it looks like this request is actually malformed — the boundary parameter is required for Content-Type: multipart/form-data. The change in behavior is actually due to a Python change to cgi.parse_multipart(): https://github.com/python/cpython/commit/cc3fa204d357be5fafc10eb8c2a80fe0bca998f1 .

What is generating the malformed request body? This is clearly a violation of RFC 7578, but if it's a widespread violation we may need to make Twisted cope.

OkHTTP - https://square.github.io/okhttp/

Which is the go to HTTP client on Android apps

BTW, the request header has boundary - Content-Type: multipart/form-data; boundary=d66f495a-c4d1-487c-9277-9ab1a20001cc, but multipart headers don't

Note that this requirement changed from RFC 2388 to RFC 7588 according to Appendix A:

"boundary" is now a required parameter in Content-Type.

I suggest that you file an issue against OkHttp. The Content-Type header it is applying to the part is nonsensical, violates the latest spec, and is a compatibility hazard. It is definitely incompatible with Python's cgi.FieldStorage.

I do think that we should fix this in Twisted by ignoring the Content-Type like we do on Python ≤3.5. I looked at Django's multipart implementation. It doesn't appear to support nested multipart. Given Twisted and Django have both done fine without it I conclude that nested multipart support isn't necessary in the real world. The complexity it adds is problematic. I think that we should ditch the cgi module.

Actually, Django's development server handles the request correctly, if that was something you were saying it doesn't support. Maybe I understood it incorrectly?

Ah, no, perhaps I was unclear. Django doesn't support _nested_ multipart, nor did cgi.parse_multipart() before Python 3.7. In Python 3.7 they changed cgi.parse_multipart() to cgi.FieldStorage, which does and therefore chokes on the bad Content-Type header. That change is causing Twisted to reject the request.

Even iOS Alamofire (we've tested with 4.9.1) has same problem, multipart header doesn't have boundary param and it fails silently with python3.7.

@rakesh-ridecell Could you provide a capture of the request?

(If the request has chunked transfer-encoding it would also be broken on 3.7+, see https://twistedmatrix.com/trac/ticket/9678 which is waiting for review.)

Thanks for the great explanation @twm. I'm going to close this as it's not something we can deal with here.

Correcting the request in the sample project allows the test to work as expected:

diff --git a/test.py b/test.py
index 16e473e..1d2eeac 100644
--- a/test.py
+++ b/test.py
@@ -2,7 +2,7 @@ import requests

 url = "http://localhost:8000/hello/"

-payload = """--d66f495a-c4d1-487c-9277-9ab1a20001cc
+payload = """--d66f495a-c4d1-487c-9277-9ab1a20001cc--
 Content-Disposition: form-data; name=\"content\"
 Content-Type: multipart/form-data; charset=utf-8
Was this page helpful?
0 / 5 - 0 ratings