Aiohttp: usage of multidict for response headers CamelCases names and results in information loss

Created on 31 Jan 2017 · 18Comments · Source: aio-libs/aiohttp

Long story short

aiohttp uses a multidict.CIMultiDict to store the response headers, however this class does not preserve the casing of the headers. Further raw_headers is not raw in that it upper()s the key names.

Expected behaviour

aiohttp headers should behave like case insensitive dict however it should preserve the original case as otherwise there's information loss.

Actual behaviour

aiohttp's headers end up getting CamelCased

Steps to reproduce

do aiohttp request with lowercase header names and notice how response headers get camel cased.

Also not following example:

>>> import multidict
>>> foo = multidict.CIMultiDict()
>>> foo['location'] = 1
>>> dict(foo)
{'Location': 1}

Your environment

all

outdated

Source

thehesiod

👍2

All 18 comments

note: raw_headers is a NOT a work-around as aiohttp .upper()'s all the names: https://github.com/KeepSafe/aiohttp/blob/master/aiohttp/protocol.py#L87 :(

thehesiod on 1 Feb 2017

@asvetlov why do we do two transformations on headers?

fafhrd91 on 1 Feb 2017

@thehesiod aiohttp does not alter raw_headers anymore. is it enough?

fafhrd91 on 16 Feb 2017

👍1

thanks!

thehesiod on 16 Feb 2017

this is still broken, aiohttp is still .upper()'ing all the header names: https://github.com/KeepSafe/aiohttp/blob/master/aiohttp/protocol.py#L87 I think aiohttp should switch to a better CaseInsensitiveDict that preserves the case. This is what botocore uses for reference: https://github.com/boto/botocore/blob/develop/botocore/vendored/requests/packages/urllib3/_collections.py#L107

which gets eventually passed to: https://github.com/boto/botocore/blob/develop/botocore/vendored/requests/structures.py#L14

I've fixed this below, however I really think a better CIMultiDict impl should be used.

thehesiod on 7 Mar 2017

btw, let me know what you guys think about this. Ends up I don't need it for aiobotocore as we have a workaround so this is just a nice to have.

thehesiod on 8 Mar 2017

I think multidict should not change case.

fafhrd91 on 14 Mar 2017

👍1

@fafhrd91:

>>> dict(CIMultiDict({'FOO': 1, 'foo2': 2}))
{'Foo2': 2, 'Foo': 1}

pretty "interesting" behavior. Not what I would expect

thehesiod on 14 Mar 2017

I agree

fafhrd91 on 14 Mar 2017

We can implement a case-insensitive dict by a dict UPPER_CASE_KEY => (Origin_Key, value) e.g.

{"CONTENT-TYPE": ("Content-Type", "text/plain")}

and override interfaces like __getitem__ and items() to act like a normal dictionary. When using hash search, use the upper-cased key; when iterating, use the normal key.

hubo1016 on 24 Mar 2017

That's question for multidict package, but remember for http headers we should be able to add multiple entries with same key. But I agree we should preserve header case.

fafhrd91 on 24 Mar 2017

@fafhrd91 Hmm, I believe use the same technique on a MultiDict will just do the trick - like
[("SET-COOKIE", ("Set-Cookie", "a=b")), ("SET-COOKIE", ("set-cookie", "c=d"))]. Then we override the getone() and getall() interface to do an extra [v[1] for v in items]. For items, it should return [("Set-Cookie", "a=b"), ("set-cookie", "c=d")] as expected.

hubo1016 on 24 Mar 2017

👍1

Let's move our conversation to multidict repo

fafhrd91 on 24 Mar 2017

Also I would like to speed up multidict

fafhrd91 on 24 Mar 2017

is there a link to an issue against multdict? I think if multidict were implemented optimally it shouldn't need cython.

thehesiod on 5 Jun 2017

There is no multidict issue.
I don't believe in fast multidict implementation without C Extensions.

asvetlov on 5 Jun 2017

Multidict 3.0+ preserves key casing

asvetlov on 30 Jun 2017

👍1

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a [new issue] for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that [new issue].