Httpx: HTTPX changes order of headers sent, using OrderedDict

Created on 3 Dec 2020  Â·  14Comments  Â·  Source: encode/httpx

With requests python library it is possible to use OrderedDict to send headers in a specified order. I tried the same with HTTPX.

with httpx.Client() as client:
    ordered_headers = collections.OrderedDict()
    ordered_headers['X-First'] = '1'
    ordered_headers['X-Second'] = '2'
    ordered_headers['X-Third'] = '3'
    ordered_headers['Connection'] = 'Keep-Alive'
    ordered_headers['Accept'] = 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
    ordered_headers['Accept-Encoding'] = 'gzip, deflate, br'
    ordered_headers['Accept-Language'] = 'en-US,en;q=0.9'
    ordered_headers['X-Last'] = '999'
    r = client.get("http://x.x.x.x/raw", headers=ordered_headers)

This is the RAW HTTP request, as received by the server.

GET /raw HTTP/1.1
Host: x.x.x.x
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate, br
Connection: Keep-Alive
User-Agent: python-httpx/0.16.1
X-First: 1
X-Second: 2
X-Third: 3
Accept-Language: en-US,en;q=0.9
X-Last: 999

Any idea how to make sure the headers are sent with the same order as being passed to the client.get() method?

question requests-compat

Most helpful comment

Here's what RFC 7230 says about header order:

https://tools.ietf.org/html/rfc7230#section-3.2.2

The order in which header fields with differing field names are received is not significant. However, it is good practice to send header fields that contain control data first, such as Host on requests and Date on responses, so that implementations can decide when not to handle a message as early as possible.

Currently, HTTPX sticks to that by not caring particularly about header fields, and pre-pending Host (instead of appending it).

Given that the spec is with us, and there's a workaround shown above if your particular use case requires preserving order, I'll close this off for now. :-)

Thanks all!

All 14 comments

Hello!

HTTPX doesn't ensure order across all headers sent to the remote host, but you'll notice that the order of headers you passed that are not among those set by default by HTTPX is maintained.

Is this something that's blocking you in any way, or just a curiosity about why this differs from Requests behavior?

I need to make sure I'm sending requests with headers ordered in a pre-determined order.

Can you suggest me where to look at in the HTTPX source code to prevent it to re-arrange the headers?

Hi,

Just came across this issue scrolling through the issues tab. I also benefit from having the possibility to have ordered headers.
I was looking through HTTPX' source and HTTPCORE's source but other than this I could not find where default headers are defined. If you could point me in the right direction I might look into making a pull.

Thanks.

@418Coffee That actually doesn't define any default headers. The default headers that @florimondmanca is referring to are actually here https://github.com/encode/httpx/blob/master/httpx/_client.py#L202-L212, and you may want to look at https://github.com/encode/httpx/blob/master/httpx/_client.py#L347-L356 as well. However do note that headers are also modified upon encoding the request in several functions here: https://github.com/encode/httpx/blob/master/httpx/_content.py#L157 and may be in other places as well.

You are on the right track with the Headers class though, I believe. Possibly modifying __setitem__ to delete the key if it exists and append it to the end of the list would work, but do test thoroughly, as you would be breaking the claim made here: https://github.com/encode/httpx/blob/master/httpx/_models.py#L711.

Hi again,

I looked at it today and I might have done something (I dont know tbh).
I made 2 changes: first one is in _merge_headers

def _merge_headers(
        self, headers: HeaderTypes = None
    ) -> typing.Optional[HeaderTypes]:
        """
        Merge a headers argument together with any headers on the client,
        to create the headers used for the outgoing request.
        """
        merged_headers = Headers(headers) # We set the user given headers first.
        merged_headers.update(self.headers) # Then update/append the default headers.
        return merged_headers

As you can see, I changed it so that headers are first set using the user given headers and then updated with the default ones. However, this would overwrite already declared headers, so I added a check in the update function :

def update(self, headers: HeaderTypes = None) -> None:  # type: ignore
        headers = Headers(headers)
        for key, value in headers.raw:
            if key.decode(headers.encoding) in self: # If key already in headers, skip.
                continue
            self[key.decode(headers.encoding)] = value.decode(headers.encoding)

Right, I tested this and it seems to work fine, however the "Host" header is always shown as the first one in the response. I don't know what causes this.

the "Host" header is always shown as the first one in the response. I don't know what causes this.

Looks like that's done in the _prepare method of Request: https://github.com/encode/httpx/blob/584a40513f820718b4356097fdf7452e154e6e99/httpx/_models.py#L843-L853

Hehe, checked that already and the host headers is only prepended if the headers dont have "host" in them.

if not has_host and self.url.host:

@418Coffee Rather than changing the behavior of .update() (which would make it behave differently than standard mappings, which is confusing), it sounds like we might want to consider .setdefault(), which does the "only set if not present" thing you seem to be looking for.

Okay sure, but we already call .update() on the default headers here, adding a .setdefault() would just add unnecessary code no? And what do you mean exactly with: _which would make it behave differently than standard mappings, which is confusing_. Don't get me wrong, I'm fine with adding a .setdefault(), but I dont think I quite understand you. Thanks.

@418Coffee I meant that your _merge_headers() snippet might probably look something like this instead:

def _merge_headers(
        self, headers: HeaderTypes = None
    ) -> typing.Optional[HeaderTypes]:
        """
        Merge a headers argument together with any headers on the client,
        to create the headers used for the outgoing request.
        """
        merged_headers = Headers(headers)
        for name, value in self.headers.multi_items():
            merged_headers.setdefault(name, value)
        return merged_headers

I'm not sure this will help with your "keep headers ordered" use case, though.

Could we maybe clarify what we expect the behavior to be…?

Right now this is what we have:

import httpx
from pprint import pprint

client = httpx.Client()
method = "GET"
url = "http://localhost:8000"
headers = {
    "x-first": "1",
    "x-second": "2",
    "x-third": "3",
    "connection": "keep-alive",
    "accept": "text/plain",
    "accept-encoding": "gzip, deflate",
    "accept-language": "en-US",
    "x-last": "999",
}
request = client.build_request(method, url, headers=headers)
pprint(request.headers.raw)

Output:

[(b'Host', b'localhost:8000'),
 (b'accept', b'text/plain'),
 (b'accept-encoding', b'gzip, deflate'),
 (b'connection', b'keep-alive'),
 (b'User-Agent', b'python-httpx/0.16.1'),
 (b'x-first', b'1'),
 (b'x-second', b'2'),
 (b'x-third', b'3'),
 (b'accept-language', b'en-US'),
 (b'x-last', b'999')]

How do we expect this to behave instead? Like this…?

[(b'x-first', b'1'),
 (b'x-second', b'2'),
 (b'x-third', b'3'),
 (b'accept', b'text/plain'),
 (b'connection', b'keep-alive'),
 (b'accept-encoding', b'gzip, deflate'),
 (b'accept-language', b'en-US'),
 (b'x-last', b'999')
 (b'Host', b'localhost:8000'),
 (b'User-Agent', b'python-httpx/0.16.1')]

@got3nks There's actually a workaround that consists in building a Request instance, and then setting its .headers explicitly to whatever you'd like, e.g. the OrderedDict:

import collections

client = httpx.Client()
request = client.build_request("GET", "https://example.org")

headers = collections.OrderedDict()
headers["x-first"] = "1"
headers["x-second"] = "2"
headers["connection"] = "keep-alive"
headers["accept"] = "text/html"
for k, v in request.headers.multi_items():
    headers.setdefault(k, v)

request.headers = httpx.Headers(list(headers.items()))

pprint(request.headers.raw)

Output:

[(b'x-first', b'1'),
 (b'x-second', b'2'),
 (b'connection', b'keep-alive'),
 (b'accept', b'text/html'),
 (b'host', b'example.org'),
 (b'accept-encoding', b'gzip, deflate, br'),
 (b'user-agent', b'python-httpx/0.16.1')]

Given how advanced this use case of ensuring order of headers is (and I don't think the HTTP spec says anything about this?), it's also possible that you could get away with this more explicit style, without having to change anything to HTTPX…

That’s good enough for my use case, thanks @florimondmanca. I’ll give it a try soon!

Here's what RFC 7230 says about header order:

https://tools.ietf.org/html/rfc7230#section-3.2.2

The order in which header fields with differing field names are received is not significant. However, it is good practice to send header fields that contain control data first, such as Host on requests and Date on responses, so that implementations can decide when not to handle a message as early as possible.

Currently, HTTPX sticks to that by not caring particularly about header fields, and pre-pending Host (instead of appending it).

Given that the spec is with us, and there's a workaround shown above if your particular use case requires preserving order, I'll close this off for now. :-)

Thanks all!

Was this page helpful?
0 / 5 - 0 ratings