With requests python library it is possible to use OrderedDict to send headers in a specified order. I tried the same with HTTPX.
with httpx.Client() as client:
ordered_headers = collections.OrderedDict()
ordered_headers['X-First'] = '1'
ordered_headers['X-Second'] = '2'
ordered_headers['X-Third'] = '3'
ordered_headers['Connection'] = 'Keep-Alive'
ordered_headers['Accept'] = 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
ordered_headers['Accept-Encoding'] = 'gzip, deflate, br'
ordered_headers['Accept-Language'] = 'en-US,en;q=0.9'
ordered_headers['X-Last'] = '999'
r = client.get("http://x.x.x.x/raw", headers=ordered_headers)
This is the RAW HTTP request, as received by the server.
GET /raw HTTP/1.1
Host: x.x.x.x
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate, br
Connection: Keep-Alive
User-Agent: python-httpx/0.16.1
X-First: 1
X-Second: 2
X-Third: 3
Accept-Language: en-US,en;q=0.9
X-Last: 999
Any idea how to make sure the headers are sent with the same order as being passed to the client.get() method?
Hello!
HTTPX doesn't ensure order across all headers sent to the remote host, but you'll notice that the order of headers you passed that are not among those set by default by HTTPX is maintained.
Is this something that's blocking you in any way, or just a curiosity about why this differs from Requests behavior?
I need to make sure I'm sending requests with headers ordered in a pre-determined order.
Can you suggest me where to look at in the HTTPX source code to prevent it to re-arrange the headers?
Hi,
Just came across this issue scrolling through the issues tab. I also benefit from having the possibility to have ordered headers.
I was looking through HTTPX' source and HTTPCORE's source but other than this I could not find where default headers are defined. If you could point me in the right direction I might look into making a pull.
Thanks.
@418Coffee That actually doesn't define any default headers. The default headers that @florimondmanca is referring to are actually here https://github.com/encode/httpx/blob/master/httpx/_client.py#L202-L212, and you may want to look at https://github.com/encode/httpx/blob/master/httpx/_client.py#L347-L356 as well. However do note that headers are also modified upon encoding the request in several functions here: https://github.com/encode/httpx/blob/master/httpx/_content.py#L157 and may be in other places as well.
You are on the right track with the Headers class though, I believe. Possibly modifying __setitem__ to delete the key if it exists and append it to the end of the list would work, but do test thoroughly, as you would be breaking the claim made here: https://github.com/encode/httpx/blob/master/httpx/_models.py#L711.
Hi again,
I looked at it today and I might have done something (I dont know tbh).
I made 2 changes: first one is in _merge_headers
def _merge_headers(
self, headers: HeaderTypes = None
) -> typing.Optional[HeaderTypes]:
"""
Merge a headers argument together with any headers on the client,
to create the headers used for the outgoing request.
"""
merged_headers = Headers(headers) # We set the user given headers first.
merged_headers.update(self.headers) # Then update/append the default headers.
return merged_headers
As you can see, I changed it so that headers are first set using the user given headers and then updated with the default ones. However, this would overwrite already declared headers, so I added a check in the update function :
def update(self, headers: HeaderTypes = None) -> None: # type: ignore
headers = Headers(headers)
for key, value in headers.raw:
if key.decode(headers.encoding) in self: # If key already in headers, skip.
continue
self[key.decode(headers.encoding)] = value.decode(headers.encoding)
Right, I tested this and it seems to work fine, however the "Host" header is always shown as the first one in the response. I don't know what causes this.
the "Host" header is always shown as the first one in the response. I don't know what causes this.
Looks like that's done in the _prepare method of Request: https://github.com/encode/httpx/blob/584a40513f820718b4356097fdf7452e154e6e99/httpx/_models.py#L843-L853
Hehe, checked that already and the host headers is only prepended if the headers dont have "host" in them.
if not has_host and self.url.host:
@418Coffee Rather than changing the behavior of .update() (which would make it behave differently than standard mappings, which is confusing), it sounds like we might want to consider .setdefault(), which does the "only set if not present" thing you seem to be looking for.
Okay sure, but we already call .update() on the default headers here, adding a .setdefault() would just add unnecessary code no? And what do you mean exactly with: _which would make it behave differently than standard mappings, which is confusing_. Don't get me wrong, I'm fine with adding a .setdefault(), but I dont think I quite understand you. Thanks.
@418Coffee I meant that your _merge_headers() snippet might probably look something like this instead:
def _merge_headers(
self, headers: HeaderTypes = None
) -> typing.Optional[HeaderTypes]:
"""
Merge a headers argument together with any headers on the client,
to create the headers used for the outgoing request.
"""
merged_headers = Headers(headers)
for name, value in self.headers.multi_items():
merged_headers.setdefault(name, value)
return merged_headers
I'm not sure this will help with your "keep headers ordered" use case, though.
Could we maybe clarify what we expect the behavior to be…?
Right now this is what we have:
import httpx
from pprint import pprint
client = httpx.Client()
method = "GET"
url = "http://localhost:8000"
headers = {
"x-first": "1",
"x-second": "2",
"x-third": "3",
"connection": "keep-alive",
"accept": "text/plain",
"accept-encoding": "gzip, deflate",
"accept-language": "en-US",
"x-last": "999",
}
request = client.build_request(method, url, headers=headers)
pprint(request.headers.raw)
Output:
[(b'Host', b'localhost:8000'),
(b'accept', b'text/plain'),
(b'accept-encoding', b'gzip, deflate'),
(b'connection', b'keep-alive'),
(b'User-Agent', b'python-httpx/0.16.1'),
(b'x-first', b'1'),
(b'x-second', b'2'),
(b'x-third', b'3'),
(b'accept-language', b'en-US'),
(b'x-last', b'999')]
How do we expect this to behave instead? Like this…?
[(b'x-first', b'1'),
(b'x-second', b'2'),
(b'x-third', b'3'),
(b'accept', b'text/plain'),
(b'connection', b'keep-alive'),
(b'accept-encoding', b'gzip, deflate'),
(b'accept-language', b'en-US'),
(b'x-last', b'999')
(b'Host', b'localhost:8000'),
(b'User-Agent', b'python-httpx/0.16.1')]
@got3nks There's actually a workaround that consists in building a Request instance, and then setting its .headers explicitly to whatever you'd like, e.g. the OrderedDict:
import collections
client = httpx.Client()
request = client.build_request("GET", "https://example.org")
headers = collections.OrderedDict()
headers["x-first"] = "1"
headers["x-second"] = "2"
headers["connection"] = "keep-alive"
headers["accept"] = "text/html"
for k, v in request.headers.multi_items():
headers.setdefault(k, v)
request.headers = httpx.Headers(list(headers.items()))
pprint(request.headers.raw)
Output:
[(b'x-first', b'1'),
(b'x-second', b'2'),
(b'connection', b'keep-alive'),
(b'accept', b'text/html'),
(b'host', b'example.org'),
(b'accept-encoding', b'gzip, deflate, br'),
(b'user-agent', b'python-httpx/0.16.1')]
Given how advanced this use case of ensuring order of headers is (and I don't think the HTTP spec says anything about this?), it's also possible that you could get away with this more explicit style, without having to change anything to HTTPX…
That’s good enough for my use case, thanks @florimondmanca. I’ll give it a try soon!
Here's what RFC 7230 says about header order:
https://tools.ietf.org/html/rfc7230#section-3.2.2
The order in which header fields with differing field names are received is not significant. However, it is good practice to send header fields that contain control data first, such as Host on requests and Date on responses, so that implementations can decide when not to handle a message as early as possible.
Currently, HTTPX sticks to that by not caring particularly about header fields, and pre-pending Host (instead of appending it).
Given that the spec is with us, and there's a workaround shown above if your particular use case requires preserving order, I'll close this off for now. :-)
Thanks all!
Most helpful comment
Here's what RFC 7230 says about header order:
https://tools.ietf.org/html/rfc7230#section-3.2.2
Currently, HTTPX sticks to that by not caring particularly about header fields, and pre-pending
Host(instead of appending it).Given that the spec is with us, and there's a workaround shown above if your particular use case requires preserving order, I'll close this off for now. :-)
Thanks all!