Crystal: HTTP::Headers gets downcase

Created on 8 Aug 2019 · 16Comments · Source: crystal-lang/crystal

This shows it perfectly:

require "http"

headers = <<-EOF
Content-Length: 0
User-Agent: Me!12
\n
EOF

io_mem = IO::Memory.new
HTTP.parse_headers_and_body(IO::Memory.new(headers)) do |headers, body|
  HTTP.serialize_headers(io_mem, headers)
  pp body
end

io_mem.to_s # => "content-length: 0\r\nuser-agent: Me!12\r\n\r\n"

The expected result is to get back the same "case" for the headers, but instead it gets downcased

bug topicnetworking

Source

bararchy

Most helpful comment

Yeah for HTTP/2. There is no such rule for older protocols. But I guess we essentially want to use the same data structures for any HTTP protocol, so it's probably better to normalize downcased right away.

straight-shoota on 8 Aug 2019

👍3

All 16 comments

This is due to recent optimizations in #8002. It only affects common header names which are cached downcased for easier comparability (see https://github.com/crystal-lang/crystal/pull/8002/files#diff-520d933e69a34d18025788f47787c6f2R173).
If you use an uncommon header name (for example Foo: 0), it gets copied verbatim.

Maybe we should consider a different solution to caching common names (for example StringPool as suggested in https://github.com/crystal-lang/crystal/pull/8002#discussion_r308346901).

Because header names are case-insensitive this is not a very serious issue, though. But it can still be annoying.

straight-shoota on 8 Aug 2019

👍1

We are doing a lot of HTTP::Headers manipulation, and we digest and parse them raw.
This introduced a need to manually upcase to verify using specs.

bararchy on 8 Aug 2019

This introduced a need to manually upcase to verify using specs

But upcase shouldn't work here because you'd get CONTENT-TYPE instead of Content-Type?

In any case, I'll "fix" this by making it always return "Content-Type", "User-Agent", etc. Headers are case insensitive and that's a good default, I think.

asterite on 8 Aug 2019

@asterite I meant Capitalize :) my mistake.
we do something along the lines of .split("-").map {|h| h.capitalize}

bararchy on 8 Aug 2019

👍1

Headers in HTTP/2 are now always transmitted lowercase (followed in HTTP/3) and it's becoming common to now receive headers lowercased, even in HTTP/1 and it's clunky case-insensitiveness (with an HTTP/2 reverse proxy for example, which is now default in Google GKE for example).

I strongly advise to join the bandwagon. HTTP headers are now lowercase by default, and that's good.

@asterite there is nothing to fix, in fact all headers should always be lowercased.

ysbaddaden on 8 Aug 2019

👍2

@bararchy That doesn't work for ETag for example.

@ysbaddaden

there is nothing to fix, in fact all headers should always be lowercased.

Who says that?

Since case doesn't matter, IMHO the best behaviour is to not alter any names.

I could accept a general doctrine to consider all header names downcased though, if that's a future proof path. As long as we consequently downcase them everywhere (Headers.add for example), this is probably okay (sometimes it might be surprising though).

straight-shoota on 8 Aug 2019

I'm good as long as what goes in == what comes out, otherwise, there are suprises

bararchy on 8 Aug 2019

@bararchy Well, that's the question here: Should we keep everything as it comes in (which would imply possibly mixed styles in one place), or should we apply some kind of normalization to headers in order to have uniform representation.
I don't think that either way necessarily causes more issues than the other. But the status quo after #8002 is the least appreciated, but also not intended (common headers are normalized, others not).

straight-shoota on 8 Aug 2019

oops

straight-shoota on 8 Aug 2019

Source for the lowercasing headers:

https://tools.ietf.org/html/rfc7540#section-8.1.2

Just as in HTTP/1.x, header field names are strings of ASCII
characters that are compared in a case-insensitive fashion. However,
header field names MUST be converted to lowercase prior to their
encoding in HTTP/2. A request or response containing uppercase
header field names MUST be treated as malformed (Section 8.1.2.6).

Blacksmoke16 on 8 Aug 2019

straight-shoota on 8 Aug 2019

👍3

Maybe I can remove the optimization and leave the code as it was before.

asterite on 9 Aug 2019

A conservative change including optimization would be to compare common names case-sensitive and use capitalized strings (or both capitalized and downcased). That's strictly better than just reverting the optimization.

Alternatively we could just proceed forward and downcase everything as @ysbaddaden suggested.

straight-shoota on 9 Aug 2019

@straight-shoota You mean making the lookup hash be:

COMMON_HEADERS = {
  "content-length" => "content-length",
  "Content-Length" => "Content-Length",
  # etc.
}

asterite on 9 Aug 2019

Yes. But you wouldn't need a hash then, a list would suffice when there is no normalization.

straight-shoota on 9 Aug 2019

Makes sense. I like that approach. It wouldn't cover other cases like "Content-length", though we might also want to consider that... I'll later update this PR with all these variants. It's probably the best of all worlds: optimized and keeps the original casing in most common cases. Great idea!

asterite on 9 Aug 2019

👍1

Was this page helpful?

0 / 5 - 0 ratings