Envoy: In HTTP/1.1, envoy should pass-through header capitalization unmodified (without lower-casing)

Created on 2 Oct 2019  路  6Comments  路  Source: envoyproxy/envoy

Title: In HTTP/1.1, envoy should pass-through header capitalization unmodified (without lower-casing)

Description:
This ticket is related to https://github.com/envoyproxy/envoy/issues/6485, which was declined; with a renewed focus on the core point of discussion. HTTP/2 requires all header fields to be lower-cased. However, the HTTP/1.1 standard is far less opinionated on the issue. On the one hand, it specifies that "field names are case-insensitive"; and on the other, it states:

Applications ought to follow "common form", where one is known or indicated,
when generating HTTP constructs, since there might exist some implementations
that fail to accept anything beyond the common forms.

Which could be interpreted to mean Train-Case as a default, since that is the form most commonly referred to in documentation of various headers (e.g. Content-Type).

Even if we take the standard to be ambiguous in this respect, there are major tools that deliver headers exactly as they were sent without modifying their case (e.g. php request headers). This leads to real-world client code often being "broken" (in terms of API standard) by expecting headers in a specific casing.

While it may be reasonable to expect an organization to be able to internally fix its own client code, if Envoy is being used for APIs served externally to that organization, it will lead to avoidable -- sometimes show-stopping bugs. In our experience, we have found that trying to migrate to Envoy introduced a breaking change to our external APIs by hiding an optional header from certain client integrations (e.g. PHP). This failure case is silent and produces no errors on either side of the integration (uh-oh). As far as our client code is concerned, it's simply as if the header was never sent!

enhancement help wanted

Most helpful comment

I think one option to support these kind of legacy systems would be to allow transforming the header keys in the http1 codec from LowerCaseString into an arbitrary string before writing it to the connection.

Trying to plumb through the casing of the upstream response headers sounds like a ton of work given how HeaderMap works, but I think if you could specify either static mappings ("content-type => Content-Type") or a formatter (map everything to Train-Case) you could enable systems to preserve the casing of response headers from before Envoy was put in front.

This also applies better to situations in which multiple proxy hops are added where some of them are h/2: the edge Envoy might even have access to the original casing, as it might have been normalized somewhere between the edge and the original upstream. Specifying the formatting rules on the edge means that you don't actually need to know the original casing.

All 6 comments

This has come up several times over the years. In general I don't think it's worth the churn to change this behavior (it will be very complicated), but I understand why it is wanted. I think @jmarantz had thought about a way to actually do this so hopefully he can weigh in.

I think one option to support these kind of legacy systems would be to allow transforming the header keys in the http1 codec from LowerCaseString into an arbitrary string before writing it to the connection.

Trying to plumb through the casing of the upstream response headers sounds like a ton of work given how HeaderMap works, but I think if you could specify either static mappings ("content-type => Content-Type") or a formatter (map everything to Train-Case) you could enable systems to preserve the casing of response headers from before Envoy was put in front.

This also applies better to situations in which multiple proxy hops are added where some of them are h/2: the edge Envoy might even have access to the original casing, as it might have been normalized somewhere between the edge and the original upstream. Specifying the formatting rules on the edge means that you don't actually need to know the original casing.

As Matt mentioned, we had a DM-slack where I proposed a fairly non-intrusive mechanism for doing this, though it hasn't gotten bumped to the top of my stack yet.

@alyssawilk has some context as well.

I'm not a fan of actual header case preservation because even the non-intrusive version IMO is going to be hairy to survive H2 hops, header additions and header removals. I'd be really curious if real world data shows that Train-Case is sufficient for most "broken" users, or if folks actually need header preservation for CustomNon-TrainCase-Headers. I don't have much intuition on this one other than "it's going to be problematic"

Per @alyssawilk I would really like to avoid preserving case if at all possible because I think it will be difficult to implement and still not work in all cases. I like @snowp's idea of having a formatter option that is implemented on encoding. I think this could be implemented pretty easily and would have a near zero cost for those not using this feature.

We don't actually require true "preservation" -- so long as there's some way to make sure they're Train-Case on the way out for H1. Either by default (which may be problematic for those now expecting kebab-case headers), or giving us a tool to configure encoding one way or another ourselves.

Was this page helpful?
0 / 5 - 0 ratings