Node: multiple headers of the "same name" allowed in http

Created on 29 Oct 2015  路  53Comments  路  Source: nodejs/node

If { 'cookie: 'foo=bar;', 'Cookie: 'foo=meh;' } is passed, you have no idea which one will be used.

If a request relies on, for example, cookie information, a request may work seemingly at random and can be rather difficult to debug.

http stalled

Most helpful comment

This is still a major issue.

All 53 comments

It should be the last one. Headers are added in object insertion order, the last one wins out.

The behavior itself is most likely undocumented but I _think_ we have tests for this (not easy to grep for though.)

Should it allow multiple by the same at all? Is it too costly to check for this?

FYI -> http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2

Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. The order in which header fields with the same field-name are received is therefore significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded.

/cc @nodejs/http

Potentially there are a few solutions here:

1) Only use the first key when there are duplicate keys once the keys have been converted to lower-case.

Basically, this solution would be to say that since HTTP headers are not case-sensitive, if { cookie: 'foo=bar;', Cookie: 'foo=meh;' } is given, act on it as if the user had really written { cookie: 'foo=bar;', cookie: 'foo=meh;' }, which in JavaScript would have been converted to { cookie: 'foo=meh;' }:

$ node -pe "({ 'cookie': 'foo=bar;', 'cookie': 'foo=meh;' })"
{ cookie: 'foo=meh;' }

2) When given an object to set headers from (i.e. res.writeHead(status, obj)), if the obj contains duplicate keys once the keys have been converted to lower case, throw a TypeError.

This would signal to users that there is an issue in the given object, which was expected to only contains header-value pairs.

Option two sounds more appealing since if you can comma separated list syntax to define multiple headers otherwise.

Another option to consider is allowing an optional array syntax for that:

{ cookie: ['foo=bar', 'foo=meh'] }

The problem I saw in production was where one library was using cookie and one external source used Cookie and the headers were merged and one was lost.

Another option to consider is allowing an optional array syntax for that:

{ cookie: ['foo=bar', 'foo=meh'] }

I didn't mention it, because that is already supported; each element in the array will be a different header, making that response look like the following:

Cookie: foo=bar
Cookie: foo=meh

Assuming we are talking about res.writeHead. If you are referring to a different Node.js API, let me know, as I don't see what API is being used actually mentioned here.

Perfect :)

In that case I presume we can close an issue.

Why would you make that presumption?

FYI -> http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2

Each header field consists of a name followed by a colon (":") and the field value. Field names are case-insensitive.

Having a similar problem with X-Robots-Tag. I need to support the following

X-Robots-Tag: googlebot: nofollow
X-Robots-Tag: otherbot: noindex, nofollow

As specified here.

@cressie176 According to the documentation:

Use an array of strings here if you need to send multiple headers with the same name.

So I think this would work for your case:

response.setHeader('X-Robots-Tag', ['googlebot: nofollow', 'otherbot: noindex, nofollow']);

@Trott, thanks for the quick response. Unfortunately I've tried that - it generates a single header which I don't think conforms to googles spec.

x-robots-tag: 'googlebot: nofollow, otherbot: noindex, nofollow',

Furthermore if I add an additional directive to apply to all user agents, it gets concatenated with the last one in the list, i.e.

response.setHeader('X-Robots-Tag', ['googlebot: nofollow', 'otherbot: noindex, nofollow', 'noimageindex']);

Results in

x-robots-tag: 'googlebot: nofollow, otherbot: noindex, nofollow, noimageindex'

Ah, I see. Thanks for the clarification. (Wish I had more to add than just that.)

lol!

Try to use power of RegExp to split string to an array.

@bricss Sorry - not sure I follow. Are you suggesting I use regex to split the header sent in the response?

In this situation my code is running on the server and I'm using node (via express) to send a response header. Node is concatenating header values with the same name into a single string, then transmitting the response.

If I was writing the client I could split the string, but in this case the client is googlebot.

@cressie176 Ouh, sorry, now I see where is a problem.

@bricss no problem. Intentions were good.

X-Robots-Tag: nofollow
X-Robots-Tag: nocache
X-Robots-Tag: googlebot: noindex
X-Robots-Tag: unavailable_after: 1-Jan-3000 00:00:00 EST

should maybe result in something like:

{
  "x-robots-tag": "nofollow,nocache",
  "x-robots-tag: googlebot": "noindex",
  "x-robots-tag: unavailable_after": "1-Jan-3000 00:00:00 EST"
}

Also, how would we set such headers with setHeader() and writeHead() for Node to understand?

Edit: HTTP header field names do not allow the ":" character, so Google was never following the specification when they designed these headers. It should've been something like x-robots-tag|googlebot: noindex. Node.js aims to be spec-compliant, so this is a no-go.

Server Push feature of HTTP/2.0 also needs this feature, it will use multiple headers named Link
https://w3c.github.io/preload/#server-push-http-2

Closing as there does not appear to be anything do to here.

Hi @jasnell. closing because you don't think this is a bug / legit feature or closing because it's being tracked in another issue? If the former can you explain why?

Closing because the API already allows for multiple headers of the same name when those are allowed (some headers do not allow it). The existing support could potentially be better, but given that core intentionally does very little actual processing of header values, the best bet would be to look at a userland solution.

What userland solution? This is a poor choice IMO. See my original example, there is no way for the user to know which cookie will end being used in the headers.

The point of this ticket was that it should _not_ be allowed to have multiple headers of the same name in the object, as one of them will be discarded ( and it's not possible to tell which one is used last time i checked )

@jasnell thanks for the clarification. Can elaborate on how I can set multiple X-Robots-Tag headers via the API?, e.g.

X-Robots-Tag: googlebot: nofollow
X-Robots-Tag: otherbot: noindex, nofollow

As mentioned above

response.setHeader('X-Robots-Tag', ['googlebot: nofollow', 'otherbot: noindex, nofollow']);

did not work

@jasnell Can you explain why the ticket was closed. The issue wasn't resolved and your explanation doesn't make any sense. I don't see why this is a hardship to be implemented correct place.

Just error on options = { cookie: "a=b", Cookie: "a=c", CoOkIe: "a=d" } because there's no way to know what a will end up being set to.

If I'm incorrect in my assertion, that's one thing, but I don't believe I am

@jasnell wtf?

@jasnell - I and others believe this ticket was closed prematurely. Can we get it re-opened please?

I'd like to weigh in on, at least, the duplicate cookie header issue. According to RFC rfc6265:

When the user agent generates an HTTP request, the user agent MUST NOT attach more than one Cookie header field.

To me this means that Node should ignore the second cookie header when parsing the cookie header like it does for these:
https://github.com/nodejs/node/blob/master/lib/_http_incoming.js#L150-L167

To me this means that Node should ignore the second cookie header when parsing the cookie header like it does for these:

It does. but you won't know which capitalization it will chose. That's the problem. That's why I've been saying it shouldn't allow it.

For instance:

{
 headers: {
    'content-type': 'text/html',
    'Content-type': 'application/json',
    'CoNtEnT-TyPe': 'application/xml'
  }
}

This issue isn't about ignoring. It's a question of knowing which one doesn't get ignored.

@danielb2

Just error on options = { cookie: "a=b", Cookie: "a=c", CoOkIe: "a=d" } because there's no way to know what a will end up being set to.

According to RFC 6265#section-4.2.2:

Although cookies are serialized linearly in the Cookie header,
servers SHOULD NOT rely upon the serialization order. In particular,
if the Cookie header contains two cookies with the same name (e.g.,
that were set with different Path or Domain attributes), servers
SHOULD NOT rely upon the order in which these cookies appear in the
header.

So you can't ever rely that there is only a single cookie named "a".

The way I see it, Node will take the first of multiple headers. I'm fine with that honestly. The issue that we are currently facing is that a rouge app is sending two cookie headers. Each with different cookies. Now, if Node did the right things when concatenating or stopping at the first one them it wouldn't be an issue. But the fact that it concats them with a , causes the incoming header to be unusable. Example

Setting up a simple express app to echo the req.headers:

curl -H "Cookie: B=1234; A=12345;" -H "Cookie: B=5678;" http://127.0.0.1:3000/

Results in:

"cookie":"B=1234; A=12345;, B=5678;"

It would be different if it was returned as either of these:

"cookie":"B=1234; A=12345; B=5678;"
"cookie":"B=1234; A=12345;"

In the incoming message handler it's choosing several to make single headers and if there are multiple then separate them with a comma:
https://github.com/nodejs/node/blob/master/lib/_http_incoming.js#L176

To me, that means that cookie should be added to that list, as this results in a broken non-rfc compliant header.

Please take notice of the comments regarding X-Robots-Tag. This issue is not limited to cookies.

The way I see it, Node will take the first of multiple headers. I'm fine with that honestly.

When you have multiple possible code paths, and complex logic, you don't necessarily know what that's going to be.

What prompted the bug was part of one code was using one capitalization, and another part of the code was doing another capitalization.

Sure, if you KNOW about this behavior, you can look out for it, but when you don't know, you risk spending hours trying to find out what the problem is which is exactly what happened.

Code should be intuitive. You should know what to expect. I would not expect the library to simply discard one header because they have different capitalizations. They're being set for a reason.

This issue is confusing. In some places it talks about node parsing (receiving), in some places it talks about generating. Which is problematic?

As far as x-robots-tag goes, I'm confused, it's syntax violates https://github.com/nodejs/node/issues/3591#issuecomment-152731585, it appears, in that they cannot be concatenated with a ,. Which is wrong, x-robots, the quote from the spec, something else?

This issue is confusing. In some places it talks about node parsing (receiving),

This issue is about parsing input when making an http request.

@sam-github This issue seems to have grown. I see a couple of issues here.

Multiple incoming x-robots-tag header handling seems broken.
Multiple incoming cookie header handling seems broken.

There is also the general, how should it handle multiple headers of the same name. Which one should get parsed and used. Right now it appears to always accepts and uses the first one. It ignores some based on that case statement and others it concatenates with a ,. We probably need to break it into individual issues so that the discussion doesn't keep going circular :)

Thanks, @davglass, that helps a bit.

The node HTTP implementation doesn't have to have specific handling for all cases, but it does have to be sufficiently general and consistent in behaviour and docs that all useful cases can be implemented.

Maybe breaking into specific issues with example code would help.

Comments like https://github.com/nodejs/node/issues/3591#issuecomment-152772620 seem to describe not the parsing of headers received by node, but how the HTTP API accepts arguments to be formatted and sent _out_ on the wire.

AFAIK the original issue is in regards to sending out headers from node (which is why it talks about which keys to lookup in an object).

I completely disagree that this is a decision which should be made by node. Personally, I think the headers should be considered immutable data as far as node is concerned, and simply throw an error if duplicate headers are found. This can be accomplished by const headerType = Object.keys(headers).map(toLower); and checking headerType for duplicates, using an object as a dictionary and a hasOwnProperty check. If hasOwnProperty returns true, then the error is thrown, simple as that.

This approach allows the user to decide what the headers should be, and allows the library to decide what is and is not acceptable input, and can be implemented as a non-breaking change by adding a boolean property to the RequestOptions, perhaps named retainHeaders.

Right now I can't interact with a specific server I need to interact with because that server is expecting capitalized headers. Absurd, I know, but not every server is going to implement the spec to the T, and node needs a way to account for that.

The capitalized header issue is a blocker for me now, and it looks like the only way around the issue is to change node to allow me to decide what the headers will be, or use a non-node proxy server which can alter the headers back to the way I need them.

headers should be a Map.

Please add this to the v8.0.0 milestone. I think that it'd make sense to fix this along with the URL changes.

Edit: Apparently I can't @mention private groups I'm not a part of, heh. @jasnell

Sorry to dig this up (and I apologize for the ping) but I figured trying to round this thread to a conclusion first is a better idea than starting a new one. Skip to the last bullet points for the tldr.

So there's a few different issues here that are seemingly orthogonal to the same functionality:

  • should multiple case-insensitive headers be expected and acceptable behavior?
  • should http mutate headers as-given vs over the wire (with regards to casing)?
  • how should the decision of which header value(s) to use be reconciled when sending a request?

It seems the most important principle that continues to be brought up is not diverging from http spec to conform to broken api requirements, which seems reasonable.

It's possible the answer lies in a compromise between these things, given that the spec only proposes that headers are case-insensitive, which allows for some implementation leeway. We might gain some insight here by taking a look at the implementation or spec design decisions of other request clients based off the http protocol, more specifically whatwg (xhr and Fetch).

Specifically, this was the proposed plan as a solution in those specs:
https://github.com/whatwg/fetch/issues/304#issuecomment-238870561

The subsequent PR to the spec:
https://github.com/whatwg/fetch/pull/476

The PR aligning xhr's spec with fetch:
https://github.com/whatwg/xhr/pull/130

I'm not saying that Node.js should in any way be influenced by unrelated project's decisions.. and in fact many general use runtimes/languages have this implemented the same way (lowercase all headers over the wire), such as both ruby and python (as far as I was able to check anyway).

However, Node.js is in a slightly more unique position with its coupling to the npm ecosystem and has shown even in very recent history that we do want to comply with web standards and we do want to do it in core (the url module, as an example)

For full disclosure, I'm trying to champion this because as it stands the node-fetch lib cannot maintain spec compliance in this manner specifically because of this issue: https://github.com/bitinn/node-fetch/issues/260

So to summarize / tldr:

  • Preserve case for all headers over the wire (or optionally, all unknown headers)
  • Compare headers using case-insensitive search (so comparison works in both directions)
  • Appending or setting an existing header would preserve the case of the existing header, not create a duplicate with alternate casing
  • When using accessors to retrieve headers, always return lowercase (for backwards compatibility)
  • Optional: canonicalize case for certain known headers, per a table like this: WebKit HTTPHeaderNames, but possibly customized or limited more for Node's usage

Thoughts?

@jasnell sorry to ping you.. when I wrote my original comment above last week, I only tried to @ you via an edit, and I think it's possible this doesn't notify you. If indeed you did get it, please ignore this message :) Otherwise please see above and let me know if we should continue the discussion here or open a new issue with a broader scope. Thanks!

@jkantr ... my apologies! I've been a bit backed up and just forgot to get back to this one...

should multiple case-insensitive headers be expected and acceptable behavior?

Generally, yes. Specifically, all header field names are case insensitive. For instance, content-type is equivalent to Content-Type which is equivalent to CoNtEnT-tYpE, etc. That said, the specific requirements in RFC 7230 state that only header fields whose values are defined as comma-separated lists may appear more than once per message. There are a handful of header fields defined by the HTTP spec itself that are required to only appear once per message (content-type and content-length, for example). That said, header fields are extensible and unless your code knows exactly which headers are limited to single values, it's generally better to expect multiple header instances.

should http mutate headers as-given vs over the wire (with regards to casing)?

Implementations handle this differently. Proper implementations of the spec should be agnostic to case in header field names, which would make it just fine to mutate header field names. There are some buggy implementations out there, however, that do not handle this appropriately. Most modern implementations should be safe, however.

(It's a worthwhile data point that HTTP/2 requires that header field names be transmitted over the wire as ASCII lower-case.)

how should the decision of which header value(s) to use be reconciled when sending a request?

Generally, maintain a list of header fields that are known to be single value and accept all others as potentially multi-valued.

Is this not solvable with Headers?

Over a year with no update, so I'm going to close this.
Feel free to ping me if this should be reopened.

This is still a major issue.

@refack
The title and perhaps the original post may not be perfectly clear, but multiple headers having the same name/key should be allowed in the response.

I'm working on some proxy type code that requires appending domain to cookie values. I am getting multiple set-cookie headers in the response that I am proxying and I should be able to process them one at a time, and have multiple set-cookie headers in the response coming out of node, rather than having to maintain an array and concatenating them.

@refack please reopen.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

vsemozhetbyt picture vsemozhetbyt  路  3Comments

dfahlander picture dfahlander  路  3Comments

srl295 picture srl295  路  3Comments

sandeepks1 picture sandeepks1  路  3Comments

cong88 picture cong88  路  3Comments