Envoy: Switch from http-parser to llhttp

Created on 29 Nov 2018 · 55Comments · Source: envoyproxy/envoy

Apparently, http_parser is abandonware and node.js started work on a switching to llhttp.

cc @alyssawilk @htuch @mattklein123

aresecurity no stalebot priorithigh tech debt

Source

PiotrSikora

👍1

Most helpful comment

So far I have 73% of http1 codec tests passing, working through them. llhttp introduces slight deviations - stricter header parsing (i.e. this block is no longer required), slightly different handling of upgrade, etc.

Also, I'm going to do the URL utility library as a separate change if no one beats me to it. My updated branch leaves that as is for now.

derekargueta on 31 Oct 2019

👍3

All 55 comments

One concern around llhttp is it uses https://github.com/indutny/llparse and a bunch of TypeScript. I don't want to go down the language war path, but it's probably worth considering the increased build complexity (Bazelifying llparser and dependencies) and developer impedance matching when making the call. All things being equal, I'm a fan of higher level solutions to parsing, but switching to this is not totally obvious.

htuch on 29 Nov 2018

Well, the auto-generated llhttp was just added to node.js as the experimental-http-parser, and node.js is/was the only thing keeping the hand-written http-parser on a life-support, so the decision might be forced upon us sooner rather than later... I don't have a strong opinion until then.

As for the extra build-time dependencies, we just need the generated C code, see: https://github.com/indutny/llhttp/issues/10.

PiotrSikora on 29 Nov 2018

If I could be of any help, please do not hesitate to ask me directly.

indutny on 29 Nov 2018

🎉1

@indutny would you mind filing a tracking bug for the migration over on the node.js so interested envoy developers can follow it? If we opt to migrate we'd probably swap shortly after you, and if we don't it's still important to know when node.js stops using http_parser from a security perspective, so it's good to know either way :-)

alyssawilk on 29 Nov 2018

My opinion on this is that we should take a wait and see approach. TBH, HTTP/1.1 at this layer is basically done. I don't really see what changes we would need to make.

mattklein123 on 29 Nov 2018

@alyssawilk https://github.com/nodejs/node/issues/24730 - This might do.

indutny on 29 Nov 2018

👍1

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

stale[bot] on 10 Feb 2019

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted". Thank you for your contributions.

stale[bot] on 17 Feb 2019

Reopening in light of https://github.com/envoyproxy/envoy/issues/6434. Since we didn't get any response from Node.JS security WG when reporting, it seems that http-parser should be considered unsupported. The community did drive a fix when we went public, see https://github.com/nodejs/http-parser/issues/468.

Since http-parser is a codec and hence exposed to untrusted traffic from clients and backends on the data plane, it has a higher standard to live up to than other external dependencies. I would like us to consider only including dependencies that have active security teams that are at least doing CVE management. This is further motivation for moving away from http-parser.

htuch on 10 Apr 2019

@htuch I'm sorry about the negative experience that you had. I wish we would have seen your security report on our channels before you submitted that github issue. What channel did you use? Was it HackerOne or the mailing list ([email protected])? I'm having a hard time finding your report...

I am part of Node.js community and one of the maintainers of http-parser. I believe that in our shared effort we managed to fix the issue very promptly after it was disclosed. Sorry again if this caused you any frustration!

indutny on 10 Apr 2019

As a matter of fact, I'm a maintainer of llhttp too :joy:

indutny on 10 Apr 2019

@indutny we sent two e-mails, one pre and one post disclosure to [email protected]. Happy to forward them again. Is HackerOne preferred? The title was "Vulnerability in http-parser & embedded NULL header handling".

Thanks again for the prompt fix once this went public, the comment above reflects a concern that we didn't have any effective method to coordinate prior to public disclosure.

htuch on 10 Apr 2019

Oh no! [email protected] should be forwarding reports to HackerOne. Apparently, something is not working as expected. Again, I'm terribly sorry for this. Will look into it momentarily.

indutny on 10 Apr 2019

@indutny Will https://hackerone.com/nodejs/hacktivity and https://groups.google.com/group/nodejs-sec be updated retroactively with the CVE?

@htuch it seems like the reporting system would have been or is broken for both http-parser and lhttp (independent of their health status as sub-projects).

moderation on 10 Apr 2019

@moderation that's a good question. We can definitely request a CVE for the issue. Let me see what I can do.

indutny on 11 Apr 2019

cc @rvagg @jasnell, perhaps?

indutny on 11 Apr 2019

@indutny we sent two e-mails, one pre and one post disclosure to [email protected]. Happy to forward them again. Is HackerOne preferred? The title was "Vulnerability in http-parser & embedded NULL header handling".

Hi, I work for @Hacker0x01 and act as our main liaison to the open source community. Hoping I can assist you.

When you e-mailed [email protected], you should have received an e-mail back from HackerOne stating "ACTION REQUIRED: Submit Vulnerability Report". Can you check to see if you received that? If so, you'll need to click that link in order to finalize your report submission.

In any case, you can always submit directly to https://hackerone.com/nodejs.

reedloden on 11 Apr 2019

@reedloden as discussed in https://github.com/nodejs/security-wg/issues/454, this was not received. I'm thinking that this is a process flaw if folks can opt to use e-mail or HackerOne and the e-mail path is not as reliable or requires steps that might result in a dropped report. Would it be better to just drop the e-mail option entirely and have folks use HackerOne directly to avoid potential lost reports?

htuch on 11 Apr 2019

I haven't been as active in the security triage recently but I don't recall this coming through. My apologies for having missed it.

jasnell on 11 Apr 2019

@htuch I figured out what happened, but until it's fixed (either on our side or on Node.js's), I would recommend that https://hackerone.com/nodejs be utilized. Even if e-mail was working, the web interface is preferred, as e-mail is not secure.

reedloden on 11 Apr 2019

@reedloden thanks for the analysis of the issue. @indutny are you going to followup with the NodeJS security WG on (1) CVE-2019-9900 and (2) how to deal with the e-mail redirect issues? I can do (1) if that helps by filing a HackerOne report.

htuch on 11 Apr 2019

Is picohttpparser still in the running as a replacement as well? It was mentioned in #2880 & #5536 as a potential perf enhancement for http1.1 .

cmluciano on 11 Apr 2019

👍2

May be time to revisit this. llhttp now the default in Node 12 and the team are looking to deprecate http-parser https://twitter.com/addaleax/status/1173966000175366144

moderation on 18 Sep 2019

Should we do an analysis between llhttp, picohttpparser, and home-brewed Ragel (another of @brian-pane's ideas) to decide what replacement to use?

derekargueta on 24 Oct 2019

👍2

Yep, it would be great to see a comparison, including feature support, standard compliance, performance, licenses, security process/maturity and so on. http-parser is basically EOLed.

htuch on 24 Oct 2019

Happy to see an analysis done, but IMO we should probably just use llhttp since it's roughly a drop in API replacement for http_parser.

I've used picohttp in the past and we would have to implement a substantial amount of stuff on top that http_parser/llhttp already implements.

mattklein123 on 24 Oct 2019

The only thing that slightly sketches me out about llhtttp is the way it pushes us towards having to live in Typescript land. I'm a fan of higher level languages for parsers and codecs, but not such a huge Javascript fan. We can just use the released C implementations though, so build integration shouldn't be an issue. FWIW, the concerns over appropriate handling of vulnerabilities in the Node.js community remains, it took a long time to resolve the http-parser issues I reported.

htuch on 25 Oct 2019

Started playing around with llhttp (diff so far), the parser API is pretty much the exact same which is great, but it doesn't expose the URL parsing library like http-parser which we use in a few places as a URL utility.

derekargueta on 27 Oct 2019

🚀1

@derekargueta can we move away from relying on these URL parsing utils and switch to the Chromium URL library as per https://github.com/envoyproxy/envoy/issues/6588? This would allow us to get behind a single source of URL parsing truth, which can only be a good thing for consistency; we've seen quite a few issues with different parts of Envoy not agreeing on things like URL semantics historically.

htuch on 27 Oct 2019

@htuch sure, I was unaware of sunsetting the Node URL parser.

Also, should the new parser should initially be gated behind a feature flag to allow opt-in before forcing everywhere, similar to when the buffer implementation changed?

derekargueta on 27 Oct 2019

@derekargueta I'd recommend a feature flag to allow folks to fall back as needed, but I'd advocate we flip it on eagerly, there isn't a ton of value in remaining with the extant URL parser.

htuch on 28 Oct 2019

👍2

Also, I'm going to do the URL utility library as a separate change if no one beats me to it. My updated branch leaves that as is for now.

derekargueta on 31 Oct 2019

👍3

@derekargueta please let me know if I can be of any help in this endeavor.

indutny on 1 Nov 2019

👍1

FWIW I highly suggest moving from http_parser to llhttp. There was even discussion about changing the defaults in 10.x during LTS due to lack of ability to continue to support the deprecated parser

MylesBorins on 7 Nov 2019

I will update the title of this issue. We are switching. @derekargueta is working on it.

mattklein123 on 7 Nov 2019

This change is about ready, however there's one missing feature in llhttp which is the lenient_http_headers option (used here in http-parser). This option toggles strict validation of HTTP header values by ensuring they are printable ASCII characters. We currently run http-parser in lenient mode (which is the default) then provide a runtime switch to enable strict header validation. However, llhttp defaults to strict validation and doesn't provide an option to disable it.

To move forward, we either need to

get a lenient option added in llhttp (there's a TODO for this cc @indutny) or
deprecate envoy.reloadable_features.strict_header_validation and have all traffic use strict header value checking. Based on some conversation on the PR that added strict_header_validation, it seems that we eventually wanted to move this way? If so, I think this option would be better.

derekargueta on 14 Nov 2019

https://github.com/envoyproxy/envoy/pull/8849 flips strict validation on by default and the option will be removed entirely (unless folks yell) at the next Envoy release. I assume we're going to support both parsers in parallel for a bit so I think going 100% strict is probably fine (we can revisit if changing the default causes problems in upcoming weeks)

alyssawilk on 14 Nov 2019

Awesome stuff. @derekargueta is there any way to support both parsers without a recompile? IMO this should be feature flagged and flippable at startup like we did with the buffer implementation?

mattklein123 on 14 Nov 2019

@derekargueta any update on status here?

htuch on 6 Mar 2020

@htuch I've done my part to encourage @derekargueta :wink: https://twitter.com/moderat10n/status/1230125590428053509

moderation on 6 Mar 2020

No status update at the moment - I have started catching my stale branches up to master. There's been some refactorings of the HTTP codecs that I need to work through

derekargueta on 9 Mar 2020

Started re-investigating this. @indutny one issue that came up is that the llhttp release artifact is that it's not built with strict mode enabled, and this option cannot be enabled from the C API since it is a variable in producing the C code. Are there plans to publish release builds with strict mode enabled, or be able to set strict mode through the C-API perhaps using -D like http-parser? This is causing quite a few tests to fail but I really don't want to introduce Typescript tooling in the Envoy build (for running npx to generate the C code with strict mode)

derekargueta on 4 May 2020

There is a bit of optimizing that has to be done on llhttp's code before compiling it down to C. I don't think that it'd be possible to introduce the configuration defines the same way as they are in http-parser. Runtime checks would likely affect performance so I'd avoid this too.

There are two things that could be done, however:

Bundling two separate builds and switching between them in a build system (Easy, might be aesthetically undesirable)
Combining two generated .c files from strict and loose mode builds into one guarded by a huge top-level #ifdef

I'll look into the latter to see how much churn this would involve in llhttp. Could you consider the former meanwhile?

indutny on 4 May 2020

See: https://github.com/nodejs/llhttp/pull/58

indutny on 4 May 2020

FWIW, that Pull Request has landed and was released in llhttp 2.1.0.

indutny on 19 Jul 2020

@derekargueta any update on progress here?

htuch on 20 Aug 2020

Now @derekargueta 's presentation at KubeConEU is done he should have some cycles :smile:

moderation on 20 Aug 2020

😄1

Yes I had a working implementation with @indutny's last release that fixed the strict-mode issue but was waiting for the exception-removal work in theHTTP codecs to settle. Currently working through some git conflicts :) should be done soon.

derekargueta on 21 Aug 2020

👀1

This commit on 2020-10-02 confirms that the last remaining maintainer for http-parser has moved on and http-parser is no longer maintained - https://github.com/nodejs/http-parser/commit/ec8b5ee63f0e51191ea43bb0c6eac7bfbff3141d.

From an @envoyproxy/dependency-shepherds POV it would be great if we could get https://github.com/envoyproxy/envoy/pull/9033 over the line /cc @derekargueta @htuch

moderation on 28 Oct 2020

I've tagged this as both priority and security in response to the above comment.

htuch on 9 Nov 2020

I have a "drop-in replacement" branch with everything passing _except_ the new functionality that was introduced via http-parser for allowing transfer-encoding: chunked + content-length: N. There's an llhttp issue for this where I've started a small investigation and the fix seems pretty straightforward: https://github.com/nodejs/llhttp/issues/69

derekargueta on 10 Nov 2020

👍1

FYI, nodejs/llhttp#69 was just closed with a fix in v3.0.0

pallas on 30 Nov 2020

I added a link to the release at the now stale WIP PR at https://github.com/envoyproxy/envoy/pull/9033#discussion_r532955677

moderation on 30 Nov 2020

Confirmed all tests pass with the new llhttp update 🎉 reviving the abstractions I initially had for supporting both parsers through a command line flag

derekargueta on 4 Dec 2020

🎉1

Woohoo

moderation on 4 Dec 2020

Was this page helpful?

0 / 5 - 0 ratings