Node: Different query parsing behavior between node and browsers for `URLSearchParams`

Created on 15 Jun 2020  路  5Comments  路  Source: nodejs/node

  • Version: 10.x.x, 12.x.x, 14.x.x (tested specifically in 10.20.1, 12.16.1, and 14.4.0)
  • Platform: macOS, ubuntu, windows, etc
  • Subsystem: N/A

What steps will reproduce the bug?

Attempt to parse query strings with either querystring.parse or URLSearchParams with a % and then a properly percent-encoded value.

// wrapper function to allow testing before Object.fromEntries
function parseQs(str) {
  const parsed = new URLSearchParams(str)
  const params = [...parsed.keys()].reduce((prev, key) => {
    prev[key] = parsed.get(key)
    return prev
  }, {})
  console.log(params);
}
parseQs('id=0&value=%') // expected from spec {id: "0", value: "%"}
parseQs('b=%2sf%2a') // expected from spec {b: "%2sf*"}
parseQs('b=%2%2af%2a') // expected from spec {b: "%2*f*"}
parseQs('a=%%2a') // expected from spec {a: "%*"}

or just

querystring.parse('id=0&value=%') // expected from spec {id: "0", value: "%"}
querystring.parse('b=%2sf%2a') // expected from spec {b: "%2sf*"}
querystring.parse('b=%2%2af%2a') // expected from spec {b: "%2*f*"}
querystring.parse('a=%%2a') // expected from spec {a: "%*"}

How often does it reproduce? Is there a required condition?

Whenever a query string with a % and then a properly percent-encoded value is attempted to be parsed using querystring.parse or URLSearchParams

What is the expected behavior?

The expected behavior from the spec https://url.spec.whatwg.org/#percent-encoded-bytes stating, if byte is 0x25 (%) and the next two bytes after byte in input are not in the ranges 0x30 (0) to 0x39 (9), 0x41 (A) to 0x46 (F), and 0x61 (a) to 0x66 (f), all inclusive, append byte to output.

When testing in latest Safari, Firefox, and Chrome it appears to follow that above spec and they return the expected values in the above snippets from the reproducing section.

What do you see instead?

> parseQs('id=0&value=%') // should be {id: "0", value: "%"}
{ id: '0', value: '%' }
undefined
> parseQs('b=%2sf%2a') // should be {b: "%2sf*"}
{ b: '%2sf*' }
undefined
> parseQs('b=%2%2af%2a') // should be {b: "%2*f*"}
{ b: '%2%2af*' }
> parseQs('a=%%2a') // should be {a: "%*"}
{ a: '%%2a' }

Additional information

Investigation started while trying to match querystring parsing between the browser and node in this Next.js PR https://github.com/vercel/next.js/pull/12154

It may be the opposite and the browser is not actually following the spec and node is and I'm mis-understanding and if this is the case I can open this issue elsewhere.

whatwg-url

Most helpful comment

@annevk I opened a PR against WPT adding the above tests, let me know if any changes are needed or there are any additional cases that should be covered

All 5 comments

@annevk ... would like your input on this issue if you're able. Just need to confirm what the correct behavior should be here.

I think @ijjk is correct. It would be nice to add those tests to WPT url/urlencoded-parser.any.js, url/urlsearchparams-constructor.any.js, and url/urlsearchparams-stringifier.any.js.

@annevk I opened a PR against WPT adding the above tests, let me know if any changes are needed or there are any additional cases that should be covered

JFYI: I'm thinking to fix this issue by the below two steps:

  1. Fixing unescape logic of querystring for the test cases: https://github.com/nodejs/node/pull/35013
  2. Implement serializer and update WPT
Was this page helpful?
0 / 5 - 0 ratings