Got: How is the URL build in got? Part 2

Created on 1 Dec 2020  路  14Comments  路  Source: sindresorhus/got

What would you like to discuss?

As I mentioned in #1545,

now I finally found out, that you use in your code

  • source/core/index.ts at line 1580
    // `options.searchParams`
    if ('searchParams' in options) {
        if (options.searchParams && options.searchParams !== defaults?.searchParams) {
            let searchParameters: URLSearchParams;

            if (is.string(options.searchParams) || (options.searchParams instanceof URLSearchParams)) {
                searchParameters = new URLSearchParams(options.searchParams);
            } else {
                validateSearchParameters(options.searchParams);

                searchParameters = new URLSearchParams();

                // eslint-disable-next-line guard-for-in
                for (const key in options.searchParams) {
                    const value = options.searchParams[key];

                    if (value === null) {
                        searchParameters.append(key, '');
                    } else if (value !== undefined) {
                        searchParameters.append(key, value as string);
                    }
                }
            }

            // `normalizeArguments()` is also used to merge options
            defaults?.searchParams?.forEach((value, key) => {
                // Only use default if one isn't already defined
                if (!searchParameters.has(key)) {
                    searchParameters.append(key, value);
                }
            });

            options.searchParams = searchParameters;
        }
    }
  • source/core/index.ts at line 1662
    // Support UNIX sockets
    let {protocol} = options.url;

    if (protocol === 'unix:') {
        protocol = 'http:';

        options.url = new URL(`http://unix${options.url.pathname}${options.url.search}`);
    }

    // Set search params
    if (options.searchParams) {
        // eslint-disable-next-line @typescript-eslint/no-base-to-string
        options.url.search = options.searchParams.toString();
    }

Why do you use these URL.searchParams instead of URL.search?

I asked also in https://github.com/nodejs/node/issues/36337

Hope this helps.

And don't be rude to me. 馃槈 馃槃

Checklist

  • [x] I have read the documentation.

Most helpful comment

@Giotino Node.js expects us to provide decoded URL, so we cannot pass an encoded version, as it will be encoding already encoded version.

That's not true, WHATWG expect path to be un-encoded, but the path from your URL is sent as it is, because it's not necessarily a WHATWG URL.

In fact the request from

got('http://127.0.0.1:8123/test%3Ftest')

ends up as

GET /test%3Ftest HTTP/1.1
user-agent: got (https://github.com/sindresorhus/got)
accept-encoding: gzip, deflate, br
Host: 127.0.0.1:8123
Connection: close

All 14 comments

And don't be rude to me.

We're not trying to be rude, but it's difficult to convey the tone of a speech through writing. Also, we are trying to help as much as we can, but we can't help everyone with every single problem.


We already had a long conversation about how URLs should be presented to the server (#1234 #1236).

TL; DR
A URL is as follow:
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

query is the same as search for WHATWG (search add the requirements of having couple of key-value and specific character encodings).

? is not a part of the query string.

If you want your server to receive /v1/users? as path you should URL-encode the ? as %3F.

BTW, a closed issue does't mean "we don't want talk about it", it's more like "we don't think this is an issue, but feel free to try to change our mind".

Hello @Giotino,

I tested your recent suggestion

If you want your server to receive /v1/users? as path you should URL-encode the ? as %3F.

and it didn't work. I get this as the result

URL {
  href: 'https://127.0.0.1/v1/users%3F',
  origin: 'https://127.0.0.1',
  protocol: 'https:',
  username: '',
  password: '',
  host: '127.0.0.1',
  hostname: '127.0.0.1',
  port: '',
  pathname: '/v1/users%3F',
  search: '',
  searchParams: URLSearchParams {},
  hash: ''
}

and it didn't work. I get this as the result

The WHATWG URL object doesn't represent an encoded version.

I still have a problem and I urgently needs to fix it.

I closed the issue because it's not a problem with Got. You just need to learn more about Node.js. Unfortunately our issue tracker is not the right place to do so.

Here's why the question mark doesn't get included:

  1. Got passes the WHATWG URL to Node.js
  2. The WHATWG URL gets normalized via https://github.com/nodejs/node/blob/2dc6bf0d121dcad48382f075bb3877aa9dc25936/lib/internal/url.js#L1272-L1291
  3. https://github.com/nodejs/node/blob/2dc6bf0d121dcad48382f075bb3877aa9dc25936/lib/_http_client.js#L212
  4. https://github.com/nodejs/node/blob/2dc6bf0d121dcad48382f075bb3877aa9dc25936/lib/_http_client.js#L351

As you can see it uses the url.search property, which is empty when the path ends with a question mark only.

https://example.com/? and https://example.com both point to the same exact endpoint. They both have empty query string.

now I finally found out, that you use in your code

source/core/index.ts at line 1580
source/core/index.ts at line 1662

That's just an option that if specified, overrides the query string provided in the URL.

Why do you use these URL.searchParams instead of URL.search?

.searchParams is a URLSearchParams instance, while .search is a string.

I asked also in nodejs/node#36337

It's an expected behavior. IMO it works correctly.

We can even consider the "parent spec" of WHATWG, the RFC 3986 which defines URIs.

Section 5.3

Parsed URI components can be recomposed to obtain the corresponding URI reference string.
Using pseudocode, this would be:

[...]

if defined(query) then
  append "?" to result;
  append query to result;
endif;

There's no ? if there's no query.

If you want your server to receive the ? you have to encode it (as i wrote before) so it's considered part of the path.

Issues like this better live on StackOverflow ;)

you have to encode it (as i wrote before)

@Giotino Node.js expects us to provide decoded URL, so we cannot pass an encoded version, as it will be encoding already encoded version.

I think got(url, {path: '/?'}) should give the result @tkoelpin expects. If not, the native Node.js HTTP client can be used instead.

@Giotino Node.js expects us to provide decoded URL, so we cannot pass an encoded version, as it will be encoding already encoded version.

That's not true, WHATWG expect path to be un-encoded, but the path from your URL is sent as it is, because it's not necessarily a WHATWG URL.

In fact the request from

got('http://127.0.0.1:8123/test%3Ftest')

ends up as

GET /test%3Ftest HTTP/1.1
user-agent: got (https://github.com/sindresorhus/got)
accept-encoding: gzip, deflate, br
Host: 127.0.0.1:8123
Connection: close

Pardon me, you're right. An encoded URL is still a valid URL therefore the WHATWG URL accepts it.

I got confused by the decodeURI usage in Got, which is used to make sure it throws on invalid URLs like https://github.com/sindresorhus/got/blob/ff918fb6dedb6d8b23421497ec890d43f45121b7/test/arguments.ts#L39-L46

I think got(url, {path: '/?'}) should give the result @tkoelpin expects. If not, the native Node.js HTTP client can be used instead.

That's really the answer which I was looked for. Thank you so much for your effort @szmarczak .

I really appreciate all of your answers. Thank you so much for helping me out, guys 馃槈

Was this page helpful?
0 / 5 - 0 ratings

Related issues

erfanium picture erfanium  路  3Comments

sindresorhus picture sindresorhus  路  3Comments

framerate picture framerate  路  4Comments

joolfe picture joolfe  路  3Comments

lukechu10 picture lukechu10  路  3Comments