Node: Support for base64url format

Created on 8 Mar 2019  Â·  7Comments  Â·  Source: nodejs/node

new Buffer(str, "base64") officially accepts both RFC 3548 (+,/) encoding as well as RFC 4648 (-,_) encoding (see #5239 and #5243).

To base64 encode a buffer using RFC 3548 we can actually do buffer.toString("base64").

Would it be possible to also natively support RFC 4648 format using buffer.toString("base64url") ?

If yes, should we strip trailing = characters as they are unnecessary and potentially harmful used in an URL?

buffer feature request

Most helpful comment

I understand, but in the order hand, Node.js already supports base64url when decoding Buffer (implicitly via base64 encoding).

https://nodejs.org/api/buffer.html#buffer_buffers_and_character_encodings

base64 - Base64 encoding. When creating a Buffer from a string, this encoding will also correctly accept "URL and Filename Safe Alphabet" as specified in RFC4648, Section 5.

It would be better to have the explicit base64url format for this.

Also, many libraries relies on built-in Node.js encodings. And many times we cannot use base64url because it's not supported officially by Node.js. Here is an exemple with Webpack:

https://webpack.js.org/configuration/output/#outputhashdigest

All encodings from Node.JS' hash.digest are supported. Using base64 for filenames might be problematic since it has the character / in its alphabet.

This feature is easy to implement in userland, but I think we must think about adding it natively to Node.js because:

  • It's a basic feature on which many libraries relies on (if Node.js doesn't support it, then the encoding doesn't exist).
  • This is pragmatic and specially useful for the Web

My english is not very good, but I hope you get the point of my argumentation.

All 7 comments

/cc @nodejs/buffer

I think this has come up a few times now – basically, I think so far the answer was that this is easy to implement in userland, and Node.js is generally careful about adding support for new encodings?

I understand, but in the order hand, Node.js already supports base64url when decoding Buffer (implicitly via base64 encoding).

https://nodejs.org/api/buffer.html#buffer_buffers_and_character_encodings

base64 - Base64 encoding. When creating a Buffer from a string, this encoding will also correctly accept "URL and Filename Safe Alphabet" as specified in RFC4648, Section 5.

It would be better to have the explicit base64url format for this.

Also, many libraries relies on built-in Node.js encodings. And many times we cannot use base64url because it's not supported officially by Node.js. Here is an exemple with Webpack:

https://webpack.js.org/configuration/output/#outputhashdigest

All encodings from Node.JS' hash.digest are supported. Using base64 for filenames might be problematic since it has the character / in its alphabet.

This feature is easy to implement in userland, but I think we must think about adding it natively to Node.js because:

  • It's a basic feature on which many libraries relies on (if Node.js doesn't support it, then the encoding doesn't exist).
  • This is pragmatic and specially useful for the Web

My english is not very good, but I hope you get the point of my argumentation.

I've no strong feelings about the feature pro or con, but the Buffer API has asymetric support for base64url from https://tools.ietf.org/html/rfc4648#section-5 ATM:

  • node accepts in on input as a base64 variant
  • node will not generate it

It would be easier to call it a "do in userland" if node didn't support it all, but since we half-support it, its a little less clear to me what the right thing to do is.

Base64url is highly pragmatic for web (and filesystem) applications so I do think there is a good argument for including it in node.

It is also not easy to implement in user-land _efficiently_ (e.g. string search and replace with regexes) and would be much faster if the safe chars were simply part of the encoding alphabet.

As for the trailing padding characters, technically I believe the padding would be required by the standard but in practice I suspect it would often not be useful for applications involving base64url. So there's a correctness vs pragmatism debate there.

It could perhaps be offered as a variant of base64url:

buf.toString('base64url'); // defaults to padding=true
buf.toString('base64url?padding=false');

One alternative would be if there were a more direct way to configure both the encoding alphabet and/or padding behavior of the base64 encoder, such as an object that could be passed into Buffer or toString or similar call:

buf.toString(new Base64("urlsafe", "nopad"))

or

buf.toString({
    encoding: 'base64url',
    padding:  false
});

I'm opposed to omitting the padding or changing the signature of any stable Buffer methods. The relevant RFCs make clear that the padding MUST NOT be omitted in the general case, even if the data length might be known to the caller; the Buffer.from(string, encoding) method doesn't require a "meaningful data" length and the buf.toString(encoding) signature doesn't return a data length, and they would break existing code if they did.

If you want to omit or sanitize the padding characters and know the data length, use the extended signature with start and end, or simply strip the trailing characters afterward, or use encodeURIComponent on the result.

That said, Buffer.from() already violates RFC recommendations by silently ignoring invalid characters and accepting the base64url alphabet when 'base64' is specified. Keeping those behaviors but allowing a stricter 'base64url' encoding for both functions without making padding optional might be "easy to implement in userland" but is even easier to implement in this built-in module. It's something I'd like to personally explore in the very near future unless anybody has objections.

@hezedu It is trying to avoid the toLowerCase call it looks like, the paradigm is repeated throughout the file.

Was this page helpful?
0 / 5 - 0 ratings