Node: btoa() and atob()

Created on 21 Oct 2015  Â·  52Comments  Â·  Source: nodejs/node

All major browsers expose the btoa and atob globals for encoding ASCII strings as base64 and vice versa. It'd be beneficial for the "isomorphic" javascript topic if we'd provide these too:

As for the implementation, we can't Buffer('string').toString('base64') because that would encode unicode while atob and btoa must per spec throw on charcodes > 255. https://github.com/mathiasbynens/base64 looks like a solid implemenation we could pretty much drop in.

feature request

Most helpful comment

The fact that you are linking to a userland package that already gets over a 1000 downloads a week (via npm) is solid evidence that the need for this, in core at least, is very minimal

Regarding the above. Isn't the fact that thousands of users are using a userland package for this a reason to include it rather than the reverse?

Perhaps I'm confused but if the userland package had very few downloads wouldn't you find yourself saying; "well no one really seems to need this". I know I would.

A bit lost on the reasoning.

All 52 comments

I'd be up for writing a bunch of tests if people want to see this happen

While I understand the desire on this, it could just as easily be implemented as a userland module that registers the globals. I'm -0 on adding it in to core.

I think I'm -1 on this for the same reason.

function btoa(str) {
  if (Buffer.byteLength(str) !== str.length)
    throw new Error('bad string!');
  return Buffer(str, 'binary').toString('base64');
}

Think that will about cover it.

Update: actually, that's a bad length check and I'm too tired to work the correct one. we'd probably have to expose v8::String::ContainsOnlyOneByte() for the fastest option.

I'd rather not add more globals, ever if possible.

The fact that you are linking to a userland package that already gets over a 1000 downloads a week (via npm) is solid evidence that the need for this, in core at least, is very minimal

By the way: Why doesn't v8 provide functions like these?

They're browser extensions, they're not part of the ECMAScript spec: https://html.spec.whatwg.org/multipage/webappapis.html#atob

Feel free to continue discussing, but given the -1's, I'm closing.

The fact that you are linking to a userland package that already gets over a 1000 downloads a week (via npm) is solid evidence that the need for this, in core at least, is very minimal

In fact, that's because everyone uses the packages atob and btoa, e.g:
https://www.npmjs.com/package/atob

~500k downloads per month

The fact that you are linking to a userland package that already gets over a 1000 downloads a week (via npm) is solid evidence that the need for this, in core at least, is very minimal

Regarding the above. Isn't the fact that thousands of users are using a userland package for this a reason to include it rather than the reverse?

Perhaps I'm confused but if the userland package had very few downloads wouldn't you find yourself saying; "well no one really seems to need this". I know I would.

A bit lost on the reasoning.

The reasoning here is that the situation seems quite well for everyone _without_ something being provided by Node core.

Yes but isn't the existance of a rarely used userland package also a valid reason to say it shouldn't go into core? So if either a userland package is used a lot (per your comment) or a little (per mine) it shouldn't be in node core. How then does that have any bearing on the question at all?

@kav You’re right, using download numbers of packages probably isn’t a good metric to determine what should go into Node core and what not. A much more relevant one is how _hard_ it is to do something outside of Node core, e.g. whether require some kind of native code is required for a certain task (at least that’s a personal opinion of mine).

👍 Solid. Thanks for taking the time. I know it may seem a bit pedantic but it helps everyone who wanders along build a model of what should go in core.

In my own humble opinion I'm 100% with you and would add that I generally don't love seeing stuff added to core just to get isomorphic behavior with browsers. node != a browser.

Note that above linked atob package is not compatible with browsers for encoding Unicode strings. If you want strict compatibilty with browsers, the one linked in my original post is the way to go.

Overall, I'd recommend to use the Buffer API in Node.js and use something Unicode-aware like https://github.com/dankogai/js-base64 in the browser.

@trevnorris Thank you sir.. It helped.

@trevnorris any update on exposing v8::String::ContainsOnlyOneByte()?

@akonwi I guess the easiest way to make that happen is to open a new issue and specifically ask for that (because this isn’t really about base64 anymore), or, if possible, work on a PR for that feature itself

Given recent decisions such as making URL global, is this something that you would like to reconsider @jasnell @MylesBorins ?

These APIs are named _very_ poorly. -1

If the goal is to align with web apis I don't think how good the name is should matter. When this issue was created there wasn't a desire to do so, but this seems to have changed with some recent decisions. Good name or not, these are specced and useful within Node. https://html.spec.whatwg.org/multipage/webappapis.html#dom-btoa

also -1

I don't think we need to be entirely at the mercy of the web's ancient and terrible choices.

@Fishrock123 is there a better name the api could be given? Is the API itself not useful? Is there something we could do to engage at the WHATWG to improve the status quo?

Is this something that would be in a js standard lib if it were to exist?

Is this something that would be in a js standard lib if it were to exist?

I'd think it'd be awsome if there were a unicode-aware base64 de/encode in the standard library.

Historically base64 has been limited to ASCII and there's even https://tools.ietf.org/html/rfc4648 which was written 3 years after the inception of UTF-8, still only speccing the encoding of ASCII values while it would be rather simple to encode multi-byte characters as individual bytes.

I hope it’s okay to provide my answers to some of the questions:

Is the API itself not useful?

Imo it’s prolematic that btoa is two operations masquerading as one (encoding as Latin-1 + encoding as Base64): That’s always confusing for developers, because they think of it as a single operation when it is not, and the implicitness of using Latin-1 as the assumed character encoding is harmful in itself (especially in a time where UTF-8 is the de-facto default).

Node made the decision to move away from this concept of “binary” strings and use Buffer objects instead a long, long time ago, and the introduction of Uint8Array also reflects that thinking as well: Sequences of characters and sequences of bytes are conceptually different things, and conflating them is always going to be confusing.

So: These functions can be used if one knows how to do that, but the API is a one that is bound to make people mess up. Providing the building blocks for it, like Node does, is the more sensible choice.

Is this something that would be in a js standard lib if it were to exist?

I'd think it'd be awsome if there were a unicode-aware base64 de/encode in the standard library.

@silverwind I’m a bit confused by your comment … How would that look like? Base64 itself isn’t really concerned with character encodings, is it?

What I see missing in the Web’s standard library is a way to encode Buffers/Uint8Arrays using Base64. (If such a thing exists: Sorry. At least I couldn’t find it.)

@addaleax I don't have an answer for any of your questions but maybe @ljharb @mathiasbynens @maggiepint or another regular at tc39 could chime in. What I'm picking up from this is that there is a browser based API that has some sort of general appeal independent of the runtime. That API is not interesting to Node due to it's design.

I'd like to see us try and find a way to drive some sort of standard that we could use, rather than simply responding -1 and maintaining the status quo... assuming there is a universal benefit across embedders for an api like this

@silverwind I’m a bit confused by your comment … How would that look like? Base64 itself isn’t really concerned with character encodings, is it?

Sorry for being unclear. Yes, it should not concern itself with encodings and should take any input value that can be converted to a stream of bytes (Uint8Array).

That API is not interesting to Node due to it's design.

@MylesBorins I’d go so far as to say that it shouldn’t be appealing to browser-based code either, it’s just that that API is there, so people use it.

I'd like to see us try and find a way to drive some sort of standard that we could use, rather than simply responding -1 and maintaining the status quo...

Yes! 👍 I just think that in that case, it’s the Web’s move to provide a proper Base64 API, and not something the language or Node would necessarily have to solve.

I’m against it as well. Unlike newer APIs such as the URL class, atob() and btoa() are by no means well-designed APIs.

I agree that atob and btoa are not the most useful APIs — they were standardized in HTML because they were shipping in browsers and became a de facto standard, not the other way around.

I made the module in OP for exactly the use case @silverwind describes (i.e. isomorphic JS) but if we’re going to add something new, we should take the opportunity to design something better.

Coming from browser development, I expected btoa and atob to be part of NodeJS as well. My assumption was wrong, which lead me to this issue.

I can only underline @MylesBorins motion, to find a proper standardization process for this kind of legacy API.
Regarding to @Fishrock123

I don't think we need to be entirely at the mercy of the web's ancient and terrible choices.

The effort put into standardization across node and js in the browser always strived to be as backwards compatible as possible.

Are there potentially any other APIs that we would need to migrate, if we'd follow along with being compatible with browser implementations? How have those been handled so far?
Seeing that atob/btoa potentially throws DOMException makes me think that it is not possibly 100% possible to have the exact same implementation, since we have no WebAPI implementation.

I understand this is a discussion point for adding this btoa functionality to node, but in the interest of universal JavaScript applications which has been mentioned, I have been researching this for a universal / isomorphic react app I have been building, and the package abab worked for me. In fact it was the only solution I could find that worked, rather than using the Buffer method also mentioned (I had typescript issues).

(This package is used by jsdom, which in turn is used by the window package.)

Getting back to my point; based on this, perhaps if this functionality is 'already written' (and this does not use the Buffer method, but instead seems to have it's own algorithm based on W3 spec), and people use many npm packages for development, people could install and use the abab package rather than adding this to node core, at least for universal apps.

(And perhaps this could be documented somewhere?)

i am 100% for making sure node and browsers have twin features but only to a point. atob and btoa are terribly named and only support ASCII range characters, and as such i'd rather work with the whatwg encodings team to come up with something instead of node having to inherit these terrible methods.

Did someone propose adding base64 to the encoding standard yet?

@scoobster17 Yes, abab is the canonical reference implementation for atob() and btoa() in Node.js. We don't generally document third-party modules however (except in very select circumstances), and I don't really see a need to document this specifically.

@silverwind I'm not sure but I would be against that idea. The Encoding Standard is about character encodings and has implications farther than just the TextDecoder/Encoder interfaces. For example, the same algorithms used in TextDecoder is used to interpret <meta charset> tags in HTML. As such, I don't believe Base64 to be in the scope of the standard.

Also it's important to note that atob() and btoa() are not proper Base64; rather they implement the forgiving Base64 algorithms.

@scoobster17

the package abab worked for me. In fact it was the only solution I could find that worked

Did you try the base-64 package? I would be surprised if abab worked but base-64 didn’t, given that both have the same goal of being a fully spec-compliant atob/btoa implementation.

Note: The package name is base-64. base64 also exists but doesn’t seem to match atob/btoa.

@mathiasbynens no, I don't believe I tried this package, but amongst the others I did, abab was the only one that did work. This base64 package also looks promising, although I'll stick with abab for now. If it works, don't touch it :)

@TimothyGu fair enough, I just thought it may be easier for developers to be directed to other means of achieving this goal from the Node docs, e.g. like a references or small note, but if you want to keep them from clutter, I understand that. I suppose they may come across this thread, in which case they have at least two potential options with abab and @mathiasbynens' base64.

@scoobster17 There is a difference between the behavior of the two packages that you should take note of: while base64 throws InvalidCharacterError objects, abab returns null.

@TimothyGu So abab doesn’t match the spec?

Note to avoid confusion: the package name is base-64 instead of base64.

@mathiasbynens The behavior of abab matches the spec. The only difference is that abab doesn’t throw "InvalidCharacterError" DOMExceptions, which don’t quite exist on Node.js, but base-64 tries to emulate it.

How is not throwing an exception when the spec tells you to “matching the spec”?

abab never advertises to match the spec in that regard:

If passed a string with characters above U+00FF, btoa will return null. If atob is passed a string that is not base64-valid, it will also return null. In both cases when null is returned, the spec calls for throwing a DOMException of type InvalidCharacterError.

(from the README)

As I said, Node.js doesn’t have DOMException, so unless abab brings in a module like domexception (or use an ad hoc noncompliant polyfill like base-64 does) it is simply impossible.

Thanks for the info. I'll investigate changing the package.

@mathiasbynens @TimothyGu started having weird issues today with encoding (not sure why it's started happening now) with package abab. Spent a long time trying to debug, but switched to package base-64 and it worked straight away. Definitely seemed to be down to the base64 algorithm.

For the purposes of universal apps, I think this base-64 package is fine, rather than having in nodejs (to come back to the subject of this thread). It seems to be quite simple to achieve in node, and it's pointless creating a function that simply calls another function.

@scoobster17 Would definitely appreciate a bug report at abab if you get around to file it :)

It’s actually quite difficult to write code that runs in both the browser and Node.js that does base64 without eating the Buffer polyfill. This would go a long way to fixing that.

I just want to say thanks for not adding this to native node api. Anything that goes into node api, we have to deal with version hell and polyfills.

I had to shim atob and it was quite simple.

  global.window.atob = (b64Str: string) => Buffer.from(b64Str, `base64`).toString(`binary`),

Reopening in particular to discuss dealing w/ data: URLs in a cross compat way. In particular code of the form:

new URL(`data:text/javscript;base64,${ strOfBase64 }`)

as @mikeal points out, writing this in a cross compat way is rather disruptive since you start having to sniff for what base64 utils are available.


@nojvek can you clarify the version and polyfill issue? Do you mean loading a polyfill if the function is missing? Because that seems to be exactly what your code above is doing. I'm curious why manually writing the polyfill is preferred to Node providing it. I'd expect you have polyfills for all sorts of APIs for older forms of node, so what makes this API different?

looking at WHAT spec btoa/atob only handle strings with 8bits, so emoji etc. will throw errors, likely those APIs are not sufficient, but I do not see a web API that does universal base64 transformation except FileReader.prototype.readAsDataURL.

for base 64 there is https://en.wikipedia.org/wiki/Base64#RFC_4648

I have helped to maintain a library that has quite a bit of usage in this space

https://github.com/brianloveswords/base64url with 877k weekly downloads.

Could be worth looking at that for API inspiration

Was this page helpful?
0 / 5 - 0 ratings