While investigating the Buffer usage, I noticed one thing — in many (really, many) cases, Buffers are constructed just for the sake of base64-encoding a string.
base64-encodinng is used in many places, like auth headers, data uris, messages, etc.
With the new API, that would look like Buffer.from(from).toString(encoding).
With the old API, that would look like new Buffer(from).toString(encoding) (note that this lacks checks).
So I propose to add a small utility function, e.g.
Buffer.convert = function(value, targetEncoding, sourceEncoding) {
if (!Buffer.isEncoding(targetEncoding)) {
throw new TypeError('"targetEncoding" must be a valid string encoding');
}
return Buffer.from(value, sourceEncoding).toString(targetEncoding);
}
Note that the impl does not default to utf-8 and forces an encoding to be specified, this way it would be more clear what the userspace code does.
Sure, that could be done on userside, but this doesn't deserve a separate package, and re-implementing that in all the packages that do conversion also doesn't look very good to me.
Note that if we have had such method earlier, the usage of Buffer without new (and other Buffer-related issues) would have been measurably lower.
If everyone thinks that this is a good idea, I am willing to file a PR for it (impl/docs/tests).
Or perhaps Buffer.convert(value[, sourceEncoding], targetEncoding) would be better — I am not sure of that yet.
I’d be okay with it, although I would probably prefer it as buffer.convert directly on the buffer module, like we have buffer.transcode (to which convert would be the dual).
And you should probably specify both encodings explicitly. For this conversion to make sense, one of these has to be a binary-to-text encoding like base64, and I don’t think there’s any way of knowing whether it’s supposed to be encoding or decoding?
/cc @nodejs/buffer
@addaleax Ah, I entirely missed that one (from #9038).
Then we could use the same header, like buffer.convert(value, fromEnc, toEnc) and perpahs also just buffer.convert(value, toEnc) (defaulting fromEnc to 'utf-8').
@ChALkeR I am not sure you can compare the situation wrt default encodings here. For transcode, it makes sense to default to utf-8 for the arguments, because transcode only makes sense to be called for character encodings, not binary-to-text ones.
If you prefer it another way, I guess that’s fine, but I really think explicit encodings would be better here.
Not sure this is a good idea.
buffer.convert(fromStr, target, source) isn't much simpler than Buffer.from(fromStr, source).toString(target).buffer.transcode.@seishun Thanks.
buffer.convert(value[, fromEnc], toEnc), I did not update the impl, though. Does that look better?buffer.convert to strings only and remove Buffer support on input. This way, it couldn't be confused with .transcode, which accepts only Buffer objects.I have already changed the order in the title to be buffer.convert(value[, fromEnc], toEnc), I did not update the impl, though. Does that look better?
I didn't say that the order is better one or the other way. My concern is that people will likely get confused either way.
This way, it couldn't be confused with .transcode, which accepts only Buffer objects.
But they still both live in the buffer module. And it's not obvious from the names which does which.
@ChALkeR Leave this open? (I imagine so, but it's been inactive for long enough that I figure it's worth checking.)
This is essentially Buffer.transcode()... and can also be accomplished using the Encoding API implementation I've done.
This is essentially Buffer.transcode()
Not quite. Buffer.transcode converts a buffer to a buffer. This converts a string to a string.
so the proposal itself has a prototype code, not sure why it got stalled for 18 months. Calling for contributors!
I thought @seishun brought up some valid concerns. Also Buffer.from(fromStr, source).toString(target) is already very convenient. I don't really know how I feel about adding this... It's not like there's even that many people asking for this feature.
@apapirovski - fair point, while I also don't have strong opinion either way (not much demanded from users vs a convenient API abstraction) my motivation was to see how this can be moved towards a conclusion. If we have consensus on no action at this point, we can close this out.
/cc @ChALkeR @addaleax @seishun @jasnell
I personally think Buffer.from(fromStr, source).toString(target) is convenient enough.
The only compelling reason I can see is that it _might_ be possible to improve the performance of that operation. But that would be a much more complicated implementation than suggested and it is not clear how significant the difference would actually be.
@BridgeAR I disagree — people do that _a lot_, and some still use new Buffer constructor there.
Simplifying that usecase would be helpful, probably.
@seishun I am also not sure if this should be on the buffer module or somewhere else. Changing the order from to, from to from, to looks fine, I suppose.
@ChALkeR if they did not move away from new Buffer by now I doubt that they would change it to anything else, no matter if it is simpler or not.
There's been no activity on this in over 2 years and it can be accomplished using existing API. Closing.
I think that Node has no way to convert buffer encodings from one buffer to another without creating an intermediary string and doing associated GC?
@jorangreef buffer.transcode() does exactly what you describe.
This issue here seems to be about a string → Buffer → string conversion. As I pointed out above, this is not really useful unless one of the encodings is a binary-to-text encoding, i.e. hex or base64. Given that limited scope and the fact that there has been no activity here, I think closing this is the right call.
Thanks @addaleax
Most helpful comment
@jorangreef
buffer.transcode()does exactly what you describe.This issue here seems to be about a
string → Buffer → stringconversion. As I pointed out above, this is not really useful unless one of the encodings is a binary-to-text encoding, i.e.hexorbase64. Given that limited scope and the fact that there has been no activity here, I think closing this is the right call.