In today’s CTC meeting we discussed reverting the DeprecationWarning
for calling Buffer
without new
that was introduced in v7
(PR up here), and it became clear that we need to come up with a long-term plan on what exactly we want to achieve, how to do that and improve our messaging about it, both before and after our actions. I’ll try to sum up what exactly we are talking about; obviously, I am somewhat biased, having been involved in plenty of the previous discussion here. (This has still gotten pretty long btw, so I hope a lot of people will find the information in here useful enough to warrant a Wall of Text.)
The Buffer
constructor has the usability flaw that it accepts input with different type signatures, so new Buffer('abcdef')
and new Buffer(100)
will both return valid buffers, and in the latter case, the Buffer
will contain 100 bytes of unitialized memory. This is a security problem for two reasons:
Buffer
constructor where a string is expected but a number is actually passed, uninitialized memory will be returned:// This is a dangerous example of converting a string to Base64!
new Buffer(someStringReceivedFromTheUser).toString('base64')
Passing the value 100
here will return a slice of memory that may contain garbage, but generally can contain any value previously stored in memory, including credentials, source code, and much more. @ChALkeR has a pretty good write-up of this: https://github.com/ChALkeR/notes/blob/master/Buffer-knows-everything.md
Again, @ChALkeR has a very-good write-up on these security issues at https://github.com/ChALkeR/notes/blob/master/Lets-fix-Buffer-API.md. It predates the current Buffer.alloc()
/Buffer.from()
situation, but it contains a helpful FAQ with answers to questions like “Why not just make Buffer(number)
zero-fill everything by default?”.
So far, in Node v6.0.0 the safer Buffer.alloc()
/Buffer.from()
API was introduced and later backported to the v5.10.0 and v4.5.0 releases. Additionally, v6.0.0 came with a documentation-only deprecation of the old Buffer()
API.
In June, https://github.com/nodejs/node/pull/7152 was opened, which seeks to deprecate the old Buffer()
API using a runtime deprecation, i.e. printing a single warning per Node process when Buffer()
or new Buffer()
is executed for the first time. Currently, that PR is still open. A reduced version of it, https://github.com/nodejs/node/pull/8169, was landed as a semver-major change in v7.0.0, that emits and displays DeprecationWarning
s for uses of Buffer()
only, but excludes uses of new Buffer()
.
I had summarized the goals and possible actions before that decision was made in https://github.com/nodejs/node/pull/7152#issuecomment-241355246 Âą; And @jasnell has written a then-current long-term plan in https://github.com/nodejs/node/pull/7152#issuecomment-240753218 that would include runtime deprecations of new Buffer()
in v8.0.0 and later actual breaking changes to the Buffer
constructor.
The reason for this distinction was trying to keep the possibility of making Buffer
a proper ES6 class at some point in the far, far future open, which would mean that calling new Buffer()
may always work. (Effects of turning Buffer
into a class would be proper subclassability and breaking Buffer()
without new
. It is, however, completely possible to add a separate class to the API that would behave like the current Buffer
implementation does, only with these differences.)
As a result of that deprecation for Buffer()
without new
in v7.0.0, significant pushback from well-known members of the community ensued, both in the threads on https://github.com/nodejs/node/pull/7152 and https://github.com/nodejs/node/pull/8169. On the one hand, it became clear that we failed in our messaging to make clear that the primary motivation for that change was helping our users avoid serious security issues; on the other hand, the added deprecation warning seemed to be incongruent with the expectations of stability and backwards compatibility that module authors and consumers have, as far as Node core is concerned.
As a result of this, the CTC decided to consider reverting the deprecation warning, possibly temporarily, and the corresponding PR is in https://github.com/nodejs/node/pull/9529. The decision on that has yet to be made, but the desire has been expressed to reach a decision soon to limit the number of v7.x versions with possibly incongruent behaviour.
From following the discussions, it is obvious that the path forward is a contentious issue; right now, the opinions range from never introducing a runtime deprecation for any version of the Buffer
constructor, to applying one for all uses of it at the next semver-major release in v8.0.0.
The strongest and most frequently expressed argument for fully runtime-deprecating the Buffer
constructor soon remains that users may not be aware that parts of their application use an unsafe API and should be warned about that.
On the other side, the warning itself is perceived as a very disruptive change to the ecosystem, suggesting that it is definitely worth exploring alternative ways to reduce the usage of both Buffer()
and new Buffer()
.
/cc @nodejs/collaborators
¹ It may or may not be obvious from the way I articulate my thoughts here – I try to stick to stating facts – but in hindsight, I regret writing it this way.
More opinionated stuff from myself:
I do not think we should ever turn Buffer
into an ES6 class; we can create a separate API for that and make Buffer
a wrapper around that, as it has been discussed before, and nothing about Buffer()
ever needs to be truly broken. I know my comment linked above mentions a possible breakage as a motivation for deprecating Buffer()
earlier than new Buffer()
, but I definitely don’t agree with my past self on this anymore. I _do_ however see that there is significant security risk involved with their usage, and that that may warrant a full runtime deprecation at some point in the future.
Correspondingly, I would ask of others (although I am aware that I obviously can’t speak for everyone) to try and avoid giving “Subclassability” and “ES6 class” as reasons for the deprecations here; I think it is obvious that our messaging broke here, as reflected by the comments on https://github.com/nodejs/node/pull/7152 and https://github.com/nodejs/node/pull/8169. IMHO, this is primarily about security, and all other benefits of future changes pale in front of that.
@addaleax Thank you for the detailed write up. Much appreciated.
I'll suggest a way forward that have been mentioned by other people already.
Instead of deprecating anything start zero-filling buffers returned by the Buffer constructor, similar to Buffer.alloc. Back-port this change to old versions of node (I'd like to see this added all the way back to 0.10 but I know that isn't officially supported anymore).
A benefit of this approach is that old code will work just like before - no changes needed.
It also introduces an incentive for module authors to upgrade to the new API as the zero filling has a perceived performance penalty.
@mafintosh ... automatically zero-filling only addresses part of the issue. There are other aspects of the existing Buffer()
use that are problematic -- but that can be patched over in much the same way. Unfortunately, doing so makes it less obvious that users on older versions of Node.js are doing the wrong thing -- and transparently fixing the issue in new versions of Node.js could lead to developers being completely unaware that their older version of Node.js may have an issue. That's not to say that zero-filling by default and limiting the maximize size to avoid DOS is not a good approach, it's just that there are still ecosystem and usability issues that go along with it.
@mafintosh For more context on what @jasnell said, see https://github.com/nodejs/node/issues/4660#issuecomment-171262864 (from before Buffer.alloc
existed). The concern is that this would actually _create_ security issues, because people would stop using new Buffer(num).fill(0)
in modules, and then anyone using an older version of Node without the zerofill behavior would be vulnerable.
This might be more doable now that Buffer.alloc
exists. If we decide to go that route, I think we should at least keep Buffer()
and new Buffer()
soft-deprecated to avoid this issue and to encourage everyone to use Buffer.alloc
instead.
Just to clarify: I'm all for the soft deprecating of the constructor. I was referring to doing the zero-fill instead of hard-deprecating it.
I do not think we should ever turn Buffer into an ES6 class; we can create a separate API for that and make Buffer a wrapper around that
I'm going to disagree with that. The additive approach of "just add a new API" leads to a sprawling and uncohesive design. Newcomers to node already complain there is so much to take in, let's not make it worse.
Accidentally accepting large numeric values can very quickly increase resource usage
v6.x and v7.x solve that by throwing an exception when calling Buffer(1234, 'encoding')
(but not for Buffer('1234', 'encoding')
), we should discuss back-porting that to v4.x.
I'm in favor, it improves security and I can only see it breaking code that is already prone to getting broken by attackers. Better it's us than some black hat, am I right?
v6.x and v7.x solve that by throwing an exception when calling
Buffer(1234, 'encoding')
(but not forBuffer('1234', 'encoding')
)
They don’t when there’s only a single argument, Buffer(1234)
vs Buffer('1234')
.
I'm on record for supporting deprecation of Node.js APIs that don't make sense (https://www.youtube.com/watch?v=jJaIwea8r2A, https://gist.github.com/sam-github/4c5c019b92cf95fb6571), and I even support the deprecation of these Buffer APIs (slowly, properly communicated), but I think the pain of these kind of changes is not well appreciated.
For one, console messages have huge negative impacts downstream, because people who can do nothing about them see them, see https://github.com/nodejs/node/pull/9483
But for another, trivial changes become less and less trivial as they work their way up module dependencies.
For example, say a version of glob is deprecated because there is a security problem. A new major comes out. The only change is in an obscure corner case I don't care about, and most don't. But, its a functional change, it _must_ be a major. That's OK.
To update to the new major for packages that directly depend on glob is trivial, just bump the major in your package, the API didn't actually chnage, so no code changes. So far, so good. So packages that depend on glob do this, takes a couple minutes, republish, no problem.
But authors do this on only the head of their packages. Going back into history and doing a patch release of every major version that has ever used glob? That's a lot of work, so only the latest. And maybe they bump their package (EDIT: major) version because, hey, who knows, maybe some downstream dep depended on that particular glob corner case.
And the changes slowly work their way up, 3 levels up through other packages that have had major updates, used by tap.... and now the latest tap depends on the latest glob, which is A-OK.
But I have packages with unit tests I haven't touched in years... and now I need to bump to the latest tap to get ride of the messages about glob... and while glob's change was tiny and insignificant, tap has changed a _lot_, and all my tests are now failing, and I'm spending hours and hours fixing them, even though the original change in glob required _no js code changes for the update_!
This has been happening to me lately, with the graceful-fs and glob updates. I support those updates, and I'm doing the work, and I don't mind.
But its worth remembering that both those updates were at their root trivial, graceful-fs and glob had new versions that were drop-in replacements for the previous majors. So, upgrading should have been trivial, and it was for direct dependencies, but as the changes ripple up the dependency trees, the changes become less and less trivial, because they start to get bundled with changes that are not so trivial.
This is the kind of thing that will happen with the Buffer changes. Immediate code that uses Buffer will change trivially, but further up the tree, its going to hurt a lot more.
They don’t when there’s only a single argument, Buffer(1234) vs Buffer('1234').
Yes, and it's regrettable, but that can't be changed anytime soon. Back-porting the changes for the two argument case would at least mitigate @ChALkeR's hoek example.
I've been thinking a lot about this, and actually written a bit but haven't shared my thoughts yet because they're incomplete. But I wanted to share one of my ideas: perhaps the fundamental problem had nothing to do with Buffers or APIs at all, but how we deprecate things.
There are two values that we all share but right now they are in conflict: Security and Stability. Core wants to motivate module authors to fix two highly dangerous flaws that might exist in their code: leaking secrets via uninitialized buffers, and DDOS attacks via buffers constructed with unboundchecked ints. Having created the flawed API, it seems reasonable they assumed responsibility for trying to fix the mistake. But we can't always solve the problems we create on our own, and this is a case where core had dug itself in a hole, and would need module authors to help lift them out. Because the Buffer API was a one-way (lossy) function: you could easily convert Buffer.from, Buffer.allocate, and Buffer.allocUnsafe to the unsafe "Buffer()" form, but given a Buffer() call it is impossible to automatically choose a safe behavior based the arguments, because knowing whether it was being used safely or not requires analyzing the source code _in the calling function_ which the Buffer() function doesn't have access to.
When your only tool is a hammer, every problem looks like a nail. So they used the only tools they had: the docs website for node, and a console message in node. That's when we ran into a stability conflict. Because apparently the console output of Node is treated as part of it's API by test runners, which saw the deprecation message and freaked out. Which sort of succeeded at the original aim: get module authors attention so they can fix this flaw. However it flew in the face of another value: Stability. Module author's freaked out, thinking core was randomly changing the Buffer API, and complained loudly. And rightly so, because broken test cases, multiplied by every project that includes that somewhere as a dependency, creates chaos. (See leftpadgate)
The literal cause of the problem is not deprecating Buffer - it's how it was communicated, which turned out to break modules, much to core's surprise, I think. I certainly wouldn't have thought one little console log message would break builds. But apparently it does! So what can we do?
Core values stability just like module authors, I'm certain. Therefore it seems clear that deprecating things by printing to the console log is the wrong strategy, because that will always interfere with program output. I think that deprecating strategy... has to go. Should be removed from core's toolbox.
Now here's my actual novelty/contribution. If the goal of deprecation is to get module authors attention so they fix their code, node core is not the right venue for that. Npm is. I think, core should reach out to npm and see if they could help publicize deprecations. Place a warning banner on every module that uses the old unsafe API. This would have zero effect on how the code runs and therefore would not threaten the stability of the module ecosystem. Yes, developers might be pissed to be publicly called out for their module using insecure APIs. So take advantage of the fact that npm has the email address for all the authors and let them know in advance. Maybe just show a small warning that gets bigger over time. The point is, deprecation is fundamentally a social process done via communication, not a technical one that can be done by modifying the node engine.
Sorry if I rambled on a bit.
I certainly wouldn't have thought one little console log message would break builds. But apparently it does!
So far I've only heard of one module that was actually functionally broken by Buffer-without-new deprecation. All other cases of breakage I've encountered were in tests which look at stderr output.
Place a warning banner on every module that uses the old unsafe API.
Static analysis doesn't always work well in JavaScript.
All other cases of breakage I've encountered were in tests which look at stderr output.
A broken test is a broken test, is a broken test. It is disruptive, especially if you rely on Continuous Deployment and a breaking test halts deploying to production. But that's a hypothetical on my part. I am a little curious though about the particulars. Can someone (@mafintosh @substack etc) speak to what horrors rained down on them as a consequence of test suites breaking?
​Unexpected console output when using a node CLI program _is_ a functional
breakage.
I (along with @mafintosh) were the original reporters of this issue. After reading through this discussion (and dealing with many issues opened by users about this issue), I think the best way forward is to zero-fill buffers created with Buffer()
and new Buffer()
.
Proposal: Zero-fill buffers created with Buffer()
and new Buffer()
in v7 (as well as active LTS versions: v4 and v6)
This solves the most important issue that is at stake here: the possibility of accidental data leakage via an unintentional Buffer(num)
. The DOS issue of Buffer(really_large_num)
remains, but denial-of-service is far, far better than data leakage.
Since Buffer()
is already soft-deprecated in the docs, modules written in the future should not be using it. Existing code that uses Buffer()
will only benefit from the zero-filling, which may fix existing security issues.
The risk that module authors will start assuming that Buffer(num)
will zero-fill, leading to insecure modules on older Node versions is reduced if we ship zero-filling to all active LTS versions.
In conclusion, zero-filling fixes the most important issue – data leakage – without major breaking changes and without hard-deprecating the Buffer API.
This solution lets us support the legacy Buffer()
API for the foreseeable future, ensuring existing code continues to work, without it being a security risk for new code that happens to use Buffer()
.
It seems there is consensus that security issues in old modules are a real problem, but some people think it could be better solved by zero-filling new Buffer(num)
instead. While this alternative proposal has some merits, it also has its shortcomings, which in my opinion outweigh the merits. I'll try to summarize them below.
Solution A: one-time runtime warning (aka hard deprecation)
Buffer
without new
could eventually be removed, which allows Buffer
to become a proper ES6 class that can be subclassed.Solution B: zero-fill buffers created with Buffer(num)
and new Buffer(num)
[new ]Buffer(num)
manually since they won't have the equivalent of --trace-deprecation
to help them. The backport also won't help users who for some reason will continue using old versions of Node.js. (Note: I'm all for forcing people using outdated versions of node to upgrade, but silently giving them security vulnerabilities is a different matter.)new Buffer
is OK in old code, then it should be OK in new code, so what's the point of Buffer.alloc*
?"). Might create an impression of bad long-term planning.There will not be a solution that please everyone. I'm definitely 👎 on zero-filling Buffer
, and at the same time I am 👎 on hard deprecating anything. And I am 👎 on delivering insecure software. I am also aware that one of the above will happen. It's a bad situation.
Just to clarify, the major issue I see with the old API is Buffer(string)
and new Buffer(string)
, but not new Buffer('hello world')
and new Buffer(42)
. For old modules, the only problematic case is new Buffer(string)
, where string is a variable.
Emitting the warning _just_ for Buffer(str)
, but not Buffer(str, 'utf8')
and Buffer(int)
might also be a viable option. This should reduce the surface area for maintainers to just the cases that are problematic.
It might be a silly idea.
Can we start adding rules in the linting/security software to catch this? Maybe there is already, but it might be an easy way to get started.
Emitting the warning _just_ for
Buffer(str)
, but notBuffer(str, 'utf8')
andBuffer(int)
might also be a viable option. This should reduce the surface area for maintainers to just the cases that are problematic.
I also thought about that, but no, that won't work. Buffer(42, 'utf8')
does the bad thing on 4.x LTS, so Buffer(str, 'utf8')
is not universally safe. And adding a throw into v4.x LTS is hard to justify.
new Buffer(42)
could _probably_ be fine, though I am not yet sure, need to think about that a bit more.
Upd: at least one cons — if we keep it, we won't be able to zero-fill new Buffer(42)
for the reasons noted below.
@feross Re: zero-fill, the only viable time to zero-fill is in the same release that introduces a runtime deprecation warning for the API. E.g. when/if Buffer(42)
is runtime-deprecated with a clear message, we will be sure that no people would add that in _new_ code, thinking that it will zero-fill on all releases that they care about.
The problem with zero-fill is that if we introduce it in vN.0.0 but don't introduce a runtime-deprecation for Buffer(42)
in that release, then at least some library authors caring about only vN.0.0 would rely on Buffer(42)
being zero-filled, and when someone runs that code on a previous release, things will go even worse than they are now.
Also don't forget that zero-fill doesn't prevent DoS issues.
Re: zero-fill, the only viable time to zero-fill is in the same release that introduces a runtime deprecation warning for the API. E.g. when/if Buffer(42) is runtime-deprecated with a clear message, we will be sure that no people would add that in new code, thinking that it will zero-fill on all releases that they care about.
@ChALkeR Making things safe and deprecating an API are two separate things. By zero-filling security becomes guaranteed which is a different concern from migrating people onto a new API. I feel while the two might be related, they are by no means tied to each other.
@ChALkeR if I understand you correctly you try and make the point that if people don't migrate to a new API the npm ecosystem will be worse off than it currently is. Given that in the current situation nobody is using the new API this is not possible. Instead what we _can_ do is make all buffer code instantly safe by zero-filling, and move the ecosystem forward in a giant leap.
Then there's the point of moving people onto a new API. I feel NodeJS has a good outreach through release notes, blog posts talks and other channels. More than enough to make people aware of a changed API. Besides that there's also an incentive for developers to pick up the new API in the form of improved performance. If we do this well all developers that care about performance will have the means and reasons to move to the new API, all without needing to make maintainers sad.
I hope this sounds reasonable; I'm in favor of zero-filling by default as it seems like the least problematic way forward. Fwiw I'd also be keen to see the perf implications of zero-filling, as I don't recall seeing any numbers so far and having them would be great to incentivize people to move to a new API.
I hope this sounds reasonable; I'm in favor of zero-filling by default as it seems like the least problematic way forward. Fwiw I'd also be keen to see the perf implications of zero-filling, as I don't recall seeing any numbers so far and having them would be great to incentivize people to move to a new API.
console.time('new Buffer(1024)')
for (var i = 0; i < 10000000; i++) {
new Buffer(1024)
}
console.timeEnd('new Buffer(1024)')
Here it is, roughly 2x:
$ node --zero-fill-buffers test.js
new Buffer(1024): 4329.844ms
$ node test.js
new Buffer(1024): 2562.717ms
(node v6.9.1)
IMHO making an old API slower is worse than emitting a warning, because you are "forcing" people to move, with the additional problem that the culprit is not easy to track. On the good side, tests will not cause warnings to be emitted.
It might be ok as a semver-major change together with a warning, but not in v4, v6 or v7.
I am still 👎 on zero-filling, as I am 👎 on the deprecation warnings. However, if we go for a deprecation warning, zero-filling any new Buffer()
and Buffer()
call is acceptable.
Maybe we can warn something like:
"new Buffer() and Buffer are now zero filled for security reasons, see LINK". In LINK, we show the new APIs, and how to move to them. If we can write something like standard --fix
that does the job automatically, it would be of great help.
The above warning is annoying, but it might cause less worry than a full deprecation. It is not a generic "the api might go away", but it's positive: we have deprecated this api to make you safe, read this link to make this warning disappear.
Keep in mind that zero-filling is currently done in an unsophisticated way. Preallocating memory coupled with offloading zeroing to a separate thread should remove much of the overhead.
@bnoordhuis Keep in mind that zero-filling is currently done in an unsophisticated way.
Good to know! If there's only a negligible performance difference to the zero-filling, that seems the best option to me.
@seishun Most of the cons in your list aren't correct.
If this is not backported, it will create new security vulnerabilities, see @ChALkeR's summary
If we add zero-filling to supported release lines, then the impact will be minimal. Users on v4, v6, v7 (and 0.12 if this is resolved in time) will be covered.
As for users on 0.10 or other unsupported versions -- they're already playing with fire by running an unsupported version in product. Even so, they're unlikely to be affected by new code that uses Buffer()
unsafely IMO. Buffer()
is already deprecated in the docs. New code should not be using it. (Developers who are familiar with the old behavior and haven't read the docs recently will continue to assume that they are responsible for zeroing/overwriting the buffer, so no harm done.)
If this is backported, it will silently introduce non-trivial performance degradation to some users
This is not an issue. See @bnoordhuis's comment: https://github.com/nodejs/node/issues/9531#issuecomment-261773876
Raises the question why we didn't do this in the first place instead of introducing new APIs ("if new Buffer is OK in old code, then it should be OK in new code, so what's the point of Buffer.alloc*?"). Might create an impression of bad long-term planning.
The point of the new APIs (Buffer.alloc()
, etc.) is to split apart two very different behaviors into explicit functions. Buffer.alloc()
allocates num
bytes of memory. Buffer.from()
converts an object into a Buffer
.
@feross
Users on v4, v6, v7 (and 0.12 if this is resolved in time) will be covered.
No, they won't. You are assuming an immediate update, I expect the number of users that will gain problems from this (i.e. that don't update immediately to a latest version in the branch, or don't update fast enough before they install a package that relies on zero-filling) as non-zero and significant.
Moreover, I estimate this to be _more_ disturbing on the ecosystem than forcing maintainers of popular packages to do trivial updates.
Also, every time someone proposes zero-filling, they are ignoring the DoS issue.
Developers who are familiar with the old behavior and haven't read the docs recently will continue to assume that they are responsible for zeroing/overwriting the buffer, so no harm done.
Why do you think these developers won't make the same mistakes that triggered the creation of the new Buffer API in the first place?
This is not an issue. See @bnoordhuis's comment: #9531 (comment)
I wouldn't be so hasty with conclusions. Remember that people run Node.js on all kinds of hardware.
The point of the new APIs (Buffer.alloc(), etc.) is to split apart two very different behaviors into explicit functions.
No, splitting them apart was not the end goal. The end goal was to fix security and DoS vulnerabilities, and we agreed that it should be solved by introducing new API rather than changing the old one. Zero-filling now means backing down on that decision.
@ChALkeR forcing maintainers of popular packages to do trivial updates
The impact is way more serious than either you or @seishun understand.
You can't just send a few PRs, do a few npm publish
s, and be done with it. Lots of packages have dependencies on packages that are not the latest version. So, in order to make the warning go away in a top-level package, you'll need to upgrade past a semver major or two, and potentially rewrite a lot of code. It's not accurate to keep calling this "trivial", IMO.
The day that node.js deprecates Buffer()
will be the day that everyone sees a permanent deprecation error when they start node.
Maybe a deprecation message and forcing everyone to do a ton of work is the right way to go in the end, but you shouldn't minimize the HUGE and very real impact this will have on the community.
No, splitting them apart was not the end goal. The end goal was to fix security and DoS vulnerabilities
This is not correct at all. There was never any security or denial-of-service vulnerability in Node.js because of Buffer()
. The behavior of Buffer()
is well-documented, and the implementation has always worked as documented.
I opened the original issue because Buffer()
was a potential foot-gun. It does very different things depending on the type of the argument that is passed in. The changes were always about splitting apart the safe and "unsafe" methods so users can be very explicit when they want uninitialized memory.
Zero-filling now means backing down on that decision.
We can make the original, flawed Buffer()
API safer by zero-filling, while also recommending the newer, explicit APIs.
@seishun Why do you think these developers won't make the same mistakes that triggered the creation of the new Buffer API in the first place?
They might make that mistake. But in that case, Node users with zero-filled Buffers would be safe, while Node users without it wouldn't be. In such a scenario, we want as many users to have zero-filling as possible.
@feross
The impact is way more serious than either you or @seishun understand.
I do understand the impact. Moreover, I have built a dependency graph of packages from npm which takes versions into an account, especially for tracking update propagations. (An older version is available here, entry points are recent package versions). I have not yet built a model which combines that graph with Buffer()
usage and download counts, but I'm planning to do that soon, before the next deprecation lands, collecting and filing security issues still takes a lot of time.
Based on the current data and previous experience of propogating package updates in dependencies chains, I expect the total usage (per downloads) to decrease about 90% in several months. I hope to obtain a better estimation soon.
The day that node.js deprecates Buffer() will be the day that everyone sees a permanent deprecation error when they start node.
We can make the original, flawed Buffer() API safer by zero-filling, while also recommending the newer, explicit APIs.
Did you read my arguments agaist that above? The amount of setups whose security would be _lowered_ by such a change is going to be not zero. If you disagree with that statement — please explain.
Node users with zero-filled Buffers would be safe
No, they won't, because they will still have the DoS vulnerability in their projects.
No, splitting them apart was not the end goal. The end goal was to fix security and DoS vulnerabilities
This is not correct at all. There was never any security or denial-of-service vulnerability in Node.js because of
Buffer()
.
To the end user, it does not matter _where_ the vulnerability was, there is no point in blaming anyone or saying «that's not my problem» here. The thing that matters here is the ecosystem security (or end user security, whichever you prefer to call it), and there was definitely a problem there. And we are still having that problem.
@ChALkeR version is available here
Your link is dead.
That's not an error, it's a one-time (per launch) warning.
I should have said "warning" instead of "error", but my point still stands. Almost all node users will see warnings spew to the console at start.
That warning could be surpressed using a command-line flag.
Doesn't matter. 95% of users will not suppress the warnings, and maintainers will still get inundated with issues about code that worked fine before.
In a significant percentage of cases, that warning would hint a security problem somewhere in the dependencies. Which is now hidden for the same reasons you mentioned.
If instead of showing a warning we just zero-fill all buffers, then there is no data leakage security issue anymore.
Did you read my arguments agaist that above? The amount of setups whose security would be lowered by such a change is going to be not zero. If you disagree with that statement — please explain.
I do disagree with that statement.
Your argument, as I understand it, is that if we ship zero-filled Buffers, then new code might be written by module authors who assume zero-filling happens on all Node versions, making users on older versions less secure.
For that to happen, ALL the following things need to go wrong, which I find unlikely:
1) Developers would need to write new code using Buffer(num)
, despite the strong recommendation against doing so in the docs.
2) Developers would need to use Buffer(num)
in such a way that the buffer is not fully overwritten by new data before being exposed to a client.
3) Developers would need to install this new code and ship it to production. (This is very unlikely because users on v0.10 or v0.12 can't really upgrade their dependencies these days without some ES6 or ES7 sneaking in and breaking everything.)
4) Developers would need to be running an insecure version of Node.js in production. (This is a really bad idea, and any user/company that cares about security will be on a supported version to keep on top of critical OpenSSL fixes, etc.)
5) An attacker would need to identify a vulnerable service and create an exploit.
Security is a continuum. Different users are willing to tolerate different levels of risk. Users with the most stringent security requirements will run a supported version: 0.12, v4 LTS, v6 LTS, or v7. Users on these versions will get zero-filled buffers and be better off than before.
No, they won't, because they will still have the DoS vulnerability in their projects.
Neither solution fixes the "DoS vulnerability". Printing a deprecation warning doesn't prevent DoS.
Also, it's a stretch to the truth to call this a "DoS vulnerability". This is an API usability issue that made it easy to write code that might be DoS'd one day. Regular expressions are also notorious for the same problem -- are you willing to deprecate those too?
To the end user, it does not matter where the vulnerability was, there is no point in blaming anyone or saying «that's not my problem» here.
It actually does matter where the source of the problem is.
It's extremely easy to write a regex that has extreme DoS potential, for example (a+)+
or (a|aa)+
can be exploited to take a server offline. Is it Node's responsibility to prevent this? Of course not! You can't deprecate regex, and you can't deprecate 1000s of packages on npm because Buffer(large_num)
might slow down some sites.
The thing that matters here is the ecosystem security (or end user security, whichever you prefer to call it), and there was definitely a problem there. And we are still having that problem.
Yes, I can agree with that statement :) We're all trying to make it (1) easier for users to do the right thing, while (2) minimizing ecosystem disruption.
We just disagree on how much ecosystem disruption is acceptable. Node can't make huge backwards incompatible changes anymore. The Web which is WAY bigger than Node doesn't even make breaking changes anymore.
95% of users will not suppress the warnings, and maintainers will still get inundated with issues about code that worked fine before.
The important point is that the users have such an option if they don't want to be annoyed. And I'm not sure what you mean by "inundated". As far as I know, the deprecation of Buffer without new
didn't cause any project to receive more than one issue about the warning. Most modules I've seen also received PRs with fixes.
Neither solution fixes the "DoS vulnerability". Printing a deprecation warning doesn't prevent DoS.
It does, indirectly. By encouraging people to update their code or their dependencies.
This is an API usability issue that made it easy to write code that might be DoS'd one day.
We could have applied the exact same logic to new Buffer(num)
, but there's a reason we didn't.
Regular expressions are also notorious for the same problem -- are you willing to deprecate those too?
Only if their misuse was as widespread as that of Buffer
, and could be deep inside the dependency tree.
After thinking about it more I don't think deprecating the entire constructor is viable due to constructor inheritance.
Which is something that requiring new was supposed to open more of and since it opens more use-cases to our users it is something we are unlikely to back down on I think.
After thinking about it more I don't think deprecating the entire constructor is viable due to constructor inheritance.
Actually, you can add a check to prevent the deprecation warning if it's a child class. See here.
fwiw, @trevnorris' suggestion can be simplified just a bit to... if (new.target && new.target === Buffer) { /** **/ }
@feross
Your link is dead.
Fixed, thanks. I forgot http://
, that was quite obvious =).
For that to happen, ALL the following things need to go wrong, which I find unlikely:
«Unlikely» does not work that way. Assign probabilities to each one, multiply them, multiply them by the ecosystem size. It's the large ecosystem that makes the net effect non-zero here. If there would have been 10 users in the ecosystem, then this would not have been an issue, of course.
Developers would need to write new code using Buffer(num), despite the strong recommendation against doing so in the docs.
They already do. A lot of users ignore deprecation warning in the docs. _p > 0.1_.
Developers would need to use Buffer(num) in such a way that the buffer is not fully overwritten by new data before being exposed to a client.
They already do.
Developers would need to install this new code and ship it to production. (This is very unlikely because users on v0.10 or v0.12 can't really upgrade their dependencies these days without some ES6 or ES7 sneaking in and breaking everything.)
Why are you mentioning 0.10 and 0.12 only? Think v6.9.1, v7.1.0, v4.6.2. That's all the current node users, a significant amount of whom would install/update new packages without updating Node.js. _p > 0.3 (from the affected package users)._
Developers would need to be running an insecure version of Node.js in production. (This is a really bad idea, and any user/company that cares about security will be on a supported version to keep on top of critical OpenSSL fixes, etc.)
Insecure? E.g. v7.1.0? The fact that v7.1.0 is going to become an «insecure version» is only because of this change you propose.
An attacker would need to identify a vulnerable service and create an exploit.
When one gets here and this becomes their last hope, things are pretty bad already. If the service is vulnerable, this is a problem by itself, even if it was not actually attacked. One of the problems is that in many cases one can't be sure if the server was compromised or not. Also, I estimate the rate of hacking something vulnerable (which accepts typed input e.g. through JSON) using just a params fuzzler here is quite probable.
Printing a deprecation warning doesn't prevent DoS.
It does, though highligting potentially dangereous code and making the packages to migrate from that to safer code.
Alright, let's get some facts straight:
I feel we can have a productive discussion on what the best solutions are given these facts. But trying to argue against them again and again is quite unproductive. It makes people with possibly valuable perspectives grow tired and leave the thread, and I'm quite sure that turning a thread into an echo chamber is not the in the best interest of the project.
That said: I'm leaving this thread now. I'm tired.
I'd say that presenting opinions as facts isn't really productive either. I'm not sure if there's any point in responding, but I'll try to be terse anyway.
in no case is throwing a deprecation warning safer than zero-filling buffers
As mentioned, zero-filling doesn't help with DoS, and there is a non-zero chance that some devs will rely on it, creating security issues for users of older versions of node.
throwing a deprecation warning will be disruptive for the ecosystem and maintainers alike
It's indeed a fact that this creates annoyance for users and work for maintainers, but I'd reserve the term "disruption" for situations where things break and stop working, i.e. a whole different level of bad. (examples: realpath fiasco, removal of leftpad, etc)
Aside: "throwing" deprecation warnings is a misnomer. If that was true, your program would have unhandled exceptions. It does not unless you use a flag.
If you skipped over my first comment, go back and read it.
@feross @mafintosh @substack @sam-github I haven't authored any popular modules, so I haven't been "inundated" with Github issues. You'll have written hundreds of packages, and seem to think that the deprecation warning message causes the end of the world. Can you link to some of the flood of issues this deprecation has created, for the benefit of the rest of us (ie @seishun @ChALkeR @Fishrock123)? Because I'm genuinely curious. And comments like @Fishrock123 's seem to suggest they don't think deprecation warnings are a "real" error.
My 2 cents:
DoS simply means in the worst case, your server fails. Maybe you lose money. I'm in this position - apparently five different packages in production introduce a RegExp DoS vulnerability, even though I have the latest versions of all my own dependencies installed. Whatever.
Leaking secrets from the server memory? Worse case, you have a major
lawsuit on your hands from all the users whose passwords and sexual predilections
were just made public.
The first is merely an issue of availability. The second is an issue of privacy.
But everyone is missing the forest for the trees:
To the end user, it does not matter where the vulnerability was, there is no point in blaming anyone or saying «that's not my problem» here.
It actually does matter where the source of the problem is.
^This is the discussion that the Node TSC needs to discuss, elaborate, and resolve. Everybody wants safe code. But frankly, I think that core
is over-reaching in trying to protect users here. The TSC needs to have a contract specifying what to do in cases like this. Frankly, this vulnerability has been around for a long time, so by inaction you've basically chosen "backwards compatibility" over "forward security". But that needs to be discussed, agreed on, and written in stone somewhere. You need to explicitly decide where the buck stops in terms of security. Right now, core is trying to hoist that responsibility on module authors ("we'll break your existing code to motivate you to fix it"), and module authors are
trying to hoist it onto core ("just zero-fill the buffers already"), and neither side has even suggested that security is the user's responsibility. Which is fine until there's a lawsuit, and then I guarantee it's the user who will be held irresponsible for running outdated, insecure code.
@wmhilton spelunking through github issues is time consuming, and a lot of our issues come through private support channels, but here's somewhat related examples:
https://github.com/strongloop/strongloop/issues/296
scroll to the end, you can see the user saying "it worked"! You can also see our yeoman generator output:
$ slc loopback
(node:36574) fs: re-evaluating native module sources is not supported. If you are using the graceful-fs module, please update it to a more recent version.
loopback deprecated loopback.cookieParser is deprecated. Use `require('cookie-parser');` instead. ../../../../usr/local/lib/node_modules/strongloop/node_modules/loopback-workspace/server/server.js:45:17
_-----_
| | .--------------------------.
|--(o)--| | Let's create a LoopBack |
`---------´ | application! |
( _´U`_ ) '--------------------------'
/___A___\
| ~ |
__'.___.'__
´ ` |° ´ Y `
? What's the name of your application?
Ouch.... we fixed that. Note the command functions perfectly, it doesn't need those capabilities, but there isn't anything "soft" about those deprecations, they are VERY in-the-end-user's-face, even though they are only actionable by a developer.
Also, https://github.com/strongloop/strongloop/issues/200#issuecomment-76652267, this again isn't super specific, but you can see the user dumped a bunch of scary messages, assumed it was a problem, and we have to walk through and say "ok, but really, didn't it all actually work?", which is quite time consuming.
I tried to find an example of the glob "security deprecation" that came out during install of various tooling (even for CLIs that did not accept glob patterns), but didn't on github.
^--- and to follow on, we treat all npm scary output as critical bugs (not "soft" messages), because when something does go wrong, customers tend to not give us the full npm-debug.log as npm gently asks them to, but just a screen shot of the first scary output they saw, even though a more experienced node user would recognize the warnings as advisory, and not related to the eventual install failure. The back and forth with them to get the relevant info is painful for everyone, and not worth the support cost.
@wmhilton The tone of most replies in this issue is too stubborn / confrontational / aggressive for me to participate. I share @feross opinion from earlier and stand by my original reply. I think around 1/4 modules I have authored is using the Buffer constructor is some way, so any deprecation involved here impacts me a lot which is why I care about this issue. Feel free to ping me on IRC / Twitter if you have questions :)
Sorry @mafintosh, I'm going to keep engaging! :) So would I be right in saying the consensus of module developers is that it is NOT node
's job to ever print warnings to the console? Stdout and stderr are sacred and should not be used by node
core? (I tend to agree with this.)
@sam-github I think you may be conflating node
runtime api deprecation warnings with npm
installation deprecated package warnings. Although from a user's experience they may be perceptually similar, the team of developers responsible for node and for npm are completely independent AFAIK.
It's after 5 am, I've been up all night and currently watching some anime while working through issues. So forgive me if I missed something in this large thread.
I'd like to first solidify two things about any future changes to the Buffer
constructor.
Buffer
constructor can't be fully removed. We can force new
, print a warning or leave it as-is, but it can't be fully removed.Buffer
must accept are those that Uint8Array
also accepts.These are more facts of where we can go more than requests. So I'd like to solidify the main argument here is if Buffer
without new
will be deprecated and if so then how will we go about it?
@trevnorris It seems it's generally agreed that "ES6 classes" is insufficient justification for hard-deprecation. There's some agreement that hard-deprecation for security reasons is justified, but that implies hard-deprecating the Buffer
constructor entirely (not removing, mind you), which has a nice side effect of also hard-deprecating Buffer
without new
(with possibility of later removal).
@wmhilton You asked "how do deprecations effect node module authors", or did I misunderstand? It was a node semver-major change that broke graceful-fs, and it is node that introduced the warning about how the version of graceful-fs you are using needs to be updated, and did so before making the breaking change to give developers advance notice (but the message got seen by end-users and developers). If/when Buffer constructors are deprecated, it will cause a wave of npm modules deprecating older versions, so while the deprecation messages will come from npm, the root cause would be here, in node.
I'm considering making Buffer()
usage into an error in the standard
linter that I maintain. [[issue](https://github.com/feross/standard/issues/693)]
standard
's ruleset has 450K downloads/month, so this could be a good, opt-in way to discourage usage of Buffer()
. The other popular rulesets maintained by Airbnb, Google, etc. might also be receptive to a rule like this.
But, this investigation has shed more light for me on just how disruptive a change like this deprecation will be: Of the 300 repos in our test suite, around 70 of them fail due to Buffer()
usage. That's 25%!
Apart from all of this, there's a solution for allowing Buffer to be extended that I believe touches outside the security peripheral. Basically it would come down to:
function Buffer(arg, encodingOrOffset, length) {
if (typeof new.target === 'function' && new.target !== Buffer) {
// continue w/ the old API
}
// ...
}
I believe allowing the old API to be used, w/o a warning, is specific enough that it wouldn't be a security concern. So the only question in this case is, would it be a strange enough API discrepancy to have the existing API deprecated with simply new Buffer()
but not when it's extended.
@sam-github Ahh, that makes sense. I did not know that the root cause of graceful-fs being deprecated was a change in node. It has a ripple effect.
@feross Now that is an interesting idea. I wonder how many builds would break though because they rely on the linter passing? But I think this is the right kind of thinking. Who else could we involve in promoting the new APIs? Obviously npmjs. But maybe even popular bloggers. We could get "I rewrote my modules using the new Buffer API, and so should you" trending on Medium and Twitter. Or the Node Security Project. It's a social engineering problem at its core.
@trevnorris Several questions:
Buffer
?Buffer(string)
API?@ChALkeR I think I can answer these:
@seishun Here's the PoC:
'use strict';
const addon = require('./build/Release/addon');
class FastFoo extends Uint8Array { }
FastFoo.prototype.constructor = Foo;
Foo.prototype = FastFoo.prototype;
function Foo(arg) {
if (typeof new.target === 'function' && new.target !== Foo)
return addon.fromString(arg, new.target.prototype);
return new FastFoo(...arguments);
}
Object.setPrototypeOf(Foo, Uint8Array.prototype);
Object.setPrototypeOf(Foo.prototype, Uint8Array.prototype);
Foo.prototype.toString = function toString() {
return Buffer.from(this.buffer).toString();
};
// User's extending class.
class Bar extends Foo {
constructor() {
super(...arguments);
}
toStringHex() {
return Buffer.from(this.buffer).toString('hex');
}
}
const b = new Bar('a');
console.log(b);
console.log(b.toString());
console.log(b.toStringHex());
This is written as a native module, where fromString()
is:
void FromString(const FunctionCallbackInfo<Value>& args) {
Isolate* isolate = args.GetIsolate();
Local<Context> context = isolate->GetCurrentContext();
assert(args[0]->IsString() && "first argument must be a string");
assert(args[1]->IsObject() && "second argument must be a object");
Local<Object> buf = node::Buffer::New(isolate, args[0].As<String>()).ToLocalChecked();
char* data = node::Buffer::Data(buf);
size_t len = node::Buffer::Length(buf);
char* dup = new char[len];
strncpy(dup, data, len);
Local<ArrayBuffer> ab = ArrayBuffer::New(
isolate, dup, len, ArrayBufferCreationMode::kInternalized);
if (dup == nullptr)
ab->Neuter();
Local<Uint8Array> ui = Uint8Array::New(ab, 0, len);
ui->SetPrototype(context, args[1].As<Object>()).IsJust();
args.GetReturnValue().Set(ui);
}
I think this would allow extending from Buffer
correctly.
@ChALkeR
Buffer
had that capability. Since this can't possibly be a security issue ATM b/c it's not possible, and b/c the case is more specialized I think it would be reasonable to allow that. Though it might feel strange to allow the full set of parameters for the extended Buffer
, but not for the normal constructor. That's my only concern.@seishun add this to binding.gyp
in the same folders as those two:
{
"targets": [{
"target_name": "addon",
"sources": [ "main.cc" ]
}]
}
also wrap the native code with this:
#include <v8.h>
#include <node.h>
#include <node_buffer.h>
using namespace v8;
// place c++ code from above here
NODE_MODULE(addon, init)
Then build with:
$ node-gyp rebuild
If you're using a custom build of node then do this:
$ /path/to/node/node $(which node-gyp) rebuild --nodedir=/path/to/node
Let me know if that works for you.
@seishun mind gist'ing your code? or hit me on IRC and we'll work through it. (edit: see below)
@seishun i'm an idiot. First rename FromJust
to FromString
and then the end of the .cc
file should be:
void init(Handle<Object> exports) {
NODE_SET_METHOD(exports, "fromString", FromString);
}
NODE_MODULE(addon, init)
Sorry about that.
@trevnorris It seems it might work (although you'd have to add cases for all Uint8Array argument combinations to the C++ code), but:
@trevnorris @seishun wait… wait… wait… is this actually all that we need for proper subclassability:
class ExtensibleBuffer extends Buffer {
constructor(...args) {
super(...args);
Object.setPrototypeOf(this, new.target.prototype);
}
}
?
@seishun
- Are you 100% sure that it won't have any edge cases?
Nope. There are no apparent issues, but I'm sure the community would find something.
- Do you know what performance impact this will have compared to a regular ES6 class?
In node v6 switching new FastFoo(...arguments)
to new FastFoo(args0, args1, args2)
(I'm sure it's faster in newer V8) shows that any overhead is in the margin of error.
- I remember you mentioning that you wish to simplify the internal implementation of Buffer (and I totally share that sentiment). Adding such hacks seems to go in the opposite direction.
I have two goals. One is to make internal implementation simpler. The other is to make Buffer extendable. Since simplifying internals isn't in the short-term I figured it would be beneficial to go ahead and make it extendable so users can being to write their code expecting those simplifications to occur. This change would be future-proof of sorts.
@addaleax
wait… wait… wait… is this actually all that we need for proper subclassability:
More or less. Honestly I didn't realize this until writing the above example.
My proposal: leave Buffer alone. Gently nudge people to stop using it altogether. Add encodings for core methods such as fs.readFile() and streams: 'uint8', 'int16', 'arraybuffer' etc which provide unmodified Uint8Array etc objects.
The fact that node has a Buffer
at all is a historical accident because people needed to deal with binary data and v8 at that point did not yet have array buffer objects. Longer term, we should make all of core support native array buffers and leave Buffer
alone as a legacy API that we can gradually move away from. I don't see the point of extending Uint8Array when we already have Uint8Array. Core methods could return plain objects directly.
So, just to avoid any confusion here: All Buffer
objects are already Uint8Array
s, the only difference is an extended prototype.
Longer term, we should make all of core support native array buffers
:+1: Fwiw I’ve started to look for methods that can accept generic Uint8Array
s as input on one core module and am planning to do the same for other modules. The native bindings layer doesn’t even see any difference between Buffer
s and Uint8Array
s, so this work pretty well everywhere.
Regarding output from core methods, I am not sure adding more (pseudo-?)encodings to core methods is worth it, since converting between typed arrays is already pretty easy as it is. Also, I’m not sure what impact this kind of change would have on streams, so I will at least leave my own hands off that.
@substack AFAIK we are not currently moving forward with the decision to
deprecate anything. @nodejs/ctc can I get a confirmation on that
On Mon, Dec 26, 2016, 4:39 PM James Halliday notifications@github.com
wrote:
Upgrade to 7.2.1. This was reverted in node core: #9529
https://github.com/nodejs/node/pull/9529
Adding new won't get us anywhere because THAT will be deprecated in 8.0—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
https://github.com/nodejs/node/issues/9531#issuecomment-269243457, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAecVwJeMiUYKxqFp7v_SjtBixD0sYzOks5rMDQtgaJpZM4Kt-70
.
AFAIK we are not currently moving forward with the decision to deprecate anything
I don’t think we have consensus on anything right now, so, yeah. Basically.
Looking at the API in larger terms. I'd like to see us support all typed arrays, and ArrayBuffer
. One place specifically for the latter is when I want to read in a large file into memory all at once. This would be possible if I could pass an ArrayBuffer
to fs.read()
.
As for what happens to Buffer
. It has a lot of heavily optimized methods, and other goodies that make it much easier to work with data in node. So Buffer
itself won't be going anywhere. I would like to see that it require the use of new
eventually, but if that's not possible then eh. We can use the example snippet above to get proper support with class
.
A decision is going to have to be made about what (if anything) to change about new Buffer()
in 8.0.0 from a secure-programming perspective.
Given the...uh...lack of consensus, this decision is likely to have to be made by the CTC. (There's an additional question of what elements of any changes to backport to LTS lines, but that may not have to be decided by the CTC. The LTS folks may be able to sort that out once a decision is made for 8.0.0.)
This issue is long, and there are lots of external resources that supply info too. I guess it is probably good to at least list out the reasonable options (and omit ones no one wants) to at least have an idea what needs deciding.
With that in mind, I think these are basically the choices:
new Buffer()
new Buffer()
new Buffer()
Am I missing any other options?
I seem to recall @jasnell had an idea of how to do zero-fill without the perf impact that held us back from doing it previously.
I do not but I know that at one point @bnoordhuis has mentioned doing some form of initialization-on-demand where chunks of the allocated memory would be zero-filled incrementally or on read as necessary. I'm not sure if there was ever any effort put into investigating the feasibility of the idea, however.
At this point I am -1 on making any significant changes in the 7.x branch.
In 8.x, I would like to switch to zero-fill-by-default for Buffer(n)
and new Buffer(n)
. The associated performance hit would likely help encourage developers to move over to the new constructors faster.
In 9.x, I would like to see the runtime deprecation for both new Buffer()
and Buffer()
-without-new. That said, Buffer()
-without-new should continue to work.
@jasnell Can you explain why you're opposed to changes in 7.0.0 but want changes in 8.0.0? Not trying to persuade you otherwise or anything. Just curious what your reasons are. Is it just a general sense that 7.0.0 is coming along too soon? Or is there something more concrete going on?
We're only just over 3 months out from 8.0.0, 7.x is not going to be LTS, and I'd like to give more time for the ecosystem to prepare. In short, I see no reason to rush it.
Wow...long day or something... I was thinking 7.x was upcoming, but uh, yeah, obviously that's not the case. Updating my previous comment on what-the-options-are to say 8.0.0.
I support shipping zero-fill-by-default in 8.x. Ideally, we will backport zero-fill-by-default to supported LTS lines, so that new code written assuming Buffer(n)
zero-fills will run securely on older versions of Node.
Also, if we ship zero-fill-by-default, the primary security issue with the Buffer()
constructor (the information leak) is fixed, so I don't think a run-time deprecation in 9.x is necessary.
Docs-only deprecation has been sufficient to discourage new use of new code from using Buffer(n)
in my anecdotal experience. The docs are very, very clear that Buffer()
is to be avoided. And even if it does get used, we have zero-filling so no information leak is possible.
@jasnell If you support runtime-deprecating the Buffer constructor, then why do you propose to do it in the next major version after introducing zero-filling? Zero-filling should be introduced no earlier than the runtime deprecation, due to the security concerns discussed before.
Also, introducing a performance hit is not the nicest way to encourage developers to move over to the new constructors. It will likely result in one of two outcomes for a given developer:
new Buffer
. I bet they would prefer a deprecation warning that would inform them about the performance hit immediately after upgrading, coupled with the --trace-deprecation
flag that would help them find the exact lines they need to update.I hate to add to the ctc-review
pile but we need to make a decision about what is going to happen to the Buffer constructor in Node.js 8.0.0. Options appear to be:
As with the ES modules, none of the options are great so I guess we should provide a lot of time. If the modules discussions are any indicator, the multiple-mediocre-options discussions can take a lot of time to go over everything carefully.
Adding ctc-review label... @nodejs/ctc
@Trott I thought about that again, and I don't think that we should be bounded by those three options.
SlowBuffer
now (#8182), but that raises another issue — unsuspecting users of those packages and tools get those warnings, and think that something is broken, which it's most probably not. In most (>50%) cases the Buffer usage is fine, the original issue comes from the fact that it's not fine in _some_ percentage of the cases and the huge overall usage, combining into a lot of potentially unsafe places. But each distinct case is more likely to be fine than not, and the users don't want to be spammed with those warnings.Buffer(10).fill(0)
which would make _some_ users to rewrite that to just Buffer(10)
.So, how about this: target runtime deprecation (to the most possible extent) in 9.0 (or later), but announce that with 8.0 release, introducing a new flag in 8.0, e.g. --buffer-deprecation
that would enable that deprecation early so that users who _want_ to see those warnings will get them and report the issues to affected packages, giving them 6 months to migrate without haste.
Thoughts?
I like the idea of an opt-in flag. That looks like the --deprectate-soon
proposition from @jasnell which is currently closed.
I don't think a flag will do any good.
Few people will know it exists, fewer still will care enough to enable it.
The crucial part here is a separate section in the 8.0.0 release notes that states that the old methods will be runtime-deprecated soon and urges to stop using those, explains the background for that, and presents the flag as the helper tool to find the code before an actual runtime deprecations emerges that starts affecting all users.
I expect some users who build upon affected libraries to be running their code with that flag and reporting issues to affected libraries. That ususally works.
It's clear that some module authors don't care about docs-only deprecation and reject PRs fixing such usage.
We don't need to cover all the cases before the runtime deprecation lands, we need to just significantly reduce the impact of such a runtime warning for unsuspecting users and give module authors time to migrate.
The crucial part here is a separate section in the 8.0.0 release notes that states that the old methods will be runtime-deprecated soon and urges to stop using those, explains the background for that
The 6.0.0 release notes already mentioned that the constructor is deprecated, and the docs give a detailed explanation why. Would it make a huge difference if the release notes warned that there is going to be a warning in 6 months, and a tool was provided to find code that is going to cause the warning? Maybe, but it seems unlikely to me. People generally don't care about what happens to their code far in the future, especially if it's just a warning.
Even if the majority of users don't care about it, I think such a flag would be beneficial. I would personally use it on my applications to spot libraries that use deprecated APIs and open issues about it.
I agree that it would be beneficial, although the relevant PR didn't get much praise.
I disagree that the runtime-deprecation should be postponed in hopes that the flag will make a difference.
I think the issues with my proposed PR were mostly around the naming of the flags. The general concept seemed to have support and I can easily revisit it with a new PR that addresses those shortcomings. That said, in the case of Buffer
, my thinking is that we should be more aggressive about it. We now have mechanisms to silence warnings if necessary (env var, --redirect-warnings
and --no-warnings
flags) for those who really do not want to be bothered by such things. This is why we have a deprecation process to begin with.
This issue continues to go around-and-around in circles.
Runtime deprecation is not a good idea for all the reasons previously discussed. We even rolled back the runtime deprecation in the middle of the v7 line because it was way too disruptive. Before trying the exact same thing again, can we discuss why we think it will be any different this time?
@ChALkeR You continue to repeat the idea that zero-filling buffers will cause new security issues, but you haven't responded to my suggestion in this comment:
I support shipping zero-fill-by-default in 8.x. Ideally, we will backport zero-fill-by-default to supported LTS lines, so that new code written assuming
Buffer(n)
zero-fills will run securely on older versions of Node.
Also, if we ship zero-fill-by-default, the primary security issue with the
Buffer()
constructor (the information leak) is fixed, so I don't think a run-time deprecation in 9.x is necessary.
Adding zero-filling, and backporting it to all LTS releases causes no ecosystem disruption, and introduces no new security issues.
We even rolled back the runtime deprecation in the middle of the v7 line because it was way too disruptive. Before trying the exact same thing again, can we discuss why we think it will be any different this time?
No, it wasn't reverted because it was too disruptive. It was reverted because it wasn't sufficiently justified. This time, it will be.
@ChALkeR You continue to repeat the idea that zero-filling buffers will cause new security issues, but you haven't responded to my suggestion in this comment:
He has, though. https://github.com/nodejs/node/issues/9531#issuecomment-262003359
@seishun, @jasnell I will support runtime deprecation for 8.0, given that we won't revert that again to nowhere. But I don't think that it would be accepted by everyone, and I think that it might cause too much distruption for the end users atm (the reason why we had do revert even the buffer-without-new runtime deprecation, which is just just a part of what we should actually deprecate).
@seishun No, it wasn't reverted because it was too disruptive. It was reverted because it wasn't sufficiently justified. This time, it will be.
"Justifying" a huge breaking change won't make said breaking change any less disruptive to the ecosystem. The Web is WAY bigger than Node.js and it somehow manages to avoid making breaking changes. Node.js can and should do the same.
This thread is quite long and I am kind of lost so I have a few questions.
@mikeal ... as has been pointed out, zero fill by default carries risks for users on older versions if modules come to depend on assuming zero-filling is the default. We need to come up with a good way of nudging users to the new safer constructor APIs without (a) incurring additional risk and (b) incurring the wrath of developers due to overly noisy deprecation warnings.
For what it's worth, the next version of standard
will treat uses of the deprecated Buffer()
constructor as an error. I'm sure that maintainers of other popular style guides like Airbnb and Google could be convinced to add the same rule. This way, module developers are warned about the issue, without annoying regular users who can't even do anything to fix it.
zero fill by default carries risks for users on older versions if modules come to depend on assuming zero-filling is the default
Yes, but aren't you considering entirely breaking this API for those users anyway? Whatever behavior they might rely on here is sort of irrelevant if the alternative is breaking them anyway.
Yes, but aren't you considering entirely breaking this API for those users anyway?
No, runtime deprecation doesn't break anything. It displays a warning on first use.
who can't even do anything to fix it.
They can submit an issue or a PR or set NODE_NO_WARNINGS=1
.
No, runtime deprecation doesn't break anything. It displays a warning on first use.
Spewing unexpected text into the stderr of a Node CLI program most certainly does break things. Pretending otherwise is irresponsible.
They can submit an issue or a PR or set NODE_NO_WARNINGS=1.
If the deprecated usage is in a deeply-nested dependency, then PRs may need to also be submitted to every package in the chain of dependencies all the way to the top package.
It's common for packages to depend on a package that is one or two major versions behind. In that case, a big refactor may be required just to update to the latest dependency and make the warning go away.
And, NODE_NO_WARNINGS=1
is hardly a realistic solution for end-users.
Spewing unexpected text into the stderr of a Node CLI program most certainly does break things. Pretending otherwise is irresponsible.
Yes, it can cause breakage in certain cases (just like any other deprecation warning), but I think that's not what @mikeal referred to as "entirely breaking this API".
And, NODE_NO_WARNINGS=1 is hardly a realistic solution for end-users.
Why not? Could you clarify who you're referring to as "end-users"?
I think this is part of the problem:
"Deprecation is no big deal":
"Deprecation would be extremely disruptive":
There is a real cost that the deprecation advocates simply don't appreciate.
We all contribute to Node in different ways. I'm not implying that one type of contribution is more valuable than another. However, I would appreciate some acknowledgement of the real pain this change is likely to cause to others members of the community, instead of constant dismissals of our concerns.
To be absolutely certain: I never said that deprecation is "no big deal", so please do not characterize my position in that way. For me it's not about coming up with a good solution, it's about coming to an agreement around the least bad approach that will be effective at moving the ecosystem in the direction we need them to go without incurring additional risk.
To be absolutely certain: I never said that deprecation is "no big deal", so please do not characterize my position in that way.
Of course. The categories are probably better named "deprecation advocates" and "deprecation detractors".
A few things we need to understand in order to have a production conversation about tradeoffs.
I think the problem we're running into here is that some people think that this sort of a break is a grey area, that it effectively is not as extreme as "real" breaking changes. That is not true, and we cannot afford to pretend it is true.
Many people in this conversation understand this and may still think this break is worth it. We can have a productive conversation about the tradeoffs from that point forward.
What we can't do is continue to argue about the severity of this kind of deprecation. It's a breaking change, it breaks a bunch of stuff in the ecosystem, it will keep people from upgrading, and we need to admit that and plan appropriately if this is still something people believe is worth it.
it's about coming to an agreement around the least bad approach that will be effective at moving the ecosystem in the direction we need them to go without incurring additional risk.
Agreed.
I think that what @feross and a few others are saying is that the breaks due to runtime deprecation are actually greater than the breaks we'll see from zero-filling the API we want to deprecate. Both are a breaking change, but one is a bit more future proof and probably less destructive.
Of course. The categories are probably better named "deprecation advocates" and "deprecation detractors".
And just to be fair, everyone is advocating a breaking change to the API.
If we take the approach of zero-filling by default, then we would need to see a concerted effort on the part of the ecosystem to help protect users from the associated risks. For instance, modules that do assume zero-fill by default may want to proactively check that they are running on versions of Node.js that do zero-fill by default, and warn users accordingly when that requirement is not met. Version sniffing sucks but it would be the only way of knowing for certain that users are not being put at risk.
Aggressive linting would also be helpful. I had proposed to ESlint adding a rule around using the new construction methods but they declined given that the rule is very Node.js specific. It would be good to get some traction around that tho and I'm happy to see that standard
is moving in that direction.
Eventually I would like to see a runtime deprecation when using Buffer()
or new Buffer()
but that does not need to be any time soon so long as we see positive movement in the right direction here.
If we take the approach of zero-filling by default, then we would need to see a concerted effort on the part of the ecosystem to help protect users from the associated risks
So, we have @feross committed to adding a rule to standardjs. We can also reach out to AirBnB about adding the same to their widely adopted styleguide and tools.
We could reach out to Visual Studio Code and Atom to see if they can add similar warnings to their editor's default Node.js support.
We can also reach out to Snyk to see if they can add a warning when code and even dependencies do the same.
It's a well worn argument at this point that unexpected prints to stderr are a breaking change.
No one argues about that, but IMO the possible actual breakage caused by extra strerr output is a relatively minor pain point of the runtime deprecation, and not the one people are complaining about. People are mostly concerned about the amount of work that it would take to remove the warnings (which in most cases will cause annoyance, not breakage), which I do acknowledge.
If this is the only solution then the change is a breaking change. It should be treated as breaking all the modules that depend on the behavior as well as all the modules that depend on those modules and so on.
In most cases, simply ignoring the warning is also a solution. In most cases, the code will continue working as intended.
@seishun ... do not forget that there are many users who run with --throw-deprecation
. For those, a new deprecation warning is a new runtime error. It's currently not possible for us to determine just how many such users exist.
For instance, modules that do assume zero-fill by default may want to proactively check that they are running on versions of Node.js that do zero-fill by default,
I don't think that would work and we would have no means to check that in runtime.
A question: if we enable zero-filling by default, how would you estimate the percentage of library authors who _will_ assume zero-filling but _won't_ check the Node.js version?
Also note that such combination is not better and does not involve less code than just using the new API where it's available.
@mikeal, @jasnell re: linting and other means — note that the runtime deprecation in v7 significantly reduced the Buffer-without-new usage, but it was visibly _increasing_ ever since the deprecation was reverted. The doc-deprecation and whatever else means are curretly in action don't work well atm. _Note: I'm not saying that adding that rule to linters is a bad idea, it's a great idea._
Everyone, also please note the fact that zero-filling doesn't solve the quite popular Buffer(arg).toString('base64')
issue (it will still be a DoS), while deprecation does.
@ChALkeR ... yep. I tend to see it as ripping the bandaid off faster. However, we cannot discount the number of module developers with significant investment in the ecosystem who are telling us that runtime deprecation would be bad, so I'm trying to figure out a compromise path. If we go the route the ecosystem developers are asking us to take, then those ecosystem developers have to take on a certain amount of responsibility in solving the issue. If we take that action and we do not see measurable improvement, then going with the runtime deprecation is definitely a step we can take to help force the issue. You're absolutely right that we cannot reasonably estimate the risks incurred by switching to zero-fill by default.
Again, I do not see a good solution here. I see degrees of bad in each option. Basing the decision on a binary Will Work or Won't Work is not going to be very effective.
@mikeal, @jasnell re: linting and other means — note that the runtime deprecation in v7 significantly reduced the Buffer-without-new usage, but it was visibly increasing ever since the deprecation was reverted. The doc-deprecation and whatever else means are curretly in action don't work well atm. Note: I'm not saying that adding that rule to linters is a bad idea, it's a great idea.
I think that working these kinds of deprecations into the tooling is probably going to be even more effective than runtime deprecation. Anecdotally, people who won't use any tooling for linting or security are likely to just run without dep warnings or ignore them. For people that use this tooling this will actually be a bigger error, one they have to alter their code to get past, than the runtime warning.
@mikeal would such tooling help to find usage of deprecated API in dependencies? Because as far as I can tell, fixing it in your own code is trivial. It's figuring out which dependencies need to be updated and accommodating any major changes in them that will require the most work.
@jasnell It's not binary, I just simplified things a bit to that it would better fit into a comment.
Perform an estimation of the probability that a single random module developer would think about only new versions and will rely on zero-fill, multiply it by the chance that that developer won't care about version detection, then multiply that by the ecosystem size (well, the part that uses Buffer API, but that's still _very_ huge). I expect to see lots of modules that start doing bad stuff on versions where zero-filling isn't enabled if we choose the zero-fill path.
Also, the unsolved DoS.
That's basically why I said that it «won't work».
Approaches like pushing for more linter warnings are nice, but the problem is not so much new code being released as modules depending on that one tiny module that hasn't been updated in two years and therefore has no one looking at the code.
The only real way I can think of to deal with it is to get a dump of every module in npm, grep (or parse the AST, if one is so inclined) for any matching uses of the Buffer constructor in all of them and blanket contact all the module authors.
I just released standard
10.0.0-beta.0 which includes a check for deprecated Node.js APIs, including use of the Buffer()
constructor. Details here: https://github.com/feross/standard/issues/693#issuecomment-283592259
The final 10.0.0 release will happen one month from today, on April 1, 2017 after our usual one month testing period. You can follow the release progress here: https://github.com/feross/standard/issues/808
Note: I pushed this release out earlier than normal. I am eager to try this approach of using community tooling to discourage deprecated APIs far in advance of Node actually hard-deprecating those APIs. Let's see how this goes.
@ChALkeR How are you monitoring usage of the deprecated APIs? Can you keep us updated on how usage in the wild changes over the next few months?
Hi everyone! @seishun invited me to chime in with a real world use-case for extending Buffer that I've recently encountered. The EthereumJS project (https://github.com/ethereumjs) uses Buffer to encode values sent and received over the network. The Ethereum RPC protocol however, expects a specific kind of HEX encoding, different from that used by Buffer (https://github.com/ethereum/wiki/wiki/JSON-RPC#hex-value-encoding). One way to fix this inconsistency would be to extend the Buffer class and overwrite the toString
method to return the expected HEX value. This way the project which has come to rely heavily on Buffer would not need to be re-written using a replacement Buffer implementation.
Sample code:
module.exports = class EthBuffer extends Buffer {
constructor(value, encoding = 'utf8', type = 'unformatted') {
if (arguments.length === 2 && typeof value === 'string') {
super(value, encoding)
} else {
super(value)
}
if (arguments.length === 2 && typeof value !== 'string') {
type = encoding
}
this._type = type
}
toString() {
const args = Array.prototype.slice.call(arguments)
const val = super.toString.apply(this, args)
if (args[0] === 'hex' && this._type === 'quantity' && val[0] === '0') {
return val.substring(1)
} else {
return val
}
}
}
I hope this helps as you guys decide on how to proceed with the Buffer refactor! Good luck!
@ChALkeR @seishun @nodejs/ctc How would you feel about a variant on zero-filling that doesn’t have the danger of people coming to rely on it? For example, we could pick a pseudorandom uint8 at process startup that we use for filling all buffers. (Or maybe the lower bits of uv_now(loop)
, if that’s better. You get the idea.)
I know this sounds like a really weird thing to do, but it:
The only thing this would not address is the DoS issue.
"Slight" performance penalty? Have you measured this @addaleax ?
@addaleax That still has the same problem that I described in https://github.com/nodejs/node/issues/9531#issuecomment-273560353. Basically, the performance penalty might be significant for some, and they would have to spend time investigating where it's coming from.
"Slight" performance penalty? Have you measured this @addaleax ?
Not explicitly. It would be weird if it benchmarked very differently from Buffer.alloc
vs Buffer.allocUnsafe
. So: Yeah, if you only look at the Buffer()
calls itself, it would definitely be noticeable.
Basically, the performance penalty might be significant for some, and they would have to spend time investigating where it's coming from.
We are going to put whatever change we make into our release notes, so I wouldn’t feel too bad about that.
I’m bringing this up because it seems like zero-filling is actually something that a lot of CTC members would be on board with, and unless I’m missing something, this kind of filling would be a strictly better option than zero-filling.
We are going to put whatever change we make into our release notes, so I wouldn’t feel too bad about that.
Not everyone reads release notes though.
and unless I’m missing something, this kind of filling would be a strictly better option than zero-filling.
I suspect that on many platforms zero-filling is faster than garbage-filling.
I like @addaleax's approach because it's the least disruptive one that's been proposed so far (other than "do nothing"), while still accomplishing the primary goals that we're all attempting to solve.
Like @jasnell said:
I do not see a good solution here. I see degrees of bad in each option. Basing the decision on a binary Will Work or Won't Work is not going to be very effective.
We should measure the true performance impact to get the full picture, but if the worst con of this approach is a performance hit on a deprecated API (that can be fixed by changing Buffer()
to Buffer.from()
in most cases) then this certainly sounds like the "least bad" option to me.
because it's the least disruptive one that's been proposed so far
I'm not exactly sure about the definition of "disruptive" used in this context, but assuming it means "things not working the intended way causing lowered productivity", then introducing a performance penalty is also disruptive, and I'm not convinced that the number of people who rely on high performance of new Buffer()
is significantly lower than the number of people who rely on stderr output.
All else being equal, I'd say a disruption that's obvious is preferable to one that is hidden in release notes.
Note: I'm not dismissing non-stderr-related concerns regarding deprecation warning. I just disagree with using the term "disruption" to describe those.
I agree with @seishun.
The type of user who would be affected by a 10% (or whatever %) performance change in new Buffer()
is probably not the same type of user who skips reading the release notes. Indeed, I would be surprised if there is any overlap between these two groups.
Anyone running a performance-critical app will not only read the release notes, but test their app and it's performance before deploying a new version of Node.js in production.
But, yes, I concede that even this approach isn't perfect. Still, I believe it's the least disruptive one that has been proposed so far.
FYI: https://github.com/eslint/eslint/issues/5614#issuecomment-285517494
Regardless of what we do here, there will be disruption. That cannot be avoided. The question is about which kind of disruption is most acceptable (i.e. "least bad"). @addaleax's suggestion to fill with a randomly selected value is a very good option here. Sure, there may be some people who do not read the release notes, there will be some who do not read the documentation, there will be some who do not read the tweets, etc... there's not going to be much we can do for those people other than not make the situation worse. We will definitely need to measure the performance hit but Buffer.allocUnsafe()
is the solution for that.
It does not solve all of the issues, of course, so there will be more to do.
@addaleax Sorry, that won't work, it doesn't differ from zero-fill too much. When I said «authors will start relying on zero-fill» I mean not that they would start relying on the Buffers to be literally filled with zeroes, but that they would start to rely on Buffer(num)
being secure (when num
isn't too big). See details below.
There are (rougly) three types of problematic packages there:
Buffer(num)
and Buffer(string|array)
usage _and_ accept user input there.Example: ws
.
See https://github.com/websockets/ws/releases/tag/1.0.1 for details (though ws
was not the only package there).
Two types of problems:
Those types of packages are relatively rare, comparing to other issues described below.
Buffer(num)
, but fail to fill it in under some circuimstances.See https://github.com/ChALkeR/notes/blob/master/Buffer-knows-everything.md#how-to-avoid-leaks for an example of vulnerable code.
Example: ip
.
Before https://github.com/indutny/node-ip/commit/b2b4469255a624619bda71e52fd1f05dc0dd621f, ip.mask('::1', '0.0.0.0')
was returning uninitialized Buffer memory.
I can't tell exactly how common is that, but that's definitely less common than the example 3 below.
No large numbers could get into Buffer(number)
in that package, so DoS isn't possible there.
zerofill/randomfill would have fixed it, _and that's the problem here_.
The problem is that under zerofill/randomfill, that code is secure, and on all current Node.js versions, including 6.10.0 and 7.7.2, it wouldn't be secure. Once the people start relying on leaking Buffer(small_number)
results as not being a security issue (and some will do that immediately), things will get worse for users who did not update to latest patch version.
Also, the performance hit could be noticeable in both zerofill and randomfill, that is yet another problem here: you can't rely on all users to read the changelog carefully. What I expect to happen once we enable zero/randomfill without a deprecation, is that some users would update to the latest patch version, see a performance degradation, don't bother digging that, instead assume that it's a temporary bug and rollback to an unpatched version. Which would get worse.
Buffer(num)
and Buffer(string|array)
usage in their API, but don't directly deal with user input.Example: hoek
.
See https://github.com/hapijs/hoek/issues/177, fixed in [email protected]
.
Module authors commonly say that it's not their problem, and that passing a number
to their API is not supported (though they don't typecheck).
There is a large amount of those.
Those increase the scope of this problem and make this less tracable — if any user input ends up being passed to such an API in some other package or in some unpublished server code, it would become the same as in case 1 — at least a DoS and probably an unitialized memory leak (if there is a remote user, they usually have some means of observing the result).
zero/randomfill won't solve the DoS here (pretty much the same as in case 1), but it will make module authors even less willing to fix their packages, once they hear that the issue is somehow «fixed» in Node.js (though it won't actually be). If you think that they won't — note that even the discussion here often totally ignores the DoS.
They only way that will make things secure, fixing both uninitialized Buffer memory leaks and DoS issues in users code would be to migrate everything in ecosystem to Buffer.alloc/Buffer.from
, and I don't see any way around that that would work long-term.
@jasnell
Regardless of what we do here, there will be disruption.
Why is disruption inevitable? It seems to me that with a documentation-only deprecation, and by better documenting why various Buffer APIs are deprecated, there could be no disruption.
Currently, users have to jump to the top of the Buffer documentation to read about why these APIs are deprecated, and there's no link from each deprecated function's documentation to that section with further explanations.
Adding why these APIs are deprecated _in the same section where they are documented_ could help convey the message that these APIs have, among other problems, security issues.
@ChALkeR DoS, where the attacker could simply kill the server with a few packages. This would not be fixed by zero/random fill.
We need to stop bringing up DoS. It's out-of-scope and completely unrelated to the uninitialized memory disclosure issue.
We can't _solve_ security for users determined to hurt themselves. By your logic, Buffer.alloc(num)
is also vulnerable to a DoS attack because if a user doesn't check the size of num
a remote attacker could trigger a DoS. This is out-of-scope and not Node's job to worry about.
If we want to make progress here, we need to agree to focus on how to prevent the memory disclosure issue. That issue, not a DoS issue, is what originally prompted the new Buffer
APIs and it's the _only_ valid reason to migrate users from Buffer()
to these new APIs.
@ChALkeR Once the people start relying on leaking Buffer(small_number) results as not being a security issue
No reasonable user will treat random data being returned as "not an issue". Unlike returning a zeroed buffer, returning a random buffer _looks very, very weird_. The only solution in this case is for the user to manually zero it out or use fill(0)
.
I think the scenarios you're describing are getting increasingly ridiculous and so impractical that I'm actually more confident now that this is the right solution after all.
When I said «authors will start relying on zero-fill» I mean not that they would start relying on the Buffers to be literally filled with zeroes, but that they would start to rely on
Buffer(num)
being secure (whennum
isn't too big).
@ChALkeR I see your point but I don’t think that’s actually the case. For code authors, to rely on Buffer(num)
being secure would imply that they are aware of the issues with the Buffer constructors, so those are the people we don’t need to worry about as much.
They only way that will make things secure, fixing both uninitialized Buffer memory leaks and DoS issues in users code would be to migrate everything in ecosystem to
Buffer.alloc/Buffer.from
, and I don't see any way around that that would work long-term.
Right, I am afraid that’s something we’ll have to accept is just not going to happen.
We can't solve security for users determined to hurt themselves. By your logic, Buffer.alloc(num) is also vulnerable to a DoS attack because if a user doesn't check the size of num a remote attacker could trigger a DoS. This is out-of-scope and not Node's job to worry about.
It seems you misunderstand the DoS issue. It happens when there is a chain of calls between the user and new Buffer()
, where no one validates the input because they assume someone else will. As a result, new Buffer(num)
can get called instead of new Buffer(string)
. This wouldn't happen with Buffer.alloc()
.
I think the scenarios you're describing are getting increasingly ridiculous and so impractical that I'm actually more confident now that this is the right solution after all.
We understand that you disagree with @ChALkeR, but such remarks are demeaning and antagonizing. Let's keep the conversation rational and stick to the facts.
Let's definitely try to keep the conversation civil and constructive. At this point it seems like there's not a lot of convincing going on and just a lot of talking past one another. What we need to come together on is a path forward towards a solution. We're not going to solve the issue over night, nor are we going to solve in a single step.
There are three constituents to this problem that need to word in concert towards a solution: core, users and the tools ecosystem. There are thousands of modules out there that use Buffer()
and new Buffer()
. @addaleax is absolutely correct that we cannot simply make those modules change their code no matter what we choose to do here. Even if we simply stripped Buffer
out entirely, that doesn't mean those module authors would change their code. @feross is also correct in that we cannot "solve" security for these developers. There is a certain amount of responsibility they have to take on themselves.
That said, @seishun and @ChALkeR are also correct in that these issues aren't simply going to go away and cannot simply be ignored. The tools ecosystem can and will help significantly here. The changes to standard
, the linting rules, security vulnerability auditing, and education will go a long way but are certainly slow. We'll get there. @ChALkeR noted that there was a measurable decline in the uses of Buffer()
-without-new when we had the deprecation warning in there. That's a positive. Noisy works but it can also be counterproductive.
What I want to do right now is come to a compromise solution that I know will not make everyone happy or solve all of the issues immediately but will hopefully move things in the right direction.
We modify Buffer(num)
and new Buffer(num)
to fill with a random byte value selected at startup for v8.0.0.
We introduce an optional * deprecation warning that is *off by default in v8.0.0 and can be enabled using a command line switch and environment variable. This warning would be emitted when the Buffer()
or new Buffer()
constructor is used.
We switch the deprecation warning to on-by-default in v9.0.0
We continue to work with tool creators and education providers to get the word out about using the new Buffer construction methods.
We continue to work with the ecosystem by proactively submitting pull requests to replace found uses of Buffer()
and new Buffer()
in the ecosystem.
Again, I know this plan does not solve the problem immediately, and I know it introduces annoying strerr output that is potentially breaking. I'm not trying to solve all the problems and I'm not trying to make everyone happy. I'm trying to find a solution that would be workable.
@seishun It seems you misunderstand the DoS issue.
Thanks for explaining to me an issue that I discovered and reported. 🙄
It happens when there is a chain of calls between the user and new Buffer(), where no one validates the input because they assume someone else will. As a result, new Buffer(num) can get called instead of new Buffer(string). This wouldn't happen with Buffer.alloc().
Yes, the confusion between Buffer(num)
and Buffer(string)
is, in fact, the reason we created the new APIs, but _it had nothing to do with DoS_. You're inventing reasons ex post facto.
If you go back and read the original issue, you'll see that the rationale for the new APIs was _uninitialized memory disclosure_, a problem at least 100x worse than DoS.
If we can can prevent uninitialized memory disclosure, which @addaleax's proposal does, then we've solved the original issue that prompted the creation of these new Buffer
APIs in the first place. This would let us keep Buffer()
around for a bit longer, giving the ecosystem more time to migrate.
@jasnell Looks like we commented at the same time. I could get behind your proposal, though I think it still deprecates Buffer
too aggressively. v9 is ~8 months away.
With the memory disclosure issue solved by random byte filling, what is the urgency to deprecate? I think we could afford to wait longer to give the ecosystem more time to migrate.
Then perhaps this: the CTC can review the progress of migrating the ecosystem away from Buffer()
and new Buffer()
before cutting the 9.0.0 release in October. The CTC could decide to enable the deprecation message then or not. By 10.0.0 next year, however, the deprecation warning would definitely be switched on by default.
@jasnell having Node core API become a moving target is an excellent way to alienate the module authors that provide value to the platform. Don't deprecate primitives. Ever.
@jasnell, that differs from what I proposed by zerofilling with a random number, which I am still unsure about for the reasons stated in https://github.com/nodejs/node/issues/9531#issuecomment-285625233. Also if it lands to 8.0, what would happen to LTS? Will it not get randomfill?
Note that @addaleax expressed an opinion (in the table, if I understood it correctly) that for deprecation at any point in near future (i.e. one year won't be enough) we will get too much pressure and will have to revert.
I personally don't agree with that, though.
@feross No, the fact that you personally ignore the DoS issue doesn't make it disappear — I expect a significant number of setups to be vulnerable to DoS because of the Buffer(num)/Buffer(arg)
mixup in some package deep in the dependency chains. Yes, that mixup is the issue you reported, but the fact that you didn't mention DoS there doesn't mean that it is out of scope.
Btw, I am not sure how you can be against https://github.com/nodejs/node/issues/9531#issuecomment-283295518 / https://github.com/nodejs/node/issues/9531#issuecomment-283246696 but in favor of https://github.com/nodejs/node/issues/9531#issuecomment-285772835 which is mostly the same (except for the randomfill in 8.0). Or do I misunderstand something?
As for «increasingly ridiculous and so impractical» — I did acknowledge your expressive language here (no joking here, it's also an input source), but it would help more if you cited the exact statement (preferrably — the exact part) that you disagree with.
Yes, the confusion between Buffer(num) and Buffer(string) is, in fact, the reason we created the new APIs, but it was about it had nothing to do with DoS. You're inventing reasons ex post facto.
In addition to @ChALkeR's comment above, I'd like to point out that DoS _was_ mentioned in https://github.com/ChALkeR/notes/blob/master/Lets-fix-Buffer-API.md, which was a major basis for the new APIs.
And it seems you're ignoring mine and @jasnell's request for a more civil attitude.
I think we could afford to wait longer to give the ecosystem more time to migrate.
I'd agree if there was evidence that the usage of new Buffer()
in the ecosystem is going to drop on its own without runtime deprecation. So far I don't see it, linting isn't going to make people update their dependencies, and I really doubt many people would willingly enable an optional warning. I'd be happy to be proven wrong though, and I think it's something worth discussing.
My proposal: don't deprecate new Buffer(n)
or Buffer(n)
. Don't change it to an ES class. Turn on zero-filling by default.
Deprecation is a significant cost.
If the goal is to make the community safer, zero-filling satisfies that.
If the goal is to provide nudge the community towards faster APIs, zero-filling satisfies that for the ones who care, and the ones who don't care shouldn't be forced to care about it.
If the goal is to make Buffer
more extensible, well, it's already plenty extensible, and I don't see how "Buffer can be subclassed" makes Node.js a better platform. In fact, it's probably a bad idea. The API should gently communicate that Buffer _probably shouldn't be subclassed_, and leaving it as a factory function accomplishes that.
Anyone who _wants_ to subclass it, though, already can today with Node 7, and it's hacky enough that they're likely to avoid it.
If the goal is to warn folks off of a DoS from creating a very large buffer, it could print a warning when Buffer(n)
is called with a suitably large number so that they can debug it when there are problems. It's a lot better than memory disclosure.
If the goal is to make people in the ecosystem stop using new Buffer()
, well... why?
@jasnell, thinking about it again, https://github.com/nodejs/node/issues/9531#issuecomment-285772835 looks good to me. If we have a concrete short path to deprecation, random-fill is acceptable. I'm still not sure should it be backported or not, though — your plan doesn't include backporting. Perhaps that's the best — that way, the impact of «users start relying on randomfill» concern is reduced to the possible minimum, and 8.0 will be safe from uninitialized memory leaks.
It's still not neglectible, though — even now, module authors often just don't see case 3 from https://github.com/nodejs/node/issues/9531#issuecomment-285625233 as an issue in the module.
Sorry about being absent from this discussion for the last month or so. Can someone clarify if all the "runtime deprecation" options I was shown on the "Node.js Buffer options" spreadsheet mean completely deprecation of the Buffer
constructor?
I've voiced this several times in the past, but my opinion is that if any changes are to occur (outside of only zero-filling) then Buffer
parameters should become the same as or a super-set of Uint8Array
. The requiring of new
is also important. Is this reflected in any of those options?
To reiterate, new Buffer(number)
and similar should never be runtime deprecated.
If the goal is to make
Buffer
more extensible, well, it's already plenty extensible, and I don't see how "Buffer can be subclassed" makes Node.js a better platform. In fact, it's probably a bad idea. The API should gently communicate that Buffer _probably shouldn't be subclassed_, and leaving it as a factory function accomplishes that.
That's a strong opinion @isaacs, but I don't see any reasoning or support for it. In the past I've extended Buffer
with additional functionality that made the API much easier to work with. The key is being able to call functions on the instance, while also being able to access its data via array index. This is difficult to replicate, and it would make things much easier if I could simply use class Foo extends Buffer
.
Can someone clarify if all the "runtime deprecation" options I was shown on the "Node.js Buffer options" spreadsheet mean completely deprecation of the
Buffer
constructor?
Yeah, it does.
@trevnorris, note that the table has a «to the technically possible extent» sentence, which means that it shouldn't break code that doesn't use Buffer(arg)
explicitly. #7152 and #11808 have a work-around for that, and everything works as far as I am informed. That was done by @seishun, I believe. Could you provide a testcase that would be broken by any of those PRs? It's better to move the dicussion to the PRs, though.
Upgrading this one from ctc-review
to ctc-agenda
. We need to make a decision about what is and isn't going to happen in version 8.0.0.
Summing up where things are now, at least as I see them, and reading a lot into the spreadsheet @ChALkeR set up and that all CTC members were invited to fill out:
While not unanimous, there seems to be a consensus that we should do something--that ignoring the issue is not a wise option.
Deprecating in version 8.0.0 has considerable opposition and scant support.
Zero-filling has more support than opposition at this time. That said, there are almost as many people who are neutral about it as there are people that are for it. So a lot will depend on how those folks end up voting.
Scheduling a deprecation has significant support and slightly less opposition. Again, a lot of neutral folks on that one, so how that goes will depend how the undecideds end up voting.
Opt-in deprecate is expected to easily land in version 8.0.0. It has two people on the record as being opposed. One of them is opposed to anything other than updating the docs. If I recall correctly, the other had trouble remembering why they indicated opposition and it may have been an error.
For some folks, support or opposition to zero-fill depends on whether there is a commitment to run-time deprecating (for example, announcing that run-time deprecation will happen in version N.0.0) and whether it will be backported to LTS lines. Therefore, it may make sense to try to come to a decision on whether or not to schedule a deprecation before trying to decide whether or not to zero-fill.
Random-fill seems to have less support than zero-fill but is still a viable contender. Most of the things said about zero-fill
I think this has been resolved, at least for Node.js 8.0.0. We will want to revisit this in 6 months (if not sooner!) before the Node.js 9.0.0 release.
For now, the CTC has decided that:
There were no decisions that were going to please everyone. This is where things sit for now. As mentioned above, we'll surely be re-visiting this in the not-too-distant future to assess how things are working or not working.
I should also mention that there is an effort to get a rule into ESLint that will flag Buffer constructor usage. I would characterize the state of that proposal as likely to be adopted by ESLint, but not a sure thing at this time.
I'm going to close this issue, but feel free to re-open or comment if you think that's not the right thing to do at this time. Thanks!
FYI, I just released standard
10.0.0 which treats usage of deprecated Node.js APIs as a lint error. So we now have thousands of users (once they update) who will see warnings about Buffer()
being deprecated in their tests and in their CI pipelines.
From the 10.0.0 changelog entry:
- Disallow using deprecated Node.js APIs
- Ensures that code always runs without warnings on the latest versions of Node.js
- Ensures that safe Buffer methods (
Buffer.from()
,Buffer.alloc()
) are used instead ofBuffer()
It's hard to know exactly how many people use standard
, but our shareable eslint config is downloaded 670K times per month, so I hope this change will have some noticeable effect in the usage numbers. We'll see.
I think it would be great if we could lean on community tooling, like standard
and others, to help make these kinds of deprecations less painful in the future. @ChALkeR, it would be great if you could keep an eye on usage of Buffer()
to see how much it changes over the next 1-3 months.
Also on the tooling front: ESLint will be shipping with a no-buffer-constructor (that may not be the name of the rule, I'm just using that as a shorthand right now) rule in the foreseeable future. See https://github.com/eslint/eslint/issues/5614#issuecomment-291742518 and thank @not-an-aardvark and @jasnell and everyone else who got us to this point. (I don't know if the rule will have enormous impact or modest impact, but we don't know if we don't try!)
With the Node.js 9.0.0 release looming closer, I think it's time to revisit this. @ChALkeR could you evaluate how much the usage of Buffer()
has changed in the last 3 months?
@seishun Thanks for the reminder! I hope to do that in a few days. =)
@ChALkeR pinging once again...
Most helpful comment
My proposal: don't deprecate
new Buffer(n)
orBuffer(n)
. Don't change it to an ES class. Turn on zero-filling by default.Deprecation is a significant cost.
If the goal is to make the community safer, zero-filling satisfies that.
If the goal is to provide nudge the community towards faster APIs, zero-filling satisfies that for the ones who care, and the ones who don't care shouldn't be forced to care about it.
If the goal is to make
Buffer
more extensible, well, it's already plenty extensible, and I don't see how "Buffer can be subclassed" makes Node.js a better platform. In fact, it's probably a bad idea. The API should gently communicate that Buffer _probably shouldn't be subclassed_, and leaving it as a factory function accomplishes that.Anyone who _wants_ to subclass it, though, already can today with Node 7, and it's hacky enough that they're likely to avoid it.
If the goal is to warn folks off of a DoS from creating a very large buffer, it could print a warning when
Buffer(n)
is called with a suitably large number so that they can debug it when there are problems. It's a lot better than memory disclosure.If the goal is to make people in the ecosystem stop using
new Buffer()
, well... why?