Node: Debugging: name every function

Created on 3 Oct 2016 · 56Comments · Source: nodejs/node

Version: 4.4.7
Platform: linux
Subsystem: http

There are too many anonymous functions in the source code which makes heap debugging frustrating

This once('response') listener ( https://github.com/nodejs/node/blob/master/lib/_http_client.js#L235-L237 ) is anonymous.

When I try to debug why I am leaking response listeners in a heap snapshot

I see that the listener in the once closure is function () {} which gives me no information. I strongly suspect that it's the abort listener but i have no evidence for it.

There are many, many, many anonymous functions in node core, there should be zero.

good first issue help wanted http

Source

Raynos

👍11

Most helpful comment

can i give it a go?
im new to node but not js and this looks like a good way for me to learn node's innards :)

Imgur

solebox on 8 Oct 2016

😄9 👍5 🎉3 ❤1

All 56 comments

This applies to 4.x & master. I did not check 6.x, I assume 6.x and master are the same.

Raynos on 3 Oct 2016

We can't name arrow functions though. Since those are done for perf reasons you'd still need to live with them... :/

Fishrock123 on 6 Oct 2016

@Fishrock123 out of curiosity _(pardon if this venue isn't suited for inquiry)_; pre-assigning the arrow function before usage does retain the naming while --inspect:ing - is what you're referring to not to do due to performance hits?

fl0w on 6 Oct 2016

@fl0w you mean assigning to a variable means you get the variable name? You can't assign names to arrow functions like to regular functions though.

Fishrock123 on 6 Oct 2016

const a = () => 42;
a.name; // returns 'a'

works since V8 5.1, so that should be possible.

addaleax on 6 Oct 2016

👍10

@Fishrock123 @addaleax answered before me - I was wondering if that was an overhead you were referring to in your original comment.

https://github.com/koajs/koa/pull/805#r76954409 for pictures

fl0w on 6 Oct 2016

@Fishrock123

Do you have a link to a demonstration that arrows are more performant. I suspect named function declarations would be more performant then arrow functions.

Raynos on 6 Oct 2016

V8 doesn't have to worry about super and new.target when emitting code for arrow functions.

It's a minor thing and probably not much of an issue once the function makes it to the optimizing tier but it means that core should lean towards arrow functions, all other things being equal.

bnoordhuis on 7 Oct 2016

@Raynos anywhere where bind/this passing was/is necessary.

Other than that, they are only marginally different like @bnoordhuis said. The best rule of thumb is use them where you need the lexical this and regular functions elsewhere. (Which is exactly what we do.)

Fishrock123 on 7 Oct 2016

Thanks for input @Fishrock123 @bnoordhuis Agreed that bind is bad.

I wrote a quick benchmark ( https://gist.github.com/Raynos/93d275463a90306b4b0779fed308550c ).
I ran the benchmark with node 6.4.0 & v8 5.0.71 and performance of both arrows and closures is identical.

Having a named function declaration;

var self = this;
function someName() {
  self.wat()
}

Instead of () => self.wat() will still improve heap debugging.

I suspect using a technique that improves heap introspection is more valuable then saving a few lines of code. Especially if the allocation and calling performance of both is the same.

Raynos on 8 Oct 2016

In your benchmark the arrow/non-arrow functions are optimized almost right from the start but that's not very representative of long tail code (code that gets called periodically but not frequently enough to get optimized.)

var self = this is something I definitely recommend against. () => this.wat() captures just the lexical this but the function in your example can over-capture the enclosing lexical scope due to how V8 implements closures. If you had a var big = Buffer(1 << 28) in there, it could stay alive as long as the function.

It's not a theoretical concern either. Over the years I've fixed several memory leaks in core that were the result of over-capturing.

bnoordhuis on 8 Oct 2016

can i give it a go?
im new to node but not js and this looks like a good way for me to learn node's innards :)

Imgur

solebox on 8 Oct 2016

😄9 👍5 🎉3 ❤1

@soleboxy I’d suggest picking a file in lib/, doing it, opening a PR and seeing how that goes. A single pull request for all functions across all JS files would basically be a recipe for conflicts. If you’re unsure about anything, feel free to ask here or in #node-dev on Freenode! :)

addaleax on 8 Oct 2016

k thanks :+1:

solebox on 8 Oct 2016

@bnoordhuis Over capturing closure variables that leads to leak is a great point. I'm convinced arrow functions are worth using for that alone. Thanks for explaining the details.

Raynos on 10 Oct 2016

I can take lib/console.js for this

tylerbrazier on 12 Oct 2016

I lint the lib folder with func-names and this is the list of files that have anonymous functions. I also mark the ones that have open PR and didn't add the ones in the internal folder.

[ ] _debug_agent
[x] _debugger
[x] _http_agent
[x] _http_client
[x] _http_incoming
[x] _http_outgoing
[x] _http_server
[x] _stream_passthrought
[x] _stream_readable
[x] _stream_transform
[x] _stream_wrap
[x] _stream_writable
[x] _tls_common
[ ] _tls_legacy
[ ] _tls_wrap
[x] assert
[x] buffer
[x] child_process
[x] cluster
[x] console
[x] crypto
[ ] dgram
[x] dns
[x] domain
[ ] events
[x] fs
[x] http
[x] https
[ ] module
[x] net
[x] os
[x] querystring
[x] readline
[x] repl
[x] stream
[ ] string_decoder
[ ] timers
[x] tls
[ ] tty
[x] url
[ ] util
[x] v8
[x] vm
[x] zlib

maasencioh on 12 Oct 2016

👍8

I'm comfortable doing this to functions, but I don't think we should go back to self = this for arrow fns. Arrow fns assigned to variables _might_ be ok.

Fishrock123 on 12 Oct 2016

well i guess we have PRs for everything

solebox on 15 Oct 2016

@Trott This rule can be easily enforced/auto fixed in ESLint, right?

princejwesley on 24 Oct 2016

@princejwesley I'm sure that can be enforced with func-names, but no autofixed

maasencioh on 24 Oct 2016

As noticed in https://github.com/nodejs/node/pull/9252, its not always preferable to have named functions as V8's inferrer does a better job in some cases:

const fs = exports;
fs.func1 = function() {}; // shows as fs.func1
fs.func2 = function func2() {}; // shows just as func2

I think we should evaluate using this pattern in all modules.

silverwind on 24 Oct 2016

Full example:

"use strict";
const mod = exports;

(mod.a = function() { console.trace(); })();
(mod.b = function b() { console.trace(); })();

Trace
    at mod.a (/Users/silverwind/git/test/infer.js:4:31)
Trace
    at b (/Users/silverwind/git/test/infer.js:5:33)

silverwind on 24 Oct 2016

👍2

@silverwind

with prototypes tho:

"use strict"

var m = new Moo;

function Moo(){};

Moo.prototype.close = function close() {
    console.trace();
}
Moo.prototype.noname = function() {
    console.trace();
}
console.log(m.close);
console.log(m.noname);
m.close();

output:

[Function: close]
[Function]  
Trace
    at Moo.close (/home/sole/projects/test.js:8:13)

although it does look some what ugly.... i can say that (the source code)

solebox on 25 Oct 2016

/cc @nodejs/ctc I think we should discuss this in this weeks call

MylesBorins on 25 Oct 2016

👍5

@soleboxy While that example shows that it doesn't seem to harm anything on assignment to prototype properties, I'm still pretty -0 on that because:

It does not convey any benefit. It just results in no change in behavior. (Or am I missing a subtlety here?)
It _could_ do some (minimal) harm. If there's a typo in the function name, that's what gets used. For example:

Moo.prototype.close = function closet() {
  console.trace();
}

That will put Moo.closet in the trace rather than Moo.close even though .close() is what the user would have called.

Trott on 26 Oct 2016

@trott - the names have to match of course.
it was an answer to @silverwind's issue
the change will not secrifice the stack trace names (i.e mod.b to b as shown)
but yet will give you the additional .name field mentioned above by @addaleax
which will in return enable the repl/console.log name of the function to be visible if one is inclined to debug or log what he does that way. it just improves reflection that is all

as i have shown in the output:

[Function: close]
[Function]  
Trace
    at Moo.close (/home/sole/projects/test.js:8:13)

the definition of noname will still yield the same trace but not a .name for console.log + repl when used by the programmer.

if that has any value to the end user experience when making sense of the code i think it is worth thinking about at least.

solebox on 26 Oct 2016

I'm 👎 on adding names to functions in the case of a.prototype.myFunc = function () {}.

I think what is currently inferred is better. I'm all in in naming all the other functions/closures etc. This seems the case from @Raynos request: https://github.com/nodejs/node/blob/master/lib/_http_client.js#L234.

mcollina on 26 Oct 2016

👍1

i don't mind what people choose ill gladly do what the consensus dictates as long as there is one.
i just want everybody to be happy :)
for that i think we need to just list all the options in which there are anon functions and decide what to do for each case.

it looks like this will be discussed soon in the meeting anyways.

solebox on 26 Oct 2016

If there's a typo in the function name, that's what gets used

Typos like that could be linted for using func-name-matching.

silverwind on 26 Oct 2016

For now, we're going to remove the good first contribution label from this. We can re-add it once we've documented clearly what the cases are where an anonymous functions should _not_ be named.

Trott on 26 Oct 2016

👍6

When it comes to function name inference we need to fact check that it's available in all debugging artifacts including:

lldb / mdb stack traces.
lldb / mdb ::jsprint / ::findjsobjects
stack traces ( error.stack )
heap dump analysis
0x / perf / flamegraphs
v8 cpu profiler
v8 allocation tracer ( new in node6 )

Having access to the function name via functionInstance.name is not sufficient.

Raynos on 26 Oct 2016

👍2

@Raynos I've been using heap snapshots and stack traces, and kind of been assuming that if those provide a usable name, then other tools should as well. Is that probably more-or-less correct, or is that a bit naive and/or misguided?

Also, you write that the .name property is not sufficient. In your opinion, is it necessary? We've certainly seen that V8 can infer an ideal name that will show up in stack traces and heap snapshots, but the .name property is still not populated. Is that a problem? Or is .name more of a nice-to-have?

Trott on 27 Oct 2016

@Trott The .name field was a counter example. It's not necessary.

Raynos on 27 Oct 2016

@Raynos could you ellaborate on how to test for each artifact efficiently or guide in the general direction?
thanks

solebox on 27 Oct 2016

So no solution on how we can name those functions?

pvsousalima on 28 Oct 2016

I was going to try to put together something that evaluated all the situations where we might or might not want to name a function and try to figure out which situations helped, which situations were detrimental, and which were neither.

But I think we can probably proceed with a simpler approach, at least to get started. There is at least one situation where it is absolutely beneficial to name functions, and it is the situation that @Raynos used as an example when he opened this issue:

Anonymous functions that are used as listeners should be named.

So we should _not_ do this:

foo.on('bar', function() { 
  // do some stuff 
});

foo.once('boop', () => {
  // do some other stuff
});

Instead, we should do this:

foo.on('bar', function barCallback() {
  // do some stuff
});

const boopListener = () => {
  // do some other stuff 
});
foo.once('boop', boopListener);

I know that second one (assigning an arrow function to an identifier) has at least a sliver of skepticism from @Fishrock123 (anyone else?). If it's in any hot code paths, maybe we can benchmark the change. (And if you want to be cautious and/or avoid the PR stalling over it, you can leave the arrow functions out. Maybe put them in a separate PR so they can be discussed separately.)

Meanwhile, it would be great if the func-names rule in ESLint had an option to opt out for functions that are being assigned to variables. (Paging @not-an-aardvark...)

Thoughts?

Trott on 29 Oct 2016

(Removing the ctc-agenda label as it was discussed at the last meeting, and as far as I can tell, it doesn't look like it needs to be discussed at the next meeting, as long as discussion here is at least inching forward.)

Trott on 29 Oct 2016

👍2

I do agree with @Trott, the examples are what we need.

mcollina on 29 Oct 2016

Naming just the inline anonymous functions that are not assigned to a variable would be a good first step until we figure something out for top-level functions.

silverwind on 29 Oct 2016

👍1

Agreed 👍

pvsousalima on 29 Oct 2016

Tackling naming anonymous functions in child_process

brad-decker on 1 Dec 2016

Looking into the Buffer module, not seeing any functions that aren't being attached like so:

Buffer.concat = function(){// [...sic]}

Buffer.prototype.readInt16BE = function(){//[...sic]}

worth checking that module off the list in the comments above?

brad-decker on 1 Dec 2016

@brad-decker Thanks. I checked off buffer.

Trott on 2 Dec 2016

Taking the lib\utils.js

GoodnessSquad

KromDaniel on 21 Jun 2017

Taking the lib\module.js

GoodnessSquad

ranstyr on 21 Jun 2017

There are still some stragglers left in the codebase. This should make for a great first issue, there's even a checklist above that should help in location these. (Or search for function() {.)

apapirovski on 12 Apr 2018

Taking the lib/domain.js

dooven on 14 Apr 2018

Looking into lib/tty.js

js-kyle on 29 Apr 2018

Sorry, we should've really updated this issue. Only functions that are not on a prototype should be updated. Some of the unchecked files above are likely not an issue as a result.

apapirovski on 29 Apr 2018

👍1

hi, i would like to help out. Can someone point out a file please?
i know javaScript fairly well.

Trshant on 14 May 2018

This feature, coming to an ESLint release hopefully soon, might make checking for this more easily automated for Node.js core. https://github.com/eslint/eslint/issues/9926

Trott on 30 May 2018

AFAIK these are the remaining files in lib/ that need to be updated:

[ ] _http_client.js
[ ] _http_common.js
[ ] _http_outgoing.js
[ ] _stream_duplex.js
[ ] _stream_readable.js
[ ] _stream_writable.js
[ ] _tls_common.js
[ ] _tls_wrap.js
[ ] crypto.js
[ ] domain.js
[ ] events.js
[ ] fs.js
[ ] net.js
[x] punycode.js
[ ] readline.js
[ ] repl.js
[ ] timers.js
[ ] tls.js
[ ] zlib.js

grep 'function *(' *.js | grep -v prototype

transitive-bullshit on 5 Jul 2018

@trivikr I think a lot of the PRs currently being opened don’t actually improve the debugging situation – when a function is stored on a property on some object, the engine infers the name from it properly anyway…

addaleax on 11 Jul 2018

👍3

I checked off punycode.js because it's vendored in and not something we should patch in our source for this. If someone wants to open a PR upstream for it, the repository is https://github.com/bestiejs/punycode.js.

Trott on 11 Jul 2018

👍1

I agree with @addaleax and I think our situation improved significantly since this issue was opened. Not only have we named almost all functions throughout the code but the compiler will now infer the the names very well by now.

Therefore I am going to close this as resolved. If someone disagrees, please feel free to reopen. However, even if this is closed, it should of course not keep someone from opening a PR that provides names to functions that can not be inferred.