Node: Array.sort arguments order changed in Node.js 11

Created on 10 Nov 2018  Â·  28Comments  Â·  Source: nodejs/node

  • Version: 11.1.0
  • Platform: Darwin hubert.local 18.2.0 Darwin Kernel Version 18.2.0: Fri Oct 5 19:41:49 PDT 2018; root:xnu-4903.221.2~2/RELEASE_X86_64 x86_64 (macOS 10.14.1)
  • Subsystem:

Hi!

I've noticed a change with Array.sort(). In Node.js 11 order of arguments passed to compare function is reversed. While in most cases it's not a problem, in some cases it causes problems.

In my project I have quite large compare function for .sort(): https://github.com/hudochenkov/postcss-sorting/blob/944da947a628192b54448368c197f586fbbe0c10/lib/sorting.js#L10-L62. It relies on arguments order. It works in Node.js 4—10. I don't expect anyone to figure out what my function does, and I can't simplify it for this issue report yet.

I created a gist, which shows how arguments get to compare function in Node.js 10 and Node.js 11.

$ npx -p node@10 npx https://gist.github.com/hudochenkov/29b739f8dbdb4aa00a46b953f62dd0a6

a: 1, b: 2
a: 2, b: 3
a: 3, b: 4
[ 1, 2, 3, 4 ]

$ npx -p node@11 npx https://gist.github.com/hudochenkov/29b739f8dbdb4aa00a46b953f62dd0a6

a: 2, b: 1
a: 3, b: 2
a: 4, b: 3
[ 1, 2, 3, 4 ]

Is it a regression or intentional change?

I tried to understand how V8 7.0 changed .sort(), but it's to complex for me :(

V8 Engine regression wontfix

Most helpful comment

Imo it's a slippery slope if we started to cater to bugs in user code. Relevant xkcd: https://xkcd.com/1172/

All 28 comments

Hello @hudochenkov, and thank you for the report.
As you mention, this is a change node get by way of a update V8. There was some discussion on the implication of the new algorithm https://github.com/nodejs/node/pull/22754#issuecomment-423444051, but AFAIK this specific issue was reported for node

/CC @nodejs/v8 (esp. @targos @hashseed @mathiasbynens)

P.S. @hudochenkov can you provide some data that using your code sorts differently?

@hudochenkov stable sort means what if elements in the unsorted array were already in the correct order, they are guaranteed to be in the same order after sorting.

in any case, as long as you always return the correct number for two elements, no matter the order, the engine will properly handle it. if the engine switches the order of the arguments, it will also handle the switched return value.

Thank you for quick replies!

I'll simplify my sorting function as much as possible and will provide examples of how it sorts differently. I'll do it tomorrow.

I've forked your gist - https://gist.github.com/refack/8fc2bbdf084f9ad28f900ad5a4f80a14 to uncover the regression:
```js
$ npx -p node@10 npx https://gist.github.com/refack/8fc2bbdf084f9ad28f900ad5a4f80a14

a: { p: 'mottob', id: 1 }, b: { p: 'bottom', id: 2 } - 0
a: { p: 'bottom', id: 2 }, b: { p: 'mottob', id: 3 } - -1
a: { p: 'mottob', id: 3 }, b: { p: 'bottom', id: 4 } - 0
[ { p: 'mottob', id: 1 },
{ p: 'bottom', id: 2 },
{ p: 'mottob', id: 3 },
{ p: 'bottom', id: 4 } ]

$ npx -p node@11 npx https://gist.github.com/refack/8fc2bbdf084f9ad28f900ad5a4f80a14

a: { p: 'bottom', id: 2 }, b: { p: 'mottob', id: 1 } - -1
a: { p: 'mottob', id: 3 }, b: { p: 'bottom', id: 2 } - 0
a: { p: 'mottob', id: 3 }, b: { p: 'mottob', id: 1 } - 0
a: { p: 'bottom', id: 4 }, b: { p: 'mottob', id: 1 } - -1
a: { p: 'bottom', id: 4 }, b: { p: 'bottom', id: 2 } - -1
[ { p: 'bottom', id: 4 },
{ p: 'bottom', id: 2 },
{ p: 'mottob', id: 1 },
{ p: 'mottob', id: 3 } ]

simpler code:

const declarations = [
  {p: 'mottob', id: 1},
  {p: 'bottom', id: 2},
  {p: 'mottob', id: 3},
  {p: 'mottob', id: 4},
];

if (process.argv[2] === 'anti' ) {
  declarations.sort((b, a) => {
    const ret = sortDeclarations(a, b);
    console.log(`a: %o, b: %o - %d`, a, b, ret);

    return -ret
  });
} else {
  declarations.sort((a, b) => {
    const ret = sortDeclarations(a, b);
    console.log(`a: %o, b: %o - %d`, a, b, ret);

    return ret
  });
}

console.log(declarations);

function sortDeclarations(a, b) {
  if (b.p === 'bottom') {
    return 1;
  }
  return a.id - b.id;
}

Get me:

D:\code\prws>d:\bin\dev\node\node10.9.0.exe t.js
a: { p: 'mottob', id: 1 }, b: { p: 'bottom', id: 2 } - 1
a: { p: 'mottob', id: 1 }, b: { p: 'mottob', id: 3 } - -2
a: { p: 'mottob', id: 3 }, b: { p: 'mottob', id: 4 } - -1
[ { p: 'bottom', id: 2 },
  { p: 'mottob', id: 1 },
  { p: 'mottob', id: 3 },
  { p: 'mottob', id: 4 } ]

D:\code\prws>d:\bin\dev\node\node11.0.0.exe t.js
a: { p: 'bottom', id: 2 }, b: { p: 'mottob', id: 1 } - 1
a: { p: 'mottob', id: 3 }, b: { p: 'bottom', id: 2 } - 1
a: { p: 'mottob', id: 4 }, b: { p: 'mottob', id: 3 } - 1
[ { p: 'mottob', id: 1 },
  { p: 'bottom', id: 2 },
  { p: 'mottob', id: 3 },
  { p: 'mottob', id: 4 } ]

D:\code\prws>d:\bin\dev\node\node11.0.0.exe t.js anti
a: { p: 'mottob', id: 1 }, b: { p: 'bottom', id: 2 } - 1
a: { p: 'bottom', id: 2 }, b: { p: 'mottob', id: 3 } - -1
a: { p: 'mottob', id: 1 }, b: { p: 'mottob', id: 3 } - -2
a: { p: 'mottob', id: 1 }, b: { p: 'mottob', id: 4 } - -3
a: { p: 'mottob', id: 3 }, b: { p: 'mottob', id: 4 } - -1
[ { p: 'bottom', id: 2 },
  { p: 'mottob', id: 1 },
  { p: 'mottob', id: 3 },
  { p: 'mottob', id: 4 } ]
  • Array#sort is returning the correct result, just like before the change in V8. This report is not about a spec correctness issue.

  • The order in which the elements are compared during sorting (and similarly, how many times the comparison function is called in total) is _intentionally_ unspecified, and depends on the exact implementation. When the V8 implementation changed from QuickSort to TimSort, this order changed. But that’s the thing: this order wildly varies across implementations, and even within the same engine it can change at any point; developers shouldn’t rely on it. This is explicitly called out in https://v8.dev/blog/array-sort#accessors-prototype.

This is not a regression; it’s an implementation detail that shouldn’t be relied upon.

My above comment was in response to the gist in the OP, i.e. https://gist.github.com/hudochenkov/29b739f8dbdb4aa00a46b953f62dd0a6.

@refack, let me respond to your test case now. This program demonstrates something separate from the issue OP reported. The sort callback in this code example is what the spec calls an _inconsistent comparison function_. For example, assume the following values:

  • a = { p: 'x', id: 3 }
  • b = { p: 'bottom', id: 4 }

Now, given the way sortDeclarations is written, we can observe the following:

  • if you compare a with b, sortDeclarations returns 1
  • if you compare b with a, sortDeclarations returns 4 - 3 which is… also 1

The result of a sort with an inconsistent comparison function is implementation-defined, and cannot be relied upon. In other words, this too is WAI.

Thank you for investigation, @refack! And thank you for clarification, @mathiasbynens!

this order wildly varies across implementations, and even within the same engine it can change at any point; developers shouldn’t rely on it.

I guess I was just lucky with my sorting function all these years :) I'm going to rewrite it to be independent of arguments order.

_inconsistent comparison function_.

I agree on that. But we still have a user-land regression, due to a dependency on unspecified behaviour.

If timsort can call the the compare function in an anti-commutative way, and that will get some percentage of such code to work again, _while_ not breaking the spec, IMO it's a win-win.

I'm a bit ambivalent on this, but I'm still marking this regression since the array.sort implementation did not go through a deprecation cycle, and defacto we have user code that broke without enough notification. as per our policy - https://github.com/nodejs/node/blob/master/COLLABORATOR_GUIDE.md#when-breaking-changes-actually-break-things

P.S. in https://github.com/nodejs/node/pull/22754#issuecomment-423452575 I did not object to landing this, even though there was indication that user code was going to break (at least 300 lines in ~50 npm packages) due to using a comparison function that returns Boolean, because IMHO that was misuse of the public API.
Here we have a much subtler situation, where the user code _looks_ like it's correct (i.e. would pass compilation if it was strongly typed), and probably passes tests. Admittedly because it's relying on unspecified behaviour.

in #22754 (comment) I did not object to landing this, even though there was indication that user code was going to break (lt least 300 line in ~50 npm packages) due to using a comparison function that returns Boolean, because IMHO that was misuse of the public API.

Here we have a much subtler situation, where the user code looks like it's correct (i.e. would pass compilation if it was strongly typed), and probably passes tests. Admittedly because it's relying on unspecified behaviour.

I don’t see how these situations are different. They’re both examples of problems caused by inconsistent comparison functions. It’s not Node.js’s responsibility to provide support for something that’s explicitly unspecified behavior, IMHO.

A comparison function that returns a boolean true or false is no worse than a comparison function that returns only 1 or 0 (but never -1) — in fact, it’s equivalent per spec, since ToNumber is implicitly called on the result. Both situations are just as bad; they’re both inconsistent comparison functions.

they’re both inconsistent comparison functions.

True.
But bottom line we have users hurting, and I'm asking, can we mitigate the situation? IMO if calling the comparison in anti-commutative way has only benefits, we should consider that. If that's not possible, we need to think is there's something else we can do. If we can't do anything, then we admit it's a known-limitation/wontfix situation.

Anyway I think we should do a publicity push to get users aware of this situation, and how they should solve issue with inconsistent comparison functions.

IMO if calling the comparison in anti-commutative way has only benefits, we should consider that.

I don’t understand what you mean by this suggestion.

In https://github.com/v8/v8/blob/78f2610345fdd14ca401d920c140f8f461b631d1/third_party/v8/builtins/array-sort.tq#L362

instead of:

const result: Number = sortCompare(context, userCmpFn, x, y);

do

const result: Number = -sortCompare(context, userCmpFn, y, x);

AFAICT, that just shifts the problem to _other_ inconsistent comparison functions.

Imo it's a slippery slope if we started to cater to bugs in user code. Relevant xkcd: https://xkcd.com/1172/

If there's an XKCD for this I have to concede (praise be to Randall Munroe seer of truth)

In https://github.com/v8/v8/blob/78f2610345fdd14ca401d920c140f8f461b631d1/third_party/v8/builtins/array-sort.tq#L362

instead of:

const result: Number = sortCompare(context, userCmpFn, x, y);

do

const result: Number = -sortCompare(context, userCmpFn, y, x);

If I do that, then in Firefox browser it will be with the wrong order, because it uses the normal parameters order.

Take this example to get the admins at the beginning of the list:

const members = [
  { id: 1, isAdmin: true },
  { id: 2, isAdmin: false },
  { id: 3, isAdmin: true },
];

members.sort((a, b) => {
  if (a.isAdmin) return -1;
  if (b.isAdmin) return 1;
  return 0;
});

It will return order them by these IDs in Chrome 3, 1, 2 and by these in Firefox 1, 3, 2

@JJBocanegra That's because your comparison function is inconsistent. It returns -1 for both check(members[0], members[2]) and check(members[2], members[0]). A better comparison would be this:

(a, b) => {
  if (a.isAdmin == b.isAdmin) {
    return a.id - b.id;
  }
  if (a.isAdmin) {
    return -1;
  }
  return 1;
}

@devsnek But that's implying that I want to order by ID, but I do not, I want the initial order to be respected. Consider this:

const members = [
  { id: 3, isAdmin: true },
  { id: 1, isAdmin: false },
  { id: 2, isAdmin: true },
];

I want the final order to be

[
  { id: 3, isAdmin: true },
  { id: 2, isAdmin: true },
  { id: 1, isAdmin: false },
]

And with your suggestion it would fail too.

@JJBocanegra Can you open a new issue at https://github.com/nodejs/help/issues/ instead of discussing this one?

@addaleax sure, but isn't it better to have it here because is directly related to the issue?

@JJBocanegra As @devsnek said, this is a bug in your code. It’s not really related to this issue here.

@addaleax I don't think it's a bug in my code, what If I want to use the initial order as a secondary sorting like is the case? With the arguments inverted is not possible without modifying the initial data adding some kind of index because other browsers use the correct arguments order.

@JJBocanegra you could return 0 instead of return a.id - b.id The point is that if both items have the same isAdmin value, they need to compare consistently by some other metric.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

akdor1154 picture akdor1154  Â·  3Comments

loretoparisi picture loretoparisi  Â·  3Comments

stevenvachon picture stevenvachon  Â·  3Comments

srl295 picture srl295  Â·  3Comments

cong88 picture cong88  Â·  3Comments