Ecma262: Array#sort() inconsistent implementations (support for booleans)

Created on 20 Apr 2017  Â·  27Comments  Â·  Source: tc39/ecma262

Hi,

I'm not sure that this is the right place to report this and I'm quite sure it must've been discussed in the past, but let's see.

I found out today that:

[ 1, 4, 2, 6, 0, 6, 2, 6 ].sort( ( a, b ) => a > b )

gives a sorted array in Chrome and Firefox and does nothing in Edge and Safari.

Obviously, sort()'s callback is meant to return a number, but, apparently, two engines also support booleans.

Does it make Chrome's and Firefox's behaviour a bug? Would it be worth (and, first of all, possible) to align Chrome and Firefox to the spec?

Or, since that would be a backwards incompatible change and taken that sort() doesn't need to implement a stable sorting algorithm anyway, a support for booleans could be standardised?

Most helpful comment

In that case, is there anything actionable remaining in this issue, or should it be closed?

All 27 comments

A comparefn that does not return numeric values is not a _consistent comparison function_. In that case, the sort order is implementation-defined.

From https://tc39.github.io/ecma262/#sec-array.prototype.sort:

If comparefn is not undefined and is not a _consistent comparison function_ for the elements of this array (see below), the sort order is implementation-defined.

And below:

A function comparefn is a consistent comparison function for a set of values S if all of the requirements below are met […]
[…]
Furthermore, Type(v) is Number

You could file bugs trying to get the engines to agree on their behavior, after which the behavior could potentially be specified.

Thanks. I haven't even thought that Chrome and Firefox work within the spec's limit. I guess I could open tickets for Edge and Safari so they add support for booleans but I don't think that this has a big chances of passing.

The comparison function is supposed to return a number, and if it doesn’t, the result is coerced to a number: see step 4.a of SortCompare abstract operation. In particular, Boolean values will be coerced to 0 or 1. (Historical note: that coercion behaviour was added in ES6, because it was observed that it is what the implementations did.)

In the context of Array#sort, a _consistent_ comparison function is a comparison function that models correctly an order relation according to which the set of values will be sorted, as explained in more details at the end of Section Array.prototype.sort. A comparison function that produces 0 or 1 but never -1 is certainly not consistent.

@Reinmar what you probably want is [1, 4, 2, 6, 0, 6, 2, 6].sort((a, b) => a - b) (or b - a). For strings, a comparator can instead return a.localeCompare(b).

What does the spec say should happen with:

[1, NaN, 3].sort((a, b) => a - b)

?

On Thu, Apr 20, 2017 at 4:04 PM, Jordan Harband notifications@github.com
wrote:

@Reinmar https://github.com/Reinmar what you probably want is [1, 4, 2,
6, 0, 6, 2, 6].sort((a, b) => a - b) (or b - a). For strings, a
comparator can instead return a.localeCompare(b).

Unspecified because the argument is not a "consistent comparison function"

On Apr 20, 2017, 4:26 PM, at 4:26 PM, "Mark S. Miller" notifications@github.com wrote:

What does the spec say should happen with:

[1, NaN, 3].sort((a, b) => a - b)

?

On Thu, Apr 20, 2017 at 4:04 PM, Jordan Harband
notifications@github.com
wrote:

@Reinmar https://github.com/Reinmar what you probably want is [1,
4, 2,
6, 0, 6, 2, 6].sort((a, b) => a - b) (or b - a). For strings, a
comparator can instead return a.localeCompare(b).

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/tc39/ecma262/issues/902#issuecomment-295957282,
or mute
the thread

https://github.com/notifications/unsubscribe-auth/AAQtzCAHqZ2KokUJfEa5KfE88Qec0feRks5rx-SJgaJpZM4NC2o9
.

--
Cheers,
--MarkM

--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/tc39/ecma262/issues/902#issuecomment-295965036

Despite that, the steps for comparison functions (see https://tc39.github.io/ecma262/#sec-sortcompare step 4.b) treats it as +0, even though the consistency requirements require that it not return NaN.

Interesting. I didn't know about this conversion to +0. But even if we allowed comparefn to return NaN and treated it as if it had returned +0, (a, b) => a - b would still not be transitive, and so still not a consistent comparison function.

No action item here re NaN. I can't think of any realistic way to further improve the situation.

Despite that, the steps for comparison functions (see https://tc39.github.io/ecma262/#sec-sortcompare step 4.b) treats it as +0, even though the consistency requirements require that it not return NaN.

It is probably a bug that the consistency requirements still require a non-NaN Number when the value is coerced to a non-NaN number in SortCompare anyway.

Or, in other words, the consistency conditions should apply to SortCompare(a, b) rather than comparefn(a, b).

I agree; it seems that either NaN should be allowed in consistent comparison functions, or SortCompare shouldn't need to mention it.

Webkit aligned its implementation to that of Firefox and Chrome :) https://bugs.webkit.org/show_bug.cgi?id=47825

Webkit aligned its implementation to that of Firefox and Chrome :) https://bugs.webkit.org/show_bug.cgi?id=47825

And what is the semantics of Firefox and Chrome?

Their patch seems to special-case boolean results of the comparison function, but I’m not sure it is what Firefox or Chrome do. Indeed, consider the following testcases, where the comparison function does not return a Boolean:

[1,3,2,5,4].sort((a,b) => Number(b<a)) // [ 1, 2, 3, 4, 5 ] in both FF in Chrome
[1,3,2,5,4].sort((a,b) => Number(a<b))  // [ 5, 4, 3, 2, 1 ] in both FF in Chrome

A reasonable semantics to me would be the following: If compareFn is the comparison function and if ToNumber(compareFn(a, b)) evaluates to 0 (or NaN), evaluates ToNumber(compareFn(b, a)) in order to decide how a and b should be ordered. But what they really do, I don’t know: I’ve not read the source code of FF or Chrome.

Other testcases (same results in both FF and Chrome):

[1,3,2,5,4].sort((a,b) => 1.5*Number(b<a)) // [ 1, 2, 3, 4, 5 ]
[1,3,2,5,4].sort((a,b) => 1.5*Number(a<b))  // [ 5, 4, 3, 2, 1 ]

but (showing that they do not follow the “reasonable” semantics of my previous comment):

[1,3,2,5,4].sort((a,b) => -Number(b<a)) // [ 1, 3, 2, 5, 4 ] (not sorted)
[1,3,2,5,4].sort((a,b) => -Number(a<b)) // [ 1, 3, 2, 5, 4 ] (not sorted)

if some implementation uses
comparefn(a, b) < 0 internally, (a > b ? 1 : 0) < 0 gives false and probably the result will not be sorted.

Will it make implementations consistent if spec, that implementation should use > 0 for the result of comparefn(a, b) - comparefn(a, b) > 0 ?

Testcase:

function test(numValues, numDistinctValues, compareFn) {
    var arr = Array(numValues)
    for (let i = 0; i < numValues; i++) {
        arr[i] = Math.floor(Math.random()*numDistinctValues)
    }
    arr.sort(compareFn)
    for (let i = 1; i < numValues; i++) {
        if (arr[i] < arr[i - 1])
            return false
    }
    return true
}

Naturally, test(n, m, (a,b) => a - b) must return true for every n and m.

For test(n, m, (a,b) => a - b > 0), I get the following results:

  • Firefox: always true;
  • Chrome: true for n ≤ 10, usually false otherwise;
  • Edge and Safari: usually false.

While we’re at it, let’s test the stability of the sort algorithm (https://esdiscuss.org/topic/stable-sort-proposal):

function test(numValues, numDistinctValues, compareFn) {
    "use strict"

    let arr = Array(numValues)

    for (let i = 0; i < numValues; i++) {
        arr[i] = [ i, Math.floor(Math.random()*numDistinctValues) ]
    }

    arr.sort((a, b) => compareFn(a[1], b[1]))

    let correct = true
    let stable = true
    for (let i = 1; i < numValues; i++) {
        if (arr[i][1] < arr[i-1][1])
            correct = false
        if (arr[i][1] === arr[i-1][1] && arr[i][0] < arr[i-1][0])
            stable = false
    }
    return [ correct, stable ]
}

(To be tested with numDistinctValues small relatively to numValues for meaningful result about stability.)

Conclusion:

  • Firefox and Safari always use a stable sort;
  • Edge uses a stable sort for n ≤ 512 and an unstable sort for larger arrays;
  • Chrome uses a stable sort for n ≤ 10 and an unstable sort for larger arrays.

Is the next step here removing step 4.b of SortCompare, since a consistent comparison function can't return NaN?

cc @mathiasbynens has this issues' landscape changed at all now that sort is stable?

cc @szuend

Is the next step here removing step 4.b of SortCompare, since a consistent comparison function can't return NaN?

Maybe I misunderstand the comment, but the comparefn is still user-provided code, so it can return what-ever (including NaN).

Regarding the relationship between "stableness" and this issue: It is not really related. "stable" is a property of the sorting algorithm itself, while this issue is about how to establish an ordering between elements.

FYI: Any "working" sorting with a boolean predicate is purely accidental for V8.

IMHO: Overloading the semantics of the value returned by a comparison function does not add any value for the user. Sure, it might accidentally fix code that is wrong in the first place, but it will also make Array#sort more confusing to use, not to mention the additional runtime overhead.

FYI: Any "working" sorting with a boolean predicate is purely accidental for V8.

The test case in https://github.com/tc39/ecma262/issues/902#issue-223024520 doesn't even return a sorted array anymore in current V8 (, probably caused by the TimSort change.)

FYI: Any "working" sorting with a boolean predicate is purely accidental for V8.

The test case in #902 (comment) doesn't even return a sorted array anymore in current V8 (, probably caused by the TimSort change.)

Even before. V8 used quicksort (unstable) with an insertion sort (stable) fallback for arrays and sub-arrays with a smaller length (10 i think). So the stableness was dependent on the input size of the test case.

@szuend right, but if user code returns NaN, it's an inconsistent comparator, and that makes the sorting results implementation-defined, to my reading. In other words, removing that step shouldn't change anything since the algorithm doesn't apply to a comparator that returns NaN.

I believe that at the time that line was added, web reality was that browsers all performed that step when presented with a compare function that was otherwise a consistent comparator except for the potentially returning NaN..

It was added to ensure that browsers continue to conform to that reality.

@allenwb in that case, should the prose be altered so that a consistent comparison function includes one that returns NaN?

@ljharb I don't think it needs to be. The specification explicitly specifies only a few algorithm steps and leaves everything else up to the implementation.

The fact that SortCompare explicitly replaces a NaN returned from a comparefn with +0 doesn't mean that the comparefn is consistent. In particular, the last sentence of the first bullet of of the definition of a consistent comparison function may not be true of functions that return NaN.

In that case, is there anything actionable remaining in this issue, or should it be closed?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

michaelficarra picture michaelficarra  Â·  3Comments

moonformeli picture moonformeli  Â·  3Comments

AlexSHoffman picture AlexSHoffman  Â·  3Comments

axross picture axross  Â·  4Comments

ursi picture ursi  Â·  4Comments