Crystal: Array#sort requires < and <= methods, not <=>

Created on 26 Aug 2018 · 7Comments · Source: crystal-lang/crystal

According to the docs for Array#sort, it:

Returns a new array with all elements sorted based on the return value of their comparison method #<=>

However, Array#sort only uses the #< and #<= methods; it does not use the #<=> method.

With <=>

class Foo
  property :number

  def initialize(@number : Int32)
  end

  def <=>(other : Foo)
    self.number <=> other.number
  end
end

arr = [Foo.new(10), Foo.new(20), Foo.new(5), Foo.new(15)]
arr.sort!
pp arr.map(&.number)

Compiling yields:

in /usr/share/crystal/src/array.cr:1872: undefined method '<' for Foo
 c -= 1 if a[c] < a[c - 1]
                ^

With < and <=

class Foo
  property :number

  def initialize(@number : Int32)
  end

  def <(other : Foo) 
    self.number < other.number
  end

  def <=(other : Foo)
    self.number <= other.number
  end
end

arr = [Foo.new(10), Foo.new(20), Foo.new(5), Foo.new(15)]
arr.sort!
pp arr.map(&.number)

Running this code yields:

[5, 10, 15, 20]

System Information

Operating System

Ubuntu 18.04.1 LTS

crystal -v

Crystal 0.26.0 [eeb53c506] (2018-08-09)

LLVM: 4.0.0
Default target: x86_64-unknown-linux-gnu

bug stdlib

Source

ghost

Most helpful comment

Cool, I managed to fix the performance problem. Well... for floats it'll be slower because of the NaN check, but for ints and generally things where comparison of <=> doesn't return nil it performs the same. I'll send a PR soon.

asterite on 26 Aug 2018

👍4

All 7 comments

@astupidnerd Thank you for reporting this!

I knew this for a while but since implementing <=> without also including Comparable is very strange, I'd say this is a very minor issue.

That said, we should fix it.

TL;DR

We should change Array#sort and Array#sort! to use <=> instead of <
We should check for nil when using <=> in the above methods, and raise in such case
(optionally) we should merge Comparable and PartialComparable, checking for nil in <, <=, > and >= and returning false in those cases.

(the above is just my opinion)

Long explanation for the optional point

I'm thinking that sort should check whether <=> returns nil, and in that case raise an error saying that the two elements can't be compared. Usually we can detect this at compile-time if <=> isn't defined for two types, but one case that's odd is when comparing NaN against other numeric values. For example in Ruby:

irb(main):002:0> [Float::NAN, 1, 2].sort
ArgumentError: comparison of Float with 1 failed
    from (irb):2:in `sort'
    from (irb):2
    from /usr/local/bin/irb:11:in `<main>'

Then:

irb(main):005:0> Float::NAN <=> 1
=> nil

However:

irb(main):006:0> Float::NAN < 1
=> false
irb(main):007:0> Float::NAN > 1
=> false
irb(main):008:0> Float::NAN == 1
=> false

That is, we can compare floats with <, > and ==, but <=> returns nil if the floats aren't comparable. Maybe it's just an edge case and we shouldn't worry about that, though...

In the standard library we have Comparable and PartialComparable. PartialComparable allows for the possibility of <=> returning nil, and if it returns nil the the automatically implemented methods (<, <=, > and >=) return false... whereas in Ruby those operations raise an exception:

# ruby
class Foo
  include Comparable

  def <=>(other)
    nil
  end
end

f = Foo.new
f < f

# bar.cr:10:in `<': comparison of Foo with Foo failed (ArgumentError)
#   from bar.cr:10:in `<main>'

But note that for Float, which defines <=> and includes Comparable, those method return false instead of raising an exception.

Maybe in our case it's fine to return false always when getting nil? I don't know. But maybe we should just merge Comparable and PartialComparable, checking for nil, and thus simplifying the standard library.

asterite on 26 Aug 2018

👍2

I have a branch will all of these changes.

The only problem is that Array#sort(&block) becomes about 3 times slower because it must now account for the block returning nil (the non-block version performance remains the same). I don't know if this is acceptable.

Maybe we can just change Array sorting methods to use <=>, but forget about the possibility of it returning nil... though of course you won't be able to sort something that is partial comparable, like Location in the compiler, which makes the whole PartialComparable a bit useless, unless you only plan to use <, >, etc., with them.

I usually prefer a language that behaves correctly even if that means sacrificing a bit of performance (Ruby <3). Our sort is pretty fast, so loosing a bit of that performance might not be that bad). Sorting numbers where one of the values is NaN should probably raise, in the same way that overflow should raise instead of silently behaving incorrectly.

Thoughts?

asterite on 26 Aug 2018

Actually, I have an idea to make it faster when nil can't be returned. I'll try it later.

asterite on 26 Aug 2018

👍4

Nice. I'm a little late to the party but... could Float#<=> throw an ArgumentError for NaN (itself)? Is the nil useful somewhere else I'm not aware of?

rdp on 25 Mar 2020

Oh wait I get it it's so that NaN <=> NaN can still return that they're "equal" (0 in this case). That's OK. Another option might be the java route where NaN gets sorted "as the highest number" (after positive infinity LOL): https://discuss.python.org/t/nan-breaks-min-max-and-sorting-functions-a-solution/2868/19 FWIW :)

rdp on 25 Mar 2020

It's not because NaN <=> NaN returns 0, it actually returns nil in both Ruby and Crystal. NaN is simply not comparable and not equal to anything else.

The reason nil is returned is so that <=> is only operator used for comparison, and it provides for partial and complete orders.

asterite on 25 Mar 2020

Was this page helpful?

0 / 5 - 0 ratings