Crystal: Segfault when trying to sort by unicode characters

Created on 4 May 2017  ·  8Comments  ·  Source: crystal-lang/crystal

How to reproduce: https://play.crystal-lang.org/#/r/1z6n

Summary

I'm define custom method

def sort_by_hint(a, b)
  return -1 if a.to_s[0] == 'Ч'
  return 1 if b.to_s[0] == 'Ч'
  0
end

And trying to sort an array of hashes like this:

sorted.sort { |a, b| sort_by_hint(a["hint"], b["hint"]) }

After running the code I have an Invalid memory access (signal 11) at address 0x8 error:

Invalid memory access (signal 11) at address 0x8
[0x45cce6] *CallStack::print_backtrace:Int32 +118
[0x45090d] __crystal_sigfault_handler +61
[0x7f36b1318fe0] ???
[0x4aaaa1] *Hash(String, Int8 | String) +17
[0x4ab04d] *Hash(String, Int8 | String) +29
[0x4aae30] *Hash(String, Int8 | String) +48
[0x4aaddd] *Hash(String, Int8 | String) +29
[0x450af6] ~procProc(Hash(String, Int8 | String), Hash(String, Int8 | String), Int32) +86
[0x4adc56] *Array(T)::partition_for_quick_sort!<Pointer(Hash(String, Int8 | String)), (Int32 | Int64), Proc(Hash(String, Int8 | String), Hash(String, Int8 | String), Int32)>:Pointer(Hash(String, Int8 | String)) +726
[0x4ab94b] *Array(T)::quick_sort_for_intro_sort!<Pointer(Hash(String, Int8 | String)), Int32, Int32, Proc(Hash(String, Int8 | String), Hash(String, Int8 | String), Int32)>:Nil +299
[0x4ab7f4] *Array(T)::intro_sort!<Pointer(Hash(String, Int8 | String)), Int32, Proc(Hash(String, Int8 | String), Hash(String, Int8 | String), Int32)>:Nil +84
[0x4aa05f] *Array(Hash(String, Int8 | String)) +47
[0x4a9f95] *Array(Hash(String, Int8 | String)) +37
[0x441523] ???
[0x450809] main +41
[0x7f36b0716511] __libc_start_main +241
[0x440bfa] _start +42
[0x0] ???

Bug reproduces on crystal versions:

  • Crystal 0.22.0 [3c71228] (2017-04-20) LLVM 3.5.0 (Ubuntu 16.04 LTS)
  • Crystal 0.22.0 (2017-04-20) LLVM 4.0.0 (MacOS 10.11.6)

Full Code

require "json"

class Payload
  JSON.mapping({
    payload: Array(Data)
  })
end

class Data
  JSON.mapping({
    hint: String,
    likes: Int8,
    title: String
  })
end

def sort_by_hint(a, b)
  return -1 if a.to_s[0] == 'Ч'
  return 1 if b.to_s[0] == 'Ч'
  0
end

raw = %({"payload": [{"hint":"Что?","likes":3,"title":"Отсутствие времени"},{"hint":"Что?","likes":8,"title":"Изменение утвержденного задания"},{"hint":"Что?","likes":7,"title":"Мало времени"},{"hint":"Что?","likes":9,"title":"мало кто следует тому алгоритму, который разработала Мария. На самом деле это позволило бы нивилировать почти все траблы по подготовке адекватного ТЗ"},{"hint":"Что?","likes":3,"title":"Отсутствие регламента"},{"hint":"Что?","likes":7,"title":"Отсутствие критериев приёмки"}, {"hint":"Что?","likes":4,"title":"Отсутствие приёмочных тестов на Gherkin (Cucumber)"},{"hint":"Что?","likes":8,"title":"отсутствие общего понимания, что такое четкое тз"},{"hint":"Что?","likes":2,"title":"не хватка времени"},{"hint":"Что?","likes":3,"title":"нехватка кадров"},{"hint":"Что?","likes":3,"title":"Не имеет  завершенного  вида"},{"hint":"Что?","likes":7,"title":"Изменение требований к продукту."},{"hint":"Что?","likes":2,"title":"Ограниченные сроки."},{"hint":"Что?","likes":8,"title":"не всегда полное и корректное ТЗ"},{"hint":"Что?","likes":9,"title":"отсутствие времени на четкую постановку ТЗ для скорейшей передачи в разработку"},{"hint":"Что?","likes":1,"title":"передача знаний"},{"hint":"Кто?","likes":6,"title":"Технический писатель"},{"hint":"Кто?","likes":8,"title":"Технический писатель"}]})

sorted = Payload.from_json(raw).payload.map { |p| { "hint" => p.hint, "likes": p.likes, "title": p.title } }

puts sorted.sort { |a, b| sort_by_hint(a["hint"], b["hint"]) }
bug stdlib

Most helpful comment

I think the solution below would be nice.
Compare the first and the last element with pivot before partition, and if it is not as expected, raise an exception.
This checking is at most O(n) times, so it doesn't affect the efficiency a lot.

# in class Array
protected def self.partition_for_quick_sort!(a, n, comp)
  v, l, r = a[n / 2], a + 1, a + n - 1
  raise "invalid block" if comp.call(v, a.value) < 0 || comp.call(r.value, v) < 0 # Here
  # ...
end

I am going to start a PR to fix this issue, but I don't know much about exception.
Which exception should be here? Or should I define a new exception for this situation?
Also a good and clear error message is important.
Need discussion, thanks.

All 8 comments

If both a and b have prefix 'Ч', we can't figure out which is smaller because sort_by_hint says they are smaller to each other.

sort_by_hint("Чa", "Чb") # => -1
sort_by_hint("Чb", "Чa") # => -1

This causes Array#sort behaves weird.

Thus, sort_by_hint should be something like this.

def sort_by_hint(a, b)
  return 0 if a.to_s[0] == 'Ч' && b.to_s[0] == 'Ч'
  return -1 if a.to_s[0] == 'Ч'
  return 1 if b.to_s[0] == 'Ч'
  0
end

Still sortshouldn't segfault on such an error but instead try to raise an exception maybe or just behave not deterministically without segfaulting or endlessly looping.

reduced

([1] * 17).sort { -1 }

@jhass yeah, code not ideal, but strange that it could lead to crash )

https://github.com/crystal-lang/crystal/blob/master/src/array.cr#L1931
This issue is caused because partition_for_quick_sort! doesn't check the array bounds.
center_median! forces the pivot (which is the local variable v in partition_for_quick_sort!) not smaller than the first element, so bounds checking is not necessary.
However, when wrong block is given, the pivot is still smaller than the first element even if center_median! has already checked this.

I think the solution below would be nice.
Compare the first and the last element with pivot before partition, and if it is not as expected, raise an exception.
This checking is at most O(n) times, so it doesn't affect the efficiency a lot.

# in class Array
protected def self.partition_for_quick_sort!(a, n, comp)
  v, l, r = a[n / 2], a + 1, a + n - 1
  raise "invalid block" if comp.call(v, a.value) < 0 || comp.call(r.value, v) < 0 # Here
  # ...
end

I am going to start a PR to fix this issue, but I don't know much about exception.
Which exception should be here? Or should I define a new exception for this situation?
Also a good and clear error message is important.
Need discussion, thanks.

I suppose that's

SortError.new "Partition failed due to unstable comparator block given."

I fixed this by checking the bounds. The sort will behave wrong, but no exception is raised. This is similar in other languages too.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

RX14 picture RX14  ·  62Comments

stugol picture stugol  ·  70Comments

akzhan picture akzhan  ·  67Comments

ezrast picture ezrast  ·  84Comments

rdp picture rdp  ·  112Comments