Crystal: Returning Iterator::Stop from a block passed to Iterator.of causes segfault when working with a StringScanner

Created on 17 May 2020  路  11Comments  路  Source: crystal-lang/crystal

Code sample

require "string_scanner"

struct Scanner
  WORD_SEPARATORS = /\s|-/
  WHITESPACE      = /\s/

  @scanner : StringScanner
  @words : Iterator(String)

  def initialize(@scanner : StringScanner)
    @words = Iterator.of do
      @scanner.skip WHITESPACE
      next Iterator.stop if @scanner.eos?
      @scanner.scan_until(WORD_SEPARATORS)
        .try(&.strip(&.whitespace?)) || Iterator.stop
    end
  end

  def do_it(width : Number = 80)
    @words.each &->puts(String)
  end # def
end   # struct

scanner = StringScanner.new "some test text here"
puts Scanner.new(scanner).do_it

Note that, interestingly, this works:

str = ""
Iterator.of do
  str += "x"
  str.size > 5 ? Iterator.stop : str
end.each &->puts(String)

So this must be something to do with StringScanner in particular, but I'm not sure what.

bug topiccodegen

Most helpful comment

Significantly reduced. So no, it's not about Iterator or StringScanner (or libpcre)

struct Foo
  def initialize(@a : Array(Int32))
    ->{
      @a[0]
    }.call
  end
end

Foo.new([5])
Invalid memory access (signal 11) at address 0x4

From what I see, this happens when all of these conditions hold:

Inside a struct,
inside an initialize,
make a closure referring to an instance variable
which is a class,
and use the closure

All 11 comments

Invalid memory access (signal 11) at address 0x40
[0x55b01b369716] *CallStack::print_backtrace:Int32 +118
[0x55b01b35bc0e] __crystal_sigfault_handler +286
[0x7f0ee5cd8800] ???
[0x55b01b3c4e02] *StringScanner#match<Regex, Bool, Regex::Options>:(String | Nil) +114
[0x55b01b3c4d80] *StringScanner#scan<Regex>:(String | Nil) +16
[0x55b01b3c4d0d] *StringScanner#skip<Regex>:(Int32 | Nil) +29
[0x55b01b35bd35] ~procProc((Iterator::Stop | String)) +53
[0x55b01b3c59d0] *Iterator::SingletonProc(String) +96
[0x55b01b3c540f] *Scanner#do_it:Nil +591
[0x55b01b34db0f] __crystal_main +1263
[0x55b01b3c6b36] *Crystal::main_user_code<Int32, Pointer(Pointer(UInt8))>:Nil +6
[0x55b01b3c69cc] *Crystal::main<Int32, Pointer(Pointer(UInt8))>:Int32 +44
[0x55b01b3590d6] main +6
[0x7f0ee5aa4023] __libc_start_main +243
[0x55b01b34d54e] _start +46
[0x0] ???

Not sure what's happening there, are we loosing the match data reference and the GC or libpcre cleans it up?

I find it highly unlikely that we're trying to consistently dereference a pointer at address 0x40. It seems more likely to me that this is a case of something being interpreted as a reference/pointer that's an integer constant as opposed to a pointer-like value.

(0x40 is 64 by the way... that seems oddly coincidental)

Significantly reduced. So no, it's not about Iterator or StringScanner (or libpcre)

struct Foo
  def initialize(@a : Array(Int32))
    ->{
      @a[0]
    }.call
  end
end

Foo.new([5])
Invalid memory access (signal 11) at address 0x4

From what I see, this happens when all of these conditions hold:

Inside a struct,
inside an initialize,
make a closure referring to an instance variable
which is a class,
and use the closure

Also for your curiosity, the 0x40 == offsetof(StringScanner, @str)

Thank you @oprypin I had no idea what was going on here

Just to add to @oprypin findings, Segfault is raised when dealing with Array. so below code will work

struct Foo
  def initialize(@a : Int32)
    ->{
      @a
    }.call
  end
end

Foo.new(5) # => Works

main culprit is inside _*Array(Int32)@Indexable(T)#[]<Int32>:Int32

->  cmpl    4(%rcx), %eax
    setl    %dl
    movb    %dl, 3(%rsp)
    jmp LBB1806_11

I don't know what exactly that adds, as this is certainly not specific to arrays. You just removed 2 crucial details from my repro:
The instance variable needs to be a class; and the instance variable needs to actually be used (its fields accessed)

I don't know what exactly that _adds_, as this is certainly not specific to arrays. You just removed 2 crucial details from my repro:
The instance variable needs to be a class; and the instance variable needs to actually be used (its fields accessed)

馃憤
my bad and sorry for the confusion

looks like #7771

Indeed, closing as duplicate.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

pbrusco picture pbrusco  路  3Comments

Sija picture Sija  路  3Comments

ArthurZ picture ArthurZ  路  3Comments

asterite picture asterite  路  3Comments

oprypin picture oprypin  路  3Comments