Crystal: Cannot detect OutOfMemory error using `Pointer(Void).malloc(size)`

Created on 27 Apr 2017  路  9Comments  路  Source: crystal-lang/crystal

Version: Crystal 0.22.0 (2017-04-22) LLVM 4.0.0

It's not possible using Pointer(T).malloc(size) to get a Pointer(T).null pointer.

I made a simple program that eat all memory until OOM (Out Of Memory) error, and I've found what seems to be a bug when there is OOM error from GC:

  • Using GC.malloc(size) it returns a null pointer (Pointer(Void) which ptr.address == 0).
  • Using Pointer(Void).malloc(size) (which should do the same thing as above I think), it crashes with a Invalid memory access (signal 11) at address 0x0.

I used this program: (I won't put it on carc.in as it's doing mallocs until OOM)

# file: eat_memory.cr
Kilobyte = 1024
Megabyte = 1024 * Kilobyte

GC.disable

nb_mega = 0

DIRECT_MALLOC = if ARGV[0]?
                  ARGV[0] == "direct"
                else
                  false
                end

def my_malloc(size)
  if DIRECT_MALLOC
    GC.malloc(size)
  else
    Pointer(Void).malloc(size)
  end
end

while (mem = my_malloc(Megabyte)) && !mem.null?
  nb_mega += 1
end

puts "Allocated memory: #{nb_mega}M"

Then using Pointer(Void).malloc(size):

$ cr eat_memory.cr
GC Warning: Failed to expand heap by 8388608 bytes
GC Warning: Failed to expand heap by 8388608 bytes
GC Warning: Failed to expand heap by 8388608 bytes
GC Warning: Failed to expand heap by 8388608 bytes
GC Warning: Failed to expand heap by 8388608 bytes
GC Warning: Failed to expand heap by 1052672 bytes
GC Warning: Out of Memory! Heap size: 6015 MiB. Returning NULL!
Invalid memory access (signal 11) at address 0x0
... stacktrace ...

And using GC.malloc(size):

$ cr eat_memory.cr direct
GC Warning: Failed to expand heap by 8388608 bytes
GC Warning: Failed to expand heap by 8388608 bytes
GC Warning: Failed to expand heap by 8388608 bytes
GC Warning: Failed to expand heap by 8388608 bytes
GC Warning: Failed to expand heap by 1052672 bytes
GC Warning: Out of Memory! Heap size: 6022 MiB. Returning NULL!
Allocated memory: 5773M

I don't understand why the Pointer(Void).malloc(size) segfault like this, because from my understanding, it calls malloc (a crystal primitive), which during codegen is replaced by a call to __crystal_malloc, which is defined here as LibGC.malloc(size).

So the 2 should be the same (except that using Pointer, there is a check on the negativity of size).

(I tried to debug the segfault using lldb, but it just freeze everything, with no error messages not even from the GC..)

bug compiler

Most helpful comment

Well, every malloc calls should be verified, otherwise the program will segfault, and it's bad. Maybe it should be the responsibility of the __crystal_malloc and __crystal_realloc functions?

All 9 comments

With help from IRC/gitter, here is the segfaulting code: codegen.cr#L1852.
The malloc primitive, after calling __crystal_malloc, initialize the buffer (with memset) to 0, without verifying that the pointer is not 0.

We should add a condition around the memset, at least to check it won't segfault without any possible rescue (as catching segfaults doesn't work atm).

Someone on IRC said that this check would slow down everything, but I really don't think a simple condition (an equality check against 0 is too easy and fast bit-wise) could slow down anything..

@asterite you wrote that code, WDYT?

Well, every malloc calls should be verified, otherwise the program will segfault, and it's bad. Maybe it should be the responsibility of the __crystal_malloc and __crystal_realloc functions?

__crystal_malloc and __crystal_realloc already return a null pointer correctly in this case. I'd argue that Pointer.malloc and #realloc should also return a null pointer. Or at least have a version (.malloc? / #realloc??) which do. It should be possible to recover from or detect a OOM situation in application code without fighting the stdlib.

As @RX14 said:

__crystal_malloc and __crystal_realloc already return a null pointer correctly in this case

I think the only issue here about Pointer.malloc (& Pointer.realloc) is in the generated code for the malloc primitive, where it initialize the malloc-ed buffer to 0 (memset in codegen.cr#L1852), which should be guarded with a If codegen, checking the pointer.

Side note: is there a reason why you can't use calloc or GC_alloc (which zeros the memory anyway)?

@RX14 I don't consider returning a null pointer to be correct. It may be correct in C land, but we're in Crystal land and we expect to be safe.

Developers aren't supposed to be using these functions, so maybe it's not the right place to add checks, but any generated code that uses these functions must check for null pointers anyway, otherwise: segfaults (and attack vectors).

I think raising an OutOfMemory error from __crystal_malloc and __crystal_realloc is the right thing to do. And the performance impact should be negligible thanks to branch prediction of modern processors.

The implementation might be a little bit trickier that it seems. We could try first to create an instance of OutOfMemory with backtrace and everything and if that fails (because there is not even enough memory for that) then throw a pre-allocated instance without backtrace info. Now the real issue is how to "try" to create the error instance without ending in an infinite loop. Doable, but tricky.

I think that it should be possible for carefully constructed code which uses Pointer.malloc to be able to handle low-memory situations gracefully, reporting an error to the user instead of crashing or raising. I think that raising for .new is a great idea, but for Pointer.malloc and such low-level methods, I think you should have enough control to rescue from that situation.

I've made a little _hacky_ test that raises OOMError on low memory, and FatalOOMError on very low memory (I cheated to allow the last one to be raised when less than 1Mb is available, for my test).
My test is available here: https://gist.github.com/bew/20a45cb3c6cbeb3fb88fcf48c383e4ea

The interesting bits are:

class FatalOOMError < Exception
  def initialize
    super "pre-allocated OOM"
  end
end

class OOMError < Exception
  @@oom_pending = false
  @@fatal_oom = FatalOOMError.new # pre-allocated error

  def self.try_new
    if @@oom_pending
      @@oom_pending = false
      return @@fatal_oom
    end

    @@oom_pending = true
    oom = new
    @@oom_pending = false

    oom
  end

  def initialize
    super "dynamically-allocated OOM"
  end
end

fun __crystal_malloc(size : UInt32) : Void*
  ptr = LibGC.malloc size
  if ptr.null?
    raise OOMError.try_new
  end
  ptr
end

I'm sure it's not perfect, but it's maybe a starting point.
I think something like this could be used to handle the infinite loop of raises you mentioned @waj !

Was this page helpful?
0 / 5 - 0 ratings

Related issues

oprypin picture oprypin  路  3Comments

pbrusco picture pbrusco  路  3Comments

asterite picture asterite  路  3Comments

cjgajard picture cjgajard  路  3Comments

costajob picture costajob  路  3Comments