Crystal: out of memory error when there is tons of free memory

Created on 13 Jun 2018  ·  28Comments  ·  Source: crystal-lang/crystal

Environment: Google Cloud Function / 2 GB in-ram file system / unknown (probably debian) 64 bit linux / crystal binaries called via Node.js.

I'm running into the fabled boehm-gc error:

GC Warning: Out of memory - trying to allocate less
Insufficient memory for GC_all_nils

when running crystal language binaries (or just crystal --version or shards install etc) from within a Google Cloud Function, and when there is tons of free ram (generally 1.7 gigs). GCF uses a 2 GB in-ram file system, meaning if I create a 20 MB file, I am really using up 20 MB of ram, etc. Running crystal --version should not require 2 GB of ram, so something fishy is going on. I've checked the output of free -h right before I run crystal --version and indeed there are like 1.8 gigs of free ram.

For some reason no matter what I do, boehm-gc is throwing this error via crystal. I am wondering if maybe this is a bug?

I can run binaries or code you give me to try to debug this in GCF.

I also posted this on the boehm-gc repo: https://github.com/ivmai/bdwgc/issues/225

Most helpful comment

All 28 comments

For anyone wondering why I care about this: I am working on a shard that will automatically package and deploy a crystal app into a google cloud function. Right now I have this working if I do the compilation locally on my linux machine, however I would like to compile binaries in Google Cloud Function so they are completely optimized for that runtime environment (and for OSX users).

If your're compiling, it might be very likely that the compiler consumes this 1.8 GB. Before jumping to conclusions that this is must be a bug, you should verify the actual memory usage while the program runs, not just before you start it. You could either poll free or add --stats to the compiler command.

@straight-shoota please read the issue description. Running any crystal binary such as crystal --version or shards --version triggers the issue! It sounds a lot like BDW is somehow incapable to detect the actual available memory.

This was in response to the last comment which talked explicitly about compiling.
Anyways it would be good to know the actual memory consumption, be it for compiling or crystal --version.

unfortunately the best I can get is the memory consumption of all processes and the total available for the system immediately before I run crystal --version and immediately after crystal --version crashes with the GC error. (the whole system is typically using something like 200 mb of ram at that point). I would second that this sounds like BDW is incorrectly estimating the actual available memory.

@sam0x17 could you run the build command with --stats option? That will show some progress and memory consumption in each stage.

I don't fully understand what code you are trying to compile and then run in GCF. How huge it is? Is it shareable?

@bcardiff I will do that and get back to you. edit: I actually can't because crystal crashes regardless of whether I am compiling or not, so --stats is irrelevant

At the moment I'm not even trying to compile anything -- just running the pre-compiled crystal binary and doing crystal --version, or shards install, etc.. The next step is compiling a tiny hello world program (e.g. puts "hello world"). This is definitely something to do with BDW and not the size of what I'm trying to do.

The folks over at BDW have replied as follows, so I will do some research on nmap etc and see if I can debug the problem that way:

This means GC cannot grab memory from OS (GC uses mmap or sbrk to get more memory depending on the config).

My advice is to write a program in C that links to bdw-gc and run that. If that fails, it's a problem with bdw-gc and not Crystal.

bdwgc doesn't "estimate the available memory", it asks for what it needs with mmap and clearly recieves an error.

I'd be wary of the output of free here: can you post the full output of free? Is free reporting the memory used by the in-ram filesystem properly? Perhaps the process is running in a memory cgroup and the amount of available memory isn't visible to the process.

all right guys here is the output from a full run of the cloud function so you can better see what's going on. I put in some free -h calls before and after every operation that _could_ have some effect on memory usage, along with a command that prints all running processes and their memory usage (the code for this function is as follows):

function procMemUsage() {
  cmd('free -h');
  cmd('ps aux  | awk \'{print $6/1024 " MB\t\t" $11}\'  | sort -n');
}

Here is the full output (scroll to the right):

D      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.434  Function execution started
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.492  $ free -h
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.523               total       used       free     shared    buffers     cached
                                                                      Mem:          2.0G        79M       1.9G         0B         0B        57M
                                                                      -/+ buffers/cache:        22M       2.0G
                                                                      Swap:           0B         0B         0B
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.523  $ ps aux  | awk '{print $6/1024 " MB      " $11}'  | sort -n
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.567  0 MB      COMMAND
                                                                      0.769531 MB       /bin/sh
                                                                      1.51172 MB        awk
                                                                      2.40625 MB        /bin/sh
                                                                      3.21484 MB        ps
                                                                      4.29297 MB        /bin/bash
                                                                      57.1875 MB        /nodejs/bin/node
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.567  preparing to compile...
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.578
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.578  injecting pcre into environment...
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.578  $ tar -xzf ./pcre.tar.gz -C /tmp
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.761  added /tmp/pcre/bin to PATH
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.773  added /tmp/pcre/lib to LD_LIBRARY_PATH
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.782  added /tmp/pcre/lib to LIBRARY_PATH
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.792  added /tmp/pcre/include to C_INCLUDE_PATH
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.801  added /tmp/pcre/include to CPP_INCLUDE_PATH
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.801
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.801  injecting libevent into environment...
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:38.801  $ tar -xzf ./libevent.tar.gz -C /tmp
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.076  added /tmp/libevent/bin to PATH
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.090  added /tmp/libevent/lib to LD_LIBRARY_PATH
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.115  added /tmp/libevent/lib to LIBRARY_PATH
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.129  added /tmp/libevent/include to C_INCLUDE_PATH
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.143  added /tmp/libevent/include to CPP_INCLUDE_PATH
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.143  $ free -h
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.210               total       used       free     shared    buffers     cached
                                                                      Mem:          2.0G        91M       1.9G        12M         0B        70M
                                                                      -/+ buffers/cache:        21M       2.0G
                                                                      Swap:           0B         0B         0B
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.210  $ ps aux  | awk '{print $6/1024 " MB      " $11}'  | sort -n
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.268  0 MB      COMMAND
                                                                      0.4375 MB     awk
                                                                      1.74609 MB        /bin/sh
                                                                      3.14453 MB        ps
                                                                      3.38281 MB        /bin/sh
                                                                      4.29297 MB        /bin/bash
                                                                      58.1875 MB        /nodejs/bin/node
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.268
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.272  crystal version not specified, finding latest version...
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.813  latest version: 0.24.2
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.813
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:39.813  downloading linux x64 binary for Crystal 0.24.2
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:40.566  $ curl -O -L --silent https://github.com/crystal-lang/crystal/releases/download/0.24.2/crystal-0.24.2-1-linux-x86_64.tar.gz
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:46.372
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:46.390  extracting "crystal-0.24.2-1-linux-x86_64.tar.gz"...
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:46.390  $ tar -xzf *.tar.gz
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:48.840
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:48.840  $ free -h
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:48.867               total       used       free     shared    buffers     cached
                                                                      Mem:          2.0G       273M       1.7G       194M         0B       252M
                                                                      -/+ buffers/cache:        21M       2.0G
                                                                      Swap:           0B         0B         0B
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:48.867  $ ps aux  | awk '{print $6/1024 " MB      " $11}'  | sort -n
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:48.935  0 MB      COMMAND
                                                                      1.82422 MB        awk
                                                                      1.87891 MB        /bin/sh
                                                                      3.33984 MB        ps
                                                                      3.51562 MB        /bin/sh
                                                                      4.29297 MB        /bin/bash
                                                                      58.1875 MB        /nodejs/bin/node
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:48.947  added /tmp/tmp-2UghNIpd6dGe8/crystal-0.24.2/bin to PATH
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:48.947
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:48.947  testing that crystal binary is installed and runnable...
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:48.947  $ free -h
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:48.962               total       used       free     shared    buffers     cached
                                                                      Mem:          2.0G       274M       1.7G       194M         0B       252M
                                                                      -/+ buffers/cache:        21M       2.0G
                                                                      Swap:           0B         0B         0B
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:48.962  $ ps aux  | awk '{print $6/1024 " MB      " $11}'  | sort -n
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:49.043  0 MB      COMMAND
                                                                      2.05859 MB        awk
                                                                      2.05859 MB        sort
                                                                      2.98438 MB        /bin/sh
                                                                      3.02734 MB        ps
                                                                      4.29297 MB        /bin/bash
                                                                      58.1875 MB        /nodejs/bin/node
I      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:49.043  $ crystal --version
E      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:49.155  Error: Command failed: crystal --version
                                                                      GC Warning: Out of memory - trying to allocate less
                                                                      Insufficient memory for GC_all_nils

                                                                          at checkExecSyncError (child_process.js:473:13)
                                                                          at Object.execSync (child_process.js:513:13)
                                                                          at silentCmd (/user_code/compile.js:15:24)
                                                                          at cmd (/user_code/compile.js:20:13)
                                                                          at compile (/user_code/compile.js:95:14)
                                                                          at exports.init (/user_code/compile.js:128:3)
                                                                          at /var/tmp/worker/worker.js:684:7
                                                                          at /var/tmp/worker/worker.js:668:9
                                                                          at _combinedTickCallback (internal/process/next_tick.js:73:7)
                                                                          at process._tickDomainCallback (internal/process/next_tick.js:128:9)
D      compile-crystal-zd08sy  l7rsrnwtot6j  2018-06-14 00:01:49.167  Function execution took 10735 ms, finished with status: 'crash'

Reading the bdwgc, allocating GC_all_nils is pretty much the first thing the GC does, so it really does point to it being an interaction between bdwgc and the environment. Getting a strace of crystal would be useful.

@RX14 ok I'll try to get strace to compile statically so I can run it in the cloud function and get back to you

@sam0x17 looks like one for the bdwgc folks, you should post your strace on that issue.

@sam0x17 But please upload the strace somewhere (gist or whatever) and post a link in the issue. This is way to long for a comment.

@straight-shoota @RX14 changed to a pastebin: https://pastebin.com/hqyuZgS8

Interestingly, I did an experiment filling an array with tons of random ints to see if I could get a statically compiled crystal binary to crash due to dynamic memory allocation, and it actually does not crash. So either this is a problem that affects crystal and shards but not crystal programs themselves, or something about how I set up my environment is different that is making things work. For my purposes this is enough to accomplish what I was trying to accomplish, but it may be of interest to continue to investigate why these memory errors were appearing before.

Reproducible example with directions is here: https://github.com/sam0x17/dynamic_memory_test
TLDR: clone the repo and run the build_deploy script, assuming that you already have gcloud installed and logged in via gcloud init and you already have docker set up to run as a non root user.

https://github.com/sam0x17/dynamic_memory_test

@RX14 @straight-shoota

I'm going to call this closed at this point. There is really no reason to be compiling in a cloud function now that we have confirmed that static binaries work, and that is the only reason anyone would ever encounter this error based on my experiment.

This error might as well affect any Crystal binary, so I'd keep it open. It's still not clear how this bug works.

@straight-shoota It's a problem with bdw-gc and that particular cloud. Crystal has nothing to do here...

Until we either:

  • find out the cause or
  • get a report that isn't in a weird environment
    there's not much to do.

As a matter of fact, can you please run df -h before you run crystal? Your tmpfs might be full (and tmpfs counts out of ram - and is fairly invisible, I hit this myself recently).

@RX14 agree that we should close this unless it becomes a problem for someone else.

That said, df -h shows that the tmpfs had plenty of free space, oddly.

Based on the strace, GCF does not support mmap(/dev/zero). So, I will change to use mmap(MAP_ANONYMOUS) instead for this target. Could you please let me know how the BDWGC was built (I mean configure/make options)? Is the target host (GCF) is Linux/amd64?

Could you please let me know how the BDWGC was built (I mean configure/make options)? Is the target host (GCF) is Linux/amd64?

@RX14 @straight-shoota maybe you guys might be able to shed some light on this? I do not know where the bdwgc compilation flags exist in this project

FWIW I'm also seeing this when running the Docker image via rootless podman:

 ✘ ryan@fedora  ~/code/kgk   master ●  podman run --net host --rm crystallang/crystal crystal
GC Warning: Out of memory - trying to allocate less
Insufficient memory for GC_all_nils
 ✘ ryan@fedora  ~/code/kgk   master ●  

Might be a podman bug but still something worth noting.

I'm also seeing this when running the Docker image via rootless podman:
GC Warning: Out of memory - trying to allocate less
Insufficient memory for GC_all_nils
Might be a podman bug but still something worth noting.

@kirbyfan64, If possible, please recheck it with the above fix (https://github.com/ivmai/bdwgc/commit/eed796d6d6fda97354ccbe6a39dce5506d9c6762)

Was this page helpful?
0 / 5 - 0 ratings