Playframework: Play2 OOM not dumping .hprof log

Created on 25 Feb 2014  路  5Comments  路  Source: playframework/playframework

I'm currently running a Play! 2.2.1 instance on EC2 micro. I've run into an issue where the linux oom killer runs and terminates the Play! instance. I expected the process to dump a .hprof file for me to inspect but this doesnt seem to be the case.

Does Play! require a special jvm option or flag to dump .hprof? I'm currently adding -J-XX:+HeapDumpOnOutOfMemoryError but its still not outputting the file

Bellow are more details...

I am packaging my Play! distribution using:

sudo play dist

To start my Play! instance I'm running as follows:

nohup sudo sh ./bin/play-server -mem 512 -J-server -Dconfig.resource=application-prod.conf -Dlogger.resource=application-logger-prod.xml -Dhttp.port=9000 -J-javaagent:/web/newrelic/newrelic.jar -Dnewrelic.bootstrap_classpath=true -J-XX:+HeapDumpOnOutOfMemoryError &

I can see the log from oom killer in /var/log/messages:

Feb 24 06:33:55 ip-10-138-72-107 kernel: [562140.124986] java invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Feb 24 06:33:55 ip-10-138-72-107 kernel: [562140.125007] java cpuset=/ mems_allowed=0
Feb 24 06:33:55 ip-10-138-72-107 kernel: [562140.125015] Pid: 26972, comm: java Not tainted 3.4.73-64.112.amzn1.x86_64 #1
Feb 24 06:33:55 ip-10-138-72-107 kernel: [562140.125022] Call Trace:
...
Feb 24 06:33:55 ip-10-138-72-107 kernel: [562140.127374] Out of memory: Kill process 26921 (java) score 891 or sacrifice child
Feb 24 06:33:55 ip-10-138-72-107 kernel: [562140.127383] Killed process 26921 (java) total-vm:2050524kB, anon-rss:558636kB, file-rss:0kB

Most helpful comment

OOM killer is a bitch. It has nothing to do with the Java heap, so setting heap dump on oome won't help. Basically, when you ask Linux for memory, it gives you a pointer, even if it doesn't actually have that much memory to give you. It doesn't actually allocate the memory until you use it. This is what let's it do things like have a stack size of 2 megabytes per thread and have 10000 threads, which in theory should take 20GB of memory, but you don't have that memory on the machine. Most threads don't use anywhere near 2MB of stack, so it never allocates the memory that they don't use.

But then what happens when everything does start to use the memory it was assigned, but no memory is available to be lazily allocated? Enter OOM killer. Linux selects a process to kill, that once killed, will free up enough memory to allocate. Because Java processes use a lot of memory, very often it selects a Java process. It'll send a SIGKILL, this means no core dump, and no chance for the JVM to do anything to try and clean up or gracefully shutdown.

This works well for things like PHP apps where each request corresponds to a process. Kill the biggest Apache worker. Problem solved. For Java apps, it's the stupidest feature ever.

What can you do about this? Not much, there are some things you can tune in the Linux kernel, but you can't turn it off altogether, and everything I've read about it sounds like it's simply not practical to try and get Linux not to do this.

Here's what you need to do:

  • 512mb is too much, a micro instance only has 600 or so mb all up, the OS needs some and the JVM uses more than just its heaps. Set your heap size to 256mb. This should be fine for a small Play app.
  • Make sure that there is nothing else on the server that could be consuming a lot of memory. If there is, fix it. If you're running Apache or something like that, it's a good place to start looking.
  • I'm not sure if it's possible to add a swap file to an EC2 micro instance. If it is, this will give you a buffer before the the OOM killer strikes.

Anyway, this has nothing to do with Play, so I'm closing.

All 5 comments

OOM killer is a bitch. It has nothing to do with the Java heap, so setting heap dump on oome won't help. Basically, when you ask Linux for memory, it gives you a pointer, even if it doesn't actually have that much memory to give you. It doesn't actually allocate the memory until you use it. This is what let's it do things like have a stack size of 2 megabytes per thread and have 10000 threads, which in theory should take 20GB of memory, but you don't have that memory on the machine. Most threads don't use anywhere near 2MB of stack, so it never allocates the memory that they don't use.

But then what happens when everything does start to use the memory it was assigned, but no memory is available to be lazily allocated? Enter OOM killer. Linux selects a process to kill, that once killed, will free up enough memory to allocate. Because Java processes use a lot of memory, very often it selects a Java process. It'll send a SIGKILL, this means no core dump, and no chance for the JVM to do anything to try and clean up or gracefully shutdown.

This works well for things like PHP apps where each request corresponds to a process. Kill the biggest Apache worker. Problem solved. For Java apps, it's the stupidest feature ever.

What can you do about this? Not much, there are some things you can tune in the Linux kernel, but you can't turn it off altogether, and everything I've read about it sounds like it's simply not practical to try and get Linux not to do this.

Here's what you need to do:

  • 512mb is too much, a micro instance only has 600 or so mb all up, the OS needs some and the JVM uses more than just its heaps. Set your heap size to 256mb. This should be fine for a small Play app.
  • Make sure that there is nothing else on the server that could be consuming a lot of memory. If there is, fix it. If you're running Apache or something like that, it's a good place to start looking.
  • I'm not sure if it's possible to add a swap file to an EC2 micro instance. If it is, this will give you a buffer before the the OOM killer strikes.

Anyway, this has nothing to do with Play, so I'm closing.

I have just noticed that ,the Play APP down some time.but I can't find any information.
Then I noticed that the Linux OOM Killer too.
thanks @jroper for that information.
I think this information should be posted on the Play User Guide

Thanks for all the detail in your response. This will definitely help prevent another OOM but doesn't settle my concern that there could be something in my code that is consuming too much heap space. I thought a .hprof file would be dumped if I set the -J-XX:+HeapDumpOnOutOfMemoryError parameter. The .hprof file would be really helpful for investigating all the threads and their stack traces at the time of the oom killer.

As I said before, the OOM you saw is NOT a Java OOM, it's a Linux thing. Java's OOME and this OOM that you're seeing are completely different. Unless you see in your application logs that Java has thrown an OutOfMemoryError, then you don't have a problem with your program using too much heap.

The thing that is creating too much heap space is that you're configuring the JVM to use 512mb. The JVM is a bachelor just moved out of home, it doesn't clean up the garbage on its heap until the heap is full of garbage. If you tell it to have a maximum heap size of 512mb, it will use that, even if it only needs 10mb, it will keep on increasing the heap size to the maximum heap size, and only then will it do a GC. And even after it does a garbage collection, from the OS perspective, the JVM is still using it - it doesn't release that memory back to the OS, it hangs on to it. That's the way the JVM heap works, it takes that much memory from the OS and keeps it.

If you want to find out how much heap your application actually uses, connect jconsole to it, go to the memory tab, hit the button to do a full GC, and then see what memory usage the heap goes down to.

Awesome explanation. I was totally unaware of all the GC complexities. Thanks for this!

Was this page helpful?
0 / 5 - 0 ratings