Elasticsearch version (bin/elasticsearch --version
):
6.3.1
Plugins installed: [xpack]
JVM version (java -version
):10
OS version (uname -a
if on a Unix-like system):centos
Description of the problem including expected versus actual behavior:
java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
but my os has free 80g memory
i used docker.elastic.co/elasticsearch/elasticsearch:6.3.1,
jvm config:
-Xms32g
-Xmx32g
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 514880
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Provide logs (if relevant):
[67329.555s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
[67329.557s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
[2018-07-12T02:47:53,549][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [e11redis28.mercury.corp] fatal error in thread [elasticsearch[e11redis28.mercury.corp][refresh][T#2]], exiting
java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
t java.lang.Thread.start0(Native Method) ~[?:?]
t java.lang.Thread.start(Thread.java:813) ~[?:?]
t java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:944) ~[?:?]
t java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1012) ~[?:?]
t java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:?]
t java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
t java.lang.Thread.run(Thread.java:844) [?:?]
@TrumanDu can you please provide stats about that specific jvm like
number of threads ps -o nlwp -p $PID
(where $PID
is the pid of that jvm - you can find pid with jps -lvm
)
and memory details of process top -p $PID
@vladimirdolzhenko thanks for you reply!
top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9566 1000 20 0 2.624t 0.036t 2.631g S 120.3 29.3 20:04.17 java
ps -o nlwp -p
This is the current situation
After running for a while, it will down.
this node size of the data is probably 2.6 T
Pinging @elastic/es-core-infra
I think OutOfMemoryError
is a bit misleading choice by the JVM in this case. You are hitting a limit on the number of processes that this user is allowed to run which is almost certainly unrelated to the amount of free memory in your case. As limits can be configured in a variety of places it can take a bit of time to find out what exactly is causing it.
A while back we had a similar issue in our CI environment and our blog post We are out of memory provides pointers on what you should check.
Having that said, this is an issue that is related to the configuration of the environment and thus I think we should close this and take further discussion to our Discuss forum.
@vladimirdolzhenko @danielmitterdorfer Thanks for writing this! It helped me debug a similar (but likely unrelated) issue in our app (which is using the ES client). For whatever reason, it had gone berserk during the weekend, spawning 9400 threads which made the machine fail in new thread creation for the same user account.
ps -o nlwp,pid -fe
helped me spot this, so I could kill the bad process and get the system back to a usable state. Greatly appreciated!
Thanks for the kind feedback @perlun and glad to hear this has helped you. :)
Most helpful comment
@vladimirdolzhenko @danielmitterdorfer Thanks for writing this! It helped me debug a similar (but likely unrelated) issue in our app (which is using the ES client). For whatever reason, it had gone berserk during the weekend, spawning 9400 threads which made the machine fail in new thread creation for the same user account.
ps -o nlwp,pid -fe
helped me spot this, so I could kill the bad process and get the system back to a usable state. Greatly appreciated!