Elasticsearch: [docker] GC and fatal error logs saved in container - enhancement to doc or default to stdout

Created on 4 Jul 2019 · 15Comments · Source: elastic/elasticsearch

Elasticsearch version (bin/elasticsearch --version):
7.2.0 (official docker images)
Description of the problem including expected versus actual behavior:
GC and fatal error logs in elasticsearch docker outputs inside the container by default /usr/share/elasticsearch/logs
I can see the value to keep this separated from elasticsearch logs (including deprecation and access logs) so if this is by design, this
enhancement issue would be to document the way to overwrite this behaviour with environment variable.

Steps to reproduce:

Run elasticsearch container, see gc logs in /usr/share/elasticsearch/logs folder as per Jvm options

docker run -d --name es720 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.2.0
docker exec -it es720 /bin/bash
[root@7485ae364541 elasticsearch]# ls logs
gc.log  gc.log.00
[root@7485ae364541 elasticsearch]# ps aux | more
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
elastic+     1 42.3 12.6 3700084 1294116 ?     Ssl  16:13   0:31 /usr/share/elasticsearch/jdk/bin/java -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Des.networkaddress.cach
e.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Di
o.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/tmp/elasticsearch-4624909543270639519 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=data -XX:ErrorFile=logs/hs_err_pi
d%p.log -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m -Djava.locale.providers=COMPAT -Des.cgroups.hierarchy.override=/ -Dio.netty.allocator.type=unpooled -XX:MaxDirectMemorySize=536870912 -De
s.path.home=/usr/share/elasticsearch -Des.path.conf=/usr/share/elasticsearch/config -Des.distribution.flavor=default -Des.distribution.type=docker -Des.bundled_jdk=true -cp /usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasti
csearch -Ediscovery.type=single-node

How to use stdout

I did not find a way to nullify "-Xlog:gc*,gc+age=" via environment variable though I can overwrite -XX:ErrorFile=/dev/stdout in JVM_OPTIONS env variable.
The solution is therefore to overwrite jvm.options (volume at container or custom Dockerfile), for example taking out the lines which are not active from default this is how the file looks like in 7.2.0 (GC outputs to stdout by default for the JVM hence not specifying location should work)

-Xmx1g
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-Des.networkaddress.cache.ttl=60
-Des.networkaddress.cache.negative.ttl=10
-XX:+AlwaysPreTouch
-Xss1m
-Djava.awt.headless=true
-Dfile.encoding=UTF-8
-Djna.nosys=true
-XX:-OmitStackTraceInFastThrow
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true
-Djava.io.tmpdir=${ES_TMPDIR}
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=data
## -XX:ErrorFile=logs/hs_err_pid%p.log
-XX:ErrorFile=/dev/stdout
8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
## 8:-Xloggc:logs/gc.log
## 8:-XX:+UseGCLogFileRotation
## 8:-XX:NumberOfGCLogFiles=32
## 8:-XX:GCLogFileSize=64m
## 9-:-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m
9-:-Djava.locale.providers=COMPAT

:DeliverPackaging Delivery discuss

Source

jguay

Most helpful comment

To ada a new gc file location via an env var you can specify -Xlog:gc:<the_path> in ES_JAVA_OPTS e.g.

docker run --rm --name es01 -e 'discovery.type=single-node' -e 'ES_JAVA_OPTS=-Xlog:gc*,gc+age=trace,safepoint:file=/var/tmp/gc.log:utctime,pid,tags:filecount=32,filesize=64m' docker.elastic.co/elasticsearch/elasticsearch:7.2.0

$ docker exec -u 1000:0 -ti es01 tail /var/tmp/gc.log
[2019-07-05T07:55:06.146+0000][1][safepoint    ] Leaving safepoint region
[2019-07-05T07:55:06.146+0000][1][safepoint    ] Total time for which application threads were stopped: 0.0002322 seconds, Stopping threads took: 0.0000546 seconds
[2019-07-05T07:55:06.188+0000][1][safepoint    ] Application time: 0.0419649 seconds
[2019-07-05T07:55:06.188+0000][1][safepoint    ] Entering safepoint region: BulkRevokeBias
[2019-07-05T07:55:06.188+0000][1][safepoint    ] Leaving safepoint region
[2019-07-05T07:55:06.188+0000][1][safepoint    ] Total time for which application threads were stopped: 0.0001994 seconds, Stopping threads took: 0.0000521 seconds
[2019-07-05T07:55:06.210+0000][1][safepoint    ] Application time: 0.0223417 seconds
[2019-07-05T07:55:06.210+0000][1][safepoint    ] Entering safepoint region: CollectForMetadataAllocation
[2019-07-05T07:55:06.211+0000][1][safepoint    ] Leaving safepoint region
[2019-07-05T07:55:06.211+0000][1][safepoint    ] Total time for which application threads were stopped: 0.0001615 seconds, Stopping threads took: 0.0000414 seconds

Note that the gc.log is also written under the default /usr/share/elasticsearch/logs so you'd still need to comment out the relevant line in jvm.options and bind mount the file.

According to JEP 271 and JEP 158 output can be:

   output         :=  'stderr'
                       'stdout'
                       [file=]<filename>

so in theory one could:

docker run --rm --name es01 -e 'discovery.type=single-node' -e 'ES_JAVA_OPTS=-Xlog:gc*,gc+age=trace,safepoint:file=stdout:utctime,pid,tags:' docker.elastic.co/elasticsearch/elasticsearch:7.2.0

but this doesn't seem to bring to show anything on stdout. I will discuss this with @pgomulka .

However do note that even if this would work, you'd end up with interspersed json structured output (from ES) and non-json (from GC) on your stdout, so not sure how helpful is that.

References

dliappis on 5 Jul 2019

👍3

All 15 comments

Pinging @elastic/es-core-infra

elasticmachine on 5 Jul 2019

To ada a new gc file location via an env var you can specify -Xlog:gc:<the_path> in ES_JAVA_OPTS e.g.

docker run --rm --name es01 -e 'discovery.type=single-node' -e 'ES_JAVA_OPTS=-Xlog:gc*,gc+age=trace,safepoint:file=/var/tmp/gc.log:utctime,pid,tags:filecount=32,filesize=64m' docker.elastic.co/elasticsearch/elasticsearch:7.2.0

$ docker exec -u 1000:0 -ti es01 tail /var/tmp/gc.log
[2019-07-05T07:55:06.146+0000][1][safepoint    ] Leaving safepoint region
[2019-07-05T07:55:06.146+0000][1][safepoint    ] Total time for which application threads were stopped: 0.0002322 seconds, Stopping threads took: 0.0000546 seconds
[2019-07-05T07:55:06.188+0000][1][safepoint    ] Application time: 0.0419649 seconds
[2019-07-05T07:55:06.188+0000][1][safepoint    ] Entering safepoint region: BulkRevokeBias
[2019-07-05T07:55:06.188+0000][1][safepoint    ] Leaving safepoint region
[2019-07-05T07:55:06.188+0000][1][safepoint    ] Total time for which application threads were stopped: 0.0001994 seconds, Stopping threads took: 0.0000521 seconds
[2019-07-05T07:55:06.210+0000][1][safepoint    ] Application time: 0.0223417 seconds
[2019-07-05T07:55:06.210+0000][1][safepoint    ] Entering safepoint region: CollectForMetadataAllocation
[2019-07-05T07:55:06.211+0000][1][safepoint    ] Leaving safepoint region
[2019-07-05T07:55:06.211+0000][1][safepoint    ] Total time for which application threads were stopped: 0.0001615 seconds, Stopping threads took: 0.0000414 seconds

Note that the gc.log is also written under the default /usr/share/elasticsearch/logs so you'd still need to comment out the relevant line in jvm.options and bind mount the file.

According to JEP 271 and JEP 158 output can be:

   output         :=  'stderr'
                       'stdout'
                       [file=]<filename>

so in theory one could:

docker run --rm --name es01 -e 'discovery.type=single-node' -e 'ES_JAVA_OPTS=-Xlog:gc*,gc+age=trace,safepoint:file=stdout:utctime,pid,tags:' docker.elastic.co/elasticsearch/elasticsearch:7.2.0

but this doesn't seem to bring to show anything on stdout. I will discuss this with @pgomulka .

However do note that even if this would work, you'd end up with interspersed json structured output (from ES) and non-json (from GC) on your stdout, so not sure how helpful is that.

References

dliappis on 5 Jul 2019

👍3

the comment by @dliappis is excellent.

docker run --rm --name es01 -e 'discovery.type=single-node' -e 'ES_JAVA_OPTS=-Xlog:gc*,gc+age=trace,safepoint:file=stdout:utctime,pid,tags:' docker.elastic.co/elasticsearch/elasticsearch:7.2.0

This command works, but because file=stdout is specified, you end up with a file stdout (this will be its name) in your elasticsearch directory
If you just want to use that command line and output to standard ouput just remove the file= section
```
docker run --rm --name es01 -e 'discovery.type=single-node' -e 'ES_JAVA_OPTS=-Xlog:gc*,gc+age=trace,safepoint:stdout:utctime,pid,tags:' docker.elastic.co/elasticsearch/elasticsearch:7.2.0

This is because the grammar for output in JEP 158 is defined as

output := 'stderr'
'stdout'
[file=]
```

pgomulka on 7 Aug 2019

❤1

@jguay also bear in mind what Java version is used in the docker image. These Gc logging options changed from 8 to 9+

pgomulka on 7 Aug 2019

I also don't know if we can expect any other decorators then [ ] to simplify JSON parsing.
The JEP 158 is stating that this is an open issue, but I couldn't find anything concrete in mailing lists

Should decorations be surrounded by something else than [] to make it easier to parse the output?

http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2015-September/014791.html

pgomulka on 7 Aug 2019

@dliappis @jguay do you think the solutions suggested in comments here are good enough?
There is no way to format GC logs into json, and apart from possibly making them visible in console interleaving json logs, there is nothing I can think of to do (bear in mind this will likely break filebeat)

pgomulka on 10 Oct 2019

@pgomulka Your earlier comment solves this issue perfectly IMO. The only question is whether we should document this in the Elasticsearch Docker docs page as it's not very easy to find.

dliappis on 11 Oct 2019

An issue being caused without this implemented: GC logs consume disk space, which for smallish filesystems can fill the disk. You may have your data volume mapped out to another FS, but the GC logs may not be and would be written to the container filesystem instead.

CpuID on 28 Oct 2019

I'm not OK with this proposition, which isn't working on my case for 7.1.1 :

docker run -d -p 9201:9200 -p 9301:9300 -v es-7.1.1-data:/usr/share/elasticsearch/data -e "discovery.type=single-node" -e 'ES_JAVA_OPTS=-Xlog:gc*,gc+age=trace,safepoint:stdout:utctime,pid,tags:' --name es-7.1.1 docker.elastic.co/elasticsearch/elasticsearch-oss:7.1.1

With your proposition, both -Xlog:gc is on the command line and gc.log files are still created :

/usr/share/elasticsearch/jdk/bin/java -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/tmp/elasticsearch-3576427146310754484 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=data -XX:ErrorFile=logs/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m -Djava.locale.providers=COMPAT -Dio.netty.allocator.type=unpooled -Des.cgroups.hierarchy.override=/ -Xlog:gc*,gc+age=trace,safepoint:stdout:utctime,pid,tags: -Des.path.home=/usr/share/elasticsearch -Des.path.conf=/usr/share/elasticsearch/config -Des.distribution.flavor=oss -Des.distribution.type=docker -Des.bundled_jdk=true -cp /usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch -Ediscovery.type=single-node

JqckB on 11 Feb 2020

I was able to suppress the normal GC logging with:

ES_JAVA_OPTS="-Xlog:disable -Xlog:gc=debug:stderr"

Essentially, I disabled logging entirely, then re-enabled it pointing to standard error. By doing that, the GC logs are separated from the regular Elasticsearch logs e.g. I could do:

docker logs my-container-id >/dev/null

...and only stderr is displayed. Note that logs/gc.log was still created, but it stayed empty while log information was being written to stderr

That being said, I whether we should reconsider the defaults. The default GC settings allow for a maximum of 32 files of at most 64mb, and that's potentially 2GB of GC logs. That feels like a lot. We could reduce these values so that we consume less space in the container's filesystem, or we could log only to stderr and let the container runtime / orchestrator worry about what to do with the logs. That's what [nginx] and [apache] do.

@dliappis / @pgomulka what do you think?

pugnascotia on 25 Feb 2020

👍2

+1 with @pugnascotia on the default configuration. It should redirect all in stderr and let the container runtime handle the logs.

JqckB on 25 Feb 2020

@pugnascotia I like the idea of directing logs to stderr. We should still be able to distinguish streams of logs. stdout are contains logs in JSON format with type field, and GC being on a different stream would not interfere.
It might be worth discussing changing the default, but using stderr sounds better to me

pgomulka on 26 Feb 2020

we discussed this with a team and we came up with a conclusion that stderr should not be used. It should not be forgotten that it is aimed to be used by exception messages like JVM errors or similar.
The default options which end up taking 2GB of data are not excessive and we should stick with them.
Users are still able to limit amount of logs retained by changing jvm options as per https://openjdk.java.net/jeps/158
If this is a frequent problem, maybe we should document this?

pgomulka on 26 Feb 2020

😕2

I’ll update the documentation

pugnascotia on 26 Feb 2020

❤1 👍1