What did you do?
Deployed cluster using manifests:
What did you expect to see?
Stable cluster, no failing pods without any load on ElasticSearch cluster
What did you see instead? Under which circumstances?
Afer ~5-10 minutes of running cluster with no any load, pods started to fail, some with OOMKilled others fall into CrashLoopBackOff
Environment
On top of AWS, built with kops
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.2", GitCommit:"66049e3b21efe110454d67df4fa62b08ea79a19b", GitTreeState:"clean", BuildDate:"2019-05-16T18:55:03Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.9", GitCommit:"16236ce91790d4c75b79f6ce96841db1c843e7d2", GitTreeState:"clean", BuildDate:"2019-03-25T06:30:48Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
kind: Elasticsearch
metadata:
name: es-poc
labels:
controller-tools.k8s.io: "1.0"
app: elasticsearch
component: elasticsearch-cluster
release: poc-1.0.0
spec:
# Cluster setup
version: 7.1.0
nodes:
# Master Nodes
- nodeCount: 3
config:
node.master: true
node.data: false
node.ingest: false
node.ml: false
cluster.remote.connect: false
xpack.monitoring.collection.enabled: true
podTemplate:
spec:
serviceAccountName: elasticsearch
containers:
- name: elasticsearch
resources:
requests:
memory: "4Gi"
meta:
labels:
app: elasticsearch
component: master-node
release: poc-1.0.0
# ----------------
# Coordinating Nodes
- nodeCount: 3
config:
node.master: false
node.data: false
node.ingest: false
node.ml: false
xpack.monitoring.collection.enabled: true
podTemplate:
spec:
serviceAccountName: elasticsearch
containers:
- name: elasticsearch
resources:
requests:
memory: "4Gi"
meta:
labels:
app: elasticsearch
component: coordinating-node
release: poc-1.0.0
# ----------------
# Machine Learning Nodes
- nodeCount: 1
config:
node.master: false
node.data: false
node.ingest: false
node.ml: true
xpack.monitoring.collection.enabled: true
podTemplate:
spec:
serviceAccountName: elasticsearch
containers:
- name: elasticsearch
resources:
requests:
memory: "8Gi"
meta:
labels:
app: elasticsearch
component: ml-node
release: poc-1.0.0
# ----------------s
# Data Ingest Nodes
- nodeCount: 3
config:
node.master: false
node.data: true
node.ingest: true
node.ml: false
xpack.monitoring.collection.enabled: true
podTemplate:
spec:
serviceAccountName: elasticsearch
containers:
- name: elasticsearch
resources:
requests:
memory: "8Gi"
meta:
labels:
app: elasticsearch
component: data-node
release: poc-1.0.0
{"level":"error","ts":1560497089.0797975,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"elasticsearch-controller","request":"dev-telemetry/es-poc","error":"Get http://100.106.0.7:8001/csr: dial tcp 100.106.0.7:8001: connect: no route to host","errorCauses":[{"error":"Get http://100.106.0.7:8001/csr: dial tcp 100.106.0.7:8001: connect: no route to host"}],"stacktrace":"github.com/elastic/cloud-on-k8s/operators/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
{"level":"error","ts":1560497132.3934348,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"elasticsearch-controller","request":"dev-telemetry/es-poc","error":"unable to delete /_cluster/voting_config_exclusions: 503 Service Unavailable: ","errorCauses":[{"error":"unable to delete /_cluster/voting_config_exclusions: 503 Service Unavailable: unknown","errorVerbose":"503 Service Unavailable: unknown\nunable to delete /_cluster/voting_config_exclusions\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/client.(*clientV7).DeleteVotingConfigExclusions\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/client/v7.go:53\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/version/version7.UpdateZen2Settings\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/version/version7/zen2.go:36\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/driver.(*defaultDriver).Reconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/driver/default.go:397\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch.(*ReconcileElasticsearch).internalReconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/elasticsearch_controller.go:283\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch.(*ReconcileElasticsearch).Reconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/elasticsearch_controller.go:228\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:215\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1333"}],"stacktrace":"github.com/elastic/cloud-on-k8s/operators/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
{"level":"error","ts":1560498437.5513227,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"elasticsearch-controller","request":"dev-telemetry/es-poc","error":"unable to delete /_cluster/voting_config_exclusions: Delete https://es-poc-es.dev-telemetry.svc.cluster.local:9200/_cluster/voting_config_exclusions?wait_for_removal=false: EOF","errorCauses":[{"error":"unable to delete /_cluster/voting_config_exclusions: Delete https://es-poc-es.dev-telemetry.svc.cluster.local:9200/_cluster/voting_config_exclusions?wait_for_removal=false: EOF","errorVerbose":"Delete https://es-poc-es.dev-telemetry.svc.cluster.local:9200/_cluster/voting_config_exclusions?wait_for_removal=false: EOF\nunable to delete /_cluster/voting_config_exclusions\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/client.(*clientV7).DeleteVotingConfigExclusions\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/client/v7.go:53\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/version/version7.UpdateZen2Settings\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/version/version7/zen2.go:36\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/driver.(*defaultDriver).Reconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/driver/default.go:397\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch.(*ReconcileElasticsearch).internalReconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/elasticsearch_controller.go:283\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch.(*ReconcileElasticsearch).Reconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/elasticsearch_controller.go:228\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:215\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1333"}],"stacktrace":"github.com/elastic/cloud-on-k8s/operators/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
{"level":"error","ts":1560847166.8442552,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"elasticsearch-controller","request":"dev-telemetry/es-poc","error":"unable to delete /_cluster/voting_config_exclusions: Delete https://es-poc-es.dev-telemetry.svc.cluster.local:9200/_cluster/voting_config_exclusions?wait_for_removal=false: EOF","errorCauses":[{"error":"unable to delete /_cluster/voting_config_exclusions: Delete https://es-poc-es.dev-telemetry.svc.cluster.local:9200/_cluster/voting_config_exclusions?wait_for_removal=false: EOF","errorVerbose":"Delete https://es-poc-es.dev-telemetry.svc.cluster.local:9200/_cluster/voting_config_exclusions?wait_for_removal=false: EOF\nunable to delete /_cluster/voting_config_exclusions\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/client.(*clientV7).DeleteVotingConfigExclusions\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/client/v7.go:53\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/version/version7.UpdateZen2Settings\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/version/version7/zen2.go:36\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/driver.(*defaultDriver).Reconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/driver/default.go:397\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch.(*ReconcileElasticsearch).internalReconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/elasticsearch_controller.go:283\ngithub.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch.(*ReconcileElasticsearch).Reconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/pkg/controller/elasticsearch/elasticsearch_controller.go:228\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:215\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1333"}],"stacktrace":"github.com/elastic/cloud-on-k8s/operators/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\ngithub.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/elastic/cloud-on-k8s/operators/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
Nodes logs
Exception in thread "main" org.elasticsearch.bootstrap.BootstrapException: java.nio.file.FileAlreadyExistsException: /usr/share/elasticsearch/config/elasticsearch.keystore.tmp
Likely root cause: java.nio.file.FileAlreadyExistsException: /usr/share/elasticsearch/config/elasticsearch.keystore.tmp
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:94)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219)
at java.base/java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:478)
at java.base/java.nio.file.Files.newOutputStream(Files.java:222)
at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:411)
at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:407)
at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:255)
at org.elasticsearch.common.settings.KeyStoreWrapper.save(KeyStoreWrapper.java:462)
at org.elasticsearch.bootstrap.Bootstrap.loadSecureSettings(Bootstrap.java:232)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:289)
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159)
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
at org.elasticsearch.cli.Command.main(Command.java:90)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:115)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92)
Refer to the log for complete error details.
{"level":"info","ts":1560847565.3037622,"logger":"keystore-updater","msg":"Deleted keystore","keystore-path":"/usr/share/elasticsearch/config/elasticsearch.keystore"}
{"level":"info","ts":1560847565.3038428,"logger":"keystore-updater","msg":"Creating keystore","keystore-path":"/usr/share/elasticsearch/config/elasticsearch.keystore"}
{"type": "server", "timestamp": "2019-06-18T08:46:07,824+0000", "level": "INFO", "component": "o.e.x.m.p.NativeController", "cluster.name": "es-poc", "node.name": "es-poc-es-ss7pqk8x8n", "message": "Native controller process has stopped - no new native processes can be started" }
{"level":"info","ts":1560847568.4866347,"logger":"process-manager","msg":"Update process state","action":"terminate","id":"es","state":"failed","pid":15}
{"level":"info","ts":1560847568.5039022,"logger":"process-manager","msg":"HTTP server closed"}
{"level":"info","ts":1560847568.5046427,"logger":"process-manager","msg":"Exit","reason":"process failed","code":-1}
Limits and requests were increased to resolve:
# Master Nodes
- nodeCount: 3
config:
node.master: true
node.data: false
node.ingest: false
node.ml: false
cluster.remote.connect: false
xpack.monitoring.collection.enabled: true
podTemplate:
spec:
containers:
- name: elasticsearch
resources:
requests:
memory: "4096Mi"
limits:
memory: "4096Mi"
# -------------
# Data Ingest Nodes
- nodeCount: 3
config:
node.master: false
node.data: true
node.ingest: true
node.ml: false
xpack.monitoring.collection.enabled: true
podTemplate:
spec:
containers:
- name: elasticsearch
resources:
requests:
memory: "4096Mi"
limits:
memory: "4096Mi"
Cluster still crashes into OOMKilled and then to CrashLoopBackOff after 45-60 minutes with new config
@iekulyk thanks for reporting this.
It looks like there a 2 different types of problems here: OOM kill on one hand, CrashLoopBackOff on the other hand.
Regarding the OOM Kill, could you share the last logs (kubectl logs <pod_name>) and spec (kubectl get pod <pod_name>) of a pod that was killed? I would like to double-check it correctly has the 4096Mi memory limit applied.
Btw only resource limits can be applied on release 0.8, release 0.9 will allow you to also set resources requests.
For pods in a CrashLoopBackOff state, do you observe the same logs as before (Exception in thread "main" org.elasticsearch.bootstrap.BootstrapException: java.nio.file.FileAlreadyExistsException: /usr/share/elasticsearch/config/elasticsearch.keystore.tmp)?
@sebgl, thanks for response
I've reduced cluster size :
2 data nodes, 1 master
currently 2 nodes in CrashLoopBackOff
{"level":"info","ts":1560932013.8047047,"logger":"process-manager","msg":"Update process state","action":"initialization","id":"es","state":"failed","pid":0}
{"level":"info","ts":1560932013.80478,"logger":"process-manager","msg":"Starting..."}
{"level":"info","ts":1560932013.8063214,"logger":"process-manager","msg":"Update process state","action":"start","id":"es","state":"started","pid":14}
{"level":"info","ts":1560932013.806349,"logger":"process-manager","msg":"Started"}
{"level":"info","ts":1560932013.8104854,"logger":"keystore-updater","msg":"Waiting for Elasticsearch to be ready"}
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
{"type": "server", "timestamp": "2019-06-19T08:13:36,516+0000", "level": "INFO", "component": "o.e.e.NodeEnvironment", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/nvme0n1p1)]], net usable_space [123.2gb], net total_space [127.9gb], types [xfs]" }
{"type": "server", "timestamp": "2019-06-19T08:13:36,520+0000", "level": "INFO", "component": "o.e.e.NodeEnvironment", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "heap size [3.9gb], compressed ordinary object pointers [true]" }
{"type": "server", "timestamp": "2019-06-19T08:13:36,529+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "node name [es-poc-es-zlqnp85q6m], node ID [75Pl3S5SQgWyVAgcNCvCtg], cluster name [es-poc]" }
{"type": "server", "timestamp": "2019-06-19T08:13:36,529+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "version[7.1.0], pid[14], build[default/docker/606a173/2019-05-16T00:43:15.323135Z], OS[Linux/3.10.0-862.14.4.el7.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/12.0.1/12.0.1+12]" }
{"type": "server", "timestamp": "2019-06-19T08:13:36,530+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "JVM home [/usr/share/elasticsearch/jdk]" }
{"type": "server", "timestamp": "2019-06-19T08:13:36,530+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch-16847378613771449580, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -Dio.netty.allocator.type=unpooled, -Des.cgroups.hierarchy.override=/, -Xms4096M, -Xmx4096M, -Djava.security.properties=/usr/share/elasticsearch/config/managed/security.properties, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/usr/share/elasticsearch/config, -Des.distribution.flavor=default, -Des.distribution.type=docker, -Des.bundled_jdk=true]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,335+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [aggs-matrix-stats]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,336+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [analysis-common]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,336+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [ingest-common]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,336+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [ingest-geoip]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,336+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [ingest-user-agent]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,337+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [lang-expression]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,337+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [lang-mustache]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,337+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [lang-painless]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,337+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [mapper-extras]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,337+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [parent-join]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,338+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [percolator]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,338+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [rank-eval]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,338+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [reindex]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,338+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [repository-url]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,338+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [transport-netty4]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,339+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [x-pack-ccr]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,339+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [x-pack-core]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,339+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [x-pack-deprecation]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,339+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [x-pack-graph]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,339+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [x-pack-ilm]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,340+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [x-pack-logstash]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,340+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [x-pack-ml]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,340+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [x-pack-monitoring]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,340+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [x-pack-rollup]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,340+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [x-pack-security]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,341+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [x-pack-sql]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,341+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded module [x-pack-watcher]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,341+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded plugin [repository-gcs]" }
{"type": "server", "timestamp": "2019-06-19T08:13:39,342+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "loaded plugin [repository-s3]" }
{"type": "deprecation", "timestamp": "2019-06-19T08:13:41,496+0000", "level": "WARN", "component": "o.e.d.c.s.Settings", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "[discovery.zen.minimum_master_nodes] setting was deprecated in Elasticsearch and will be removed in a future release! See the breaking changes documentation for the next major version." }
{"type": "deprecation", "timestamp": "2019-06-19T08:13:41,499+0000", "level": "WARN", "component": "o.e.d.c.s.Settings", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "[discovery.zen.hosts_provider] setting was deprecated in Elasticsearch and will be removed in a future release! See the breaking changes documentation for the next major version." }
{"type": "server", "timestamp": "2019-06-19T08:13:42,963+0000", "level": "INFO", "component": "o.e.x.s.a.s.FileRolesStore", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "parsed [2] roles from file [/usr/share/elasticsearch/config/roles.yml]" }
{"type": "server", "timestamp": "2019-06-19T08:13:43,349+0000", "level": "INFO", "component": "o.e.x.m.p.l.CppLogMessageHandler", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "[controller/90] [Main.cc@109] controller (64 bit): Version 7.1.0 (Build a8ee6de8087169) Copyright (c) 2019 Elasticsearch BV" }
{"type": "server", "timestamp": "2019-06-19T08:13:43,713+0000", "level": "DEBUG", "component": "o.e.a.ActionModule", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "Using REST wrapper from plugin org.elasticsearch.xpack.security.Security" }
{"level":"info","ts":1560932023.8366818,"logger":"keystore-updater","msg":"Waiting for Elasticsearch to be ready"}
{"type": "server", "timestamp": "2019-06-19T08:13:43,967+0000", "level": "INFO", "component": "o.e.d.DiscoveryModule", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "using discovery type [zen] and seed hosts providers [settings, file]" }
{"type": "server", "timestamp": "2019-06-19T08:13:44,645+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "initialized" }
{"type": "server", "timestamp": "2019-06-19T08:13:44,645+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "starting ..." }
{"type": "server", "timestamp": "2019-06-19T08:13:45,070+0000", "level": "INFO", "component": "o.e.x.m.p.NativeController", "cluster.name": "es-poc", "node.name": "es-poc-es-zlqnp85q6m", "message": "Native controller process has stopped - no new native processes can be started" }
{"level":"info","ts":1560932025.370031,"logger":"process-manager","msg":"Update process state","action":"terminate","id":"es","state":"failed","pid":14}
{"level":"info","ts":1560932025.3862207,"logger":"process-manager","msg":"HTTP server closed"}
{"level":"info","ts":1560932025.3861277,"logger":"process-manager","msg":"Exit","reason":"process failed","code":-1}
apiVersion: v1
kind: Pod
metadata:
annotations:
update.k8s.elastic.co/timestamp: "2019-06-19T04:08:29.370994917Z"
creationTimestamp: "2019-06-18T15:22:22Z"
labels:
common.k8s.elastic.co/type: elasticsearch
elasticsearch.k8s.elastic.co/cluster-name: es-poc
elasticsearch.k8s.elastic.co/node-data: "true"
elasticsearch.k8s.elastic.co/node-ingest: "true"
elasticsearch.k8s.elastic.co/node-master: "false"
elasticsearch.k8s.elastic.co/node-ml: "true"
elasticsearch.k8s.elastic.co/pod-name: es-poc-es-zlqnp85q6m
elasticsearch.k8s.elastic.co/version: 7.1.0
name: es-poc-es-zlqnp85q6m
namespace: dev-telemetry
ownerReferences:
- apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
blockOwnerDeletion: true
controller: true
kind: Elasticsearch
name: es-poc
uid: dd4bc77e-91dc-11e9-8bac-0ae689a1e762
resourceVersion: "113279087"
selfLink: /api/v1/namespaces/dev-telemetry/pods/es-poc-es-zlqnp85q6m
uid: de88877e-91dc-11e9-8bac-0ae689a1e762
spec:
automountServiceAccountToken: false
containers:
- command:
- /mnt/elastic/process-manager/process-manager
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: ES_JAVA_OPTS
value: -Xms4096M -Xmx4096M -Djava.security.properties=/usr/share/elasticsearch/config/managed/security.properties
- name: READINESS_PROBE_PROTOCOL
value: https
- name: PROBE_USERNAME
value: elastic-internal-probe
- name: PROBE_PASSWORD_FILE
value: /mnt/elastic/probe-user/elastic-internal-probe
- name: PM_PROC_NAME
value: es
- name: PM_PROC_CMD
value: /usr/local/bin/docker-entrypoint.sh
- name: PM_TLS
value: "true"
- name: PM_CERT_PATH
value: /usr/share/elasticsearch/config/node-certs/cert.pem
- name: PM_KEY_PATH
value: /usr/share/elasticsearch/config/private-key/node.key
- name: KEYSTORE_SOURCE_DIR
value: /mnt/elastic/secure-settings
- name: KEYSTORE_RELOAD_CREDENTIALS
value: "true"
- name: KEYSTORE_ES_USERNAME
value: elastic-internal-reload-creds
- name: KEYSTORE_ES_PASSWORD_FILE
value: /mnt/elastic/reload-creds-user/elastic-internal-reload-creds
- name: KEYSTORE_ES_CA_CERTS_PATH
value: /usr/share/elasticsearch/config/node-certs/ca.pem
- name: KEYSTORE_ES_ENDPOINT
value: https://127.0.0.1:9200
- name: KEYSTORE_ES_VERSION
value: 7.1.0
image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
imagePullPolicy: IfNotPresent
name: elasticsearch
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
- containerPort: 8080
name: process-manager
protocol: TCP
readinessProbe:
exec:
command:
- bash
- -c
- "\n#!/usr/bin/env bash\n# Consider a node to be healthy if it responds to
a simple GET on \"/\"\nCURL_TIMEOUT=3\n\n# setup basic auth if credentials
are available\nif [ -n \"${PROBE_USERNAME}\" ] && [ -f \"${PROBE_PASSWORD_FILE}\"
]; then\n PROBE_PASSWORD=$(<$PROBE_PASSWORD_FILE)\n BASIC_AUTH=\"-u ${PROBE_USERNAME}:${PROBE_PASSWORD}\"\nelse\n
\ BASIC_AUTH=''\nfi\n\n# request Elasticsearch\nstatus=$(curl -o /dev/null
-w \"%{http_code}\" --max-time $CURL_TIMEOUT -XGET -s -k ${BASIC_AUTH} ${READINESS_PROBE_PROTOCOL:-https}://127.0.0.1:9200)\n\n#
ready if status code 200\nif [[ $status == \"200\" ]]; then\n\texit 0\nelse\n\texit
1\nfi\n"
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
resources:
limits:
memory: 8Gi
requests:
memory: 8Gi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/config
name: config-volume
- mountPath: /usr/share/elasticsearch/plugins
name: plugins-volume
- mountPath: /usr/share/elasticsearch/bin
name: bin-volume
- mountPath: /usr/share/elasticsearch/data
name: data
- mountPath: /usr/share/elasticsearch/logs
name: logs
- mountPath: /usr/share/elasticsearch/config/private-key
name: private-key-volume
- mountPath: /mnt/elastic/process-manager
name: local-bin-volume
- mountPath: /mnt/elastic/secrets
name: users
readOnly: true
- mountPath: /usr/share/elasticsearch/config/managed
name: es-poc
readOnly: true
- mountPath: /mnt/elastic/unicast-hosts
name: es-poc-unicast-hosts
readOnly: true
- mountPath: /mnt/elastic/probe-user
name: probe-user
readOnly: true
- mountPath: /usr/share/elasticsearch/config/extrafiles
name: extrafiles
readOnly: true
- mountPath: /usr/share/elasticsearch/config/node-certs
name: node-certificates
readOnly: true
- mountPath: /mnt/elastic/reload-creds-user
name: reload-creds-user
readOnly: true
- mountPath: /mnt/elastic/secure-settings
name: secure-settings
readOnly: true
- mountPath: /mnt/elastic/es-config
name: es-config
readOnly: true
dnsPolicy: ClusterFirst
initContainers:
- command:
- sysctl
- -w
- vm.max_map_count=262144
image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
imagePullPolicy: IfNotPresent
name: tweak-os-settings
resources: {}
securityContext:
privileged: true
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
- command:
- bash
- -c
- "#!/usr/bin/env bash -eu\n\n\tES_DIR=\"/usr/share/elasticsearch\"\n\tCONFIG_DIR=$ES_DIR/config\n\tPLUGIN_BIN=$ES_DIR/bin/elasticsearch-plugin\n\tKEYSTORE_BIN=$ES_DIR/bin/elasticsearch-keystore
\n\n\t# compute time in seconds since the given start time\n\tfunction duration()
{\n\t\tlocal start=$1\n\t\tend=$(date +%s)\n\t\techo $((end-start))\n\t}\n\n\t######################\n\t#
\ START #\n\t######################\n\n\tscript_start=$(date +%s)\n\n\techo
\"Starting init script\"\n\n\t######################\n\t# Plugins #\n\t######################\n\n\tplugins_start=$(date
+%s)\n\t# Install extra plugins\n\t\n\t\techo \"Installing plugin repository-s3\"\n\t\t#
Using --batch accepts any user prompt (y/n)\n\t\t$PLUGIN_BIN install --batch
repository-s3\n\t\n\t\techo \"Installing plugin repository-gcs\"\n\t\t# Using
--batch accepts any user prompt (y/n)\n\t\t$PLUGIN_BIN install --batch repository-gcs\n\t\n\n\techo
\"Installed plugins:\"\n\t$PLUGIN_BIN list\n\n\techo \"Plugins installation
duration: $(duration $plugins_start) sec.\"\n\n\t######################\n\t#
\ Config linking #\n\t######################\n\n\t# Link individual files
from their mount location into the config dir\n\t# to a volume, to be used by
the ES container\n\tln_start=$(date +%s)\n\t\n\t\techo \"Linking /mnt/elastic/secrets/users
to /usr/share/elasticsearch/config/users\"\n\t\tln -sf /mnt/elastic/secrets/users
/usr/share/elasticsearch/config/users\n\t\n\t\techo \"Linking /mnt/elastic/secrets/roles.yml
to /usr/share/elasticsearch/config/roles.yml\"\n\t\tln -sf /mnt/elastic/secrets/roles.yml
/usr/share/elasticsearch/config/roles.yml\n\t\n\t\techo \"Linking /mnt/elastic/secrets/users_roles
to /usr/share/elasticsearch/config/users_roles\"\n\t\tln -sf /mnt/elastic/secrets/users_roles
/usr/share/elasticsearch/config/users_roles\n\t\n\t\techo \"Linking /mnt/elastic/es-config/elasticsearch.yml
to /usr/share/elasticsearch/config/elasticsearch.yml\"\n\t\tln -sf /mnt/elastic/es-config/elasticsearch.yml
/usr/share/elasticsearch/config/elasticsearch.yml\n\t\n\t\techo \"Linking /mnt/elastic/unicast-hosts/unicast_hosts.txt
to /usr/share/elasticsearch/config/unicast_hosts.txt\"\n\t\tln -sf /mnt/elastic/unicast-hosts/unicast_hosts.txt
/usr/share/elasticsearch/config/unicast_hosts.txt\n\t\n\techo \"File linking
duration: $(duration $ln_start) sec.\"\n\n\n\t######################\n\t# Files
persistence #\n\t######################\n\n\t# Persist the content of bin/,
config/ and plugins/\n\t# to a volume, to be used by the ES container\n\tmv_start=$(date
+%s)\n\t\n\t\techo \"Moving /usr/share/elasticsearch/config/* to /volume/config/\"\n\t\tmv
/usr/share/elasticsearch/config/* /volume/config/\n\t\n\t\techo \"Moving /usr/share/elasticsearch/plugins/*
to /volume/plugins/\"\n\t\tmv /usr/share/elasticsearch/plugins/* /volume/plugins/\n\t\n\t\techo
\"Moving /usr/share/elasticsearch/bin/* to /volume/bin/\"\n\t\tmv /usr/share/elasticsearch/bin/*
/volume/bin/\n\t\n\t\techo \"Moving /usr/share/elasticsearch/data/* to /volume/data/\"\n\t\tmv
/usr/share/elasticsearch/data/* /volume/data/\n\t\n\t\techo \"Moving /usr/share/elasticsearch/logs/*
to /volume/logs/\"\n\t\tmv /usr/share/elasticsearch/logs/* /volume/logs/\n\t\n\techo
\"Files copy duration: $(duration $mv_start) sec.\"\n\n\t######################\n\t#
\ Volumes chown #\n\t######################\n\n\t# chown the data and logs
volume to the elasticsearch user\n\tchown_start=$(date +%s)\n\t\n\t\techo \"chowning
/volume/data to elasticsearch:elasticsearch\"\n\t\tchown -v elasticsearch:elasticsearch
/volume/data\n\t\n\t\techo \"chowning /volume/logs to elasticsearch:elasticsearch\"\n\t\tchown
-v elasticsearch:elasticsearch /volume/logs\n\t\n\techo \"chown duration: $(duration
$chown_start) sec.\"\n\n\t######################\n\t# End #\n\t######################\n\n\techo
\"Init script successful\"\n\techo \"Script duration: $(duration $script_start)
sec.\"\n"
image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
imagePullPolicy: IfNotPresent
name: prepare-fs
resources: {}
securityContext:
privileged: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /volume/config
name: config-volume
- mountPath: /volume/plugins
name: plugins-volume
- mountPath: /volume/bin
name: bin-volume
- mountPath: /volume/data
name: data
- mountPath: /volume/logs
name: logs
- command:
- bash
- -c
- "\n\t\t#!/usr/bin/env bash -eu\n\t\tcp process-manager $LOCAL_BIN\n"
env:
- name: LOCAL_BIN
value: /volume/bin
image: docker.elastic.co/eck/eck-operator:0.8.0
imagePullPolicy: IfNotPresent
name: inject-process-manager
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /volume/bin
name: local-bin-volume
- command:
- /root/cert-initializer
image: docker.elastic.co/eck/eck-operator:0.8.0
imagePullPolicy: Always
name: cert-initializer
ports:
- containerPort: 8001
name: csr
protocol: TCP
resources: {}
securityContext:
privileged: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/config/node-certs
name: node-certificates
readOnly: true
- mountPath: /mnt/elastic/private-key
name: private-key-volume
nodeName: ip-172-19-75-225.eu-west-1.compute.internal
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 120
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- emptyDir: {}
name: config-volume
- emptyDir: {}
name: plugins-volume
- emptyDir: {}
name: bin-volume
- emptyDir: {}
name: data
- emptyDir: {}
name: logs
- emptyDir: {}
name: private-key-volume
- emptyDir: {}
name: local-bin-volume
- name: users
secret:
defaultMode: 420
optional: false
secretName: es-poc-es-roles-users
- configMap:
defaultMode: 420
name: es-poc
optional: false
name: es-poc
- configMap:
defaultMode: 420
name: es-poc-unicast-hosts
optional: false
name: es-poc-unicast-hosts
- name: probe-user
secret:
defaultMode: 420
items:
- key: elastic-internal-probe
path: elastic-internal-probe
optional: false
secretName: es-poc-internal-users
- name: extrafiles
secret:
defaultMode: 420
optional: false
secretName: es-poc-extrafiles
- name: reload-creds-user
secret:
defaultMode: 420
items:
- key: elastic-internal-reload-creds
path: elastic-internal-reload-creds
optional: false
secretName: es-poc-internal-users
- name: secure-settings
secret:
defaultMode: 420
optional: false
secretName: es-poc-secure-settings
- name: elasticsearch-data
persistentVolumeClaim:
claimName: es-poc-es-zlqnp85q6m-elasticsearch-data
- name: node-certificates
secret:
defaultMode: 420
optional: false
secretName: es-poc-es-zlqnp85q6m-certs
- name: es-config
secret:
defaultMode: 420
optional: false
secretName: es-poc-es-zlqnp85q6m-config
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2019-06-19T04:08:29Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2019-06-19T06:19:26Z"
message: 'containers with unready status: [elasticsearch]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: null
message: 'containers with unready status: [elasticsearch]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2019-06-19T04:07:05Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://c08200420b7f38fb5e15e9327529f40e4462d63588478500cc442f03c2441c51
image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
imageID: docker-pullable://docker.elastic.co/elasticsearch/elasticsearch@sha256:802b6a299260dbaf21a9c57e3a634491ff788a1ea13a51598d4cd105739509c4
lastState:
terminated:
containerID: docker://c08200420b7f38fb5e15e9327529f40e4462d63588478500cc442f03c2441c51
exitCode: 255
finishedAt: "2019-06-19T08:13:45Z"
reason: OOMKilled
startedAt: "2019-06-19T08:13:33Z"
name: elasticsearch
ready: false
restartCount: 26
state:
waiting:
message: Back-off 5m0s restarting failed container=elasticsearch pod=es-poc-es-zlqnp85q6m_dev-telemetry(de88877e-91dc-11e9-8bac-0ae689a1e762)
reason: CrashLoopBackOff
hostIP: 172.19.75.225
initContainerStatuses:
- containerID: docker://f32c5fad8be28c766807df8313fb49ea3077cd6464ff239e17a4df56ddd014be
image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
imageID: docker-pullable://docker.elastic.co/elasticsearch/elasticsearch@sha256:802b6a299260dbaf21a9c57e3a634491ff788a1ea13a51598d4cd105739509c4
lastState: {}
name: tweak-os-settings
ready: true
restartCount: 0
state:
terminated:
containerID: docker://f32c5fad8be28c766807df8313fb49ea3077cd6464ff239e17a4df56ddd014be
exitCode: 0
finishedAt: "2019-06-19T04:07:44Z"
reason: Completed
startedAt: "2019-06-19T04:07:44Z"
- containerID: docker://9f65e6caee0b96c44ad4b65e03b6659413c8176337ed1391a6f03bf9ada8fc35
image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
imageID: docker-pullable://docker.elastic.co/elasticsearch/elasticsearch@sha256:802b6a299260dbaf21a9c57e3a634491ff788a1ea13a51598d4cd105739509c4
lastState: {}
name: prepare-fs
ready: true
restartCount: 0
state:
terminated:
containerID: docker://9f65e6caee0b96c44ad4b65e03b6659413c8176337ed1391a6f03bf9ada8fc35
exitCode: 0
finishedAt: "2019-06-19T04:07:54Z"
reason: Completed
startedAt: "2019-06-19T04:07:45Z"
- containerID: docker://14e47c5e8d1e36a9ad85df1e60503b83f58f009e3d9df5d1b31dd9f1a263ed45
image: docker.elastic.co/eck/eck-operator:0.8.0
imageID: docker-pullable://docker.elastic.co/eck/eck-operator@sha256:1e910d2502690f9007d103f89cef92ef3c4f1115c08819edb3b7409481b291a3
lastState: {}
name: inject-process-manager
ready: true
restartCount: 0
state:
terminated:
containerID: docker://14e47c5e8d1e36a9ad85df1e60503b83f58f009e3d9df5d1b31dd9f1a263ed45
exitCode: 0
finishedAt: "2019-06-19T04:08:01Z"
reason: Completed
startedAt: "2019-06-19T04:08:01Z"
- containerID: docker://51f49b912ac17934d4d8be14730163975de5be79c89010f341075cd2f8f9db35
image: docker.elastic.co/eck/eck-operator:0.8.0
imageID: docker-pullable://docker.elastic.co/eck/eck-operator@sha256:1e910d2502690f9007d103f89cef92ef3c4f1115c08819edb3b7409481b291a3
lastState: {}
name: cert-initializer
ready: true
restartCount: 0
state:
terminated:
containerID: docker://51f49b912ac17934d4d8be14730163975de5be79c89010f341075cd2f8f9db35
exitCode: 0
finishedAt: "2019-06-19T04:08:29Z"
reason: Completed
startedAt: "2019-06-19T04:08:04Z"
phase: Running
podIP: 100.103.128.4
qosClass: Burstable
startTime: "2019-06-19T04:07:05Z"
{"level":"info","ts":1560932215.7750769,"logger":"process-manager","msg":"Update process state","action":"initialization","id":"es","state":"failed","pid":0}
{"level":"info","ts":1560932215.7751539,"logger":"process-manager","msg":"Starting..."}
{"level":"info","ts":1560932215.7765884,"logger":"process-manager","msg":"Update process state","action":"start","id":"es","state":"started","pid":17}
{"level":"info","ts":1560932215.776638,"logger":"process-manager","msg":"Started"}
{"level":"info","ts":1560932215.779489,"logger":"keystore-updater","msg":"Waiting for Elasticsearch to be ready"}
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
{"type": "server", "timestamp": "2019-06-19T08:16:58,975+0000", "level": "INFO", "component": "o.e.e.NodeEnvironment", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/nvme0n1p1)]], net usable_space [91.7gb], net total_space [127.9gb], types [xfs]" }
{"type": "server", "timestamp": "2019-06-19T08:16:58,980+0000", "level": "INFO", "component": "o.e.e.NodeEnvironment", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "heap size [1.9gb], compressed ordinary object pointers [true]" }
{"type": "server", "timestamp": "2019-06-19T08:16:58,993+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "node name [es-poc-es-nckxn896sn], node ID [gq-eYTn-QBWQUuB1mDtI4A], cluster name [es-poc]" }
{"type": "server", "timestamp": "2019-06-19T08:16:58,994+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "version[7.1.0], pid[17], build[default/docker/606a173/2019-05-16T00:43:15.323135Z], OS[Linux/3.10.0-862.14.4.el7.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/12.0.1/12.0.1+12]" }
{"type": "server", "timestamp": "2019-06-19T08:16:58,994+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "JVM home [/usr/share/elasticsearch/jdk]" }
{"type": "server", "timestamp": "2019-06-19T08:16:58,995+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch-1370492985762362908, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -Dio.netty.allocator.type=unpooled, -Des.cgroups.hierarchy.override=/, -Xms2048M, -Xmx2048M, -Djava.security.properties=/usr/share/elasticsearch/config/managed/security.properties, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/usr/share/elasticsearch/config, -Des.distribution.flavor=default, -Des.distribution.type=docker, -Des.bundled_jdk=true]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,874+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [aggs-matrix-stats]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,874+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [analysis-common]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,874+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [ingest-common]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,874+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [ingest-geoip]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,874+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [ingest-user-agent]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,875+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [lang-expression]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,875+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [lang-mustache]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,875+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [lang-painless]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,875+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [mapper-extras]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,875+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [parent-join]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,875+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [percolator]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,876+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [rank-eval]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,876+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [reindex]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,876+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [repository-url]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,876+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [transport-netty4]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,876+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [x-pack-ccr]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,876+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [x-pack-core]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,877+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [x-pack-deprecation]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,877+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [x-pack-graph]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,877+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [x-pack-ilm]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,877+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [x-pack-logstash]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,877+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [x-pack-ml]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,877+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [x-pack-monitoring]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,878+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [x-pack-rollup]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,878+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [x-pack-security]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,878+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [x-pack-sql]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,878+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded module [x-pack-watcher]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,879+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded plugin [repository-gcs]" }
{"type": "server", "timestamp": "2019-06-19T08:17:01,879+0000", "level": "INFO", "component": "o.e.p.PluginsService", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "loaded plugin [repository-s3]" }
{"type": "deprecation", "timestamp": "2019-06-19T08:17:04,368+0000", "level": "WARN", "component": "o.e.d.c.s.Settings", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "[discovery.zen.minimum_master_nodes] setting was deprecated in Elasticsearch and will be removed in a future release! See the breaking changes documentation for the next major version." }
{"type": "deprecation", "timestamp": "2019-06-19T08:17:04,371+0000", "level": "WARN", "component": "o.e.d.c.s.Settings", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "[discovery.zen.hosts_provider] setting was deprecated in Elasticsearch and will be removed in a future release! See the breaking changes documentation for the next major version." }
{"level":"info","ts":1560932225.8303845,"logger":"keystore-updater","msg":"Waiting for Elasticsearch to be ready"}
{"type": "server", "timestamp": "2019-06-19T08:17:05,970+0000", "level": "INFO", "component": "o.e.x.s.a.s.FileRolesStore", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "parsed [2] roles from file [/usr/share/elasticsearch/config/roles.yml]" }
{"type": "server", "timestamp": "2019-06-19T08:17:06,540+0000", "level": "INFO", "component": "o.e.x.m.p.l.CppLogMessageHandler", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "[controller/89] [Main.cc@109] controller (64 bit): Version 7.1.0 (Build a8ee6de8087169) Copyright (c) 2019 Elasticsearch BV" }
{"type": "server", "timestamp": "2019-06-19T08:17:07,031+0000", "level": "DEBUG", "component": "o.e.a.ActionModule", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "Using REST wrapper from plugin org.elasticsearch.xpack.security.Security" }
{"type": "server", "timestamp": "2019-06-19T08:17:07,542+0000", "level": "INFO", "component": "o.e.x.m.p.NativeController", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "Native controller process has stopped - no new native processes can be started" }
{"type": "server", "timestamp": "2019-06-19T08:17:07,622+0000", "level": "INFO", "component": "o.e.d.DiscoveryModule", "cluster.name": "es-poc", "node.name": "es-poc-es-nckxn896sn", "message": "using discovery type [zen] and seed hosts providers [settings, file]" }
{"level":"info","ts":1560932228.4319546,"logger":"process-manager","msg":"Update process state","action":"terminate","id":"es","state":"failed","pid":17}
{"level":"info","ts":1560932228.4471412,"logger":"process-manager","msg":"HTTP server closed"}
{"level":"info","ts":1560932228.4470956,"logger":"process-manager","msg":"Exit","reason":"process failed","code":-1}
kind: Pod
metadata:
annotations:
update.k8s.elastic.co/timestamp: "2019-06-19T04:03:31.870642156Z"
creationTimestamp: "2019-06-19T04:01:52Z"
labels:
common.k8s.elastic.co/type: elasticsearch
elasticsearch.k8s.elastic.co/cluster-name: es-poc
elasticsearch.k8s.elastic.co/node-data: "false"
elasticsearch.k8s.elastic.co/node-ingest: "false"
elasticsearch.k8s.elastic.co/node-master: "true"
elasticsearch.k8s.elastic.co/node-ml: "true"
elasticsearch.k8s.elastic.co/version: 7.1.0
name: es-poc-es-nckxn896sn
namespace: dev-telemetry
ownerReferences:
- apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
blockOwnerDeletion: true
controller: true
kind: Elasticsearch
name: es-poc
uid: dd4bc77e-91dc-11e9-8bac-0ae689a1e762
resourceVersion: "113279912"
selfLink: /api/v1/namespaces/dev-telemetry/pods/es-poc-es-nckxn896sn
uid: f849eb50-9246-11e9-8bac-0ae689a1e762
spec:
automountServiceAccountToken: false
containers:
- command:
- /mnt/elastic/process-manager/process-manager
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: ES_JAVA_OPTS
value: -Xms2048M -Xmx2048M -Djava.security.properties=/usr/share/elasticsearch/config/managed/security.properties
- name: READINESS_PROBE_PROTOCOL
value: https
- name: PROBE_USERNAME
value: elastic-internal-probe
- name: PROBE_PASSWORD_FILE
value: /mnt/elastic/probe-user/elastic-internal-probe
- name: PM_PROC_NAME
value: es
- name: PM_PROC_CMD
value: /usr/local/bin/docker-entrypoint.sh
- name: PM_TLS
value: "true"
- name: PM_CERT_PATH
value: /usr/share/elasticsearch/config/node-certs/cert.pem
- name: PM_KEY_PATH
value: /usr/share/elasticsearch/config/private-key/node.key
- name: KEYSTORE_SOURCE_DIR
value: /mnt/elastic/secure-settings
- name: KEYSTORE_RELOAD_CREDENTIALS
value: "true"
- name: KEYSTORE_ES_USERNAME
value: elastic-internal-reload-creds
- name: KEYSTORE_ES_PASSWORD_FILE
value: /mnt/elastic/reload-creds-user/elastic-internal-reload-creds
- name: KEYSTORE_ES_CA_CERTS_PATH
value: /usr/share/elasticsearch/config/node-certs/ca.pem
- name: KEYSTORE_ES_ENDPOINT
value: https://127.0.0.1:9200
- name: KEYSTORE_ES_VERSION
value: 7.1.0
image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
imagePullPolicy: IfNotPresent
name: elasticsearch
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
- containerPort: 8080
name: process-manager
protocol: TCP
readinessProbe:
exec:
command:
- bash
- -c
- "\n#!/usr/bin/env bash\n# Consider a node to be healthy if it responds to
a simple GET on \"/\"\nCURL_TIMEOUT=3\n\n# setup basic auth if credentials
are available\nif [ -n \"${PROBE_USERNAME}\" ] && [ -f \"${PROBE_PASSWORD_FILE}\"
]; then\n PROBE_PASSWORD=$(<$PROBE_PASSWORD_FILE)\n BASIC_AUTH=\"-u ${PROBE_USERNAME}:${PROBE_PASSWORD}\"\nelse\n
\ BASIC_AUTH=''\nfi\n\n# request Elasticsearch\nstatus=$(curl -o /dev/null
-w \"%{http_code}\" --max-time $CURL_TIMEOUT -XGET -s -k ${BASIC_AUTH} ${READINESS_PROBE_PROTOCOL:-https}://127.0.0.1:9200)\n\n#
ready if status code 200\nif [[ $status == \"200\" ]]; then\n\texit 0\nelse\n\texit
1\nfi\n"
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
resources:
limits:
memory: 4Gi
requests:
memory: 4Gi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/config
name: config-volume
- mountPath: /usr/share/elasticsearch/plugins
name: plugins-volume
- mountPath: /usr/share/elasticsearch/bin
name: bin-volume
- mountPath: /usr/share/elasticsearch/data
name: data
- mountPath: /usr/share/elasticsearch/logs
name: logs
- mountPath: /usr/share/elasticsearch/config/private-key
name: private-key-volume
- mountPath: /mnt/elastic/process-manager
name: local-bin-volume
- mountPath: /mnt/elastic/secrets
name: users
readOnly: true
- mountPath: /usr/share/elasticsearch/config/managed
name: es-poc
readOnly: true
- mountPath: /mnt/elastic/unicast-hosts
name: es-poc-unicast-hosts
readOnly: true
- mountPath: /mnt/elastic/probe-user
name: probe-user
readOnly: true
- mountPath: /usr/share/elasticsearch/config/extrafiles
name: extrafiles
readOnly: true
- mountPath: /usr/share/elasticsearch/config/node-certs
name: node-certificates
readOnly: true
- mountPath: /mnt/elastic/reload-creds-user
name: reload-creds-user
readOnly: true
- mountPath: /mnt/elastic/secure-settings
name: secure-settings
readOnly: true
- mountPath: /mnt/elastic/es-config
name: es-config
readOnly: true
dnsPolicy: ClusterFirst
initContainers:
- command:
- sysctl
- -w
- vm.max_map_count=262144
image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
imagePullPolicy: IfNotPresent
name: tweak-os-settings
resources: {}
securityContext:
privileged: true
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
- command:
- bash
- -c
- "#!/usr/bin/env bash -eu\n\n\tES_DIR=\"/usr/share/elasticsearch\"\n\tCONFIG_DIR=$ES_DIR/config\n\tPLUGIN_BIN=$ES_DIR/bin/elasticsearch-plugin\n\tKEYSTORE_BIN=$ES_DIR/bin/elasticsearch-keystore
\n\n\t# compute time in seconds since the given start time\n\tfunction duration()
{\n\t\tlocal start=$1\n\t\tend=$(date +%s)\n\t\techo $((end-start))\n\t}\n\n\t######################\n\t#
\ START #\n\t######################\n\n\tscript_start=$(date +%s)\n\n\techo
\"Starting init script\"\n\n\t######################\n\t# Plugins #\n\t######################\n\n\tplugins_start=$(date
+%s)\n\t# Install extra plugins\n\t\n\t\techo \"Installing plugin repository-s3\"\n\t\t#
Using --batch accepts any user prompt (y/n)\n\t\t$PLUGIN_BIN install --batch
repository-s3\n\t\n\t\techo \"Installing plugin repository-gcs\"\n\t\t# Using
--batch accepts any user prompt (y/n)\n\t\t$PLUGIN_BIN install --batch repository-gcs\n\t\n\n\techo
\"Installed plugins:\"\n\t$PLUGIN_BIN list\n\n\techo \"Plugins installation
duration: $(duration $plugins_start) sec.\"\n\n\t######################\n\t#
\ Config linking #\n\t######################\n\n\t# Link individual files
from their mount location into the config dir\n\t# to a volume, to be used by
the ES container\n\tln_start=$(date +%s)\n\t\n\t\techo \"Linking /mnt/elastic/secrets/users
to /usr/share/elasticsearch/config/users\"\n\t\tln -sf /mnt/elastic/secrets/users
/usr/share/elasticsearch/config/users\n\t\n\t\techo \"Linking /mnt/elastic/secrets/roles.yml
to /usr/share/elasticsearch/config/roles.yml\"\n\t\tln -sf /mnt/elastic/secrets/roles.yml
/usr/share/elasticsearch/config/roles.yml\n\t\n\t\techo \"Linking /mnt/elastic/secrets/users_roles
to /usr/share/elasticsearch/config/users_roles\"\n\t\tln -sf /mnt/elastic/secrets/users_roles
/usr/share/elasticsearch/config/users_roles\n\t\n\t\techo \"Linking /mnt/elastic/es-config/elasticsearch.yml
to /usr/share/elasticsearch/config/elasticsearch.yml\"\n\t\tln -sf /mnt/elastic/es-config/elasticsearch.yml
/usr/share/elasticsearch/config/elasticsearch.yml\n\t\n\t\techo \"Linking /mnt/elastic/unicast-hosts/unicast_hosts.txt
to /usr/share/elasticsearch/config/unicast_hosts.txt\"\n\t\tln -sf /mnt/elastic/unicast-hosts/unicast_hosts.txt
/usr/share/elasticsearch/config/unicast_hosts.txt\n\t\n\techo \"File linking
duration: $(duration $ln_start) sec.\"\n\n\n\t######################\n\t# Files
persistence #\n\t######################\n\n\t# Persist the content of bin/,
config/ and plugins/\n\t# to a volume, to be used by the ES container\n\tmv_start=$(date
+%s)\n\t\n\t\techo \"Moving /usr/share/elasticsearch/config/* to /volume/config/\"\n\t\tmv
/usr/share/elasticsearch/config/* /volume/config/\n\t\n\t\techo \"Moving /usr/share/elasticsearch/plugins/*
to /volume/plugins/\"\n\t\tmv /usr/share/elasticsearch/plugins/* /volume/plugins/\n\t\n\t\techo
\"Moving /usr/share/elasticsearch/bin/* to /volume/bin/\"\n\t\tmv /usr/share/elasticsearch/bin/*
/volume/bin/\n\t\n\t\techo \"Moving /usr/share/elasticsearch/data/* to /volume/data/\"\n\t\tmv
/usr/share/elasticsearch/data/* /volume/data/\n\t\n\t\techo \"Moving /usr/share/elasticsearch/logs/*
to /volume/logs/\"\n\t\tmv /usr/share/elasticsearch/logs/* /volume/logs/\n\t\n\techo
\"Files copy duration: $(duration $mv_start) sec.\"\n\n\t######################\n\t#
\ Volumes chown #\n\t######################\n\n\t# chown the data and logs
volume to the elasticsearch user\n\tchown_start=$(date +%s)\n\t\n\t\techo \"chowning
/volume/data to elasticsearch:elasticsearch\"\n\t\tchown -v elasticsearch:elasticsearch
/volume/data\n\t\n\t\techo \"chowning /volume/logs to elasticsearch:elasticsearch\"\n\t\tchown
-v elasticsearch:elasticsearch /volume/logs\n\t\n\techo \"chown duration: $(duration
$chown_start) sec.\"\n\n\t######################\n\t# End #\n\t######################\n\n\techo
\"Init script successful\"\n\techo \"Script duration: $(duration $script_start)
sec.\"\n"
image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
imagePullPolicy: IfNotPresent
name: prepare-fs
resources: {}
securityContext:
privileged: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /volume/config
name: config-volume
- mountPath: /volume/plugins
name: plugins-volume
- mountPath: /volume/bin
name: bin-volume
- mountPath: /volume/data
name: data
- mountPath: /volume/logs
name: logs
- command:
- bash
- -c
- "\n\t\t#!/usr/bin/env bash -eu\n\t\tcp process-manager $LOCAL_BIN\n"
env:
- name: LOCAL_BIN
value: /volume/bin
image: docker.elastic.co/eck/eck-operator:0.8.0
imagePullPolicy: IfNotPresent
name: inject-process-manager
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /volume/bin
name: local-bin-volume
- command:
- /root/cert-initializer
image: docker.elastic.co/eck/eck-operator:0.8.0
imagePullPolicy: Always
name: cert-initializer
ports:
- containerPort: 8001
name: csr
protocol: TCP
resources: {}
securityContext:
privileged: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/config/node-certs
name: node-certificates
readOnly: true
- mountPath: /mnt/elastic/private-key
name: private-key-volume
nodeName: ip-172-19-74-179.eu-west-1.compute.internal
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 120
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- emptyDir: {}
name: config-volume
- emptyDir: {}
name: plugins-volume
- emptyDir: {}
name: bin-volume
- emptyDir: {}
name: data
- emptyDir: {}
name: logs
- emptyDir: {}
name: private-key-volume
- emptyDir: {}
name: local-bin-volume
- name: users
secret:
defaultMode: 420
optional: false
secretName: es-poc-es-roles-users
- configMap:
defaultMode: 420
name: es-poc
optional: false
name: es-poc
- configMap:
defaultMode: 420
name: es-poc-unicast-hosts
optional: false
name: es-poc-unicast-hosts
- name: probe-user
secret:
defaultMode: 420
items:
- key: elastic-internal-probe
path: elastic-internal-probe
optional: false
secretName: es-poc-internal-users
- name: extrafiles
secret:
defaultMode: 420
optional: false
secretName: es-poc-extrafiles
- name: reload-creds-user
secret:
defaultMode: 420
items:
- key: elastic-internal-reload-creds
path: elastic-internal-reload-creds
optional: false
secretName: es-poc-internal-users
- name: secure-settings
secret:
defaultMode: 420
optional: false
secretName: es-poc-secure-settings
- name: node-certificates
secret:
defaultMode: 420
optional: false
secretName: es-poc-es-nckxn896sn-certs
- name: es-config
secret:
defaultMode: 420
optional: false
secretName: es-poc-es-nckxn896sn-config
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2019-06-19T04:03:32Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2019-06-19T04:46:40Z"
message: 'containers with unready status: [elasticsearch]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: null
message: 'containers with unready status: [elasticsearch]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2019-06-19T04:01:52Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://a7c44a01d0af5579049c1bb4fb83c70242d7e87aa7832fdb79bbf22b2dda35b3
image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
imageID: docker-pullable://docker.elastic.co/elasticsearch/elasticsearch@sha256:802b6a299260dbaf21a9c57e3a634491ff788a1ea13a51598d4cd105739509c4
lastState:
terminated:
containerID: docker://a7c44a01d0af5579049c1bb4fb83c70242d7e87aa7832fdb79bbf22b2dda35b3
exitCode: 255
finishedAt: "2019-06-19T08:17:08Z"
reason: OOMKilled
startedAt: "2019-06-19T08:16:55Z"
name: elasticsearch
ready: false
restartCount: 44
state:
waiting:
message: Back-off 5m0s restarting failed container=elasticsearch pod=es-poc-es-nckxn896sn_dev-telemetry(f849eb50-9246-11e9-8bac-0ae689a1e762)
reason: CrashLoopBackOff
hostIP: 172.19.74.179
initContainerStatuses:
- containerID: docker://ffc4e521a27e759341a55ce1458362af378e9072987547c90847e2e92bbd5aa3
image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
imageID: docker-pullable://docker.elastic.co/elasticsearch/elasticsearch@sha256:802b6a299260dbaf21a9c57e3a634491ff788a1ea13a51598d4cd105739509c4
lastState: {}
name: tweak-os-settings
ready: true
restartCount: 0
state:
terminated:
containerID: docker://ffc4e521a27e759341a55ce1458362af378e9072987547c90847e2e92bbd5aa3
exitCode: 0
finishedAt: "2019-06-19T04:01:58Z"
reason: Completed
startedAt: "2019-06-19T04:01:58Z"
- containerID: docker://311a2b78650f8d21e71542af3f2d2cfdae13c1b0fd44be4021d26f77aa1d4090
image: docker.elastic.co/elasticsearch/elasticsearch:7.1.0
imageID: docker-pullable://docker.elastic.co/elasticsearch/elasticsearch@sha256:802b6a299260dbaf21a9c57e3a634491ff788a1ea13a51598d4cd105739509c4
lastState: {}
name: prepare-fs
ready: true
restartCount: 0
state:
terminated:
containerID: docker://311a2b78650f8d21e71542af3f2d2cfdae13c1b0fd44be4021d26f77aa1d4090
exitCode: 0
finishedAt: "2019-06-19T04:02:12Z"
reason: Completed
startedAt: "2019-06-19T04:01:59Z"
- containerID: docker://6b882580d792367426046df37e8969e5be8ffb33c6e45955c01935afd97d5e13
image: docker.elastic.co/eck/eck-operator:0.8.0
imageID: docker-pullable://docker.elastic.co/eck/eck-operator@sha256:1e910d2502690f9007d103f89cef92ef3c4f1115c08819edb3b7409481b291a3
lastState: {}
name: inject-process-manager
ready: true
restartCount: 0
state:
terminated:
containerID: docker://6b882580d792367426046df37e8969e5be8ffb33c6e45955c01935afd97d5e13
exitCode: 0
finishedAt: "2019-06-19T04:02:14Z"
reason: Completed
startedAt: "2019-06-19T04:02:14Z"
- containerID: docker://1b4cb60bd27cac59115cdce2696ebab3138d7847b786f09900b16f8520af0bee
image: docker.elastic.co/eck/eck-operator:0.8.0
imageID: docker-pullable://docker.elastic.co/eck/eck-operator@sha256:1e910d2502690f9007d103f89cef92ef3c4f1115c08819edb3b7409481b291a3
lastState: {}
name: cert-initializer
ready: true
restartCount: 0
state:
terminated:
containerID: docker://1b4cb60bd27cac59115cdce2696ebab3138d7847b786f09900b16f8520af0bee
exitCode: 0
finishedAt: "2019-06-19T04:03:32Z"
reason: Completed
startedAt: "2019-06-19T04:03:31Z"
phase: Running
podIP: 100.119.0.14
qosClass: Burstable
startTime: "2019-06-19T04:01:52Z"
No logs with Exception in thread "main" org.elasticsearch.bootstrap.BootstrapException: java.nio.file.FileAlreadyExistsException: /usr/share/elasticsearch/config/elasticsearch.keystore.tmp
for kubectl logs <pod_name> --previous too
I'll redeploy cluster with original configuration again to see if they will appear
Looking at node 1:
lastState:
terminated:
containerID: docker://c08200420b7f38fb5e15e9327529f40e4462d63588478500cc442f03c2441c51
exitCode: 255
finishedAt: "2019-06-19T08:13:45Z"
reason: OOMKilled
startedAt: "2019-06-19T08:13:33Z"
The pod seems to be killed by Kubernetes, out of memory. According to the spec it should be running with a 8Gi memory limit. The JVM heap size is correctly set to 4Gi.
Do you have a lot of shards in that cluster @iekulyk? Maybe a 4Gi JVM heap is not enough. GET /_cat/health?v should display the number of shards, if you can manage to reach ES endpoint.
@sebgl, there is no data on that cluster
only imported from Kibana UI (samples)
shards - 4
documents - 1,033
data - 985.0 K
Hi, could provide the output of the following command on the K8S node: sudo dmesg -T ?
Thank you
If the output is to verbose please just copy/paste the part where there're details about the last OOMKiller activities.
[Wed Jun 19 09:15:41 2019] Memory cgroup out of memory: Kill process 30572 (elasticsearch[e) score 1499 or sacrifice child
[Wed Jun 19 09:15:41 2019] Killed process 30025 (java) total-vm:4770836kB, anon-rss:1298548kB, file-rss:21304kB, shmem-rss:0kB
[Wed Jun 19 09:15:41 2019] process-manager invoked oom-killer: gfp_mask=0x50, order=0, oom_score_adj=869
[Wed Jun 19 09:15:41 2019] IPv4: martian source 100.127.0.9 from 100.101.128.0, on dev datapath
[Wed Jun 19 09:15:41 2019] ll header: 00000000: ff ff ff ff ff ff 8e 88 3a bb 95 49 08 06 ........:..I..
[Wed Jun 19 09:15:41 2019] process-manager cpuset=0b7821b76fccf45f9b6e383269be1012b4afb3281c7bf22df74162b345acf20f mems_allowed=0
[Wed Jun 19 09:15:41 2019] CPU: 3 PID: 30024 Comm: process-manager Tainted: G ------------ T 3.10.0-862.14.4.el7.x86_64 #1
[Wed Jun 19 09:15:41 2019] Hardware name: Amazon EC2 m5.xlarge/, BIOS 1.0 10/16/2017
...
Task in / killed as a result of limit of /kubepods/burstable/pod996fb235-926f-11e9-8bac-0ae689a1e762
[Wed Jun 19 09:15:41 2019] memory: usage 809536kB, limit 2097152kB, failcnt 159697
[Wed Jun 19 09:15:41 2019] memory+swap: usage 809980kB, limit 9007199254740988kB, failcnt 0
[Wed Jun 19 09:15:41 2019] kmem: usage 783896kB, limit 9007199254740988kB, failcnt 0
[Wed Jun 19 09:15:41 2019] Memory cgroup stats for /kubepods/burstable/pod996fb235-926f-11e9-8bac-0ae689a1e762: cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
[Wed Jun 19 09:15:41 2019] Memory cgroup stats for /kubepods/burstable/pod996fb235-926f-11e9-8bac-0ae689a1e762/40c8643ac456f5957d1f3609ecfc428cec46915af0f71400ce004c2c4942c950: cache:0KB rss:40KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:40KB inactive_file:0KB active_file:0KB unevictable:0KB
And i still can see Exception in thread "main" org.elasticsearch.bootstrap.BootstrapException: for some pods
Exception in thread "main" org.elasticsearch.bootstrap.BootstrapException: java.nio.file.FileAlreadyExistsException: /usr/share/elasticsearch/config/elasticsearch.keystore.tmp
Likely root cause: java.nio.file.FileAlreadyExistsException: /usr/share/elasticsearch/config/elasticsearch.keystore.tmp
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:94)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219)
at java.base/java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:478)
at java.base/java.nio.file.Files.newOutputStream(Files.java:222)
at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:411)
at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:407)
at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:255)
at org.elasticsearch.common.settings.KeyStoreWrapper.save(KeyStoreWrapper.java:462)
at org.elasticsearch.bootstrap.Bootstrap.loadSecureSettings(Bootstrap.java:232)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:289)
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159)
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
at org.elasticsearch.cli.Command.main(Command.java:90)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:115)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92)
Refer to the log for complete error details.
3.10.0-862.14.4.el7.x86_64
It looks like you are using Centos7 or a distribution based on the Red Hat kernel.
Note that this kernel is pretty old and there are some known issues regarding the way the "kernel memory" is accounted within cgroups.
I have been able to reproduce your experience on a CentOS 7 distribution, pods were killed just after a few minutes of activity.
To solve this problem you have to use Docker 18.09.x and patch the kubelet as described here_(don't forget to reboot)_. I did it myself and the OOMKiller has not been triggered so far.
Of course you also have the option to use another distribution.
Also note that disabling the kernel memory accounting by passing the cgroup.memory=nokmem argument to the kernel is not enough as the kubelet will enable it at runtime if not patched.
Regarding the issue with the keystore temp. file there is an other issue open
As a side note here is the command to disable kernel memory accounting at startup on CentOS 7:
sudo /sbin/grubby --update-kernel=ALL --args='cgroup.memory=nokmem'
You still have to patch the kubelet though.
Hi @barkbay , thanks for the advice
While it is not fast to make changes to our k8s cluster, I've tried to deploy elasticsearch cluster with official helm charts and it is stable so far
3 master - 30 gb storate, 2 gb ram
3 client - 30 gb storate, 2 gb ram
3 data - 30 gb storate, 2 gb ram
As we have some amount of deployments already in same k8s cluster (including ElasticSearch cluster we deployed with own helm charts 1.5 year ago) and we don't have such problems with any other deployments
Any chance that we installed the operator in some way that it was corrupted and now causing this issues ?
Thanks
@barkbay Any idea why the eck-operator encounters this issue after a short period of time with minimal activity and the official ES helm charts do not? Is the eck-operator misbehaving in some way to cause the issue?
Hi,
I just did some tests with our master branch and it seems that the problem has been solved.
My guess is that in 0.8.x we were using a small additional program in the container, which is specific to ECK, and I think that this program was triggering the bug in the CentOS kernel.
Release 0.9 which is expected in a few days should fix the problem.
Closing for now, 0.9 should improve the situation, please reopen if you still have this issue.
Please reopen this!
I can confirm that with the mentioned Centos kernel this happens on version 0.9.0 exactly as before.
With the centos kernel updated to latest el-stable (5ish) and the kubelet rebuilt with disabled kmem the goddamn es continued to gobble up all the ram it found.
It also happens on Ubuntu 18.04 4.15 kernel.
Reproduced repeatedly over many nights of ingestion little data (300mb to 30gb) on Kubernetes versions 14 and 15. (installed via kubespray)
The constant ram growth was largely unaffected whether overnight it got a total of 300mb or 30gb
To make it clear - this problem with ES is only with the operator. A normal version 7.3 statefulset deployment via helm is not causing problems
@strzelecki-maciek Until #1716 is merged and released, you can apply a small env variable change to your current Elasticsearch resource (details here) . I'm using this setting myself in production with success.
Closing as #1716 has been merged.
Most helpful comment
It looks like you are using Centos7 or a distribution based on the Red Hat kernel.
Note that this kernel is pretty old and there are some known issues regarding the way the "kernel memory" is accounted within cgroups.
I have been able to reproduce your experience on a CentOS 7 distribution, pods were killed just after a few minutes of activity.
To solve this problem you have to use Docker 18.09.x and patch the
kubeletas described here_(don't forget to reboot)_. I did it myself and theOOMKillerhas not been triggered so far.Of course you also have the option to use another distribution.
Also note that disabling the kernel memory accounting by passing the
cgroup.memory=nokmemargument to the kernel is not enough as thekubeletwill enable it at runtime if not patched.Regarding the issue with the keystore temp. file there is an other issue open