Elasticsearch version: 5.0.0-alpha5 (RC4)
Plugins installed: none
JVM version: 1.8.0_60
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
OS version: OS X El Capitan 10.11.6 (15G31)
Description of the problem including expected versus actual behavior:
_local_
I'm starting a new blank version of alpha5 with this config:
network.host: _local_
Everything starts correctly:
[2016-08-08 16:40:23,452][INFO ][node ] [] initializing ...
[2016-08-08 16:40:23,520][INFO ][env ] [7kX1QFh] using [1] data paths, mounts [[/ (/dev/disk1)]], net usable_space [30.1gb], net total_space [464.7gb], spins? [unknown], types [hfs]
[2016-08-08 16:40:23,520][INFO ][env ] [7kX1QFh] heap size [1.9gb], compressed ordinary object pointers [true]
[2016-08-08 16:40:23,521][INFO ][node ] [7kX1QFh] node name [7kX1QFh] derived from node ID; set [node.name] to override
[2016-08-08 16:40:23,522][INFO ][node ] [7kX1QFh] version[5.0.0-alpha5], pid[38688], build[d327dd4/2016-08-04T08:59:39.568Z], OS[Mac OS X/10.11.6/x86_64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_60/25.60-b23]
[2016-08-08 16:40:24,488][INFO ][io.netty.util.internal.PlatformDependent] Your platform does not provide complete low-level API for accessing direct buffers reliably. Unless explicitly requested, heap buffer will always be preferred to avoid potential system unstability.
[2016-08-08 16:40:24,492][INFO ][plugins ] [7kX1QFh] loaded module [aggs-matrix-stats]
[2016-08-08 16:40:24,492][INFO ][plugins ] [7kX1QFh] loaded module [ingest-common]
[2016-08-08 16:40:24,492][INFO ][plugins ] [7kX1QFh] loaded module [lang-expression]
[2016-08-08 16:40:24,493][INFO ][plugins ] [7kX1QFh] loaded module [lang-groovy]
[2016-08-08 16:40:24,493][INFO ][plugins ] [7kX1QFh] loaded module [lang-mustache]
[2016-08-08 16:40:24,493][INFO ][plugins ] [7kX1QFh] loaded module [lang-painless]
[2016-08-08 16:40:24,493][INFO ][plugins ] [7kX1QFh] loaded module [percolator]
[2016-08-08 16:40:24,493][INFO ][plugins ] [7kX1QFh] loaded module [reindex]
[2016-08-08 16:40:24,493][INFO ][plugins ] [7kX1QFh] loaded module [transport-netty3]
[2016-08-08 16:40:24,493][INFO ][plugins ] [7kX1QFh] loaded module [transport-netty4]
[2016-08-08 16:40:24,494][INFO ][plugins ] [7kX1QFh] no plugins loaded
[2016-08-08 16:40:26,363][INFO ][node ] [7kX1QFh] initialized
[2016-08-08 16:40:26,364][INFO ][node ] [7kX1QFh] starting ...
[2016-08-08 16:40:26,489][INFO ][transport ] [7kX1QFh] publish_address {127.0.0.1:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}
[2016-08-08 16:40:26,494][WARN ][bootstrap ] [7kX1QFh] initial heap size [268435456] not equal to maximum heap size [2147483648]; this can cause resize pauses and prevents mlockall from locking the entire heap
[2016-08-08 16:40:26,494][WARN ][bootstrap ] [7kX1QFh] please set [discovery.zen.minimum_master_nodes] to a majority of the number of master eligible nodes in your cluster
[2016-08-08 16:40:29,553][INFO ][cluster.service ] [7kX1QFh] new_master {7kX1QFh}{7kX1QFhPSxaGAHLjRzxeXQ}{I3OnGNGsRGu92A57qXLIoQ}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2016-08-08 16:40:29,573][INFO ][http ] [7kX1QFh] publish_address {127.0.0.1:9200}, bound_addresses {[fe80::1]:9200}, {[::1]:9200}, {127.0.0.1:9200}
[2016-08-08 16:40:29,573][INFO ][node ] [7kX1QFh] started
[2016-08-08 16:40:29,580][INFO ][gateway ] [7kX1QFh] recovered [0] indices into cluster_state
_local_,_eth0:ipv4_
Then I change the network configuration with a bad network card name eth0
(does not exist on my Mac):
network.host: _local_,_eth0:ipv4_
It stops nicely with this expected message:
[2016-08-08 16:41:40,677][INFO ][node ] [7kX1QFh] starting ...
[2016-08-08 16:41:40,742][WARN ][bootstrap ] [] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupError: java.lang.IllegalArgumentException: No interface named 'eth0' found, got [name:lo0 (lo0), name:en0 (en0), name:awdl0 (awdl0)]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:105)
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:96)
at org.elasticsearch.cli.SettingCommand.execute(SettingCommand.java:54)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:88)
at org.elasticsearch.cli.Command.main(Command.java:54)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:75)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:68)
Caused by: java.lang.IllegalArgumentException: No interface named 'eth0' found, got [name:lo0 (lo0), name:en0 (en0), name:awdl0 (awdl0)]
at org.elasticsearch.common.network.NetworkUtils.getAddressesForInterface(NetworkUtils.java:232)
at org.elasticsearch.common.network.NetworkService.resolveInternal(NetworkService.java:261)
at org.elasticsearch.common.network.NetworkService.resolveInetAddresses(NetworkService.java:220)
at org.elasticsearch.common.network.NetworkService.resolveBindHostAddresses(NetworkService.java:130)
at org.elasticsearch.transport.TcpTransport.bindServer(TcpTransport.java:569)
at org.elasticsearch.transport.netty4.Netty4Transport.doStart(Netty4Transport.java:181)
at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:68)
at org.elasticsearch.transport.TransportService.doStart(TransportService.java:177)
at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:68)
at org.elasticsearch.node.Node.start(Node.java:469)
at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:193)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:257)
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:101)
... 6 more
[2016-08-08 16:41:41,699][INFO ][node ] [7kX1QFh] stopping ...
[2016-08-08 16:41:41,701][INFO ][node ] [7kX1QFh] stopped
[2016-08-08 16:41:41,701][INFO ][node ] [7kX1QFh] closing ...
[2016-08-08 16:41:41,710][INFO ][node ] [7kX1QFh] closed
_local_,_en0:ipv4_
Now, I change the configuration to:
network.host: _local_,_en0:ipv4_
When I start, I'm getting this error message which is confusing:
[2016-08-08 16:43:37,203][INFO ][node ] [7kX1QFh] starting ...
[2016-08-08 16:43:37,336][INFO ][transport ] [7kX1QFh] publish_address {192.168.0.48:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}, {192.168.0.48:9300}
[2016-08-08 16:43:37,340][INFO ][bootstrap ] [7kX1QFh] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2016-08-08 16:43:37,343][WARN ][bootstrap ] [] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupError: java.lang.RuntimeException: bootstrap checks failed
initial heap size [268435456] not equal to maximum heap size [2147483648]; this can cause resize pauses and prevents mlockall from locking the entire heap
please set [discovery.zen.minimum_master_nodes] to a majority of the number of master eligible nodes in your cluster
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:105)
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:96)
at org.elasticsearch.cli.SettingCommand.execute(SettingCommand.java:54)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:88)
at org.elasticsearch.cli.Command.main(Command.java:54)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:75)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:68)
Caused by: java.lang.RuntimeException: bootstrap checks failed
initial heap size [268435456] not equal to maximum heap size [2147483648]; this can cause resize pauses and prevents mlockall from locking the entire heap
please set [discovery.zen.minimum_master_nodes] to a majority of the number of master eligible nodes in your cluster
at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:132)
at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:85)
at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:65)
at org.elasticsearch.bootstrap.Bootstrap$5.validateNodeBeforeAcceptingRequests(Bootstrap.java:178)
at org.elasticsearch.node.Node.start(Node.java:471)
at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:193)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:257)
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:101)
... 6 more
Suppressed: java.lang.IllegalStateException: initial heap size [268435456] not equal to maximum heap size [2147483648]; this can cause resize pauses and prevents mlockall from locking the entire heap
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:133)
... 13 more
Suppressed: java.lang.IllegalStateException: please set [discovery.zen.minimum_master_nodes] to a majority of the number of master eligible nodes in your cluster
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:133)
... 13 more
[2016-08-08 16:43:37,345][INFO ][node ] [7kX1QFh] stopping ...
[2016-08-08 16:43:37,360][INFO ][node ] [7kX1QFh] stopped
[2016-08-08 16:43:37,360][INFO ][node ] [7kX1QFh] closing ...
[2016-08-08 16:43:37,370][INFO ][node ] [7kX1QFh] closed
It looks strange to me that:
en0
IP V4 address: 192.168.0.48@dadoonet it says in the log why it fails to start up:
bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
So it refuses to start up since it is bound to en0
, not sure where you are seeing that it's a WARN? It's only a WARN log message in the first example, when bound to localhost only.
Oh! I see. Thanks. I missed the INFO
part of the logs...
I wonder though if we should add somewhere a ERROR
or FATAL
log like Bootstrap checks failed. Prevent from starting...
Something along those lines.
I'm closing for now as everything works as expected.
Thanks @dakrone !
It says there's a startup error:
org.elasticsearch.bootstrap.StartupError: java.lang.RuntimeException: bootstrap checks failed
@jasontedor Sure. All the information is here and this is fine.
My point is that it's "just" a WARN
. To me a WARN
means that the service is working but might have unexpected behavior.
Here, we know that the service won't ever work. That's my I suggested changing the log level to ERROR
or FATAL
which means for the later that elasticsearch won't survive to this fatal error.
But again, not a big deal to me as the information is here and we know perfectly what is happening.
The warn in
[2016-08-08 16:43:37,343][WARN ][bootstrap ] [] uncaught exception in thread [main]
is from elsewhere, it's from the uncaught exception handler. And that's the best that we can say there, right? We definitely can not call all uncaught exceptions fatal?
Worked on my AWS Linux Machine
cluster.name: myES_Cluster
node.name: ESNODE_CYR
node.master: true
node.data: true
transport.host: localhost
transport.tcp.port: 9300
http.port: 9200
network.host: 0.0.0.0
discovery.zen.minimum_master_nodes: 2
I have tried with this on the elasticsearch.yml (key:value) and worked fine for me. But it takes 2 days to fix it :wink: :slight_smile: , going on with ES Doc is so tough.
@cyrilcyril70: You added this comment on discuss and at least on 4 issues.
What is your intention ?
@cyrilcyril70 You saved my life bro.. Have been struggling with this annoying issue for days.. ES docs doesnt explain this clearly.,
Your settings worked perfectly for me..
I just has to add transport.host: localhost
transport.tcp.port: 9300
to my elasticsearch.yml
@neuronring I believe that we cover this situation in the bootstrap check docs. Do you feel that the explanation there is not adequate, or that there would be a more suitable place?
@jasontedor Yes Jason. I looked into that section Developement Vs Production several times. To my knowledge (may be for lot others too. probably thats why i could see lot of questions about how fix "bootstrap check failed") it is not clear at all. Especially for the people migrating from 2.4 to 5.x it is like a nightmare. If you would change the doc and wordings easy enough to understand, that will attract more users. In My org, we are lately finding ES difficult to use because of new versions with lack of clarity in docs.
just my opinion,,,
@neuronring I appreciate the feedback and I'm sorry that you struggled with this. Yet, I think I need some further guidance on what it is that you found problematic about the docs. We say this:
Note that HTTP can be configured independently of transport via http.host and transport.host; this can be useful for configuring a single instance to be reachable via HTTP for testing purposes without triggering production mode.
Can you help me understand how that can be improved?
@jasontedor Sure. i have difficulty understanding whether that setting is for a single node Elasticsearch or for a cluster
transport.host: localhost
transport.tcp.port: 9300
if i use the same setting on both the nodes. They are not forming a cluster. both run as standalone nodes. Either of the nodes identify the other one. Here is my elasticsearch.yml file. i have the same setting on both the nodes
cluster.name: elasticsearch
cluster.routing.allocation.awareness.force.cloud.values: zone
node.name: Elkprod1
node.master: true
node.data: true
node.max_local_storage_nodes: 1
path.conf: /usr/local/etc/elasticsearch
path.data: /data/elasticsearch
path.logs: /usr/local/var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
transport.host: localhost
transport.tcp.port: 9300
gateway.expected_nodes: 1
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ["10.1.1.10","10.1.1.11"]
@neuronring If you bind transport to locahost (transport.host
), then you can not form a cluster across machines. Therefore, if you want to form a cluster between these two nodes, you have to remove setting transport.host
. In this case, your node will be subject to the bootstrap checks and you have to address them. Your logs will tell you what the problems are.
Ugh.. then it is a deadlock scenario.. if i get remove of transport.host
ES wont startup.
I want to form a cluster at the same time, escape from bootstrap checks. Please help. here is my log. Again.. bootstrap checks failed !!!
[2017-03-01T17:05:09,618][INFO ][o.e.p.PluginsService ] [opselk-197323431-1-198091806.Elkprod.apps-ops.com] no plugins loaded
[2017-03-01T17:05:11,358][INFO ][o.e.n.Node ] [opselk-197323431-1-198091806.Elkprod.apps-ops.com] initialized
[2017-03-01T17:05:11,358][INFO ][o.e.n.Node ] [opselk-197323431-1-198091806.Elkprod.apps-ops.com] starting ...
[2017-03-01T17:05:11,442][WARN ][i.n.u.i.MacAddressUtil ] Failed to find a usable hardware address from the network interfaces; using random bytes: bd:e2:6b:bb:af:90:b9:e9
[2017-03-01T17:05:11,497][INFO ][o.e.t.TransportService ] [opselk-197323431-1-198091806.Elkprod.apps-ops.com] publish_address {10.1.1.10:9300}, bound_addresses {0.0.0.0:9300}
[2017-03-01T17:05:11,501][INFO ][o.e.b.BootstrapChecks ] [opselk-197323431-1-198091806.Elkprod.apps-ops.com] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-03-01T17:05:11,503][ERROR][o.e.b.Bootstrap ] [opselk-197323431-1-198091806.Elkprod.apps-ops.com] node validation exception
bootstrap checks failed
max file descriptors [64000] for elasticsearch process is too low, increase to at least [65536]
system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk
[2017-03-01T17:05:11,505][INFO ][o.e.n.Node ] [opselk-197323431-1-198091806.Elkprod.apps-ops.com] stopping ...
[2017-03-01T17:05:11,583][INFO ][o.e.n.Node ] [opselk-197323431-1-198091806.Elkprod.apps-ops.com] stopped
[2017-03-01T17:05:11,584][INFO ][o.e.n.Node ] [opselk-197323431-1-198091806.Elkprod.apps-ops.com] closing ...
[2017-03-01T17:05:11,604][INFO ][o.e.n.Node ] [opselk-197323431-1-198091806.Elkprod.apps-ops.com] closed
Your issues are explained in the logs:
max file descriptors [64000] for elasticsearch process is too low, increase to at least [65536]
system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk
You need to increase the number of file descriptors, and disable system call filters if your kernel does not support the seccomp features that we need. This is covered in the docs that I linked to previously.
same error......can not start cluster
I am also getting the same error.
The error was clearly described by users in many issues. Why you are looking for the number of files descriptors or everything else but not to the "transport" issue !!
What to do to make master elligible nodes can see each other and to elect their master??
Thanks
I think you misunderstand that the bootstrap checks apply if the node can form a cluster with another node: see the docs.
You have to address the issues the checks are reporting.
What I described in my comment was the "Transport" module not the "Bootstrap checks" (the deactivation of the later was just to see if we could find a work-around to the main issue which is the "Transport configuration" for a cluster).
I understood that the "bootstrap checks" verify the ES configuration (such as Transport module,..) not the possibility to form a cluster or not. In fact, my reading was based on the official elastic docs like module Transport and also the link you 've provided.
See development vs. production mode in the docs.
Here is what I did:
I added http.bind to my elasticsearch.yml this enabled me to have a single mode on the network
My config ...
http.port: 9201
http.bind_host: 192.168.1.172
other configs:
in /etc/hosts
192.168.1.172 myTestNode
I was able to curl the node from a another local box:
hope this helps
192.168.1.171 templogger
pedro@templogger:~$ curl -vk 192.168.1.172:9201
GET / HTTP/1.1
User-Agent: curl/7.38.0
Host: 192.168.1.172:9201
Accept: /< HTTP/1.1 200 OK
< content-type: application/json; charset=UTF-8
< content-length: 327
<
{
"name" : "FUc5C5P",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "-eC4NGb_Qa-hCUsmmWK0mg",
"version" : {
"number" : "5.6.0",
"build_hash" : "781a835",
"build_date" : "2017-09-07T03:09:58.087Z",
"build_snapshot" : false,
"lucene_version" : "6.6.0"
},
"tagline" : "You Know, for Search"
}
- Connection #0 to host 192.168.1.172 left intact
http.port: 9201
http.publish_host: 192.168.1.172 #by itself does not work
http.host: 192.168.1.172 #works alone
I think I am confused a bit here too. I need to be able to cluster two servers with elasticsearch installed. If I set transport.publish_host
to a loopback address, then it is unreachable outside that server. If I do set it to an non-loopback address, then so far as I have been able to tell you have to set discovery.type: single-node
to get it to boot at all.
I understand that it sets it to production mode, and thus bootstrap checks are enforced, but why is it that the bootstrap checks fail when it is assigned to a public IP?
Thanks!
@AddoSolutions, @pedrosk, and future readers, please note that this was closed many months ago, and is unlikely to receive further attention (this reply notwithstanding). The various issues above are all apparently covered by this reply:
You have to address the issues the checks are reporting
If you would like help with interpreting the output from the bootstrap checks, or any other log messages, please ask a question in the discussion forum, including a full copy of the log, and we'll be able to assist. If you have ideas for improvements to the messages and/or the documentation pertaining to bootstrap checks then please open a PR. In either case, this thread isn't the best place for further discussion.
Most helpful comment
Worked on my AWS Linux Machine
cluster.name: myES_Cluster
node.name: ESNODE_CYR
node.master: true
node.data: true
transport.host: localhost
transport.tcp.port: 9300
http.port: 9200
network.host: 0.0.0.0
discovery.zen.minimum_master_nodes: 2
I have tried with this on the elasticsearch.yml (key:value) and worked fine for me. But it takes 2 days to fix it :wink: :slight_smile: , going on with ES Doc is so tough.