Attempting to upgrade our logging stack from v1.5.1 => v3.6.0 succeeds running through Ansible, but the ES containers do not deploy successfully.
I thought it might be a corruption, so I tried after wiping the storage for the ES containers which didn't make a diff. I then tried to install fresh and same problem occurs. v1.5.1 works fine, but would like to keep the logging aligned with the cluster version.
Not sure if this is the right place to put this or if there's someone else that maintains the v3.6.0 logging images (if the problem lies there) -- Any help would be appreciated.
ansible --versionansible 2.3.2.0
config file = /Users/hef/work/openshift-ansible/ansible.cfg
configured module search path = Default w/o overrides
python version = 2.7.13 (default, Jul 18 2017, 09:17:00) [GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)]
Upgrade with/without existing data from v1.5.1 to v3.6.0 || Fresh install of v3.6.0
git checkout release-3.6git pull --rebase to updateansible-playbook playbooks/byo/openshift-cluster/openshift-logging.yml(also tried on master branch with no luck)
Successful install and/or upgrade the container images in logging project to v3.6.0 + any other changes necessary to rev up to a v3.6.0 cluster.
ES containers do not come up (in crash loop) with the following output:
[2017-09-21 19:09:40,650][INFO ][container.run ] Begin Elasticsearch startup script
--
聽 | [2017-09-21 19:09:40,663][INFO ][container.run ] Comparing the specified RAM to the maximum recommended for Elasticsearch...
聽 | [2017-09-21 19:09:40,664][INFO ][container.run ] Inspecting the maximum RAM available...
聽 | [2017-09-21 19:09:40,668][INFO ][container.run ] ES_HEAP_SIZE: '1024m'
聽 | [2017-09-21 19:09:40,669][INFO ][container.run ] Setting heap dump location /elasticsearch/persistent/heapdump.hprof
聽 | [2017-09-21 19:09:40,672][INFO ][container.run ] Checking if Elasticsearch is ready on https://localhost:9200
聽 | Exception in thread "main" java.lang.IllegalArgumentException: Unknown Discovery type [kubernetes]
聽 | at org.elasticsearch.discovery.DiscoveryModule.configure(DiscoveryModule.java:100)
聽 | at <<<guice>>>
聽 | at org.elasticsearch.node.Node.<init>(Node.java:213)
聽 | at org.elasticsearch.node.Node.<init>(Node.java:140)
聽 | at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143)
聽 | at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:194)
聽 | at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286)
聽 | at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45)
聽 | Refer to the log for complete error details.
$ cat /etc/redhat-release)CentOS Linux release 7.3.1611 (Core)@rhefner Can you provide the output from oc get configmap/logging-elasticsearch -o yaml ?
cc @portante @richm
@ewolinetz: Yes, sir. Here you go: https://gist.github.com/rhefner/3d949f7b4074920636a69f6688c121bf
By the way, once it crashes for a while, the pods output this:
Comparing the specificed RAM to the maximum recommended for ElasticSearch...
--
聽 | Inspecting the maximum RAM available...
聽 | ES_JAVA_OPTS: '-Dmapper.allow_dots_in_name=true -Xms128M -Xmx1024m'
聽 | Exception in thread "main" java.lang.IllegalArgumentException: Could not resolve placeholder 'HAS_DATA'
聽 | at org.elasticsearch.common.property.PropertyPlaceholder.parseStringValue(PropertyPlaceholder.java:128)
聽 | at org.elasticsearch.common.property.PropertyPlaceholder.replacePlaceholders(PropertyPlaceholder.java:81)
聽 | at org.elasticsearch.common.settings.Settings$Builder.replacePropertyPlaceholders(Settings.java:1179)
聽 | at org.elasticsearch.node.internal.InternalSettingsPreparer.initializeSettings(InternalSettingsPreparer.java:131)
聽 | at org.elasticsearch.node.internal.InternalSettingsPreparer.prepareEnvironment(InternalSettingsPreparer.java:100)
聽 | at org.elasticsearch.common.cli.CliTool.<init>(CliTool.java:107)
聽 | at org.elasticsearch.common.cli.CliTool.<init>(CliTool.java:100)
聽 | at org.elasticsearch.bootstrap.BootstrapCLIParser.<init>(BootstrapCLIParser.java:48)
聽 | at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:242)
聽 | at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45)
聽 | Refer to the log for complete error details.
聽 | Checking if Elasticsearch is ready on https://localhost:9200 ..
Having the same issue here. From what I know the elasticsearch.yml
discovery.type: kubernetes
should be:
discovery.zen.hosts_provider: kubernetes
I believe this is from a change in the kubernetes plugin
After changing this, I got farther, but have searchguard issues saying it was not initialized.
@ttindell2: Good call, changing the discovery type to zen.hosts_provider got me further as well. Now, ES is just timing out for me:
[2017-09-21 21:56:36,810][INFO ][container.run ] Begin Elasticsearch startup script
--
聽 | [2017-09-21 21:56:36,826][INFO ][container.run ] Comparing the specified RAM to the maximum recommended for Elasticsearch...
聽 | [2017-09-21 21:56:36,827][INFO ][container.run ] Inspecting the maximum RAM available...
聽 | [2017-09-21 21:56:36,831][INFO ][container.run ] ES_HEAP_SIZE: '1024m'
聽 | [2017-09-21 21:56:36,833][INFO ][container.run ] Setting heap dump location /elasticsearch/persistent/heapdump.hprof
聽 | [2017-09-21 21:56:36,837][INFO ][container.run ] Checking if Elasticsearch is ready on https://localhost:9200
聽 | [2017-09-21 22:02:06,481][ERROR][container.run ] Timed out waiting for Elasticsearch to be ready
聽 | cat: elasticsearch_connect_log.txt: No such file or directory
@rhefner
A very useful log is on the pod at /elasticsearch/${CLUSTER_NAME}/logs/
There are a few logs in there, but only one will have info. could you see what that log says?
@ttindell2 Seems like some searchguard issues, not sure if it's the same as yours..?
sh-4.2$ cat /elasticsearch/logging-es/logs/logging-es.log
[2017-09-22 00:41:40,797][INFO ][node ] [logging-es-data-master-bk8ocbgu] version[2.4.4], pid[1], build[fcbb46d/2017-01-03T11:33:16Z]
[2017-09-22 00:41:40,798][INFO ][node ] [logging-es-data-master-bk8ocbgu] initializing ...
[2017-09-22 00:41:42,415][INFO ][plugins ] [logging-es-data-master-bk8ocbgu] modules [reindex, lang-expression, lang-groovy], plugins [openshift-elasticsearch, cloud-kubernetes], sites []
[2017-09-22 00:41:42,530][INFO ][env ] [logging-es-data-master-bk8ocbgu] using [1] data paths, mounts [[/elasticsearch/persistent (/dev/loop0)]], net usable_space [99.9gb], net total_space [99.9gb], spins? [possibly], types [xfs]
[2017-09-22 00:41:42,530][INFO ][env ] [logging-es-data-master-bk8ocbgu] heap size [989.8mb], compressed ordinary object pointers [true]
[2017-09-22 00:41:43,545][INFO ][http ] [logging-es-data-master-bk8ocbgu] Using [org.elasticsearch.http.netty.NettyHttpServerTransport] as http transport, overridden by [search-guard2]
[2017-09-22 00:41:43,816][INFO ][transport ] [logging-es-data-master-bk8ocbgu] Using [com.floragunn.searchguard.transport.SearchGuardTransportService] as transport service, overridden by [search-guard2]
[2017-09-22 00:41:43,817][INFO ][transport ] [logging-es-data-master-bk8ocbgu] Using [com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] as transport, overridden by [search-guard-ssl]
[2017-09-22 00:41:48,511][INFO ][io.fabric8.elasticsearch.plugin.kibana.IndexMappingLoader] Trying to load Kibana mapping for io.fabric8.elasticsearch.kibana.mapping.app from plugin: /usr/share/elasticsearch/index_patterns/com.redhat.viaq-openshift.index-pattern.json
[2017-09-22 00:41:48,516][INFO ][io.fabric8.elasticsearch.plugin.kibana.IndexMappingLoader] Trying to load Kibana mapping for io.fabric8.elasticsearch.kibana.mapping.ops from plugin: /usr/share/elasticsearch/index_patterns/com.redhat.viaq-openshift.index-pattern.json
[2017-09-22 00:41:48,517][INFO ][io.fabric8.elasticsearch.plugin.kibana.IndexMappingLoader] Trying to load Kibana mapping for io.fabric8.elasticsearch.kibana.mapping.empty from plugin: /usr/share/elasticsearch/index_patterns/com.redhat.viaq-openshift.index-pattern.json
[2017-09-22 00:41:48,698][INFO ][node ] [logging-es-data-master-bk8ocbgu] initialized
[2017-09-22 00:41:48,698][INFO ][node ] [logging-es-data-master-bk8ocbgu] starting ...
[2017-09-22 00:41:48,921][INFO ][discovery ] [logging-es-data-master-bk8ocbgu] logging-es/ENzlPG2kTy2jfumf_9u80w
[2017-09-22 00:42:18,922][WARN ][discovery ] [logging-es-data-master-bk8ocbgu] waited for 30s and no initial state was set by the discovery
[2017-09-22 00:42:19,105][INFO ][http ] [logging-es-data-master-bk8ocbgu] publish_address {10.1.0.245:9200}, bound_addresses {[::]:9200}
[2017-09-22 00:42:19,105][INFO ][node ] [logging-es-data-master-bk8ocbgu] started
[2017-09-22 00:42:19,125][WARN ][discovery.zen.ping.unicast] [logging-es-data-master-bk8ocbgu] failed to send ping to [{#zen_unicast_6#}{::1}{[::1]:9300}]
SendRequestTransportException[[][[::1]:9300][internal:discovery/zen/unicast]]; nested: NodeNotConnectedException[[][[::1]:9300] Node not connected];
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:340)
at com.floragunn.searchguard.transport.SearchGuardTransportService.sendRequest(SearchGuardTransportService.java:88)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPingRequestToNode(UnicastZenPing.java:440)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:426)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:945)
at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:360)
at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: NodeNotConnectedException[[][[::1]:9300] Node not connected]
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:1141)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:830)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:329)
... 13 more
[2017-09-22 00:42:31,213][WARN ][discovery.zen.ping.unicast] [logging-es-data-master-bk8ocbgu] failed to send ping to [{#zen_unicast_6#}{::1}{[::1]:9300}]
SendRequestTransportException[[][[::1]:9300][internal:discovery/zen/unicast]]; nested: NodeNotConnectedException[[][[::1]:9300] Node not connected];
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:340)
at com.floragunn.searchguard.transport.SearchGuardTransportService.sendRequest(SearchGuardTransportService.java:88)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPingRequestToNode(UnicastZenPing.java:440)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:426)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:945)
at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:360)
at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: NodeNotConnectedException[[][[::1]:9300] Node not connected]
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:1141)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:830)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:329)
... 13 more
[2017-09-22 00:42:48,922][ERROR][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] Failure while checking .searchguard.logging-es-data-master-bk8ocbgu index MasterNotDiscoveredException[null]
MasterNotDiscoveredException[null]
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$5.onTimeout(TransportMasterNodeAction.java:234)
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:236)
at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:816)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:43:10,305][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:43:18,927][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:43:19,323][WARN ][discovery.zen.ping.unicast] [logging-es-data-master-bk8ocbgu] failed to send ping to [{#zen_unicast_6#}{::1}{[::1]:9300}]
SendRequestTransportException[[][[::1]:9300][internal:discovery/zen/unicast]]; nested: NodeNotConnectedException[[][[::1]:9300] Node not connected];
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:340)
at com.floragunn.searchguard.transport.SearchGuardTransportService.sendRequest(SearchGuardTransportService.java:88)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPingRequestToNode(UnicastZenPing.java:440)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:426)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:945)
at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:360)
at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: NodeNotConnectedException[[][[::1]:9300] Node not connected]
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:1141)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:830)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:329)
... 13 more
[2017-09-22 00:43:51,929][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:44:24,931][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:44:46,504][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:44:57,933][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:45:01,527][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:45:30,934][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:46:03,936][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:46:16,681][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:46:36,937][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:46:52,738][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:47:09,939][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:47:25,788][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:47:28,792][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:47:31,796][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:47:42,940][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
[2017-09-22 00:47:46,822][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:48:04,846][ERROR][com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] [logging-es-data-master-bk8ocbgu] SSL Problem renegotiation attempted by peer; closing the connection
javax.net.ssl.SSLException: renegotiation attempted by peer; closing the connection
at org.jboss.netty.handler.ssl.SslHandler.handleRenegotiation(SslHandler.java:1368)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1248)
at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2017-09-22 00:48:15,944][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-bk8ocbgu] index '.searchguard.logging-es-data-master-bk8ocbgu' not healthy yet, we try again ... (Reason: no response)
@rhefner
yep same exact issue here. I havent found any solution to it yet. Rolling back to v1.5.1 didnt help either.
@wozniakjan any ideas?
You have an updated opneshift-ansible but old ES image. If you are getting the ES image from https://hub.docker.com/r/openshift/origin-logging-elasticsearch/tags/, it looks like only latest has been updated. I recommend not changing anything in the ES config map and pull the latest ES image.
If you want some background about what is exactly happening or other solution than updating the ES images, read on. In September, we introduced a new type of master discovery algorithm in ES images - by label and port, because discovering by service didn't work well with readiness probe.
It has relevant changes in:
1) openshift-ansible - https://github.com/openshift/openshift-ansible/pull/5209
2) ES image - https://github.com/openshift/origin-aggregated-logging/pull/609
If you don't want to update the ES image then you need to:
oc edit dc logging-es-data-master-... each ES DeploymentConfig and remove part starting readinessProbe:oc edit cm logging-elasticsearch and changecloud:
kubernetes:
pod_label: ${POD_LABEL}
pod_port: 9300
namespace: ${NAMESPACE}
to
cloud:
kubernetes:
service: ${SERVICE_DNS}
namespace: ${NAMESPACE}
Reverted elasticsearch.yml back to what it was and pulled latest ES container. Cluster is now in yellow state. Cluster is ok now.
@rhefner does the suggested above resolve the issue you are seeing as well?
@ewolinetz Yes, it looks like it did, indeed.
Closing due to pulling from latest resolved the issue
@ewolinetz will there be an updated tagged image? I'm not happy about running latest in production :(
@mhutter yes, there will be. you will not be expected to run with the latest tag in production.
@mhutter you can now use v3.6.1 https://hub.docker.com/r/openshift/origin-logging-elasticsearch/tags/
Just tried a new origin deployment switching to the v3.6.1 images and ES is failing to start.
This was done with the ansible installer with these definititions:
openshift_release=v3.6
openshift_hosted_logging_deployer_version=v3.6.1
This is what is seen in the logs of the logging-es-data-master pod
[2017-11-01 15:10:02,491][INFO ][container.run ] Begin Elasticsearch startup script
--
聽 | [2017-11-01 15:10:02,498][INFO ][container.run ] Comparing the specified RAM to the maximum recommended for Elasticsearch...
聽 | [2017-11-01 15:10:02,499][INFO ][container.run ] Inspecting the maximum RAM available...
聽 | [2017-11-01 15:10:02,503][INFO ][container.run ] ES_HEAP_SIZE: '4096m'
聽 | [2017-11-01 15:10:02,506][INFO ][container.run ] Setting heap dump location /elasticsearch/persistent/heapdump.hprof
聽 | [2017-11-01 15:10:02,509][INFO ][container.run ] Checking if Elasticsearch is ready on https://localhost:9200
聽 | Exception in thread "main" java.lang.IllegalArgumentException: Unknown Discovery type [kubernetes]
聽 | at org.elasticsearch.discovery.DiscoveryModule.configure(DiscoveryModule.java:100)
聽 | at <<<guice>>>
聽 | at org.elasticsearch.node.Node.<init>(Node.java:213)
聽 | at org.elasticsearch.node.Node.<init>(Node.java:140)
聽 | at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143)
聽 | at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:194)
聽 | at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286)
聽 | at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45)
聽 | Refer to the log for complete error details.
@tdudgeon thanks! Two details:
1) openshift_hosted_logging_deployer_version was deprecated in https://github.com/openshift/openshift-ansible/pull/5176, please try to use openshift_logging_image_version=v3.6.1
2) but unfortunately, our release engineers may have pushed 3.6.0 into 3.6.1 (judging from the same sha256), so only usable remains the latest
We have introduced and released new tag v3.6. This tag will be updated regularly, will no longer have to wait for release engineers to push a new image. More info here https://github.com/openshift/origin-aggregated-logging/pull/758
This should be working now:
openshift_logging_image_version=v3.6
Reproducible on v3.7:
[root@master ~]# oc version
oc v3.7.0+7ed6862
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://openshift2.example.com
openshift v3.7.0+7ed6862
kubernetes v1.7.6+a08f5eeb62
Elasticsearch config map was not showing latest discovery algorithm _(cloud:(kubernetes: (service: ${SERVICE_DNS}, namespace: ${NAMESPACE})) )_
After manually updating config map, container had the following logs:
sh-4.2$ tail -f /elasticsearch/logging-es/logs/logging-es.log
[2018-01-23 10:28:35,897][INFO ][node ] [logging-es-data-master-96f1ifqf] started
[2018-01-23 10:29:05,798][ERROR][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-96f1ifqf] Failure while checking .searchguard.logging-es-data-master-96f1ifqf index MasterNotDiscoveredException[null]
MasterNotDiscoveredException[null]
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$5.onTimeout(TransportMasterNodeAction.java:234)
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:236)
at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:816)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2018-01-23 10:29:35,807][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-96f1ifqf] index '.searchguard.logging-es-data-master-96f1ifqf' not healthy yet, we try again ... (Reason: no response)
[2018-01-23 10:30:08,808][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-96f1ifqf] index '.searchguard.logging-es-data-master-96f1ifqf' not healthy yet, we try again ... (Reason: no response)
[2018-01-23 10:30:41,810][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-96f1ifqf] index '.searchguard.logging-es-data-master-96f1ifqf' not healthy yet, we try again ... (Reason: no response)
[2018-01-23 10:31:03,307][WARN ][discovery.zen.ping.unicast] [logging-es-data-master-96f1ifqf] failed to send ping to [{#zen_unicast_6#}{::1}{[::1]:9300}]
SendRequestTransportException[[][[::1]:9300][internal:discovery/zen/unicast]]; nested: NodeNotConnectedException[[][[::1]:9300] Node not connected];
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:340)
at com.floragunn.searchguard.transport.SearchGuardTransportService.sendRequest(SearchGuardTransportService.java:88)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPingRequestToNode(UnicastZenPing.java:440)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:426)
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:945)
at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:360)
at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: NodeNotConnectedException[[][[::1]:9300] Node not connected]
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:1141)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:830)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:329)
... 13 more
[2018-01-23 10:31:14,812][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-96f1ifqf] index '.searchguard.logging-es-data-master-96f1ifqf' not healthy yet, we try again ... (Reason: no response)
[2018-01-23 10:31:47,814][WARN ][com.floragunn.searchguard.action.configupdate.TransportConfigUpdateAction] [logging-es-data-master-96f1ifqf] index '.searchguard.logging-es-data-master-96f1ifqf' not healthy yet, we try again ... (Reason: no response)
Most helpful comment
You have an updated opneshift-ansible but old ES image. If you are getting the ES image from https://hub.docker.com/r/openshift/origin-logging-elasticsearch/tags/, it looks like only
latesthas been updated. I recommend not changing anything in the ES config map and pull thelatestES image.If you want some background about what is exactly happening or other solution than updating the ES images, read on. In September, we introduced a new type of master discovery algorithm in ES images - by label and port, because discovering by service didn't work well with readiness probe.
It has relevant changes in:
1) openshift-ansible - https://github.com/openshift/openshift-ansible/pull/5209
2) ES image - https://github.com/openshift/origin-aggregated-logging/pull/609
If you don't want to update the ES image then you need to:
oc edit dc logging-es-data-master-...each ES DeploymentConfig and remove part startingreadinessProbe:oc edit cm logging-elasticsearchand changeto