Elasticsearch: Reindex task encounters NPE in org.apache.http.nio.pool.RouteSpecificPool

Created on 12 Jul 2017 · 9Comments · Source: elastic/elasticsearch

Elasticsearch version: 5.4.3

Plugins installed: [x-pack]

JVM version (java -version):
OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)

OS version (uname -a if on a Unix-like system):
Linux ip-10-124-1-211 4.9.32-15.41.amzn1.x86_64 #1 SMP Thu Jun 22 06:20:54 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
After making roughly 200 calls to the remote reindex api

POST _reindex
{
  "dest": {
    "index": "some-index",
    "version_type": "external"
  },
  "source": {
    "index": "some-index",
    "remote": {
      "host": "http://some-server.com:9200"
    }
  }
}

I started seeing the following response:

{"error":{"root_cause":[{"type":"connect_exception","reason":null}],"type":"connect_exception","reason":null},"status":500}

looking in the log, I see:

[2017-07-12T15:22:16,214][WARN ][r.suppressed             ] path: /_reindex, params: {}
java.net.ConnectException: null
    at org.apache.http.nio.pool.RouteSpecificPool.timeout(RouteSpecificPool.java:168) ~[?:?]
    at org.apache.http.nio.pool.AbstractNIOConnPool.requestTimeout(AbstractNIOConnPool.java:561) ~[?:?]
    at org.apache.http.nio.pool.AbstractNIOConnPool$InternalSessionRequestCallback.timeout(AbstractNIOConnPool.java:822) ~[?:?]
    at org.apache.http.impl.nio.reactor.SessionRequestImpl.timeout(SessionRequestImpl.java:183) ~[?:?]
    at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processTimeouts(DefaultConnectingIOReactor.java:210) ~[?:?]
    at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:155) ~[?:?]
    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) ~[?:?]
    at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) ~[?:?]
    at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) ~[?:?]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]

If I replace the hostname in the remote server URL with its IP, its started working again. It appears that ES is caching the connection/session to the remote host (via apache.http) and it has gotten into an invalid state. It's also worth noting that that the server is behind a load balancer (ELB). IP of ELB could change over time...

:CorFeatureJava Low Level REST Client >bug CorFeatures

Source

agouldsfi

👍4

Most helpful comment

I think Its hindering the use of amazon elb's as remote address while reindexing operation. [since ips behind elb keeps changing.]

palashkulsh on 1 Jun 2018

👍4

All 9 comments

Lovely. I'm initially classifying this as a bug in the low level rest client because that is the thing we use to make that connection. That is the bit that'd have to modify how it sets up the http client.

nik9000 on 12 Jul 2017

I see this same message if the elasticsearch instance is stopped and restarted. The rest client will reconnect and even write to elastic but these errors continue to be thrown. I am calling the bulk API.

ajrnz on 28 Jul 2017

I ran into this same error as well. The issue appears to have arisen when a node left the cluster (and was subsequently replaced) during the reindexing process. Now any subsequent calls to the reindex API return the same error.

rlvoyer on 11 Aug 2017

I experience the same problem on 6.1.3 while reindexing from AWS managed cluster to self managed.

After each reindex I have to restart service in order to issue another reindex.

vvucetic on 17 Feb 2018

👍1

I'm seeing the same errors when trying to migrate from 2.x to 5.x. We're not using a hostname but an ip address.

ju5t on 22 Mar 2018

I'm seeing this error on an AWS hosted ES 6.2 instance. But it's not specific to re-indexing. I'm encountering it while running integration tests, and in the process am deleting and re-creating my test indexes.

Before switching to the ES Rest client api, I was just constructing the headers/body by hand and calling ES via HttpURLConnection, but that was just while we were experimenting. It was working fine! But of course, it's not scalable to do things that way, so I switch to the rest api.

It seems this bug is taking a very long time to get addressed? The first report on this was 10 months ago.

atulsudhalkar on 31 May 2018

👍1

I think Its hindering the use of amazon elb's as remote address while reindexing operation. [since ips behind elb keeps changing.]

palashkulsh on 1 Jun 2018

👍4

We hit this issue trying to reindex a logging cluster (ES 5.6.2) sitting behind an ELB to a new ES 6.3.2 cluster and we have had to restart ES on the target client node to get things sorted. Let's hope this gets fixed sometimes soon as tracking down this issue was a doozy.

jdoss on 21 Jan 2019

I fixed this problem by setting networkaddress.cache.ttl=60 (was 0 - never expire) in $JAVA_HOME/jre/lib/security/java.security. My ES cluster was behind AWS ELB which changes IP often.

vvucetic on 1 Feb 2019

👍1

Was this page helpful?

0 / 5 - 0 ratings