Elasticsearch: transport_serialization_exception in _nodes API with gcs.client.default.credentials_file

Created on 8 May 2018  路  5Comments  路  Source: elastic/elasticsearch

Elasticsearch version (bin/elasticsearch --version):

sudo /usr/share/elasticsearch/bin/elasticsearch --version
Version: 5.5.2, Build: b2f0c09/2017-08-14T12:33:14.154Z, JVM: 1.8.0_144

Plugins installed: [repository-gcs]

JVM version (java -version):
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)

OS version (uname -a if on a Unix-like system):

4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:

There is a failed_node_exception in the _node API. I expect info from remote nodes, but it isn't present.

{
  "_nodes": {
    "total": 2,
    "successful": 1,
    "failed": 1,
    "failures": [
      {
        "type": "failed_node_exception",
        "reason": "Failed node [gqyW3xM3QYazVDnvsN6g4g]",
        "caused_by": {
          "type": "transport_serialization_exception",
          "reason": "Failed to deserialize response of type [org.elasticsearch.action.admin.cluster.node.info.NodeInfo]",
          "caused_by": {
            "type": "illegal_state_exception",
            "reason": "unexpected byte [0x03]"
          }
        }
      }
    ]
  },
  "cluster_name": "elasticsearch",
  "nodes": {
... everything normal from here

Steps to reproduce:

  1. Install a plain elasticsearch cluster with at least two nodes.
  2. Install the repository-gcs plugin with a _Service Account_ as described in https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-gcs-usage.html
  3. Obtain the credentials file and add it on each node with elasticsearch-keystore add-file gcs.client.default.credentials_file credentials.json
  4. Start the elasticsearch nodes.
  5. Run curl http://localhost:9200/_nodes and see the transport_serialization_exception.

Everything else works fine as far as I can tell, even snapshot+restore with the repository-gcs plugin. The _nodes API is not critical in itself, but the transport_serialization_exception is worrying.

Provide logs (if relevant):

[2018-05-08T08:03:02,776][DEBUG][o.e.a.a.c.n.i.TransportNodesInfoAction] [apa31] failed to execute on node [gqyW3xM3QYazVDnvsN6g4g]
org.elasticsearch.transport.RemoteTransportException: [Failed to deserialize response of type [org.elasticsearch.action.admin.cluster.node.info.NodeInfo]]
Caused by: org.elasticsearch.transport.TransportSerializationException: Failed to deserialize response of type [org.elasticsearch.action.admin.cluster.node.info.NodeInfo]
        at org.elasticsearch.transport.TcpTransport.handleResponse(TcpTransport.java:1431) [elasticsearch-5.5.2.jar:5.5.2]
        at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1403) [elasticsearch-5.5.2.jar:5.5.2]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) [transport-netty4-5.5.2.jar:5.5.2]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) [netty-codec-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297) [netty-codec-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413) [netty-codec-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) [netty-codec-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:544) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458) [netty-transport-4.1.11.Final.jar:4.1.11.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.11.Final.jar:4.1.11.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.lang.IllegalStateException: unexpected byte [0x03]
        at org.elasticsearch.common.io.stream.StreamInput.readBoolean(StreamInput.java:409) ~[elasticsearch-5.5.2.jar:5.5.2]
        at org.elasticsearch.common.io.stream.StreamInput.readBoolean(StreamInput.java:399) ~[elasticsearch-5.5.2.jar:5.5.2]
        at org.elasticsearch.common.io.stream.StreamInput.readOptionalWriteable(StreamInput.java:710) ~[elasticsearch-5.5.2.jar:5.5.2]
        at org.elasticsearch.action.admin.cluster.node.info.NodeInfo.readFrom(NodeInfo.java:208) ~[elasticsearch-5.5.2.jar:5.5.2]
        at org.elasticsearch.transport.TcpTransport.handleResponse(TcpTransport.java:1428) ~[elasticsearch-5.5.2.jar:5.5.2]
        at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1403) [elasticsearch-5.5.2.jar:5.5.2]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926) ~[?:?]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:544) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_144]
:DistributeDiscovery-Plugins >bug

All 5 comments

Thanks for reporting. I can confirm this is a problem in 5.2.2 and it reproduces nicely (thanks!). I can also confirm this isn't an issue any more with 5.6.5. I tried to find the changed that fixed this but failed. These things are very tricky and the problem is typically far from where the exception occurs. I'm going close this as fixed. I suggest you upgrade to avoid this. If it keeps occurring please feel free to reopen the issue.

Thanks @DaveCTurner I saw it, but I'm not sure as settings are serialized a few items before it explodes:

        if (in.readBoolean()) {
            settings = Settings.readSettingsFromStream(in);
        }
        os = in.readOptionalWriteable(OsInfo::new);
        process = in.readOptionalWriteable(ProcessInfo::new);
        jvm = in.readOptionalWriteable(JvmInfo::new);
        threadPool = in.readOptionalWriteable(ThreadPoolInfo::new);
        transport = in.readOptionalWriteable(TransportInfo::new); <-- here

Obviously I can check it. I'll try with 5.5.3 and report.

@DaveCTurner confirmed. 5.5.3 fixes it. Thanks for the link.

Pinging @elastic/es-distributed

Was this page helpful?
0 / 5 - 0 ratings

Related issues

martijnvg picture martijnvg  路  3Comments

DhairyashilBhosale picture DhairyashilBhosale  路  3Comments

jpountz picture jpountz  路  3Comments

brwe picture brwe  路  3Comments

matthughes picture matthughes  路  3Comments