Elasticsearch: SearchStats counters prone to overflow

Created on 21 Mar 2017 · 16Comments · Source: elastic/elasticsearch

Elasticsearch version: 5.2.1

Plugins installed: -/-

JVM version: OpenJDK 1.8.0_111

OS version: Ubuntu 14.04, Kernel 4.4.0

Description of the problem including expected versus actual behavior:

In an environment with extensive use of scroll API, the signed long scrollTimeInMillis defined in org.elasticsearch.index.search.stats.SearchStats can overflow Long.MAX_VALUE, causing StreamOutput. writeVLong used in SearchStats$Stats.writeTo to throw java.lang.IllegalStateException: Negative longs unsupported, use writeLong or writeZLong.

This breaks the monitoring endpoints on data nodes.

Expected behavior is, of course, not to break. Time counters could be implemented as double, with the overhead of floating point arithmetic, and then just losing precision when the value becomes bigger. Or just don't not increment counters beyond Long.MAX_VALUE and return "inf" or "NaN" instead so you'd just lose that value for monitoring. I also doubt that millisecond precision is useful here, but changing it to seconds would only defer the problem.

Provide logs (if relevant):

[2017-03-21T13:32:56,002][WARN ][o.e.a.a.c.n.s.TransportNodesStatsAction] [elastic-node1] not accumulating exceptions, excluding exception from response
org.elasticsearch.action.FailedNodeException: Failed node [QjF2heCMSImu6bPfZ-HzIg]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction.onFailure(TransportNodesAction.java:247) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction.access$300(TransportNodesAction.java:160) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction$1.handleException(TransportNodesAction.java:219) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1024) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.TcpTransport.lambda$handleException$17(TcpTransport.java:1411) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.common.util.concurrent.EsExecutors$1.execute(EsExecutors.java:109) [elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.TcpTransport.handleException(TcpTransport.java:1409) [elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.TcpTransport.handlerResponseError(TcpTransport.java:1401) [elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1345) [elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) [transport-netty4-5.2.1.jar:5.2.1]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:280) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:396) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:527) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:481) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.7.Final.jar:4.1.7.Final]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111-internal]
Caused by: org.elasticsearch.transport.RemoteTransportException: [elastic-node2][a.b.c.d:9300][cluster:monitor/nodes/stats[n]]
Caused by: java.lang.IllegalStateException: Negative longs unsupported, use writeLong or writeZLong for negative numbers [-1426090323412]
    at org.elasticsearch.common.io.stream.StreamOutput.writeVLong(StreamOutput.java:219) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.index.search.stats.SearchStats$Stats.writeTo(SearchStats.java:211) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.index.search.stats.SearchStats.writeTo(SearchStats.java:353) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.common.io.stream.StreamOutput.writeOptionalStreamable(StreamOutput.java:723) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.action.admin.indices.stats.CommonStats.writeTo(CommonStats.java:255) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.indices.NodeIndicesStats.writeTo(NodeIndicesStats.java:175) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.action.admin.cluster.node.stats.NodeStats.writeTo(NodeStats.java:235) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.TcpTransport.buildMessage(TcpTransport.java:1179) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.TcpTransport.sendResponse(TcpTransport.java:1123) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.TcpTransport.sendResponse(TcpTransport.java:1101) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.TcpTransportChannel.sendResponse(TcpTransportChannel.java:67) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.TcpTransportChannel.sendResponse(TcpTransportChannel.java:61) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.DelegatingTransportChannel.sendResponse(DelegatingTransportChannel.java:58) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.RequestHandlerRegistry$TransportChannelWrapper.sendResponse(RequestHandlerRegistry.java:111) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:270) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:266) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1488) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:596) ~[elasticsearch-5.2.1.jar:5.2.1]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.2.1.jar:5.2.1]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_111-internal]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_111-internal]
    ... 1 more

:CorFeatureStats >bug

Source

DrMurx

All 16 comments

How are the scroll times that large to begin with, that's a cumulative scroll time of at least 292,277,266 years? That is, maybe there is something else going on here?

jasontedor on 21 Mar 2017

Impressive!

On Tue, Mar 21, 2017 at 10:19 AM Jason Tedor notifications@github.com
wrote:

How are the scroll times that large to begin with, that's a cumulative
scroll time of 292,277,266 years?

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/elastic/elasticsearch/issues/23675#issuecomment-288092587,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AANLohEr6QltkdcaDa0Hxf-5G5onac91ks5rn9x8gaJpZM4Mj3Ru
.

nik9000 on 21 Mar 2017

@jasontedor I know that the number is insanely great, and I doubt it can be reached in regular operation. Still, the exception clearly indicates a negative value, and assuming that scrollTimeInMillis is always positive, I can give no better explanation than a numeric overflow.

This is the stacked Grafana graph of scrollTimeInMillis of our 12 node setup. The overflow happens at 292 years (approx at 9222842861407 ms).

Grafana Metrics ES5 scrollTimeInMillis

The metrics are polled from the first node which is still able to return the negative value. But the inter-cluster communication to sync the local stats between the nodes is obviously broken by the exception thrown in SearchStats$Stats.writeTo.

DrMurx on 21 Mar 2017

Still, the exception clearly indicates a negative value, and assuming that scrollTimeInMillis is always positive, I can give no better explanation than a numeric overflow.

Yes, I understand the bug, and it's clear how to address. My concern is that there is a deeper issue here so I don't want to fix the bug until I understand where these overflowed values are coming from because addressing the serialization issue will just paper over the potentially deeper issue.

jasontedor on 21 Mar 2017

👍1

The overflow happens at 292 years (approx at 9222842861407 ms).

This is off by a factor of 10^6. If this is right, it indicates maybe a unit conversion bug somewhere. I'll dig.

jasontedor on 21 Mar 2017

Are you sure about the units on your chart?

jasontedor on 21 Mar 2017

Do you also track scroll_total historically. If so, can you share it on the same timeline that you shared scroll_time_in_millis? Can you also share the current output of /_stats/search?level=shards?

I do not see any unit conversion bugs behind this (I do think you might have a unit conversion problem in your chart; that doesn't explain the overflow though).

jasontedor on 21 Mar 2017

Here's the graph for elastic-node131, with the plain numbers to rule out any potential Grafana related unit conversion issue.

screen shot 2017-03-24 at 4 04 38 pm

That's the output of /_stats/search?level=shards and it matches the numbers from the graphs. It's also again close to the overflow situation (might happen tonight):

_stats/search

DrMurx on 24 Mar 2017

Your stats are rather interesting. For example, you show 502,385,516 scrolls executed in total, and show 579,306 currently open scrolls? This seems abnormally high, does it sound right to you? What is your workload? Why so many open scrolls? Are you closing your scrolls when you're done with them? Your shards all show ~50000 scrolls open each. Why? The average time a scroll is open seems reasonable (which leads to me believe that you're not forgetting to close them, but it's worth checking, especially given the number of currently open scrolls). However with your average scroll duration of 200,000 milliseconds, you need only 46 trillion scrolls of this duration to overflow. Yeah, only. It reminds me of the joke where someone is attending an astronomy lecture where the speaker says "in a billion years, our star, the Sun, will run out of fuel and die" and so this attendee raises his hand and says, panicking, "how long did you say we had?" and the lecturer repeats a billion years to which the attendee says "I was getting worried, I thought you said a million years!"

jasontedor on 24 Mar 2017

Hello,

These numbers are all expected: under normal operations, each node has 45-50k open contexts. On the current cluster, this adds up to ~600k open contexts on 12 nodes.

We do not close scrolls ourselves: we just wait until they expire and according to what @jasontedor is saying, they’re expiring correctly.

Regardless of the optimizations that we have planned, lowering down to 30k-40k open contexts on average in total, we might still hit these limits that we are seeing.

Now we reach them in a few days, we will be reaching them in a few weeks. I would still consider this a bug.

marcdeop on 28 Mar 2017

These numbers are all expected: under normal operations, each node has 45-50k open contexts.

Why so many?

We do not close scrolls ourselves

You should close these when you're done with them.

I would still consider this a bug.

You say this like this is under dispute. This is not under dispute. Instead, I am trying to understand the situation.

jasontedor on 28 Mar 2017

You say this like this is under dispute. This is not under dispute. Instead, I am trying to understand the situation.

So, to be clear here, it can overflow, that is not open to dispute. Instead, we need to understand the circumstances under which it can before concluding what to do about it (for which there are many options).

You should close these when you're done with them.

I maintain that you should be doing this and I want to see what impact that has on your cumulative scroll times.

jasontedor on 28 Mar 2017

These numbers are all expected: under normal operations, each node has 45-50k open contexts.

Why so many?

By how our search is designed and our load, this is expected. Again, we plan on optimizing this to have 30k-40k open contexts in total

We do not close scrolls ourselves

You should close these when you're done with them.

Due to how the application is designed, we cannot really know when we are "done" with them. That is why we are relying on the expire time ( We have on the roadmap to reduce this expire time though ).

I would still consider this a bug.

You say this like this is under dispute. This is not under dispute. Instead, I am trying to understand the situation.

I apologize, in fact, we really appreciate the effort that you are putting into figuring out what is happening here. Thank you :-)

marcdeop on 28 Mar 2017

This will be fixed by #27068.

jasontedor on 21 Oct 2017

🎉1

Looking forward to seeing it fixed!

Thanks a lot @jasontedor

marcdeop on 21 Oct 2017

Closed by #27068

jasontedor on 21 Oct 2017

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Error when trying to define an absolute path for synonyms/stopwords file.

abrahamduran · 3Comments

Cleanup Elasticsearch configuration files support

jasontedor · 3Comments

Add support for role assumption in s3 repository

malpani · 3Comments

How to ignore a filed in elastic search mapping

abtpst · 3Comments

Support coerce JSON -> String

matthughes · 3Comments