Sarama: EOF error while fetching new messages

Created on 17 Jul 2017  路  5Comments  路  Source: Shopify/sarama

Versions

Please specify real version numbers or git SHAs, not just "Latest" since that changes fairly regularly.
Sarama Version: c01858abb625b73a3af51d0798e4ad42c8147093
Sarama Cluster: 045a96d0acad1692fa63ca3cfb15c452115516ae
Kafka Version: 0.10.0.1
Go Version: go1.8 darwin/amd64

Configuration

I'm using github.com/bsm/sarama-cluster to configure the consumer with mostly default configuration:

config := cluster.NewConfig()
config.Consumer.Offsets.Initial = sarama.OffsetOldest
config.Consumer.Return.Errors = true
config.Group.Return.Notifications = true
cluster.NewConsumer(brokers, consumerGroup, []string{topic}, config)
Logs

logs immediately before the error happens (all of them are logged by sarama):
{"level":"info","msg":"client/metadata fetching metadata for all topics from broker node1:9092\n","time":"2017-07-15T15:27:05.057+0000"}
{"level":"info","msg":"client/metadata fetching metadata for all topics from broker node4:9092\n","time":"2017-07-15T15:27:05.547+0000"}
{"level":"info","msg":"consumer/broker/1 abandoned subscription to core.v1/10 because kafka server: Tried to send a message to a replica that is not the leader for some partition. Your metadata is out of date.\n","time":"2017-07-15T15:30:14.451+0000"}
{"level":"info","msg":"consumer/core.v1/10 finding new broker\n","time":"2017-07-15T15:30:16.451+0000"}
{"level":"info","msg":"client/metadata fetching metadata for [core.v1] from broker node2:9092\n","time":"2017-07-15T15:30:16.452+0000"}
{"level":"info","msg":"client/metadata got error from broker while fetching metadata: EOF","time":"2017-07-15T15:30:16.452+0000"}
{"level":"info","msg":"Closed connection to broker node2:9092\n","time":"2017-07-15T15:30:16.452+0000"}
{"level":"info","msg":"client/brokers deregistered broker #2 at node2:9092","time":"2017-07-15T15:30:16.452+0000"}
{"level":"info","msg":"client/metadata fetching metadata for [core.v1] from broker node4:9092\n","time":"2017-07-15T15:30:16.452+0000"}
{"level":"info","msg":"client/brokers registered new broker #2 at node2:9092","time":"2017-07-15T15:30:16.453+0000"}
{"level":"info","msg":"consumer/broker/5 added subscription to core.v1/10\n","time":"2017-07-15T15:30:16.453+0000"}
{"level":"info","msg":"consumer/broker/5 disconnecting due to error processing FetchRequest: EOF\n","time":"2017-07-15T15:30:16.453+0000"}
{"level":"info","msg":"Closed connection to broker node5:9092\n","time":"2017-07-15T15:30:16.453+0000"}

After that I'm getting an error on consumer's Errors() channel
{"level":"error","msg":"Failure in Green messages consumption: kafka: error while consuming core.v1/10: EOF","time":"2017-07-15T15:30:16.453+0000"}

Problem Description

I've noticed that periodically some consumers are failing with FetchRequest:EOF errors. I cannot attribute it to a particular kafka cluster state, it happens with different topics at a different time and service is able to pick up from where it left after restart. Logs' pattern look the same every time.
I have a hard time to understand what this error designate.

Most helpful comment

For anyone bumping into this, quick repeated EOFs for requests can also be just miss match of Config.Version with the Kafka brokers.

All 5 comments

mark

EOF means that something is terminating the TCP connection. Is there anything in the broker logs when this occurs which would suggest why the broker is closing the connection? Is your network connection particularly flaky?

@dim any thoughts on possible causes for this within sarama-cluster?

@eapache I believe this thread has actually started in the sarama-cluster issues tracker. unfortunately there is nothing in sarama-cluster that would cause an EOF, it's simply propagating Errors() it receives from the underlying PartitionConsumer instances.

Based on my experience with Kafka it's usually some sort of timeout on the server side. Maybe the server's heartbeat requirements are too strict which would cause the broker to prematurely close client connections. Sometimes, we have also seen per-topic settings to be the culprit, e.g. very aggressive expiration among other things. I doubt it's an issue on the client side.

Closing as stale. Please reopen if more information is available.

For anyone bumping into this, quick repeated EOFs for requests can also be just miss match of Config.Version with the Kafka brokers.

Was this page helpful?
0 / 5 - 0 ratings