On some boxes producer fails to produce messages and gets stuck in an infinite loop with the following message:
Debug|2018-08-23 13:06:23.288|KafkaProducerTest.exe#producer-1|TOPPAR| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: my.topicname [0] 1
message(s) in xmit queue (0 added from partition queue)
Also afterwards I can see high CPU usage caused by librdkafka.
EDIT: Even using a very basic example it hangs on Windows Server 2012.
It's rather tricky because it works fine on some boxes while does not on others. The code is very simple:
var producer = new Producer<Empty, SomeType>(settings, new ProtoSerializer<Empty>(), new ProtoSerializer<SomeType>());
await producer.ProduceAsync("my.topic", new Message<Empty, SomeType> {Value = new SomeType
{
PropertyX =...,
PropertyY= ...
}
});
Please provide the following information:
{
"batch.num.messages":1,
"socket.nagle.disable":"true",
"socket.blocking.max.ms":50,
"queue.buffering.max.ms":1,
"log.connection.close":"false",
"group.id":"KafkaProducerTest.exe",
"client.id":"KafkaProducerTest.exe",
"enable.auto.commit":"false",
"acks":1,
"bootstrap.servers":"192.168.XXX.UUU:9092, 192.168.XXX.ZZZ:9092, 192.168.XXX.YYY:9092",
"request.required.acks":1,
"debug":"all"
}
Debug|2018-08-23 13:06:16.200|KafkaProducerTest.exe#producer-1|STATE| [thrd:192.
168.XXX.UUU:9092/bootstrap]: 192.168.XXX.UUU:9092/bootstrap: Broker changed state
APIVERSION_QUERY -> UP
Debug|2018-08-23 13:06:16.200|KafkaProducerTest.exe#producer-1|BROADCAST| [thrd:
192.168.XXX.UUU:9092/bootstrap]: Broadcasting state change
Debug|2018-08-23 13:06:16.200|KafkaProducerTest.exe#producer-1|METADATA| [thrd:1
92.168.XXX.UUU:9092/bootstrap]: Skipping metadata refresh of 1 topic(s): connecte
d: already being requested
Debug|2018-08-23 13:06:16.197|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/bootstrap: ClusterId: x2DnsDHoRUGMd2-kONsKeg, Control
lerId: 1
Debug|2018-08-23 13:06:16.201|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/bootstrap: 3 brokers, 1 topics
Debug|2018-08-23 13:06:16.196|KafkaProducerTest.exe#producer-1|APIVERSION| [thrd
:192.168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/bootstrap: Enabling feature
SaslHandshake
Debug|2018-08-23 13:06:16.201|KafkaProducerTest.exe#producer-1|APIVERSION| [thrd
:192.168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/bootstrap: Feature LZ4: Gr
oupCoordinator (0..0) supported by broker
Debug|2018-08-23 13:06:16.201|KafkaProducerTest.exe#producer-1|APIVERSION| [thrd
:192.168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/bootstrap: Enabling feature
LZ4
Debug|2018-08-23 13:06:16.202|KafkaProducerTest.exe#producer-1|APIVERSION| [thrd
:192.168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/bootstrap: Feature OffsetT
ime: Offset (1..1) supported by broker
Debug|2018-08-23 13:06:16.202|KafkaProducerTest.exe#producer-1|APIVERSION| [thrd
:192.168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/bootstrap: Enabling feature
OffsetTime
Debug|2018-08-23 13:06:16.202|KafkaProducerTest.exe#producer-1|APIVERSION| [thrd
:192.168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/bootstrap: Feature : Creat
eTopics (0..0) supported by broker
Debug|2018-08-23 13:06:16.202|KafkaProducerTest.exe#producer-1|APIVERSION| [thrd
:192.168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/bootstrap: Feature : Delet
eTopics (0..0) supported by broker
Debug|2018-08-23 13:06:16.203|KafkaProducerTest.exe#producer-1|APIVERSION| [thrd
:192.168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/bootstrap: Enabling feature
Debug|2018-08-23 13:06:16.203|KafkaProducerTest.exe#producer-1|FEATURE| [thrd:19
2.168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/bootstrap: Updated enabled pro
tocol features to MsgVer1,ApiVersion,BrokerBalancedConsumer,ThrottleTime,Sasl,Sa
slHandshake,BrokerGroupCoordinator,LZ4,OffsetTime,MsgVer2
Debug|2018-08-23 13:06:16.203|KafkaProducerTest.exe#producer-1|STATE| [thrd:192.
168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/bootstrap: Broker changed state
APIVERSION_QUERY -> UP
Debug|2018-08-23 13:06:16.203|KafkaProducerTest.exe#producer-1|BROADCAST| [thrd:
192.168.XXX.YYY:9092/bootstrap]: Broadcasting state change
Debug|2018-08-23 13:06:16.203|KafkaProducerTest.exe#producer-1|METADATA| [thrd:1
92.168.XXX.YYY:9092/bootstrap]: Skipping metadata refresh of 1 topic(s): connecte
d: already being requested
Debug|2018-08-23 13:06:16.201|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/bootstrap: Broker #0/3: 192.168.XXX.UUU:9092 NodeId
2
Debug|2018-08-23 13:06:16.204|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/bootstrap: Broker #1/3: 192.168.XXX.YYY:9092 NodeId
1
Debug|2018-08-23 13:06:16.204|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/bootstrap: Broker #2/3: 192.168.XXX.ZZZ:9092 NodeId
3
Debug|2018-08-23 13:06:16.205|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/bootstrap: Topic #0/1: my.topicname with 1
partitions
Debug|2018-08-23 13:06:16.205|KafkaProducerTest.exe#producer-1|STATE| [thrd:main
]: Topic my.topicname changed state unknown -> exists
Debug|2018-08-23 13:06:16.205|KafkaProducerTest.exe#producer-1|PARTCNT| [thrd:ma
in]: Topic my.topicname partition count changed from 0 to 1
Debug|2018-08-23 13:06:16.205|KafkaProducerTest.exe#producer-1|TOPPARNEW| [thrd:
main]: NEW my.topicname [0] 00000062A60D3580 (at rd_kafka_topic_partitio
n_cnt_update:587)
Debug|2018-08-23 13:06:16.206|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: Topic my.topicname partition 0 Leader 3
Debug|2018-08-23 13:06:16.206|KafkaProducerTest.exe#producer-1|BRKDELGT| [thrd:m
ain]: my.topicname [0]: delegate to broker (none) (rktp 00000062A60D3580
, term 0, ref 2, remove 0)
Debug|2018-08-23 13:06:16.206|KafkaProducerTest.exe#producer-1|BRKDELGT| [thrd:m
ain]: my.topicname [0]: broker :0/internal is now leader for partition w
ith 0 messages (0 bytes) queued
Debug|2018-08-23 13:06:16.206|KafkaProducerTest.exe#producer-1|BRKMIGR| [thrd:ma
in]: Migrating topic my.topicname [0] 00000062A60D3580 from (none) to :0
/internal (sending PARTITION_JOIN to :0/internal)
Debug|2018-08-23 13:06:16.206|KafkaProducerTest.exe#producer-1|PARTCNT| [thrd:ma
in]: Partitioning 1 unassigned messages in topic my.topicname to 1 parti
tions
Debug|2018-08-23 13:06:16.206|KafkaProducerTest.exe#producer-1|TOPBRK| [thrd::0/
internal]: :0/internal: Topic my.topicname [0]: joining broker (rktp 000
00062A60D3580)
Debug|2018-08-23 13:06:16.207|KafkaProducerTest.exe#producer-1|FETCHADD| [thrd::
0/internal]: :0/internal: Added my.topicname [0] to active list (1 entri
es, opv 0)
Debug|2018-08-23 13:06:16.207|KafkaProducerTest.exe#producer-1|BROADCAST| [thrd:
:0/internal]: Broadcasting state change
Debug|2018-08-23 13:06:16.207|KafkaProducerTest.exe#producer-1|UAS| [thrd:main]:
1/1 messages were partitioned in topic my.topicname
Debug|2018-08-23 13:06:16.208|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/bootstrap: 1/1 requested topic(s) seen in metadata
Debug|2018-08-23 13:06:16.208|KafkaProducerTest.exe#producer-1|CLUSTERID| [thrd:
main]: 192.168.XXX.ZZZ:9092/bootstrap: ClusterId update "" -> "x2DnsDHoRUGMd2-kO
NsKeg"
Debug|2018-08-23 13:06:16.209|KafkaProducerTest.exe#producer-1|CONTROLLERID| [th
rd:main]: 192.168.XXX.ZZZ:9092/bootstrap: ControllerId update -1 -> 1
Debug|2018-08-23 13:06:16.209|KafkaProducerTest.exe#producer-1|BROADCAST| [thrd:
main]: Broadcasting state change
Debug|2018-08-23 13:06:17.125|KafkaProducerTest.exe#producer-1|QRYLEADER| [thrd:
main]: Topic my.topicname [0]: leader is internal: re-query
Debug|2018-08-23 13:06:17.125|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: Requesting metadata for 1/1 topics: refresh unavailable topics
Debug|2018-08-23 13:06:17.126|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/bootstrap: Request metadata for 1 topic(s): refresh u
navailable topics
Debug|2018-08-23 13:06:17.203|KafkaProducerTest.exe#producer-1|UPDATE| [thrd:192
.168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/bootstrap: NodeId changed from
-1 to 1
Debug|2018-08-23 13:06:17.203|KafkaProducerTest.exe#producer-1|UPDATE| [thrd:192
.168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/1: Name changed from 192.168.12
8.73:9092/bootstrap to 192.168.XXX.YYY:9092/1
Debug|2018-08-23 13:06:17.203|KafkaProducerTest.exe#producer-1|LEADER| [thrd:192
.168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/1: Mapped 0 partition(s) to bro
ker
Debug|2018-08-23 13:06:17.203|KafkaProducerTest.exe#producer-1|STATE| [thrd:192.
168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/1: Broker changed state UP -> UP
DATE
Debug|2018-08-23 13:06:17.204|KafkaProducerTest.exe#producer-1|BROADCAST| [thrd:
192.168.XXX.YYY:9092/bootstrap]: Broadcasting state change
Debug|2018-08-23 13:06:17.204|KafkaProducerTest.exe#producer-1|STATE| [thrd:192.
168.XXX.YYY:9092/bootstrap]: 192.168.XXX.YYY:9092/1: Broker changed state UPDATE -
> UP
Debug|2018-08-23 13:06:17.204|KafkaProducerTest.exe#producer-1|BROADCAST| [thrd:
192.168.XXX.YYY:9092/bootstrap]: Broadcasting state change
Debug|2018-08-23 13:06:17.290|KafkaProducerTest.exe#producer-1|UPDATE| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/bootstrap: NodeId changed fro
m -1 to 3
Debug|2018-08-23 13:06:17.290|KafkaProducerTest.exe#producer-1|UPDATE| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: Name changed from 192.168.
XXX.ZZZ:9092/bootstrap to 192.168.XXX.ZZZ:9092/3
Debug|2018-08-23 13:06:17.290|KafkaProducerTest.exe#producer-1|TOPICUPD| [thrd:1
92.168.XXX.ZZZ:9092/bootstrap]: Topic my.topicname [0] migrated from bro
ker -1 to 3
Debug|2018-08-23 13:06:17.290|KafkaProducerTest.exe#producer-1|BRKDELGT| [thrd:1
92.168.XXX.ZZZ:9092/bootstrap]: my.topicname [0]: delegate to broker 192
.168.XXX.ZZZ:9092/3 (rktp 00000062A60D3580, term 0, ref 2, remove 0)
Debug|2018-08-23 13:06:17.291|KafkaProducerTest.exe#producer-1|BRKDELGT| [thrd:1
92.168.XXX.ZZZ:9092/bootstrap]: my.topicname [0]: broker :0/internal no
longer leader
Debug|2018-08-23 13:06:17.291|KafkaProducerTest.exe#producer-1|BRKDELGT| [thrd:1
92.168.XXX.ZZZ:9092/bootstrap]: my.topicname [0]: broker 192.168.XXX.ZZZ
:9092/3 is now leader for partition with 1 messages (74 bytes) queued
Debug|2018-08-23 13:06:17.291|KafkaProducerTest.exe#producer-1|BRKMIGR| [thrd:19
2.168.XXX.ZZZ:9092/bootstrap]: Migrating topic my.topicname [0] 00000062
A60D3580 from :0/internal to 192.168.XXX.ZZZ:9092/3 (sending PARTITION_LEAVE to
:0/internal)
Debug|2018-08-23 13:06:17.291|KafkaProducerTest.exe#producer-1|LEADER| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: Mapped 1 partition(s) to b
roker
Debug|2018-08-23 13:06:17.291|KafkaProducerTest.exe#producer-1|STATE| [thrd:192.
168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: Broker changed state UP ->
UPDATE
Debug|2018-08-23 13:06:17.292|KafkaProducerTest.exe#producer-1|BROADCAST| [thrd:
192.168.XXX.ZZZ:9092/bootstrap]: Broadcasting state change
Debug|2018-08-23 13:06:17.292|KafkaProducerTest.exe#producer-1|SEND| [thrd:192.1
68.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: Sent MetadataRequest (v2, 61
bytes @ 0, CorrId 3)
Debug|2018-08-23 13:06:17.292|KafkaProducerTest.exe#producer-1|STATE| [thrd:192.
168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: Broker changed state UPDATE
-> UP
Debug|2018-08-23 13:06:17.292|KafkaProducerTest.exe#producer-1|BROADCAST| [thrd:
192.168.XXX.ZZZ:9092/bootstrap]: Broadcasting state change
Debug|2018-08-23 13:06:17.292|KafkaProducerTest.exe#producer-1|RECV| [thrd:192.1
68.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: Received MetadataResponse (v
2, 186 bytes, CorrId 3, rtt 0.60ms)
Debug|2018-08-23 13:06:17.293|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/3: ===== Received metadata (for 1 requested topics):
refresh unavailable topics =====
Debug|2018-08-23 13:06:17.293|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/3: ClusterId: x2DnsDHoRUGMd2-kONsKeg, ControllerId: 1
Debug|2018-08-23 13:06:17.293|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/3: 3 brokers, 1 topics
Debug|2018-08-23 13:06:17.293|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/3: Broker #0/3: 192.168.XXX.UUU:9092 NodeId 2
Debug|2018-08-23 13:06:17.291|KafkaProducerTest.exe#producer-1|TOPBRK| [thrd::0/
internal]: :0/internal: Topic my.topicname [0]: leaving broker (0 messag
es in xmitq, next leader 192.168.XXX.ZZZ:9092/3, rktp 00000062A60D3580)
Debug|2018-08-23 13:06:17.294|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/3: Broker #1/3: 192.168.XXX.YYY:9092 NodeId 1
Debug|2018-08-23 13:06:17.294|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/3: Broker #2/3: 192.168.XXX.ZZZ:9092 NodeId 3
Debug|2018-08-23 13:06:17.294|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/3: Topic #0/1: my.topicname with 1 partitio
ns
Debug|2018-08-23 13:06:17.294|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: Topic my.topicname partition 0 Leader 3
Debug|2018-08-23 13:06:17.294|KafkaProducerTest.exe#producer-1|FETCHADD| [thrd::
0/internal]: :0/internal: Removed my.topicname [0] from active list (0 e
ntries, opv 0)
Debug|2018-08-23 13:06:17.295|KafkaProducerTest.exe#producer-1|BROADCAST| [thrd:
:0/internal]: Broadcasting state change
Debug|2018-08-23 13:06:17.295|KafkaProducerTest.exe#producer-1|BRKDELGT| [thrd:m
ain]: my.topicname [0]: delegate to broker 192.168.XXX.ZZZ:9092/3 (rktp
00000062A60D3580, term 0, ref 3, remove 0)
Debug|2018-08-23 13:06:17.295|KafkaProducerTest.exe#producer-1|BRKDELGT| [thrd:m
ain]: my.topicname [0]: broker 192.168.XXX.ZZZ:9092/3 is now leader for
partition with 1 messages (74 bytes) queued
Debug|2018-08-23 13:06:17.296|KafkaProducerTest.exe#producer-1|PARTCNT| [thrd:ma
in]: Partitioning 0 unassigned messages in topic my.topicname to 1 parti
tions
Debug|2018-08-23 13:06:17.296|KafkaProducerTest.exe#producer-1|UAS| [thrd:main]:
0/0 messages were partitioned in topic my.topicname
Debug|2018-08-23 13:06:17.296|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.ZZZ:9092/3: 1/1 requested topic(s) seen in metadata
Debug|2018-08-23 13:06:17.302|KafkaProducerTest.exe#producer-1|UPDATE| [thrd:192
.168.XXX.UUU:9092/bootstrap]: 192.168.XXX.UUU:9092/bootstrap: NodeId changed from
-1 to 2
Debug|2018-08-23 13:06:17.302|KafkaProducerTest.exe#producer-1|UPDATE| [thrd:192
.168.XXX.UUU:9092/bootstrap]: 192.168.XXX.UUU:9092/2: Name changed from 192.168.12
8.72:9092/bootstrap to 192.168.XXX.UUU:9092/2
Debug|2018-08-23 13:06:17.302|KafkaProducerTest.exe#producer-1|LEADER| [thrd:192
.168.XXX.UUU:9092/bootstrap]: 192.168.XXX.UUU:9092/2: Mapped 0 partition(s) to bro
ker
Debug|2018-08-23 13:06:17.302|KafkaProducerTest.exe#producer-1|STATE| [thrd:192.
168.XXX.UUU:9092/bootstrap]: 192.168.XXX.UUU:9092/2: Broker changed state UP -> UP
DATE
Debug|2018-08-23 13:06:17.302|KafkaProducerTest.exe#producer-1|BROADCAST| [thrd:
192.168.XXX.UUU:9092/bootstrap]: Broadcasting state change
Debug|2018-08-23 13:06:17.303|KafkaProducerTest.exe#producer-1|STATE| [thrd:192.
168.XXX.UUU:9092/bootstrap]: 192.168.XXX.UUU:9092/2: Broker changed state UPDATE -
> UP
Debug|2018-08-23 13:06:17.303|KafkaProducerTest.exe#producer-1|BROADCAST| [thrd:
192.168.XXX.UUU:9092/bootstrap]: Broadcasting state change
Debug|2018-08-23 13:06:18.127|KafkaProducerTest.exe#producer-1|QRYLEADER| [thrd:
main]: Topic my.topicname [0]: leader is unavailable: re-query
Debug|2018-08-23 13:06:18.127|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: Requesting metadata for 1/1 topics: refresh unavailable topics
Debug|2018-08-23 13:06:18.127|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.UUU:9092/2: Request metadata for 1 topic(s): refresh unavailabl
e topics
Debug|2018-08-23 13:06:18.292|KafkaProducerTest.exe#producer-1|TOPBRK| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: Topic my.topicname
[0]: joining broker (rktp 00000062A60D3580)
Debug|2018-08-23 13:06:18.292|KafkaProducerTest.exe#producer-1|FETCHADD| [thrd:1
92.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: Added atech.depositreque
st [0] to active list (1 entries, opv 0)
Debug|2018-08-23 13:06:18.292|KafkaProducerTest.exe#producer-1|BROADCAST| [thrd:
192.168.XXX.ZZZ:9092/bootstrap]: Broadcasting state change
Debug|2018-08-23 13:06:18.302|KafkaProducerTest.exe#producer-1|SEND| [thrd:192.1
68.XXX.UUU:9092/bootstrap]: 192.168.XXX.UUU:9092/2: Sent MetadataRequest (v2, 61 b
ytes @ 0, CorrId 2)
Debug|2018-08-23 13:06:18.302|KafkaProducerTest.exe#producer-1|RECV| [thrd:192.1
68.XXX.UUU:9092/bootstrap]: 192.168.XXX.UUU:9092/2: Received MetadataResponse (v2,
186 bytes, CorrId 2, rtt 0.37ms)
Debug|2018-08-23 13:06:18.302|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.UUU:9092/2: ===== Received metadata (for 1 requested topics): r
efresh unavailable topics =====
Debug|2018-08-23 13:06:18.303|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.UUU:9092/2: ClusterId: x2DnsDHoRUGMd2-kONsKeg, ControllerId: 1
Debug|2018-08-23 13:06:18.303|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.UUU:9092/2: 3 brokers, 1 topics
Debug|2018-08-23 13:06:18.303|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.UUU:9092/2: Broker #0/3: 192.168.XXX.UUU:9092 NodeId 2
Debug|2018-08-23 13:06:18.303|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.UUU:9092/2: Broker #1/3: 192.168.XXX.YYY:9092 NodeId 1
Debug|2018-08-23 13:06:18.303|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.UUU:9092/2: Broker #2/3: 192.168.XXX.ZZZ:9092 NodeId 3
Debug|2018-08-23 13:06:18.303|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.UUU:9092/2: Topic #0/1: my.topicname with 1 partition
s
Debug|2018-08-23 13:06:18.304|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: Topic my.topicname partition 0 Leader 3
Debug|2018-08-23 13:06:18.304|KafkaProducerTest.exe#producer-1|METADATA| [thrd:m
ain]: 192.168.XXX.UUU:9092/2: 1/1 requested topic(s) seen in metadata
Debug|2018-08-23 13:06:19.291|KafkaProducerTest.exe#producer-1|TOPPAR| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: my.topicname [0] 1
message(s) in xmit queue (1 added from partition queue)
Debug|2018-08-23 13:06:20.290|KafkaProducerTest.exe#producer-1|TOPPAR| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: my.topicname [0] 1
message(s) in xmit queue (0 added from partition queue)
Debug|2018-08-23 13:06:21.290|KafkaProducerTest.exe#producer-1|TOPPAR| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: my.topicname [0] 1
message(s) in xmit queue (0 added from partition queue)
Debug|2018-08-23 13:06:22.289|KafkaProducerTest.exe#producer-1|TOPPAR| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: my.topicname [0] 1
message(s) in xmit queue (0 added from partition queue)
Debug|2018-08-23 13:06:23.288|KafkaProducerTest.exe#producer-1|TOPPAR| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: my.topicname [0] 1
message(s) in xmit queue (0 added from partition queue)
Debug|2018-08-23 13:06:24.287|KafkaProducerTest.exe#producer-1|TOPPAR| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: my.topicname [0] 1
message(s) in xmit queue (0 added from partition queue)
Debug|2018-08-23 13:06:25.286|KafkaProducerTest.exe#producer-1|TOPPAR| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: my.topicname [0] 1
message(s) in xmit queue (0 added from partition queue)
Debug|2018-08-23 13:06:26.286|KafkaProducerTest.exe#producer-1|TOPPAR| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: my.topicname [0] 1
message(s) in xmit queue (0 added from partition queue)
Debug|2018-08-23 13:06:27.285|KafkaProducerTest.exe#producer-1|TOPPAR| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: my.topicname [0] 1
message(s) in xmit queue (0 added from partition queue)
Debug|2018-08-23 13:06:28.284|KafkaProducerTest.exe#producer-1|TOPPAR| [thrd:192
.168.XXX.ZZZ:9092/bootstrap]: 192.168.XXX.ZZZ:9092/3: my.topicname [0] 1
message(s) in xmit queue (0 added from partition queue)
^C
Not that much critical but it's a production issue.
please try 1.0-experimental-10. In particular, #545 has been resolved since 1.0-experimental-4.
[labeled as question as I believe this is most likely an issue that's been resolved in a later build]
@mhowlett Cool, I will try it tomorrow.
@mhowlett I've got some bad news. I've just tested it with 1.0-experimental-10 and the issue is still there.
Could you try to reproduce when commenting out the following config properties?:
"batch.num.messages":1,
"socket.nagle.disable":"true",
"socket.blocking.max.ms":50,
"queue.buffering.max.ms":1,
"log.connection.close":"false",
Also note that the following properties are for the consumer and are ignored for the producer:
"group.id":"KafkaProducerTest.exe",
"enable.auto.commit":"false",
@edenhill no success. However, I've noticed that the code (without your proposed config changes) works on Windows Server 2016 but it does not on Windows Server 2012 (regardless with or without the proposed config changes). The consumer config is OK as I've got both producer and consumer in my test app. Thanks for a very prompt response.
Do any of the producer examples here work?
@edenhill Let me try.
@edenhill I've just tried this one. It hangs too.
@edenhill @mhowlett Ok, my initial hypothesis about the issue being OS related fails. I have found a Windows Server 2012 machine where it works just fine. I am very confused now.
@edenhill ~For some reason~ rd_clock() is returning negative values on some of my boxes. It looks like it's due to an overflow when reading QueryPerformanceCounter values. After rebooting the offending box it started working (QueryPerformanceCounter got reset). I've put the following code in rd_kafka_broker_producer_serve just after now is set:
rd_rkb_dbg(rkb, QUEUE, "CLOCK", "now %"PRId64"", now);
Output from a working box:
Debug|2018-08-29 14:54:25.898|rdkafka#producer-1|CLOCK| [thrd:zzz.zzz.zzz.zzz:9092/bootstrap]: zzz.zzz.zzz.zzz:9092/3:
now 14951693103
Output from not working box:
Debug|2018-08-29 14:53:49.300|rdkafka#producer-1|CLOCK| [thrd:xxx.xxx.xxx.xxx:9092/bootstrap]: xxx.xxx.xxx.xxx:9092/2:
now -102503745224
EDIT: QueryPerformanceCounter values from a non-working box:
Before reboot:
QueryPerformanceCounter: 219923348897693
QueryPerformanceFrequency: 14318180
After reboot:
QueryPerformanceCounter: 10132389946
QueryPerformanceFrequency: 14318180
This issue is also apparent in 0.11.5 but not in 0.11.4 and below. Rebooting the machine did resolve the problem but not sure if I should continue to use 0.11.5.
@edenhill @mhowlett @ecparra I believe the issue is in the following piece of code from rdtime.h:
static RD_INLINE rd_ts_t rd_clock (void) {
//some other code
return (now.QuadPart * 1000000) / freq.QuadPart;
//some other code
}
QuadPart is a long long so the max value for that is 9223372036854775807 which means the max value before it overflows on the multiplication by 1000000 can be 9223372036854. With a freq of 14318180 (that was on my failing box) it corresponds to slightly more than 7 days:
//counter value / counter freq / seconds in an hour / hours in a day
9223372036854 / 14318180 / 3600 / 24 = about 7.45 days
In 0.11.4 it was implemented using return (rd_ts_t)GetTickCount64() * 1000LLU; that's why it wouldn't have that bug. Does this make sense?
~EDIT: Could we just change the order of the calculation to: (now.QuadPart / freq.QuadPart) * 1000000 ;~ ignore it. According to Microsoft this does not make sense:
//
// We now have the elapsed number of ticks, along with the
// number of ticks-per-second. We use these values
// to convert to the number of elapsed microseconds.
// To guard against loss-of-precision, we convert
// to microseconds *before* dividing by ticks-per-second.
//
@queil Great analysis! Will look into it, thank you!
@edenhill Thank you very much. It's a real show-stopper for us at the moment.
@edenhill - we also have a number of bug reports consumer side where restarting the windows server is the only thing that fixes the problem (#610 is this, #584 is this). those issues appear in 0.11.5 and weren't a problem in 0.11.4. could this also be that? superficially, seems likely to me.
thanks @queil - I think you might have tracked down a really important issue.
For reference, Microsoft QueryPerformanceCounter.
(now.QuadPart / freq.QuadPart) * 1000000 ;
I believe this will limit the precision to 1 second, we need microsecond (or at least sub-millisecond) precision.
You're right. I edited my comment above.
I believe now.QuadPart / ((double)freq.QuadPart/1000000.0)) will retain precision
or you could stick to integer arithmetic:
clock_coarse = now.QuadPart/freq.QuadPart * 1000000
ticks_fine = ticks - clock_coarse*freq.QuadPart
clock_fine = (ticks_fine * 1000000)/freq.QuadPart
return clock_coarse + clock_fine
quick calc: double has 52 bits of fraction precision -> largest int without any loss of precision = 4503599600000000000. equivalent to ~10 years using tick value from @queil. after that gradual loss of precision. so double calculation would be fine, and probably faster (though there's a couple of casts in there as well which could be costly). i think i prefer sticking to integers though.
where is ticks coming from?
oh sorry ticks is now.QuadPart
The double-version is two instructions, so it should be fine, and is easier to reason about.
0.11.6-RC1 is now on nuget.org which should resolve this.
This problem still seems to be there in 0.11.6-RC1. It hangs at ProduceAsync part. I am using .NET Framework 4.6.1. It works when I use 0.11.2.
var config = new Dictionary
{
{ "bootstrap.servers", "localhost:9092" }
};
using (var producer = new Producer
{
var dr = producer.ProduceAsync("my-topic", null, "test message text").Result;
Console.WriteLine($"Delivered '{dr.Value}' to: {dr.TopicPartitionOffset}");
Console.ReadKey();
}
This looks like some race condition issue, I was using 0.11.5 and everything was working okay, suddenly today it stopped working in ProduceAsync part.
@sshaky2 - this sounds like a different issue (unless i'm mistaken, the produce call wasn't blocking in this issue). how long are you waiting? try setting message.timeout.ms to say 10000 and see if the delivery report returns a failure. It defaults to 5 minutes. My guess is message delivery failed for some reason. if the problem persists, please open a different issue and include some debug:all logs.
@mhowlett @edenhill It looks like 0.11.6-RC1 does not have the fix for this issue. Is that intended? I've built qpcfix and it works but the 0.11.6-RC1 nuget does not.
Sorry, that was a mistake on my side, the qpcfix PR is not in RC1.
Will create an RC2 today, this time with the fix in..
@edenhill Cool, thanks!
I've tried 0.11.6-RC2 today - the bug is gone now. Thanks all!
@queil Thank you for all your work on this issue, finding the root cause, suggesting a fix, verifying, all very valuable!
@mhowlett & @queil
We are facing the same issue as @sshaky2.
Library used: 0.11.5 and higher
Server : Windows Server 2012 standard
Configuration:
var producerConfig = new Dictionary
{
{"bootstrap.servers", "server:9093"},
{"schema.registry.url", "url:8081"},
// optional schema registry client properties:
{"schema.registry.connection.timeout.ms", 5000},
{"schema.registry.max.cached.schemas", 10},
// optional avro serializer properties:
{"avro.serializer.buffer.bytes", 50},
{"avro.serializer.auto.register.schemas", true },
{"debug", "all" }
};
Log clearly mentions that its looping with "TOPPAR'
|2018-09-13 11:49:02.699|rdkafka#producer-1|BROADCAST| [thrd:ssl://server:9093/395]: Broadcasting state change
7|2018-09-13 11:49:03.698|rdkafka#producer-1|TOPPAR| [thrd:ssl://server:9093/395]: ssl://server:9093/39581: topicname [1] 1 message(s) in xmi
t queue (1 added from partition queue)
7|2018-09-13 11:49:04.696|rdkafka#producer-1|TOPPAR| [thrd:ssl://server:9093/395]: ssl://server:9093/39581: topicname [1] 1 message(s) in xmi
t queue (0 added from partition queue)
7|2018-09-13 11:49:05.695|rdkafka#producer-1|TOPPAR| [thrd:ssl://server:9093/395]: ssl://server:9093/39581: topicname [1] 1 message(s) in xmi
t queue (0 added from partition queue)
7|2018-09-13 11:49:06.693|rdkafka#producer-1|TOPPAR| [thrd:ssl://server:9093/395]: ssl://server:9093/39581: topicname [1] 1 message(s) in xmi
t queue (0 added from partition queue)
7|2018-09-13 11:49:07.691|rdkafka#producer-1|TOPPAR| [thrd:ssl://server:9093/395]: ssl://server:9093/39581: topicname [1] 1 message(s) in xmi
t queue (0 added from partition queue)
Working fine with following :
Library used: 0.11.4
Server : Windows Server 2012 standard
Library used: 0.11.5 and higher
Server : Windows 8.1
@sshaky2 Are you able to resolve this?
Thanks
@ravi-sontale - did you use 0.11.6-RC2? that is the only version in which this is currently fixed.
@mhowlett
Yes. I have used 0.11.6-RC2 as well and having same issue.
@ravi-sontale Can you reproduce this on v0.11.6-RC2 with debug=all and provide the logs?
@edenhill
Following is the log :
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature MsgVer2: Produce (3..3) supported
by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature MsgVer2: Fetch (4..4) supported b
y broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Enabling feature MsgVer2
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature ApiVersion: ApiVersion (0..0) sup
ported by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Enabling feature ApiVersion
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature BrokerGroupCoordinator: GroupCoor
dinator (0..0) supported by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Enabling feature BrokerGroupCoordinator
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature BrokerBalancedConsumer: GroupCoor
dinator (0..0) supported by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature BrokerBalancedConsumer: OffsetCom
mit (1..2) supported by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature BrokerBalancedConsumer: OffsetFet
ch (1..1) supported by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature BrokerBalancedConsumer: JoinGroup
(0..0) supported by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature BrokerBalancedConsumer: SyncGroup
(0..0) supported by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature BrokerBalancedConsumer: Heartbeat
(0..0) supported by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature BrokerBalancedConsumer: LeaveGrou
p (0..0) supported by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Enabling feature BrokerBalancedConsumer
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature ThrottleTime: Produce (1..2) supp
orted by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature ThrottleTime: Fetch (1..2) suppor
ted by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Enabling feature ThrottleTime
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature Sasl: JoinGroup (0..0) supported
by broker
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Enabling feature Sasl
7|2018-09-13 16:26:24.006|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature SaslHandshake: SaslHandshake (0..
0) supported by broker
7|2018-09-13 16:26:24.021|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Enabling feature SaslHandshake
7|2018-09-13 16:26:24.021|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature LZ4: GroupCoordinator (0..0) supp
orted by broker
7|2018-09-13 16:26:24.021|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Enabling feature LZ4
7|2018-09-13 16:26:24.021|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature OffsetTime: Offset (1..1) support
ed by broker
7|2018-09-13 16:26:24.021|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Enabling feature OffsetTime
7|2018-09-13 16:26:24.021|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature : CreateTopics (0..0) supported b
y broker
7|2018-09-13 16:26:24.021|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Feature : DeleteTopics (0..0) supported b
y broker
7|2018-09-13 16:26:24.021|rdkafka#producer-1|APIVERSION| [thrd:ssl://broker2:9093/3]: ssl://broker2
:9093/35514: Enabling feature
7|2018-09-13 16:26:24.021|rdkafka#producer-1|FEATURE| [thrd:ssl://broker2:9093/3]: ssl://broker2:9093/35514: Updated enabled protocol features to MsgVer1,
ApiVersion,BrokerBalancedConsumer,ThrottleTime,Sasl,SaslHandshake,BrokerGroupCoo
rdinator,LZ4,OffsetTime,MsgVer2
7|2018-09-13 16:26:24.021|rdkafka#producer-1|STATE| [thrd:ssl://broker2:9093/3]: ssl://broker2:9093/35514: Broker changed state APIVERSION_QUERY -> UP
7|2018-09-13 16:26:24.021|rdkafka#producer-1|BROADCAST| [thrd:ssl://broker2:9093/3]: Broadcasting state change
7|2018-09-13 16:26:24.021|rdkafka#producer-1|METADATA| [thrd:ssl://broker2:9093/3]: ssl://broker2:9093/35514: Request metadata for brokers only: connected
7|2018-09-13 16:26:24.021|rdkafka#producer-1|SEND| [thrd:ssl://broker2:9093/3]: ssl://bbroker2:9093/35514: Sent MetadataRequest (v2, 25 bytes @ 0, CorrId 2
)
7|2018-09-13 16:26:24.021|rdkafka#producer-1|RECV| [thrd:ssl://broker2:9093/3]: ssl://broker2:9093/35514: Received MetadataResponse (v2, 284 bytes, CorrId
2, rtt 4.01ms)
7|2018-09-13 16:26:24.021|rdkafka#producer-1|METADATA| [thrd:main]: ssl://broker2:9093/35514: ===== Received metadata
: connected =====
7|2018-09-13 16:26:24.021|rdkafka#producer-1|METADATA| [thrd:main]: ssl://broker2:9093/35514: ClusterId: WG7ZA5ZVSra5
apVUIWCa9Q, ControllerId: 37930
7|2018-09-13 16:26:24.021|rdkafka#producer-1|METADATA| [thrd:main]: ssl://broker2:9093/35514: 4 brokers, 0 topics
7|2018-09-13 16:26:24.021|rdkafka#producer-1|METADATA| [thrd:main]: ssl://broker2:9093/35514: Broker #0/4: broker2:9093 NodeId 35514
7|2018-09-13 16:26:24.021|rdkafka#producer-1|METADATA| [thrd:main]: ssl://broker2:9093/35514: Broker #1/4: broker2:9093 NodeId 32670
7|2018-09-13 16:26:24.021|rdkafka#producer-1|METADATA| [thrd:main]: ssl://broker2:9093/35514: Broker #2/4: broker2:9093 NodeId 37930
7|2018-09-13 16:26:24.021|rdkafka#producer-1|METADATA| [thrd:main]: ssl://broker2:9093/35514: Broker #3/4: broker2:9093 NodeId 37547
7|2018-09-13 16:26:24.255|rdkafka#producer-1|TOPIC| [thrd:app]: New local topic:
topic-name
7|2018-09-13 16:26:24.255|rdkafka#producer-1|TOPPARNEW| [thrd:app]: NEW topic-name [-1] 000000B8DAB46AE0 (at rd_kafka_topic_new0:362)
7|2018-09-13 16:26:24.739|rdkafka#producer-1|NOINFO| [thrd:main]: Topic topic-name partition count is zero: should refresh metadata
7|2018-09-13 16:26:24.739|rdkafka#producer-1|METADATA| [thrd:main]: Requesting m
etadata for 1/1 topics: refresh unavailable topics
7|2018-09-13 16:26:24.739|rdkafka#producer-1|METADATA| [thrd:main]: ssl://node1:9093/bootstrap: Request metada
ta for 1 topic(s): refresh unavailable topics
7|2018-09-13 16:26:24.817|rdkafka#producer-1|SEND| [thrd:ssl://node1:9]: ssl://node1:9093/bootstrap: Sent MetadataRequest (v2, 51 bytes @ 0,
CorrId 3)
7|2018-09-13 16:26:24.817|rdkafka#producer-1|RECV| [thrd:ssl://node1:9]: ssl://node1:9093/bootstrap: Received MetadataResponse (v2, 401 byte
s, CorrId 3, rtt 4.65ms)
7|2018-09-13 16:26:24.817|rdkafka#producer-1|METADATA| [thrd:main]: ssl://node1:9093/bootstrap: ===== Received
metadata (for 1 requested topics): refresh unavailable topics =====
7|2018-09-13 16:26:24.817|rdkafka#producer-1|METADATA| [thrd:main]: ssl://node1:9093/bootstrap: ClusterId: WG7
ZA5ZVSra5apVUIWCa9Q, ControllerId: 37930
7|2018-09-13 16:26:24.817|rdkafka#producer-1|METADATA| [thrd:main]: ssl://node1:9093/bootstrap: 4 brokers, 1 t
opics
7|2018-09-13 16:26:24.817|rdkafka#producer-1|METADATA| [thrd:main]: ssl://node1:9093/bootstrap: Broker #0/4:
int-evh-broker3-bw-stdev.:9093 NodeId 32670
7|2018-09-13 16:26:24.817|rdkafka#producer-1|METADATA| [thrd:main]: ssl://node1:9093/bootstrap: Broker #1/4:
broker2:9093 NodeId 35514
7|2018-09-13 16:26:24.817|rdkafka#producer-1|METADATA| [thrd:main]: ssl://node1:9093/bootstrap: Broker #2/4:
int-evh-broker4-bw-stdev.:9093 NodeId 37930
7|2018-09-13 16:26:24.817|rdkafka#producer-1|METADATA| [thrd:main]: ssl://node1:9093/bootstrap: Broker #3/4:
int-evh-broker1-bw-stdev.:9093 NodeId 37547
7|2018-09-13 16:26:24.817|rdkafka#producer-1|METADATA| [thrd:main]: ssl://node1:9093/bootstrap: Topic #0/1:
topic-name with 2 partitions
7|2018-09-13 16:26:24.817|rdkafka#producer-1|STATE| [thrd:main]: Topic topic-name changed state unknown -> exists
7|2018-09-13 16:26:24.817|rdkafka#producer-1|PARTCNT| [thrd:main]: Topic topic-name partition count changed from 0 to 2
7|2018-09-13 16:26:24.817|rdkafka#producer-1|TOPPARNEW| [thrd:main]: NEW topic-name [0] 000000B8DAB47AB0 (at rd_kafka_topic_partition_cnt_update:5
87)
7|2018-09-13 16:26:24.817|rdkafka#producer-1|TOPPARNEW| [thrd:main]: NEW topic-name [1] 000000B8DAB47FB0 (at rd_kafka_topic_partition_cnt_update:5
87)
7|2018-09-13 16:26:24.817|rdkafka#producer-1|METADATA| [thrd:main]: Topic topic-name partition 0 Leader 32670
7|2018-09-13 16:26:24.817|rdkafka#producer-1|BRKDELGT| [thrd:main]: topic-name [0]: delegate to broker ssl://broker3:9093/32670 (rktp 000000B8DAB47AB0, term 0, ref 2, remove 0)
7|2018-09-13 16:26:24.817|rdkafka#producer-1|BRKDELGT| [thrd:main]: topic-name [0]: broker ssl://broker3:9093/32670 is now leader for partition with 0 messages (0 bytes) queued
7|2018-09-13 16:26:24.817|rdkafka#producer-1|BRKMIGR| [thrd:main]: Migrating top
ic topic-name [0] 000000B8DAB47AB0 from (none) to ssl://broker3:9093/32670 (sending PARTITION_JOIN to s
sl://broker3:9093/32670)
7|2018-09-13 16:26:24.817|rdkafka#producer-1|METADATA| [thrd:main]: Topic topic-name partition 1 Leader 35514
7|2018-09-13 16:26:24.817|rdkafka#producer-1|BRKDELGT| [thrd:main]: topic-name [1]: delegate to broker ssl://broker2:9093/35514 (rktp 000000B8DAB47FB0, term 0, ref 2, remove 0)
7|2018-09-13 16:26:24.817|rdkafka#producer-1|BRKDELGT| [thrd:main]: topic-name [1]: broker ssl://broker2:9093/35514 is now leader for partition with 0 messages (0 bytes) queued
7|2018-09-13 16:26:24.817|rdkafka#producer-1|BRKMIGR| [thrd:main]: Migrating top
ic topic-name [1] 000000B8DAB47FB0 from (none) to ssl://broker2:9093/35514 (sending PARTITION_JOIN to s
sl://broker2:9093/35514)
7|2018-09-13 16:26:24.817|rdkafka#producer-1|PARTCNT| [thrd:main]: Partitioning
1 unassigned messages in topic topic-name to 2 partitions
7|2018-09-13 16:26:24.817|rdkafka#producer-1|UAS| [thrd:main]: 1/1 messages were
partitioned in topic topic-name
7|2018-09-13 16:26:24.817|rdkafka#producer-1|METADATA| [thrd:main]: ssl://node1:9093/bootstrap: 1/1 requested
topic(s) seen in metadata
7|2018-09-13 16:26:25.004|rdkafka#producer-1|TOPBRK| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: Topic topic-name [0]: joining br
oker (rktp 000000B8DAB47AB0)
7|2018-09-13 16:26:25.004|rdkafka#producer-1|FETCHADD| [thrd:ssl://broker39093/3]: ssl://broker3:9093/32670: Added topic-name [0] to active
list (1 entries, opv 0)
7|2018-09-13 16:26:25.004|rdkafka#producer-1|BROADCAST| [thrd:ssl://broker3:9093/3]: Broadcasting state change
7|2018-09-13 16:26:25.035|rdkafka#producer-1|TOPBRK| [thrd:ssl://broker3:9093/3]: ssl://broker2:9093/35514: Topic topic-name [1]: joining br
oker (rktp 000000B8DAB47FB0)
7|2018-09-13 16:26:25.035|rdkafka#producer-1|FETCHADD| [thrd:ssl://broker2:9093/3]: ssl://broker2:9093/35514: Added topic-name [1] to active
list (1 entries, opv 0)
7|2018-09-13 16:26:25.035|rdkafka#producer-1|BROADCAST| [thrd:ssl://broker2:9093/3]: Broadcasting state change
7|2018-09-13 16:26:26.003|rdkafka#producer-1|TOPPAR| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: topic-name [0] 1 message(s) in x
mit queue (1 added from partition queue)
7|2018-09-13 16:26:27.017|rdkafka#producer-1|TOPPAR| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: topic-name [0] 1 message(s) in x
mit queue (0 added from partition queue)
7|2018-09-13 16:26:28.015|rdkafka#producer-1|TOPPAR| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: topic-name [0] 1 message(s) in x
mit queue (0 added from partition queue)
7|2018-09-13 16:26:29.014|rdkafka#producer-1|TOPPAR| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: topic-name [0] 1 message(s) in x
mit queue (0 added from partition queue)
7|2018-09-13 16:26:30.012|rdkafka#producer-1|TOPPAR| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: topic-name [0] 1 message(s) in x
mit queue (0 added from partition queue)
7|2018-09-13 16:26:31.010|rdkafka#producer-1|TOPPAR| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: topic-name [0] 1 message(s) in x
mit queue (0 added from partition queue)
7|2018-09-13 16:26:32.009|rdkafka#producer-1|TOPPAR| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: topic-name [0] 1 message(s) in x
mit queue (0 added from partition queue)
7|2018-09-13 16:26:33.007|rdkafka#producer-1|TOPPAR| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: topic-name [0] 1 message(s) in x
mit queue (0 added from partition queue)
7|2018-09-13 16:26:34.006|rdkafka#producer-1|TOPPAR| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: topic-name [0] 1 message(s) in x
mit queue (0 added from partition queue)
7|2018-09-13 16:26:35.004|rdkafka#producer-1|TOPPAR| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: topic-name [0] 1 message(s) in x
mit queue (0 added from partition queue)
7|2018-09-13 16:26:36.003|rdkafka#producer-1|TOPPAR| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: topic-name [0] 1 message(s) in x
mit queue (0 added from partition queue)
7|2018-09-13 16:26:37.001|rdkafka#producer-1|TOPPAR| [thrd:ssl://broker3:9093/3]: ssl://broker3:9093/32670: topic-name [0] 1 message(s) in x
mit queue (0 add
@ravi-sontale And that's really on v0.11.6-RC2 (not RC1!)? :/
@edenhill
Yes. Today only I have downloaded the 0.11.6-RC2.
And tested using avroseriliser program.
Can you verify that the first line (or thereabout) of output says the version is v0.11.6-RC2?
E.g.:
RDKAFKA-7-INIT: rdkafka#producer-1: [thrd:app]: librdkafka v0.11.6-RC2 (0xb06ff) rdkafka#producer-1 initialized (debug 0xffff)
@edenhill
The output says 0.11.5
7|2018-09-13 18:07:16.283|rdkafka#producer-1|INIT| [thrd:app]: librdkafka **v0.11.
5** (0xb05ff) rdkafka#producer-1 initialized (debug 0xffff)
The problem was I used the AvroSpecific example and copied all the dlls and exe to server. As the bin for this project doesnt have librdkafa folder. I copied it from previous version thinking its the same for 0.11.6-RC2.
I then replaces the librdkafka dlls from 0.11.6-RC2 to AvroSpecific projects bin folder. And then executed on server and it worked :)
I'll try updating my application to 0.11.6-RC2 and see if it works.
Thanks for your prompt response and help.
Phew, that is good news :)
@ravi-sontale I had to downgrade it to 0.11.2 to make it work. I haven't tested 0.11.6-RC2 yet.
Downgraded to 0.11.4 and its working through my application as well.
This bug was introduced in v0.11.5 and fixed in v0.11.6-RC2
Any idea when the official v0.11.6 release is planned?
soon. there's a last minute issue related to loading interceptor plugins that's blocking this.
@queil thanks for the analysis. Completely perplexed with this issue, it did work for a few days after the upgrade to 11.5 then it stated to fail on Windows 2012 R2 but Windows 10 no issues at all. This post saved us. Downgrading to 11.4 works.
Is it fixed in 11.6? I am confused that issue is not closed.
@klesta490 - yes
This issue has popped up again with us in v 11.6. We are running on Windows Server 2012 are are seeing this log when the issue occurs:
INFO - 2019-02-19 11:41:52,139 [21] Ascend.ATI.VATI.Moodle.Framework.MessagingPublisher - {"Level":2,"Message":"[thrd:sasl_plaintext://xxxx:9093/103]: sasl_plaintext:/xxxx:9093/103: ATI.VATI.EnrollmentChanges.STG [1] 6 message(s) in xmit queue (0 added from partition queue)"}
When publishes successfully we'll see some number before the "added from partition queue"
The steps to reproduce are the same as listed above. The only difference I can note are we are using SASL.
Same problem here with version 0.11.6.
Is there a workaround to avoid this? The producer stops producing without warning.
Most helpful comment
@klesta490 - yes