Pulsar: Consumer fails to retrieve messages when using multi-host connection with Avro schema and one broker is down in cluster.

Created on 20 Sep 2019  路  3Comments  路  Source: apache/pulsar

Describe the bug
When using multi-host connection in java client and also using schema registry for example Avro schema. Then it's not possible to retrieve messages when 1 broker is down from cluster. It's not happening without schema eh.g. using byte[] as message content.

To Reproduce
Steps to reproduce the behavior:

  1. Set up pulsar cluster with multiple brokers.
  2. Send few messages to to some topic using Avro Schema
  3. Kill 1 broker
  4. Try to retrieve messages from same topic with same schema

Expected behavior
Consumer starts to retrieving messages

Screenshots
019-09-20 17:16:38 WARN ConnectionPool:200 - Failed to open connection to host-nr-3:6651 : org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed: Connection refused: host-nr-3/host-nr-3-IP:6651
2019-09-20 17:16:38 WARN PulsarClientImpl:269 - [test-event] Failed to get partitioned topic metadata: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed: Connection refused: host-nr-3/host-nr-3-IP:6651
2019-09-20 17:16:38 INFO PulsarClientImpl:531 - Client closing. URL: pulsar+ssl://host-nr-1:6651,host-nr-2:6651,host-nr-3:6651
org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.ExecutionException: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed: Connection refused: host-nr-3/host-nr-3-IP:6651
at org.apache.pulsar.client.api.PulsarClientException.unwrap(PulsarClientException.java:297)
at org.apache.pulsar.client.impl.ProducerBuilderImpl.create(ProducerBuilderImpl.java:88)
at eu.leapin.eventbus.pulsarproducer.PulsarProducer.doMain(PulsarProducer.java:84)
at eu.leapin.eventbus.pulsarproducer.PulsarProducer.main(PulsarProducer.java:67)
Caused by: java.util.concurrent.ExecutionException: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed: Connection refused: host-nr-3/host-nr-3-IP:6651
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at org.apache.pulsar.client.impl.ProducerBuilderImpl.create(ProducerBuilderImpl.java:86)
... 2 more
Caused by: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed: Connection refused: host-nr-3/host-nr-3-IP:6651
at org.apache.pulsar.client.impl.ConnectionPool.lambda$null$9(ConnectionPool.java:202)
at org.apache.pulsar.shade.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at org.apache.pulsar.shade.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
at org.apache.pulsar.shade.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:335)
at org.apache.pulsar.shade.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
at org.apache.pulsar.shade.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.CompletionException: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed: Connection refused: host-nr-3/host-nr-3-IP:6651
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647)
at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at org.apache.pulsar.client.impl.ConnectionPool.lambda$connectToAddress$17(ConnectionPool.java:275)
at org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511)
at org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504)
at org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483)
at org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424)
at org.apache.pulsar.shade.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121)
at org.apache.pulsar.shade.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.fulfillConnectPromise(AbstractEpollChannel.java:629)
at org.apache.pulsar.shade.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:648)
at org.apache.pulsar.shade.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:522)
at org.apache.pulsar.shade.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:423)
at org.apache.pulsar.shade.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:330)
... 3 more
Caused by: org.apache.pulsar.shade.io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed: Connection refused: host-nr-3/host-nr-3-IP:6651
at org.apache.pulsar.shade.io.netty.channel.unix.Socket.finishConnect(..)(Unknown Source)
Caused by: org.apache.pulsar.shade.io.netty.channel.unix.Errors$NativeConnectException: syscall:getsockopt(..) failed: Connection refused
... 1 more

Desktop (please complete the following information):

  • OS: ubuntu 18.04 LTS

Additional context
Same problem is for producing messages too.
And I'm using Pulsar 2.4.1 and java client library is also 2.4.1

typbug

Most helpful comment

It suddenly started to working and I never had this issue anymore.

All 3 comments

Couldn't replicate the issue anymore

How did you solve it ?

It suddenly started to working and I never had this issue anymore.

Was this page helpful?
0 / 5 - 0 ratings