Kafka-node: What is difference btw Highlevel producer and producer ?

Created on 21 Nov 2015 · 13Comments · Source: SOHU-Co/kafka-node

question

Source

pravin-d

👍12

Most helpful comment

A HighLevelProducer writes to all available partitions in a topic on a round robin basis. E.g. for

Topic: name: myTopic1, number of partitions: 20

a HighLevelProducer will first create a message for partition 0, then 1, then 2, then 3 ... then 18, then 19, then 0, then 1 ... and so on.
This leads to an even distribution of the messages across partitions.

A "normal" producer sends a message to a specified partition. You can specify it in the payload:

{
   topic: 'topicName',
   messages: ['message body'],// multi messages should be a array, single message can be just a string or a KeyedMessage instance 
   partition: 1, //default 0 
   attributes: 2, // default: 0 
}

If you omit the partition key for a message, it will default to partition 0. If you have only one partition, then the Producer and HighLevelProducer are doing the same thing.

Pretty much the same is true for HighLevelConsumers vs normal Consumers. A HighLevelConsumer will receive messages from all available partitions, whereas the normal Consumer only receives messages from the specified partition when adding the topic, or by default just from partition 0.

peterjuras on 6 Jan 2016

👍60 ❤13 🎉12

All 13 comments

Anybody know the answer to this? Same with Consumer vs. HighLevelConsumer. In one setup I'm working with, Producer works fine, but Consumer does not. I had to use HighLevelConsumer.

Would be great if the difference was documented in the README.

carlosrymer on 14 Dec 2015

If anybody is wondering, while doing some trial and error, there's at least one difference I can surface between Consumer and HighLevelConsumer.

While Consumer allows you to set an offset, HighLevelConsumer ignores it. To use an offset with Consumer, you must have fromOffset set to true.

In addition, when you retrieve the offset, make sure you subtract by one if you want to start listening for new messages from the moment you create your consumer on.

carlosrymer on 15 Dec 2015

👍1

A HighLevelProducer writes to all available partitions in a topic on a round robin basis. E.g. for

Topic: name: myTopic1, number of partitions: 20

A "normal" producer sends a message to a specified partition. You can specify it in the payload:

{
   topic: 'topicName',
   messages: ['message body'],// multi messages should be a array, single message can be just a string or a KeyedMessage instance 
   partition: 1, //default 0 
   attributes: 2, // default: 0 
}

If you omit the partition key for a message, it will default to partition 0. If you have only one partition, then the Producer and HighLevelProducer are doing the same thing.

peterjuras on 6 Jan 2016

👍60 ❤13 🎉12

When I diff the code for the high level producer and the producer they're almost identical. The only significant difference appears to be the default partitioner type, which is configurable on the both anyway. Am I missing something?

cressie176 on 19 Jul 2016

@cressie176 you are correct. There's duplicated code for a small difference. It can be cleaned up. PRs are welcome!

hyperlink on 19 Jul 2016

Thanks for the quick response @hyperlink. Would consider submitting a PR, but unless I've misread there's no reliable way throttle incoming messages (been reading some of the pause() issues). This is likely to be important for us, so I'm going to investigate alternative clients.

cressie176 on 19 Jul 2016

We throttle outside of the module using async.queue. We pause() in our message handler and push the message to the queue. Once the queue is drained we call resume().

It has worked well so far. Good luck!

hyperlink on 19 Jul 2016

so would it be correct to say that high-level versions should only be used when partitions are being used to "randomly" split the load across partitions for load-balancing when the order in which messages are processed __does not__ matter?

tony-kerz on 16 Jan 2017

👍1

@tony-kerz that sounds reasonable.

hyperlink on 19 Jan 2017

@hyperlink , @peterjuras
How do I specify number of partitions while creating Topics through 'HighLevelProducer' ? , I see that we could specify it in normal 'Producer' createTopic method but not in createTopics method of ''HighLevelProducer'

MUI-Pop on 14 Jun 2018

I think you have to statically specify the number of partitions in the kafka config of each node and can't do that dynamically on the creation of a new topic. I might be wrong though.

Also, dynamic creation of topics might not always be what you want (see e.g. https://stackoverflow.com/questions/43563977/can-a-kafka-producer-create-topics-and-partitions/43625219)

peterjuras on 14 Jun 2018

@peterjuras
That makes sense. Thank you.

MUI-Pop on 15 Jun 2018

@MUI-Pop not published yet but #958 added the ability to create topics using kafka's admin protocol and gives you control over number of replica and partitions.

hyperlink on 28 Jun 2018

Was this page helpful?

0 / 5 - 0 ratings