Sarama: HashPartitioner cast uint32 to int32

Created on 26 Apr 2018  Â·  5Comments  Â·  Source: Shopify/sarama

Versions

Sarama Version: e8552c0 (latest master)
Kafka Version: 1.0.0
Go Version: 1.10

Configuration

Standard Configuration.

Logs

No specific logs.

Problem Description

https://github.com/Shopify/sarama/blob/e8552c0dbf502cd7ec9186b57897e6cf9f15fb34/partitioner.go#L126

The p.hasher.Sum32() is being cast to int32. The default hasher sarama uses is FNV-1. In golang, Sum32() for FNV-1 returns uint32, and it can overflow if cast to int32.

In my case, I'm using custom hash (xxHash32) and with the key="123", the hash value="3062191159" which overflows int32. It would be nice to not have the p.hasher.Sum32() cast to int32.

-( producer

Most helpful comment

Ah, this is true... hmm.

The main problem with fixing it is that it's basically a breaking change; we need to maintain the consistency of the hash function between versions for existing users. I think the best we can do is implement a NewUnsignedHashPartitioner or something which people can manually configure.

If we ever do a sarama 2.0 with breaking changes we could fix it for the default at that point. Added to https://github.com/Shopify/sarama/wiki/Ideas-that-will-break-backwards-compatibility

All 5 comments

It's not clear to me how this is a problem? The partition chosen is still valid, consistent, and well-distributed.

The problem is the result is different when integrating with software written in other languages. We have a program that is written in Rust and using Rust Kafka Crate. They are using uint32 for Sum32, so the result is different comparing with golang Sarama.

By casting to signed int32 and making the result non-negative, you are essentially changing the hash algorithm, which is a problem when expecting a hash partitioner to behave the same way across the implementations.

Ah, this is true... hmm.

The main problem with fixing it is that it's basically a breaking change; we need to maintain the consistency of the hash function between versions for existing users. I think the best we can do is implement a NewUnsignedHashPartitioner or something which people can manually configure.

If we ever do a sarama 2.0 with breaking changes we could fix it for the default at that point. Added to https://github.com/Shopify/sarama/wiki/Ideas-that-will-break-backwards-compatibility

Just to follow up: we've already had to adjust this once for different implementations of the absolute-value function (https://github.com/Shopify/sarama/commit/6967cdb516216f65c91b9ae11431901c8b12ecf5).

I would also note that the HashPartitioner code is fairly simple and the interface is public, so you're free to implement the entirely new partitioner behaviour yourself.

Thanks for the feedback. We are now using the entirely new custom hash partitioner in our project.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

thomaslee picture thomaslee  Â·  7Comments

qiuyesuifeng picture qiuyesuifeng  Â·  3Comments

ruixule picture ruixule  Â·  5Comments

chandradeepak picture chandradeepak  Â·  3Comments

blackfox1983 picture blackfox1983  Â·  3Comments