Google-cloud-go: pubsub: pull message slow after update

Created on 30 Jun 2020  路  14Comments  路  Source: googleapis/google-cloud-go

Client

PubSub v1.4.0

Environment

GKE

Go Environment

$ go version
go version go1.14.4 linux/amd64

Code

I used following config file when I was using v1.2.0 and it was reading all the messages with high throughput, but with the same config and Pubsub v1.4.0 (Yes I upgraded recently) read throughput is really slow.

subscription.ReceiveSettings.MaxOutstandingMessages: 10
subscription.ReceiveSettings.NumGoroutines: 1

Expected behavior

Messages read rate is very high which leads to high throughput

Actual behavior

Messages read rate is very slow and some of the subscription not pulling any messages

pubsub p2 investigating bug

All 14 comments

Hi, thanks for taking the time to file an issue.

What's your publish rate, and how long does it take for you to handle an individual message?

Also, there's been a change recently where Pub/Sub version is no longer tied to the general cloud.google.com/go package version. The latest version of cloud.google.com/go/pubsub is actually v1.4.0. Can you check to see if you've pulled that in?

I am really sorry for the confusion. Yes, I am using pubsub 1.4. My cloud.google.com/go version was 0.53.0 which is now v0.59.0.

The publishing rate is around 150k-200k per second. With 3 processes each with 4 CPU but non of the processes are using full CPU or memory. Handling is good during the start but after a while, it processes around half of the messages

Can you try increasing MaxOutstandingMessages? The default is 1000, so 10 is sort of low. MaxOutstandingMessages controls how many messages can be pulled in by the client at once, as well as the number of callback functions spawned to handle messages.

@hongalex We've recently had a similar situation, where an update to the latest pubsub version (v1.4.0, although we saw the same in v1.3.1) caused our subscribers to stop pulling messages. We've experienced this in services which have a reasonably high number of subscribers (> 70). In these cases, subscription.Receive (in async mode) would block forever without receiving any messages.

We then tried manually pulling messages (via a standalone SubscriptionClient), making the grpc calls ourselves. In this case, we were always receiving a fraction of the messages we were asking for.

Eventually, we have found out that by increasing the number of connections available to the SubscriberClient (bigger connection pool), the issue goes away (probably meaning the connection was saturated). Which brings us to this. Why is the library leaving the SubscriberClient with a single connection while the PublisherClient can have up to 3 (assuming numConns=4)? Is there any reason behind this?

The way we've worked around this is by instantiating:

  • a pubsub.Client with pubsub.NewClient (which we use to do things like managing topics, subscriptions and publishing)
  • a pubsub.SubscriberClient with pubsub.NewSubscriberClient (which we exclusively use to call Pull to get new messages on a loop, building the request message ourselves)

Would you consider this a typical use case? Or you would expect most of the people to stick with just the pubsub.Client to do all operations. If so, why does the subscriber only get one of all the connections?

@jesushernandez Yes we have around 200+ subscriptions and I can see that it is not pulling from some of the subscriptions at all.

@jesushernandez can you please help me whether you needed to update this config to higher or not after that numConnection ?change

subscription.ReceiveSettings.MaxOutstandingMessages: 10
subscription.ReceiveSettings.NumGoroutines: 1

@smit-aterlo We are not using the subscription.Receive API anymore. Instead, we're using the grpc endpoint to pull messages. Here's our code https://github.com/lileio/pubsub/blob/master/providers/google/google.go

It used to be much simpler with the streaming pull via subscription.Receive but we were having the issues we discussed above with it.

OK thank you very much for the help @jesushernandez

No problem. Use that code as inspiration as it is still under active development and it still lacks proper testing.

In any case, this is an interim solution. I'm still waiting to know if there's anything else we could have tried or we're doing wrong.

Yes, that is true. Meanwhile @hongalex if you have any solution with subscription.Receive API please let us know.

Hiya, apologies for the delay. Can y'all try updating to cloud.google.com/go/pubsub v1.6.0 to see if this fixes your issue?

I can confirm our subscriptions are now pulling messages much faster and no messages are stuck. Thank you, @hongalex!

Yes, it resolved the issue we were having. Thank you for the help

Was this page helpful?
0 / 5 - 0 ratings