Google-cloud-python: PubSub Error Handling/Health check

Created on 18 Dec 2017  路  5Comments  路  Source: googleapis/google-cloud-python

We're experiencing issues where our subscription client stops pulling messages from topics that have messages. No error messages are coming through explaining what is happening. We're running the subscription client as a docker container in a kubernetes cluster. Since the subscriber runs background threads, the main process of time.sleep(30) never fails so kubernetes doesn't know to kill the pod and restart it.

https://cloud.google.com/pubsub/docs/pull#subscriber-error-listener-go doesn't have a python code example. What do you recommend for listening to errors, and what should I use as a healthcheck for kubernetes?

pubsub investigating

Most helpful comment

@dhermes I appreciate your speedy, helpful and thorough responses. Thank you!

All 5 comments

@rwillard I'd like to help more, can you help me with some more details:

  • Output of pip show google-cloud-pubsub
  • Could you provide a code snippet showing what you're doing, what you expect to happen and what actually happens?

google-cloud-pubsub==0.29.0

import logging
import time
from google.cloud import pubsub_v1


class Subscriber():
    def __init__(self, project, topic, subscription, callback, max_messages=1):
        self.subscriber = pubsub_v1.SubscriberClient()
        self.subscription_path = self.subscriber.subscription_path(project, subscription)
        self.flow_control = pubsub_v1.types.FlowControl(max_messages=max_messages)
        self.logger = logging.getLogger(__name__)
        self.callback = callback

    def run(self):
        self.logger.info('Initializing subscriber...')
        self.subscriber.subscribe(self.subscription_path, callback=self.callback, flow_control=self.flow_control)

        while True:
            time.sleep(60)

The problem was the subscriber was no longer pulling messages. Previously with blocking calls, if something went wrong with a pull command or anything else, if the exception wasn't handled the application would error out. Now that we're using this library, background threads erroring doesn't cause the main process to stop. How can I programmatically verify that the background threads are working correctly, in this case actually pulling from pubsub?

@rwillard First, let me recommend google-cloud-pubsub==0.30.0 to you. There have been 5 releases since then, each of which has important bugfixes:

As for

How can I programmatically verify that the background threads are working correctly

Instead of calling

self.subscriber.subscribe(
    self.subscription_path, 
    callback=self.callback,
    flow_control=self.flow_control)

you could call

policy = self.subscriber.subscribe(
    self.subscription_path, 
    flow_control=self.flow_control)
future = policy.open(self.callback)

and then you can check if the subscriber is still working by checking future.done(). If future.done() is False (i.e. if it failed), then future.exception() will tell you what the failure is. The only way future.done() can be true in the case of "success" is if you call policy.close(). (This was not true before 0.29.3/0.29.4 due to a bug.)

You could also do:

policy = self.subscriber.subscribe(
    self.subscription_path, 
    callback=self.callback,
    flow_control=self.flow_control)
future = policy.future

though policy.future (which is a read-only @property for the policy._future attribute/member) is subject to change. (If you call close() it will be set to None.)

The problem was the subscriber was no longer pulling messages.

This is likely because flow control was essentially totally broken before 0.29.4.

I am going to pre-emptively close this issue, since it seems like flow control was the problem.

If you re-run your code on the latest version (0.30.0) and the problem still persists, let's re-open this issue and continue the discussion.

Cheers and thanks for taking the time to share details!

@dhermes I appreciate your speedy, helpful and thorough responses. Thank you!

Was this page helpful?
0 / 5 - 0 ratings