Full context at https://github.com/GoogleCloudPlatform/google-cloud-python/issues/4841#issuecomment-386988007 downwards.
gist with logs: https://gist.github.com/psalaberria002/e91956d94e82d706d65da7111a40ca24
Possible culprits:
send() and recv() expectedly error on next(self.call), but if that was a retryable error that somehow re-bubbled, it would not trigger any callbacks.bidi uses rpc._wrapped because api_core doesn't appropriately expose add_done_callback. This means that bidi unwraps all of the retry logic.This doesn't explain why retrying a perfectly normal looking error somehow still leads us to a broken stream, unless the newly opened stream also has an error. But it's unclear why that wouldn't happen when calling the initial RPC.
Can anyone tell me how I can auto kill the subscriber in case this happens? I mean, coz, restarting the process works, killing it incase such event happens and starting it will do the trick!
Just need to know how does my code know that this exception arouse.
As noted above, the core issue here is that the lower-level code is not surfacing this error in a way that can be handled by the user. There is no workaround for now. When this is fixed, the future will appropriate raise the exception.
@theacodes Any update on this? It's stopping every 2 hours now. Very weird. Could you explain the meaning of the log? I just couldn't understand it as to why it's happening. I'm curious
The log:
Thread-ConsumeBidirectionalStream caught error 503 The service was unable to fulfill your request. Please try again. [code=8a75] and will exit. Generally this is due to the RPC itself being cancelled and the error will be surfaced to the calling code.
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 439, in _thread_main
response = self._bidi_rpc.recv()
File "/usr/local/lib/python2.7/dist-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 365, in recv
super(ResumableBidiRpc, self).recv)
File "/usr/local/lib/python2.7/dist-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 357, in _recoverable
return method(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 258, in recv
return next(self.call)
File "/usr/local/lib/python2.7/dist-packages/grpc/_channel.py", line 350, in next
return self._next()
File "/usr/local/lib/python2.7/dist-packages/google/api_core/grpc_helpers.py", line 56, in error_remapped_callable
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "/usr/local/lib/python2.7/dist-packages/six.py", line 737, in raise_from
raise value
ServiceUnavailable: 503 The service was unable to fulfill your request. Please try again. [code=8a75]
[2018-05-15 14:39:07] {/usr/local/lib/python2.7/dist-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py:455} INFO - Thread-ConsumeBidirectionalStream exiting
[2018-05-15 14:39:08] {/usr/local/lib/python2.7/dist-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py:110} DEBUG - Empty queue and inactive call, exiting request generator.
[2018-05-15 14:39:08] {/usr/local/lib/python2.7/dist-packages/google/cloud/pubsub_v1/subscriber/_protocol/leaser.py:164} INFO - Thread-LeaseMaintainer exiting.
I find subscriber in 0.32.0 more stable than subscriber_experimental in 0.34.0. That stops too, but it happens after a day or two not unlike subscriber_experimental which goes offline every two hours.
I'm encountering exactly the same issue. What is currently the recommended approach to build an application which continuously awaits messages and processes them, until the exact cause is identified and solved? Are there older versions we can use?
Hi folks - I am actively working on this. Please keep in mind that this is a beta library that is using a beta feature of the Pub/Sub API. Issues like this are why we haven't yet declared this library (and the endpoint) GA. We appreciate your feedback and testing!
In the meantime, you can try using the google-api-python-client and the non-streaming pull method. You won't get anywhere near the throughput as you would with streaming pull, but for many use cases this is totally acceptable.
Hi @theacodes thank you very much. I understand the beta state, thanks for the links. I'm looking forward to test the latest updates.
Hi @theacodes ,
I tested 0.35.0 this morning, using following code:
subscriber = pubsub.SubscriberClient()
subscription = subscriber.subscription_path(options.project, options.subscription)
print("Starting")
while True:
try:
future = subscriber.subscribe(subscription, callback)
future.result()
except KeyboardInterrupt as e:
future.cancel()
break
except Exception as e:
print("Reloading subscriber due to faillure ({0}).".format(e.__class__.__name__))
pass
The goal is to re-open whenever something occurs. Unfortunately after about 3 minutes following occurs, which is the same issue encountered with 0.34.0 and was reported in #4841
```[D 180528 14:36:29 bidi:466] Thread-ConsumeBidirectionalStream caught error 503 The service was unable to fulfill your request. Please try again. [code=8a75] and will exit. Generally this is due to the RPC itself being cancelled and the error will be surfaced to the calling code.
Traceback (most recent call last):
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 76, in next
return six.next(self._wrapped)
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/grpc/_channel.py", line 341, in __next__
return self._next()
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/grpc/_channel.py", line 335, in _next
raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, The service was unable to fulfill your request. Please try again. [code=8a75])>
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 366, in _recoverable
return method(*args, **kwargs)
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 258, in recv
return next(self.call)
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 78, in next
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
google.api_core.exceptions.ServiceUnavailable: 503 The service was unable to fulfill your request. Please try again. [code=8a75]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 76, in next
return six.next(self._wrapped)
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/grpc/_channel.py", line 341, in __next__
return self._next()
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/grpc/_channel.py", line 335, in _next
raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, The service was unable to fulfill your request. Please try again. [code=8a75])>
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 457, in _thread_main
response = self._bidi_rpc.recv()
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 383, in recv
super(ResumableBidiRpc, self).recv)
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 375, in _recoverable
return method(*args, **kwargs)
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 258, in recv
return next(self.call)
File "/home/javdrher/.virtualenvs/recommender-api-backend/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 78, in next
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
google.api_core.exceptions.ServiceUnavailable: 503 The service was unable to fulfill your request. Please try again. [code=8a75]
[I 180528 14:36:29 bidi:473] Thread-ConsumeBidirectionalStream exiting
[D 180528 14:36:29 bidi:110] Empty queue and inactive call, exiting request generator.
[I 180528 14:36:30 leaser:164] Thread-LeaseMaintainer exiting.
```
No exception seems to reach the handler.
Python 3.6.3, libraries:
google-api-core==1.2.0
google-auth==1.4.1
google-cloud-pubsub==0.35.0
googleapis-common-protos==1.5.3
grpc-google-iam-v1==0.11.4
grpcio==1.12.0
Thank you for testing this, @anmol1vw13! I still haven't been able to reproduce this myself, but it would be super helpful if you could enable debug level logging and provide the full log of everything. If it's too big to send over a github gist, please feel free to email it to me.
Note to (future) self:
This is due to https://github.com/GoogleCloudPlatform/google-cloud-python/blob/26f1a97ec06b5c48df7a3c9eec00ae3985657164/pubsub/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py#L375 assuming that the method will always succeed. For some reason, this is possible to fail as well.
Assuming you meant to refer me, that log was created with debugging on, the full log can be found here:
https://gist.github.com/javdrher/90ebab7a171466f9b9c457f28d95ac09
it occurs frequently, each time I run it this happens after approximately 3 minutes.
@javdrher interesting. It seems like in your case you never received any messages - is that expected?
Can you tell me more about your environment? What credentials are you using?
This is indeed the case, I wasn't sending any messages through. Its currently running in a development environment with no live requests passing through it yet. The client was running on my laptop, with standard gcloud authentication.
I also have an implementation which is now being integrated into our system, which is running on a Kubernetes cluster with a service account with the PubSub publisher role. That implementation currently runs 0.34.0, with following code:
future = subscriber.subscribe_experimental(subscription, callback)
while True:
if not future._manager._consumer.is_active:
logging.info("Reloading subscriber due to known faillure.")
future = subscriber.subscribe_experimental(subscription, loader)
time.sleep(60)
The log of that pod is as follows (it runs logging level INFO, but I know from previous runs the debug log looks the same like the one I posted)
sales-loader | May 28, 2018, 7:23:54 PM | [I 180528 17:23:54 loader:39] Reloading subscriber due to known faillure.
sales-loader | May 28, 2018, 7:23:28 PM | [I 180528 17:23:28 leaser:164] Thread-LeaseMaintainer exiting.
sales-loader | May 28, 2018, 7:23:22 PM | [I 180528 17:23:22 bidi:455] Thread-ConsumeBidirectionalStream exiting
sales-loader | May 28, 2018, 7:19:52 PM | [I 180528 17:19:52 loader:39] Reloading subscriber due to known faillure.
sales-loader | May 28, 2018, 7:19:36 PM | [I 180528 17:19:36 leaser:164] Thread-LeaseMaintainer exiting.
sales-loader | May 28, 2018, 7:19:31 PM | [I 180528 17:19:31 bidi:455] Thread-ConsumeBidirectionalStream exiting
sales-loader | May 28, 2018, 7:15:51 PM | [I 180528 17:15:51 loader:39] Reloading subscriber due to known faillure.
sales-loader | May 28, 2018, 7:14:56 PM | [I 180528 17:14:56 leaser:164] Thread-LeaseMaintainer exiting.
sales-loader | May 28, 2018, 7:14:56 PM | [I 180528 17:14:56 bidi:455] Thread-ConsumeBidirectionalStream exiting
sales-loader | May 28, 2018, 7:12:50 PM | [I 180528 17:12:50 loader:39] Reloading subscriber due to known faillure.
sales-loader | May 28, 2018, 7:11:58 PM | [I 180528 17:11:58 leaser:164] Thread-LeaseMaintainer exiting.
sales-loader | May 28, 2018, 7:11:57 PM | [I 180528 17:11:57 bidi:455] Thread-ConsumeBidirectionalStream exiting
sales-loader | May 28, 2018, 7:08:49 PM | [I 180528 17:08:49 loader:39] Reloading subscriber due to known faillure.
sales-loader | May 28, 2018, 7:08:10 PM | [I 180528 17:08:10 leaser:164] Thread-LeaseMaintainer exiting.
sales-loader | May 28, 2018, 7:08:07 PM | [I 180528 17:08:07 bidi:455] Thread-ConsumeBidirectionalStream exiting
sales-loader | May 28, 2018, 7:04:48 PM | [I 180528 17:04:48 loader:39] Reloading subscriber due to known faillure.
sales-loader | May 28, 2018, 7:04:22 PM | [I 180528 17:04:22 leaser:164] Thread-LeaseMaintainer exiting.
sales-loader | May 28, 2018, 7:04:20 PM | [I 180528 17:04:20 bidi:455] Thread-ConsumeBidirectionalStream exiting
...
It's very strange to see it fail that often, but that is on 0.34, so I'd be interested to see how it behaves in your cluster with 0.35. On my end, I've literally run the thing for 4 days without any fatal errors, so this is a fun one.
If you try using service account key authentication locally, does that make a difference?
Modified the code as follows for 0.35.0:
future = subscriber.subscribe(subscription, loader)
while True:
if not future._manager._consumer.is_active:
print("Reloading subscriber due to known faillure.")
future = subscriber.subscribe(subscription, callback)
time.sleep(60)
With the service account locally:
https://gist.github.com/javdrher/df44f6521ae352cde463aa7af7d97607
On kubernetes (GKE, version 1.8.10-gke.0), no signs of improvement either:
sales-loader | May 29, 2018, 10:25:42 AM | [I 180529 08:25:42 loader:38] Reloading subscriber due to known faillure.
sales-loader | May 29, 2018, 10:25:05 AM | [I 180529 08:25:05 leaser:164] Thread-LeaseMaintainer exiting.
sales-loader | May 29, 2018, 10:25:05 AM | [I 180529 08:25:05 bidi:473] Thread-ConsumeBidirectionalStream exiting
sales-loader | May 29, 2018, 10:22:42 AM | [I 180529 08:22:42 loader:38] Reloading subscriber due to known faillure.
sales-loader | May 29, 2018, 10:22:05 AM | [I 180529 08:22:05 leaser:164] Thread-LeaseMaintainer exiting.
sales-loader | May 29, 2018, 10:21:58 AM | [I 180529 08:21:58 bidi:473] Thread-ConsumeBidirectionalStream exiting
sales-loader | May 29, 2018, 10:18:42 AM | [I 180529 08:18:42 loader:38] Reloading subscriber due to known faillure.
sales-loader | May 29, 2018, 10:18:17 AM | [I 180529 08:18:17 leaser:164] Thread-LeaseMaintainer exiting.
sales-loader | May 29, 2018, 10:18:15 AM | [I 180529 08:18:15 bidi:473] Thread-ConsumeBidirectionalStream exiting
sales-loader | May 29, 2018, 10:14:42 AM | [I 180529 08:14:42 loader:38] Reloading subscriber due to known faillure.
sales-loader | May 29, 2018, 10:14:30 AM | [I 180529 08:14:30 leaser:164] Thread-LeaseMaintainer exiting.
sales-loader | May 29, 2018, 10:14:24 AM | [I 180529 08:14:24 bidi:473] Thread-ConsumeBidirectionalStream exiting
sales-loader | May 29, 2018, 10:11:44 AM | [I 180529 08:11:44 leaser:164] Thread-LeaseMaintainer exiting.
sales-loader | May 29, 2018, 10:11:42 AM | [I 180529 08:11:42 loader:38] Reloading subscriber due to known faillure.
sales-loader | May 29, 2018, 10:11:41 AM | [I 180529 08:11:41 bidi:473] Thread-ConsumeBidirectionalStream exiting
sales-loader | May 29, 2018, 10:08:41 AM | [I 180529 08:08:41 loader:38] Reloading subscriber due to known faillure.
sales-loader | May 29, 2018, 10:08:22 AM | [I 180529 08:08:22 leaser:164] Thread-LeaseMaintainer exiting.
sales-loader | May 29, 2018, 10:08:16 AM | [I 180529 08:08:16 bidi:473] Thread-ConsumeBidirectionalStream exiting
I also see this issue in v0.35.0. It happens during a period where no message published to the subscribed topic for roughly 3 to 3.5 minutes. Once the error occurs, no further messages are published until the process is manually killed and restarted. No error bubbles up to the future, and likewise, future.done() always reports False, leading to no simple way of detecting the scenario.
In our use-case, we expect there to be periods of very low activity (particularly overnight), and also one topic usually has messages published at a rate of one message every five minutes. Subscribers to this topic would fail immediately, and subscribers to other topics with higher activity during the day would fail at the end of the day when activity falls.
Logs are at https://gist.github.com/sjagoe/8b360af13e9c4edb3e76f1da735950ce
The code I'm using is taken from the main example in the documentation.
def callback(message):
message.ack()
def run():
subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path('my-project', 'my-subscription')
future = subscriber.subscribe(subscription_path, callback)
future.result()
Thanks, this should mostly fix the propagation of the exception? when released I'll check the stability as well.
Yep
@theacodes This seems to have introduced an error. I now have this at the tail of the logs. Otherwise, the behaviour is the same; the error does not bubble up to the future and the subscriber hangs.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 456, in _thread_main
response = self._bidi_rpc.recv()
File "/usr/local/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 382, in recv
super(ResumableBidiRpc, self).recv)
File "/usr/local/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 373, in _recoverable
self._finalize(exc)
File "/usr/local/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 323, in _finalize
callback(result)
File "/usr/local/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/streaming_pull_manager.py", line 424, in _on_rpc_done
self.close(reason=future)
File "/usr/local/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/streaming_pull_manager.py", line 321, in close
self._consumer.stop()
File "/usr/local/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/bidi.py", line 493, in stop
self._thread.join()
File "/usr/local/lib/python3.6/threading.py", line 1053, in join
raise RuntimeError("cannot join current thread")
RuntimeError: cannot join current thread
2018-05-30 06:04:12,489 DEBUG [google.cloud.pubsub_v1.subscriber._protocol.bidi:120] Cleanly exiting request generator.
2018-05-30 06:04:12,490 INFO [google.cloud.pubsub_v1.subscriber._protocol.bidi:472] Thread-ConsumeBidirectionalStream exiting
2018-05-30 06:04:12,582 INFO [google.cloud.pubsub_v1.subscriber._protocol.leaser:164] Thread-LeaseMaintainer exiting.
I'm investigating this now.
Great news everyone - I was able to not only reproduce this issue (finally!) and fix the bubbling of the exception and joining issue, but I also discovered the underlying issue and hopefully you won't see as many top-level exceptions.
I need to get one more PR in before we can release. :)
Thanks, @theacodes! Looking forward to trying the next release.
I can confirm that this appears fixed in v0.35.2.
2018-05-31 07:42:03,438 INFO [root:436] Observed recoverable stream error 504 Deadline Exceeded
2018-05-31 07:42:03,439 INFO [root:436] Observed recoverable stream error 504 Deadline Exceeded
2018-05-31 07:42:03,440 INFO [google.cloud.pubsub_v1.subscriber._protocol.bidi:361] Re-established stream
And I'm still receiving messages after this.
I can confirm that this appears fixed in v0.35.2.
Hooray! Thank you for testing.
Locally it runs perfect. On GKE I get:
sales-loader | May 31, 2018, 9:45:48 AM | [I 180531 07:45:48 bidi:361] Re-established stream
sales-loader | May 31, 2018, 9:45:48 AM | [I 180531 07:45:48 bidi:361] Re-established stream
sales-loader | May 31, 2018, 9:45:48 AM | [I 180531 07:45:48 streaming_pull_manager:436] Observed recoverable stream error 504 Deadline Exceeded
sales-loader | May 31, 2018, 9:45:48 AM | [I 180531 07:45:48 streaming_pull_manager:436] Observed recoverable stream error 504 Deadline Exceeded
sales-loader | May 31, 2018, 9:45:48 AM | [I 180531 07:45:48 streaming_pull_manager:436] Observed recoverable stream error 504 Deadline Exceeded
sales-loader | May 31, 2018, 9:45:48 AM | [I 180531 07:45:48 streaming_pull_manager:436] Observed recoverable stream error 504 Deadline Exceeded
sales-loader | May 31, 2018, 9:35:49 AM | [I 180531 07:35:49 bidi:361] Re-established stream
sales-loader | May 31, 2018, 9:35:49 AM | [I 180531 07:35:49 bidi:361] Re-established stream
sales-loader | May 31, 2018, 9:35:49 AM | [I 180531 07:35:49 streaming_pull_manager:436] Observed recoverable stream error 504 Deadline Exceeded
sales-loader | May 31, 2018, 9:35:49 AM | [I 180531 07:35:49 streaming_pull_manager:436] Observed recoverable stream error 504 Deadline Exceeded
but all seems to be managed perfectly.
Thanks!
Those are all normal! Thank you for testing.
On Thu, May 31, 2018, 12:49 AM Joachim van der Herten <
[email protected]> wrote:
Locally it runs perfect. On GKE I get:
sales-loader | May 31, 2018, 9:45:48 AM | [I 180531 07:45:48 bidi:361] Re-established stream
sales-loader | May 31, 2018, 9:45:48 AM | [I 180531 07:45:48 bidi:361] Re-established stream
sales-loader | May 31, 2018, 9:45:48 AM | [I 180531 07:45:48 streaming_pull_manager:436] Observed recoverable stream error 504 Deadline Exceeded
sales-loader | May 31, 2018, 9:45:48 AM | [I 180531 07:45:48 streaming_pull_manager:436] Observed recoverable stream error 504 Deadline Exceeded
sales-loader | May 31, 2018, 9:45:48 AM | [I 180531 07:45:48 streaming_pull_manager:436] Observed recoverable stream error 504 Deadline Exceeded
sales-loader | May 31, 2018, 9:45:48 AM | [I 180531 07:45:48 streaming_pull_manager:436] Observed recoverable stream error 504 Deadline Exceeded
sales-loader | May 31, 2018, 9:35:49 AM | [I 180531 07:35:49 bidi:361] Re-established stream
sales-loader | May 31, 2018, 9:35:49 AM | [I 180531 07:35:49 bidi:361] Re-established stream
sales-loader | May 31, 2018, 9:35:49 AM | [I 180531 07:35:49 streaming_pull_manager:436] Observed recoverable stream error 504 Deadline Exceeded
sales-loader | May 31, 2018, 9:35:49 AM | [I 180531 07:35:49 streaming_pull_manager:436] Observed recoverable stream error 504 Deadline Exceededbut all seems to be managed perfectly.
Thanks!—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/GoogleCloudPlatform/google-cloud-python/issues/5306#issuecomment-393442281,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAPUc5b6PuwGovioR8DTjoH4O7yJK-Ltks5t36CkgaJpZM4T1tQP
.
I am still experiencing some issues. In the logs you guys posted, I cannot see this log entry: Thread-ConsumeBidirectionalStream exiting.
I experienced the same behaviour as you are all describing:
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
INFO | 13:25:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
INFO | 13:25:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
INFO | 13:25:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
After that messages were still been received, so that is good.
But few minutes later this happened instead:
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
INFO | 13:35:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
INFO | 13:35:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
INFO | 13:35:19 | google.cloud.pubsub_v1.subscriber._protocol.heartbeater | Thread-Heartbeater exiting.
INFO | 13:35:21 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Thread-LeaseMaintainer exiting
After that messages were not received anymore. The Thread-ConsumeBidirectionalStream exited.
The heartbeat is still been sent... though
I can see that now this occurs more often in version 0.35.x with the new subscriber. With DEBUG log level I can also see this happens after the Thread-ConsumeBidirectionalStream exits:
INFO | 11:57:11 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
DEBUG | 11:57:11 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Empty queue and inactive call, exiting request generator.
DEBUG | 11:57:11 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Empty queue and inactive call, exiting request generator.
DEBUG | 11:57:11 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Empty queue and inactive call, exiting request generator.
DEBUG | 11:57:11 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Empty queue and inactive call, exiting request generator.
The way I am running my subscriber is this:
def receive_messages(self, subscription_name):
subscription_path = self.subscriber.subscription_path(
self.project, subscription_name
)
future = self.subscriber.subscribe(
subscription_path, callback=self.callback, flow_control=self.flow_control
)
self.logger.info(
'Listening for messages on %s subscription for topic %s',
subscription_path, self.topic_name
)
try:
future.result()
self.logger.error('I should not get here')
except Exception as exc:
self.logger.error('Pub/Sub Error. Subscription %s stopped working.', subscription_name)
raise exc
I have four subscriptions that is why there are that many repetitions in the logs. Each subscription lives on a separate thread.
None of those logger.error() is being printed, so the subscriber keeps hanging at the future.result() which means no exceptions are being propagated/finalised. Maybe I am doing something wrong, or using a wrong version?
google-api-core==1.2.0
google-auth==1.4.2
google-cloud-pubsub==0.35.2
googleapis-common-protos==1.5.3
grpc-google-iam-v1==0.11.4
grpcio==1.12.0
It would be really helpful to reduce it to one subscriber and see if this
still shows up, that way we can make some sense of the logs. Otherwise the
noise from the other ones makes it difficult to ascertain what's going on.
On Thu, May 31, 2018, 8:04 AM Daniele Giancola notifications@github.com
wrote:
I am still experiencing some issues. In the logs you guys posted, I cannot
see this log entry: Thread-ConsumeBidirectionalStream exiting.I experienced the same behaviour as you are all describing:
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:25:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
INFO | 13:25:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
INFO | 13:25:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
INFO | 13:25:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established streamAfter that messages were still been received, so that is good.
But few minutes later this happened instead:
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | root | Observed recoverable stream error 504 Deadline Exceeded
INFO | 13:35:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
INFO | 13:35:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
INFO | 13:35:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
INFO | 13:35:19 | google.cloud.pubsub_v1.subscriber._protocol.heartbeater | Thread-Heartbeater exiting.
INFO | 13:35:21 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Thread-LeaseMaintainer exitingAfter that messages were not received anymore. The
Thread-ConsumeBidirectionalStream exited.The heartbeat is still been sent... though
I can see that now this occurs more often in version 0.35.x with the new
subscriber. With DEBUG log level I can also see this happens after the
Thread-ConsumeBidirectionalStream exits:INFO | 11:57:11 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
DEBUG | 11:57:11 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Empty queue and inactive call, exiting request generator.
DEBUG | 11:57:11 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Empty queue and inactive call, exiting request generator.
DEBUG | 11:57:11 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Empty queue and inactive call, exiting request generator.
DEBUG | 11:57:11 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Empty queue and inactive call, exiting request generator.The way I am running my subscriber is this:
def receive_messages(self, subscription_name): subscription_path = self.subscriber.subscription_path( self.project, subscription_name ) future = self.subscriber.subscribe( subscription_path, callback=self.callback, flow_control=self.flow_control ) self.logger.info( 'Listening for messages on %s subscription for topic %s', subscription_path, self.topic_name ) try: future.result() self.logger.error('I should not get here') except Exception as exc: self.logger.error('Pub/Sub Error. Subscription %s stopped working.', subscription_name) raise excI have four subscriptions that is why there are that many repetitions in
the logs. Each subscription lives on a separate thread.
None of those logger.error() is being printed, so the subscriber keeps
hanging at the future.result() which means no exceptions are being
propagated/finalised. Maybe I am doing something wrong, or using a wrong
version?google-api-core==1.2.0
google-auth==1.4.2
google-cloud-pubsub==0.35.2
googleapis-common-protos==1.5.3
grpc-google-iam-v1==0.11.4
grpcio==1.12.0—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/GoogleCloudPlatform/google-cloud-python/issues/5306#issuecomment-393560910,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAPUczuvdloTAkyrpbon4r2pTOjcMG9Jks5t4AZ8gaJpZM4T1tQP
.
good point.
here is the tail of the log with one subscription:
DEBUG | 16:37:44 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG | 16:37:44 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager| Scheduling callbacks for 3 messages.
DEBUG | 16:37:44 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager| Sent request(s) over unary RPC.
DEBUG | 16:37:44 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 2 batched requests
DEBUG | 16:37:44 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager| Sent request(s) over unary RPC.
DEBUG | 16:37:44 | google.cloud.pubsub_v1.subscriber._protocol.bidi | waiting for recv.
DEBUG | 16:37:44 | clients.policies.base| Callback log (EDITED)
DEBUG | 16:37:44 | clients.policies.base| Callback log (EDITED)
DEBUG | 16:37:44 | clients.policies.base| Callback log (EDITED)
DEBUG | 16:37:44 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG | 16:37:44 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager| Scheduling callbacks for 1 messages.
INFO | 16:37:44 | root | Observed recoverable stream error 504 Deadline Exceeded
DEBUG | 16:37:44 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager| Sent request(s) over unary RPC.
DEBUG | 16:37:44 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Empty queue and inactive call, exiting request generator.
DEBUG | 16:37:44 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager| Sent request(s) over unary RPC.
INFO | 16:37:44 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
DEBUG | 16:37:44 | clients.policies.base| Callback log (EDITED)
INFO | 16:37:45 | google.cloud.pubsub_v1.subscriber._protocol.heartbeater | Thread-Heartbeater exiting.
INFO | 16:37:49 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Thread-LeaseMaintainer exiting.
In fact nothing else is being logged after that, in previous logs I pasted, the other messages must have been coming from the other subscribers as you were suggesting.
But the subscriber is hanging there...
this seems to happen every 10 minutes, maybe this is a configuration issue?
@jam182 can you give a few more details about your environment? Where are you running? How many messages are you receiving?
The subscriber app is running on a gcp instance within a docker container. It's python 3.6.3
The authentication is done by pointing the GOOGLE_APPLICATION_CREDENTIALS to the json service account. Subscriptions have 10 seconds acknowledge deadline. At the moment I am testing with ~10 messages per second. All the subscriber does with the message, is to log it to file and send a ack.
I am also testing on my local with pycharm and a virtualenv.
what else could be useful to know?
That should be enough to go on. I'm able to reproduce the "Deadline exceeded" but not the behavior of the consumer exiting, here's what my log looks like:
INFO:google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager:Observed recoverable stream error 504 Deadline Exceeded
INFO:google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager:Observed recoverable stream error 504 Deadline Exceeded
INFO:google.cloud.pubsub_v1.subscriber._protocol.bidi:Re-established stream
DEBUG:google.cloud.pubsub_v1.subscriber._protocol.bidi:Empty queue and inactive call, exiting request generator.
DEBUG:google.cloud.pubsub_v1.subscriber._protocol.heartbeater:Sent heartbeat.
The key difference here is the "Re-established stream" log.
I have been doing more tests. The consumer thread exiting does not always happen, so it is harder to reproduce. But it does happen. I can see that the Observed recoverable stream error 504 Deadline Exceeded happens exactly every 10 minutes.
In my current test run for the first time I noticed this has increased to happen exactly every two minutes now and I can see traceback now
INFO | 14:17:10 | root | Observed recoverable stream error 503 OS Error
INFO | 14:17:10 | root | Observed recoverable stream error 503 OS Error
ERROR | 14:17:10 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Error in queue callback worker: 503 OS Error
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 54, in error_remapped_callable
return callable_(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/grpc/_channel.py", line 500, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/usr/lib/python3.6/site-packages/grpc/_channel.py", line 434, in _end_unary_response_blocking
raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, OS Error)>
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/helper_threads.py", line 112, in __call__
self._callback(items)
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 101, in dispatch_callback
self.ack(batched_commands.pop(requests.AckRequest))
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 121, in ack
self._manager.send(request)
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/streaming_pull_manager.py", line 264, in send
self._send_unary_request(request)
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/streaming_pull_manager.py", line 243, in _send_unary_request
ack_ids=list(request.ack_ids))
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/_gapic.py", line 42, in <lambda>
fx = lambda self, *a, **kw: wrapped_fx(self.api, *a, **kw) # noqa
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/gapic/subscriber_client.py", line 646, in acknowledge
self._acknowledge(request, retry=retry, timeout=timeout)
File "/usr/lib/python3.6/site-packages/google/api_core/gapic_v1/method.py", line 139, in __call__
return wrapped_func(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/google/api_core/retry.py", line 260, in retry_wrapped_func
on_error=on_error,
File "/usr/lib/python3.6/site-packages/google/api_core/retry.py", line 177, in retry_target
return target()
File "/usr/lib/python3.6/site-packages/google/api_core/timeout.py", line 206, in func_with_timeout
return func(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 56, in error_remapped_callable
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
google.api_core.exceptions.ServiceUnavailable: 503 OS Error
INFO | 14:17:10 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Re-established stream
This is the only other stacktrace I was able to see. I am testing with 60 messages a second. still on v0.35.2
Those logs seem fine. "Re-established stream" means that things are recovering as expected. It's when we see "Thread-ConsumeBidirectionalStream exiting" without an associated error on the future that is problematic.
yeah, I see. It seems the guys above where having issues initially with the subscriber being idle for the most part and becoming unresponsive because the exception was not being propagated. Maybe they are not seeing it anymore because they probably have very low rate requests per second.
In fact, I can see that at low level of traffic the Thread-ConsumeBidirectionalStream does not exit so it is fine.
However at a bit higher traffic rate, the thread exits sometimes. Not always, but we do have long running tests and we see that happening eventually. the traffic rate we are testing this with is ~60 messages a second. I don't know if that would help to reproduce the behaviour. But at higher request rates something makes the consumer thread to crash and that exception is not being surfaced or logged.
What we see happening is that every 10 minutes the Observed recoverable stream error 504 Deadline Exceeded happens but it recovers most of the times with the Re-established stream. But from time to time the consumer thread dies.
When that happens do you see any exception logged at all?
the only stacktraces I got I pasted on this thread (and on this thread: https://github.com/GoogleCloudPlatform/google-cloud-python/issues/5392), but they don't say much. We enabled debug level logging also on the root logger. I don't know how to get more logging tbh.
Alright. I'll see what I can do to try to reproduce.
Hey @jam182! I've been running the subscriber for 2 days and haven't been able to see this specific error. However, I have cut a new release, 0.35.3, that adds some more logging. Could you give it a try and let me know the logs if this re-occurs?
hey thanks!
here is some new logs, I could reproduce the error three times, it's much harder in DEBUG mode:
1.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | waiting for recv.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 3 batched requests
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.leaser | The current p99 value is 10 seconds.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Renewing lease for 3 ack IDs.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 3 batched requests
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Snoozing lease management for 3.072772 seconds.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Scheduling callbacks for 4 messages.
INFO | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Observed recoverable stream error 504 Deadline Exceeded
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | The bidirectional RPC unexpectedly exited. <google.api_core.grpc_helpers._StreamingResponseIterator object at 0x7fa07d4248d0>
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 4 batched requests
INFO | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
DEBUG | 12:47:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Empty queue and inactive call, exiting request generator.
2.
```DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | waiting for recv.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 5 batched requests
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 5 batched requests
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Scheduling callbacks for 5 messages.
INFO | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Observed recoverable stream error 504 Deadline Exceeded
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | The bidirectional RPC unexpectedly exited.
INFO | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | RPC termination has signaled streaming pull manager shutdown.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping consumer.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 5 batched requests
INFO | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping scheduler.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Cleanly exiting request generator.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping leaser.
INFO | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Thread-LeaseMaintainer exiting.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping dispatcher.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Exiting the QueueCallbackWorker.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping heartbeater.
INFO | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.heartbeater | Thread-Heartbeater exiting.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Finished stopping manager.
3.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | waiting for recv.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 6 batched requests
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 6 batched requests
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Scheduling callbacks for 8 messages.
INFO | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Observed recoverable stream error 504 Deadline Exceeded
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | The bidirectional RPC unexpectedly exited.
INFO | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | RPC termination has signaled streaming pull manager shutdown.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 9 batched requests
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping consumer.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Cleanly exiting request generator.
INFO | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping scheduler.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping leaser.
INFO | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Thread-LeaseMaintainer exiting.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping dispatcher.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Error in queue callback worker: 'NoneType' object has no attribute 'remove'
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/helper_threads.py", line 112, in __call__
self._callback(items)
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 101, in dispatch_callback
self.ack(batched_commands.pop(requests.AckRequest))
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 124, in ack
self.drop(items)
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 132, in drop
self._manager.leaser.remove(items)
AttributeError: 'NoneType' object has no attribute 'remove'
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Exiting the QueueCallbackWorker.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping heartbeater.
INFO | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.heartbeater | Thread-Heartbeater exiting.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Finished stopping manager.
```
Thanks, @jam182. let's continue this over at #5444.
Most helpful comment
hey thanks!
here is some new logs, I could reproduce the error three times, it's much harder in DEBUG mode:
1.
2.
```DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | waiting for recv.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 5 batched requests
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 5 batched requests
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Scheduling callbacks for 5 messages.
INFO | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Observed recoverable stream error 504 Deadline Exceeded
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | The bidirectional RPC unexpectedly exited.
INFO | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | RPC termination has signaled streaming pull manager shutdown.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping consumer.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 5 batched requests
INFO | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping scheduler.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Cleanly exiting request generator.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping leaser.
INFO | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Thread-LeaseMaintainer exiting.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping dispatcher.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Exiting the QueueCallbackWorker.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping heartbeater.
INFO | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.heartbeater | Thread-Heartbeater exiting.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Finished stopping manager.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | waiting for recv.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 6 batched requests
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 6 batched requests
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Scheduling callbacks for 8 messages.
INFO | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Observed recoverable stream error 504 Deadline Exceeded
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | The bidirectional RPC unexpectedly exited.
INFO | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | RPC termination has signaled streaming pull manager shutdown.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 9 batched requests
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping consumer.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Cleanly exiting request generator.
INFO | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping scheduler.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping leaser.
INFO | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Thread-LeaseMaintainer exiting.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping dispatcher.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Error in queue callback worker: 'NoneType' object has no attribute 'remove'
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/helper_threads.py", line 112, in __call__
self._callback(items)
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 101, in dispatch_callback
self.ack(batched_commands.pop(requests.AckRequest))
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 124, in ack
self.drop(items)
File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 132, in drop
self._manager.leaser.remove(items)
AttributeError: 'NoneType' object has no attribute 'remove'
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Exiting the QueueCallbackWorker.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping heartbeater.
INFO | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.heartbeater | Thread-Heartbeater exiting.
DEBUG | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Finished stopping manager.
```