Azure-sdk-for-python: ServiceBus AMQP - idle_timeout is not used with NEXT_AVAILABLE session

Created on 10 Jan 2020  路  10Comments  路  Source: Azure/azure-sdk-for-python

  • Package Name: azure-servicebus
  • Package Version: 0.50.2
  • Operating System: Windows 10
  • Python Version: 3.8.0

Describe the bug
I am using azure.servicebus.ServiceBusClient with session-enabled topics.
I create a Receiver with a certain idle_timeout (in seconds) and then call fetch_next with the same number for the argument timeout.

I am working on a topic where there is no message currently circulating, only me trying to listen.

  • If I provide to the receiver a specific session ID (any GUID), fetch_next returns after the specified timeout.

  • If instead I give NEXT_AVAILABLE for the session ID, fetch_next blocks for a minutem no matter the value of the timeout I provide. Also I receive a VendorLinkDetach('com.microsoft:timeout:...').

To Reproduce
Steps to reproduce the behavior:

  1. Create a SubscriptionClient
  2. Create a receiver receiver: Receiver = subscription_client.get_receiver(mode=ReceiveSettleMode.PeekLock, session=NEXT_AVAILABLE, idle_timeout=4)
  3. Call receiver.fetch_next(1, timeout=4)

The method will return/throw after 1 minute instead of 4 seconds. It will behave correctly (use the 4 seconds timeout) if you pass a GUID instead of NEXT_AVAILABLE.

Expected behavior
Behavior consistent with GUID session IDs, fetch_next (or __enter__ when using the context manager) returning after the timeout that is specified.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Here is a snippet of code that can be used to reproduce the problem, and the output I get :

import logging
import time
import uuid
from datetime import timedelta

from azure.servicebus import ServiceBusClient, ReceiveSettleMode, NoActiveSession, NEXT_AVAILABLE
from azure.servicebus.receive_handler import Receiver

CONN_STRING = "Endpoint=sb://secret.servicebus.windows.net/;SharedAccessKeyName=agent;SharedAccessKey=secret"
logging.getLogger().setLevel(logging.WARNING)

if __name__ == "__main__":
    bus_client = ServiceBusClient.from_connection_string(CONN_STRING)
    subscription_client = bus_client.get_subscription("secret", "secret")
    for session_id in ("", NEXT_AVAILABLE):
        print("///////////////////////////////////////////////")
        print(f" USING SESSION {'NEXT_AVAILABLE' if session_id else 'GUID'}")
        print("///////////////////////////////////////////////")

        for i in range(1, 5):
            _session = session_id or str(uuid.uuid4())
            print("    -----------------------------------------------")
            print(f"    TIMEOUT SET TO {i}")
            start = time.monotonic_ns()
            receiver: Receiver = subscription_client.get_receiver(mode=ReceiveSettleMode.PeekLock,
                                                                  session=_session, idle_timeout=i)
            print("    Got receiver")
            try:
                receiver.fetch_next(1, timeout=i)
            except NoActiveSession:
                pass
            except Exception as err:
                print(err)
            finally:
                receiver.close()

            duration_ns = time.monotonic_ns() - start
            duration = timedelta(microseconds=duration_ns/1000)
            print(f"    EXECUTED IN {duration_ns} ns")
            print(f"    TIMEOUT {i} -- EXECUTED IN {duration}")

    print("Finished")
///////////////////////////////////////////////
 USING SESSION GUID
///////////////////////////////////////////////
    -----------------------------------------------
    TIMEOUT SET TO 1
    Got receiver
    EXECUTED IN 1765000000 ns
    TIMEOUT 1 -- EXECUTED IN 0:00:01.765000
    -----------------------------------------------
    TIMEOUT SET TO 2
    Got receiver
    EXECUTED IN 2766000000 ns
    TIMEOUT 2 -- EXECUTED IN 0:00:02.766000
    -----------------------------------------------
    TIMEOUT SET TO 3
    Got receiver
    EXECUTED IN 3765000000 ns
    TIMEOUT 3 -- EXECUTED IN 0:00:03.765000
    -----------------------------------------------
    TIMEOUT SET TO 4
    Got receiver
    EXECUTED IN 4782000000 ns
    TIMEOUT 4 -- EXECUTED IN 0:00:04.782000
///////////////////////////////////////////////
 USING SESSION NEXT_AVAILABLE
///////////////////////////////////////////////
    -----------------------------------------------
    TIMEOUT SET TO 1
    Got receiver
VendorLinkDetach('com.microsoft:timeout: The operation did not complete within the allocated time 00:01:00 for object attach. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=
761101 TrackingId:secret, SystemTracker:gateway5, Timestamp:2020-01-10T17:14:54')
    EXECUTED IN 62640000000 ns
    TIMEOUT 1 -- EXECUTED IN 0:01:02.640000
    -----------------------------------------------
    TIMEOUT SET TO 2
    Got receiver
VendorLinkDetach('com.microsoft:timeout: The operation did not complete within the allotted timeout of 00:01:00. The time allotted to this operation may have been a portion of a longer timeout. For more information on exception types and proper excepti
on handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 TrackingId:secret, SystemTracker:secret:Topic:secret|secret, Timestamp:2020-01-10T17:15:55 TrackingId:secret, SystemTracker:gateway5, Timestamp:2020-01-10T17:15:55')
    EXECUTED IN 60703000000 ns
    TIMEOUT 2 -- EXECUTED IN 0:01:00.703000
    -----------------------------------------------
    TIMEOUT SET TO 3
    Got receiver
VendorLinkDetach('com.microsoft:timeout: The operation did not complete within the allocated time 00:01:00 for object attach. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=
761101 TrackingId:secret, SystemTracker:gateway5, Timestamp:2020-01-10T17:16:57')
    EXECUTED IN 62594000000 ns
    TIMEOUT 3 -- EXECUTED IN 0:01:02.594000
    -----------------------------------------------
    TIMEOUT SET TO 4
    Got receiver
VendorLinkDetach('com.microsoft:timeout: The operation did not complete within the allotted timeout of 00:01:00. The time allotted to this operation may have been a portion of a longer timeout. For more information on exception types and proper excepti
on handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 TrackingId:secret, SystemTracker:secret:Topic:secret|secret, Timestamp:2020-01-10T17:17:58 TrackingId:secret, SystemTracker:gateway5, Timestamp:2020-01-10T17:17:58')
    EXECUTED IN 60797000000 ns
    TIMEOUT 4 -- EXECUTED IN 0:01:00.797000
Finished

Client Service Bus bug customer-reported

All 10 comments

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jfggdl

@mayurid ,
This seems to be a client side bug.
Could you please assign it to the appropriate team for investigating this on the python client side.

@K3UL sorry for missing the issue here.
We're looking into the issue and will update you if we make any progress on this.
Thanks for your patience in advance.

Hi, any ETA on this ?
Currently it is impossible to use a real timeout AND we have to use specific code to ignore that specific WARNING so it does not pollute our logs

Hey @K3UL , apologize for not getting to you sooner.

After investigation, I found that this is an expected outcome when you specify session id to "NEXT_AVAILABLE" while no session is available, it works differently from specifying a certain session_id value.

  • NEXT_AVAILABLE: specifying "NEXT_AVAILABLE" requires the service side to look for any available/free session that is ready to be received from -- it's a service side setting. If service side can't find any in one minute, it would return error VendorLinkDetach('com.microsoft:timeout: The operation did not complete within the allocated time 00:01:00 for object attach. which means client doesn't connect to any session.
  • idle_timeout: idle_timeout is the timeout between received messages. It come into effect only when the client side has already connected to a queue or a session (when you're connecting to a sessionful entity) and starts receiving.

So in your case of no available session, as the client in fact doesn't connect to any session, idle_timeout won't intervene.

If you find that warning logging annoying, you could eliminate that by escalating uamqp log level:

uamqp_logger = logging.getLogger("uamqp")
uamqp_logger.setLevel(logging.ERROR)
# or uamqp_logger.setLevel(logging.CRITICAL)

Let me know whether this helps solving your problem.

@yunhaoling Thank you for looking into the issue. Silencing the logger was not really difficult, it is more a problem for me because I DO WANT the real Warnings (i.e. Deprecations ?) to show up.
I see the workflow you're describing and I understand why this happens, but I wonder then how to change that one minute timeout ? It is not critical to do so, but I don't like the idea of Azure controlling this for me.
For me the main problem is a lack of consistency in the behavior if I use a GUID or NEXT_AVAILABLE.

Hey @K3UL ,thanks for the feedback, I get your point on the "consistency in the behavior".

Right now we're having internal discussion on this topic and you've made your point quite well. I'll bring your concern to the team.

The reason why we choose not to override the behavior of 60s timeout for no-available-session as it's a service side setting, there could be service side concern on this -- e.g. it's expensive to re-establish the underlying connection so just wait for a while to see if any free session would be showing up.

I'll keep you updated on the topic, thanks for sharing your thoughts.

@yunhaoling I understand what you mean about the need to NOT re-establish connections too often (although I still think it should be my choice), but I was actually more thinking about making it a longer timeout. In my case I get new messages after more than a minute almost always.

And tell me if I'm wrong, but service bus being charged per-operation and per-request, I could optimize our costing a little by having a longer timeout, and so less connections, am I right ?

Hi @K3UL , We respect your thoughts on this and has internally started drafting API proposal for our latest library -- azure-servicebus v7.0.0.

However, I'm sorry to tell you that the change won't go into the old preview library v0.50, v0.50 is about to be deprecated.

Currently we're putting all our efforts into azure-servicebus v7.0.0 (currently in preview status) to create a more user friendly and productive library. This "SessionReceiver opening timeout" is a non-trivial change to the v0.50 library and we have limited cycles so we decided not to support that for the old preview library.

I have created an issue for this topic for the v7 lib: https://github.com/Azure/azure-sdk-for-python/issues/11505.
Please use the issue to track our working progress.

Apologize for the inconvenience we bring to you and thanks for your understanding.

Hi @yunhaoling,
That is fine by me, I try to get rid of the legacy SDK as much as I can actually. I only use the old one for subscriptions management, but one of your colleagues told me you have a new management SDK brewingm that will be a little less "intrusive" on the credentials needed :)

Thank you for your time and for adding this issue on v7 ! I'll be watching out for it

Was this page helpful?
0 / 5 - 0 ratings