Sagemaker-python-sdk: Bug: change: use regional endpoint when creating AWS STS client #1026

Created on 9 Sep 2019  路  22Comments  路  Source: aws/sagemaker-python-sdk

Please fill out the form below.

System Information

  • Python 3.6.9
  • Sagemaker SDK 1.39.0

Describe the problem

PR #1026 introduced a bug by not using a scheme for a particular STS endpoint

Minimal repro / logs

sagemaker.get_execution_role(sagemaker_session)

Leads to this stack trace

  File "/Library/Caches/virtualenvs/gocentral-ml-sagemaker-0XKWRhha-py3.6/lib/python3.6/site-packages/sagemaker/session.py", line 1386, in get_caller_identity_arn
    "sts", endpoint_url=sts_regional_endpoint(self.boto_region_name)
  File "/Library/Caches/virtualenvs/gocentral-ml-sagemaker-0XKWRhha-py3.6/lib/python3.6/site-packages/boto3/session.py", line 263, in client
    aws_session_token=aws_session_token, config=config)
  File "/Library/Caches/pypoetry/virtualenvs/gocentral-ml-sagemaker-0XKWRhha-py3.6/lib/python3.6/site-packages/botocore/session.py", line 839, in create_client
    client_config=config, api_version=api_version)
  File "/Library/Caches/pypoetry/virtualenvs/gocentral-ml-sagemaker-0XKWRhha-py3.6/lib/python3.6/site-packages/botocore/client.py", line 86, in create_client
    verify, credentials, scoped_config, client_config, endpoint_bridge)
  File "/Library/Caches/pypoetry/virtualenvs/gocentral-ml-sagemaker-0XKWRhha-py3.6/lib/python3.6/site-packages/botocore/client.py", line 328, in _get_client_args
    verify, credentials, scoped_config, client_config, endpoint_bridge)
  File "/Library/Caches/pypoetry/virtualenvs/gocentral-ml-sagemaker-0XKWRhha-py3.6/lib/python3.6/site-packages/botocore/args.py", line 85, in get_client_args
    client_cert=new_config.client_cert)
  File "/Library/Caches/pypoetry/virtualenvs/gocentral-ml-sagemaker-0XKWRhha-py3.6/lib/python3.6/site-packages/botocore/endpoint.py", line 261, in create_endpoint
    raise ValueError("Invalid endpoint: %s" % endpoint_url)
ValueError: Invalid endpoint: sts.us-west-2.amazonaws.com

The endpoint.py requires a scheme and non is provided via the sts_regional_endpoint method.

pending release bug

Most helpful comment

It's working for me:
1) open the terminal and activate the right conda env used in jupyter
2) pip install sagemaker==1.38.6
3) reload the kernel in jupyter and import sagemaker package
4) confirm that the 1.38 version is loaded
5) the get_execution_role() should be working now.

All 22 comments

More details: https://github.com/boto/botocore/blob/develop/botocore/utils.py#L832-L853 from botocore prevents this change from working

Thanks for the detailed bug report, and apologies for the trouble this has caused. I've submitted a PR to fix this: https://github.com/aws/sagemaker-python-sdk/pull/1035

Hey the fix of this bug was already merged and deployed?

not deployed yet - will do so tomorrow morning. In the meantime, please downgrade your SDK version to 1.38.6. Sorry for the inconvenience!

I am trying to use jupyter notebook and is throwing same error, this issue will be fixed when you deployed the fixes?

How does one downgrade the SDK in the jupyter notebook? Just spent a chunk of time on this issue trying to get the demo working.

Yeah, I am block now for the same thing, cannot use jupyter notebook even for demo notebooks, this issue is a headache, totally crazy a company as Amazon has this type of bugs

The default sagemaker package still has de issue (1.39).

But it seems it's working if you downgrade to 1.38.6:
pip install sagemaker==1.38.6

I'm getting the same error even after downgrading to 1.38.6.

Any Estimated time for this issue?

It's working for me:
1) open the terminal and activate the right conda env used in jupyter
2) pip install sagemaker==1.38.6
3) reload the kernel in jupyter and import sagemaker package
4) confirm that the 1.38 version is loaded
5) the get_execution_role() should be working now.

Apparently the second time is the charm. Downgraded to 1.38.6 again and now it works.

v1.39.1 has just been released, and should contain the fix.

@bscholesboogie Could you describe step by step How to downgrade sagemaker package in a jupyter notebook instance, please?

@bscholesboogie Could you describe step by step How to downgrade sagemaker package in a jupyter notebook instance, please?

Sure. In your Jupyter notebook, insert a blank (Code) cell at the top and shift-enter the following:
!pip install sagemaker==1.38.6

Give it a minute, and it will uninstall v.1.39.0 and install 1.38.6 over it. Be sure to restart the kernel before you begin running the rest of your code.

It seems new spawned sagemaker instances still got the 1.39.0 package by default. You have either to manually downgrade sagemaker package to 1.38.6 or to upgrade it to 1.39.1 to solve the error.

thank you @ivenzor and @bscholesboogie for your help here! I've also reached out to the team that owns SageMaker Notebook Instances about new instances still using the buggy SDK version.

Hi, I'm trying to train some models on SageMaker's Notebooks today and it's not working at all, by the look of this thread, I think the Service Health Dashboard should have been updated to provide details on this downtime.

Although the fix is relatively easy, just running !pip install sagemaker==1.38.6 within a notebook instance, I don't think AWS users should be expected to seek out fixes on GitHub comments sections, some information in SageMaker would have been preferable

Today's spawned instances have already the correct package by default (1.39.1).

I'm still getting the error as of now in eu-west-1.
When you say "spawned instance" does that constitute deleting and recreating my notebook, or is simply shutting down the notebook instance and opening it again fine? Because if so, I have tried the latter.

I'm still getting the error as of now in eu-west-1.
When you say "spawned instance" does that constitute deleting and recreating my notebook, or is simply shutting down the notebook instance and opening it again fine? Because if so, I have tried the latter.

!pip install -U sagemaker
!pip install -U boto3

I just ran this in SM notebook instance based in Ireland
I had to restart the kernel, but it is now working
Hopefully this works for you also

Sorry for the delay, for "spawned instance" I meant when you start your sagemaker notebook instances they already have the fixed sagemaker package (1.39.1).

sh-4.2$ conda list | grep sagemaker
sagemaker 1.39.1

Was this page helpful?
0 / 5 - 0 ratings