Sagemaker-python-sdk: An easier equivalent to the removed update_endpoint argument

Created on 23 Sep 2020  路  8Comments  路  Source: aws/sagemaker-python-sdk

Describe the feature you'd like

A direct/simple way to update an existing endpoint to a new model version (created e.g. by Model() constructor or Estimator.fit()).

Per the SDK v2 migration doc, Estimator.deploy() and Model.deploy() have had their update_endpoint argument removed and raise an error when called with an existing endpoint name. Users are advised to use Predictor.update_endpoint() instead.

The problem is the update_endpoint() method takes an existing SageMaker Model name as parameter and, per #1094, I'm not aware of an easy/SDK way to register a Model in the API given a Model object or a trained Estimator.

How would this feature be used? Please describe.

When a user has re-trained an Estimator or created a new Model object in the SDK, they'll be able to easily update an existing endpoint - like they would have done in v1 with Model.deploy(..., update_endpoint=True).

Describe alternatives you've considered

The implementation could maybe proceed as:

  • Re-instate the update_endpoint parameter to enable the old one-line flow
  • Add a method on Model (and maybe Estimator too?) to register the Model in the SageMaker API.
  • Something else?

Additional context

As used in, for example, the amazon-sagemaker-analyze-model-predictions sample.

It'd be great to know if I'm just missing an easy way to use Predictor.update_endpoint() for this!

contributions welcome feature request

Most helpful comment

This is a missing feature, and a very important one; please update.

All 8 comments

An example flow I got working for now, which uses private/internal functions and repeats the instance type way too much:

sagemaker_model._init_sagemaker_session_if_does_not_exist('ml.m5.xlarge')
sagemaker_model._create_sagemaker_model('ml.m5.xlarge')
predictor.update_endpoint(
    model_name=sagemaker_model.name,
    initial_instance_count=1,
    instance_type='ml.m5.xlarge',
)

...Speaking of which, it seems weird to me that initial_instance_count and instance_type are required params on the predictor call when the model_name is specified, but not otherwise? Can't it just default to the existing endpoint instance params as it would in the case where model_name wasn't changed?

An example flow I got working for now, which uses private/internal functions and repeats the instance type way too much:

sagemaker_model._init_sagemaker_session_if_does_not_exist('ml.m5.xlarge')
sagemaker_model._create_sagemaker_model('ml.m5.xlarge')
predictor.update_endpoint(
    model_name=sagemaker_model.name,
    initial_instance_count=1,
    instance_type='ml.m5.xlarge',
)

...Speaking of which, it seems weird to me that initial_instance_count and instance_type are required params on the predictor call when the model_name is specified, but not otherwise? Can't it just default to the existing endpoint instance params as it would in the case where model_name wasn't changed?

I am also encountering a similar issue. However, I am actually having a hard time finding the model name when using the sdk. How have you gone about doing this? I was not able to locate a place where the estimator, or associated training jobs keep track of the created model at all unfortunately, but I may just be missing it.

@kenanzh when you call Estimator.deploy() it actually wraps around creating 3 things in the back-end, that you can see in the SageMaker Console: Model, Endpoint Configuration, and Endpoint.

In my example I was explicitly creating an SDK 'Model' object. I think you should be able to get the equivalent of my sagemaker_model by calling Estimator.create_model(...).

Note that creating a PyTorchModel(or equivalent for other frameworks) in the SDK does not actually register it in the SageMaker API, which is why I called the internal sagemaker_model._create_sagemaker_model('ml.m5.xlarge') above. Creating the "real model" in the SageMaker API requires knowing the instance type (because most frameworks have different images for GPU vs CPU), so normally it happens when you call Model.transformer() or Model.deploy(). The sagemaker_model.name property will be empty until the "real model" has been created in the API.

@athewsey Thanks for using our product and the suggestion. We will have a discussion about this feature request.

This is a missing feature, and a very important one; please update.

Any update/workaround regarding this one?

+1. Whatever good reason there is behind removing the update_endpoint arg, the migration doc reads like "go figure out yourself", which does not really help with the transition from v1 to v2. I would expect at least some example about how do perform the same function in v2, or revert this change if it is not really necessary.

Here is how to create a model and update an existing endpoint

Create model using sagemaker session

You can create the model using sagemaker session. Depending on whether it is BYO or an existing training job chose of the one method to create the container definition.

BYO- Create model

The model is trained outside sagemaker, e.g. a Pretrained Model

Step 0 - Prerequisite FOR BYO: Package your model correctly : Note: Make sure the model_data_url is packaged correctly according to create-the-directory-structure-for-your-model-files and upload to s3. Also thanks to Joao Moura, add the ENV variable so that sagemaker knows the entry point

import sagemaker, datetime

# Retrieve the inference image uri for a GPU instance for pytorch 1.4.0
image_uri = sagemaker.image_uris.retrieve("pytorch", "us-east-2", version="1.4.0", py_version="py3", 
                              instance_type="ml.p3.2xlarge", accelerator_type=None, image_scope="inference",
                              container_version=None, distribution=None, base_framework_version=None)

# Define container def
container_def = sagemaker.session.container_def(image_uri, model_data_url,  env={'SAGEMAKER_PROGRAM': my_inference_entry_point})

# Create model
new_model_name = "my-new-model-{}".format( datetime.datetime.now().strftime("%Y%m%d%H%M%S"))
sm_session = sagemaker.session.Session()
sm_session.create_model(new_model_name, role, container_def)

Existing training job - Create model

import sagemaker, datetime
from sagemaker.pytorch.estimator  import PyTorch

# Retrieve the inference image uri for a GPU instance for pytorch 1.4.0
image_uri = sagemaker.image_uris.retrieve("pytorch", "us-east-2", version="1.4.0", py_version="py3", 
                              instance_type="ml.p3.2xlarge", accelerator_type=None, image_scope="inference",
                              container_version=None, distribution=None, base_framework_version=None)

# Attach to existing training job
estimator = PyTorch.attach(training_job_name )

# Construct PyTorch model object
new_model_name = "my-new-model-{}".format( datetime.datetime.now().strftime("%Y%m%d%H%M%S"))
model = estimator.create_model(name=new_model_name, entry_point=my_inference_entry_point,image_uri=image_uri)

# Prepare container def, so you package the entry point file and the model 
container_def = model.prepare_container_def()

# Create model - in SageMaker
sm_session = sagemaker.session.Session()
sm_session.create_model(new_model_name, role, container_def)

Update Endpoint

Once the model is created, update the existing endpoint

import sagemaker

predictor = sagemaker.pytorch.model.PyTorchPredictor(existing_endpoint_name)
predictor.update_endpoint(initial_instance_count=1, instance_type="ml.p3.2xlarge", model_name= new_model_name)
Was this page helpful?
0 / 5 - 0 ratings