Sagemaker-python-sdk: Deploy method doesn't work with "content_type" keyword arg

Created on 19 Nov 2019  路  4Comments  路  Source: aws/sagemaker-python-sdk

Please fill out the form below.

System Information

  • Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): Scikit Surprise
  • Framework Version: 1.1.0
  • Python Version: 3.6.5 (SageMaker notebook conda_python3 kernel)
  • CPU or GPU: CPU
  • Python SDK Version: 1.43.1
  • Are you using a custom image: No - SDK SKLearn estimator in script mode with a subprocess.call([sys.executable, "-m", "pip", "install", "surprise"]) command to install the surprise library on load.

Describe the problem

Possibly related to #1120 and #623

Estimator.deploy() docs say that additional kwargs are passed through to create_model()

Estimator.create_model() docs say that content_type is a supported keyword argument.

In fact, passing content_type to Estimator.deploy() raises TypeError: __init__() got an unexpected keyword argument 'content_type' as below.

Minimal repro / logs

Create a SKLearn estimator with a custom script:

estimator = sagemaker.sklearn.estimator.SKLearn(
    entry_point=my_cool_script_path,
    train_instance_type="ml.c4.xlarge",
    role=role,
    sagemaker_session=sagemaker_session,
    output_path="s3://{}/{}".format(bucket, outdir),
    train_use_spot_instances=True,
    train_max_run=60*5, # 5 minutes
    train_max_wait=60*10 # 10 minutes
)

This script actually wants JSON inference request payloads, not dataframes:

# Raises error:
predictor = estimator.deploy(
    initial_instance_count=1,
    instance_type="ml.m4.xlarge",
    content_type=sagemaker.content_types.CONTENT_TYPE_JSON
)

The docs do note that implementations may customize create_model, but I don't see why they would remove/suppress base parameters like this?

Raises error:

TypeError                                 Traceback (most recent call last)
<ipython-input-71-a58115be62f0> in <module>()
      2     initial_instance_count=1,
      3     instance_type="ml.m4.xlarge",
----> 4     content_type=sagemaker.content_types.CONTENT_TYPE_JSON
      5 )

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/estimator.py in deploy(self, initial_instance_count, instance_type, accelerator_type, endpoint_name, use_compiled_model, update_endpoint, wait, model_name, kms_key, **kwargs)
    549         else:
    550             kwargs["model_kms_key"] = self.output_kms_key
--> 551             model = self.create_model(**kwargs)
    552         model.name = model_name
    553         return model.deploy(

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/sklearn/estimator.py in create_model(self, model_server_workers, role, vpc_config_override, **kwargs)
    174             vpc_config=self.get_vpc_config(vpc_config_override),
    175             enable_network_isolation=self.enable_network_isolation(),
--> 176             **kwargs
    177         )
    178 

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/sklearn/model.py in __init__(self, model_data, role, entry_point, image, py_version, framework_version, predictor_cls, model_server_workers, **kwargs)
     99         """
    100         super(SKLearnModel, self).__init__(
--> 101             model_data, image, role, entry_point, predictor_cls=predictor_cls, **kwargs
    102         )
    103 

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/model.py in __init__(self, model_data, image, role, entry_point, source_dir, predictor_cls, env, name, enable_cloudwatch_metrics, container_log_level, code_location, sagemaker_session, dependencies, git_config, **kwargs)
    726             name=name,
    727             sagemaker_session=sagemaker_session,
--> 728             **kwargs
    729         )
    730         self.entry_point = entry_point

TypeError: __init__() got an unexpected keyword argument 'content_type'
contributions welcome documentation

Most helpful comment

The above solution does not work anymore for SageMaker 2.x due to the removal of these methods.

The solution is to specify the serializer and deserializer in the deploy method instead.

Here is a sample code:

from sagemaker.deserializers import JSONDeserializer
from sagemaker.serializers import JSONSerializer

predictor = estimator.deploy(
    initial_instance_count=1,
    instance_type="ml.m4.xlarge",
    serializer=JSONSerializer(),
    deserializer= JSONDeserializer()
)

All 4 comments

Hello @athewsey,

Apologies for the late response.

Thank you for bringing this issue up to us.

I will first attempt to reproduce the error that you are seeing and will report ASAP.

Thank you!

@athewsey you can set the content type after you've created the predictor:

predictor = estimator.deploy(
    initial_instance_count=1,
    instance_type="ml.m4.xlarge"
)

predictor.serializer = sagemaker.predictor.json_serializer
predictor.deserializer = sagemaker.predictor.json_deserializer
predictor.content_type = sagemaker.content_types.CONTENT_TYPE_JSON
predictor.accept = sagemaker.content_types.CONTENT_TYPE_JSON

predictor.predict(...)

SKLearn doesn't actually inherit from Estimator - they both inherit from a class called EstimatorBase which has create_model() only as an abstract method.

The above solution does not work anymore for SageMaker 2.x due to the removal of these methods.

The solution is to specify the serializer and deserializer in the deploy method instead.

Here is a sample code:

from sagemaker.deserializers import JSONDeserializer
from sagemaker.serializers import JSONSerializer

predictor = estimator.deploy(
    initial_instance_count=1,
    instance_type="ml.m4.xlarge",
    serializer=JSONSerializer(),
    deserializer= JSONDeserializer()
)

Agree this issue is no longer relevant with SageMaker SDK v2.x released and propagated to SageMaker notebook kernels. Closing!

Was this page helpful?
0 / 5 - 0 ratings