Please fill out the form below.
If I try to deploy a pre-built model like so:
```{Python}
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model0100.tar.gz',
role = role,
framework_version='1.13', py_version='py3',
entry_point = 'train.py')
Will fail upon deploying:
```{Python}
predictor = sagemaker_model.deploy(initial_instance_count=1,
instance_type='ml.p2.xlarge')
I receive:
```{Python}
ValueError: Error hosting endpoint sagemaker-tensorflow-2019-07-07-11-50-45-473: Failed Reason: The image '520713654638.dkr.ecr.eu-west-1.amazonaws.com/sagemaker-tensorflow:1.13-gpu-py3' does not exist.
I can get past this error by specifying the image (which is not well-documented - took a lot of digging to find a link that worked):
```{Python}
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model0100.tar.gz',
role = role,
framework_version='1.13', py_version='py3',
entry_point = 'train.py', image = '763104351884.dkr.ecr.eu-west-1.amazonaws.com/tensorflow-inference:1.13-gpu' )
Any idea how to solve this?
Hi @NoahDolev, thank you for using SageMaker! From the code you provided, it seems you want to train your model with train.py?
In order to use TensorFlow script mode to train your model (and then deploy), you want to start with the Tensorflow Estimator class: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/estimator.py#L188
You either set script_mode=True or py_version="py3" to enable script mode.
Hi @ChuyangDeng ,
I am not sure that has anything to do with the issue I posted. I am reporting to you that the docker image which SageMaker searches for by default is not correct for eu-west-1. Also, script_mode is not a valid flag of TensorFlowModel. This flag exists only in TensorFlow to the best of my knowledge.
Best,
Noah
Hi @NoahDolev,
Are you trying to do training or hosting here? Our TensorFlow script mode is only supported for training. And a TensoFlowModel class is for hosting, that's why the docker image uri is not correct (cannot be found).
If you are training your model, you should use TensorFlow estimator class so that you can train with our script mode image.
If you are deploying your trained model, you will use TensorFlowModel class, but no script mode is supported with deploying.
@NoahDolev @ChuyangDeng I met the same error when I follow this link:
https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/
to deploy a pre-trained model in SageMaker with a different model. Since I am using py3 in my model, so I have to specify the image like this:
`sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
role = role,
py_version='py3',
framework_version = '1.12',
entry_point = 'train.py')
predictor = sagemaker_model.deploy(initial_instance_count=1,
instance_type='ml.p2.xlarge')`
ValueError: Error hosting endpoint sagemaker-tensorflow-2019-07-10-05-06-02-075: Failed Reason: The image '520713654638.dkr.ecr.us-east-2.amazonaws.com/sagemaker-tensorflow:1.12-gpu-py3' does not exist.
When I delete py_version='py3' there is no error anymore.
Hi @yuchuang1979 ,
Precisely what I am referring to. I am trying to deploy a model I trained elsewhere. You can also specify the image to solve the problem. My point, however, is that the default is pointing to the wrong docker image. It's a bug.
Best,
Noah
@NoahDolev thanks for pointing out that there is another route by specifying the image. I am totally new to SageMaker and just began the work several days ago.
How could you create the image before specifying it in the function?
Just some context.
There are two TensorFlow solutions that handle serving in the Python SDK.
They have different class representations and documentation as shown here.
TensorFlowModel - https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/model.py#L47
Doc: https://github.com/aws/sagemaker-python-sdk/tree/v1.12.0/src/sagemaker/tensorflow#deploying-directly-from-model-artifacts
Key difference: Uses a proxy GRPC client to sent requests
Container impl: https://github.com/aws/sagemaker-tensorflow-container/blob/master/src/tf_container/serve.py
Model - https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/serving.py#L96
Doc: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst
Key difference: Utilizes the TensorFlow serving rest API
Container impl: https://github.com/aws/sagemaker-tensorflow-serving-container/blob/master/container/sagemaker/serve.py
Python 3 isn't supported using the TensorFlowModel object, as the container uses the TensorFlow serving api library in conjunction with the GRPC client to handle making inferences, however the TensorFlow serving api isn't supported in Python 3 officially, so there are only Python 2 versions of the containers when using the TensorFlowModel object.
If you need Python 3 then you will need to use the Model object defined in #2 above. The inference script format will change if you need to handle pre and post processing. https://github.com/aws/sagemaker-tensorflow-serving-container#prepost-processing.
Also your inference requests will need to follow the TFS rest API.
https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst#making-predictions-against-a-sagemaker-endpoint
Since you train externally you're going to need to make sure your model artifacts follow the correct format. https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst#deploying-more-than-one-model-to-your-endpoint
Here is an example that does for the most part what you're trying to do. https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_serving_container/tensorflow_serving_container.ipynb
Sorry for the confusion and wall of text and links. Please let me know if there is anything I can clarify.
Thanks!
@ChoiByungWook This is quite clear. Thanks!
@ChoiByungWook Thanks for your introduction! I am wondering when will tf 1.14 be supported for serving?
I tried cpu, gpu and elastic ones, but it seems the corresponding images are all not available:
The image '763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:1.14-cpu' does not exist.
The image '763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:1.14-gpu' does not exist.
I used your second one:
from sagemaker import get_execution_role
from sagemaker.tensorflow.serving import Model
role = get_execution_role()
sagemaker_model = Model(model_data = 's3://sagemaker-hover/Models/zulu/tpu/model.tar.gz',
role = role,
framework_version='1.14')
predictor = sagemaker_model.deploy(initial_instance_count=1,
instance_type='ml.p2.xlarge',
endpoint_name='test-001')
And also for the TensorFlowModel module, it seems it only supports until tf 1.12.
We have to use the proxy server with circle to run this.
Did the format for specifying images change after TensorFlow 2 support was added? Or are there just no pre-built images for TensorFlow frameworks 2.0 and 2.1? I get
UnexpectedStatusException: Error hosting endpoint sagemaker-tensorflow-2020-04-13-14-02-35-992: Failed. Reason: The image '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-cpu-py2' does not exist..
UnexpectedStatusException: Error hosting endpoint sagemaker-tensorflow-2020-04-13-14-02-35-992: Failed. Reason: The image '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-gpu-py2' does not exist..
UnexpectedStatusException: Error hosting endpoint sagemaker-tensorflow-2020-04-13-14-02-35-992: Failed. Reason: The image '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-cpu-py3' does not exist..
UnexpectedStatusException: Error hosting endpoint sagemaker-tensorflow-2020-04-13-14-02-35-992: Failed. Reason: The image '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-gpu-py3' does not exist..
When trying to specify
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
role = role,
framework_version = '2.1.0',
entry_point = 'train.py')
in the sample notebook available at https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/.
@ChoiByungWook The container implementation code locations given above (for TensorflowModel & Model) are outdated. Can you please point to the current implementations?
@keelerh @ratulray I believe the class you're looking for is sagemaker.tensorflow.serving.Model (the second one that @ChoiByungWook mentioned): https://sagemaker.readthedocs.io/en/stable/sagemaker.tensorflow.html#tensorflow-serving-model. That class should retrieve the correct image URI for the TF 2.x images.
if you have any further questions, please open a new issue (it'll help with our internal tracking)
Thanks Lauren for your response. Actually my question was not that. I opened a new issue https://github.com/aws/sagemaker-python-sdk/issues/1472

What should i do?
@abdelhamednouh you're commenting on an old, closed issue with an unrelated error message - can you open a new issue?
Most helpful comment
Just some context.
There are two TensorFlow solutions that handle serving in the Python SDK.
They have different class representations and documentation as shown here.
TensorFlowModel - https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/model.py#L47
Doc: https://github.com/aws/sagemaker-python-sdk/tree/v1.12.0/src/sagemaker/tensorflow#deploying-directly-from-model-artifacts
Key difference: Uses a proxy GRPC client to sent requests
Container impl: https://github.com/aws/sagemaker-tensorflow-container/blob/master/src/tf_container/serve.py
Model - https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/serving.py#L96
Doc: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst
Key difference: Utilizes the TensorFlow serving rest API
Container impl: https://github.com/aws/sagemaker-tensorflow-serving-container/blob/master/container/sagemaker/serve.py
Python 3 isn't supported using the TensorFlowModel object, as the container uses the TensorFlow serving api library in conjunction with the GRPC client to handle making inferences, however the TensorFlow serving api isn't supported in Python 3 officially, so there are only Python 2 versions of the containers when using the TensorFlowModel object.
If you need Python 3 then you will need to use the Model object defined in #2 above. The inference script format will change if you need to handle pre and post processing. https://github.com/aws/sagemaker-tensorflow-serving-container#prepost-processing.
Also your inference requests will need to follow the TFS rest API.
https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst#making-predictions-against-a-sagemaker-endpoint
Since you train externally you're going to need to make sure your model artifacts follow the correct format. https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst#deploying-more-than-one-model-to-your-endpoint
Here is an example that does for the most part what you're trying to do. https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_serving_container/tensorflow_serving_container.ipynb
Sorry for the confusion and wall of text and links. Please let me know if there is anything I can clarify.
Thanks!