Describe the bug
I am using the template project and intend to create and deploy a keras model which has customized preprocessing script. In order to do that I squeeze the preprocessing script inside the inference.py as described in https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/deploying_tensorflow_serving.html, created model works as expected when I tried to use TensorFlowModel model independently, but when I deploy from the model package arn which registered from pipeline, it seems not behaving as expected.
To reproduce
My pipeline looks like this
- step_data_ingest
- step_feature_engineering
- step_data_validation
- step_train (shows below)
- output_path = f"{experiment_package_s3_dir}/model"
xx_estimator = TensorFlow(
entry_point='train.py',
source_dir='pipelines/abalone/src', # this should be just "source" for your code
role=role,
image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:2.3.1-cpu-py37-ubuntu18.04",
instance_count=1,
model_dir=False,
instance_type=training_instance_type,
output_path = output_path, # all training step output (include debug etc.)
sagemaker_session=sagemaker_session,
container_log_level=10, # 10 debug 20 info 30 warning 40 error
base_job_name=f"{base_job_prefix}-model-train",
hyperparameters={
"epochs": 5,
"batch_size": 256,
"early_stop_patience": 10,
"country": country
}
)
step_train = TrainingStep(
name="TrainxxModel",
estimator=xx_estimator,
inputs={
"train": TrainingInput(
s3_data=step_data_ingest.properties.ProcessingOutputConfig.Outputs[
"train"
].S3Output.S3Uri,
content_type=None,
),
"validation": TrainingInput(
s3_data=step_data_ingest.properties.ProcessingOutputConfig.Outputs[
"validation"
].S3Output.S3Uri,
content_type=None,
),
"encoders": TrainingInput(
s3_data=step_feature_engineering.properties.ProcessingOutputConfig.Outputs[
"encoders"
].S3Output.S3Uri,
content_type=None,
),
},
)
- step_model_eval
- step_register
step_register = RegisterModel(
name="RegisterxxModel",
estimator=xx_estimator,
model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts,
image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.3.1-cpu-py37-ubuntu18.04",
content_types=["application/json", "text/csv"],
response_types=["application/json", "text/csv"],
inference_instances=["ml.m5.large"],
transform_instances=["ml.m5.large"],
model_package_group_name=model_package_group_name,
approval_status=model_approval_status,
source_dir='pipelines/abalone/src',
entry_point="inference.py",
model_metrics=salary_model_metrics,
role=role,
)
inside the inference.py, I have defined several setup outside of the handler method, looks like this
import xxx
setup encoders
load model
def handler
process input
model predict
processing output
def _processing_input
encoders do feature transform
def _processing_output
Expected behavior
The registerred model will take input, log and execute the feature transform and give back proper prediction after being deployed
Screenshots or logs
If applicable, add screenshots or logs to help explain your problem.
## A. The working one
xx_model = TensorFlowModel(
name="xxDebugModel",
entry_point="inference.py",
source_dir="../pipelines/abalone/src",
image_uri=image_uri,
model_data=model_data,
sagemaker_session=sagemaker.Session(),
container_log_level=0,
role=role,
)
display(xx_model.__dict__)
## B. deploy from package created from xx_model.register
model_package = xx_model.register(
content_types=["application/json", "text/csv"],
response_types=["application/json", "text/csv"],
inference_instances=["ml.m5.large"],
transform_instances=["ml.m5.large"],
model_package_group_name="xxPipeModelGroup",
approval_status="Approved",
model_metrics = xx_model_metrics,
description=f"Registered from XX Model Experiement Pack: {experiment_package_s3_dir}"
)
model_package.deploy(
initial_instance_count=1,
instance_type="ml.m5.large",
endpoint_name="xxDebugModelPackDeploy"
)
## C. deploy from package registerred after pipeline, these two methods shows same behavior
model_package_from_arn = sagemaker.ModelPackage(
role=role,
model_package_arn = "arn:aws:sagemaker:us-east-1:104436464649:model-package/xxx-us/5"
)
model_package_from_arn.deploy(
initial_instance_count=1,
instance_type="ml.m5.large",
endpoint_name="xxDebugModelPackARNDeploy"
)
A shows logs like this
// [from tensorflow serving] loading model
// [from inference.py] load encoders
// [from inference.py] setting up
// [from tensorflow serving] entering event loop (ping...ping...)
B and C shows similar logs
// [from tensorflow serving] loading model
// [from tensorflow serving] entering event loop (ping...ping...)
System information
A description of your system. Please provide:
Additional context
Other issues found:
Someone can help?
Hey @ruyyi0323,
Apologies on the late response.
For clarification purposes, are you following https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry.html for setting up your pipeline for registering and deploying your Keras model?
Hi @ChoiByungWook
Thanks for helping!
I am doing this way
# A.
salary_model = TensorFlowModel(
name="SalaryDebugModel",
entry_point="inference.py",
source_dir=base_dir,
image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.3.1-cpu-py37-ubuntu18.04",
model_data=model_data,
sagemaker_session=sagemaker.Session(),
container_log_level=10,
role=role,
)
display(salary_model.__dict__)
salary_model.deploy(
instance_type="ml.m5.2xlarge",
initial_instance_count=1,
endpoint_name="SalaryDebugModel",
update_endpoint=True,
tags=None
)
# B.
model_package = None
model_package = salary_model.register(
content_types=["application/json", "text/csv"],
response_types=["application/json", "text/csv"],
inference_instances=["ml.m5.4xlarge"],
transform_instances=["ml.m5.4xlarge"],
model_package_group_name="SalaryDebugModelGroup",
approval_status="Approved",
description="DEBUG"
)
model_package.__dict__
model_package.deploy(
initial_instance_count=1,
instance_type="ml.m5.4xlarge",
endpoint_name="SalaryDebugModelPackARNDeploy"
)
A is working perfectly fine and B is not working, seems like B didn't load the entry point somehow
B doesn't have the log in red box, and the prediction is not functioning, A get's correct prediction behavior

Plus: B is facing duplicate Tag issue, I have to run same cell for twice before I have the model gets deployed.
@ruyyi0323,
I apologize for the bad experience.
I believe you are running into these issues as the SageMaker deep learning frameworks containers dynamically load userscripts (inference.py) in conjunction with environment variables mapping to a S3 object. Your model package gets pulled from S3 and your inference.py file gets packed into a new tar file, as shown here: https://github.com/aws/sagemaker-python-sdk/blob/e08c04e6ed0fdfb7e9e873d119769509f3ed74de/src/sagemaker/utils.py#L362. Thus, in your A version of the deployment, this all passes, as inference.py is inside the container as expected: https://github.com/aws/sagemaker-tensorflow-serving-container#prepost-processing.
I believe if you attempt to use the repacked model data of the successfully deployed A version model in your registration it should work.
// Version A
xx_model = TensorFlowModel()
xx_model.deploy()
repacked_model_data = xx_model.repacked_model_data
// Version B
xx_model.model_data = repacked_model_data
xx_model.register()
...
// Create new Model object
new_xx_model = TensorFlowModel(model_data=repacked_model_data, ...)
new_xx_model.register()
If you don't like that approach, you can also attempt to repack the model on your local and create the new model object as shown above. You would need to follow the code directory shown in:
// Example, with tar file called model.tar.gz
model.tar.gz
>>> model
>>> code |------ inference.py
>>> |------ saved_model.pb
If you would prefer baking the scripts into the image instead of loading it in your model package.
You would have to either extend or commit a new Docker image with your inference.py script being baked in.
Quick overview for creating a new algorithm container in SageMaker: https://github.com/aws/amazon-sagemaker-examples/blob/master/aws_marketplace/creating_marketplace_products/Bring_Your_Own-Creating_Algorithm_and_Model_Package.ipynb
For extending an existing image see: https://github.com/aws/amazon-sagemaker-examples/blob/master/advanced_functionality/pytorch_extending_our_containers/pytorch_extending_our_containers.ipynb
The image to extend would be: 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:2.3.1-cpu-py37-ubuntu18.04
Hi @ChoiByungWook ,
Thanks for the explaination, I am currently trying out the sagemaker workflow pipeline thus I am trying my best to mimic the behavior of the repack model and register to see why the RegisterStep failed to build up the model with corresponding prediction behavior. Which is C way that I mentioned in original thread.
If the utils.py way you've mentioned basically replicate the repack model step? I can do that to try out.
Hi @ChoiByungWook ,
Thanks for the explaination, I am currently trying out the sagemaker workflow pipeline thus I am trying my best to mimic the behavior of the repack model and register to see why the RegisterStep failed to build up the model with corresponding prediction behavior. Which is C way that I mentioned in original thread.
If the utils.py way you've mentioned basically replicate the repack model step? I can do that to try out.
Yes, you should be able to call the repack_model function, however it does have a few parameters. When you do deploy with the TensorFlowModel object it calls that as well: https://github.com/aws/sagemaker-python-sdk/blob/e08c04e6ed0fdfb7e9e873d119769509f3ed74de/src/sagemaker/tensorflow/model.py#L320-L328
Thanks so much, will do a quick PoC tomorrow
Awesome! For reference, feel free to pull down the model object that is associated with your successful endpoint (version A) in the AWS console. When you download the tar file from S3, it should show you your pb model and inference.py inside.
AWS Console -> SageMaker -> Endpoint -> Endpoint configuration settings -> Model Name -> Container 1 -> Model data location
Or you can check out the repacked_model_data object in your localhost.
// Version A
print(xx_model.repacked_model_data)
Hi, @ChoiByungWook I am trying to inspect the repacked_model_data after I do model.deploy(..), however I am getting NoneType, is this a bug here?

Hey @ruyyi0323,
My apologies.
It looks like we have some inconsistency with how we track the repacked_model_data within the base Model class: https://github.com/aws/sagemaker-python-sdk/blob/c4d71f5639697d833e5578016a3f45402a4a80de/src/sagemaker/model.py#L1131
and the TensorFlowModel class: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/model.py#L294-L332
I believe to find the repacked model you will need to look at the model defined in your endpoint in the AWS Console.
Hi @ChoiByungWook , I have tried couple experiments to see what's going on, not sure if that helps for you or anyone else, I will just go put some information as the reference here
Illustration of whole PoC

[Success] A1. create TensorFlowModel object using the model.tar.gz from TrainingStep output and deploy
./code/inference.py
./code/{other files}
./model/1/{model_assets, including pb or whatever}
salary_model = TensorFlowModel(
name="SalaryDebugModel",
entry_point="inference.py",
source_dir=base_dir,
image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.3.1-cpu-py37-ubuntu18.04",
model_data=model_data,
sagemaker_session=sagemaker.Session(),
container_log_level=10,
role=role,
)
salary_model.deploy(..)
md5-4e61f2ffdf8133208d50fe8d538a9ffc
from sagemaker.utils import (
repack_model
)
repacked_salary_model = TensorFlowModel(
name="SalaryDebugModelRepack",
image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.3.1-cpu-py37-ubuntu18.04",
model_data=salary_model.repacked_model_data
sagemaker_session=sagemaker.Session(),
container_log_level=10,
role=role
)
repacked_salary_model.deploy(..)
md5-2e170520226bad7ee11b9979cf7ebd37
repacked_salary_model = TensorFlowModel(
name="SalaryDebugModelRepack",
image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.3.1-cpu-py37-ubuntu18.04",
model_data="s3://sagemaker-us-east-1-104436464649/SalaryDebugModel/model.tar.gz", # repacked file
sagemaker_session=sagemaker.Session(),
container_log_level=10,
role=role
)
repacked_salary_model.deploy(..)
md5-c40f9053fb5eede42acf3e9b10d596ca
model_package_1 = salary_model.register(
content_types=["application/json", "text/csv"],
response_types=["application/json", "text/csv"],
inference_instances=["ml.m5.4xlarge"],
transform_instances=["ml.m5.4xlarge"],
model_package_group_name="SalaryDebugModelGroup",
approval_status="Approved",
description="DEBUG"
)
model_package_1.deploy(
initial_instance_count=1,
instance_type="ml.m5.4xlarge",
endpoint_name="SalaryDebugModelPackARNDeploy",
wait=False
)
md5-d05007bf4fefc74254ebb83468f4c010
model_package_1 = salary_model.register(
content_types=["application/json", "text/csv"],
response_types=["application/json", "text/csv"],
inference_instances=["ml.m5.4xlarge"],
transform_instances=["ml.m5.4xlarge"],
model_package_group_name="SalaryDebugModelGroup",
approval_status="Approved",
description="DEBUG"
)
model_package_1.deploy(
initial_instance_count=1,
instance_type="ml.m5.4xlarge",
endpoint_name="SalaryDebugModelPackARNDeploy",
wait=False
)
md5-3add8f4b82b48875eb0c9618856be8ab
./code/{both train and inference scripts}
./code/{other scripts}
./{model_assets}
./{other reference files}
md5-4dc9d4aed5b89e33e5480c7cd319c34d
salary_estimator = TensorFlow(
entry_point='train.py',
source_dir="pipelines/abalone/src", # this should be just "source" for your code
role=role,
image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:2.3.1-gpu-py37-cu110-ubuntu18.04",
instance_count=1,
instance_type=training_instance_type,
output_path = get_projection_s3_dir(experiment_name, "model"),
model_dir = False,
sagemaker_session=sagemaker_session,
container_log_level=10, # 10 debug 20 info 30 warning 40 error
volume_size=160,
base_job_name=f"{base_job_prefix}-model-train",
hyperparameters={
"epochs": train_epochs,
"batch_size":train_batch_size,
"early_stop_patience": early_stop_tolerance,
},
# trainstep
step_register = RegisterModel(
name="RegisterSalaryModel",
estimator=salary_estimator,
model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts,
image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.3.1-cpu-py37-ubuntu18.04",
content_types=["application/json", "text/csv"],
response_types=["application/json", "text/csv"],
inference_instances=_available_inference_instances,
transform_instances=_available_transform_instances,
model_package_group_name=model_package_group_name,
approval_status=model_approval_status,
model_metrics=salary_model_metrics,
role=role,
)
Summary
/opt/ml/model/code folder, suggest the src to have training codes and inference codesOther Issues Came Across
.deploy function, and I have no idea how to force name or remove the tags)

@ruyyi0323,
Thank you so much for all of this!
For option A2, if you're calling repack_model directly, you can use the parameter you passed in for repacked_model_uri as your model_data parameter in your TensorFlowModel constructor.
@ruyyi0323,
Thank you so much for all of this!
For option A2, if you're calling repack_model directly, you can use the parameter you passed in for
repacked_model_urias yourmodel_dataparameter in your TensorFlowModel constructor.
Thanks @ChoiByungWook, Haven't try that out yet but I believe that would work also.
Hello. I tried to follow your _[Success] B2_ recipe and it did not work for me. The estimator has both entry_point and source_dir parameters. The source dir contains both training and inference py files. The RegisterModel step uses both the estimator and the training S3 model artefact. However, this resulting S3 model artefact has no code source dir with inference.py. It was not packaged into the model during the training step. Do I need a model step before registering? When does repackaging happen?
tf_estimator = TensorFlow(
entry_point='tf_train.py',
source_dir='code',
role=role,
framework_version='2.4.1',
model_dir=False,
py_version='py37',
instance_type='ml.m5.large',
instance_count=1,
output_path=output_path,
)
register_step = RegisterModel(
name="RegisterModel",
estimator=tf_estimator,
model_data=training_step.properties.ModelArtifacts.S3ModelArtifacts,
image_uri=f"763104351884.dkr.ecr.eu-west-1.amazonaws.com/tensorflow-inference:2.3-cpu",
content_types=["application/json"],
response_types=["application/json"],
inference_instances=["ml.t2.medium", "ml.m5.xlarge"],
transform_instances=["ml.m5.xlarge"],
model_package_group_name=model_name_param,
)
Hi,
Yeah you need to add that part in your training step, or you initiate another training job to pack your model. The model artifact file that you used for final registry should contains the code file and model files.
Get Outlook for iOShttps://aka.ms/o0ukef
From: Ievgen Goichuk @.>
Sent: Thursday, April 22, 2021 9:25:38 AM
To: aws/sagemaker-python-sdk *@.>
Cc: Chen Liang @.>; State change @.*>
Subject: Re: [aws/sagemaker-python-sdk] RegisterModel or TensorFlowModel.register lost the inference behavior (#2123)
External Email
Hello. I tried to follow your [Success] B2 recipe and it did not work for me. The estimator has both entry_point and source_dir parameters. The source dir contains both training and inference py files. The RegisterModel step uses both the estimator and the training S3 model artefact. However, this resulting S3 model artefact has no code source dir with inference.py. It was not packaged into the model during the training step. Do I need a model step before registering? When does repackaging happen?
tf_estimator = TensorFlow(
entry_point='tf_train.py',
source_dir='code',
role=role,
framework_version='2.4.1',
model_dir=False,
py_version='py37',
instance_type='ml.m5.large',
instance_count=1,
output_path=output_path,
)
register_step = RegisterModel(
name="RegisterModel",
estimator=tf_estimator,
model_data=training_step.properties.ModelArtifacts.S3ModelArtifacts,
image_uri=f"763104351884.dkr.ecr.eu-west-1.amazonaws.com/tensorflow-inference:2.3-cpu",
content_types=["application/json"],
response_types=["application/json"],
inference_instances=["ml.t2.medium", "ml.m5.xlarge"],
transform_instances=["ml.m5.xlarge"],
model_package_group_name=model_name_param,
)
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHubhttps://github.com/aws/sagemaker-python-sdk/issues/2123#issuecomment-824839337, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHSHMKMLIESJDY2IYSCH3ULTKAPVFANCNFSM4XBOX7UA.
From the given code snippet, if you look at the model.tar.gz from output_path or from your TrainStep output, you should be able to see code+model file in your model.tar.gz.
Get Outlook for iOShttps://aka.ms/o0ukef
From: Chen Liang @.>
Sent: Thursday, April 22, 2021 11:29:03 AM
To: aws/sagemaker-python-sdk *@.>; aws/sagemaker-python-sdk @.>
Cc: State change @.*>
Subject: Re: [aws/sagemaker-python-sdk] RegisterModel or TensorFlowModel.register lost the inference behavior (#2123)
Hi,
Yeah you need to add that part in your training step, or you initiate another training job to pack your model. The model artifact file that you used for final registry should contains the code file and model files.
Get Outlook for iOShttps://aka.ms/o0ukef
From: Ievgen Goichuk @.>
Sent: Thursday, April 22, 2021 9:25:38 AM
To: aws/sagemaker-python-sdk *@.>
Cc: Chen Liang @.>; State change @.*>
Subject: Re: [aws/sagemaker-python-sdk] RegisterModel or TensorFlowModel.register lost the inference behavior (#2123)
External Email
Hello. I tried to follow your [Success] B2 recipe and it did not work for me. The estimator has both entry_point and source_dir parameters. The source dir contains both training and inference py files. The RegisterModel step uses both the estimator and the training S3 model artefact. However, this resulting S3 model artefact has no code source dir with inference.py. It was not packaged into the model during the training step. Do I need a model step before registering? When does repackaging happen?
tf_estimator = TensorFlow(
entry_point='tf_train.py',
source_dir='code',
role=role,
framework_version='2.4.1',
model_dir=False,
py_version='py37',
instance_type='ml.m5.large',
instance_count=1,
output_path=output_path,
)
register_step = RegisterModel(
name="RegisterModel",
estimator=tf_estimator,
model_data=training_step.properties.ModelArtifacts.S3ModelArtifacts,
image_uri=f"763104351884.dkr.ecr.eu-west-1.amazonaws.com/tensorflow-inference:2.3-cpu",
content_types=["application/json"],
response_types=["application/json"],
inference_instances=["ml.t2.medium", "ml.m5.xlarge"],
transform_instances=["ml.m5.xlarge"],
model_package_group_name=model_name_param,
)
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHubhttps://github.com/aws/sagemaker-python-sdk/issues/2123#issuecomment-824839337, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHSHMKMLIESJDY2IYSCH3ULTKAPVFANCNFSM4XBOX7UA.