Please fill out the form below.
I am trying to deploy a previously trained SKLearn model. Training works fine when using the SDK. However, when using the SKLearnModel class, a different account id is used for the base image. This fails deployment with the error:
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreateModel operation: Role <my_role_arn> cannot pull 520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-scikit-learn:0.20.0-cpu-py3. Ensure that the role exists and the image was granted pull permission.
from sagemaker.sklearn.estimator import SKLearn
from sagemaker import s3_input
channel = {
'training': s3_input(s3_training_key),
}
skl = SKLearn(
entry_point='./model.py',
role=role,
train_instance_type="ml.c4.xlarge",
train_instance_count=1,
output_path=s3_output_prefix,
base_job_name=job_name,
)
skl.fit(
inputs=channel,
job_name=job_name,
)
import sagemaker
from sagemaker.sklearn.model import SKLearnModel
estim = sagemaker.estimator.Estimator.attach(job_name)
estim_data = str(estim.model_data)
model = SKLearnModel(
model_data=estim_data,
role=role,
entry_point='./model.py',
name=job_name
)
model.deploy(
initial_instance_count=1,
instance_type=instance_type,
endpoint_name=endpoint_name,
)
deploy is what gives the error. I think the problem comes from the account inconsistency between fw_utils and fw_registry. It seems that the Estimator class uses fw_registry.default_framework_uri whereas the SKLearnModel class uses fw_utils.create_image_uri. The error indicates that account id = 520713654638 is the problem, which is the default value of create_image_uri. Found #624
apologies for the inconvenience. #624 has now been merged, and will be released with the next version
This has been released in https://github.com/aws/sagemaker-python-sdk/releases/tag/v1.18.3
Please update your library.
pip install --upgrade sagemaker
Most helpful comment
apologies for the inconvenience. #624 has now been merged, and will be released with the next version