Sagemaker-python-sdk: Local mode is not working in the latest sdk

Created on 16 May 2018  路  2Comments  路  Source: aws/sagemaker-python-sdk

System Information

  • Framework: Tensorflow
  • Framework Version: 1.8.0
  • Python Version: 3.5
  • CPU or GPU: CPU
  • Python SDK Version: latest
  • Are you using a custom image: No

Describe the problem

Local mode is not working. This problem seems similar to #144.

I'm using the latest sagemaker-python-sdk via pip install git+https://github.com/aws/sagemaker-python-sdk
I'm trying to run the following code from the provided example (https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_distributed_mnist/tensorflow_local_mode_mnist.ipynb):

import sagemaker
import utils
from tensorflow.contrib.learn.python.learn.datasets import mnist
import tensorflow as tf

data_sets = mnist.read_data_sets('data', dtype=tf.uint8, reshape=False, validation_size=5000)

utils.convert_to(data_sets.train, 'train', 'data')
utils.convert_to(data_sets.validation, 'validation', 'data')
utils.convert_to(data_sets.test, 'test', 'data')

import boto3
session = boto3.Session(profile_name='my_profile')
s3 = session.resource('s3')
for bucket in s3.buckets.all():
    print(bucket.name)

#### Entering MFA code here

sagemaker_session = sagemaker.Session(boto_session=session)
role=sagemaker.get_execution_role(sagemaker_session=sagemaker_session)
inputs = sagemaker_session.upload_data(path='data', key_prefix='data/mnist', bucket='sagemaker-my-bucket')

from sagemaker.tensorflow import TensorFlow

mnist_estimator = TensorFlow(entry_point='mnist.py',
                             role=role,
                             training_steps=10, 
                             evaluation_steps=10,
                             train_instance_count=2,
                             train_instance_type='local',
                             sagemaker_session=sagemaker_session)

mnist_estimator.fit(inputs)

I get the following error:

Minimal repro / logs

ClientError: An error occurred (ValidationException) when calling the CreateTrainingJob operation: 1 validation error detected: Value 'local' at 'resourceConfig.instanceType' failed to satisfy constraint: Member must satisfy enum value set: [ml.p2.xlarge, ml.m5.4xlarge, ml.m4.16xlarge, ml.p3.16xlarge, ml.m5.large, ml.p2.16xlarge, ml.c4.2xlarge, ml.c5.2xlarge, ml.c4.4xlarge, ml.c5.4xlarge, ml.c4.8xlarge, ml.c5.9xlarge, ml.c5.xlarge, ml.c4.xlarge, ml.c5.18xlarge, ml.p3.2xlarge, ml.m5.xlarge, ml.m4.10xlarge, ml.m5.12xlarge, ml.m4.xlarge, ml.m5.24xlarge, ml.m4.2xlarge, ml.p2.8xlarge, ml.m5.2xlarge, ml.p3.8xlarge, ml.m4.4xlarge]

question

Most helpful comment

Hi @Stanpol

The TensorFlow estimator will use whatever SageMaker Session you pass in to it, or create a LocalSession if none is passed in -- this determines whether services calls are made to the service or to local mode. You should remove this part of the TensorFlow constructor:
sagemaker_session=sagemaker_session.

Let us know if you have any more questions.

Thanks!

All 2 comments

Hi @Stanpol

The TensorFlow estimator will use whatever SageMaker Session you pass in to it, or create a LocalSession if none is passed in -- this determines whether services calls are made to the service or to local mode. You should remove this part of the TensorFlow constructor:
sagemaker_session=sagemaker_session.

Let us know if you have any more questions.

Thanks!

I'm going to close this for now -- please feel free to reopen this if you have more questions. Thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

cgarciae picture cgarciae  路  5Comments

jguo16 picture jguo16  路  3Comments

stevehawley picture stevehawley  路  3Comments

zjost picture zjost  路  3Comments

jaipreet-s picture jaipreet-s  路  3Comments