Please fill out the form below.
When calling Tensorflow from the SDK, we are limited in the size of the parameters :
ClientError: An error occurred (ValidationException) when calling the CreateTrainingJob operation: 1 validation error detected: Value '{sagemaker_requirements="", batch_size=32, evaluation_steps=null, ... sagemaker_job_name="train-image-nature-2018-07-26-11-05-33-968", epochs=10, training_steps=3450}' at 'hyperParameters' failed to satisfy constraint: Map value must satisfy constraint: [Member must have length less than or equal to 256, Member must have length greater than or equal to 0]
256 is small, in particular if you send a list of labels or have many parameters.
.../envs/sagemaker_tf_27/lib/python2.7/site-packages/sagemaker/session.pyc in train(self, image, input_mode, input_config, role, job_name, output_config, resource_config, hyperparameters, stop_condition, tags)
262 LOGGER.info('Creating training-job with name: {}'.format(job_name))
263 LOGGER.debug('train request: {}'.format(json.dumps(train_request, indent=4)))
--> 264 self.sagemaker_client.create_training_job(**train_request)
265
266 def tune(self, job_name, strategy, objective_type, objective_metric_name,
.../envs/sagemaker_tf_27/lib/python2.7/site-packages/botocore/client.pyc in _api_call(self, *args, **kwargs)
312 "%s() only accepts keyword arguments." % py_operation_name)
313 # The "self" in this scope is referring to the BaseClient.
--> 314 return self._make_api_call(operation_name, kwargs)
315
316 _api_call.__name__ = str(py_operation_name)
.../envs/sagemaker_tf_27/lib/python2.7/site-packages/botocore/client.pyc in _make_api_call(self, operation_name, api_params)
610 error_code = parsed_response.get("Error", {}).get("Code")
611 error_class = self.exceptions.from_code(error_code)
--> 612 raise error_class(parsed_response, operation_name)
613 else:
614 return parsed_response
Hi @PedroCardoso ,
For each hyper-parameter in the map, we have limits that each key or value should have length no more than 256.
For what you mentioned, if you have too many hyper-parameters, that won't reach this limit if each of them has key or value length within 256. If the map value is a list of a lot things, it might be a problem.
So could you give me a specific example? Then we can either recommend better practice to you or increase the limit to a more reasonable number.
Thanks
Hi @yangaws
I believe that my particular problem is with sending a list of labels as parameter. I do need those to build the Estimator.
As an example, think of a parameter that contains a list with 30 or 40 strings objects.
@PedroCardoso
I am not confident that we will increase that limit recently. I can put a feature request here. If we keep receiving such issues, we will definitely prioritize this feature.
For now my suggestion is, for your list of 30-40 labels, specify all the labels as a separate channel in some common format like JSON.
Are the channels information present in the parameters for the function call estimator_fn() ?
Hello,
I don't think the channels information is exposed to the estimator_fn(), as evident here https://github.com/aws/sagemaker-tensorflow-container/blob/master/src/tf_container/trainer.py#L92
I believe only the train_input_fn and eval_input_fn have access to the channels.
https://github.com/aws/sagemaker-tensorflow-container/blob/master/src/tf_container/trainer.py#L116
https://github.com/aws/sagemaker-tensorflow-container/blob/master/src/tf_container/trainer.py#L153
A workaround for this is to use the hyperparameters to store the channel metadata. Like...
hp = {'my_channel': 's3//:url/labels.json'}
Closing due to inactivity. Feel free to reopen if necessary.
Just hit this issue, using a custom docker container to train a model and I can't specify the features I want to train on. :-1:
hitting the same thing too. Its odd that this notebook for shows a value larger than 256 in the hyper params but its actually not supported
for those hitting this. My solution was pass the big parameters as a json file, and have it send to the job with a manifesto file.
for those hitting this. My solution was pass the big parameters as a json file, and have it send to the job with a manifesto file.
do you have a sample for that?
for those hitting this. My solution was pass the big parameters as a json file, and have it send to the job with a manifesto file.
I too am interested in learning about this, since I'm currently using the hyperparams file for all my image annotation labels in an object recognition case, and there are too many labels apparently.
I'm also stuck here. My use case is that I need to set the SAGEMAKER_SPARKML_SCHEMA environment variable when using the https://github.com/aws/sagemaker-sparkml-serving-container (required for CSV input) and I also have ~40 features to pass. I don't think this is an uncommon pattern
Most helpful comment
Just hit this issue, using a custom docker container to train a model and I can't specify the features I want to train on. :-1: