Sagemaker-python-sdk: Question about deploying model

Created on 31 May 2018  路  10Comments  路  Source: aws/sagemaker-python-sdk

Hi there,

I was wondering if it was possible to send a batch of inputs to predict from the endpoint? In my limited experience I can only send one image at a time, which takes quite some time if I have many images to predict.

documentation

Most helpful comment

Hi @mklissa ,

The example notebook you refer to defines the transform_fn funtion. (https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/mxnet_gluon_cifar10/cifar10.py#L169) So the process of prediction will be exactly the same as what defines there. From that function I could see it always gets one prediction that returned in the response_body. So if you want multiple predictions to be returned, you could modify this transform_fn to do that.

There seems lack of explanation for this transform_fn in our documentation. The current documentation recommends users to define input_fn, predict_fn and output_fn to control the prediction logic. (https://github.com/aws/sagemaker-python-sdk#model-serving)
If you provide input_fn, predict_fn and output_fn yourself, you don't need to provide a transform_fn and could assume the default transform_fn will just execute input_fn, predict_fn and output_fn in order.
(https://github.com/aws/sagemaker-mxnet-containers/blob/ab0495f7e2f0a7df08b149058fa49b1b859135e6/src/mxnet_container/serve/transformer.py#L129)

For your case, just updating current transform_fn would be a quick solution. But we are still in design process that whether this should be encouraged in the future. I will mark this issue as a 'doc update'. Best practices on this should be recorded in future doc.

Thanks for using sagemaker! If you still have problems, feel free to respond here.

All 10 comments

The SageMaker InvokeEndpoint API allows you to send up to 5MB at a time in one request. As long as the image you are running in your endpoint supports batches, then you can send batches up to that size.

Are you planning on using one of our official images? They should all support batch inputs, but it'd help to know which one.

I am actually using the MxNet Cifar10 example from this repo: https://github.com/awslabs/amazon-sagemaker-examples (exact link: https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python-sdk/mxnet_gluon_cifar10 ) Maybe I should actually only raise this issue on that repo. I run the code as is, but when the model is deployed I try to predict all the images from "image_data" variable but I only get "8.0" as output. Here is my only change to the code:

image_data=np.array(image_data)
image_data=image_data.reshape(10,3,32,32)
predictor.predict(image_data)

8.0

Hi @mklissa ,

The example notebook you refer to defines the transform_fn funtion. (https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/mxnet_gluon_cifar10/cifar10.py#L169) So the process of prediction will be exactly the same as what defines there. From that function I could see it always gets one prediction that returned in the response_body. So if you want multiple predictions to be returned, you could modify this transform_fn to do that.

There seems lack of explanation for this transform_fn in our documentation. The current documentation recommends users to define input_fn, predict_fn and output_fn to control the prediction logic. (https://github.com/aws/sagemaker-python-sdk#model-serving)
If you provide input_fn, predict_fn and output_fn yourself, you don't need to provide a transform_fn and could assume the default transform_fn will just execute input_fn, predict_fn and output_fn in order.
(https://github.com/aws/sagemaker-mxnet-containers/blob/ab0495f7e2f0a7df08b149058fa49b1b859135e6/src/mxnet_container/serve/transformer.py#L129)

For your case, just updating current transform_fn would be a quick solution. But we are still in design process that whether this should be encouraged in the future. I will mark this issue as a 'doc update'. Best practices on this should be recorded in future doc.

Thanks for using sagemaker! If you still have problems, feel free to respond here.

our MXNet documentation now includes transform_fn: https://sagemaker.readthedocs.io/en/stable/using_mxnet.html#use-transform-fn. (at this time, it is the only one of our frameworks that supports transform_fn)

I'm facing this same problem after migrating my sagemaker keras model from file mode to script mode. Invoking the endpoint with multidimensional array with multiple inputs still only returns a single prediction. Can someone help with what I'm missing here?

@aninoy can you open a new issue? (it'll help with our internal tracking)

@laurenyu I actually got it to work by serializing to jsonlines so this is not a bug or issue per se. I was just confused why the default json serialization with say 5 input samples only returns the prediction result for the first input.

@aninoy I believe the original expectation was that using an endpoint would be for one input at a time and using batch transform would be for multiple inputs. if you want, you can open an issue over in https://github.com/aws/sagemaker-mxnet-inference-toolkit/issues. if you have any interest in perhaps contributing a change, the relevant code starts at https://github.com/aws/sagemaker-mxnet-inference-toolkit/blob/master/src/sagemaker_mxnet_serving_container/default_inference_handler.py#L148-L149

@laurenyu thanks for the clarification. I'm actually trying to do this for a keras tensorflow model in script mode. If you could point me to the right code for that, I'd be happy see if a change request would make sense.

Was this page helpful?
0 / 5 - 0 ratings