Please fill out the form below.
I'm trying to use Sagemaker Python SDK in Lambda to trigger train and deploy steps. Packaged the dependencies along with function code and when trying to create Lambda function it is throwing error 'Unzipped size must be smaller than 262144000 bytes'
Sorry, though this issue is related to Lambda service limit I want to check is there anyway I can reduce the size of the dependencies?
I have tried removing boto3 and botocare from function zip file since Lambda provides these libraries but it lead to different issue 'expecting python-dateutil<2.8.1,>=2.1'
AWS Lambda error 'Unzipped size must be smaller than 262144000 bytes'
Similarly, instead of Laye when packaged code with dependencies and uploading the zip file into Lambda function I received error 'Unzipped size must be smaller than 262144000 bytes'
Appreciate your help.
Hi @nemalipuri !
Unfortunately, running sagemaker-python-sdk in AWS Lambda is not currently supported. This is a pain point that we're aware of and for which we are working on prioritizing a solution.
I would normally recommend pinning python-dateutil to 2.8.0 to resolve the conflict, but I actually experimented locally and found that, even without boto3, the zip (55MB) is still over the 50MB zipped limit for Lambda.
An alternative is to remove numpy and scipy dependencies entirely for specific sagemaker installations, as they account for ~73% of the installation size.
In order for me to gauge the solution's viability, can you tell me if you will need numpy/scipy functionality when running sagemaker-python-sdk in AWS Lambda?
Similarly, what are your sagemaker-python-sdk AWS Lambda use-cases?
Thanks!
@knakad Thanks for looking into this.
Almost a year back I've used Sagemaker Python SDK in Lambda without any issues, the version it was 1.18.0 and size of the package was smaller.
Another use-case came up now and when I trie to pull latest package the size is larger than unzipped limit(260MB). Use-case is build a ML model with custom container and implement Lambda functions for creating training job and endpoint creation. StepFunctions will invoke these Lambda services at scheduled times to automate workflow.
I am not using scipy in my client code.
Even in sagemaker-python-sdk library I see scipy used at one place only(src/sagemaker/amazon/common.py).
I did try without boto3, botocare and scipy, but Lambda failed with error 'No module named 'numpy.core._multiarray_umath'.
Steps I executed:
Cloned sagemaker-python-sdk repo v1.49.0
Removed "scipy>=0.19.0" in setup.py
pip install into a directory (ex. pip install . -t ./python -c ../requirements.txt)
Zipped and uploaded into S3
Created a Layer and attached this layer to Lambda
'import sagemaker' failed with No module named 'numpy.core._multiarray_umath'.
If you could provide some workaround it would be great otherwise plan sdk(via boto3) is the only option I would have to implement sagemaker apis in Lambda.
Thank you.
Until sagemaker-python-sdk is officially supported in AWS Lambda, here's a workaround that removes a bit of bloat from the installation, allowing it to fit in lambda without sacrificing any functionality:
pip install sagemaker --target sagemaker-installation
cd sagemaker-installation
find . -type d -name "tests" -exec rm -rfv {} +
find . -type d -name "__pycache__" -exec rm -rfv {} +
zip -r sagemaker_lambda_light.zip .
I was able to upload the following zip along with a simple handler that called import sagemaker and some very basic validation.
This solution also doesn't require you to fork any of the code, so you can more easily run the latest sagemaker-python-sdk with the latest features/bug fixes.
Please try it out and let me know if you run into any issues =)
Perfect, it worked after executing the above steps. Thank you so much!
Anytime! Leaving this issue open to track the workaround and the feature request.
@knakad This looks like a great solution and I'd like to implement it. I followed the steps you listed out, created a layer and attached it to my Lambda function, but I still get the error when I try and import sagemaker package in my lambda function:
"errorMessage": "Unable to import module 'lambda_function': No module named 'sagemaker'"
Any idea what could be causing the issue? I don't get any hints in the logs in CloudWatch and it just looks like the function is not able to find the sagemaker package from the attached layer.
Thanks for your help on this, in advance.
Is there any date decided for the support in AWS Lambda for sagemaker-python-sdk ?
Is there any date decided for the support in AWS Lambda for sagemaker-python-sdk ?
After facing the issue myself, I read through the documentation and found the requirement on the path within the zip file that must be followed. There are two options: python or python/lib/python3.8/site-packages. I installed the sagemaker package into a python folder, delete tests and __pycache__ folders, then zipped it up, loaded it to S3 and created a layer. After that, import sagemaker from the lambda function with the layer attached worked for me.
Documentation for ease of reference: https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html
+1 to all of this! Looking forward to using SageMaker in Lambda once this is resolved.
Until sagemaker-python-sdk is officially supported in AWS Lambda, here's a workaround that removes a bit of bloat from the installation, allowing it to fit in lambda without sacrificing any functionality:
pip install sagemaker --target sagemaker-installation
cd sagemaker-installation
find . -type d -name "tests" -exec rm -rfv {} +
find . -type d -name "__pycache__" -exec rm -rfv {} +
zip -r sagemaker_lambda_light.zip .I was able to upload the following zip along with a simple handler that called
import sagemakerand some very basic validation.This solution also doesn't require you to fork any of the code, so you can more easily run the latest sagemaker-python-sdk with the latest features/bug fixes.
Please try it out and let me know if you run into any issues =)
@knakad do we need to manually zip sagemaker installation along with handler.py and upload it manually to s3? also, how will the lambda function pick up the new zip file? It would be helpful if you could list down the steps to do this. Thanks!
In order to create a valid sagemaker SDK layer it is important to create the layer using an AWS compatible numpy version (since some numpy packages are binary). Here is a slightly updated version of the above that has proved to work for me:
mkdir sagemaker-layer
cd sagemaker-layer
mkdir python
# Install the sagemaker modules in the python folder
pip install sagemaker --target ./python
# Remove tests and cache stuff (to reduce size)
find ./python -type d -name "tests" -exec rm -rfv {} +
find ./python -type d -name "__pycache__" -exec rm -rfv {} +
# Remove the python/numpy* folders since it will contain a numpy version for your host machine
rm -rf python/numpy*
# Download an AWS Linux compatible numpy package
# Navigate to https://pypi.org/project/numpy/#files.
# Search for and download newest *manylinux1_x86_64.whl package for your Python version (I have Python 3.7)
curl "https://files.pythonhosted.org/packages/9b/04/c3846024ddc7514cde17087f62f0502abf85c53e8f69f6312c70db6d144e/numpy-1.19.2-cp37-cp37m-manylinux2010_x86_64.whl" -o "numpy-1.19.2-cp36-cp36m-manylinux1_x86_64.whl"
unzip numpy-1.19.2-cp37-cp37m-manylinux2010_x86_64.whl -d python
zip -r sagemaker_lambda.zip .
# When zip file is ready, upload it to S3
aws s3 cp sagemaker_lambda.zip s3://ai4iot-lambda/sagemaker_lambda_light.zip
# When upload is complete, goto Lambda layers to create a layer from the uploaded zip file.
@arne-munch-ellingsen Thank you for the lead. I tried your code but when testing the lambda function I got the following error:
Response:
{
"errorMessage": "Unable to import module 'lambda_function': cannot import name '_ccallback_c' from 'scipy._lib' (/opt/python/scipy/_lib/__init__.py)",
"errorType": "Runtime.ImportModuleError"
}
My local machine (where I ran your code) is Mac OS, any idea what am I missing?
@shlomi-schwartz Are you trying to import scipy in your Lambda function? If that is the case you will have to add scipy to your layer as well using the same "trick" that I used to add the AWS Lambda Python 3.7 specific numpy library. The Sagemaker SDK does not include scipy.
@arne-munch-ellingsen Thanks for the tip, I was not calling scipy, it was one of the dependencies for sagemaker==1.71.1, but I used your trick and downloaded the .whl file, it works now!
Thanks again 馃憤
This worked for me, but I am looking forward to the actual support for SageMaker SDK in Lambda.
mkdir lambda_deployment
cd lambda_deployment
touch lambda_function.py
Write the logic in the lambda_function.py file.
pip install sagemaker --target sagemaker-installation
cd sagemaker-installation
find . -type d -name "tests" -exec rm -rfv {} +
find . -type d -name "__pycache__" -exec rm -rfv {} +
zip -r ../lambda-deployment.zip .
cd ..
zip -g lambda-deployment.zip lambda_function.py
Then upload lambda-deployment.zip to Lambda
further to @arne-munch-ellingsen's post, you can skip the download of the numpy whl and use the AWSLambda-Python37-SciPy1x layer provided by AWS (arn:aws:lambda:eu-west-2:142628438157:layer:AWSLambda-Python37-SciPy1x:35) instead
Most helpful comment
In order to create a valid sagemaker SDK layer it is important to create the layer using an AWS compatible numpy version (since some numpy packages are binary). Here is a slightly updated version of the above that has proved to work for me: