Sagemaker-python-sdk: v2.0.0 Release Plans

Created on 6 May 2020  Â·  23Comments  Â·  Source: aws/sagemaker-python-sdk

With the sunsetting of Python 2 earlier this year, we’re taking this opportunity to work on v2.0.0 and include some breaking changes that we have been considering. Our approach is that this isn’t meant to be a revolutionary overhaul. We don’t plan on rewriting everything entirely, and intend to keep the core experience largely the same as it was before. While we do have an opportunity to make breaking changes here, I do want to make sure each breaking change is meaningful, and not simply creating a different user experience for the sake of doing so.

I realize the following lists might look like a lot, but I did attempt to keep the list of changes relatively short. There are always going to be lots of changes we wish we could make to the Python SDK, but there are two categories of changes I deliberately excluded: (1) smaller improvements that can be done without incurring a breaking change and (2) bigger improvements/features that deserve their own dedicated planning.

6/4/20 edit: Due to how things have been going (and resourcing), some items have been deprioritized. I've also edited the timeline.
7/10/20 edit: Updated the timeline and added one item due to some changes in team planning.
7/28/20 edit: Updated last two dates in timeline.

Timeline

We are targeting the second half of July for a release, with release candidates starting at the end of May. More concrete dates to come as we get further along in implementation.

Timeframe | Milestone
---|---
5/8/20 | create branch for v2 development
5/18/20 | start including warnings in v1 for upcoming changes
mid June | start releasing release candidates
7/29/20 | last v1 release. After this, merges to master are limited to critical/urgent changes.
8/3/20 | v2.0.0 release

Major Changes

Description | Issue
---|---
Deprecate Python 2 support for the Python SDK | #1461
Deprecate TensorFlow "legacy mode" (in favor of "script mode") | #1462
Require framework_version for frameworks. For frameworks with multiple Python versions supported, require py_version. | #1465
Create new resources each time with deploy() and transformer() | #1470
Create a single module for generating image URIs | #1464
Separate the functionality of printing logs from attach() | #1405
Refactor Session | #1463
Create a script for upgrading from v1 to v2 code | #1478
Move SerDe to dedicated modules | #1694

Smaller Changes

Description | Issue
---|---
Remove scipy from required dependencies | #1471
Clean up util modules | #1466
Rename unclear/inconsistent names | #1473
Remove **kwargs | #1474
Deprecate CLI | #1476
Deprecate unused parameters | #1475

Other Possibilities

This is the shortlist of items that almost made the cut. Some may end up being included, time permitting.

  • Allow framework entry points to not be in the top-level directory of source_dir (#941)
  • Revisit how hyperparameters are encoded (#613)
  • Introduce KMS support for Session.default_bucket() and use that for framework sourcedir.tar.gz files (#1124)
  • Default wait=True for HyperparameterTuning and Transform Jobs
  • Match some of ScriptProcessor's interface to be more like framework estimators, e.g. source_dir/code_location (#1248)

Project Board

https://github.com/aws/sagemaker-python-sdk/projects/1

planning

Most helpful comment

Can we add lambda as a supported endpoint for inference? (currently only 'instance_type' and 'local' are supported)

model.deploy(instance_type='lambda')

working example, for reference: https://aws.amazon.com/blogs/machine-learning/build-test-and-deploy-your-amazon-sagemaker-inference-models-to-aws-lambda/

All 23 comments

Suggest adding python SDK support support for Neo to compile for target devices other than “ml_” instance types—for e.g., jetson nano, deeplens, raspberry pi.

Can we have the option to not require an s3 path when creating SageMaker models, but just an image in ECR? While it will take extra time to update those in production by re-deploying the image, rather than just the artifact, customers love developing quickly by creating the container with the model files actually stored in the image itself.

Can we add lambda as a supported endpoint for inference? (currently only 'instance_type' and 'local' are supported)

model.deploy(instance_type='lambda')

working example, for reference: https://aws.amazon.com/blogs/machine-learning/build-test-and-deploy-your-amazon-sagemaker-inference-models-to-aws-lambda/

What will be the timelines for deprecation of the existing package?

@BenHamm I believe the SDK doesn't strictly prevent a Model.compile with a non "ml_" target device, but simply prints out a warning (source). Have you been encountering bugs with that? or am I misunderstanding your feature request?

@EmilyWebber is this something you're able to currently do with boto3? if so, then making model_data optional is something we could pretty easily do. (if not, I'll need to check in with the SageMaker Hosting team.)

@ezeeetm have you looked into Neo? there is a 'lambda' target device

@maddy2u the last release for v1 will be sometime in mid/late June, shortly before the release of v2

Since sagemaker experiments is essentially part of sagemaker, are there any plans to merge sagemaker-experiments into sagemaker-python-sdk?

@LiutongZhou that's a good question! We've started a conversation with the team that owns sagemaker-experiments to discussing merging the two libraries, but don't have an ETA (yet).

Will local execution of processing jobs be included? I also found some issues when trying to wrap local execution in a process pool.

@perdasilva at this time, SageMaker is focusing on the Studio experience. There's an open feature request for local processing jobs at https://github.com/aws/sagemaker-python-sdk/issues/1278

  • support entry point that is not in the root folder. if needed one can provide additional paths to the PYTHONPATH
  • support custom (shell) script to be run on the code prior to train, e.g. protobuf compilation

Will I be able to use TensorFlow 2.2.0 as Framework Version?

support entry point that is not in the root folder. if needed one can provide additional paths to the PYTHONPATH

@litaws thanks for bringing this up! we've had some on and off requests for this, and I'd like to include it if we have time.

support custom (shell) script to be run on the code prior to train, e.g. protobuf compilation

this should already be supported, e.g. https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_script_mode_using_shell_commands/tensorflow_script_mode_using_shell_commands.ipynb

Will I be able to use TensorFlow 2.2.0 as Framework Version?

@ArrichM TF 2.2 is supported for training already: https://aws.amazon.com/releasenotes/available-deep-learning-containers-images. Inference support for 2.2 is on the way. (Support for newer framework versions happens separately from work in this repository.)

support entry point that is not in the root folder. if needed one can provide additional paths to the PYTHONPATH

@litaws thanks for bringing this up! we've had some on and off requests for this, and I'd like to include it if we have time.

that would be great!

support custom (shell) script to be run on the code prior to train, e.g. protobuf compilation

this should already be supported, e.g.

So I can set a .sh file as an entry point 🤔...
Does that include PyTorch?

So I can set a .sh file as an entry point 🤔...
Does that include PyTorch?

@litaws yep!

Any sign of refactors to Model to pick up things like #1094 in particular, and more generally the ability to do things like explicitly create and describe() "Models" in SageMaker API?

@athewsey that was definitely something I thought about, but it's not on my priority list for this particular project due to it not necessarily being a breaking change. Methods like describe() and attach() could be added and the private (by Python convention) _create_sagemaker_model() method could be made public without requiring a major version bump. (I may find some other time to tackle a couple of those items, though.)

also just want to update here that v2.0.0.rc0 has been released: https://sagemaker.readthedocs.io/en/v2.0.0.rc0/v2.html. Would love to hear feedback from anyone who tries it!

In the future, can we add deprecated warnings after there is a way to resolve them? I suggest releasing the new functions first then announcing the intent to deprecate along specified grace period. Right now my notebooks are lit up with warnings with no way to address them. I also have no information from the warnings on when I need to fix these issues and what (if any) grace period I have to fix these issues to avoid a code break.

v2.0.0.rc1 is now released: https://sagemaker.readthedocs.io/en/v2.0.0.rc1/v2.html. Again, would very much appreciate feedback!

This change includes:

  • a lot of renames, so you can start trying those out to remove the various warnings
  • a change in name generation for inference resources, so that if you have to retry a deployment with some different configuration, those changes in configuration are honored

Hi Lauren,

the 'pip' command listed at https://sagemaker.readthedocs.io/en/v2.0.0.rc1/v2.html#installation didn't work for me:
_pip install [email protected]:aws/sagemaker-python-sdk.[email protected]_

This worked instead:
_pip install git+git://github.com/aws/[email protected]_

v2.0.0 has been released.

it seems that readme doesn't reflect it yet?

also release notes are currently: merge v2 changes into master

@aleksandersumowski my bad - thank you! I've opened up https://github.com/aws/sagemaker-python-sdk/pull/1808 and also fixed the release notes

Was this page helpful?
0 / 5 - 0 ratings