Azure-docs: Can't load the pickle file back in to predict test data

Created on 4 Jun 2019  Â·  14Comments  Â·  Source: MicrosoftDocs/azure-docs

Using our own cloud-based notebook VM, in the Predict Test Data step, running the line clf = joblib.load( os.path.join(os.getcwd(), 'sklearn_mnist_model.pkl')) fails with KeyError: 0 when on the line unpicker.load( ). any idea why this happens? We have repro'ed on two machines both using cloud-based notebooks. the sklearn_minst_model.pkl file does exist.


Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

Pri2 corsubsvc cxp machine-learninsvc product-question triaged

Most helpful comment

@YutongTie-MSFT I also ran into this problem, but was able to work around it by replacing the line from sklearn.externals import joblib with import joblib. It resulted in "UserWarning: Trying to unpickle estimator LogisticRegression from version 0.21.2 when using version 0.20.3," but I was still able to successfully run the rest of the notebook tutorial.

All 14 comments

@tchmiel Thank you for your feedback, we will investigate it and get back to you soon.

@tchmiel Hi, could you please send us an email at [email protected] with your Azure subscription ID and URL of this thread? We would like to investigate it deeper as this is not a document issue.
We will now proceed to close this thread. If there are further questions regarding this matter, please respond here and @YutongTie-MSFT and we will gladly continue the discussion.

@YutongTie-MSFT - email sent. Thank you.

@YutongTie-MSFT I also ran into this problem, but was able to work around it by replacing the line from sklearn.externals import joblib with import joblib. It resulted in "UserWarning: Trying to unpickle estimator LogisticRegression from version 0.21.2 when using version 0.20.3," but I was still able to successfully run the rest of the notebook tutorial.

I can confirm this is still an issue, and the workaround suggested by @kemichi works.

@coljac @kemichi Thank you guys.

@sdgilley Hi, could you please take a look of this issue and update the document as necessary? Thank you.

@tchmiel, @coljac, @kemichi - can you provide more info on when the Notebook VM you are using was created and what SDK version it is using?

I'm trying to reproduce this error with the documented line: clf = joblib.load( os.path.join(os.getcwd(), 'sklearn_mnist_model.pkl'))

  • On a Notebook VM created last month, running SDK 1.0.43, I see a UserWarning but the model works.
  • On a Notebook VM created a few days ago, running SDK 1.0.45, I see no error or warning.

Can you provide more information about when your Notebook VM was created and what SDK version it is using so I can reproduce?

Thanks,
Sheri

Original issue was resolved after the SDK was updated to 1.0.43.

Since code issue now resolved, no need to update. #please-close

I am having the same issue again with Azure ML SDK Version: 1.0.62.
Providing a dedicated scikit-learn version in part 1 of the tutorial for the estimator helped:
cd = CondaDependencies.create(pip_packages=['azureml-sdk','scikit-learn==0.20.3','azureml-dataprep[pandas,fuse]>=1.1.14'])

I am also getting same issue as above
Azure ML SDK Version: 1.0.62
VM: STANDARD_D2_V2
sklearn.__version: 0.20.3

tried workaround as suggested above to use "import joblib". Getting below warning

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/sklearn/base.py:253: UserWarning: Trying to unpickle estimator LogisticRegression from version 0.21.3 when using version 0.20.3. This might lead to breaking code or invalid results. Use at your own risk.
UserWarning)

Hi All,
I am facing the same issue with Azure ML SDK Version: 1.0.74. The workaround suggested (import jobib) is also not working.

I am using my own cloud-based notebook VM, in the Feed the test dataset to the model to get predictions step, running the line clf = joblib.load( os.path.join(os.getcwd(), 'sklearn_mnist_model.pkl')) fails with KeyError: 0 when on the line unpicker.load( ) and on line obj = _unpickle(fobj, filename, mmap_mode). getting the getting ModuleNotFoundError: No module named 'joblib'. at the bottom. Any idea why this happens?

I am facing the same issue with Azure ML SDK Version: 1.0.83. The workaround suggested (import jobib) is also not working.

Basic things that you guys can at least do is to verify the working code before publishing the document for it. Its really sad and pathetic service from azure team.

Hi - Iam facing the same issue.
Azure ML SDK Version: 1.2.0.
scikit-learn version is 0.20.3.

Both from sklearn.externals import joblib and import joblib are not working.
Created the model using Compute instance(not Notebook VM) 24 hours back.

Was this page helpful?
0 / 5 - 0 ratings