Passing a dataset.as_mount() to ScriptRunConfig the way provided in this page raises the TypeError: Object of type 'DataReference' is not JSON serializable exception. Instead, the script should add one more step to make it serializable in the ScriptRunConfig as elaborated in this stackoverflow response
Here is the change that needs to be applied prior to calling the as_mount function:
arguments=['--input-data-dir', dataset.as_named_input('input').as_mount()]
⚠Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
Hi @classicboyir
dataset.as_mount() should work now by upgrading to the latest SDK version. Can you help confirm?
Thanks
It still does not work.
azureml core version'1.2.0'
TypeError: Object of type DataReference is not JSON serializable
This thing was perfectly working while I was using estimators. why did you move to ScriptRunConfig and break the behaviour?
1.2.0 is relatively old, the current version is 1.18.0 and new versions are released every other week
The recommendation is to use a Dataset.File.from_* and a ScriptRunConfig rather than a DataReference or other method
Upgrading worked. But datasets only support read only files. How do I mount a writable storage location? Example scenario : output periodic checkpoint from a long running training job.
I believe @mx-iao and @aminsaied are building a best practices checkpointing example over at https://github.com/Azure/azureml-examples (see https://github.com/Azure/azureml-examples/issues/249)
@MayMSFT what is the recommendation for writable mount? Use blobfuse directly? Datasets?
Hi @krishansubudhi you can configure dataset as output to your scriptrun. sample notebook here: https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/work-with-data/datasets-tutorial/scriptrun-with-data-input-output/how-to-use-scriptrun.ipynb
Should you still have questions regarding this issue, please comment here and reopen it. We're closing it for now. #please-close
Most helpful comment
1.2.0 is relatively old, the current version is 1.18.0 and new versions are released every other week
The recommendation is to use a
Dataset.File.from_*and aScriptRunConfigrather than a DataReference or other method