Machinelearningnotebooks: PermissionError: [Errno 13] Permission denied: '.\NTUSER.DAT'. when trying to run ML pipeline

Created on 29 Dec 2019 · 4Comments · Source: Azure/MachineLearningNotebooks

The short story is, when I try to submit an azure ML pipeline run (an azure ML pipeline, not an Azure pipeline) from a jupyter notebook, I get PermissionError: [Errno 13] Permission denied: '.\NTUSER.DAT'. More details:

Relevant code:

from azureml.train.automl import AutoMLConfig
from azureml.train.automl.runtime import AutoMLStep
automl_settings = {
    "iteration_timeout_minutes": 20,
    "experiment_timeout_minutes": 30,
    "n_cross_validations": 3,
    "primary_metric": 'r2_score',
    "preprocess": True,
    "max_concurrent_iterations": 3,
    "max_cores_per_iteration": -1,
    "verbosity": logging.INFO,
    "enable_early_stopping": True,
    'time_column_name': "DateTime"
}

automl_config = AutoMLConfig(task = 'forecasting',
                             debug_log = 'automl_errors.log',
                             path = ".",
                             compute_target=compute_target,
                             run_configuration=conda_run_config,                               
                             training_data = financeforecast_dataset,
                             label_column_name = 'TotalUSD',
                             **automl_settings
                            )

automl_step = AutoMLStep(
    name='automl_module',
    automl_config=automl_config,
    allow_reuse=False)

training_pipeline = Pipeline(
    description="training_pipeline",
    workspace=ws,    
    steps=[automl_step])

training_pipeline_run = Experiment(ws
, 'test').submit(training_pipeline)

The training_pipeline step runs for apx 20 seconds, and then I get a long trace, ending in:

~\AppData\Local\Continuum\anaconda2\envs\forecasting\lib\site- 
packages\azureml\pipeline\core\_module_builder.py in _hash_from_file_paths(hash_src)
    100             hasher = hashlib.md5()
    101             for f in hash_src:
--> 102                 with open(str(f), 'rb') as afile:
    103                     buf = afile.read()
    104                     hasher.update(buf)

PermissionError: [Errno 13] Permission denied: '.\\NTUSER.DAT'

According to Azure's docs on this topic, submitting a pipeline uploads a "snapshot" of the "source directory" you specified. Initially, I hadn't specified a source directory, so, to test that out, I added:

default_source_directory="testing",
as a parameter for the training_pipeline object, but saw the same behavior when I then tried to run it. Not sure if that is the same source directory the documentation is referring to. The docs also say that if no source directory is specified, the "current local directory" is uploaded. I used print (os.getcwd()) to get the working directory and gave "Everyone" full control permissions on the directory (working in a windows env).

All the preceding code works fine, and I can submit an experiment if I use a ScriptRunConfig and run it on attached compute rather than using a pipeline/training cluster.

Any ideas? Thanks in advance to anyone who tries to help.

Pipelines awaiting-product-team-response cxp product-question triaged

Source

casieo

Most helpful comment

@sanpil That was it, thank you. After posting this question but before hearing from you, I had tried specifying the path in the automl config object, by adding the bolded line below, but that did not work.

automl_config = AutoMLConfig(task = 'forecasting',
debug_log = 'automl_errors.log',
path = ".",
data_script = "c:\users\me\script.py"
compute_target=compute_target,
run_configuration=conda_run_config,
training_data = financeforecast_dataset,
label_column_name = 'TotalUSD',
**automl_settings
)

Here is the config that finally worked:

automl_config = AutoMLConfig(task = 'forecasting',
debug_log = 'automl_errors.log',
compute_target=compute_target,
run_configuration=conda_run_config,
path = "c:\users\me",
data_script ="script.py",
#training_data = financeforecast_dataset,
#label_column_name = 'TotalUSD',
**automl_settings
)

casieo on 2 Jan 2020

🎉1 👍1

All 4 comments

@casieo what is your compute target? Is it AML Compute? What all does your source directory contain? Its recommended that you keep a separate folder as a step source directory where only needed files(like scripts) for that step, are there.

purnesh42H on 31 Dec 2019

Can you specify the source_directory in the path of AutoMLConfig instead of using path = "."

sanpil on 2 Jan 2020

👍1

Here is the config that finally worked:

casieo on 2 Jan 2020

🎉1 👍1

@casieo
We will now proceed to close this thread. If there are further questions regarding this matter, please respond here and @YutongTie-MSFT and we will gladly continue the discussion.