Mlflow: Can't log artifact to remote server: Permission Denied on local machine

Created on 15 Jan 2019  路  4Comments  路  Source: mlflow/mlflow

System information

  • Have I written custom code (as opposed to using a stock example script provided in MLflow): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): fedora 28
  • MLflow installed from (source or binary): pip
  • MLflow version (run mlflow --version): 0.8.1
  • Python version: 3.7.1
  • Exact command to reproduce: MLFLOW_TRACKING_URI=http://myaws:5000 python mlflow/examples/quickstart/mlflow_tracking.py

Problem

I have AWS-based remote server up and running
On my local machine I run example file which logs some metrics and adds simple artifact

Expected no errors, UI shows metadata and artifact
Actual: artifact error (see below), UI shows only metadata and no artifact

On the server /home/ec2-user/mlruns folder is created and has metadata
UI can show metrics nicely

Source code / logs

Running mlflow_tracking.py
Traceback (most recent call last):
  File "mlflow_tracking.py", line 23, in <module>
    log_artifacts("outputs")
  File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/site-packages/mlflow/tracking/fluent.py", line 227, in log_artifacts
    MlflowClient().log_artifacts(run_id, local_dir, artifact_path)
  File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/site-packages/mlflow/tracking/client.py", line 174, in log_artifacts
    artifact_repo.log_artifacts(local_dir, artifact_path)
  File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/site-packages/mlflow/store/local_artifact_repo.py", line 31, in log_artifacts
    mkdir(artifact_dir)
  File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/site-packages/mlflow/utils/file_utils.py", line 105, in mkdir
    raise e
  File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/site-packages/mlflow/utils/file_utils.py", line 102, in mkdir
    os.makedirs(target)
  File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/os.py", line 211, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/os.py", line 211, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/os.py", line 211, in makedirs
    makedirs(head, exist_ok=exist_ok)
  [Previous line repeated 1 more time]
  File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/os.py", line 221, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/home/ec2-user'

Why the hack local client tries to create same folder as on the server as it already created outputs folder to upload file?

Local client should not care at all which path is used on remote server

Most helpful comment

@hanyucui thank you very much for this explanation, I definetely missed this part in the docs

Still it looks very counter-intuitive for me and as kind of bug, but I understand reasons behind this

If any of dev-team is reading this, I'd suggest to make client-server-storage approach available as well.
In some environments (like mine) there must be some restrictions and disabilities for clients to access the storage directly

All 4 comments

What was the exact command line used to start the server?

@hanyucui simply that:

[ec2-user@ip-10-0-0-148 ~]$ pwd
/home/ec2-user

[ec2-user@ip-10-0-0-148 ~]$ mlflow server --host 0.0.0.0
[2019-01-21 10:33:02 +0000] [2259] [INFO] Starting gunicorn 19.9.0
[2019-01-21 10:33:02 +0000] [2259] [INFO] Listening at: http://0.0.0.0:5000 (2259)
[2019-01-21 10:33:02 +0000] [2259] [INFO] Using worker: sync
[2019-01-21 10:33:02 +0000] [2262] [INFO] Booting worker with pid: 2262
[2019-01-21 10:33:02 +0000] [2263] [INFO] Booting worker with pid: 2263
[2019-01-21 10:33:02 +0000] [2265] [INFO] Booting worker with pid: 2265
[2019-01-21 10:33:02 +0000] [2268] [INFO] Booting worker with pid: 2268

and then from client:

quickstart$ MLFLOW_TRACKING_URI=http://***.compute.amazonaws.com:5000 python mlflow_tracking.py

......

PermissionError: [Errno 13] Permission denied: '/home/ec2-user'

@budmitr Looks like this expected behavior. According to the documentation, the client directly logs into the artifact store. And since it can't access '/home/ec2-user', it bails out. This is the original text:

The _artifact store_ ... and is where _clients_ log their artifact output (for example, models).

In your case, you probably want to set the artifact URI (different from the tracking URI) to an S3 bucket. The documentation has an example:

mlflow server \
    --file-store /mnt/persistent-disk \
    --default-artifact-root s3://my-mlflow-bucket/ \
    --host 0.0.0.0

@hanyucui thank you very much for this explanation, I definetely missed this part in the docs

Still it looks very counter-intuitive for me and as kind of bug, but I understand reasons behind this

If any of dev-team is reading this, I'd suggest to make client-server-storage approach available as well.
In some environments (like mine) there must be some restrictions and disabilities for clients to access the storage directly

Was this page helpful?
0 / 5 - 0 ratings