mlflow --version): 0.8.1I have AWS-based remote server up and running
On my local machine I run example file which logs some metrics and adds simple artifact
Expected no errors, UI shows metadata and artifact
Actual: artifact error (see below), UI shows only metadata and no artifact
On the server /home/ec2-user/mlruns folder is created and has metadata
UI can show metrics nicely
Running mlflow_tracking.py
Traceback (most recent call last):
File "mlflow_tracking.py", line 23, in <module>
log_artifacts("outputs")
File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/site-packages/mlflow/tracking/fluent.py", line 227, in log_artifacts
MlflowClient().log_artifacts(run_id, local_dir, artifact_path)
File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/site-packages/mlflow/tracking/client.py", line 174, in log_artifacts
artifact_repo.log_artifacts(local_dir, artifact_path)
File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/site-packages/mlflow/store/local_artifact_repo.py", line 31, in log_artifacts
mkdir(artifact_dir)
File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/site-packages/mlflow/utils/file_utils.py", line 105, in mkdir
raise e
File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/site-packages/mlflow/utils/file_utils.py", line 102, in mkdir
os.makedirs(target)
File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/os.py", line 211, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/os.py", line 211, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/os.py", line 211, in makedirs
makedirs(head, exist_ok=exist_ok)
[Previous line repeated 1 more time]
File "/home/localuser/miniconda3/envs/mflow/lib/python3.7/os.py", line 221, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/home/ec2-user'
Why the hack local client tries to create same folder as on the server as it already created outputs folder to upload file?
Local client should not care at all which path is used on remote server
What was the exact command line used to start the server?
@hanyucui simply that:
[ec2-user@ip-10-0-0-148 ~]$ pwd
/home/ec2-user
[ec2-user@ip-10-0-0-148 ~]$ mlflow server --host 0.0.0.0
[2019-01-21 10:33:02 +0000] [2259] [INFO] Starting gunicorn 19.9.0
[2019-01-21 10:33:02 +0000] [2259] [INFO] Listening at: http://0.0.0.0:5000 (2259)
[2019-01-21 10:33:02 +0000] [2259] [INFO] Using worker: sync
[2019-01-21 10:33:02 +0000] [2262] [INFO] Booting worker with pid: 2262
[2019-01-21 10:33:02 +0000] [2263] [INFO] Booting worker with pid: 2263
[2019-01-21 10:33:02 +0000] [2265] [INFO] Booting worker with pid: 2265
[2019-01-21 10:33:02 +0000] [2268] [INFO] Booting worker with pid: 2268
and then from client:
quickstart$ MLFLOW_TRACKING_URI=http://***.compute.amazonaws.com:5000 python mlflow_tracking.py
......
PermissionError: [Errno 13] Permission denied: '/home/ec2-user'
@budmitr Looks like this expected behavior. According to the documentation, the client directly logs into the artifact store. And since it can't access '/home/ec2-user', it bails out. This is the original text:
The _artifact store_ ... and is where _clients_ log their artifact output (for example, models).
In your case, you probably want to set the artifact URI (different from the tracking URI) to an S3 bucket. The documentation has an example:
mlflow server \
--file-store /mnt/persistent-disk \
--default-artifact-root s3://my-mlflow-bucket/ \
--host 0.0.0.0
@hanyucui thank you very much for this explanation, I definetely missed this part in the docs
Still it looks very counter-intuitive for me and as kind of bug, but I understand reasons behind this
If any of dev-team is reading this, I'd suggest to make client-server-storage approach available as well.
In some environments (like mine) there must be some restrictions and disabilities for clients to access the storage directly
Most helpful comment
@hanyucui thank you very much for this explanation, I definetely missed this part in the docs
Still it looks very counter-intuitive for me and as kind of bug, but I understand reasons behind this
If any of dev-team is reading this, I'd suggest to make client-server-storage approach available as well.
In some environments (like mine) there must be some restrictions and disabilities for clients to access the storage directly