Mlflow: log.artifacts is not storing artifacts

Created on 17 Jul 2018  路  5Comments  路  Source: mlflow/mlflow

System information

  • Linux Ubuntu 16.04
  • MLflow installed from pip
  • MLflow version 0.2.1
  • Python version 2:

When using the following code:

with mlflow.start_run(experiment_id=EXPERIMENT_ID):
    fit = model.fit(x=train_images, y=train_labels, batch_size=64, epochs=5)
    test_loss, test_acc = model.evaluate(test_images, test_labels)
    mlflow.log_param('batch_size', 64)
    mlflow.log_param('epochs', 5)
    mlflow.log_metric('accuracy', test_acc)
    model.save('/dbfs/scripts/mnist_model.h5')
    mlflow.log_artifact('/dbfs/scripts/mnist_model.h5')

the artifacts are not stored on the tracking server. The server is set up using the following:

mlflow server --host 0.0.0.0 --file-store /mlruns &

the artifacts folder ( /mlruns/1/943d1fe8f6e64f458d9c626532715652/artifacts/) on the tracking server is empty. I confirmed that the model gets saved to /dbfs/scripts/mnist_model.h5.

What am I doing wrong?

Most helpful comment

Even I encountered a similar issue. I am unable to store, view, and retrieve the artifacts in MLFlow. The artifact folder is empty irrespective of creating a new experiment and assign proper experiment name and location.

Server: mlflow server --backend-store-uri mlruns/ --default-artifact-root mlruns/ --host 0.0.0.0 --port 5000

Create an Experiment: mlflow.create_experiment(exp_name, artifact_location='mlruns/')

with mlflow.start_run():
mlflow.log_metric("mse", float(binary))
mlflow.log_artifact(data_path, "data")
# log model
mlflow.keras.log_model(model, "models")

The code compiles and runs but does not have any artifacts recorded. It has mlflow.log-model.history file but not the model.h5

All 5 comments

Hi @samuel100, I suspect this is related to #158 (the artifact URI for the run is generated from the artifact URI for its parent experiment, rather than the artifact URI of the current tracking server). See this comment for more info.

One way to fix this for now is:

  • Create an experiment against your tracking server with an artifact root of /mlruns - you can to SSH into the tracking server & run mlflow experiments create --artifact-root /mlruns [experiment-name], or call the Python mlflow.create_experiment API after running mlflow.set_tracking_uri(<your_server_uri>).
  • Run your code snippet above (with the ID of the newly-created experiment), being sure to call mlflow.set_tracking_uri(<your_server_uri>) beforehand. The created run should get its artifact URI from the experiment, which should now have the desired artifact root.

Thanks @smurching. Unfortunately the mlflow experiments create command does not work as anticipated - despite setting the env variable for the tracking URI to http://localhost:5000, it creates the files in /home/mlruns rather than /mlruns. It is like it is creating the default structure rather than how my server is set up. In addition the python mlflow.create_experiment API does not have a parameter to set the artifact root.

@samuel100 sorry for the slow response - PR #232 (released with MLflow 0.4.2) added the ability to pass an artifact root to mlflow.create_experiment / specify an artifact root via the experiments CLI - would you be able to give that a try? If it doesn't work, it'd be great if you could share contents of the experiment metadata file within mlruns/<experiment-id>/meta.yaml directory & I'll dig in further - thanks & sorry again for the delay!

Hi
Is there a way to store the artifacts on mlflow server

while the below creates the artifacts in the directory

import os
import mlflow
import mlflow.pyfunc

class AddN(mlflow.pyfunc.PythonModel):

def __init__(self, n):
    self.n = n

def predict(self, context, model_input):
    return model_input.apply(lambda column: column + self.n)

model_path = "add_n_model"
add5_model = AddN(n=5)

tracking_uri="http://localhost:5000"
mlflowClient = mlflow.tracking.MlflowClient(tracking_uri)

experiment_id=mlflowClient.create_experiment("firstExperiment","/mlflowserver_default_artifact_root")

experiment_to_run=mlflowClient.get_experiment_by_name("firstExperiment")
created_run=mlflowClient.create_run(experiment_to_run.experiment_id)

os.environ['MLFLOW_TRACKING_URI'] = tracking_uri
os.environ['MLFLOW_ARTIFACT_URI'] = tracking_uri

mlflow.pyfunc.log_model(
artifact_path="artifacts",
python_model=add5_model,
)

====== when i run the below

import os
import mlflow
import mlflow.pyfunc

class AddN(mlflow.pyfunc.PythonModel):

def __init__(self, n):
    self.n = n

def predict(self, context, model_input):
    return model_input.apply(lambda column: column + self.n)

model_path = "add_n_model"
add5_model = AddN(n=5)

tracking_uri="http://localhost:5000"
mlflowClient = mlflow.tracking.MlflowClient(tracking_uri)

experiment_to_run=mlflowClient.get_experiment_by_name("firstExperiment")
created_run=mlflowClient.create_run(experiment_to_run.experiment_id)

os.environ['MLFLOW_TRACKING_URI'] = tracking_uri
os.environ['MLFLOW_ARTIFACT_URI'] = tracking_uri

mlflow.pyfunc.log_model(
artifact_path="artifacts",
python_model=add5_model,
)

I get the below error

/home/cyril/anaconda3/lib/python3.7/site-packages/parso/python/tree.py:46: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
from collections import Mapping
Traceback (most recent call last):
File "temp.py", line 29, in
python_model=add5_model,
File "/home/cyril/anaconda3/lib/python3.7/site-packages/mlflow/pyfunc/__init__.py", line 664, in log_model
conda_env=conda_env)
File "/home/cyril/anaconda3/lib/python3.7/site-packages/mlflow/models/__init__.py", line 78, in log
mlflow.tracking.fluent.log_artifacts(local_path, artifact_path)
File "/home/cyril/anaconda3/lib/python3.7/site-packages/mlflow/tracking/fluent.py", line 277, in log_artifacts
MlflowClient().log_artifacts(run_id, local_dir, artifact_path)
File "/home/cyril/anaconda3/lib/python3.7/site-packages/mlflow/tracking/client.py", line 253, in log_artifacts
artifact_repo = get_artifact_repository(run.info.artifact_uri)
File "/home/cyril/anaconda3/lib/python3.7/site-packages/mlflow/store/artifact_repository_registry.py", line 99, in get_artifact_repository
return _artifact_repository_registry.get_artifact_repository(artifact_uri)
File "/home/cyril/anaconda3/lib/python3.7/site-packages/mlflow/store/artifact_repository_registry.py", line 67, in get_artifact_repository
artifact_uri, list(self._registry.keys())
mlflow.exceptions.MlflowException: Could not find a registered artifact repository for: http://localhost:5000/6a8e2424eb2a490bb7aec43a63c4f87e/artifacts. Currently registered schemes are: ['', 'file', 's3', 'gs', 'wasbs', 'ftp', 'sftp', 'dbfs', 'hdfs', 'runs']

Even I encountered a similar issue. I am unable to store, view, and retrieve the artifacts in MLFlow. The artifact folder is empty irrespective of creating a new experiment and assign proper experiment name and location.

Server: mlflow server --backend-store-uri mlruns/ --default-artifact-root mlruns/ --host 0.0.0.0 --port 5000

Create an Experiment: mlflow.create_experiment(exp_name, artifact_location='mlruns/')

with mlflow.start_run():
mlflow.log_metric("mse", float(binary))
mlflow.log_artifact(data_path, "data")
# log model
mlflow.keras.log_model(model, "models")

The code compiles and runs but does not have any artifacts recorded. It has mlflow.log-model.history file but not the model.h5

Was this page helpful?
0 / 5 - 0 ratings