I am unable to log my runs into the tracking server with the error message at the client:
2019/04/24 11:35:28 ERROR mlflow.utils.rest_utils: API request to http://mlflow-tracking-service.core:5000/api/2.0/preview/mlflow/runs/get failed with code 500 != 200, retrying up to 2 more times. API response body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>500 Internal Server Error</title>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>
Error log at the tracking server:
[2019-04-24 11:35:44,347] ERROR in app: Exception on /ajax-api/2.0/preview/mlflow/experiments/get [GET]
Traceback (most recent call last):
File "/opt/conda/lib/python2.7/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/opt/conda/lib/python2.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/opt/conda/lib/python2.7/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/opt/conda/lib/python2.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/conda/lib/python2.7/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/opt/conda/lib/python2.7/site-packages/mlflow/server/handlers.py", line 81, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python2.7/site-packages/mlflow/server/handlers.py", line 143, in _get_experiment
response_message.runs.extend([r.to_proto() for r in run_info_entities])
File "/opt/conda/lib/python2.7/site-packages/mlflow/entities/run_info.py", line 150, in to_proto
proto.experiment_id = self.experiment_id
TypeError: 0 has type int, but expected one of: bytes, unicode
Python code for testing:
from mlflow.tracking import MlflowClient
from contextlib import contextmanager
@contextmanager
def run_context(client, experiment_id, user_id=None, run_name=None):
curr_run = client.create_run(experiment_id, user_id, run_name)
yield curr_run
client.set_terminated(curr_run.info.run_uuid)
mlflow_client = MlflowClient()
current_experiment = mlflow_client.get_experiment_by_name('chris_testing')
if current_experiment is None:
current_experiment = mlflow_client.create_experiment('chris_testing')
else:
current_experiment = current_experiment.experiment_id
with run_context(mlflow_client,
current_experiment,
'chrissng') as run:
print(run)
mlflow_client.log_param(run.info.run_uuid, "a", 1)
mlflow_client.log_metric(run.info.run_uuid, "b", 3)
with open("output.txt", "w") as f:
f.write("Hello world!")
mlflow_client.log_artifact(run.info.run_uuid, "output.txt")
Thanks for raising this @chrissng, @drewmcdonald @MeisterUrian, this is indeed a bug in the logic for deserializing run metadata from the SQL backend (because it stores experiment IDs as ints). I'll work on a PR to fix this - for now unfortunately I think the only workaround is to downgrade your client & server to MLflow 0.9.0, or to use the file-backed tracking store.
@smurching Thanks for the prompt response and the fix! I will be looking forward to the next release!
Hi,
I am having the same issue. I am using version 0.9.0 and I tried also with the latest one.
2019/06/19 10:12:53 ERROR mlflow.utils.rest_utils: API request to https://mlflow-tracking.staging.travix.com/api/2.0/preview/mlflow/runs/log-parameter failed with code 500 != 200, retrying up to 1 more times. API response body:
The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.
2019/06/19 10:12:56 ERROR mlflow.utils.rest_utils: API request to https://mlflow-tracking.staging.travix.com/api/2.0/preview/mlflow/runs/log-parameter failed with code 500 != 200, retrying up to 0 more times. API response body:
The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.
Traceback (most recent call last):
File "/anaconda/envs/emailforecasting/bin/mlflow", line 10, in
sys.exit(cli())
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(args, *kwargs)
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, *ctx.params)
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(args, *kwargs)
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/mlflow/cli.py", line 141, in run
run_id=run_id,
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/mlflow/projects/__init__.py", line 209, in run
storage_dir=storage_dir, block=block, run_id=run_id)
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/mlflow/projects/__init__.py", line 81, in _run
tracking.MlflowClient().log_param(active_run.info.run_uuid, key, value)
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/mlflow/tracking/client.py", line 162, in log_param
self.store.log_param(run_id, param)
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/mlflow/store/rest_store.py", line 182, in log_param
self._call_endpoint(LogParam, req_body)
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/mlflow/store/rest_store.py", line 64, in _call_endpoint
host_creds=host_creds, endpoint=endpoint, method=method, json=json_body)
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/mlflow/utils/rest_utils.py", line 73, in http_request_safe
response = http_request(host_creds=host_creds, endpoint=endpoint, *kwargs)
File "/anaconda/envs/emailforecasting/lib/python3.7/site-packages/mlflow/utils/rest_utils.py", line 58, in http_request
(url, retries))
mlflow.exceptions.MlflowException: API request to https://mlflow-tracking.staging.travix.com/api/2.0/preview/mlflow/runs/log-parameter failed to return code 200 after 3 tries
(emailforecasting) ➜ customerservice.callforecasting git:(acsaj)
Most helpful comment
Thanks for raising this @chrissng, @drewmcdonald @MeisterUrian, this is indeed a bug in the logic for deserializing run metadata from the SQL backend (because it stores experiment IDs as ints). I'll work on a PR to fix this - for now unfortunately I think the only workaround is to downgrade your client & server to MLflow 0.9.0, or to use the file-backed tracking store.