Mlflow: I cannot see the model signature and input_example in MLmodel file of a spark flavor model

Created on 22 Jul 2020  路  4Comments  路  Source: mlflow/mlflow

Hi, I am wondering if I have miss something when I try to log model with signature and input example? Can anyone help me please. Thank you!

My tracking-uri and backend-store-uri are the same mysql database uri
and default-artifact-root is ./mlruns

I am trying to train an ALS model using pyspark2 ml package and log it by mlflow,
but on the mlflow ui http://:, I cannot see the model signature and input_example in MLmodel file.

And I don't see the difference for the one I didn't log model with signature and the one I did, in the process of "train -> models serve -> send data in json file and get prediction ". Could someone tell me how can I see the difference.
The information of model signature and input_example should be there as I see for model of sklearn flavor.

Screen Shot 2020-07-22 at 2 55 09 PM

Screen Shot 2020-07-22 at 3 25 39 PM

Screen Shot 2020-07-22 at 2 56 45 PM

aremodels prioritimportant-soon

All 4 comments

@Astonzzh Thanks for filing this issue. I was able to reproduce the issue.

from pyspark.sql import SparkSession
from pyspark.ml.recommendation import ALS

import mlflow
import mlflow.spark
from mlflow.models.signature import infer_signature
from pyspark.ml import Pipeline

spark = (
    SparkSession
    .builder
    .getOrCreate()
)

# prepare train and test data
train = spark.createDataFrame(
    [(0, 0, 4.0),
     (0, 1, 2.0),
     (1, 1, 3.0)],
    ["user", "item", "rating"],
)

test = spark.createDataFrame(
    [(0, 2), (1, 0), (2, 0)],
    ["user", "item"]
)

# train model
als = ALS(rank=10, maxIter=10, regParam=0.1, userCol="user", itemCol="item", ratingCol="rating")
pipeline = Pipeline(stages=[als])
model = pipeline.fit(train)

# create signature and example
signature = infer_signature(test, model.transform(test))
example_dict = {'user': 0, 'item': 1}

# log model
mlflow.spark.log_model(
    model,
    'small_ALS',
    signature=signature,
    input_example=example_dict
)

I found signature and input_example are not passed to Model.log.

https://github.com/mlflow/mlflow/blob/c2708b5735354e10ec052164253eb74243c5603f/mlflow/spark.py#L174-L177

I have confirmed that adding missing input_example and signature solves this issue.

    if is_local_uri(run_root_artifact_uri):
        return Model.log(artifact_path=artifact_path, flavor=mlflow.spark, spark_model=spark_model,
                         conda_env=conda_env, dfs_tmpdir=dfs_tmpdir, sample_input=sample_input,
                         registered_model_name=registered_model_name,
+                        signature=signature,
+                        input_example=input_example)

Screen Shot 2020-07-22 at 17 01 08

Screen Shot 2020-07-22 at 17 04 52

Cool! Thank you!

Was this page helpful?
0 / 5 - 0 ratings