Hi team,
I am trying to create reusable components with UI metadata and metrics persisted. It seems that I need to explicit about outputting metadata and metrics in the component.yaml file. For example, confusion_metric
name: Confusion matrix
description: Calculates confusion matrix
inputs:
- {name: Predictions, type: GCSPath, description: 'GCS path of prediction file pattern.'} # type: {GCSPath: {data_type: CSV}}
- {name: Target lambda, type: String, default: '', description: 'Text of Python lambda function which computes target value. For example, "lambda x: x[''a''] + x[''b'']". If not set, the input must include a "target" column.'}
- {name: Output dir, type: GCSPath, description: 'GCS path of the output directory.'} # type: {GCSPath: {path_type: Directory}}
outputs:
- {name: MLPipeline UI metadata, type: UI metadata}
- {name: MLPipeline Metrics, type: Metrics}
implementation:
container:
image: gcr.io/ml-pipeline/ml-pipeline-local-confusion-matrix:1.0.0
command: [python2, /ml/confusion_matrix.py]
args: [
--predictions, {inputValue: Predictions},
--target_lambda, {inputValue: Target lambda},
--output, {inputValue: Output dir},
]
fileOutputs:
MLPipeline UI metadata: /mlpipeline-ui-metadata.json
MLPipeline Metrics: /mlpipeline-metrics.json
However, in the component specification. It mentions that fileOutputs is a Legacy property and should not be used.
They are contradicting each other. Also, I could not find any documentation saying we need to use fileOutputs in the component.yaml file to persist UI metadata and metrics.
So I wonder what is the correct way to output UI metadata and metrics?
Thanks
Bin
It mentions that fileOutputs is a Legacy property and should not be used.
They are contradicting each other.
Yes, the confusion_matrix component still uses a legacy way to describe outputs.
fileOutputs works, but we really discourage its usage.
Before:
args: [
--predictions, {inputValue: Predictions},
--target_lambda, {inputValue: Target lambda},
--output, {inputValue: Output dir},
]
fileOutputs:
MLPipeline UI metadata: /mlpipeline-ui-metadata.json
MLPipeline Metrics: /mlpipeline-metrics.json
after:
args: [
--predictions, {inputValue: Predictions},
--target_lambda, {inputValue: Target lambda},
--output, {inputValue: Output dir},
--metadata-path, {outputPath: MLPipeline UI metadata},
--metrics-path, {outputPath: MLPipeline Metrics},
]
The confusion matrix component will be fixed at some point in the future: https://github.com/kubeflow/pipelines/pull/580/files#diff-a65b15d94e67f77a796e4940651e5c62
Thanks for the example! I think it would be nice to add this information in the documentation, as it does not have a clear example on how to output the UI metadata and metrics in the component.yaml file.
Thank you. We'll try to improve the documentation.
Most helpful comment
Yes, the confusion_matrix component still uses a legacy way to describe outputs.
fileOutputsworks, but we really discourage its usage.Before:
after:
The confusion matrix component will be fixed at some point in the future: https://github.com/kubeflow/pipelines/pull/580/files#diff-a65b15d94e67f77a796e4940651e5c62