When trying to convert the output PredictedLabel from key back to value, the exception
System.InvalidOperationException
HResult=0x80131509
Message=Metadata KeyValues does not exist
Source=Microsoft.ML.Core
StackTrace:
at Microsoft.ML.Runtime.Contracts.Check(IExceptionContext ctx, Boolean f, String msg) in C:\MLDotNet2\src\Microsoft.ML.Core\Utilities\Contracts.cs:line 497
is thrown.
Code to repro:
```C#
var mlContext = new MLContext();
var textLoaderOptions = new TextLoader.Options()
{
Columns = new[]
{
new TextLoader.Column("Label", DataKind.Single, 0),
new TextLoader.Column("Row", DataKind.Single, 1),
new TextLoader.Column("Column", DataKind.Single, 2),
},
HasHeader = true,
Separators = new[] { '\t' }
};
var textLoader = mlContext.Data.CreateTextLoader(textLoaderOptions);
var data = textLoader.Load(@"C:\MLDotNet2\test\data\trivial-train.tsv");
var ap = mlContext.BinaryClassification.Trainers.AveragedPerceptron();
var ova = mlContext.MulticlassClassification.Trainers.OneVersusAll(ap);
var pipeline = mlContext.Transforms.Conversion.MapValueToKey("Label")
.Append(mlContext.Transforms.Concatenate("Features", "Row", "Column"))
.Append(ova)
.Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel"));
var model = pipeline.Fit(data);
```
Replace C:\MLDotNet2\ with path to ML.NET repo on your local machine
The problem here is that the Annotations are not carried from the Label column to the PredictedLabel column.
@Ivanidzo4ka @shauheen
This is a regression from 0.11 and is currently currently blocking automl multiclass scenario.
Can I please request that we prioritize this issue?
Thanks in advance
I also hit this issue during the bug bash in upgrading XamlBrewer.Uwp.MachineLearningSample to the latest ML.NET build.
```C#
_pipeline = MLContext.Transforms.Conversion.MapValueToKey("Label")
.Append(MLContext.Transforms.Text.FeaturizeText("Features", "Text"))
// Main algorithm
// .Append(MLContext.MulticlassClassification.Trainers.StochasticDualCoordinateAscent())
// or
.Append(MLContext.MulticlassClassification.Trainers.LogisticRegression())
// or
// .Append(MLContext.MulticlassClassification.Trainers.NaiveBayes()) // yields weird metrics...
// Convert the predicted value back into a language.
.Append(MLContext.Transforms.Conversion.MapKeyToValue("PredictedLabel"));
```
This doesn't appear to be OVA specific.
Still reproducible in ML.NET 1.4
I have 18 records in IDataView for training.
Executing CrossValidate on dataView.TestSet always raises
System.InvalidOperationException: 'Metadata KeyValues does not exist'.
Performing CrossValidate on entire dataView works fine.
Performing CrossValidate on dataView.TrainSet works fine only if numberOfFolds is greater than 3, otherwise, the same exception.
public dynamic GetCrossMetrics(IEstimator<ITransformer> pipeline, IEstimator<ITransformer> estimator, IDataView dataView)
{
var inputs = pipeline.Fit(dataView).Transform(dataView);
var metrics = Predictor.MulticlassClassification.CrossValidate(inputs, pipeline.Append(estimator), numberOfFolds: 2, labelColumnName: "Emotion");
return metrics;
}
var inputs = GetInputs(); // Read data from TSV file
var pipeline = GetPipeline(); // Create "Label" column, concatenate "Features", and NormalizeMinMax
var estimator = GetEstimator(); // MultiClass.LightGbm
var inputSets = Predictor.Data.TrainTestSplit(inputs, testFraction: 0.2);
var model = pipeline.Append(estimator).Fit(inputSets.TrainSet);
var x1 = GetCrossMetrics(pipeline, estimator, inputs); // always works
var x2 = GetCrossMetrics(pipeline, estimator, inputSets.TrainSet); // works only for numberOfFolds > 3
var x3 = GetCrossMetrics(pipeline, estimator, inputSets.TestSet); // always fails, TestSet contains 2 items