Machinelearning: Exception when converting PredictedLabel from Key To Value

Created on 26 Mar 2019  路  4Comments  路  Source: dotnet/machinelearning

When trying to convert the output PredictedLabel from key back to value, the exception

System.InvalidOperationException
  HResult=0x80131509
  Message=Metadata KeyValues does not exist
  Source=Microsoft.ML.Core
  StackTrace:
   at Microsoft.ML.Runtime.Contracts.Check(IExceptionContext ctx, Boolean f, String msg) in C:\MLDotNet2\src\Microsoft.ML.Core\Utilities\Contracts.cs:line 497

is thrown.

Code to repro:
```C#
var mlContext = new MLContext();

var textLoaderOptions = new TextLoader.Options()
{
Columns = new[]
{
new TextLoader.Column("Label", DataKind.Single, 0),
new TextLoader.Column("Row", DataKind.Single, 1),
new TextLoader.Column("Column", DataKind.Single, 2),
},
HasHeader = true,
Separators = new[] { '\t' }
};
var textLoader = mlContext.Data.CreateTextLoader(textLoaderOptions);
var data = textLoader.Load(@"C:\MLDotNet2\test\data\trivial-train.tsv");

var ap = mlContext.BinaryClassification.Trainers.AveragedPerceptron();
var ova = mlContext.MulticlassClassification.Trainers.OneVersusAll(ap);

var pipeline = mlContext.Transforms.Conversion.MapValueToKey("Label")
.Append(mlContext.Transforms.Concatenate("Features", "Row", "Column"))
.Append(ova)
.Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel"));

var model = pipeline.Fit(data);
```

Replace C:\MLDotNet2\ with path to ML.NET repo on your local machine

bug

All 4 comments

The problem here is that the Annotations are not carried from the Label column to the PredictedLabel column.

@Ivanidzo4ka @shauheen
This is a regression from 0.11 and is currently currently blocking automl multiclass scenario.
Can I please request that we prioritize this issue?
Thanks in advance

I also hit this issue during the bug bash in upgrading XamlBrewer.Uwp.MachineLearningSample to the latest ML.NET build.

https://github.com/XamlBrewer/UWP-MachineLearning-Sample/blob/150b11ed4941451e0aa48c7f71500069efa2c2fb/XamlBrewer.Uwp.MachineLearningSample/Models/MulticlassClassification/MulticlassClassificationModel.cs#L19-L30

```C#
_pipeline = MLContext.Transforms.Conversion.MapValueToKey("Label")
.Append(MLContext.Transforms.Text.FeaturizeText("Features", "Text"))
// Main algorithm
// .Append(MLContext.MulticlassClassification.Trainers.StochasticDualCoordinateAscent())
// or
.Append(MLContext.MulticlassClassification.Trainers.LogisticRegression())
// or
// .Append(MLContext.MulticlassClassification.Trainers.NaiveBayes()) // yields weird metrics...

        // Convert the predicted value back into a language.
            .Append(MLContext.Transforms.Conversion.MapKeyToValue("PredictedLabel"));

```
This doesn't appear to be OVA specific.

Still reproducible in ML.NET 1.4
I have 18 records in IDataView for training.
Executing CrossValidate on dataView.TestSet always raises
System.InvalidOperationException: 'Metadata KeyValues does not exist'.
Performing CrossValidate on entire dataView works fine.
Performing CrossValidate on dataView.TrainSet works fine only if numberOfFolds is greater than 3, otherwise, the same exception.

public dynamic GetCrossMetrics(IEstimator<ITransformer> pipeline, IEstimator<ITransformer> estimator, IDataView dataView)
{
  var inputs = pipeline.Fit(dataView).Transform(dataView);
  var metrics = Predictor.MulticlassClassification.CrossValidate(inputs, pipeline.Append(estimator), numberOfFolds: 2, labelColumnName: "Emotion");

  return metrics;
}

var inputs = GetInputs();                          // Read data from TSV file
var pipeline = GetPipeline();                    // Create "Label" column, concatenate "Features", and NormalizeMinMax
var estimator = GetEstimator();               // MultiClass.LightGbm
var inputSets = Predictor.Data.TrainTestSplit(inputs, testFraction: 0.2);
var model = pipeline.Append(estimator).Fit(inputSets.TrainSet);

var x1 = GetCrossMetrics(pipeline, estimator, inputs);                      // always works
var x2 = GetCrossMetrics(pipeline, estimator, inputSets.TrainSet);   // works only for numberOfFolds > 3
var x3 = GetCrossMetrics(pipeline, estimator, inputSets.TestSet);     // always fails, TestSet contains 2 items 
Was this page helpful?
0 / 5 - 0 ratings

Related issues

frankhaugen picture frankhaugen  路  3Comments

pgovind picture pgovind  路  3Comments

ddobric picture ddobric  路  4Comments

JakeRadMSFT picture JakeRadMSFT  路  3Comments

sethreidnz picture sethreidnz  路  3Comments