Machinelearning: Label and Score

Created on 15 May 2018  路  7Comments  路  Source: dotnet/machinelearning

Not sure if it's the best place for this question but how to get the labels back when getting the score? I remember from the build 2018 video that during the demo they used PredictedLabelColumnOriginalValueConverter to get the label back instead of the index for the PredictedLabel but is there a way to apply this converter to all the values in the Score? If yes, how so?

I also remember they were using the ClassificationMetrics.ConfusionMatrix.ClassNames to get the labels back from the score but this isn't really doable if you load the output in an app and don't need/want to test your data.

I found the file Terms.txt in the output zip archive and I can definitely parse it to find the labels but I'm sure there's a better way to do it.

question

All 7 comments

Getting the class names for the other scores (not just the top score) is currently available in ML.NET by going through IDV (IDataView), but not in the high-level APIs (e.g. LearningPipeline). This is something we definitely need to expose!

@codemzs, @glebuk, @TomFinley, any thoughts on how we can enable getting this type of data in the high-level APIs?

Is there any available documentation/example on how to use IDataView vs LearningPipeline? Also, is it safe to assume that the float[] from Score will have the same order as the labels in the Terms.txt file? Is there a public API to access the files once they have been loaded with PredictionModel.ReadAsync()?

@cheesemacfly, just wanted to quickly follow up to see if #239 addresses your question. Take a look at TryGetScoreLabelNames here, which gives you the class names (in the same order as the scores).

@GalOshri it definitely looks like this would address my question. Is there a pre-release version of 0.2 of the nuget package available somewhere so I can test it?

Yes, the daily NuGet builds are available here.

@GalOshri it works as expected thanks.

Also I'm now wondering, is there a reason why we couldn't get the labels from the Predict() call result right away? So using something like [ColumnName("MappedScore")] (keeping [ColumnName("Score")] as it is) we would get back a Tuple<float, string>[] for example? It would be easier to read than a second method call on the PredictionModel object.

Just to recap and for future reference, because I've been looking for this, the code that gets the label for the scores is this:

string[] scoreLabels;
model.TryGetScoreLabelNames(out scoreLabels);

The and the order matches the scores in the model!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

aslotte picture aslotte  路  3Comments

rebecca-burwei picture rebecca-burwei  路  3Comments

lionelquirynen picture lionelquirynen  路  3Comments

rogancarr picture rogancarr  路  3Comments

daholste picture daholste  路  3Comments