Docs: multi class classification?

Created on 16 Jun 2018  Â·  23Comments  Â·  Source: dotnet/docs

which method can i use to make multi-class classification?


Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

product-question

All 23 comments

@bs6523 the get started example uses StochasticDualCoordinateAscentClassifier

Other multiclass classification trainers in ML.NET are

The topic that describes ML.NET trainers is under construction; see PR #5698

Do you have any tutorials for this?

Do you have any tutorials for this?

To my knowledge, there is only the one linked earlier.

that one is not working in my case, I want to classify sentiment as Neutral, Positive and negative, do you have any idea about this?

What makes you think that that is not working in your case?

the training file in the tutorial is numbers, but in my case, it will be text with values
example
"Hi how are you" 0
"I am sad" 1
"I am good " 2

0 - neutral
1 - negative
2 - positive

The training file in the tutorial has the same structure as in your example, except the two columns are swapped (value - text).

Or we look at different files?

Data
Yes, the data set you provided from the tutorial that used Binary classifier, it's working for me,
but i am trying with StochasticDualCoordinateAscentClassifier with https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet/get-started/windows
tutorial, that trining data is different training data

What happens if you try to use the sentiment analysis tutorial with StochasticDualCoordinateAscentClassifier?

its giving error, because the formats are different, the tutorial is for FastTreeBinaryClassifier,

@bs6523 oh! that's becoming interesting.

Try this one in the Train method:

pipeline.Add(new Dictionarizer("Label"));
pipeline.Add(new TextFeaturizer("Features", "SentimentText"));
pipeline.Add(new StochasticDualCoordinateAscentClassifier());
pipeline.Add(new PredictedLabelColumnOriginalValueConverter() { PredictedLabelColumn = "PredictedLabel" });

And also update the SentimentPrediction class to use float type for the field:

public class SentimentPrediction
{
    [ColumnName("PredictedLabel")]
    public float Sentiment;
}

Of course, you should update the Evaluate method as well:

var testData = new TextLoader(_testDataPath).CreateFrom<SentimentData>();
var evaluator = new ClassificationEvaluator();
ClassificationMetrics metrics = evaluator.Evaluate(model, testData);

That would break the code that displays metrics, because ClassificationMetrics has other properties. Check its API to decide what metrics you would like to see.

I hope that helps.

`using System;
using Microsoft.ML.Models;
using Microsoft.ML.Runtime;
using Microsoft.ML.Runtime.Api;
using Microsoft.ML.Trainers;
using Microsoft.ML.Transforms;
using System.Collections.Generic;
using System.Linq;
using Microsoft.ML;

namespace FMSforTelicom.Models
{
public class SentimentData
{
[Column(ordinal: "0")]
public string SentimentText;
[Column(ordinal: "1", name: "Label")]
public float Sentiment;
}

public class SentimentPrediction
{
    [ColumnName("PredictedLabel")]
    public float Sentiment;
}

public static class Globals
{
    public static PredictionModel<SentimentData, SentimentPrediction> model_Sentiment;
}

class SentimentAnalysis
{
    const string _dataPathNeutral = @"..\Sentiment_Training_Data\sentiment labelled sentences\Training_NeuNonneu.txt";
    const string _dataPathPosNeg = @"..\Sentiment_Training_Data\sentiment labelled sentences\Training_PosNeg.txt";
    const string _testDataPathPosNeg = @"..\Sentiment_Training_Data\sentiment labelled sentences\Evaluate_PosNeg.txt";
    const string _testDataPathNeuNonneu = @"..\Sentiment_Training_Data\sentiment labelled sentences\Evaluate_NeuNonneu.txt";


    static void Main(string[] args)
    {
        SentimentAnalysis.Training();
        String Sentiment = SentimentAnalysis.Predict("how are you doing, i am not good today");
        Console.Read();
    }


    public static void Training()
    {

        var pipeline = new LearningPipeline();
        pipeline.Add(new TextLoader<SentimentData>(_dataPathNeutral, useHeader: false, separator: "tab"));
        pipeline.Add(new Dictionarizer("Label"));
        pipeline.Add(new TextFeaturizer("Features", "SentimentText"));
        pipeline.Add(new StochasticDualCoordinateAscentClassifier());
        pipeline.Add(new PredictedLabelColumnOriginalValueConverter() { PredictedLabelColumn = "PredictedLabel" });


        //  var testData = new TextLoader(_testDataPath).CreateFrom<SentimentData>();
        //  var evaluator = new ClassificationEvaluator();
        //  ClassificationMetrics metrics = evaluator.Evaluate(model, testData);


       Globals.model_Sentiment = pipeline.Train<SentimentData, SentimentPrediction>();

        //training end
    }


    public static String Predict(String text)
    {
        String _Sentiment = "Null";
        String _SentimentText = text;

        IEnumerable<SentimentData> sentiments = new[]{
        new SentimentData{
          SentimentText = _SentimentText,
          Sentiment = 0
          },
        };

        IEnumerable<SentimentPrediction> predictions = Globals.model_Sentiment.Predict(sentiments);

        Console.WriteLine();
        Console.WriteLine("Sentiment Predictions");
        Console.WriteLine("---------------------");

        Console.WriteLine("Text: " + predictions.ToList()[0].Sentiment);

        return _Sentiment;
    }
}

}`

this is my code, I am getting this error

System.Reflection.TargetInvocationException: 'Exception has been thrown by the target of an invocation.'
InvalidOperationException: Source column 'Label' is required but not found

I saw this and got this MultiClass language classification to work (I also had an Exception on my first trial - these exceptions are realy not helpful unless you know what it means :-) ...

https://github.com/Dirkster99/ML/tree/master/source/MultibleClasses

(even though I got quit a few classification examples now - I am still not sure how input and output datatypes are mapped/related?)

@bs6523 do you use the latest version of the library: 0.2?

yes

  <ItemGroup>
    <PackageReference Include="Microsoft.ML" Version="0.2.0" />
</ItemGroup>

https://github.com/Dirkster99/ML/blob/master/source/MultibleClasses/MultibleClasses.csproj

pipeline.Add(new TextLoader(_dataPathNeutral, useHeader: false, separator: "tab"));
any replacement for this line after updating to .ML 0.2.0? (not working ufter updating)

Yes there is a syntax change between 0.1 and 0.2, consider something like this:
new TextLoader(dataPath).CreateFrom<Digit>(separator: ',', allowQuotedStrings:false)

from here:
https://github.com/Dirkster99/ML/blob/bbda50bee5b8f0d086fce50d7b28f61a00a21c17/source/IrisDataset/Models/PredictDigits/PredictDigit.cs

Your solution is probably something like this (useHeader: false is default):
new TextLoader(_dataPathNeutral).CreateFrom(separator: '\t')

Try new pipeline.Add(new TextLoader(_dataPathNeutral).CreateFrom\

@pkulikov the code is working after updating to 0.2.0
@Dirkster99 @JRAlexander thanks for the code, its working fine,

anyone know how is "StochasticDualCoordinateAscentClassifier()" working? is there any explanation for that? I cant understand how it's working in the background.
Is it a neural net model or some other model?

@bs6523

anyone know how is "StochasticDualCoordinateAscentClassifier()" working? is there any explanation for that? I cant understand how it's working in the background.
Is it a neural net model or some other model?

Current docs don't explain learners in detail. The mentioned learner might be that one. In any case, if you ask this question in the ML.NET repo itself, the probability to get an answer is higher.

@pkulikov
do you know how to get this values?

 Accuracy: 
 Auc:
 F1Score: 
 Negative Precision: 
 Positive Precision: 
 Negative Recal: 
 Positive Recall

usign var evaluator = new ClassificationEvaluator();?

I used BinaryClassificationEvaluator and got these scores but for ClassificationEvaluator it's not working,

@bs6523 muliclass classification has other set of evaluation metrics. Check the properties of the ClassificationMetrics class. For example, you can get confusion matrix by ClassificationMetrics.ConfusionMatrix property. In case of the binary classification metrics, the confusion matrix is represented by precision and recall values.

Closing this as it appears resolved.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sebagomez picture sebagomez  Â·  3Comments

LJ9999 picture LJ9999  Â·  3Comments

svick picture svick  Â·  3Comments

skylerberg picture skylerberg  Â·  3Comments

ike86 picture ike86  Â·  3Comments