Machinelearning: More trainer related naming alignment

Created on 13 Mar 2019  路  4Comments  路  Source: dotnet/machinelearning

Take another pass over the trainers and the model parameter types, and align them, because now we have:

LogisticRegressionMulticlassClassificationTrainer but MulticlassLogisticRegressionModelParameters.

I think following the same principles on the ModelParams might make them more relatable; so change MulticlassLogisticRegressionModelParameters to LogisticRegressionMulticlassModelParameters

API

Most helpful comment

Update:

  • To keep consistency between Class names of Trainers and ModelParameters, we will not use the word "Classification" in either the Trainer class or the ModelParameter class
  • We feel its OK to drop the word "Classification" for 2 main reasons :

    • we have sufficient context to just use suffix : BinaryTrainer , BinaryModelParameters etc. without using the word "Classification"

    • adding the word "Classification" leads to long names

Here is a summary of the Trainer and ModelParameter class names

| Updated Trainer Class Name |
| ------------------------ |
| FastTreeBinaryTrainer |
| FastTreeRegressionTrainer |
| FastTreeRankingTrainer |
| FastTreeTweedieTrainer |
| FastForestRegressionTrainer |
| FastForestBinaryTrainer |
| MatrixFactorizationTrainer |
| GamBinaryTrainer |
| GamRegressionTrainer |
| LogisticRegressionBinaryTrainer |
| LogisticRegressionMulticlassTrainer (see comment below by @wschin ) |
| AveragedPerceptronTrainer |
| OnlineGradientDescentTrainer |
| PoissonRegressionTrainer |
| KMeansTrainer |
| OlsTrainer |
| PriorTrainer |
| PairwiseCouplingTrainer |
| OneVersusAllTrainer |
| NaiveBayesMulticlassTrainer |
| SgdCalibratedTrainer |
| SgdNonCalibratedTrainer |
| FieldAwareFactorizationMachineTrainer |
| SymbolicSgdTrainer |
| LightGbmRegressionTrainer |
| LightGbmBinaryTrainer |
| LightGbmRankingTrainer |
| LightGbmMulticlassTrainer |
| LinearSvmTrainer |
| RandomizedPcaTrainer |
| SdcaCalibratedBinaryTrainer |
| SdcaNonCalibratedBinaryTrainer |
| SdcaMulticlassTrainer |
| SdcaRegressionTrainer |
| SdcaTrainerBase |
| LbfgsTrainerBase |
| GamTrainerBase |

| Updated ModelParameter Class Name |
| ------------------------------------------ |
| GamBinaryModelParameters |
| GamRegressionModelParameters |
| LightGbmBinaryModelParameters |
| LightGbmRankingModelParameters |
| LightGbmRegressionModelParameters |
| FastTreeBinaryModelParameters |
| FastTreeRegressionModelParameters |
| FastTreeRankingModelParameters |
| FastTreeTweedieModelParameters |
| FastForestBinaryModelParameters |
| FastForestRegressionModelParameters |
| NaiveBayesMulticlassModelParameters |
| OlsModelParameters |
| PcaModelParameters |
| FieldAwareFactorizationMachineModelParameters |
| KMeansModelParameters |
| PoissonRegressionModelParameters |
| MatrixFactorizationModelParameters |
| OneVersusAllModelParameters |
| PairwiseCouplingModelParameters |
| PriorModelParameters |
| LinearRegressionModelParameters |
| RegressionModelParameters |
| LinearBinaryModelParameters |
| CalibratedModelParametersBase |
| GamModelParametersBase |
| TreeEnsembleModelParameters |
| TreeEnsembleModelParametersBasedOnQuantileRegressionTree |
| TreeEnsembleModelParametersBasedOnRegressionTree |

All 4 comments

@shauheen would this be a candidate for Project 13?

In general, it seems there isn't a 1:1 mapping between trainer and model parameter types.

  • SdcaNonCalibratedBinaryClassificationTrainer uses LinearBinaryModelParameters

    • SdcaCalibratedBinaryClassificationTrainer uses CalibratedModelParametersBase as type of the model parameter.

However, there are some discrepancies that exists. Whenever possible we should align the trainer and model parameter types.

Listing some of the trainers where we can fix this :

  • MulticlassLogisticRegressionModelParameters
  • BinaryClassificationGamModelParameters
  • RegressionGamModelParameters
  • MultiClassNaiveBayesModelParameters
  • PrincipleComponentModelParameters
  • OrdinaryLeastSquaresRegressionModelParameters
  • FastForestClassificationModelParameters

Update:

  • To keep consistency between Class names of Trainers and ModelParameters, we will not use the word "Classification" in either the Trainer class or the ModelParameter class
  • We feel its OK to drop the word "Classification" for 2 main reasons :

    • we have sufficient context to just use suffix : BinaryTrainer , BinaryModelParameters etc. without using the word "Classification"

    • adding the word "Classification" leads to long names

Here is a summary of the Trainer and ModelParameter class names

| Updated Trainer Class Name |
| ------------------------ |
| FastTreeBinaryTrainer |
| FastTreeRegressionTrainer |
| FastTreeRankingTrainer |
| FastTreeTweedieTrainer |
| FastForestRegressionTrainer |
| FastForestBinaryTrainer |
| MatrixFactorizationTrainer |
| GamBinaryTrainer |
| GamRegressionTrainer |
| LogisticRegressionBinaryTrainer |
| LogisticRegressionMulticlassTrainer (see comment below by @wschin ) |
| AveragedPerceptronTrainer |
| OnlineGradientDescentTrainer |
| PoissonRegressionTrainer |
| KMeansTrainer |
| OlsTrainer |
| PriorTrainer |
| PairwiseCouplingTrainer |
| OneVersusAllTrainer |
| NaiveBayesMulticlassTrainer |
| SgdCalibratedTrainer |
| SgdNonCalibratedTrainer |
| FieldAwareFactorizationMachineTrainer |
| SymbolicSgdTrainer |
| LightGbmRegressionTrainer |
| LightGbmBinaryTrainer |
| LightGbmRankingTrainer |
| LightGbmMulticlassTrainer |
| LinearSvmTrainer |
| RandomizedPcaTrainer |
| SdcaCalibratedBinaryTrainer |
| SdcaNonCalibratedBinaryTrainer |
| SdcaMulticlassTrainer |
| SdcaRegressionTrainer |
| SdcaTrainerBase |
| LbfgsTrainerBase |
| GamTrainerBase |

| Updated ModelParameter Class Name |
| ------------------------------------------ |
| GamBinaryModelParameters |
| GamRegressionModelParameters |
| LightGbmBinaryModelParameters |
| LightGbmRankingModelParameters |
| LightGbmRegressionModelParameters |
| FastTreeBinaryModelParameters |
| FastTreeRegressionModelParameters |
| FastTreeRankingModelParameters |
| FastTreeTweedieModelParameters |
| FastForestBinaryModelParameters |
| FastForestRegressionModelParameters |
| NaiveBayesMulticlassModelParameters |
| OlsModelParameters |
| PcaModelParameters |
| FieldAwareFactorizationMachineModelParameters |
| KMeansModelParameters |
| PoissonRegressionModelParameters |
| MatrixFactorizationModelParameters |
| OneVersusAllModelParameters |
| PairwiseCouplingModelParameters |
| PriorModelParameters |
| LinearRegressionModelParameters |
| RegressionModelParameters |
| LinearBinaryModelParameters |
| CalibratedModelParametersBase |
| GamModelParametersBase |
| TreeEnsembleModelParameters |
| TreeEnsembleModelParametersBasedOnQuantileRegressionTree |
| TreeEnsembleModelParametersBasedOnRegressionTree |

For multi-class LR trainer and its model, we will different names in #2976. Looks like they don't need multiclass.

  • (rename) LogisticRegressionMulticlassClassificationTrainer ---> LbfgsMaximumEntropyTrainer
  • (rename) MulticlassLogisticRegressionModelParameters ---> MaximumEntropyModelParameters

We can NOT have LogisticRegressionMulticlass because LogisticRegression is binary classification only.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ddobric picture ddobric  路  4Comments

darren-zdc picture darren-zdc  路  3Comments

JakeRadMSFT picture JakeRadMSFT  路  3Comments

sethreidnz picture sethreidnz  路  3Comments

rogancarr picture rogancarr  路  3Comments