Machinelearning: More trainer related naming alignment

Created on 13 Mar 2019  路  4Comments  路  Source: dotnet/machinelearning

Take another pass over the trainers and the model parameter types, and align them, because now we have:

LogisticRegressionMulticlassClassificationTrainer but MulticlassLogisticRegressionModelParameters.

I think following the same principles on the ModelParams might make them more relatable; so change MulticlassLogisticRegressionModelParameters to LogisticRegressionMulticlassModelParameters

API

Most helpful comment

Update:

  • To keep consistency between Class names of Trainers and ModelParameters, we will not use the word "Classification" in either the Trainer class or the ModelParameter class
  • We feel its OK to drop the word "Classification" for 2 main reasons :

    • we have sufficient context to just use suffix : BinaryTrainer , BinaryModelParameters etc. without using the word "Classification"

    • adding the word "Classification" leads to long names

Here is a summary of the Trainer and ModelParameter class names

| Updated Trainer Class Name |
| ------------------------ |
| FastTreeBinaryTrainer |
| FastTreeRegressionTrainer |
| FastTreeRankingTrainer |
| FastTreeTweedieTrainer |
| FastForestRegressionTrainer |
| FastForestBinaryTrainer |
| MatrixFactorizationTrainer |
| GamBinaryTrainer |
| GamRegressionTrainer |
| LogisticRegressionBinaryTrainer |
| LogisticRegressionMulticlassTrainer (see comment below by @wschin ) |
| AveragedPerceptronTrainer |
| OnlineGradientDescentTrainer |
| PoissonRegressionTrainer |
| KMeansTrainer |
| OlsTrainer |
| PriorTrainer |
| PairwiseCouplingTrainer |
| OneVersusAllTrainer |
| NaiveBayesMulticlassTrainer |
| SgdCalibratedTrainer |
| SgdNonCalibratedTrainer |
| FieldAwareFactorizationMachineTrainer |
| SymbolicSgdTrainer |
| LightGbmRegressionTrainer |
| LightGbmBinaryTrainer |
| LightGbmRankingTrainer |
| LightGbmMulticlassTrainer |
| LinearSvmTrainer |
| RandomizedPcaTrainer |
| SdcaCalibratedBinaryTrainer |
| SdcaNonCalibratedBinaryTrainer |
| SdcaMulticlassTrainer |
| SdcaRegressionTrainer |
| SdcaTrainerBase |
| LbfgsTrainerBase |
| GamTrainerBase |

| Updated ModelParameter Class Name |
| ------------------------------------------ |
| GamBinaryModelParameters |
| GamRegressionModelParameters |
| LightGbmBinaryModelParameters |
| LightGbmRankingModelParameters |
| LightGbmRegressionModelParameters |
| FastTreeBinaryModelParameters |
| FastTreeRegressionModelParameters |
| FastTreeRankingModelParameters |
| FastTreeTweedieModelParameters |
| FastForestBinaryModelParameters |
| FastForestRegressionModelParameters |
| NaiveBayesMulticlassModelParameters |
| OlsModelParameters |
| PcaModelParameters |
| FieldAwareFactorizationMachineModelParameters |
| KMeansModelParameters |
| PoissonRegressionModelParameters |
| MatrixFactorizationModelParameters |
| OneVersusAllModelParameters |
| PairwiseCouplingModelParameters |
| PriorModelParameters |
| LinearRegressionModelParameters |
| RegressionModelParameters |
| LinearBinaryModelParameters |
| CalibratedModelParametersBase |
| GamModelParametersBase |
| TreeEnsembleModelParameters |
| TreeEnsembleModelParametersBasedOnQuantileRegressionTree |
| TreeEnsembleModelParametersBasedOnRegressionTree |

All 4 comments

@shauheen would this be a candidate for Project 13?

In general, it seems there isn't a 1:1 mapping between trainer and model parameter types.

  • SdcaNonCalibratedBinaryClassificationTrainer uses LinearBinaryModelParameters

    • SdcaCalibratedBinaryClassificationTrainer uses CalibratedModelParametersBase as type of the model parameter.

However, there are some discrepancies that exists. Whenever possible we should align the trainer and model parameter types.

Listing some of the trainers where we can fix this :

  • MulticlassLogisticRegressionModelParameters
  • BinaryClassificationGamModelParameters
  • RegressionGamModelParameters
  • MultiClassNaiveBayesModelParameters
  • PrincipleComponentModelParameters
  • OrdinaryLeastSquaresRegressionModelParameters
  • FastForestClassificationModelParameters

Update:

  • To keep consistency between Class names of Trainers and ModelParameters, we will not use the word "Classification" in either the Trainer class or the ModelParameter class
  • We feel its OK to drop the word "Classification" for 2 main reasons :

    • we have sufficient context to just use suffix : BinaryTrainer , BinaryModelParameters etc. without using the word "Classification"

    • adding the word "Classification" leads to long names

Here is a summary of the Trainer and ModelParameter class names

| Updated Trainer Class Name |
| ------------------------ |
| FastTreeBinaryTrainer |
| FastTreeRegressionTrainer |
| FastTreeRankingTrainer |
| FastTreeTweedieTrainer |
| FastForestRegressionTrainer |
| FastForestBinaryTrainer |
| MatrixFactorizationTrainer |
| GamBinaryTrainer |
| GamRegressionTrainer |
| LogisticRegressionBinaryTrainer |
| LogisticRegressionMulticlassTrainer (see comment below by @wschin ) |
| AveragedPerceptronTrainer |
| OnlineGradientDescentTrainer |
| PoissonRegressionTrainer |
| KMeansTrainer |
| OlsTrainer |
| PriorTrainer |
| PairwiseCouplingTrainer |
| OneVersusAllTrainer |
| NaiveBayesMulticlassTrainer |
| SgdCalibratedTrainer |
| SgdNonCalibratedTrainer |
| FieldAwareFactorizationMachineTrainer |
| SymbolicSgdTrainer |
| LightGbmRegressionTrainer |
| LightGbmBinaryTrainer |
| LightGbmRankingTrainer |
| LightGbmMulticlassTrainer |
| LinearSvmTrainer |
| RandomizedPcaTrainer |
| SdcaCalibratedBinaryTrainer |
| SdcaNonCalibratedBinaryTrainer |
| SdcaMulticlassTrainer |
| SdcaRegressionTrainer |
| SdcaTrainerBase |
| LbfgsTrainerBase |
| GamTrainerBase |

| Updated ModelParameter Class Name |
| ------------------------------------------ |
| GamBinaryModelParameters |
| GamRegressionModelParameters |
| LightGbmBinaryModelParameters |
| LightGbmRankingModelParameters |
| LightGbmRegressionModelParameters |
| FastTreeBinaryModelParameters |
| FastTreeRegressionModelParameters |
| FastTreeRankingModelParameters |
| FastTreeTweedieModelParameters |
| FastForestBinaryModelParameters |
| FastForestRegressionModelParameters |
| NaiveBayesMulticlassModelParameters |
| OlsModelParameters |
| PcaModelParameters |
| FieldAwareFactorizationMachineModelParameters |
| KMeansModelParameters |
| PoissonRegressionModelParameters |
| MatrixFactorizationModelParameters |
| OneVersusAllModelParameters |
| PairwiseCouplingModelParameters |
| PriorModelParameters |
| LinearRegressionModelParameters |
| RegressionModelParameters |
| LinearBinaryModelParameters |
| CalibratedModelParametersBase |
| GamModelParametersBase |
| TreeEnsembleModelParameters |
| TreeEnsembleModelParametersBasedOnQuantileRegressionTree |
| TreeEnsembleModelParametersBasedOnRegressionTree |

For multi-class LR trainer and its model, we will different names in #2976. Looks like they don't need multiclass.

  • (rename) LogisticRegressionMulticlassClassificationTrainer ---> LbfgsMaximumEntropyTrainer
  • (rename) MulticlassLogisticRegressionModelParameters ---> MaximumEntropyModelParameters

We can NOT have LogisticRegressionMulticlass because LogisticRegression is binary classification only.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

daholste picture daholste  路  3Comments

dev8546 picture dev8546  路  3Comments

ddobric picture ddobric  路  4Comments

rogancarr picture rogancarr  路  3Comments

maxt3r picture maxt3r  路  3Comments