Machinelearning: Namespace reorg for the public surface

Created on 27 Feb 2019  路  5Comments  路  Source: dotnet/machinelearning

Related to #2326, and the feedback of the ML.Net public surface API recommended the following changes:

1 - Microsoft.ML.Trainers.* everything should go to Microsoft.ML.Trainers, except FastTree.
2- Microsoft.ML.Transforms.* should go to Microsoft.ML.Transforms with the exception of Text and Images

cc @Ivanidzo4ka , @eerhardt @TomFinley

Most helpful comment

I am not sure if Microsoft.ML.Trainers.Linear is a over-cooked. If we do Microsoft.ML.Trainers.Linear, shall we also do Microsoft.ML.Trainers.KernelMachines and Microsoft.ML.Trainers.FactorizationMachines? My feeling is that machine learning thigns are essentially ambiguous and we often get super long names to make then formally defined. Could all trainers stay in ML.Trainers unless it has a special reason?

All 5 comments

There is more to do on this:
remove:
Microsoft.ML.Transforms.TimeSeries
Microsoft.ML.Transforms.FeatureSelection
Microsoft.ML.Transforms.Tensorflow
Microsoft.ML.Trainers.Recommender

Well issue states Microsoft.ML package, and none of this namespaces is in that package.

Proposal for the final namespace landscape of the public surface:

  • Microsoft.ML.ImageAnalytics - contains both the ImageLoader and the other image transforms.
    The ImageLoader gets moved into ML.Data, where TextLoader is. The other image transforms get placed into Microsoft.ML.Transforms.Image (consistent with Microsoft.ML.Transforms.Text).

  • Microsoft.ML.LightGBM -> changes to Microsoft.ML.Trainers.LightGBM*

  • Microsoft.ML.Trainers.Recommender -> stays as is, if we follow the pattern of: external trainer code wrapped from ML.NET gets its own namespace (e.g. Light GBM).

  • Microsoft.ML.Transforms.FeatureSelection -> moves to Microsoft.ML.Transforms
    There will be no Microsoft.ML.Transforms.FeatureSelection.

  • There are only 3 public types inside Microsoft.ML.Model.
    Move: IHaveFeatureWeights and ModelParametersBase
    inside Microsoft.ML.Trainers (or should it go under Microsoft.ML together with ICanSaveModel)

The proposal above is me synthesizing the discussion we have had in the past about trainers.

My personal opinion is that we should not have Microsoft.ML.Trainers.FastTree and Microsoft.ML.Trainers.LightGBM. We can fold them both into a Microsoft.ML.Trainers.Trees, call the linear ones Microsoft.ML.Trainers.Linear (ensembles can come on Microsoft.ML.Trainers.Ensembles, when time. ).

move to Microsoft.ML.Transforms.Onnx.


The other namespaces that are not changing are:

  • Microsoft.ML
  • Microsoft.ML.Calibrators
  • Microsoft.ML.Data
  • Microsoft.ML.EntryPoints
  • Microsoft.ML.Model
  • Microsoft.ML.StaticPipe
  • Microsoft.ML.Trainers
  • Microsoft.ML.Trainers.FastTree
  • Microsoft.ML.Transforms
  • Microsoft.ML.Transforms.TensorFlow
  • Microsoft.ML.Transforms.Text
  • Microsoft.ML.Transforms.TimeSeries

@Ivanidzo4ka @TomFinley @eerhardt @wschin @shauheen @CESARDELATORRE

I am not sure if Microsoft.ML.Trainers.Linear is a over-cooked. If we do Microsoft.ML.Trainers.Linear, shall we also do Microsoft.ML.Trainers.KernelMachines and Microsoft.ML.Trainers.FactorizationMachines? My feeling is that machine learning thigns are essentially ambiguous and we often get super long names to make then formally defined. Could all trainers stay in ML.Trainers unless it has a special reason?

I agree -- also I feel that external libraries like LightGBM ( and TensorFlow ) already create a natural separation from their name alone.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

frankhaugen picture frankhaugen  路  3Comments

daholste picture daholste  路  3Comments

sethreidnz picture sethreidnz  路  3Comments

daholste picture daholste  路  4Comments

aslotte picture aslotte  路  3Comments