Machinelearning: Inventory of the Microsoft.ML public types: what to rename, move, hide there.

Created on 6 Feb 2019  路  8Comments  路  Source: dotnet/machinelearning

This is the list of the types in the Microsoft.ML namespace, as seen from the docs site. Taking a first pass at what doesn't need to be there, what needs to be hidden, and what needs to be moved to a different namespace.

Microsoft.ML namespace type | Rename/Move/Hide |
--------------------------- | --------- |
BinaryClassificationCatalog | |
BinaryClassificationCatalog.BinaryClassificationTrainers| |
BinaryClassificationMetricsStatistics| |
BinaryLoaderSaverCatalog| |
CategoricalCatalog| |
ChannelMessage | move to Microsoft.ML.Data |
ChannelMessageKind| move to Microsoft.ML.Data |
ClusteringCatalog| |
ClusteringCatalog.ClusteringTrainers| |
ComponentCatalog| move to Microsoft.ML.Core |
ConversionsCatalog| |
ConversionsExtensionsCatalog| |
CustomMappingCatalog| |
DataOperationsCatalog| |
DataReaderExtensions| |
DebuggerExtensions| |
ExplainabilityCatalog | |
ExpLoss | move to Microsoft.ML.Trainers.Loss |
ExpLoss.Arguments | move to Microsoft.ML.Trainers.Loss |
ExtensionsCatalog | |
FactorizationMachineExtensions | |
FeatureSelectionCatalog | |
HalLearnersCatalog | |
HingeLoss | move to Microsoft.ML.Trainers.Loss |
HingeLoss.Arguments | move to Microsoft.ML.Trainers.Loss |
IChannel | move to Microsoft.ML.Data |
IChannelProvider | move to Microsoft.ML.Data |
IClassificationLoss | move to Microsoft.ML.Trainers.Loss |
IComponentFactory | move to Microsoft.ML.Data |
IComponentFactory | move to Microsoft.ML.Data |
IComponentFactory | move to Microsoft.ML.Data |
IComponentFactory | move to Microsoft.ML.Data |
IComponentFactory | move to Microsoft.ML.Data |
IExceptionContext | move to Microsoft.ML.Core |
IFileHandle | move to Microsoft.ML.Data |
IHost | move to Microsoft.ML.Data |
IHostEnvironment | move to Microsoft.ML.Data |
ILossFunction | move to Microsoft.ML.Trainers.Loss |
ImageEstimatorsCatalog | |
IParameterValue | move to Microsoft.ML.Sweeper|
IParameterValue | move to Microsoft.ML.Sweeper|
IPipe | move to Microsoft.ML.Data |
IPredictionTransformer | |
IPredictor | |
IPredictorProducing | |
IProgressChannel | move to Microsoft.ML.Data |
IProgressChannelProvider | move to Microsoft.ML.Data |
IProgressEntry | move to Microsoft.ML.Data |
IRegressionLoss | move to Microsoft.ML.Trainers.Loss |
IRunResult | move to Microsoft.ML.Sweeper |
IRunResult | move to Microsoft.ML.Sweeper |
IScalarOutputLoss | move to Microsoft.ML.Trainers.Loss |
ISingleFeaturePredictionTransformer | |
ISupportClassificationLossFactory | move to Microsoft.ML.Trainers.Loss|
ISupportRegressionLossFactory | move to Microsoft.ML.Trainers.Loss|
ISupportSdcaClassificationLoss | move to Microsoft.ML.Trainers.Loss|
ISupportSdcaClassificationLossFactory | move to Microsoft.ML.Trainers.Loss|
ISupportSdcaRegressionLossFactory | move to Microsoft.ML.Trainers.Loss|
ISweeper | move to Microsoft.ML.Sweeper|
ISweepResultEvaluator | move to Microsoft.ML.Sweeper|
IValueGenerator | move to Microsoft.ML.Sweeper|
KMeansClusteringExtensions | |
LearningPipelineExtensions | |
LightGbmExtensions | |
LoggingEventArgs | |
LogLoss | move to Microsoft.ML.Trainers.Loss |
LogLossFactory | move to Microsoft.ML.Trainers.Loss |
MessageSensitivity | move to Microsoft.ML.Data |
MetricsStatisticsBase | move to Microsoft.ML.Data |
MetricStatistics | move to Microsoft.ML.Data |
MLContext | |
ModelOperationsCatalog | |
ModelOperationsCatalog.ExplainabilityTransforms | |
ModelOperationsCatalog.SubCatalogBase | |
MulticlassClassificationCatalog | |
MulticlassClassificationCatalog.MulticlassClassificationTrainers | |
MultiClassClassifierMetricsStatistics | move to Microsoft.ML.Data |
NormalizerCatalog | |
OnnxCatalog | |
OnnxExportExtensions | |
ParameterSet | move to Microsoft.ML.Sweeper |
PcaCatalog | |
PermutationFeatureImportanceExtensions | move to Microsoft.ML.Data |
PoissonLoss | move to Microsoft.ML.Trainers.Loss |
PoissonLossFactory | move to Microsoft.ML.Trainers.Loss |
PredictionEngine | |
PredictionEngineBase | |
PredictionEngineExtensions | |
PredictionKind | |
ProgressHeader | move to Microsoft.ML.Data |
ProjectionCatalog | |
QuantileStatistics | move to Microsoft.ML.Data |
RankerMetricsStatistics | move to Microsoft.ML.Data |
RankingCatalog | |
RankingCatalog.RankingTrainers | |
RecommendationCatalog | |
RecommendationCatalog.RecommendationTrainers | |
RecommenderCatalog | |
RegressionCatalog | |
RegressionCatalog.RegressionTrainers | |
RegressionMetricsStatistics | move to Microsoft.ML.Data |
RunMetric | move to Microsoft.ML.Sweeper |
RunResult | move to Microsoft.ML.Sweeper |
SignatureClassificationLoss | move to Microsoft.ML.Trainers.Loss |
SignatureRegressionLoss | move to Microsoft.ML.Trainers.Loss |
SignatureSuggestedSweepsParser | move to Microsoft.ML.Sweeper |
SignatureSweeper | move to Microsoft.ML.Sweeper |
SignatureSweepResultEvaluator | move to Microsoft.ML.Sweeper |
SimpleFileHandle | move to Microsoft.ML.Data |
SmoothedHingeLoss | move to Microsoft.ML.Trainers.Loss |
SmoothedHingeLoss.Arguments | move to Microsoft.ML.Trainers.Loss |
SquaredLoss | move to Microsoft.ML.Trainers.Loss |
SquaredLossFactory | move to Microsoft.ML.Trainers.Loss |
StandardLearnersCatalog | |
TensorflowCatalog | |
TextCatalog | |
TextLoaderSaverCatalog | |
TrainCatalogBase | |
TrainCatalogBase.CatalogInstantiatorBase | hide |
TrainerInfo | |
TransformExtensionsCatalog | |
TransformsCatalog | |
TransformsCatalog.CategoricalTransforms | |
TransformsCatalog.ConversionTransforms | |
TransformsCatalog.FeatureSelectionTransforms | |
TransformsCatalog.ProjectionTransforms | |
TransformsCatalog.SubCatalogBase | |
TransformsCatalog.TextTransforms | |
TreeExtensions | |
TweedieLoss | move to Microsoft.ML.Trainers.Loss |
TweedieLoss.Arguments | move to Microsoft.ML.Trainers.Loss |

cc @yaeldekel @TomFinley @glebuk see if any of my suggestions need to change.
For the cases marked with move, where should they live?

API documentation

All 8 comments

If I read through your list, I notice two main themes here among your "moves:" losses, and things relating to IHostEnvironment.

Losses seem most relevant to trainers (e.g., they are often moved there), so if not in ML.NET they should live somewhere there, I'd expect? Maybe Microsoft.ML.Trainers, since the concept of loss is so general? But they're also useful in evaluation -- we might want to consider the possibility that loss is so fundamental a concept that it may even deserve its own special namespace, though I hardly insist on this. Another note about losses: Note that ISupportSdcaClassificationLossFactory and suchlike ought to be hidden, and the public API should not expose them. See #1973.

You did not mention moving IHost. Anything related to IHostEnvironment (including ProgressHeader, ChannelMessage, ComponentCatalog, and other such things) should live in some special namespace that users won't be bothered with. (I am somewhat indifferent as to choice of name, just, something user's won't use.) This would include I suppose also any other types that have to be public for "reasons" but that are nonetheless only useful to component authors. That definitely shouldn't be in people's faces.

Also why do we have the sweeper DLL part of Microsoft.ML nuget? 馃槃 That's not right.

Also also: this is necessarily a breaking change to API, so why is it filed under documentation and not under project 13 and API? I hope you don't mind, going to add those labels, since those strike me as more appropriate.

One thing I noticed is that IHostEnvironment and its friends are sitll in the main Microsoft.ML. While MLContext must be in Microsoft.ML, IHostEnvironment should be somewhere else, somewhere where user non-component authoring code won't be too likely to run into it. I'm somewhat indifferent as to where.

Also why do we have the sweeper DLL part of Microsoft.ML nuget?

It is no longer after #2690.

I might prefer that IHostEnvironment and its friends not be in any API that a typical user would stumble upon. This seems to disqualify not only Microsoft.ML but Microsoft.ML.Data. @eerhardt jokingly suggested Microsoft.ML.Runtime, but might we want to make his joke real? That's not such a problematic namespace really, considering that it deals with component authorship. (I don't insist on that name, I just think it should be a namespace that only component authors would tend to be on.)

Otherwise at first glance things look pretty good. Some of these will become internalized and hidden w.r.t. #1973 and related downstream work being done by @ganik.

Closed with #2885

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sethreidnz picture sethreidnz  路  3Comments

maxt3r picture maxt3r  路  3Comments

frankhaugen picture frankhaugen  路  3Comments

rebecca-burwei picture rebecca-burwei  路  3Comments

aslotte picture aslotte  路  3Comments