A trained normalizer does not have public methods or properties that allows access to the normalization parameters. For example, a minmax normalizer doesn't have the min, max, or zero-level available.
I would expect that the trained transformer would have a .Model property, similar to a trained learner.
This is required V1 Scenario (#2498):
@shauheen @eerhardt @TomFinley I marked this as P13, but it could be a V1.1 as it does not change APIs, but needs new APIs. This is similar to LDA (#2197), and I think we have a more global problem where trained transforms don't expose a .Model property similar to our learners.
I think it is actually possible to extract the normalizer parameters, but it is not well exposed and requires some casting. The fact that this issue was initially posted highlights that it is quite confusing.
The NormalizingTransformer has a Columns field which exposes ColumnOptions.
These are not the usual ColumnOptions, we actually have different ones that are used as the arguments of the estimator. These ColumnOptions are meant to expose the the NormalizerParameters and have the following field:
As you will notice NormalizerModelParametersBase needs to be casted to the correct sealed class: AffineNormalizerModelParameters, CdfNormalizerModelParameters or BinNormalizerModelParameters.
Once that is done it's possible to access the parameters.
I found a sample that uses the property:
https://github.com/dotnet/machinelearning/blob/master/docs/samples/Microsoft.ML.Samples/Dynamic/Normalizer.cs
Wait, is the issue that we inappropriately named something ColumnOptions, even though it is not actually an Options object, per se? That is, #2709 got a little too happy with its renamings, even if 95% of the renamings might have been correct?
I think something like this that has info should not be named options.
This suggests @shauheen that this belongs in project 13.
@artidoro Thanks for investigating this!
This is a bit confusing. I can't imagine that we really want to do something like:
var modelParams = transformer.Columns
.First(x => x.Name == "Induced")
.ModelParameters as NormalizingTransformer.AffineNormalizerModelParameters<float>;
I do realize that we have to get the parameters out per column, but the searching and casting is a bit awkward. Can we have something like:
var normalizerParameters = normalizer.Columns["MyColumn"].Model;
Where the type for normalizerParameters is given automatically? (Note that the Columns["MyColumn"] is also not possible.)
I originally made a recommendation not to put this in Project13 because I didn't know about the Columns option. I agree with @TomFinley that this needs a breaking change (although I think we still need to talk it through). I'll add Project 13 back on.
Where the type for
normalizerParametersis given automatically? (Note that theColumns["MyColumn"]is also not possible.)
Without having separate transform types for each sort, this is not possible. So I have a normalizer transform, one column normalized via CDF, the other via affine, the other some other way like binning or whatever. The type is impossible. You must cast, the way it is designed.
If it had been designed that a single application of the estimator could only normalize a single way, and there were different ITransformer implementing classes for each of those ways, then what you say could be possible, but that's not how this thing was designed. Not saying that's good or bad (I actually would prefer it in this strongly typed way), but it's not a simple fix, at all. Otherwise we would have done that already as we already did with, say, things like calibrated vs. uncalibrated predictors.
So, we are doomed to somewhat weak typing here, sorry to say.
I can at least rename ColumnOptions to something more intuitive.
How does ColumnNormalizationInfos sound?
Actually, I don't have a problem with Columns here. It's more that you have to search it for a name and that you have to do a manual cast. I'd suggest
normalizer.Columns["MyColumn"]ModelParameters classes. .AffineNormalizerModelParameters<float> might be a bit much? It's not that clear that that's the class you want to cast to from the name.Hi @rogancarr, since I misinterpreted your words, perhaps you could provide more detail. What you said is this.
var normalizerParameters = normalizer.Columns["MyColumn"].Model;
According to my best interpretation now, I believe now that you are under the impression that it is somehow possible for, at compile time, the type of normalizerParameters to be known. Is my interpretation of your belief correct? I did not think this was what you were saying originally, but if you are I can provide a different answer.
Well, I am simply describing what I want to have happen. I realize that we can't know the type because it's an option to the learner we haven't trained yet. That said, I don't want to have to cast :) I think manual casting is a huge barrier for someone getting started, especially when it's not clear what the cast needs to be.
So maybe what I want is helper extensions for each type of normalization.
we should probably change the title of the issue, since the parameters are actually publicly available.
As point by @TomFinley above, we have weak typing when it comes to getting the normalization parameters.
For v1.0 the expectation is that users will :
GetNormalizerModelParameters API to get the NormalizerModelParametersBaseAffineNormalizerModelParameters, BinNormalizerModelParameters etc to get the parameters@rogancarr . It seems we would have to re-design this transform to enable the example above (of what you would have wanted)
Columns property in the public surfaceColumns is an ImmutableArray . We would have to enable support to get Column by name.Seems this is something we should probably move to backlog or perhaps close ? It certainly looks out of scope for Project 13
@abgoswam Let's move this to the backlog. Let's see how it lands with v1.0 and see if we want to make improvements for v1.1.
I think the solution for v1.1 is to make lots of explicit samples showing the casting.