Hi all,
I have been doing a little deep dive into some of my models in order to understand a little more about feature relevance. My results for running feature explanatory analysis is as follows for bin classification:
2020-01-08 11:34:03.813 +00:00 [INF] BinaryFastTreeParameters
2020-01-08 11:34:03.815 +00:00 [INF] Bias: 0
2020-01-08 11:34:03.816 +00:00 [INF] Feature Weights:
2020-01-08 11:34:03.843 +00:00 [INF] Feature: CloseWeight: 0.1089412
2020-01-08 11:34:03.931 +00:00 [INF] Feature: OpenWeight: 0.3691619
2020-01-08 11:34:03.932 +00:00 [INF] Feature: HighWeight: 0.06676193
2020-01-08 11:34:03.933 +00:00 [INF] Feature: LowWeight: 0.1926264
2020-01-08 11:34:03.934 +00:00 [INF] Feature: STO_FastStochWeight: 0.19846
2020-01-08 11:34:03.938 +00:00 [INF] Feature: STO_StochKWeight: 0.5019926
2020-01-08 11:34:03.941 +00:00 [INF] Feature: STO_StochDWeight: 0.3781931
2020-01-08 11:34:03.942 +00:00 [INF] Feature: STOWeight: 0
2020-01-08 11:34:03.943 +00:00 [INF] Feature: CCI_TypicalPriceAvgWeight: 0.131141
2020-01-08 11:34:03.944 +00:00 [INF] Feature: CCI_TypicalPriceMADWeight: 0.1299266
2020-01-08 11:34:03.946 +00:00 [INF] Feature: CCIWeight: 1
2020-01-08 11:34:03.947 +00:00 [INF] Feature: RSIDownWeight: 0.4761779
2020-01-08 11:34:03.948 +00:00 [INF] Feature: RSIUpWeight: 0.1249975
2020-01-08 11:34:03.951 +00:00 [INF] Feature: RSIWeight: 0.2877662
2020-01-08 11:34:03.952 +00:00 [INF] Feature: MOMWeight: 0.1822069
2020-01-08 11:34:03.953 +00:00 [INF] Feature: ADX_PositiveDirectionalIndexWeight: 0.2435836
2020-01-08 11:34:03.954 +00:00 [INF] Feature: ADX_NegativeDirectionalIndexWeight: 0.4263106
2020-01-08 11:34:03.955 +00:00 [INF] Feature: ADXWeight: 0.1899773
2020-01-08 11:34:03.956 +00:00 [INF] Feature: CMOWeight: 0.2601428
But for PFI I have the following:
2020-01-08 11:34:09.369 +00:00 [INF] Calculating Binary Classification Feature PFI
2020-01-08 11:34:09.371 +00:00 [INF] Feature PFI for learner:BinaryFastTree
2020-01-08 11:34:09.383 +00:00 [INF] Close| 0.000000
2020-01-08 11:34:09.384 +00:00 [INF] Open| 0.000000
2020-01-08 11:34:09.385 +00:00 [INF] High| 0.000000
2020-01-08 11:34:09.386 +00:00 [INF] Low| 0.000000
2020-01-08 11:34:09.391 +00:00 [INF] STO_FastStoch| 0.000000
2020-01-08 11:34:09.400 +00:00 [INF] STO_StochK| 0.000000
2020-01-08 11:34:09.401 +00:00 [INF] STO_StochD| 0.000000
2020-01-08 11:34:09.402 +00:00 [INF] STO| 0.000000
2020-01-08 11:34:09.404 +00:00 [INF] CCI_TypicalPriceAvg| 0.000000
2020-01-08 11:34:09.406 +00:00 [INF] CCI_TypicalPriceMAD| 0.000113
2020-01-08 11:34:09.408 +00:00 [INF] CCI| 0.000000
2020-01-08 11:34:09.414 +00:00 [INF] RSIDown| 0.000221
2020-01-08 11:34:09.416 +00:00 [INF] RSIUp| 0.000000
2020-01-08 11:34:09.431 +00:00 [INF] RSI| 0.000000
2020-01-08 11:34:09.443 +00:00 [INF] MOM| -0.003003
2020-01-08 11:34:09.457 +00:00 [INF] ADX_PositiveDirectionalIndex| 0.000000
2020-01-08 11:34:09.467 +00:00 [INF] ADX_NegativeDirectionalIndex| 0.000000
2020-01-08 11:34:09.470 +00:00 [INF] ADX| 0.000000
2020-01-08 11:34:09.479 +00:00 [INF] CMO| 0.000000
My question is essentially - what should I read (if anything) into zero values for PFI. The evaluation score too:
020-01-08 11:34:17.135 +00:00 [INF] Score: -4.640871
2020-01-08 11:34:17.138 +00:00 [INF] Probability: 0.1351293
I would appreciate any thoughts that you may have regarding using such info to improve model veracity.
Thank you
Fig
Can you please share the code you used to print those values to check a couple of things?
Pleasure and thank you for your help!
The logging functions:
private void LogModelWeights(LinearBinaryModelParameters subModel, string name)
{
var weights = subModel.Weights.ToList();
// Log the model parameters.
Logger.Info(name + $"Parameters");
Logger.Info("Bias: " + subModel.Bias);
Logger.Info($"Feature Weights:");
// 1 Feature Weights
for (int i = 0; i < features.Length; i++)
{
contributions[i].Weight = weights[i];
contributions[i].Contribution = 0; // The weight will be assigned by the prediction engine
// Using CalculateFeatureContribution (bellow)
Logger.Info(" Feature: " + contributions[i].Name + "Weight: " + contributions[i].Weight);
}
}
private void LogPermutationMetics(IDataView transformedData,
ImmutableArray<BinaryClassificationMetricsStatistics> permutationMetrics)
{
var allFeatureNames = GetColumnNamesUsedForPFI(transformedData);
var mapFields = new List<string>();
for (int i = 0; i < allFeatureNames.Count(); i++)
{
var slotField = new VBuffer<ReadOnlyMemory<char>>();
if (transformedData.Schema[allFeatureNames[i]].HasSlotNames())
{
transformedData.Schema[allFeatureNames[i]].GetSlotNames(ref slotField);
for (int j = 0; j < slotField.Length; j++)
{
mapFields.Add(allFeatureNames[i]);
}
}
else
{
mapFields.Add(allFeatureNames[i]);
}
}
// Now let's look at which features are most important to the model
// overall. Get the feature indices sorted by their impact on AUC.
// The importance, or the absolute average decrease in R-squared metric calculated
// by PermutationFeatureImportance can then be ordered from most important to least important.
var sortedIndices = permutationMetrics
.Select((metrics, index) => new { index, metrics.AreaUnderRocCurve })
.OrderByDescending(
feature => Math.Abs(feature.AreaUnderRocCurve.Mean));
Console.WriteLine($"Feature indices sorted by their impact on AUC:");
foreach (var feature in sortedIndices)
{
Console.WriteLine($"{mapFields[feature.index],-20}|\t{Math.Abs(feature.AreaUnderRocCurve.Mean):F6}");
}
Console.WriteLine($"PMI AUC Logged as the following:");
// Combine metrics with feature names and format for display
for (int i = 0; i < permutationMetrics.Length; i++)
{
Logger.Info($"{importances[i].Name}|\t{permutationMetrics[i].AreaUnderRocCurve.Mean:F6}");
importances[i].AUC = permutationMetrics[i].AreaUnderRocCurve.Mean;
}
}
Hi @lefig - can you share the code that generates the objects passed to these logging functions?
LinearBinaryModelParameters subModel
IDataView transformedData
ImmutableArray<BinaryClassificationMetricsStatistics> permutationMetrics
Please also share code for any data processing and model training.
PFI values for features being 0 mean that permuting the feature values did not change AreaUnderRocCurve much. This is not the same as the weight learned by the model being 0. You can have non-zero weights for a feature that are not statistically significant, and you could end up with a situation where PFI metrics are 0.
Note that PFI value is just one indicator of feature importance, not a conclusive statement of feature importance. That said, so many features having PFI of 0 warrants some further investigation. Here are a few reasons I can think of that can possibly explain this.
permutationCount used for calculating PFI is 1 (or a small number). Please double check the value of this argument is something reasonable (try something like 10 or 30)AreaUnderRocCurve isn't very large when a feature is permuted. What is the actual AreaUnderRocCurve of this model evaluated on the training and test data? AreaUnderRocCurve ~0.5 or ~0.6 would indicate a particularly poor model, which you would expect to be about as poor when a feature is permuted, hence no change in AreaUnderRocCurve.ImmutableArray<BinaryClassificationMetricsStatistics> permutationMetrics on a very small dataset? That could give rise to 0 change in AreaUnderRocCurve.Hi @najeeb-kazmi
Thank you for your kind help. The code that generates the metrics is as follows (this is an example of one such learner that requires a calibrator).
private void CalculateGamCalibratedClassificationPermutationFeatureImportance(MLContext mlContext, IDataView transformedData,
ITransformer trainedModel, string learner)
{
// Extract the trainer (last transformer in the model)
var singleTrainerModel = (trainedModel as BinaryPredictionTransformer<CalibratedModelParametersBase<GamBinaryModelParameters,
PlattCalibrator>>);
//Calculate Feature Permutation
ImmutableArray<BinaryClassificationMetricsStatistics> permutationMetrics =
mlContext
.BinaryClassification.PermutationFeatureImportance(predictionTransformer: singleTrainerModel,
data: transformedData,
labelColumnName: "Label",
numberOfExamplesToUse: 100, permutationCount: 50);
Logger.Info("Calculating Binary Classification Feature PFI");
Logger.Info("Feature PFI for learner:" + learner);
LogPermutationMetics(transformedData, permutationMetrics);
}
I tend to think (your point 2) that the model is poor and needs some features removed. Hence I was hoping to have some insight regarding the names of those features so that I can proceed with changing the model.
Best wishes
Fig
@lefig
What is the AUC of this model?
Maybe using only 100 rows is the reason you are not seeing non-zero PFI. Try using the entire dataset.
@lefig any update on this and the information I requested? Also, did any of my suggestions help in debugging this?
I'm curious to see why this is happening as it is quite unusual. As I mentioned, it's not clear which model is giving you 0 PFI, Gam or linear. Would be nice to see reproducible example so I can debug this (small snippet of the data and the actual code for training the model and calculating PFI).
Hi @najeeb-kazmi
I really appreciate your time and help with this. Please let me generate some further test data and I will get back to you.
@lefig if this is still an issue, please feel free to reopen.