If I understand correctly, using DeepExplainer for multivariate time series classification data, the SHAP values should sum to the output value from the model for that test example. If I take these SHAP values summed for all features across the time dimension and run them through a bounded 0-1 function such as sigmoid, does this represent the probability of classification at each timestep, or is this an incorrect way to go-about this?
The use case for this is to understand the importance of full time-steps rather than individual features at each time-step. I would appreciate all thoughts on this matter!
If you are taking all the features and summing them for a specific time slice you will get the importance of that time slice...but running that summed value through a sigmoid would be a fair bit of approximation. If you do, just make sure you include the base value (explainer.expected_value) as well before sending it through the function.
If you are taking all the features and summing them for a specific time slice you will get the importance of that time slice...but running that summed value through a sigmoid would be a fair bit of approximation. If you do, just make sure you include the base value (
explainer.expected_value) as well before sending it through the function.
@slundberg Thanks for the insight -- by including the base value, do you mean subtracting the expected value from the time-step sum before sending through the function?
I meant adding explainer.expected_value to the sum. This is because SHAP values encode the difference from the expected value caused by the features.
thanks for the help!
Most helpful comment
I meant adding explainer.expected_value to the sum. This is because SHAP values encode the difference from the expected value caused by the features.