Shap: what's the difference between feature_perturbation="interventional" and feature_perturbation="tree_path_dependent"

Created on 13 Mar 2020 · 4Comments · Source: slundberg/shap

i have tried feature_perturbation="interventional" whit backgrouddata and feature_perturbation="tree_path_dependent" without backgrouddata in TreeExplainer, the first example was very slow and finally ran into error,but the second example worked very well,so i want to know the difference between this two parameters and how shap works with this two parameters.

Source

mrlittleyo

Most helpful comment

Finally, the error you are getting with the interventional approach has also been reported in issue #1096, #887, #941 and #903. I am assuming that you also get the error message "Additivity check failed in TreeExplainer! Please report this on Github"?

Release 0.34.0 was supposed to fix the too tight check_additivity tolerance in TreeExplainer (which I guess is why issues #887, #941 and #903 are closed), but I think there still is a problem. I have noticed that checking the additivity property fails only when feature_perturbation = 'interventional'. I get no error when using feature_perturbation = 'tree_path_dependent'. Hopefully @slundberg will be able to help us out :)

LEMTideman on 16 Mar 2020

👍3

All 4 comments

The difference between feature_perturbation = ‘interventional’ and feature_perturbation = ‘tree_path_dependent’ is explained in detail in the Methods section of Lundberg’s Nature Machine Intelligence paper, which is entitled _From local explanations to global understanding with explainable AI for trees_.

Quoting Lundberg, “Shapley values are computed by introducing each feature, one at a time, into a conditional expectation function of the model’s output, f_x (S)=E[f(X)│do(X_S=x_S ) ], and attributing the change produced at each step to the feature that was introduced; then averaging this process over all feature orderings.” The subset X_S is the coalition of present features (i.e. we know what values these features take for data instance x). Lundberg uses the do-notation proposed by Janzing: E[f(X)│do(X_S=x_S ) ] is the conditional expectation when we intervene to make X_S=x_S (rather than observe that X_S=x_S). Look into the work of Judea Pearl on interventional versus observational conditional probabilities if you like (in a nutshell, Pearl studied causality).

The background dataset is used for marginalization. The complementary subset of features X_C is missing (i.e. we do not know what values these features take for data instance x). According to the documentation, the absence of a feature is simulated by replacing the feature with one of the values it takes in the background dataset. In order to use feature_perturbation = ‘interventional’, you must have access to background data (for example, the training dataset). You can use feature_perturbation = ‘tree_path_dependent’ when no background data is provided because it can infer the background distribution based on the structure of the model: the tree-path dependent approach is to follow the decision trees and use the number of training instances that ended up in each leaf to represent the background distribution. I guess that feature_perturbation = ‘tree_path_dependent’ is also useful when your aim is to understand the decision-making process of a model whose training data is unavailable (for example a proprietary black-box ML model).

As for the difference between the interventional and tree-path dependent approaches, I suggest you check out issue #882 and https://github.com/christophM/interpretable-ml-book/issues/142. One important difference between the interventional approach and the tree-path dependent approaches is that the interventional approach assumes that feature subsets X_S and X_C are independent. This is because the intervention on X_S effectively breaks the dependency on complementary subset X_C. So we approximate conditional expectation by unconditional expectation, and this is justified from a causal perspective (check _Feature relevance quantification in explainable AI: a causal problem_ by Janzing).

Another important difference is the “true to the data” versus “true to the model” issue discussed in https://github.com/christophM/interpretable-ml-book/issues/142. If I understand correctly, the tree-path dependent approach is “true to the data” because it avoids basing the computation of SHAP values on unrealistic data instances. It achieves this by constraining the sampling of “unknown” features (those not belonging to the coalition under study) to a range of values (i.e. partition of the feature space) allowed by the decision tree, effectively conditioning on the features that split prior nodes. However, you then get an explanation that risks not being “true to the model” because of feature correlation (and other high-order dependencies): groups of correlated features may be assigned little importance because of their shared contribution to the predictive task; and vice-versa uninformative features may be assigned high importance, not because they are actually useful to the model but because they are highly correlated to another useful feature. On the other hand, the interventional approach is “true to the model” but not “true to the data”.

Edit August 2020: have a look at the paper entitled "True to the Model or True to the Data?" by Hugh Chen https://arxiv.org/abs/2006.16234

LEMTideman on 16 Mar 2020

❤3

LEMTideman on 16 Mar 2020

👍3

Just chiming in to say that I am also getting the

"shap.common.SHAPError: Additivity check failed in TreeExplainer!"

error when feature_perturbation = 'interventional' even in the latest release. I do not get it when feature_perturbation = 'tree_path_dependent'.

CanML on 19 May 2020

I have the opposite problem: I am calling TreeExplainer with the default feature_perturbation = 'tree_path_dependent' but with a different data set than the one the model was trained on.
Why is it a problem if the background data do not reach all leaves? The following error message also suggests that the interventional mode does not require a bg data set, opposite to the comments from above ?

_AssertionError: The background dataset you provided does not cover all the leaves in the model, so TreeExplainer cannot run with the feature_perturbation="tree_path_dependent" option! Try providing a larger background dataset, or using feature_perturbation="interventional"._

Thanks!
Markus