Hi Scott,
i have a question regarding the usage of the package in following scenario:
What is the correct way to accomplish this?
I can think of two ways:
1.
explainer = shap.TreeExplainer(deployed_model)
shap_values = explainer.shap_values(X_new_sample)
or
2.
explainer = shap.TreeExplainer(deployed_model)
shap_values = explainer.shap_values(X_training_samples_and_new_sample)
thanks in advance
Alex
It seems like the only relevant difference is what you do with 'shap_values' afterwards. If you inspect / plot the row of the new sample you are accomplishing your second use. If you plot the whole, you are looking at the population (say the feature importance or the dependence plot) for the model (+ 1 single new sample).
As @BrianMiner mentioned if you are just explaining a single new prediction then you should probably use the first approach (1). Then take the SHAP values for that prediction and use them how you like (for example passing them to shap.force_plot)
Most helpful comment
As @BrianMiner mentioned if you are just explaining a single new prediction then you should probably use the first approach (1). Then take the SHAP values for that prediction and use them how you like (for example passing them to
shap.force_plot)