Shap: nan SHAP values in the XGBOOST implementation

Created on 12 Jul 2018  Â·  17Comments  Â·  Source: slundberg/shap

can you please review https://github.com/dmlc/xgboost/issues/3333 ?
when unique_path[i].pweight == 0 did you mean for 0 contrib to be added?

todo

All 17 comments

Thanks for pointing that out. I'll try and fix it soon.

I just ran into this issue. Just confirming: is it safe to replace the nans with 0?

@sergeyf Unfortunately not, the only option right now is to fall back to the approximate Saabas method approximate=True. Other deadlines have prevented me tackling this yet...

Gotcha, thank you.

I ran into this issue as well, but adding the approximate=True still resulted in some nan values

@ddearauj if you have a small sharable example of that it would be helpful.

@slundberg I will try to anonymize my dataset. What is the best way to send this?

If you don't want it online you can just email it to me directly (email is on my papers). I am away for the next two weeks, but I'll see if I can reproduce it if you share an example.

@slundberg My lab at Mount Sinai is hoping to use SHAP with XGBoost for variable selection and interpretation in a new paper, but this bug is turning out to be a bit of a showstopper. We end up with models where up to 92% of observations have NaN for the SHAP of at least one feature. I'm looking into fixing the bug using the hints at dmlc/xgboost#3333. Any more I should know before diving into XGBoost's SHAP code? I know R and Python well and I'm halfway-competent in C++ (I know C better), but I haven't worked on R–C++ or Python–C++ interfaces before. My lab mostly uses R.

Hi, is there any update/fix on this issue? Kind of stuck in my progress. Can anyone explain why 0 is not a suitable replacement for the NaN values?

A fix was already merged into the latest XGBoost master. Not sure if it is
in a released version yet or not.
On Fri, Feb 22, 2019 at 3:10 AM Antriksh Goel notifications@github.com
wrote:

Hi, is there any update/fix on this issue? Kind of stuck in my progress.
Can anyone explain why 0 is not a suitable replacement for the NaN values?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/slundberg/shap/issues/152#issuecomment-466361848, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADkTxYBDrt8YVCEL_F2g4hXMNkTkUZ8Uks5vP9A0gaJpZM4VMrwP
.

For those looking at this issue, XGBoost 0.81 does not include the fix, but it will presumably be available in the next release.

Thanks, I was able to use the fix through the XGBoost master available here.

Hi astrixg,
I am also facing the same problem and trying to look for a solution. Can you please let me know the steps you followed to fix this issue.

Thanks in advance.

Here are the steps:

https://xgboost.readthedocs.io/en/latest/build.html

First you build the library and then install the package.

Great! I am going to close this now :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

samupino picture samupino  Â·  3Comments

Nithanaroy picture Nithanaroy  Â·  4Comments

gabrielcs picture gabrielcs  Â·  3Comments

nickkimer picture nickkimer  Â·  4Comments

artemmavrin picture artemmavrin  Â·  4Comments