Operating System:
MacOS
CPU/GPU model:
CPU
C++/Python/R version:
Python
LightGBM version or commit hash:
LightGBM==2.2.3
JSONDecodeError Traceback (most recent call last)
<ipython-input-45-9dde935b11fe> in <module>
2
3 shap.initjs()
----> 4 explainer = shap.TreeExplainer(model, xtest)
5 print("explained done!")
6 shap_values = explainer.shap_values(xtest)
~/env/lib/python3.7/site-packages/shap/explainers/tree.py in __init__(self, model, data, model_output, feature_dependence)
100 self.feature_dependence = feature_dependence
101 self.expected_value = None
--> 102 self.model = TreeEnsemble(model, self.data, self.data_missing)
103
104 assert feature_dependence in feature_dependence_codes, "Invalid feature_dependence option!"
~/env/lib/python3.7/site-packages/shap/explainers/tree.py in __init__(self, model, data, data_missing)
599 self.model_type = "lightgbm"
600 self.original_model = model
--> 601 tree_info = self.original_model.dump_model()["tree_info"]
602 try:
603 self.trees = [Tree(e, data=data, data_missing=data_missing) for e in tree_info]
~/env/lib/python3.7/site-packages/lightgbm/basic.py in dump_model(self, num_iteration, start_iteration)
2151 ctypes.byref(tmp_out_len),
2152 ptr_string_buffer))
-> 2153 ret = json.loads(string_buffer.value.decode())
2154 ret['pandas_categorical'] = json.loads(json.dumps(self.pandas_categorical,
2155 default=json_default_with_numpy))
/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
346 parse_int is None and parse_float is None and
347 parse_constant is None and object_pairs_hook is None and not kw):
--> 348 return _default_decoder.decode(s)
349 if cls is None:
350 cls = JSONDecoder
/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/decoder.py in decode(self, s, _w)
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()
339 if end != len(s):
/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/decoder.py in raw_decode(self, s, idx)
351 """
352 try:
--> 353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
355 raise JSONDecodeError("Expecting value", s, err.value) from None
JSONDecodeError: Expecting ',' delimiter: line 9 column 110 (char 277)
My SHAP version is 0.30.1.
I have tried all of the approaches on the following issues:
https://github.com/slundberg/shap/issues/620#issue-450191481
https://github.com/microsoft/LightGBM/issues/1935#issue-395612381
please try the latest master branch.
@guolinke I have just tried out the latest branch.
JSONDecodeError: Expecting ',' delimiter: line 9 column 110 (char 277)
The same error persists. Please let me know if you need more information.
@billydentsu Please provide any repro for creating your model, which you pass to shap.TreeExplainer. Or you can dump/pickle it and attach to the message here.
@StrikerRUS hi, thanks for the reply.
Here is the trained model:
ts_fresh_model_v1.txt
Please let me know if you have any other questions.
@billydentsu It seems to be not the txt format actually. How did you produce that file?
@StrikerRUS uh excuse me, I have used joblib.
Here is the model.txt file produced by the lightgbm:
ts_fresh_model_v1.txt
@StrikerRUS maybe we can add a test for the json format check?
@billydentsu I find the root cause,
you have the " symbol in your feature names, which is not supportted by json decoder...
@StrikerRUS I think we should check the feature_name, to avoid the [],{}": characters.
@guolinke Great! Now everything works. Thanks!
@guolinke
I think we should check the feature_name, to avoid the [],{}": characters.
Can it be done by JSON library at cpp side?
UPD: Maybe it can be done here along with non-ascii check? https://github.com/microsoft/LightGBM/commit/0d59859c670b9de37bffa8a6e536497c88d9f25d
@guolinke Maybe we should combine it with
https://github.com/microsoft/LightGBM/blob/dc5840709909bc18df437cddf62677f7be2915a4/include/LightGBM/utils/common.h#L49-L56
this can only remove the head and tail quotes.
I think simply remove them is not good. we may need to replace them or just throw errors.
Most helpful comment
@StrikerRUS I think we should check the feature_name, to avoid the
[],{}":characters.