Lightgbm: bug/segfault when using add_features_from and somewhat sparse data

Created on 12 Nov 2020 · 3Comments · Source: microsoft/LightGBM

How you are using LightGBM?

LightGBM component: Python package

Environment info

Operating System: Ubuntu 16.04.6 LTS

CPU/GPU model: Intel(R) Core(TM) i7-3840QM CPU @ 2.80GHz

C++ compiler version: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609

CMake version: 3.5.1

Python version: 3.6.3

Comments: tried on other machines running Debian with newer python, gcc, and cmake

LightGBM version or commit hash: I tried 1bc27939a43c414c3424f339994d0aa11f3aa3b1 and 3.0.0 from pypi

Error message and / or logs

this hash: 1bc27939a43c414c3424f339994d0aa11f3aa3b1

`/home/ilya/anaconda3/lib/python3.6/site-packages/lightgbm/basic.py:1971: UserWarning: Cannot add features from NoneType type of raw data to NoneType type of raw data. Set free_raw_data=False when construct Dataset to avoid this warnings.warn(err_msg) /home/ilya/anaconda3/lib/python3.6/site-packages/lightgbm/basic.py:1973: UserWarning: Reseting categorical features. You can set new categorical features viaset_categorical_feature`` method
warnings.warn("Reseting categorical features.\n"
[LightGBM] [Debug] Dataset::GetMultiBinFromSparseFeatures: sparse rate 0.800036
[LightGBM] [Info] Total Bins 10200
[LightGBM] [Info] Number of data points in the train set: 1000000, number of used features: 40
Segmentation fault (core dumped)


with 3.0.0 from pypi:
```[LightGBM] [Fatal] Bug. There should be only one multi-val group.
Traceback (most recent call last):
  File "lgb_test.py", line 28, in <module>
    m = lgb.train(pars, datasets[0], num_boost_round=3)
  File "/home/ilya/anaconda3/lib/python3.6/site-packages/lightgbm/engine.py", line 231, in train
    booster = Booster(params=params, train_set=train_set)
  File "/home/ilya/anaconda3/lib/python3.6/site-packages/lightgbm/basic.py", line 1991, in __init__
    ctypes.byref(self.handle)))
  File "/home/ilya/anaconda3/lib/python3.6/site-packages/lightgbm/basic.py", line 55, in _safe_call
    raise LightGBMError(decode_string(_LIB.LGBM_GetLastError()))
lightgbm.basic.LightGBMError: Bug. There should be only one multi-val group.

Reproducible example(s)

import lightgbm as lgb
import numpy as np

r = np.random.RandomState(42)
ROWS = 1000000
COLUMNS = 20
N = 2
tags = 'ab'

data = [r.rand(ROWS, COLUMNS) for _ in range(N)]
for i in range(N):
    data[i][data[i] < 0.8] = 0
label = r.rand(ROWS)

def construct(i):
    return lgb.Dataset(data[i], feature_name=[tags[i] + str(n) for n in range(COLUMNS)], label=label).construct()

datasets = [construct(i) for i in range(N)]

for i in range(1, N):
    datasets[0].add_features_from(datasets[i])

if lgb.__version__ == '3.0.0':
    datasets[0].feature_name = sum((d.feature_name for d in datasets), [])

pars = {'verbosity': 2, 'seed': 42, 'force_col_wise': True}

m = lgb.train(pars, datasets[0], num_boost_round=3)

Steps to reproduce

python <above code>

Thoughts and partial workaround

On pypi version, setting pars['force_row_wise'] = True instead makes it work but master tip still segfaults. My guess is that in 3.0.0 some bin appears more than once?

Separately, the warnings in master tip don't make sense to me: I think it is perfectly reasonable to free raw data, so why this check?

Many thanks for any help!

bug

Source