Lightgbm: bug/segfault when using add_features_from and somewhat sparse data

Created on 12 Nov 2020  路  3Comments  路  Source: microsoft/LightGBM

How you are using LightGBM?

LightGBM component: Python package

Environment info

Operating System: Ubuntu 16.04.6 LTS

CPU/GPU model: Intel(R) Core(TM) i7-3840QM CPU @ 2.80GHz

C++ compiler version: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609

CMake version: 3.5.1

Python version: 3.6.3

Comments: tried on other machines running Debian with newer python, gcc, and cmake


LightGBM version or commit hash: I tried 1bc27939a43c414c3424f339994d0aa11f3aa3b1 and 3.0.0 from pypi

Error message and / or logs

this hash: 1bc27939a43c414c3424f339994d0aa11f3aa3b1

`/home/ilya/anaconda3/lib/python3.6/site-packages/lightgbm/basic.py:1971: UserWarning: Cannot add features from NoneType type of raw data to NoneType type of raw data. Set free_raw_data=False when construct Dataset to avoid this warnings.warn(err_msg) /home/ilya/anaconda3/lib/python3.6/site-packages/lightgbm/basic.py:1973: UserWarning: Reseting categorical features. You can set new categorical features viaset_categorical_feature`` method
warnings.warn("Reseting categorical features.\n"
[LightGBM] [Debug] Dataset::GetMultiBinFromSparseFeatures: sparse rate 0.800036
[LightGBM] [Info] Total Bins 10200
[LightGBM] [Info] Number of data points in the train set: 1000000, number of used features: 40
Segmentation fault (core dumped)


with 3.0.0 from pypi:
```[LightGBM] [Fatal] Bug. There should be only one multi-val group.
Traceback (most recent call last):
  File "lgb_test.py", line 28, in <module>
    m = lgb.train(pars, datasets[0], num_boost_round=3)
  File "/home/ilya/anaconda3/lib/python3.6/site-packages/lightgbm/engine.py", line 231, in train
    booster = Booster(params=params, train_set=train_set)
  File "/home/ilya/anaconda3/lib/python3.6/site-packages/lightgbm/basic.py", line 1991, in __init__
    ctypes.byref(self.handle)))
  File "/home/ilya/anaconda3/lib/python3.6/site-packages/lightgbm/basic.py", line 55, in _safe_call
    raise LightGBMError(decode_string(_LIB.LGBM_GetLastError()))
lightgbm.basic.LightGBMError: Bug. There should be only one multi-val group.

Reproducible example(s)

import lightgbm as lgb
import numpy as np

r = np.random.RandomState(42)
ROWS = 1000000
COLUMNS = 20
N = 2
tags = 'ab'

data = [r.rand(ROWS, COLUMNS) for _ in range(N)]
for i in range(N):
    data[i][data[i] < 0.8] = 0
label = r.rand(ROWS)

def construct(i):
    return lgb.Dataset(data[i], feature_name=[tags[i] + str(n) for n in range(COLUMNS)], label=label).construct()

datasets = [construct(i) for i in range(N)]

for i in range(1, N):
    datasets[0].add_features_from(datasets[i])

if lgb.__version__ == '3.0.0':
    datasets[0].feature_name = sum((d.feature_name for d in datasets), [])

pars = {'verbosity': 2, 'seed': 42, 'force_col_wise': True}

m = lgb.train(pars, datasets[0], num_boost_round=3)

Steps to reproduce

python <above code>

Thoughts and partial workaround

On pypi version, setting pars['force_row_wise'] = True instead makes it work but master tip still segfaults. My guess is that in 3.0.0 some bin appears more than once?

Separately, the warnings in master tip don't make sense to me: I think it is perfectly reasonable to free raw data, so why this check?

Many thanks for any help!

bug

Most helpful comment

I'll open another PR for this.

All 3 comments

@shiyu1994 as you are changing the related codes, can you also fix this in your PR? (or after merge it).

I'll open another PR for this.

@shiyu1994 any progress for the fix?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

chivee picture chivee  路  3Comments

JoshuaC3 picture JoshuaC3  路  3Comments

ahbon123 picture ahbon123  路  4Comments

NicolasHug picture NicolasHug  路  3Comments

jianqin123 picture jianqin123  路  3Comments