LightGBM component: Python package
Operating System: Fedora 29
CPU/GPU model: Intel(R) Xeon(R) Gold 5120 CPU
C++ compiler version: g++ (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2)
CMake version: 3.14.5
Java version: OpenJDK Runtime Environment (build 1.8.0_232-b09)
Python version: Python 3.6.10 -- on conda
R version: N/A
Other:
LightGBM version or commit hash: 2.3.2 / 2e2757f183a17c2998c1002531fbbf27d0010783
When using auc_mu metric with a large dataset, validation score varies wildly outside the range [0, 1].
Reported validation metrics on cv were floating around -27, or 4.5 in some cases (atleast for my data).
Running the example described below, this is the first output:
[1] cv_agg's auc_mu: 14.9488 + 0.0245611
It only takes two lines:
data = lgb.Dataset(np.random.randn(3000000, 15), np.random.randint(0, 9, 3000000))
lgb.cv({'objective': 'multiclass', 'num_class': 9, 'metric': 'auc_mu'}, data, verbose_eval=True)
Interestingly, running the same code with 10x less data gives the expected result, giving 0.5 auc:
data = lgb.Dataset(np.random.randn(300000, 15), np.random.randint(0, 9, 300000))
lgb.cv({'objective': 'multiclass', 'num_class': 9, 'metric': 'auc_mu'}, data, verbose_eval=True)
yields [1] cv_agg's auc_mu: 0.499315 + 0.000975809
@btrotta Can you please take a look?
@thadeuluiz Thanks for reporting this and providing a reproducible example. I'll look into it.
Most helpful comment
@thadeuluiz Thanks for reporting this and providing a reproducible example. I'll look into it.