Problem:
Pool label doesn't work
As I correctly understood from this docs:
https://catboost.ai/docs/concepts/python-reference_pool.html
with parameter label I can set the labels for the dataset, but it doesn't work at all.
catboost version:
0.15.2
Operating System:
Mac Mojave 10.14.5 (18F203)
2.6 GHz Intel Core i7
Intel UHD Graphics 630 1536 MB
sandbox:
https://repl.it/repls/PassionateKeyScale
from catboost import Pool
# pool-label-test.csv data
# L,a,b,c,d
# 0,3,5,7,9
# 1,4,6,8,10
pool = Pool(data='pool-label-test.csv', label=['zero', 'one'], delimiter=',', has_header=True)
print(pool.get_label())
# ['0', '1']
Thank you
If data is a file, then labels must be in the file. If you want labels to be in a separate column, you have to pass a matrix to data.
We'll point this out in the docs and make an exception. Thanks for pointing this out!
@annaveronika probably better to have an ability to pass the column name to label property?
e.g.: not the first csv file column contains the label
Currently you either use file with everything, or you use matrices and arrays. It is usually enough for everyone. It is possible to allow using files for some parts of data and lists and matrices for other parts of data, we don't plan to implement this in close future, but we might do it at some point.
@annaveronika got you :)