This is an interesting topic and may can further accelerate the learning speed of LightGBM.
@chivee @xuehui1991 Please investigate about this.
yes, we still need to investigate the bottleneck for (or give an statics about ) GPU computing, such as moving data via PCI v.s. computing gradients. However, the multiprocessor feature will definitely helping accelerate the computing process.
according to benchmarks in xgboost project, GPU can speed-up building tree severel times (i7-6700K vs. Titan X (pascal)), what is impresive..
https://github.com/dmlc/xgboost/pull/1848/commits/f0c61666a7041f4159d966b9ade8fa27e44eb786
https://github.com/huanzhang12/lightgbm-gpu/issues/1, a community version of gpu
@huanzhang12, glad to see you have the plan to create PR back. we can have discussion here, and work together to enable GPU training
@chivee Thanks! I am working on finalizing my GPU code. One thing I need to do right now is to update my branch to the upstream LightGBM master. Since there are a lot of changes in v2 (commit 4f77bd28), I need to make sure I don't break anything. I will let you guys know if I have any questions or concerns. Thanks!
@huanzhang12 , Sure, feel to ask if you need any help.
@guolinke @chivee Some quick updates: I have mostly finished upgrading my GPU code to v2: https://github.com/huanzhang12/lightgbm-gpu/tree/v2. I am currently working on implementing the Dense4bitsBin on GPU (which is new in v2), and doing more testing. I hope I can get the code ready to review and merge by the end of this week.
@huanzhang12 awesome, looking forward to your contribution.
Most helpful comment
yes, we still need to investigate the bottleneck for (or give an statics about ) GPU computing, such as moving data via PCI v.s. computing gradients. However, the multiprocessor feature will definitely helping accelerate the computing process.