Do you have the plan to support Java like xgboost4j?
I need it too for one of the highly important Microsoft customer...
predict4j - does not work with current version (fails because default_values missing in 2.0.7 model).
pmml conversion - does not work either, fails by double -> int cast somewhere in model
the spark version will release on mmlspark (https://github.com/Azure/mmlspark ) soon.
ping @rmhasan
the pmml seems is broken for the newer version. could you help to bring it back ?
Exporting LightGBM to PMML, and then scoring using a Java PMML engine should count as a viable option in the meantime.
I've updated my JPMML-LightGBM exporter library to be fully compatible with the latest LightGBM v2.0.7 (including the handling of categorical/binary features and missing values). Better yet, it provides some custom functionality such as limiting the number of trees (similar to the num_iteration parameter of LightGBM's Scikit-Learn API), and compacting individual trees.
Tree compaction involves 1) expanding LightGBM-style binary splits into PMML-style multi-way splits, 2) eliminating half terminal nodes (aka leafs) and 3) eliminating redundant tree splitting predicates. It leads to >50% reduction in PMML file size.
@vladimir-vilinski @liulhdarks @vruusmann I've checked in the code to generate SWIG Java wrappers to LightGBM repo
To build, you just need to run:
mkdir build ; cd build
cmake -DUSE_SWIG=ON ..
make -j4
The jar file is also available in maven central:
https://repo.maven.apache.org/maven2/com/microsoft/ml/lightgbm/lightgbmlib/
You can import it with sbt via:
"com.microsoft.ml.lightgbm" % "lightgbmlib" % "2.0.120"
I also have a PR open to add LightGBM to MMLSpark, a package for apache spark distributed data processing framework:
https://github.com/Azure/mmlspark/pull/235
If you have any suggestions for how to improve the SWIG wrappers or have any general questions please let me know.
Thanks @imatiach-msft so much!
@guolinke , let's remaining this open for further discussion.
@chivee @guolinke - @drdarshan suggested that the Java bindings could be improved by using SWIG typemaps. This would be more customized code but it would remove the need for developers to deal with SWIG pointer types. I think this is an improvement that we could add in the future for developers who use the Java bindings directly (and not our spark-based learners).
@liulhdarks @imatiach-msft
Does this[1] solve the problem?
@spkaplan yes, that is the package that I am maintaining. However, the java interface still needs to be improved more, as I mentioned above. I am open to suggestions from the community. Right now the autogenerated wrappers are mainly only used in mmlspark in scala code, but anyone can use the package.
@imatiach-msft Thank you for the quick reply! I had accidentally overlooked your previous comment regarding the jar available in maven central. Thank you for pointing that out!
Closed in favor of being in #2302. We decided to keep all feature requests in one place.
Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.
Most helpful comment
@vladimir-vilinski @liulhdarks @vruusmann I've checked in the code to generate SWIG Java wrappers to LightGBM repo
To build, you just need to run:
mkdir build ; cd build
cmake -DUSE_SWIG=ON ..
make -j4
The jar file is also available in maven central:
https://repo.maven.apache.org/maven2/com/microsoft/ml/lightgbm/lightgbmlib/
You can import it with sbt via:
"com.microsoft.ml.lightgbm" % "lightgbmlib" % "2.0.120"
I also have a PR open to add LightGBM to MMLSpark, a package for apache spark distributed data processing framework:
https://github.com/Azure/mmlspark/pull/235
If you have any suggestions for how to improve the SWIG wrappers or have any general questions please let me know.