Xgboost: Javascript version of the Library

Created on 10 Aug 2016  路  17Comments  路  Source: dmlc/xgboost

I am opening this issue to see if anyone is interested in contributing.

As XGBoost have minimum dependecy, it is possible to use amalgamation and emscripten to build a javascript version of the library. See https://github.com/dmlc/mxnet.js as an example for our deep learning project.

This could be fun for some usecases, to run XGBoost on the browser and provide some cool demos

help wanted

Most helpful comment

Hello @tqchen,

I have a version on JS working (using WebAssembly) so it's able to run on the browser, also I'm able to load models saved on other programming languages (using the C API)

mljs/xgboost

tell me if you are interested to include it here.

All 17 comments

@tqchen I can try to participate in it

sounds good, @Jabher Let me know if you need any help on this, mxnet.js could be a good starting point

Cool, thanks!

@tqchen I personally think that implementing predict is more important than implementing train in this case. I can only think of toy examples where a train javascript function would be helpful, such as web ML demos. However, there are a lot of cases where computing xgboost predictions clientside on a pre-built xgboost model would be very practical for a variety of applications. I've had to do this for clients before using other ML models. A typical workflow might be:

  1. train locally on a research machine using an existing xgboost API (Scala, R, python..).
  2. export tree as json object with xgb.dump.
  3. read in json model into javascript environment and generate predictions.

The advantage of just implementing the predict functionality is that it can be done with VinallaJS. I could also develop the predict functionality quite quickly, whereas a complete javascript API for XGBoost would be a larger-scale undertaking. What do you think? Happy to contribute a PR for this.

@AndrewHannigan FYI there is a project to build compiler of boosting model (transform the model in if else instructions in C++ then compile it). It will be released soon. What you are speaking may be part of this project.
More info here : https://github.com/dmlc/xgboost/issues/2551

@pommedeterresautee sounds very intriguing, but why take C++ source code as input for the compiler? Wouldn't JSON be easier to parse, more portable, widely supported by other languages, etc.? Seems to me the forest should be represented as data, not code.

performance !! for industrial deployment

Oh I see, so this just going to produce a binary executable. Sounds cool! Can you share any other info on the proejct, besides #2551? There might be pieces of the pipeline before the ifelse generation step that would overlap with this project.

@hcho3 may give you more info.
The parsing of the model and the internal representation may be useful for both projects, plus tree-lite manages also models from lightgbm...

@AndrewHannigan treelite is currently in public beta. I am adding more documentation before an official release. The idea here is to produce a binary executable for optimized prediction. Right now, treelite produces C program internally to be fed into a C compiler (e.g. gcc), but we can certainly produce a JS program instead.

Has there been any progress on this?

Hello, I just have a version of the library that works on node and the browser (using WebAssembly)

https://github.com/mljs/xgboost

Currently, you are able to train models, save it and load it, but I'm trying to be able to load the model saved in other language but I'm getting an unexpected behavior, I don't know if somebody is able to help me
(the current implementation is on the load-external-files branch)

Hello @tqchen,

I have a version on JS working (using WebAssembly) so it's able to run on the browser, also I'm able to load models saved on other programming languages (using the C API)

mljs/xgboost

tell me if you are interested to include it here.

nice, if you are interested in porting back the js binding, we can put it under xgboost/web

We cannot host the emscriptened js file directly in the repo(as it can be big), but we can host the build flow and put emcc generated file in a separate repo.

cc @hcho3

For most of the C++ change, we can use EMSCRIPTEN to detect if the current project is compiled with emcc

@tqchen I'd suppose we need to update both Makefile and CMakeLists.txt to accommodate the JS build?
@JeffersonH44 Thanks! This is great. It will be best to incorporate your work into the main repository. (Treelite ended up taking a different focus for the time being)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

nnorton24 picture nnorton24  路  3Comments

uasthana15 picture uasthana15  路  4Comments

FabHan picture FabHan  路  4Comments

lizsz picture lizsz  路  3Comments

vkuznet picture vkuznet  路  3Comments