Lightgbm: [Request] Register official wheel to PyPI

Created on 2 Jun 2017  路  50Comments  路  Source: microsoft/LightGBM

It will be very useful if the wheel package is registered in PyPI.

After cmake and make, I tried to create the wheel package by the following command with anaconda python3.5.

python setup.py bdist_wheel

I found the resulting wheel package works fine with both python3.5 and python3.6 (anaconda)

I only tried Ubuntu 16.04 but the same procedure should be valid for other platforms like Windows and Mac. I'd be glad if you look in to it.

Most helpful comment

I think we can use something like https://github.com/joerick/cibuildwheel to build these wheels in CI.
But we need to enable osx build in travis first. @wxchan can you help for this ?

and travis seems can deploy the pipy as well .

All 50 comments

A customer asked for this, @guolinke it would great to have this :-)

I think we can put the CPU version first.

A problem is: should we put the pre-compiled the dll/so into package? Or re-compile everytime when user install?

@wxchan thoughts ?

Maybe we can keep multiple versions? I am not quite familiar with this.

@wxchan
keep multi-version needs many effort to generate these dll every time we changes the code.
using re-compile is much easier.

@wxchan
If changing to re-compile method, we need a new python-package build solution like the R's solution now: https://github.com/Microsoft/LightGBM/blob/master/R-package/src/install.libs.R .

I don't see many python packages do thing like this. I can do some investigation.

Maybe we can use dll with stable version like v2.0 for now. We don't need to change code every time. I think we sometimes are adding some features not used very frequently.

@wxchan refer to https://github.com/chainer/chainer/tree/v1.24.0
It uses the source code to build every time install the package.

I didn't see it using source code with pip install:

Collecting chainer
  Downloading chainer-2.0.0.tar.gz (318kB)
    100% |鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 327kB 926kB/s 
Collecting filelock (from chainer)
  Downloading filelock-2.0.10.tar.gz
Requirement already satisfied: nose in ./miniconda3/lib/python3.5/site-packages (from chainer)
Requirement already satisfied: numpy>=1.9.0 in ./miniconda3/lib/python3.5/site-packages (from chainer)
Requirement already satisfied: protobuf>=2.6.0 in ./miniconda3/lib/python3.5/site-packages (from chainer)
Requirement already satisfied: six>=1.9.0 in ./miniconda3/lib/python3.5/site-packages (from chainer)
Requirement already satisfied: setuptools in ./miniconda3/lib/python3.5/site-packages/setuptools-27.2.0-py3.5.egg (from protobuf>=2.6.0->chainer)
Building wheels for collected packages: chainer, filelock
  Running setup.py bdist_wheel for chainer ... done
  Stored in directory: /Users/wxchan/Library/Caches/pip/wheels/89/56/eb/e821cea4cd834b24428e29eb9b62e0c0c27c2fc6c987995ffc
  Running setup.py bdist_wheel for filelock ... done
  Stored in directory: /Users/wxchan/Library/Caches/pip/wheels/2f/63/bd/8a321da392fae487e7a5d92abb868d957426f9bb01a258a547
Successfully built chainer filelock
Installing collected packages: filelock, chainer
Successfully installed chainer-2.0.0 filelock-2.0.10

@wxchan
it is removed in the latest version(2.9).
see the code here: https://github.com/chainer/chainer/blob/v1.24.0/chainer_setup_build.py

I see, it's using ext_modules in setup(). But it seems using codes upload to pypi not in github, we still need to update every time changes made. The only good thing maybe we don't need multiple versions for different OS.

We can refer ibayer/fastFM, where Travis deploys binaries of different OS.

@wxchan, @henry0312 , any updates ? Do we have any issues to submit it to pypi ?

@guolinke xgboost has a script to build python https://github.com/dmlc/xgboost/blob/master/python-package/xgboost/build-python.sh, we may need something like this, I am not sure how to write it for all platforms.

@wxchan we can use python language to build package instead. It is easier for the all platforms.

@wxchan
It is obvious that xgboost does not support pip install for Windows.
https://github.com/dmlc/xgboost/blob/master/python-package/setup_pip.py#L17

@lucidfrontier45 I know. It's better we can support Windows.

@guolinke
We can pack python and c++ codes into source distribution (https://docs.python.org/3.5/distutils/sourcedist.html), but I can't get compiled with pip.

I rather leave this to a more experienced user, because if you upload one version that not works well, you cannot upload with the same version number again even if you deleted it. Honestly I already uploaded one bad version (I thought it can be deleted).

@wxchan
We cannot offline test the installation of built package ?

@wxchan
I think the process is :

  1. build a python-package that can submit to the pypi
  2. test the installation offline, make sure everything is okay.
  3. upload to the pypi

For the offline installation, I remember it can use the "wheel" file .

The wheel is binary distribution (bdist), we need to build it on different platform separately.

@wxchan
There is not the offline source distribution ?

there will be a lightgbm-0.2.tar.gz file after run python setup.py sdist, but I don't know how it's processed when run pip install.

@wxchan Can you create a repo for the python-package build ? So that I can try it as well.

I think we can release binary distribution on PyPI and Github with travis (and Docker).
(If we need binary distribution only for osx and ubuntu, we don't need Docker)

For WIndows, we need to use AppVeyor.

@henry0312 and osx ?

@guolinke Travis CI provides osx and ubuntu environment.
https://docs.travis-ci.com/user/ci-environment/
https://docs.travis-ci.com/user/osx-ci-environment/

By the way, can you use Azure instances to deploy?
I guess that we can release binary distribution with Jenkins and Docker on Azure.
(I've already used Jenkins and Docker for binary distribution for my Ubuntu)

https://github.com/wxchan/LightGBM/commit/b3290823627fa78474ccf13ccc9c177aaaa44094

I copy src and include folder into python-package folder and run python setup.py sdist and get lightgbm-0.2.tar.gz. We need to write something in setup.py to make it compiled.

https://pypi.python.org/pypi/xgboost
you can see what is inside xgboost-0.6a2.tar.gz, they put src and include in xgboost folder

@wxchan

Check the branch "python-package".
Now it can build the source distribution. And the built package can be installed by pip. (only test in windows).

@guolinke should I run python setup.py sdist to test? got an error on mac:

ld: can't open output file for writing: ../lightgbm, errno=21 for architecture x86_64
collect2: error: ld returned 1 exit status
make[2]: *** [../lightgbm] Error 1
make[1]: *** [CMakeFiles/lightgbm.dir/all] Error 2
make: *** [all] Error 2

@wxchan seems linux also has this error. I will check it

@wxchan can you help to fix the getopt errors ? The error is very strange.

Now we are on the pypi 馃槃

@guolinke @wxchan Thanks so much for this! The pypi distribution makes it much easier to make this a requirement for our production environment.

I'm very impressed by the engineering on this project so far, building in GPU support so rapidly, supporting bindings for so many different languages, and having the pypi distribution. I'm very excited to be able to use lightgbm with the drop of a pip install lightgbm hat.

Thanks guys!

Can you also kindly supply binary wheel version like tensorflow?
It will be great help since I no longer need to install compiler suits and become easier to deploy with Docker.

binary dist has higher priority than source dist if use pypi, maybe we need to find somewhere to hold those wheels.

@wxchan
What do you mean somewhere? PyPI itself can contain wheels of the same version for different platforms.

@wxchan @lucidfrontier45
Do we need to use another name if host in pipy again?

@lucidfrontier45 pip prefers wheels than sdist. If we use wheels, sdist won't be used. It's bad when binary wheels not worked.
@guolinke but I found there is a --no-binary option, so I think it's ok to put them under same name.

refer to:

Pip prefers Wheels where they are available. To disable this, use the --no-binary flag for pip install.

https://pip.pypa.io/en/stable/user_guide/#installing-from-wheels

@guolinke

No. For example, take a look at this tensorflow repository.
https://pypi.python.org/pypi/tensorflow/1.2.0
They put wheel for each combination of (OS, Python version) to the same repository.

PyPI allow you to upload multiple wheels to the same repository.
All you need to do is to correctly name the wheels.
https://www.python.org/dev/peps/pep-0491/#file-name-convention
I think it can be controlled when you run bdist_wheel by plat-name and python-tag options.

$ python setup.py bdist_wheel --help
Install lib_lightgbm from: ['../lib_lightgbm.so', '../../../.conda/envs/kawlu/lightgbm/lib_lightgbm.so']
Common commands: (see '--help-commands' for more)

  setup.py build      will build the package underneath 'build/'
  setup.py install    will install the package

Global options:
  --verbose (-v)  run verbosely (default)
  --quiet (-q)    run quietly (turns verbosity off)
  --dry-run (-n)  don't actually do anything
  --help (-h)     show detailed help message
  --no-user-cfg   ignore pydistutils.cfg in your home directory

Options for 'bdist_wheel' command:
  --bdist-dir (-b)  temporary directory for creating the distribution
  --plat-name (-p)  platform name to embed in generated filenames (default:
                    linux-x86_64)
  --keep-temp (-k)  keep the pseudo-installation tree around after creating
                    the distribution archive
  --dist-dir (-d)   directory to put final built distributions in
  --skip-build      skip rebuilding everything (for testing/debugging)
  --relative        build the archive using relative paths(default: false)
  --owner (-u)      Owner name used when creating a tar file [default: current
                    user]
  --group (-g)      Group name used when creating a tar file [default: current
                    group]
  --universal       make a universal wheel (default: false)
  --python-tag      Python implementation compatibility tag (default: py3)

usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: setup.py --help [cmd1 cmd2 ...]
   or: setup.py --help-commands
   or: setup.py cmd --help

So it needs [python version] x [windows, linux, osx] wheels ... ?

@guolinke
I think LightGBM's code is independent of Python versoin.
So you only need three wheels (win, linux, osx).

I think we can use something like https://github.com/joerick/cibuildwheel to build these wheels in CI.
But we need to enable osx build in travis first. @wxchan can you help for this ?

and travis seems can deploy the pipy as well .

@guolinke Current travis on mac takes forever to start. I will open a pr later to see if it works.

btw, can you reopen this thread? not easy to find when it's closed.

Now the wheels can be built via CI, and push back to github release.

@wxchan can you help to test the wheel installation in osx ? I am not sure about the package naming conversion in OSX.
And can it run without installing gcc with openmp ?

I found the wheel in PyPI works fine on my ubuntu 16.04.
Thank you so much.

@guolinke it can be installed on my mac. W/o gcc, it seems not work. See logs here https://travis-ci.org/wxchan/LightGBM/jobs/247779798.

@wxchan Hi there, I got the same error message you mentioned in previous comment, can you please share how you fix it, thanks in advance!

ld: can't open output file for writing: ../lightgbm, errno=21 for architecture x86_64
collect2: error: ld returned 1 exit status
make[2]: * [../lightgbm] Error 1
make[1]:
[CMakeFiles/lightgbm.dir/all] Error 2
make: *
* [all] Error 2

Was this page helpful?
0 / 5 - 0 ratings