Spacy: ๐Ÿ’ซ Continuous deployment of dev builds, wheels etc.

Created on 15 Jan 2018  ยท  13Comments  ยท  Source: explosion/spaCy

This issue details some of the changes we want to make to the spaCy build process, to be able to ship updates faster, and make it easier for developers to test new changes. Introducing spacy-nightly for v2.0 last year was the first step in that direction. We've also started using a better system of dedicated feature branches and pull requests (e.g. #1408, #1424, #1792), with our ๐Ÿ’ซ emoji indicating issues and PRs by @honnibal or me.

Continuous deployment of dev builds

The new build process should trigger on each push to master, run the tests, build spaCy and, if successful, upload it to spacy-nightly.

For example, let's say you've contributed a bug fix or a new feature. Once your PR is merged and the master build is successful, it will be automatically published as a .dev version on spacy-nightly. This lets you โ€“ and others โ€“ use and test it immediately, without having to build spaCy from source. Production users can test their application against the latest state of development at any time. It also makes it easier to debug new features and helps ensure stability of the official releases.

Development releases are specified by adding .dev[N] to the version identifier, for example, 2.0.6.dev0. They'll only be installed if you tell pip to do so explicitly:

pip install spacy-nightly==2.0.6.dev0
pip install spacy-nightly --pre

Development versions sort before all other versions โ€“ so 2.0.6.dev0 < 2.0.6. Also see the PEP 440 spec on public version identifiers.

Publishing the dev builds on spacy-nightly should be easy to achieve by adding a condition to the setup.py, which replaces the package name if it's run with a certain flag indicating a dev build. The model compatibility check is also set up to support development versions.

Wheels

Travis, Appveyor and Circle CI also support building and uploading binary wheels. We'd like to have wheel deployment also enabled both for the nightly releases and full versions. Wheels are much faster to install and do not require a compiler to be installed on the target machine.

Required changes

Apart from the advantages of being able to install the latest master version easily, nothing will change for spaCy users and contributors. But here's what we're going to be doing:

  • Ensure all updates to the documentation and other non-source parts of the repository are committed with [ci skip] to avoid builds.
  • Keep using feature branches and pull requests for updates we publish โ€“ never push directly to master (duh).
  • Integrate the dev builds into our current CI workflow, make all required changes to the setup.py and add documentation.
enhancement help wanted meta

All 13 comments

I'd be happy to help out with some of this, especially pushing wheels as any downstream projects using spacy have to endure the time spent compiling spacy for their CI!

@willprice Thanks!

The question is, what's the best way to get you working on this? Maybe a good start would be if you fork one of our small libraries such as cymem, and get the wheels building on your fork?

If this subgoal seems like a waste of time and you can just get a fork of spaCy working, by all means go ahead!

Hi @honnibal, I think forking sounds like a good idea, I need a safe proving ground where I can fiddle without causing issues ;). I won't be able to do a proper 'github fork' as you can't easily have separate CI for forks (at least on travis), so I'll just clone the repo and push up my own version and link it here.

Questions:

  1. I'm not sure it makes sense for each CI platform to push wheels to PyPI, so do we just want one platform to push wheels? However in the event that the chose CI platform goes down then we lose wheel pushes, so we might want a way of having multiple platforms push wheels. Thoughts?

  2. To add the .dev[N] suffix to versions we'll need to store the current N version in the repo, are you happy having CI committing the change to the repo?

  3. With the [ci skip] tag in commit messages, do you want anything done at all, or just skip everything (including building docs)?

  4. Can you elaborate on the last point "
    Integrate the dev builds into our current CI workflow, make all required changes to the setup.py and add documentation." as I'm not sure what this means.

In the mean time I'll get stuck in...

Some of the external dependencies also need to build a binary dist for spacy also need to be built:

Starred packages are those managed by explosion so could feasibly be changed to build bdist wheels too. thinc seems to have one of the longest compile times.

Great action on this -- thanks.

  1. I'm not sure it makes sense for each CI platform to push wheels to PyPi

Unfortunately I think it's necessary to use at least appveyor and Travis. I'm not sure why conda-forge uses CircleCI as well, but I wouldn't be surprised if there's a good reason based on some missing requirement on Travis. Another potential problem is that sometimes builds queue for days on Travis for OSX.

For now I think it probably makes sense to start with Travis, and then do Appveyor for the Windows wheels.

  1. To add the .dev[N] suffix to versions we'll need to store the current N version in the repo, are you happy having CI committing the change to the repo?

Could the CI query PyPi to find the latest version number? Assuming the CI has to make at least a git tag for each release, we could also use that to find the next version.

These are possibly bad ideas --- I just want to enumerate the options.

  1. With the [ci skip] tag in commit messages, do you want anything done at all, or just skip everything (including building docs)?

@ines ?

  1. Can you elaborate on the last point "
    Integrate the dev builds into our current CI workflow, make all required changes to the setup.py and add documentation." as I'm not sure what this means.

@ines ?

Windows might in fact be the easiest to do as the appveyor config is nice and standalone and building binary wheels is substantially easier on windows than it is for linux (which requires this whole docker centos 5 setup). Probably won't have time until the weekend to make much progress but will keep you updated :)

Re dependencies:

I think if we reach out to the developers of wrapt and cytoolz they would be keen to adopt a solution we develop. I'm not sure how actively maintained ujson is. I'd like to remove the dependency on the regex library to improve the speed of the tokenizer.

About that msgpack library: we need an extension library msgpack-numpy. I think this was why I used the msgpack-python library instead of msgpack, but maybe I just messed up?

With the [ci skip] tag in commit messages, do you want anything done at all, or just skip everything (including building docs)?

This should already be handled by default (at least by Travis, and I'm pretty sure Appveyor as well). So if we use a build as the trigger or first step, we shouldn't have to do anything here. (@honnibal and I just have to remember to tag the commit messages of docs PRs accordingly when we merge.)

Can you elaborate on the last point " Integrate the dev builds into our current CI workflow, make all required changes to the setup.py and add documentation." as I'm not sure what this means.

Don't worry about the documentation at this point โ€“ we'll take care of this later. Aside from that, I mostly meant changing the CI config files and setting up the dev builds (e.g. @honnibal's suggestions above) and changing the setup.py to allow an additional flag (or similar) to indicate the dev builds. If it's a dev build, spacy-nightly is used as the package name instead of spacy, so it'll be available as a different package on pip.

This should already be handled by default (at least by Travis, and I'm pretty sure Appveyor as well). So if we use a build as the trigger or first step, we shouldn't have to do anything here. (@honnibal and I just have to remember to tag the commit messages of docs PRs accordingly when we merge.)

You're absolutely right, both Travis and Appveyor support [skip ci] tags. I've noted this above.

Don't worry about the documentation at this point โ€“ we'll take care of this later. Aside from that, I mostly meant changing the CI config files and setting up the dev builds (e.g. @honnibal's suggestions above) and changing the setup.py to allow an additional flag (or similar) to indicate the dev builds. If it's a dev build, spacy-nightly is used as the package name instead of spacy, so it'll be available as a different package on pip.

Excellent, I've not found an easy way of overriding the package name in setuptools yet, I might have to use sed to update spacy/about.py for dev builds.

Excellent, I've not found an easy way of overriding the package name in setuptools yet,

I thought we might just be able to do this here?

https://github.com/explosion/spaCy/blob/7e6dc283dbecf4bf9674559198b4a2c06e9f4c2e/setup.py#L173-L174

And then do something like this:

package_name = about['__title_nightly__'] if IS_DEV_BUILD else about['__title__']

But maybe I'm wrong and haven't thought this through โ€“ I'm definitely not a setuptools expert.

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings