This post here is to review the status of the sdist as part of scikit-image in light of recent developments in how python is now packaged in late 2019.
A few recent developments have been:
In the latest 0.16.0 release, we had a hiccup due to an old version of cython being used to generate the c files released in the sdist. @stefanv mentionned that we should consider not shipping c files with the sdist.
Numpy actually recommends very aggressively new versions of cython for their downstream packagers to use. We can maybe consider moving up our pinning to bleeding edge???
The major change in the python community has been the ease at which users on the major platforms can install scikit-image. Wheels are provided for all our core dependencies, and this greatly reduces the complexity of installing (and building from source) scikit-image.
To summarize:
I'm +1 on both. We should consider that sdist install will now fail until the right version of Cython is available, and print a helpful error message, similar to what we do for NumPy.
@hmaarrfk I feel like you left out
Note that the old Cython version was not due to an old version on the builder's machine, but rather, to not rebuilding the Cython files when using sdist. imho sdist should always result in a clean build of everything. It might be possible to hook in a make clean to the sdist command?
I am having a lot of issues with scikit-image and Python 3.8. This is because for scikit-image to compile from source numpy needs to be installed before running pip install scikit-image.
This is the problem that PEP 517/518 are designed to fix. I have been involved in a lot of the planning to migrate the astropy ecosystem to using pyproject and more modern tooling. From that exercise what I would suggest is that you use pyproject to pin numpy to the oldest version you support (which have wheels for all platforms) and use pyproject to ensure that the desired version of Cython is used to generate the C files at compile time.
I would be happy to get the ball rolling on this if there is generally positive feelings about it.
@Cadair I'd be very happy to get a PR for this!
I'm very happy to say that installing scikit image from source for python 3.8 should not be a smooth process in Nov 2019.
Pinning to old versions of numpy can cause other problems, and in either case, we would not have released the correct pinnings for python 3.8.
In my mind, for python 3.8 you will have to wait at least a few months before using it. If you are impatient, conda-forge has done some progress. That said, they do not use pyproject.toml... so....
I'm very happy to say that installing scikit image from source for python 3.8 should not be a smooth process in Nov 2019.
Can you explain what you mean by this?
It's not just Python 3.8, pyproject.toml provides a way of allowing reliable and predictable builds from source for all unsupported Python versions (i.e. new ones) and also on architectures where binaries are not available.
@stefanv I mean, I think that for now, python 3.8 is unsupported by scikit image, and at least until the end of November, it will remain that way until our dependencies create 3.8 wheels and we can find a time to cut a new release.
The thing is, while it is nice to say that things on different platforms will Just Work (TM) they often dont. What version pinning should we have put for python 3.8? 1.14? The wheel is likely not going to get built.
A bit of manual work is needed to build packages, and reading out dependencies as a human seems like a just middle.
Honestly, I don't doubt that pyproject.toml has improved, but last year, it gave me great headaches. My only point is that the 3.8 usecase doesn't convice me.
What version pinning should we have put for python 3.8? 1.14?
I suggest we take this particular discussion over to the PR.
My only point is that the 3.8 usecase doesn't convice me.
New Python versions are a obvious pain point for this, but in general I think it makes sense to make compiling from source as easy as possible.
Honestly, I don't doubt that pyproject.toml has improved, but last year, it gave me great headaches.
SunPy has happily been using it for at least a year, there have been a few bumps along the way (the pip install -e bug being by far the worst) but it has allowed us to make a lot of improvements to our release pipeline (mainly using the build isolation to generate the sdist).
Python setup -e is precisely why I removed pyproject.toml a while back.
Installing from source is simply not something I'm too keep on supporting. We require a C compiler which is the one dependency that may never become pip installable....
Most people creating new distributions will be using their own dependency control. Debian packages will use their own, and conda packages will use their own.
What use case is pushing you to WANT to install from source?
Installing from source is simply not something I'm too keep on supporting. We require a C compiler [...]
Some people will always need to install from source, and having a functioning toolchain is not a high bar for people using scikit-image on CI services or boxes used for development in general. You can take what you want from the fact that I have come here wanting to patch this because I have been cornered into compiling scikit-image from source on a load of CI builds for SunPy...
cornered into compiling scikit-image from source
TLDR: Python 3.8 isn't supported by scikit-image today. My suggestion to you, is to disable Python 3.8 CI tests and simply to wait.
I know it is exciting to want to move on the shiny new python 38, but personally, it might be a little premature for scikit-image since matplotlib (a dependency I have been trying to remove) still hasn't released something for python 3.8. We've really been trying to build for python 3.8 https://github.com/scikit-image/scikit-image-wheels/pulls
As for people with CIs, this means that you are working with unstable software, that hasn't yet been tested by upstream developers for the project. I can personally say that I've been involved in migrating conda-forge's software to Python 3.8, and without new releases from the official developers, we've had to skip many tests, or to do some creative patching. If you want bear this cost, that is in fact your choice, but unfortunately, this kind of problem doesn't magically go away by adding a pyproject.toml. There will always be a transition period.
Pyproject.toml has infact been a non-obvious decision for scipy https://github.com/scipy/scipy/issues/8734 that also has a numpy dependency and a compiler requirement.
I'm not saying I don't want to modernize the build process, but these "fully isolated environments on the end user's computer" don't always benefit the scientific community that has compiler dependencies.
For example, we've found there to be an incompatibility with things compiled on Numpy 1.13.3 and used on numpy 1.15.4. https://github.com/scikit-image/scikit-image/issues/4270
https://github.com/scikit-image/scikit-image-wheels/pull/31 If we had provided a pyproject.toml file, anybody using numpy 1.15.4 would be unable to build and use the pyproject.toml file on Windows.
Personally, I would much rather focus on moving to ManyLinux2010, which SunPy seems to have done, your help there would go a long way, and potentially adding ARM64/ARM32 builds.
We could also be tempted to release nightly on test.pypi.org
Removing the generated C files seems like a good modernization that we can do since installing Cython is now easier than ever. These are the kinds of modernizations I am in favour of.
but unfortunately, this kind of problem doesn't magically go away by adding a pyproject.toml
I am not suggesting for a moment that adding pyproject.toml will magically fix all the Python 3.8, or 3.9 or whatever bugs. I haven't actually run into any issues with 3.8 in any of SunPy's (many) dependencies. What I have run into is the fact that skimage doesn't compile from source in a clean environment, which is just really frustrating.
For example, we've found there to be an incompatibility with things compiled on Numpy 1.13.3 and used on numpy 1.15.4
This is frustrating, and there are always going to be bugs, adding pyproject lets you have more control over the build time dependencies, and gives you another tool in the box to deal with weird issues like this. Even if it doesn't fix them it at least makes them predictable and the same for everyone.
Removing the generated C files seems like a good modernization that we can do since installing Cython is now easier than ever.
Which is made a lot easier by using pyproject.toml because you as the packager can be sure you know what version of Cython everyone is generating their C with, meaning that it should be the same for everyone who installs from source (or builds wheels) without having to include it in the package to achieve that result.
For reference I have been compiling the master branch of skimage on the master branch of cpython for about a year now and it seems to work OK.
I do aggressively remove pyproject.toml files, but that is because it makes an isolated enviroment with _old_ (read not master branch) versions of things.
I strongly agree about not including the c files in the sdist as it negates one of the selling points of cython which is "cython handles changes to the cpython c-API for you", but I think "rebuild them if you have cython" is also a reasonable position to take.
I strongly agree about not including the c files in the sdist as it negates one of the selling points of cython which is "cython handles changes to the cpython c-API for you", but I think "rebuild them if you have cython" is also a reasonable position to take.
I investigated this recently: we had a working version of this with earlier versions of scikit-image, but at some point switched to relying on Cython for caching. At this point, pre-generated C files became a real problem, because they would not get re-generated. I think we may have to go back to our old system of caching with .md5 files, which perfectly handled the source problem (or, just remove them, perhaps easier and nearly as effective).
@hmaarrfk W.r.t. matplotlib 3.8 wheels, we have those available and could also build them from source, so that should not be a blocker on generating skimage wheels for 3.8.
@stefanv, we may choose to release a wheel for 3.8 by doing what you said but:
So, we can take shortcuts, creating packages that can't be installed by end users. Or, we can just wait. And if we have extra time, and put our efforts in helping matplotlib's builds get released.
Not having a wheel for 3.8 does us no good.
Here's an alternative plan: @tacaswell, when can you have binary wheels available on PyPi for 3.8? Any chance you can add 3.8 wheels for an existing release?
It is not our concern how the user gets matplotlib installed.
Our users should get matplotlib exactly the same way they get scikit-image. If they use pip + PyPi, matplotlib should be installed there as well. Otherwise, as both packages use C/Python ABIs, they will have a broken install. This is crucial for getting a working version of python + scikit-image + matplotlib installed on your system.
So many packaging systems will actually explicitely disable build isolation:
This is what comes out of pip config list inside a conda build environment
:env:.cache-dir='/home/mark/miniconda3/conda-bld/scikit-image_1574378830596/pip_cache'
:env:.ignore-installed='True'
:env:.no-build-isolation='False'
:env:.no-dependencies='True'
:env:.no-index='True'
and notice how the logic is inverted.....
https://github.com/pypa/pip/issues/5735
Our users should get matplotlib exactly the same way they get scikit-image. If they use
pip + PyPi, matplotlib should be installed there as well. Otherwise, as both packages use C/Python ABIs, they will have a broken install. This is crucial for getting a working version of python + scikit-image + matplotlib installed on your system.
I don't think that is true; you can mix Debian packages with pip and have a fully working system.
I don't think that is true; you can mix Debian packages with pip and have a fully working system.
You really shouldn't. msarahan explains it much better than me in his talk:
https://youtu.be/Kamld5Z-xx0?t=970
Get to 16:46.
The rest of his talk is excellent too.
You really shouldn't. msarahan explains it much better than me in his talk:
https://youtu.be/Kamld5Z-xx0?t=970
So what he is describing is a very different situation: where someone distributes a wheel labeled as manylinux1 compatible, while it is not. In the case I am describing, Debian is installing something that is definitely compatible with the current system (it's a Debian system with Debian packages), and manylinux1 wheels play just fine with that鈥攖hey vendor the libraries they use, and they are compiled against older versions of the standard libraries than what are present.
@tacaswell, I guess you and I can simply start to use this:
pip config set no-build-isolation False
@stefanv I did push the wheels yesterday (it was already on my to-do list) for 3.1.2 (which is a bug-fix release for the 3.1 series).
Will try to read the rest of the conversation and catch up with the rest in a bit.
Amazing, that rolls a big stone out of the way for us鈥攖hanks, @tacaswell!
@Cadair how are you builds doing since we released the wheels? I assume all is good?
Yeah, much easier now :smile:
@hmaarrfk What is the status of this issue?
I think we successfully moved up the Cython pinning.
Do you still want to remove the generated c files from the sdist?
I'm still -1 on adding back pyproject.toml.
I am still :+1: on removing the pre-generated .c files
I don't really know how to do that.
I couldn't find it in the numpy docs...
I think you just have to remove them from the MANIFEST.in, and then ensure that tools/check_sdist.py passes.
I just thought I would come back around and say that the inability to build skimage in a clean python env from source has bitten me on sunpy's CI again. I am trying to setup a linux 32bit test build, because the debian package maintainers have found a few 32bit specific bugs in the past. As there are no wheels I can't do this cleanly.
I really believe adopting new standard python packaging things like pyproject.toml makes things easier for the users who still build from source (which people do still do).
we release wheels for 32 bit linux. At least for 0.16.2 we did. Did you need python 3.8 + linxu32?
Would you want to help us get CI testing running on linux32?
Most helpful comment
@hmaarrfk I feel like you left out
Note that the old Cython version was not due to an old version on the builder's machine, but rather, to not rebuilding the Cython files when using sdist. imho sdist should always result in a clean build of everything. It might be possible to hook in a
make cleanto the sdist command?