What happened:
Twice in the last two months, LightGBM's continuous integration has been broken by the following situation:
distributed changes in a way that makes it incompatible with older versions of daskdistributed is published to anaconda's main channels several days before the corresponding dask versionconda install -y dask distributed results in an environment with incompatible versions of dask and distributedI've documented the most recent instance of this problem in https://github.com/microsoft/LightGBM/issues/4285.
We ended up with an environment like this:
dask-2021.4.0 | pyhd3eb1b0_0 5 KB
dask-core-2021.4.0 | pyhd3eb1b0_0 670 KB
distributed-2021.4.1 | py37h06a4308_0 1.0 MB
And saw all Dask tests in that project fail with this error:
> from distributed.protocol.core import dumps_msgpack
E ImportError: cannot import name 'dumps_msgpack' from 'distributed.protocol.core' (/root/miniconda/envs/test-env/lib/python3.7/site-packages/distributed/protocol/core.py)
Caused by the fact that distributed.protocol.core.dumps_msgpack() was removed in 2021.4.1 (#4677), but dask 2021.4.0 still relies on it.
What you expected to happen:
I expected that since dask and distributed are so tightly connected to each other, new versions of these libraries would be published to the main anaconda channels at the same time.
Minimal Complete Verifiable Example:
It's hard to create an MCVE for this since it relies on external state in a package manager, but as of 12 hours ago the steps at https://github.com/microsoft/LightGBM/issues/4285#issuecomment-841000102 could reproduce this issue.
If you need more details than that please let me know and I can try to produce a tighter reproducible example.
Anything else we need to know?:
Environment:
Thanks for bringing this up @jameslamb . We don't have much input on the main anaconda channel.
Still, @seibert do you know who we should talk to about updating the main channel as the current versions are incompatible with one another. FWIW we are planning a release today: https://github.com/dask/community/issues/155
Thanks for reporting @jameslamb! FWIW some folks also ran into this with the 2021.04.1 release
on conda-forge (see the discussion starting here https://github.com/dask/community/issues/150#issuecomment-826844711). I think the core issue here is that we don't specify maximum allowed versions for our dask and distributed dependencies.
Over in https://github.com/dask/community/issues/155#issuecomment-841278326 I'm proposing we start pinning dask and distributed more tightly to avoid these types version inconsistency issues. If you have any thoughts on the topic, please feel free to engage over in that issue
Ok sure, will do! You can close this issue then if you'd like. To keep the discussion focused over in dask/community.
Yeah I think tighter pinnings as James proposed should address this going forward
cc @anaconda-pkg-build (for awareness)
Closing as discussion moved over to the dask/community issue tracker and the relevant folks have been pinged here for visibility
Most helpful comment
Thanks for reporting @jameslamb! FWIW some folks also ran into this with the
2021.04.1releaseon
conda-forge(see the discussion starting here https://github.com/dask/community/issues/150#issuecomment-826844711). I think the core issue here is that we don't specify maximum allowed versions for ourdaskanddistributeddependencies.Over in https://github.com/dask/community/issues/155#issuecomment-841278326 I'm proposing we start pinning
daskanddistributedmore tightly to avoid these types version inconsistency issues. If you have any thoughts on the topic, please feel free to engage over in that issue