https://pypi.org/project/pip-tools/
has several features, most relevant is automatically generating requirements.txt from setup.py
not sure how to incorporate it into our workflow, but worth thinking about
I love this. It makes a requirements.txt with full version resolutions that are compatible with the constraints in setup.py. For example, if you run pip-compile on AllenNLP it generates the following requirements.txt, which they recommend committing to source control.
#
# This file is autogenerated by pip-compile
# To update, run:
#
# pip-compile
#
alabaster==0.7.12 # via sphinx
attrs==19.3.0 # via pytest
babel==2.8.0 # via sphinx
blis==0.4.1 # via spacy, thinc
boto3==1.10.45
botocore==1.13.45 # via boto3, s3transfer
catalogue==0.2.0 # via spacy
certifi==2019.11.28 # via requests
chardet==3.0.4 # via requests
click==7.0 # via sacremoses
conllu==1.3.1
cycler==0.10.0 # via matplotlib
cymem==2.0.3 # via preshed, spacy, thinc
docutils==0.15.2 # via botocore, sphinx
editdistance==0.5.3
flaky==3.6.1
ftfy==5.6
h5py==2.10.0
idna==2.8 # via requests
imagesize==1.2.0 # via sphinx
importlib-metadata==1.3.0 # via catalogue, pluggy, pytest
jinja2==2.10.3 # via numpydoc, sphinx
jmespath==0.9.4 # via boto3, botocore
joblib==0.14.1 # via sacremoses, scikit-learn
jsonpickle==1.2
kiwisolver==1.1.0 # via matplotlib
markupsafe==1.1.1 # via jinja2
matplotlib==3.1.2
more-itertools==8.0.2 # via pytest, zipp
murmurhash==1.0.2 # via preshed, spacy, thinc
nltk==3.4.5
numpy==1.18.0
numpydoc==0.9.2
overrides==2.0
packaging==19.2 # via pytest, sphinx
parsimonious==0.8.1
plac==1.1.3 # via spacy, thinc
pluggy==0.13.1 # via pytest
preshed==3.0.2 # via spacy, thinc
protobuf==3.11.2 # via tensorboardx
py==1.8.1 # via pytest
pygments==2.5.2 # via sphinx
pyparsing==2.4.6 # via matplotlib, packaging
pytest==5.3.2
python-dateutil==2.8.0
pytorch-pretrained-bert==0.6.2
pytz==2019.3
regex==2019.12.20 # via pytorch-pretrained-bert, transformers
requests==2.22.0
responses==0.10.9
s3transfer==0.2.1 # via boto3
sacremoses==0.0.35 # via transformers
scikit-learn==0.22.1
scipy==1.4.1
sentencepiece==0.1.85 # via transformers
six==1.13.0 # via cycler, h5py, nltk, packaging, parsimonious, protobuf, python-dateutil, responses, sacremoses, tensorboardx
snowballstemmer==2.0.0 # via sphinx
spacy==2.2.3
sphinx==2.3.1 # via numpydoc
sphinxcontrib-applehelp==1.0.1 # via sphinx
sphinxcontrib-devhelp==1.0.1 # via sphinx
sphinxcontrib-htmlhelp==1.0.2 # via sphinx
sphinxcontrib-jsmath==1.0.1 # via sphinx
sphinxcontrib-qthelp==1.0.2 # via sphinx
sphinxcontrib-serializinghtml==1.1.3 # via sphinx
sqlparse==0.3.0
srsly==1.0.0 # via spacy, thinc
tensorboardx==2.0
thinc==7.3.1 # via spacy
torch==1.3.1
tqdm==4.41.1
transformers==2.3.0
urllib3==1.25.7 # via botocore, requests
wasabi==0.5.0 # via spacy, thinc
wcwidth==0.1.8 # via ftfy, pytest
word2number==1.1
zipp==0.6.0 # via importlib-metadata
# The following packages are considered to be unsafe in a requirements file:
# setuptools
That said--I'm rather inexpert with the Python ecosystem so I'm hesitant to recommend anything.
@dirkgr @brendan-ai2 I'm curious what you two think.
It would be lovely to not have the duplication between requirements.txt and setup.py, but my understanding is that requirements.txt is for working on the library, and setup.py is for using it, so the goals are a little different.
Also, this pins all the versions. Didn't you conclude that pinning is bad, because it breaks conflict resolution for users?
We did conclude that--but since it's requirements.txt (local development) and not setup.py (what our users interact with) it seems like it's probably a good thing. It would mean we all develop in the same environment but have looser pins for our users who may need to use AllenNLP with additional software packages.
Yes the goals for requirements.txt and setup.py are different, but we have the same dependencies in each I believe.
If we recommend pip install -e . for development, do we need requirements.txt at all?
Sorry, didn't mean to close.
That's an even better suggestion. We can remove requirements.txt and ask people to install locally with pip install -e .. If someone needs a requirements.txt they can create one with pip-compile.
The only trade-off is that it wouldn't ensure we all have the same dependencies in our development environments, but that's not a new problem--it's our present one.
pip-compile is bundled with pip-sync so you can keep your environment up to date with what's explicitly specified in the repo.
It loses dev dependencies on things like black and flake8 ... but those are the only two I can think of, and it's not a big deal to just pip install them manually.
If we care, we could have a separate dev-requirements.txt that only has things that aren't in setup.py.
As @dirkgr says, I think we probably don't want pins by default. Honestly, it's unclear to me if there's an easy solution here. We want to keep requirements as loose as possible for our users, but that means having a very broad surface for testing.
dev-requirements.txt sounds good to me. One possible issue, however, is that we use pip install -r requirements.txt in our Dockerfile to get caching for our dependencies. I'm not sure whether pip install -e . is a drop-in replacement for that specific case.
@brendan-ai2 I don't think there would be any problem pinning development dependencies. That said--the dev-requirements.txt approach seems better.
Also I'm less keen on using pip-compile and pip-sync since they are not compatible with Conda. I got some weird conda errors when running pip-sync which convinced me the tooling isn't worth much for us.
See https://github.com/allenai/allennlp/pull/3580 for the dev-requirements.txt approach.
Closing due to #3580
Most helpful comment
If we care, we could have a separate
dev-requirements.txtthat only has things that aren't insetup.py.