To prevent Python 2/3 compatibility problems and improve performance on Windows, we need a better way of managing compatibility checks. Copying over my comment from #982:
For spaCy v2.0, we definitely want to come up with a better way of standardising the compatibility checks across the code base. This is all pretty messy at the moment. Maybe a
spacy.compathelper would be a good solution for this. All functions that require different versions depending on the platform or Python version could then be imported directly from there.
sys (unideal)json.dumps using six for Python version check (good, should be extended and standardised)Introduce a spacy.compat module that includes replacements for all functions that require different versions depending on Python version or platform. Those functions should always be imported from that module instead of using them directly.
# compat.py
import six
import json
# version 1
if six.PY2:
json_dumps = lambda data: json.dumps(data, indent=2).decode("utf8")
elif six.PY3:
json_dumps = lambda data: json.dumps(data, indent=2)
# version 2
def json_dumps(data):
if six.PY2:
return json.dumps(data, indent=2).decode("utf8")
elif six.PY3:
return json.dumps(data, indent=2)
from spacy.compat import json_dumps
some_data = {'key': 'value'}
json_data = json_dumps(some_data)
spacy.compat could also expose a more general function that returns True / False for specific configurations. This should only be used in selected cases, for example to print specific error messages or warnings in spacy.cli.
from spacy.compat import is_config
if is_config(python=2, windows=True):
print("Specific message for Python 2 on Windows")
unicode stuff, pathlib transforms, ...Hi @ines , I don't know if you'd consider this a "good" example ๐
but I've used a separate compat.py module in textacy and imported anything with PY2/3 differences from there. When overriding builtins, I usually add an underscore to the name to make it obvious that it's not the default: e.g. unicode_ and bytes_. FWIW, I think it's worked well.
Thanks, this is looks good! I like the underscore naming conventions, I think we should probably adopt this concept.
Here's some of my work in progress btw: https://github.com/explosion/spaCy/blob/develop/spacy/compat.py
It's been working pretty well so far โ and I finally get to tidy up some of the old code ๐
Already implemented in v1.8.0 โ there's probably more to do and to fix, but for now, it's already working pretty well.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
Thanks, this is looks good! I like the underscore naming conventions, I think we should probably adopt this concept.
Here's some of my work in progress btw: https://github.com/explosion/spaCy/blob/develop/spacy/compat.py
It's been working pretty well so far โ and I finally get to tidy up some of the old code ๐