Spacy: ๐Ÿ’ซ Improve compatibility checks for platform and Python version

Created on 14 Apr 2017  ยท  4Comments  ยท  Source: explosion/spaCy

To prevent Python 2/3 compatibility problems and improve performance on Windows, we need a better way of managing compatibility checks. Copying over my comment from #982:

For spaCy v2.0, we definitely want to come up with a better way of standardising the compatibility checks across the code base. This is all pretty messy at the moment. Maybe a spacy.compat helper would be a good solution for this. All functions that require different versions depending on the platform or Python version could then be imported directly from there.

Current state

  • Manual checks across the code base (bad)
  • Util functions to check for Windows and Python2 matching string returned by sys (unideal)
  • Overwriting specific functions like json.dumps using six for Python version check (good, should be extended and standardised)

Suggested solution

Introduce a spacy.compat module that includes replacements for all functions that require different versions depending on Python version or platform. Those functions should always be imported from that module instead of using them directly.

# compat.py
import six
import json

# version 1
if six.PY2:
   json_dumps = lambda data: json.dumps(data, indent=2).decode("utf8")
elif six.PY3:
    json_dumps = lambda data: json.dumps(data, indent=2)

# version 2
def json_dumps(data):
    if six.PY2:
        return json.dumps(data, indent=2).decode("utf8")
   elif six.PY3:
        return json.dumps(data, indent=2)
from spacy.compat import json_dumps

some_data = {'key': 'value'}
json_data = json_dumps(some_data)

spacy.compat could also expose a more general function that returns True / False for specific configurations. This should only be used in selected cases, for example to print specific error messages or warnings in spacy.cli.

from spacy.compat import is_config

if is_config(python=2, windows=True):
    print("Specific message for Python 2 on Windows")

Questions / Considerations

  • How are other (similar) libraries solving this? Any good examples we could take inspiration from?
  • Which functions need to be covered? E.g. unicode stuff, pathlib transforms, ...
enhancement help wanted (easy) ๐ŸŒ™ nightly

Most helpful comment

Thanks, this is looks good! I like the underscore naming conventions, I think we should probably adopt this concept.

Here's some of my work in progress btw: https://github.com/explosion/spaCy/blob/develop/spacy/compat.py
It's been working pretty well so far โ€“ and I finally get to tidy up some of the old code ๐Ÿ˜„

All 4 comments

Hi @ines , I don't know if you'd consider this a "good" example ๐Ÿ˜… but I've used a separate compat.py module in textacy and imported anything with PY2/3 differences from there. When overriding builtins, I usually add an underscore to the name to make it obvious that it's not the default: e.g. unicode_ and bytes_. FWIW, I think it's worked well.

Thanks, this is looks good! I like the underscore naming conventions, I think we should probably adopt this concept.

Here's some of my work in progress btw: https://github.com/explosion/spaCy/blob/develop/spacy/compat.py
It's been working pretty well so far โ€“ and I finally get to tidy up some of the old code ๐Ÿ˜„

Already implemented in v1.8.0 โ€“ there's probably more to do and to fix, but for now, it's already working pretty well.

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings