Numba: Python 3.9 Support

Created on 10 Oct 2020  路  12Comments  路  Source: numba/numba

Reporting a bug

  • [X] I have tried using the latest released version of Numba (most recent is
    visible in the change log (https://github.com/numba/numba/blob/master/CHANGE_LOG).
  • [X] I have included below a minimal working reproducer (if you are unsure how
    to write one see http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports).

I'm seeing this warning pop up for a clean installation of numba with Python 3.9:

python3 -m pip install numba
Collecting numba
  Using cached numba-0.51.2.tar.gz (2.1 MB)
Processing ./.cache/pip/wheels/40/08/53/26580f3607587bd3fa1a18619841d1dcfedcabf2be52f8e2cd/llvmlite-0.34.0-cp39-cp39-linux_x86_64.whl
Processing ./.cache/pip/wheels/a3/17/dd/f2dba23a35bb6008732772ccfb13d3d0e537fbc6919ce6862b/numpy-1.19.2-cp39-cp39-linux_x86_64.whl
Requirement already satisfied: setuptools in /usr/local/lib/python3.9/site-packages (from numba) (50.3.0)
Building wheels for collected packages: numba
  Building wheel for numba (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: /usr/local/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-t2pl9xsf/numba/setup.py'"'"'; __file__='"'"'/tmp/pip-install-t2pl9xsf/numba/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-3k5ws828
       cwd: /tmp/pip-install-t2pl9xsf/numba/
  Complete output (7 lines):
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/tmp/pip-install-t2pl9xsf/numba/setup.py", line 354, in <module>
      metadata['ext_modules'] = get_ext_modules()
    File "/tmp/pip-install-t2pl9xsf/numba/setup.py", line 87, in get_ext_modules
      import numpy.distutils.misc_util as np_misc
  ModuleNotFoundError: No module named 'numpy'
  ----------------------------------------
  ERROR: Failed building wheel for numba
  Running setup.py clean for numba
  ERROR: Command errored out with exit status 1:
   command: /usr/local/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-t2pl9xsf/numba/setup.py'"'"'; __file__='"'"'/tmp/pip-install-t2pl9xsf/numba/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' clean --all
       cwd: /tmp/pip-install-t2pl9xsf/numba
  Complete output (7 lines):
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/tmp/pip-install-t2pl9xsf/numba/setup.py", line 354, in <module>
      metadata['ext_modules'] = get_ext_modules()
    File "/tmp/pip-install-t2pl9xsf/numba/setup.py", line 87, in get_ext_modules
      import numpy.distutils.misc_util as np_misc
  ModuleNotFoundError: No module named 'numpy'
  ----------------------------------------
  ERROR: Failed cleaning build dir for numba
Failed to build numba
Installing collected packages: llvmlite, numpy, numba
    Running setup.py install for numba ... done
  DEPRECATION: numba was installed using the legacy 'setup.py install' method, because a wheel could not be built for it. pip 21.0 will remove support for this functionality. A possible replacement is to fix the wheel build issue reported above. You can find discussion regarding this at https://github.com/pypa/pip/issues/8368.
Successfully installed llvmlite-0.34.0 numba-0.51.2 numpy-1.19.2

I can see on the README that these versions are currently recommended:

  • Python versions: 3.6-3.8
  • llvmlite 0.33.*

Is Python 3.9 and llvmlite 0.34.* not supported? Pip is currently warning about wheel build failure, but numba will install.

feature_request

Most helpful comment

I've started work on this and have discovered that due to a couple of bytecode instruction sequence changes that this is going to be a lot more work than was initially envisaged. The patch for Python 3.9 is probably going to be large and have associated risks (it needs to rewrite swathes of bytecode sequences). It is therefore the view of the core developers that this change will warrant a full new release so as to go through the full release candidate and community testing process. Unless we can figure out a way to avoid this, which is not looking likely at present, Python 3.9 support will most likely land in Numba 0.53 which will be tagged for RC early in 2021.

Sorry to have to bring this news and thank you all for your understanding.

Technical details:

The main issues stem from some bytecode generation changes that arose as a result of:
https://bugs.python.org/issue39320
Essentially, the patch to CPython is pushing work from the interpreter into the compiler, which helps CPython performance. However, this results in an awkward problem for Numba.

An example:

def foo(a, b):
    t = 10
    z = (1, 2, *a, 3, t, *b, 4)
    return z

print(dis.dis(foo))
print(foo((100, np.zeros(4)), (300, 400)))

On Python < 3.9:

 65           0 LOAD_CONST               1 (10)
              2 STORE_FAST               2 (t)

 66           4 LOAD_CONST               6 ((1, 2))
              6 LOAD_FAST                0 (a)
              8 LOAD_CONST               4 (3)
             10 LOAD_FAST                2 (t)
             12 BUILD_TUPLE              2
             14 LOAD_FAST                1 (b)
             16 LOAD_CONST               7 ((4,))
             18 BUILD_TUPLE_UNPACK       5
             20 STORE_FAST               3 (z)

 67          22 LOAD_FAST                3 (z)
             24 RETURN_VALUE
None
(1, 2, 100, array([0., 0., 0., 0.]), 3, 10, 300, 400, 4)

on Python 3.9:

 65           0 LOAD_CONST               1 (10)
              2 STORE_FAST               2 (t)

 66           4 LOAD_CONST               2 (1)
              6 LOAD_CONST               3 (2)
              8 BUILD_LIST               2
             10 LOAD_FAST                0 (a)
             12 LIST_EXTEND              1
             14 LOAD_CONST               4 (3)
             16 LIST_APPEND              1
             18 LOAD_FAST                2 (t)
             20 LIST_APPEND              1
             22 LOAD_FAST                1 (b)
             24 LIST_EXTEND              1
             26 LOAD_CONST               5 (4)
             28 LIST_APPEND              1
             30 LIST_TO_TUPLE
             32 STORE_FAST               3 (z)

 67          34 LOAD_FAST                3 (z)
             36 RETURN_VALUE
None
(1, 2, 100, array([0., 0., 0., 0.]), 3, 10, 300, 400, 4)

On Python <3.9 BUILD_TUPLE and BUILD_TUPLE_UNPACK are used to assemble the tuple, and all the parts of the tuple that are defined as constants are stored as tuples too. For Numba, it's relatively easy to handle this as it's just a question of looking at how many tuples are on the stack and then generating a sequence of binary additions to compute the result.

On Python 3.9, the tuple is built via creating a list, and then extending/appending to it, and then finally using the new op-code LIST_TO_TUPLE to make the list into a tuple. The problem with this is that Numba cannot magically turn a list into a tuple as the tuple type in Numba must have both the size and the types of all elements known at compile time. Also, lists in Numba must be homogeneous in type, so even were it possible to do a list-to-tuple converter, it'd fail unless all the elements of the list were of the same type and the size of the list were known. As a result this new bytecode sequence needs analysing and then rewriting as a compounding of expressions based on tuples such that the type of the resulting tuple can be statically determined. Rewriting bytecode sequences such as the above is inherently complicated and risky and it is this risk which warrants a new release tag with full release candidates such that they can be well tested.

All 12 comments

For reference, the checklist in #6332 describes the numba 0.52 and 0.53 release cycle plans.

@mjsteinbaugh thanks for asking about this. The Numba stack, which includes llvmlite currently does not support being executed on Python 3.9. So, I have modified the title of this issue accordingly and re-phrased it as a feature request. We may, if everything goes well, support Python 3.9 with the next patch release before the end of the year.

I should also note, that the modifications to the bytecode with Python 3.9 will mean that even if you manage to install/compile Numba and llvmlite it is unlikely that it will work correctly when used.

OK thanks for the update @esc . I figured that was the case, and I'll stick with Python 3.8.6 for the time being.

馃憤

FWIW, llvmlite appears to work ok with Python 3.9 (at least, the test suite passes) but the numba test suite seems to hang: https://buildd.debian.org/status/fetch.php?pkg=numba&arch=amd64&ver=0.51.2-1%2Bb1&stamp=1602743150&raw=0

FWIW, llvmlite appears to work ok with Python 3.9 (at least, the test suite passes) but the numba test suite seems to hang: https://buildd.debian.org/status/fetch.php?pkg=numba&arch=amd64&ver=0.51.2-1%2Bb1&stamp=1602743150&raw=0

@mwhudson thanks for testing. I've started looking at 3.9 for Numba, there's a few changes but nothing huge, I expect a fair bit Numba will still work ok.

Is there a rough ETA for this?

I raised this yesterday at the Numba core developer meeting (minutes). The aim is to ship a 0.52.1, which is functionally identical to 0.52.0 but with Python 3.9 support added, mid-December or before.

I've started work on this and have discovered that due to a couple of bytecode instruction sequence changes that this is going to be a lot more work than was initially envisaged. The patch for Python 3.9 is probably going to be large and have associated risks (it needs to rewrite swathes of bytecode sequences). It is therefore the view of the core developers that this change will warrant a full new release so as to go through the full release candidate and community testing process. Unless we can figure out a way to avoid this, which is not looking likely at present, Python 3.9 support will most likely land in Numba 0.53 which will be tagged for RC early in 2021.

Sorry to have to bring this news and thank you all for your understanding.

Technical details:

The main issues stem from some bytecode generation changes that arose as a result of:
https://bugs.python.org/issue39320
Essentially, the patch to CPython is pushing work from the interpreter into the compiler, which helps CPython performance. However, this results in an awkward problem for Numba.

An example:

def foo(a, b):
    t = 10
    z = (1, 2, *a, 3, t, *b, 4)
    return z

print(dis.dis(foo))
print(foo((100, np.zeros(4)), (300, 400)))

On Python < 3.9:

 65           0 LOAD_CONST               1 (10)
              2 STORE_FAST               2 (t)

 66           4 LOAD_CONST               6 ((1, 2))
              6 LOAD_FAST                0 (a)
              8 LOAD_CONST               4 (3)
             10 LOAD_FAST                2 (t)
             12 BUILD_TUPLE              2
             14 LOAD_FAST                1 (b)
             16 LOAD_CONST               7 ((4,))
             18 BUILD_TUPLE_UNPACK       5
             20 STORE_FAST               3 (z)

 67          22 LOAD_FAST                3 (z)
             24 RETURN_VALUE
None
(1, 2, 100, array([0., 0., 0., 0.]), 3, 10, 300, 400, 4)

on Python 3.9:

 65           0 LOAD_CONST               1 (10)
              2 STORE_FAST               2 (t)

 66           4 LOAD_CONST               2 (1)
              6 LOAD_CONST               3 (2)
              8 BUILD_LIST               2
             10 LOAD_FAST                0 (a)
             12 LIST_EXTEND              1
             14 LOAD_CONST               4 (3)
             16 LIST_APPEND              1
             18 LOAD_FAST                2 (t)
             20 LIST_APPEND              1
             22 LOAD_FAST                1 (b)
             24 LIST_EXTEND              1
             26 LOAD_CONST               5 (4)
             28 LIST_APPEND              1
             30 LIST_TO_TUPLE
             32 STORE_FAST               3 (z)

 67          34 LOAD_FAST                3 (z)
             36 RETURN_VALUE
None
(1, 2, 100, array([0., 0., 0., 0.]), 3, 10, 300, 400, 4)

On Python <3.9 BUILD_TUPLE and BUILD_TUPLE_UNPACK are used to assemble the tuple, and all the parts of the tuple that are defined as constants are stored as tuples too. For Numba, it's relatively easy to handle this as it's just a question of looking at how many tuples are on the stack and then generating a sequence of binary additions to compute the result.

On Python 3.9, the tuple is built via creating a list, and then extending/appending to it, and then finally using the new op-code LIST_TO_TUPLE to make the list into a tuple. The problem with this is that Numba cannot magically turn a list into a tuple as the tuple type in Numba must have both the size and the types of all elements known at compile time. Also, lists in Numba must be homogeneous in type, so even were it possible to do a list-to-tuple converter, it'd fail unless all the elements of the list were of the same type and the size of the list were known. As a result this new bytecode sequence needs analysing and then rewriting as a compounding of expressions based on tuples such that the type of the resulting tuple can be statically determined. Rewriting bytecode sequences such as the above is inherently complicated and risky and it is this risk which warrants a new release tag with full release candidates such that they can be well tested.

For those reacting to https://github.com/numba/numba/issues/6345#issuecomment-738696458 with the "confused" (:confused:) emoji, some explanation...

It's really important to note that Numba uses Python _bytecode_ as input, not Python _source code_. There's some information about the basics of how Numba works in the developer guide here, which will hopefully help explain why bytecode changes cause Numba problems! If anyone has questions about this topic, please feel free to post them on Numba's discourse forum.

This PR contains the progress so far on Python 3.9. The majority of the bytecode changes are accommodated through peephole rewrites of new bytecodes/bytecode sequences, but some more engineering effort is needed to convert "working" into "reliable and tested". Remaining issues include, but are not limited to, dealing with changes to the AST representation which is used in some tests, and some problems with new control flow altering bytecodes.

Thank you all for your support and patience whilst we figure this out! :slightly_smiling_face:

I should also note, that the modifications to the bytecode with Python 3.9 will mean that even if you manage to install/compile Numba and llvmlite it is unlikely that it will work correctly when used.

this. I was able to work around the llvmlite install but get errors when attempting to use pandas_profiling

Was this page helpful?
0 / 5 - 0 ratings