Numba: Initilizing a Numpy structured array inside a jitclass constructor throws an error

Created on 2 Jun 2020  路  3Comments  路  Source: numba/numba

This works:

from numba.experimental import jitclass
from numba import from_dtype
import numpy as np

values_dtype = np.dtype([
    ('one', 'U10'),
    ('two', 'f8')
])

class My_test:
    def __init__(self):
        self.values = values = np.empty(2, dtype=values_dtype)

my_test = My_test()
print(my_test.values)

But when jitting the class, it throws an error:

from numba.experimental import jitclass
from numba import from_dtype
import numpy as np

values_dtype = np.dtype([
    ('one', 'U10'),
    ('two', 'f8')
])

@jitclass([
    ('values', from_dtype(values_dtype))
])
class My_test:
    def __init__(self):
        self.values = values = np.empty(2, dtype=values_dtype)

my_test = My_test()
print(my_test.values)

Cannot cast unaligned array(Record(one[type=[unichr x 10];offset=0],two[type=float64;offset=40];48;False), 1d, C) to Record(one[type=[unichr x 10];offset=0],two[type=float64;offset=40];48;False): %".96" = load {i8*, i8*, i64, i64, [48 x i8]*, [1 x i64], [1 x i64]}, {i8*, i8*, i64, i64, [48 x i8]*, [1 x i64], [1 x i64]}* %"$12call_function_kw.5"

Any works arounds would be greatly appreciated!

Thanks :)

numba: 0.49.1
numpy: 1.18.4
python: 3.7.1

  • [x] I am using the latest released version of Numba (most recent is visible in
    the change log (https://github.com/numba/numba/blob/master/CHANGE_LOG).
  • [x] I have included below a minimal working reproducer (if you are unsure how
    to write one see http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports).
question

All 3 comments

Thanks for the issue report - it looks to me like the code is creating an array of record types with np.empty(2, dtype=values_dtype) and assigning it to self.values, which is declared as a record type. This works in Python because you don't need consistency between the different types that a variable can be assigned, but you do for compiled code. The error message is perhaps easier to interpret if you examine the types that it's casting from:

array(Record(one[type=[unichr x 10];offset=0],two[type=float64;offset=40];48;False), 1d, C)

and to:

Record(one[type=[unichr x 10];offset=0],two[type=float64;offset=40];48;False)

on separate lines.

However, my quick attempt to fix this, changing the jitclass values type with:

diff --git a/repro.py b/repro.py
index 2e1ba82..d8a43fd 100644
--- a/repro.py
+++ b/repro.py
@@ -8,7 +8,7 @@ values_dtype = np.dtype([
 ])

 @jitclass([
-    ('values', from_dtype(values_dtype))
+    ('values', from_dtype(values_dtype)[:])
 ])
 class My_test:
     def __init__(self):

to declare values as an array rather than a record, results in another traceback:

Traceback (most recent call last):
  File "repro.py", line 18, in <module>
    print(my_test.values)
  File "/home/gmarkall/miniconda3/envs/numba/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 1506, in _array_str_implementation
    return array2string(a, max_line_width, precision, suppress_small, ' ', "")
  File "/home/gmarkall/miniconda3/envs/numba/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 712, in array2string
    return _array2string(a, options, separator, prefix)
  File "/home/gmarkall/miniconda3/envs/numba/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 484, in wrapper
    return f(self, *args, **kwargs)
  File "/home/gmarkall/miniconda3/envs/numba/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 517, in _array2string
    lst = _formatArray(a, format_function, options['linewidth'],
  File "/home/gmarkall/miniconda3/envs/numba/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 838, in _formatArray
    return recurser(index=(),
  File "/home/gmarkall/miniconda3/envs/numba/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 794, in recurser
    word = recurser(index + (-i,), next_hanging_indent, next_width)
  File "/home/gmarkall/miniconda3/envs/numba/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 748, in recurser
    return format_function(a[index])
  File "/home/gmarkall/miniconda3/envs/numba/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 1294, in __call__
    str_fields = [
  File "/home/gmarkall/miniconda3/envs/numba/lib/python3.8/site-packages/numpy/core/arrayprint.py", line 1294, in <listcomp>
    str_fields = [
UnicodeDecodeError: 'utf-32-le' codec can't decode bytes in position 0-3: code point not in range(0x110000)

I'm looking into this next traceback now.

Ah, never mind - the traceback is because it was printing uninitialized memory from np.empty. The following works:

from numba.experimental import jitclass
from numba import from_dtype
import numpy as np

values_dtype = np.dtype([
    ('one', 'U10'),
    ('two', 'f8')
])

@jitclass([
    ('values', from_dtype(values_dtype)[:])
])
class My_test:
    def __init__(self):
        self.values = values = np.zeros(2, dtype=values_dtype)

my_test = My_test()
print(my_test.values)

producing:

[('', 0.) ('', 0.)]

@gmarkall Thank you so much for your quick reply and help! :)

Was this page helpful?
0 / 5 - 0 ratings