Dali: Inconsistant Edge create types for feed_input (int64)

Created on 12 Jun 2019 · 7Comments · Source: NVIDIA/DALI

Was using ExternalSource to add data to the pipeline. During the feed_input step it wraps the NumPy array in a Edge.TensorListCPU or Edge.TensorCPU.
However, I get the following error:

RuntimeError: [/opt/dali/dali/python/backend_impl.cc:94] Cannot create type for unknow format string: l
Stacktrace (67 entries):

Minimal example that fails:

Edge.TensorCPU(np.array([1], dtype=np.int64))

Example that works:

Edge.TensorCPU(np.array([1], dtype=np.longlong))

Which is surprising since on this machine:

>>> np.int64
<class 'numpy.int64'>
>>> np.longlong
<class 'numpy.int64'>

nvidia.dali.__version__
'0.10.0'

bug

Source

Mmdixon

All 7 comments

It seems that py::buffer object created for the failing case carries on format type 'f' and it seems to be a pybind11 limitation to parse it further. It supports only those formats https://github.com/pybind/pybind11/blob/v2.3.0/include/pybind11/detail/common.h#L700 (described https://docs.python.org/3/library/struct.html#format-characters). Will dig deeper.

JanuszL on 13 Jun 2019

It seems that python's buffer protocol returns type l for numpy.int64. Sounds like numpy or Python problem (less likelly pybind11), not related to DALI which relies on format returned from Python (I have reproduced it without DALI as well).
Anyway I just reported it to pybind11 developer to make sure that it definitely not their fault.

JanuszL on 13 Jun 2019

👀1

I wouldn't say the dtypes are wrong per-se, if you break down its properties
For np.longlong: Pretty standard to what the docs say.

base:dtype('int64')
byteorder:'='
char:'q'
isnative:True
itemsize:8
kind:'i'
name:'int64'
str:'<i8'
type:<class 'numpy.int64'>

For np.int64

base:dtype('int64')
byteorder:'='
char:'l'
isnative:True
itemsize:8
kind:'i'
name:'int64'
str:'<i8'
type:<class 'numpy.int64'>

Note that the _standard_ size of l is 4, but here it is representing the _native_ size of a long which is going to be sizeof(long) == 8

The representation of both agree, <i8, that is a Little Endian, integer kind, of size 8.
pybind11 probably avoids creating format descriptors with l as it one of the more ambiguous ones, but does parse it correctly if given one.

Mmdixon on 13 Jun 2019

I think Python documentation about character representation of underlying format is rather unequivocal: https://docs.python.org/3/library/struct.html#format-characters and https://docs.python.org/2/library/struct.html#format-characters. If you get l you expect signed 4b data, while 'q' is signed 8b data.
I imagine we could do some work around for data that claims to be 'l' type but have 8b item size, but I don't know if that is not going to break anything else.

JanuszL on 14 Jun 2019

For Format Strings, the docs linked say

The ‘Standard size’ column refers to the size of the packed value in bytes when using standard size; that is, when the format string starts with one of '<', '>', '!' or '='. When using native size, the size of the packed value is platform-dependent.

Then under https://docs.python.org/3/library/struct.html#byte-order-size-and-alignment

If the first character is not one of these, '@' is assumed.

Where @ means native size. So l means @l which is a native long, integer 8bytes.

I think the confusion is coming from this case l is equivalent to q however

pybind11::format_descriptor<long>::format()
pybind11::format_descriptor<long long>::format()
pybind11::format_descriptor<int64_t>::format()

are all going to return q (because why not, the docs don't mention some canonical form to make format strings unique when there are multiple correct answers). So a switch statement like this l will fall through the cracks.

Mmdixon on 14 Jun 2019

You are right. I made some modification to make DALI more flexible for such cases https://github.com/NVIDIA/DALI/pull/985.

JanuszL on 17 Jun 2019

👍1

It should be included in 0.12.0

JanuszL on 6 Aug 2019

Was this page helpful?

0 / 5 - 0 ratings