Dali: Extend DALIClassificationIterator to standard [images, labels] return value for pytorch compatibility

Created on 28 Apr 2020  ·  4Comments  ·  Source: NVIDIA/DALI

I'm trying to create a standardized interface to use aDALIClassificationIterator with pytorch and have something like this:

from copy import deepcopy

class DALIClassificationIteratorLikePytorch(DALIClassificationIterator):
    def __next__(self):
        """Override this to return things like pytorch."""
        if self._first_batch is None:
            print("first batch")
            return super(DALIClassificationIteratorLikePytorch, self).__next__()

        sample = super(DALIClassificationIteratorLikePytorch, self).__next__()

        if sample is not None and len(sample) > 0:
            images = deepcopy(sample[0]["data"])
            labels = deepcopy(sample[0]["label"])
            # images = sample[0]["data"]
            # labels = sample[0]["label"]
            print("returning!")
            return images, labels

I seem to be seeing an issue where the logic for self._first_batch gets triggered multiple times causing issues downstream. Under what scenarios is DALIClassificationIterator._first_batch used for pytorch? Looking at the code it doesn't seem to have a function.

question

All 4 comments

Hi,
__init__ calls next self.next() to set self._first_batch, self._first_batch it is None so self.next() return a value and init assigns it to self._first_batch.
Next call to self.next() returns value from self._first_batch and sets it to None.
The third call to self.next() computes the next value and returns it.
I don't think you need a special code path for self._first_batch is None in your code. DALIClassificationIterator will handle it.

Thanks for the quick response, that was the first thing I tried, i.e.:

class DALIClassificationIteratorLikePytorch(DALIClassificationIterator):
    def __next__(self):
        """Override this to return things like pytorch."""
        sample = super(DALIClassificationIteratorLikePytorch, self).__next__()

        if sample is not None and len(sample) > 0:
            images = deepcopy(sample[0]["data"])
            labels = deepcopy(sample[0]["label"])
            print("returning!")
            return images, labels

But this triggers an error:

dali_imagefolder.py", line 209, in __next__
    images = sample[0]["data"]
TypeError: new(): invalid data type 'str'

But I didn't really look closely enough I guess about the alternation. It is now fixed! Code below for working example:

class DALIClassificationIteratorLikePytorch(DALIClassificationIterator):
    def __next__(self):
        """Override this to return things like pytorch."""
        sample = super(DALIClassificationIteratorLikePytorch, self).__next__()

        if sample is not None and len(sample) > 0:
            if isinstance(sample[0], dict):
                images = sample[0]["data"]
                labels = sample[0]["label"]
            else:
                images, labels = sample

            return images, labels

Thanks again.

But this triggers an error:

dali_imagefolder.py", line 209, in __next__ images = sample[0]["data"] TypeError: new(): invalid data type 'str'

Because init call your next which removes dicts, and the self._first_batch stores plain images, labels pair.

Yup makes sense, thanks! Missed that alternating of _first_batch and batch.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yjjiang11 picture yjjiang11  ·  5Comments

tianyang-li picture tianyang-li  ·  4Comments

ShoufaChen picture ShoufaChen  ·  4Comments

Usernamezhx picture Usernamezhx  ·  4Comments

ay27 picture ay27  ·  6Comments