Hi, I met issues when using DALI to replace Pytorch dataloader. My task is to transfer mel spectrograms with several labels into training model. I firstly hope to transfer a python class with labels (including IDs and strings) and GPU tensors into pipeline, but here said ExternalSource accepts input only on the CPU (via numpy array) so I changed the data format into several numpy arrays. Then I found the iterator (DALIGenericIterator and DALIClassificationIterator) support only one or two outputs in single pipeline, but the batch in my model contains at least 6 (mel_inputs, input_lengths, mel_target, output_lengths, speaker_id, gate_padded). And each mel spectrogram has a different length.
I would like to ask how can I transfer these data into a GPU based training model? Do I need to write a custom function to support this? If yes, may I have some guidance?
Here is my code:
class ExternalInputIterator(object):
def __init__(self, batch_size, data_folder, target_speaker):
self.batch_size = batch_size
self.filepaths_pair = load_files_from_path(data_folder, target_speaker)
self.dataset_len = len(self.filepaths_pair)
def __iter__(self):
self.i = 0
shuffle(self.filepaths_pair)
return self
def __next__(self):
inputs = []
if self.i >= self.dataset_len:
raise StopIteration
for i in range(self.batch_size):
file = self.get_mel_pair(self.filepaths_pair[i])
inputs.append(file)
batch = self.collate_fn(inputs)
self.i = (self.i + 1) % self.dataset_len
return batch
@property
def size(self,):
return self.dataset_len
def get_mel_pair(self, files):
...
return files
def load_mel(self, filename):
melspec = np.load(filename)
return melspec
def collate_fn(self, batch):
...
return (mel_inputs, input_lengths, mel_targets, output_lengths, gate, input_voice, target_voice, speaker_id)
next = __next__
class ExternalSourcePipeline(Pipeline):
def __init__(self, batch_size, num_threads, device_id, external_data):
super(ExternalSourcePipeline, self).__init__(batch_size, num_threads, device_id, seed=12)
self.mel_inputs = ops.ExternalSource()
self.input_lengths = ops.ExternalSource()
self.mel_targets = ops.ExternalSource()
self.output_lengths = ops.ExternalSource()
self.speaker_id = ops.ExternalSource()
self.gate_padded = ops.ExternalSource()
self.input_voice = ops.ExternalSource()
self.target_voice = ops.ExternalSource()
self.pad = ops.Pad(fill_value=0)
self.gate_pad = ops.Pad(fill_value=1)
self.external_data = external_data
self.iterator = iter(self.external_data)
def define_graph(self):
self.mel_inputs = self.mel_inputs()
# mel_inputs = self.mel_inputs
mel_inputs = self.pad(self.mel_inputs)
self.mel_targets = self.mel_targets()
# mel_targets = self.mel_targets
mel_targets = self.pad(self.mel_targets)
self.gate_padded = self.gate_padded()
# gate_padded = self.gate_padded
gate_padded = self.gate_pad(self.gate_padded)
self.input_lengths = self.input_lengths()
self.output_lengths = self.output_lengths()
self.speaker_id = self.speaker_id()
self.input_voice = self.input_voice()
self.target_voice = self.target_voice()
return (mel_inputs, self.input_lengths, mel_targets, self.output_lengths, gate_padded,
self.input_voice, self.target_voice, self.speaker_id)
def iter_setup(self):
try:
(mel_inputs, input_lengths, mel_targets, output_lengths, gate_padded, input_voice, target_voice, speaker_id) = self.iterator.next()
self.feed_input(self.mel_inputs, mel_inputs)
self.feed_input(self.input_lengths, input_lengths)
self.feed_input(self.mel_targets, mel_targets)
self.feed_input(self.output_lengths, output_lengths)
self.feed_input(self.speaker_id, speaker_id)
self.feed_input(self.gate_padded, gate_padded)
self.feed_input(self.input_voice, input_voice)
self.feed_input(self.target_voice, target_voice)
except StopIteration:
self.iterator = iter(self.external_data)
raise StopIteration
if __name__ == '__main__':
trainset_loader = ExternalInputIterator(batch_size=4, data_folder=hparams.data_folder, target_speaker=hparams.target_speaker)
pipe = ExternalSourcePipeline(batch_size=4, num_threads=2, device_id=0, external_data=trainset_loader)
dali_iter = DALIGenericIterator([pipe], ['mel_inputs', 'input_lengths', 'mel_targets', 'output_lengths', 'gate_padded', 'input_voice', 'target_voice', 'speaker_id'],
size=trainset_loader.size, auto_reset=True, last_batch_padded = True)
print('dataset size:{}, batch size:{}'.format(trainset_loader.size, 4))
for e in range(10):
for i, data in enumerate(dali_iter):
if i % 10 == 0:
print('epoch {}, iteration {}'.format(e, i))
print('mel_inputs:', data[0]['mel_inputs'].shape)
print('input_lengths:', list(data[0]['input_lengths']))
print('mel_targets:', data[0]['mel_targets'].shape)
print('output_lengths:', list(data[0]['output_lengths']))
print('speaker_id:', list(data[0]['speaker_id']))
dali_iter.reset()
Hi,
ExternalSource accepts CPU only input, numpy, or anything that supports array_interface.
There is ongoing work to enable a GPU input as well https://github.com/NVIDIA/DALI/pull/1997.
Regarding the number of the DALIGenericIterator outputs, it can be any. In the example, we show how to use two, but you can add more in the same way. Just add more values in the output_map.
Hi @JanuszL, thank you for your reply. When I run the code, I got this error:
Traceback (most recent call last):
File "/media/zzy/D/Programs/pycharm-2019.2.5/helpers/pydev/pydevd.py", line 1415, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/media/zzy/D/Programs/pycharm-2019.2.5/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/media/zzy/E/DL/torch-code/parrotron/test.py", line 15, in <module>
size=trainset_loader.size, auto_reset=True, fill_last_batch = True, last_batch_padded = False)
File "/home/zzy/anaconda3/envs/StarGAN-VC/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 162, in __init__
self._first_batch = self.next()
File "/home/zzy/anaconda3/envs/StarGAN-VC/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 259, in next
return self.__next__()
File "/home/zzy/anaconda3/envs/StarGAN-VC/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 190, in __next__
category_tensors[category] = out.as_tensor()
RuntimeError: [/opt/dali/dali/pipeline/data/tensor_list.h:435] Assert on "this->IsDenseTensor()" failed: All tensors in the input TensorList must have the same shape and be densely packed.
In a batch the data shape is like np.array([[80 * 218], [80 * 156], [80 * 131], [80 * 109]]). It seems the iterator doesn't support variable-length data. I tried to use self.pad = ops.Pad(fill_value=0), but I got another error:
RuntimeError: Critical error in pipeline: [/opt/dali/dali/operators/generic/pad.cc:172] Unsupported data type: 10
...
Current pipeline object is no longer valid.
That is true, DALI doesn't support variable-length data in PyTorch. However, PaddlPaddle and MXNet (Gluon) have such support. TensorFlow supports this only for CPU.
Another question is ops.Pad function cannot be used to pad the variable-length data. As describe above:
In a batch the data shape is like np.array([[80 * 218], [80 * 156], [80 * 131], [80 * 109]]). I tried to use self.pad = ops.Pad(fill_value=0), but I got another error:
RuntimeError: Critical error in pipeline: [/opt/dali/dali/operators/generic/pad.cc:172] Unsupported data type: 10
...
Current pipeline object is no longer valid.
What's the reason for this error? @JanuszL
It seems that you are trying to pad on the datatype that is not supported by the pad operator.
@jantonguirao any thoughts?
@Approximetal Most of our operators don't support float64. I'd recommend using 32-bit floats (float) instead to feed your external source inputs
Hi @jantonguirao, I checked the data, it is float32, the format is like this [np.array(shape=(80,216)), np.array(shape=(80,209)), np.array(shape=(80,193)), np.array(shape=(80,116))]
I use pad function self.pad = ops.Pad(fill_value=0, axes=(2,)) it still return the error above. Is there any issue in my code?
The Documentation said the parameter is data (TensorList) – Input to the operator. but the error shows TypeError: expected np.ndarray. Is it confusing?
@Approximetal The error message you shared with us before points to the fact that the input of Pad operator was in float64 format, which is also the default in numpy:
>>> arr = np.zeros(shape=(2, 2))
>>> print(arr.dtype)
float64
vs
>>> arr = np.zeros(shape=(2, 2), dtype=np.float32)
>>> print(arr.dtype)
float32
The other TypeError message is probably not coming from Pad operator but from somewhere else.
If you are able to share a reproducible code sample (with some sample data) we could analyze and find what's the problem.
@Approximetal
The Documentation said the parameter is data (TensorList) – Input to the operator. but the error shows TypeError: expected np.ndarray. Is it confusing?
Do you refer to:
RuntimeError: Critical error in pipeline: [/opt/dali/dali/operators/generic/pad.cc:172] Unsupported data type: 10
...
Current pipeline object is no longer valid.
error?
@Approximetal
The Documentation said the parameter is data (TensorList) – Input to the operator. but the error shows TypeError: expected np.ndarray. Is it confusing?
Do you refer to:
RuntimeError: Critical error in pipeline: [/opt/dali/dali/operators/generic/pad.cc:172] Unsupported data type: 10 ... Current pipeline object is no longer valid.error?
Not this one, I mean, I don't know why the pad function cannot use, so I checked the documentation, and it said the data format should be a TensorList, So I use torch.from_numpy to transfer the data then got TypeError: expected np.ndarray. Maybe I just misunderstood.
I don't get it, torch.from_numpy creates a Torch tensor and DALI cannot handle it as an input, maybe you want to use .numpy method?
I don't get it, torch.from_numpy creates a Torch tensor and DALI cannot handle it as an input, maybe you want to use .numpy method?
Just want to confirm the
in the documentation means a list of numpy array right? If DALI can only handle numpy array, that's okay. I just misunderstood, nothing matters, thanks reply~
Sure, it can be either a list of batch size numpy arrays, where each array corresponds to one tensor, or one nympy array where outermost dimension corresponds to the batch size.
@Approximetal The error message you shared with us before points to the fact that the input of Pad operator was in float64 format, which is also the default in numpy:
>>> arr = np.zeros(shape=(2, 2)) >>> print(arr.dtype) float64vs
>>> arr = np.zeros(shape=(2, 2), dtype=np.float32) >>> print(arr.dtype) float32The other TypeError message is probably not coming from Pad operator but from somewhere else.
If you are able to share a reproducible code sample (with some sample data) we could analyze and find what's the problem.
@jantonguirao I found the dtype of the id info is int64, I changed it to float32, and it works. Thanks~
Hi, I met a new error RuntimeError: [/opt/dali/dali/python/backend_impl.cc:138] Assert on "info.strides[i] == info.itemsize*dim_prod" failed: Strided data not supported. Detected on dimension 1
The code is like:
input_voice = np.concatenate([frame for frame, _ in input_voice], axis=0)
where frame is a shape=[60,80] numpy array, input_voice is a concatenated numpy array with float32.
What does this error mean? How can I change the data format?
It looks that your NumPy array has strides and the memory is not contiguous. You can try to copy the array and see if the strides have been changes, see more info in for example this place.
It looks that your NumPy array has strides and the memory is not contiguous. You can try to copy the array and see if the strides have been changes, see more info in for example this place.
I use np.ascontiguousarray() to solve the problem.