I have an existing pipeline (v0.7.0, installed from the wheel files) that runs all its augmentations in a for-loop. I find this very convenient since the code is concise and I do not have to modify define_graph() to add/remove an operator. Below is some pseudo code to give an idea how my pipeline works
class Example(Pipeline):
def __init__(self, batch_size, num_threads, device_id):
super(Example, self).__init__(batch_size, num_threads, device_id)
self.decoder = ...
self.input_jpegs = ops.ExternalSource()
self.augmentations = {}
self.augmentations['contrast'] = {
'operation': ops.Contrast(),
'generators': [ops.Uniform(range=(0.4, 1.6))],
'args': ['contrast']}
self.augmentations['brightness'] = {
'operation': ops.Brightness(),
'generators': [ops.Uniform(range=(0.6, 1.4))],
'args': ['brightness']}
def define_graph(self):
"""Modify images based on the pipeline's augmentations dictionary."""
self.jpegs = self.input_jpegs()
images = self.decoder(self.jpegs)
augs = list(self.augmentations.values())
for aug in augs:
kwargs = {a: g() for a, g in zip(aug['args'], aug['generators'])}
images = aug['operation'](images, **kwargs)
return images
def feed_input(self):
...
I'd like to add the Slice operator to this pipeline to enable differently-sized, random crops for each image in the batch. However, the difficulty is that Slice requires three positional tensor inputs: images, crop_begin, and crop_size. For a small pipeline like the example, it wouldn't be too much trouble to call the augmentations one-by-one in define_graph() instead of using a for-loop. However, my actual pipeline has a much longer list of augmentations that will continue growing, so I'd like to preserve the for-loop structure if possible.
Looking at the source code, I saw that Slice is mainly a wrapper around Crop and achieves the individual crops by modifying the protected members per_sample_dimensions_, crop_width_, and crop_height_. I tried several custom operator approaches that would allow me to run Slice with images as the only positional input
Slice code into MySlice - a custom operator which also inherits from Crop. Use named argument inputs instead of positional inputs to assign values to the protected members mentioned aboveInstantiateOperator(OpSpec("Slice")) to get a pointer to Dali's Slice operator within MySlice. Then use allocate crop_begin and crop_size tensors within MySlice. Pass these tensors to the Slice pointer via OpSpec.AddInput() and then call slice_ptr->Run(ws) InstantiateOperator(OpSpec("Crop")) to access the protected members mentioned aboveWith each approach I ran into issues
MySlice compiles, but when you load the .so with the plugin manager, there are missing symbol errors. Particularly for CropAttr::CheckShapescrop_begin and crop_size tensors are supposed to contain a (x, y) pair for each batch sample, but I am unsure how to allocate this data structure starting from scratchInstantiateOperator doesn't have access to internal Crop membersApproach 2 seems the most promising to me, but perhaps I have overlooked something. Any solutions or alternative approaches would be appreciated. Thanks!
Clarification: based on the call to the tensor method of crop_begin in slice.cu, I realized that this should be allocated as type TensorList _not_ Tensor as I mentioned above in 2
However, I am still unsure how I should allocate and assign values to these TensorLists
@addisonklinke Having per sample crop dimensions in Crop is something very reasonable and we are considering to add it to our built-in Crop operator. Let me evaluate it with the team and I'll come back to you with an answer.
@jantonguirao That sounds great. To clarify, would the new crop operator take these per sample dimensions as named argument inputs? That way they can be be generated with ops.Uniform() and the crop operator will only require images as a positional input
@addisonklinke Yes, that's what I had in mind
@addisonklinke We've implemented support for argument tensors to specify crop dimensions (It's in master branch now).
def __init__(self, ...):
...
self.rgn_crop_h = ops.Uniform(range=(100,120))
self.rgn_crop_w = ops.Uniform(range=(100,220))
self.crop = ops.Crop()
def define_graph(self):
...
images = ...
crop_h = self.rgn_crop_h()
crop_w = self.rgn_crop_w()
crop_output = self.crop(images, crop_h=crop_h, crop_w=crop_w)
return crop_output
You can still use the old API.
It was add in https://github.com/NVIDIA/DALI/pull/637 and will be available in DALI 0.8
Most helpful comment
@addisonklinke Having per sample crop dimensions in Crop is something very reasonable and we are considering to add it to our built-in
Cropoperator. Let me evaluate it with the team and I'll come back to you with an answer.