Dali: The pixels generated from DALI are all 0

Created on 18 Dec 2019  ยท  6Comments  ยท  Source: NVIDIA/DALI

I'm trying running the example on the official website: https://docs.nvidia.com/deeplearning/sdk/dali-developer-guide/docs/examples/video/video_reader_label_example.html It's an example of sampling frames from videos in a set of folders.
I downloaded "DALI_extra" from https://github.com/NVIDIA/DALI_extra excepct the videos in DALI_extra/db/video/sintel/labelled_videos/ because I couldn't access them. I placed the original videos with some videos from Kinetics-400 dataset. The following is the architecture of the folder:

DALI_extra/db/video/sintel/labelled_videos/
โ”œโ”€โ”€ 0
โ”‚   โ”œโ”€โ”€ video1.mp4
โ”‚   โ””โ”€โ”€ video2.mp4
โ”œโ”€โ”€ 1
โ”‚   โ”œโ”€โ”€ video3.mp4
โ”‚   โ””โ”€โ”€ video4.mp4
โ””โ”€โ”€ 2
    โ”œโ”€โ”€ video5.mp4
    โ””โ”€โ”€ video6.mp4

My code is a little different form the released code. I just add os.environ['DALI_EXTRA_PATH']='DALI_extra' to set the root directory and add print(sequences_out) in the last line to get the vlaues of the output data.

from __future__ import print_function
from __future__ import division
import os
import numpy as np
os.environ['DALI_EXTRA_PATH']='DALI_extra'
from nvidia.dali.pipeline import Pipeline
import nvidia.dali.ops as ops
import nvidia.dali.types as types
print(os.listdir(os.environ['DALI_EXTRA_PATH']))
batch_size=2
sequence_length=8
initial_prefetch_size=11
video_directory = os.path.join(os.environ['DALI_EXTRA_PATH'], "db", "video", "sintel", "labelled_videos")
shuffle=True
n_iter=6
class VideoPipe(Pipeline):
    def __init__(self, batch_size, num_threads, device_id, data, shuffle):
        super(VideoPipe, self).__init__(batch_size, num_threads, device_id, seed=16)
        self.input = ops.VideoReader(device="gpu", file_root=data, sequence_length=sequence_length,
                                     shard_id=0, num_shards=1,
                                     random_shuffle=shuffle, initial_fill=initial_prefetch_size)
    def define_graph(self):
        output, labels = self.input(name="Reader")
        return output, labels
pipe = VideoPipe(batch_size=batch_size, num_threads=2, device_id=0, data=video_directory, shuffle=shuffle)
pipe.build()
for i in range(n_iter):
    sequences_out, labels = pipe.run()
    sequences_out = sequences_out.as_cpu().as_array()
    labels = labels.as_cpu().as_array()
    print(sequences_out.shape)
    print(labels.shape)
    print(sequences_out)

When I run the code, I get zeors like (the following is a part of the output)

   [[0 0 0]
    [0 0 0]
    [0 0 0]
    ...
    [0 0 0]
    [0 0 0]
    [0 0 0]]

   ...

   [[0 0 0]
    [0 0 0]
    [0 0 0]
    ...
    [0 0 0]
    [0 0 0]
    [0 0 0]]

   [[0 0 0]
    [0 0 0]
    [0 0 0]
    ...
    [0 0 0]
    [0 0 0]
    [0 0 0]]

   [[0 0 0]
    [0 0 0]
    [0 0 0]
    ...
    [0 0 0]
    [0 0 0]
    [0 0 0]]]


  [[[0 0 0]
    [0 0 0]
    [0 0 0]
    ...
    [0 0 0]
    [0 0 0]
    [0 0 0]]

   [[0 0 0]
    [0 0 0]
    [0 0 0]
    ...
    [0 0 0]
    [0 0 0]
    [0 0 0]]

   [[0 0 0]
    [0 0 0]
    [0 0 0]
    ...
    [0 0 0]
    [0 0 0]
    [0 0 0]]

   ...

   [[0 0 0]
    [0 0 0]
    [0 0 0]
    ...
    [0 0 0]
    [0 0 0]
    [0 0 0]]

   [[0 0 0]
    [0 0 0]
    [0 0 0]
    ...
    [0 0 0]
    [0 0 0]
    [0 0 0]]

   [[0 0 0]
    [0 0 0]
    [0 0 0]
    ...
    [0 0 0]
    [0 0 0]
    [0 0 0]]]]]

However, sometimes the program suspends at sequences_out, labels = pipe.run().
I don't why I get wrong results. I have tried some other examples from the official website including load images, but they are still 0.
My developing environment:

Python 3.6.7(a virtual environment in anaconda)            
future          0.18.2                               
matplotlib      3.1.2              
numpy           1.17.3             
nvidia-dali     0.16.0   ( It is installed by official pip command)
opencv-python   4.1.2.30                   
pip             19.3.1             
protobuf        3.11.1                                      
setuptools      42.0.2.post20191201                       
torch           1.1.0              
torchvision     0.3.0                        
wheel           0.33.6

Hardware:

Nvidia GeForce RTX 2080 Ti *4. The GPU-2 and GPU-3 is being used. The DALI code runs on GPU-0.

I have looked up for all the issues but I didn't find any issue mentioned this problem. Thanks a lot for your reply!!

bug

All 6 comments

Hi,
DALI_extra uses git lfs that is why you may not access them.
Regarding hang - can you narrow down this problem to a particular file and share it with us so we can check it? I can only guess it is due to VFR (variable frame rate) DALI currently doesn't support (we have some heuristics to detect this kind of video and show some warning but it is not 100% accurate).
Regarding 0 - can you provide any file that reproduces that problem?
@a-sansanwal - have I missed anything?

@zugexiaodui It seems the Kinetics-400 dataset is taken from youtube. It's very likely that DALI is not able to correctly decode because of variable frame rate videos on youtube. and that the heuristic it uses also wasnt able to detect that it was variable frame rate because otherwise DALI will warn about vfr videos in the dataset.
Please post the videos you used, it will help improve DALI.

@a-sansanwal @JanuszL Thank you for your timely answer!
Regarding 0 - I made a mistake that I use plt.save to save a image, which caused blank image. I use cv2.imwrite to save the image and then the image is not empty. The reason why the output of print is 0 may be that not all numbers are displayed and the numbers we can see are all 0 by chance. I have solved the problem.
Regarding hang - It's the problem that I haven't solved. As you mentioned, the code runs correctly if the videos are original 'sintel_trailer' videos(I have downloaded them) except for some warnings about audio. The warnings don't matter because I checked the saved frames(by cv2.imwrite) and they are correct. If I changed the original videos with Kinetics-400(from youtube) videos, the program hangs. I upload the code and videos to: mycode_1.zip
Thanks for your help!

@JanuszL @zugexiaodui Okay, so I analyzed the second issue mentioned in this thread. It is not related to VFR as we speculated. It's caused by a scenario we coincidentally never tested for. The fix is easy and I will send a PR soon after better testing+verifying it.

OK! Thank you! @a-sansanwal

Let us keep that open until the fix is merged.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

frank-wei picture frank-wei  ยท  3Comments

jxmelody picture jxmelody  ยท  6Comments

Doom9234 picture Doom9234  ยท  3Comments

Usernamezhx picture Usernamezhx  ยท  4Comments

dhkim0225 picture dhkim0225  ยท  4Comments