Dali: Can't process big size image

Created on 4 Jul 2018 · 6Comments · Source: NVIDIA/DALI

hi,
I want to use DALI to process image and run a tensorflow model . I used a height quality image data set (about 4MB / picture), then error occured:

2018-07-04 19:11:03.454375: E tensorflow/stream_executor/cuda/cuda_dnn.cc:455] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2018-07-04 19:11:03.454419: E tensorflow/stream_executor/cuda/cuda_dnn.cc:427] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
2018-07-04 19:11:03.454433: F tensorflow/core/kernels/conv_ops.cc:713] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)

I found that was likely an "out of memory error" (Notice that I use the same code ,but a smaller data set, the smaller one could run).
So I convert the data set to lower quality(about 400k) ,but still big image size (width x height are mostly 4000+pixels)
it returns new error ,

InternalError (see above for traceback): DALI Output(&pipe_handle_) failed: Critical error in pipeline: Error in thread 1: [/opt/dali/dali/pipeline/operators/decoder/nvjpeg_decoder.h:324] NVJPEG error "8"

It's an nvJpeg error, and the error code is "8",which means(I found it in PDF" nvJPEG Library Documentation" of nvJPEG)

NVJPEG_STATUS_INTERNAL_ERROR (8) Error during the execution of the device tasks.

So ,does DALI/nvJPEG has a limitation of image size or image quality? If yes, what's the limitation?

bug

Source

jxmelody

All 6 comments

Update ：
when I use big size images to run DALI+tensorflow model,the GPU utilization will increase up to almost 100%, so I use

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.7
sess = tf.Session(config=config)

It works.

jxmelody on 5 Jul 2018

🚀1

Hi, thanks for posting the issue. Would it be possible to get an image that results in internal error in nvJPEG? This looks like a bug in either nvJPEG or DALI integration of nvJPEG.

On the memory issue - TensorFlow by default takes pretty much all of the available GPU memory and there is not much left for DALI to work with. There is an ongoing work to integrate with TensorFlow's memory manager (so that DALI could use the memory TF allocated), but currently the solution is (as you figured out) to limit the amount of memory that TF uses with per_process_gpu_memory_fraction.