VideoReader segmentation fault on long videos when using video_reader backend. This issue is a continuation of #2259
Torchvision segfaults when reading entire test video.
I used to believe this was the issue with long videos only, but it happens on the test videos we have provided as well suggesting in might related to the FFMPEG version installed on a system (the fact that test don't catch that might suggest that).
Steps to reproduce the behavior:
vframes, _, _ = torchvision.io.read_video(path, pts_unit="sec") where path=$TVDIR/test/assets/videos/TrumanShow_wave_f_nm_np1_fr_med_26.aviBacktrace suggests it's an issue in libswscale.
#0 0x00007fff88224cf2 in ?? () from /home/bjuncek/miniconda3/envs/vb/lib/libswscale.so.5
#1 0x00007fff88223bb4 in ?? () from /home/bjuncek/miniconda3/envs/vb/lib/libswscale.so.5
#2 0x00007fff881f9af4 in sws_scale () from /home/bjuncek/miniconda3/envs/vb/lib/libswscale.so.5
Which I've previously found can be due to conflicting inputs (note, this _might_ be due to new/different FFMPEG version?).
Video is being read
Collecting environment information...
PyTorch version: 1.6.0
Is debug build: False
CUDA used to build PyTorch: 10.2
OS: Ubuntu 18.04.4 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: Could not collect
Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: Quadro RTX 8000
GPU 1: Quadro RTX 8000
Nvidia driver version: 440.33.01
cuDNN version: Probably one of the following:
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudnn.so.7
/usr/local/cuda-10.2.89/targets/x86_64-linux/lib/libcudnn.so.7
Versions of relevant libraries:
[pip3] numpy==1.19.1
[pip3] torch==1.6.0
[pip3] torchvision==0.7.0a0+78ed10c
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 hfd86e86_1
[conda] mkl 2020.2 256
[conda] mkl-service 2.3.0 py38he904b0f_0
[conda] mkl_fft 1.1.0 py38h23d657b_0
[conda] mkl_random 1.1.1 py38hcb8c335_0 conda-forge
[conda] numpy 1.19.1 py38hbc911f0_0
[conda] numpy-base 1.19.1 py38hfa32c7d_0
[conda] pytorch 1.6.0 py3.8_cuda10.2.89_cudnn7.6.5_0 pytorch
[conda] torchvision 0.7.0a0+78ed10c pypi_0 pypi
Removing hidden inputs (specifically size/aspect ratio/crop) in _read_video op can in principle fix this, but _might_ be bc breaking if users are exposing these manually in their code.
cc @bjuncek
More information:
Program received signal SIGSEGV, Segmentation fault.
0x00007f4d206b7fc2 in ff_yuv_420_rgb24_ssse3.loop0 ()
at libswscale/x86/yuv_2_rgb.asm:376
376 libswscale/x86/yuv_2_rgb.asm: No such file or directory.
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64
(gdb) bt
#0 0x00007f4d206b7fc2 in ff_yuv_420_rgb24_ssse3.loop0 ()
at libswscale/x86/yuv_2_rgb.asm:376
#1 0x00007f4d206b6e84 in yuv420_rgb24_ssse3 (c=0x564bca2e59c0, src=0x7ffd52b1c9e0,
srcStride=0x7ffd52b1c9c0, srcSliceY=0, srcSliceH=256, dst=0x7ffd52b1ca00,
dstStride=0x7ffd52b1c9d0) at libswscale/x86/yuv2rgb_template.c:177
#2 0x00007f4d2068eb45 in sws_scale (c=<optimized out>, srcSlice=<optimized out>,
srcStride=<optimized out>, srcSliceY=<optimized out>, srcSliceH=256,
dst=<optimized out>, dstStride=0x7ffd52b1cd10) at libswscale/swscale.c:969
#3 0x00007f4d21f62d2b in ffmpeg::(anonymous namespace)::transformImage (
context=0x564bca2e59c0, srcSlice=0x564bca35f100, srcStride=0x564bca35f140,
inFormat=..., outFormat=...,
out=0x564bca4c5b60 "\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\024\r\027\024\r\031\024\r\031\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\023\f\030\023\f\030\023\f"..., planes=0x7ffd52b1cd20, lines=0x7ffd52b1cd10)
at /root/vision/torchvision/csrc/cpu/decoder/video_sampler.cpp:46
#4 0x00007f4d21f639a8 in ffmpeg::VideoSampler::sample (this=0x564bca5a4220,
srcSlice=0x564bca35f100, srcStride=0x564bca35f140, out=0x564bca4c3490)
at /root/vision/torchvision/csrc/cpu/decoder/video_sampler.cpp:182
#5 0x00007f4d21f63c1e in ffmpeg::VideoSampler::sample (this=0x564bca5a4220,
frame=0x564bca35f100, out=0x564bca4c3490)
The segafults only occur when MMX/SSE/AVX optimizations are enabled on FFmpeg
I believe this issue might be a bug in FFmpeg introduced in https://github.com/FFmpeg/FFmpeg/commit/fc6a5883d6af8cae0e96af84dda0ad74b360a084, and that has been fixed in https://github.com/FFmpeg/FFmpeg/commit/ba3e771a42c29ee02c34e7769cfc1b2dbc5c760a
The bug report for this issue was in https://trac.ffmpeg.org/ticket/8747
If that's the case, then recompiling FFmpeg would solve the issue.
Effectively, this issue is directly related to the regression introduced in 4.3 and fixed in https://github.com/FFmpeg/FFmpeg/commit/ba3e771a42c29ee02c34e7769cfc1b2dbc5c760a. On FFmpeg 4.2 video reader tests pass
Given that this was a known issue from ffmpeg, and is fixed by using a different version, I'm closing this issue