Skip to content

VideoClips with video_reader backend fails at loading clip with idx=0 if clip_length_in_frames=1 #2184

Closed
@surisdi

Description

@surisdi

🐛 Bug

The VideoClips class fails at loading the clip with idx=0 when clip_length_in_frames=1, and the video backend is video_reader.

To Reproduce

import torchvision
from torchvision.datasets.video_utils import VideoClips

video_path = '/path/to/some/video.mp4'
torchvision.set_video_backend('video_reader')


def load_clip(frames_per_clip, idx):
    video_clips = VideoClips(video_paths=[video_path], clip_length_in_frames=frames_per_clip)
    _ = video_clips.get_clip(idx)


# This raises an error
try:
    load_clip(frames_per_clip=1, idx=0)
except Exception as e:
    print(e)

# This works fine
load_clip(frames_per_clip=1, idx=1)

# This works fine
load_clip(frames_per_clip=2, idx=0)

The output is the following:

UserWarning: The pts_unit 'pts' gives wrong results and will be removed in a follow-up version. Please use pts_unit 'sec'.

100%|███████████████████| 1/1 [00:00<00:00,  5.37it/s]
torch.Size([252, 360, 480, 3]) x 1  # This is the error message
100%|███████████████████| 1/1 [00:00<00:00,  5.62it/s]
100%|███████████████████| 1/1 [00:00<00:00,  5.84it/s]

Expected behavior

It should load the first frame of the video (which is equivalent to the first clip of the video).

Environment

Collecting environment information...
PyTorch version: 1.6.0.dev20200504
Is debug build: No
CUDA used to build PyTorch: 10.2

OS: Ubuntu 18.04.4 LTS
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CMake version: version 3.10.2

Python version: 3.8
Is CUDA available: Yes
CUDA runtime version: 10.2.89
GPU models and configuration: 
GPU 0: GeForce RTX 2080 Ti
GPU 1: GeForce RTX 2080 Ti
GPU 2: GeForce RTX 2080 Ti
GPU 3: GeForce RTX 2080 Ti
GPU 4: GeForce RTX 2080 Ti
GPU 5: GeForce RTX 2080 Ti
GPU 6: GeForce RTX 2080 Ti
GPU 7: GeForce RTX 2080 Ti

Nvidia driver version: 440.33.01
cuDNN version: /usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudnn.so.7

Versions of relevant libraries:
[pip3] numpy==1.18.2
[pip3] numpydoc==0.9.2
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               10.2.89              hfd86e86_1  
[conda] mkl                       2020.0                      166  
[conda] mkl-service               2.3.0            py38he904b0f_0  
[conda] mkl_fft                   1.0.15           py38ha843d7b_0  
[conda] mkl_random                1.1.0            py38h962f231_0  
[conda] numpy                     1.18.1           py38h4f9e942_0  
[conda] numpy-base                1.18.1           py38hde5b4d6_1  
[conda] pytorch                   1.6.0.dev20200504 py3.8_cuda10.2.89_cudnn7.6.5_0    pytorch-nightly
[conda] torchvision               0.7.0a0+c7147af          pypi_0    pypi

Additional context

The method _read_video_from_file in torchvision.io._video_opt.py returns the whole video (not just the first frame) when video_pts_range = (0, 0).

The specific problem occurs when calling torch.ops.video_reader.read_video_from_file(...).

Please note that it may be possible that the error would be solved if using 'sec' instead of 'pts' as the UserWarning message suggests, and maybe there is already work being done in that direction already (moving VideoClips to pts_unit='sec'). However, right now I don't think I can modify the units to seconds without modifying the library.

cc @pmeier @bjuncek

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions