Description
🐛 Bug
The VideoClips
class fails at loading the clip with idx=0
when clip_length_in_frames=1
, and the video backend is video_reader
.
To Reproduce
import torchvision
from torchvision.datasets.video_utils import VideoClips
video_path = '/path/to/some/video.mp4'
torchvision.set_video_backend('video_reader')
def load_clip(frames_per_clip, idx):
video_clips = VideoClips(video_paths=[video_path], clip_length_in_frames=frames_per_clip)
_ = video_clips.get_clip(idx)
# This raises an error
try:
load_clip(frames_per_clip=1, idx=0)
except Exception as e:
print(e)
# This works fine
load_clip(frames_per_clip=1, idx=1)
# This works fine
load_clip(frames_per_clip=2, idx=0)
The output is the following:
UserWarning: The pts_unit 'pts' gives wrong results and will be removed in a follow-up version. Please use pts_unit 'sec'.
100%|███████████████████| 1/1 [00:00<00:00, 5.37it/s]
torch.Size([252, 360, 480, 3]) x 1 # This is the error message
100%|███████████████████| 1/1 [00:00<00:00, 5.62it/s]
100%|███████████████████| 1/1 [00:00<00:00, 5.84it/s]
Expected behavior
It should load the first frame of the video (which is equivalent to the first clip of the video).
Environment
Collecting environment information...
PyTorch version: 1.6.0.dev20200504
Is debug build: No
CUDA used to build PyTorch: 10.2
OS: Ubuntu 18.04.4 LTS
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CMake version: version 3.10.2
Python version: 3.8
Is CUDA available: Yes
CUDA runtime version: 10.2.89
GPU models and configuration:
GPU 0: GeForce RTX 2080 Ti
GPU 1: GeForce RTX 2080 Ti
GPU 2: GeForce RTX 2080 Ti
GPU 3: GeForce RTX 2080 Ti
GPU 4: GeForce RTX 2080 Ti
GPU 5: GeForce RTX 2080 Ti
GPU 6: GeForce RTX 2080 Ti
GPU 7: GeForce RTX 2080 Ti
Nvidia driver version: 440.33.01
cuDNN version: /usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudnn.so.7
Versions of relevant libraries:
[pip3] numpy==1.18.2
[pip3] numpydoc==0.9.2
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 hfd86e86_1
[conda] mkl 2020.0 166
[conda] mkl-service 2.3.0 py38he904b0f_0
[conda] mkl_fft 1.0.15 py38ha843d7b_0
[conda] mkl_random 1.1.0 py38h962f231_0
[conda] numpy 1.18.1 py38h4f9e942_0
[conda] numpy-base 1.18.1 py38hde5b4d6_1
[conda] pytorch 1.6.0.dev20200504 py3.8_cuda10.2.89_cudnn7.6.5_0 pytorch-nightly
[conda] torchvision 0.7.0a0+c7147af pypi_0 pypi
Additional context
The method _read_video_from_file
in torchvision.io._video_opt.py
returns the whole video (not just the first frame) when video_pts_range = (0, 0)
.
The specific problem occurs when calling torch.ops.video_reader.read_video_from_file(...)
.
Please note that it may be possible that the error would be solved if using 'sec' instead of 'pts' as the UserWarning message suggests, and maybe there is already work being done in that direction already (moving VideoClips
to pts_unit='sec'
). However, right now I don't think I can modify the units to seconds without modifying the library.