-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[video reader] inception commit #1303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I start to continue the discussion in the original PR (#1279) below |
From @fmassa Thanks a lot for the PR Zhicheng! The first thing I need to figure out before we can merge this is how we will be adding ffmpeg as a dependency for torchvision, and if it will be a soft or hard dependency. A few options: use ffmpeg from conda-forge Another thing I need to do is to get CI working for Windows and OSX in torchvision, so that we can make sure that this PR compiles and works nicely in the other OS that torchvision supports. I'll be looking into both the CI and ffmpeg dependency from an OSS perspective. Option 1 will serve well. In virtual env of Anaconda (say
We need to add |
From @soumith i think it might be a good start to start with (1), i.e. the ffmpeg from conda-source or system package manager (brew install ffmpeg / apt install ffmpeg). Also, by ffmpeg I presume you mean libav? For binaries, we will figure out how to ship ffmpeg the right way ourselves. Just building ffmpeg from source is not sufficient btw, because you need to build it with codec support, and there are tons of codecs we need to build it with. I agree building ffmpeg from source is not a smooth process (see lengthy instructions here: https://trac.ffmpeg.org/wiki/CompilationGuide/Centos). On the other side, |
a087905
to
ade1cf7
Compare
f989b7d
to
2f6ede3
Compare
…t __init__ method
a8351e2
to
83c78e9
Compare
@stephenyan1231 I've fixed lint and made the Can you look into the final build failures that happen for some specific versions of gcc? |
…e as key type of std::unordered_map
Codecov Report
@@ Coverage Diff @@
## master #1303 +/- ##
==========================================
- Coverage 65.47% 65.04% -0.43%
==========================================
Files 75 76 +1
Lines 5827 5902 +75
Branches 892 901 +9
==========================================
+ Hits 3815 3839 +24
- Misses 1742 1795 +53
+ Partials 270 268 -2
Continue to review full report at Codecov.
|
Which ffmpeg version using?
Got some error... |
Summary: Pull Request resolved: #62 Current dependency torchvision 0.4.0 was released in August. It missed quite a few PRs that are merged after that, and that are needed for video classification, such as - pytorch/vision#1437 - pytorch/vision#1431 - pytorch/vision#1423 - pytorch/vision#1418 - pytorch/vision#1408 - pytorch/vision#1376 - pytorch/vision#1363 - pytorch/vision#1353 - pytorch/vision#1303 This will fail the CI test when a diff uses changes made in those PRs. Before a new official version of TorchVision is released, we can temporarily use the nightly torchvision to get all the recent PRs, and unblock the PR merging. We plan to use a fixed version of TorchVision later. Reviewed By: vreis Differential Revision: D17944239 fbshipit-source-id: 86ff540e3fc4f08ef767e84ef103525db5158201
* [video reader] inception commit * add method save_metadata to class VideoClips in video_utils.py * add load_metadata() method to VideoClips class * add Exception to not catch unexpected events such as memory erros, interrupt * fix bugs in video_plus.py * [video reader]remove logging. update setup.py * remove time measurement in test_video_reader.py * Remove glog and try making ffmpeg finding more robust * Add ffmpeg to conda build * Add ffmpeg to conda build [again] * Make library path finding more robust * Missing import * One more missing fix for import * Py2 compatibility and change package to av to avoid version conflict with ffmpeg * Fix for python2 * [video reader] support to decode one stream only (e.g. video/audio stream) * remove argument _precomputed_metadata_filepath * remove save_metadata method * add get_metadata method * expose _precomputed_metadata and frame_rate arguments in video dataset __init__ method * remove ssize_t * remove size_t to pass CI check on Windows * add PyInit__video_reader function to pass CI check on Windows * minor fix to define PyInit_video_reader symbol * Make c++ video reader optional * Temporarily revert changes to test_io * Revert changes to python files * Rename files to make it private * Fix python lint * Fix C++ lint * add a functor object EnumClassHash to make Enum class instances usable as key type of std::unordered_map * fix cpp format check * Fix cherry-pick conflict for 0.4.2 release
Implement a C++ video decoder, and refer to it as TorchVision (TV) video reader in the following.
Attention
This PR replaces the original PR (#1279) which is contaminated by other irrelevant commits.
Main features
AV_PIX_FMT_RGB24
)AVSampleFormat
(default:AV_SAMPLE_FMT_FLT
)load_metadata()
andsave_metadata()
to classVideoClips
in video_utils.pyAPIs
The main API includes
FfmpegDecoder::decodeFile(....)
: decode frames from a given video file. This is useful for both OOS and FB research projects, where videos reside in file folder.FfmpegDecoder::decodeMemory(....)
: decode frames from a given compressed video byte array. This is useful for decoding everstore videos.Sanity check
unit tests
Changes to TorchVision installation
Video reader depends on ffmpeg4. To install it, use conda.
conda install -c conda-forge ffmpeg=4.0.2
conda install -c conda-forge av
which will automatically installffmpeg
dependency.Benchmark
We use several videos from HMDB-51, UCF-101 and Kinetics-400 for benchmarking and unit test. Test videos are listed below.
RATRACE_wave_f_nm_np1_fr_goo_37.avi
SchoolRulesHowTheyHelpUs_wave_f_nm_np1_ba_med_0.avi
TrumanShow_wave_f_nm_np1_fr_med_26.avi
v_SoccerJuggling_g23_c01.avi
v_SoccerJuggling_g24_c01.avi
R6llTwEh07w.mp4
SOX5yA1l24A.mp4
WUzgd7C1pWA.mp4
Unit test
Results of unit test are attached.
[torchvision video reader unit test.log]
torchvision.video.reader.unit.test.log
Comparison with PyAv