add HMDB51 and UCF101 datasets as well as prototype for new style video decoding #5335

pmeier · 2022-02-02T12:40:50Z

This adds HMDB51 as first video dataset (see #4541). It is used for prototyping video decoder datapipes initially implemented by @bjuncek in #4838. You can test with the following snippet

from torchvision.prototype import datasets

dataset = datasets.load("hmdb51")
dataset = datasets.utils.KeyframeDecoder(dataset)

for sample in dataset:
    print(sample)
    break

Apart from that the biggest change is the added meta dictionary attribute to the EncodedData feature. The rationale here is that each dataset might provide very different meta data for each file and this is hard to standardize. We might be able to have some common attributes like path, but I would still leave the option open for arbitrary meta data.

facebook-github-bot · 2022-02-02T12:40:56Z

💊 CI failures summary and remediations

As of commit 07d78b2 (more details on the Dr. CI page):

1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

unittest_prototype (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

�[31m============================== �[31m�[1m5 ...0.26s�[0m�[31m ===============================�[0m

    raise TypeError(f"{cls} is not a generic class")
TypeError: <class 'torch.utils.data.datapipes.iter.grouping.ShardingFilterIterDataPipe'> is not a generic class
------ generated xml file: /home/circleci/project/test-results/junit.xml -------
=========================== short test summary info ============================
ERROR test/test_prototype_builtin_datasets.py - TypeError: <class 'torch.util...
ERROR test/test_prototype_datasets_api.py - TypeError: <class 'torch.utils.da...
ERROR test/test_prototype_datasets_utils.py - TypeError: <class 'torch.utils....
ERROR test/test_prototype_models.py - TypeError: <class 'torch.utils.data.dat...
ERROR test/test_prototype_transforms_functional.py - TypeError: <class 'torch...
!!!!!!!!!!!!!!!!!!! Interrupted: 5 errors during collection !!!!!!!!!!!!!!!!!!!!
�[31m============================== �[31m�[1m5 errors�[0m�[31m in 0.26s�[0m�[31m ===============================�[0m


Exited with code exit status 2

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

pmeier · 2022-02-02T12:54:31Z

@bjuncek I've also ported UCF101 from #4848 so you can close it and take over this PR.

pmeier · 2022-02-02T15:22:58Z

torchvision/prototype/datasets/_builtin/hmdb51.py

+            f"{url_root}/hmdb51_org.rar",
+            sha256="9e714a0d8b76104d76e932764a7ca636f929fff66279cda3f2e326fa912a328e",
+        )
+        videos._preprocess = self._extract_videos_archive


@NicolasHug The archive is a rar of rars so using a single extract=True won't cut it. We need the full extraction since reading from rar archives is rather slow and with this we get a significant performance increase.

Another option would be to use this "recursive extraction" by default when setting extract=True.

…eo-decoding

bjuncek · 2022-02-11T15:14:53Z

@pmeier I see that you closed this; should I switch to the new PR?

pmeier · 2022-02-11T15:41:04Z

This probably happened because I deleted the branch that this was supposed to be merged into. I'll fix, keep working on this.

pmeier · 2022-02-15T09:06:46Z

For whatever reason, GitHub does not let me re-open this. I've send the PR again in #5422.

datumbox · 2022-02-15T10:09:04Z

@pmeier It's because the revamp branch is deleted.

pmeier · 2022-02-15T11:19:13Z

Yeah, I figured that, but it doesn't let me select a new merge target like main either. This is possible for open PRs.

add hmdb51 dataset and prototype for new style video decoding

9c66ddc

pmeier added module: datasets module: video prototype labels Feb 2, 2022

pmeier requested a review from bjuncek February 2, 2022 12:40

pytorch-bot bot added the ciflow/default label Feb 2, 2022

facebook-github-bot added the cla signed label Feb 2, 2022

port UCF101

1c025f1

appease mypy

afd8bc1

pmeier commented Feb 2, 2022

View reviewed changes

This was linked to issues Feb 3, 2022

HMDB51 #5359

Open

UCF101 #5361

Open

pmeier mentioned this pull request Feb 4, 2022

[NOMERGE] Feedback Transforms API implementation #5375

Closed

fix resource loading

5b98c64

pmeier changed the title ~~add HMDB51 dataset and prototype for new style video decoding~~ add HMDB51 and UCF101 datasets as well as prototype for new style video decoding Feb 7, 2022

Merge branch 'revamp-prototype-features-transforms' into datasets/vid…

07d78b2

…eo-decoding

pmeier deleted the branch pytorch:revamp-prototype-features-transforms February 11, 2022 14:51

pmeier closed this Feb 11, 2022

pmeier mentioned this pull request Feb 15, 2022

add HMDB51 and UCF101 datasets as well as prototype for new style video decoding #5422

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add HMDB51 and UCF101 datasets as well as prototype for new style video decoding #5335

add HMDB51 and UCF101 datasets as well as prototype for new style video decoding #5335

pmeier commented Feb 2, 2022 •

edited

Loading

facebook-github-bot commented Feb 2, 2022 •

edited

Loading

pmeier commented Feb 2, 2022

pmeier Feb 2, 2022

bjuncek commented Feb 11, 2022

pmeier commented Feb 11, 2022

pmeier commented Feb 15, 2022

datumbox commented Feb 15, 2022

pmeier commented Feb 15, 2022

add HMDB51 and UCF101 datasets as well as prototype for new style video decoding #5335

add HMDB51 and UCF101 datasets as well as prototype for new style video decoding #5335

Conversation

pmeier commented Feb 2, 2022 • edited Loading

facebook-github-bot commented Feb 2, 2022 • edited Loading

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

unittest_prototype (1/1)

pmeier commented Feb 2, 2022

pmeier Feb 2, 2022

Choose a reason for hiding this comment

bjuncek commented Feb 11, 2022

pmeier commented Feb 11, 2022

pmeier commented Feb 15, 2022

datumbox commented Feb 15, 2022

pmeier commented Feb 15, 2022

pmeier commented Feb 2, 2022 •

edited

Loading

facebook-github-bot commented Feb 2, 2022 •

edited

Loading