Added SceneFLow variant datasets #6345

TeodorPoncu · 2022-08-02T11:05:57Z

This is a continuation of the PR split (#6311, #6269) which contains the SceneFlow dataset variants.

NicolasHug

Thanks @TeodorPoncu , just a few comments but it looks great already

NicolasHug · 2022-08-02T11:46:34Z

test/test_datasets.py

+            "final": "frames_finalpass",
+        }
+
+        num_examples = 1


Suggested change

num_examples = 1

NicolasHug · 2022-08-02T11:50:32Z

torchvision/datasets/_stereo_matching.py

+
+    Args:
+        root (string): Root directory where SceneFlow is located.
+        split (string): Which dataset variant to user, "FlyingThings3D" (default), "Monkaa" or "Driving".


Following up on our discussoin in #6269 (comment), it looks like we don't really need to expose this dataset. There's no obvious use-case yet, so I'm happy to drop it.

If really we want to allow users to control this, I think we should just call this parameter variant instead of split, as split is usually reserved for train / test / val. But perhaps the dataset authors have a dedicated name already?

In the CREStereo paper, they have a vague formulation from which one could deduce that they are using ALL mentioned datasets (about 6 synthetic datasets if I recall correctly).

I agree that variant might be a better naming scheme.

NicolasHug · 2022-08-02T11:53:22Z

test/test_datasets.py

+class SceneFlowStereoTestCase(datasets_utils.ImageDatasetTestCase):
+    DATASET_CLASS = datasets.SceneFlowStereo
+    ADDITIONAL_CONFIGS = datasets_utils.combinations_grid(
+        split=("FlyingThings3D", "Driving", "Monkaa"), pass_name=("clean", "final")


Suggested change

split=("FlyingThings3D", "Driving", "Monkaa"), pass_name=("clean", "final")

split=("FlyingThings3D", "Driving", "Monkaa"), pass_name=("clean", "final", "both")

NicolasHug

Thanks @TeodorPoncu LGTM, just one final question regarding the variant parameter

NicolasHug · 2022-08-03T12:27:34Z

torchvision/datasets/_stereo_matching.py

+
+        root = Path(root) / "SceneFlow"
+
+        verify_str_arg(variant, "variant", valid_values=("FlyingThings3D", "Driving", "Monkaa"))


I'm a bit confused: I thought we were relying on all variants for the training? Do we want to allow "all" here as well?

Well, in the Optical Flow datasets we directly provide access to FlyingThings3D. Some authors use all the variants, or just a subset of them. An "all" split could work as well, however if a user requires some sort of variant combination I guess they could opt for ConcatDataset, without requiring us to provide a way of handling variant permutations.

If I recall correctly we settled for removing variants for the CREStereo Dataset since the only existing use-case is to use it with all its variants (#6351, #6269 (comment)).

If I recall correctly we settled for removing variants for the CREStereo Dataset since the only existing use-case is to use it with all its variants (#6351, #6269 (comment)).

Is this the case here as well?

Honestly I'm fine with either at this point, I'm just surprised that we don't need "all" because this is what we needed for CREStereo

We could add an "all" split as well. As a user, generally my preference is to have access to a "smaller" version of a dataset for quick experimentation / validation. Similarly to the ImageNet vs. ImageNette scenario.

As far as I am aware RAFT-Stereo uses all 3 variants as well, however I do not have an exhaustive list to guarantee that this is a universally used approach.

OK, well let's just leave it as is and figure out whether not having it complicates things in the training reference. Thanks for the details

For instance, this is a configuration example for a dataset chain schedule, similarly to how RAFT-Stereo is used, where the datasets are trained in sequential order, performing optimisation using samples from the dataset for the specified number of steps. Adding the "all" variant or removing the variant argument all-together would result in:

train_dataset: ["instereo-2k", "sintel", "sceneflow"] dataset_steps: [200_000, 150_000, 100_000]

Having the current version would yield:

train_dataset: ["instereo-2k", "sintel", "flythings3d", "monkaa", "driving"] dataset_steps: [200_000, 150_000, 33_000, 33_000, 33_000]

NicolasHug

LGTM, thanks @TeodorPoncu

github-actions · 2022-08-03T15:34:04Z

Hey @TeodorPoncu!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

Summary: * added SceneFLow variant datasets * Changed split name to variant name * removed trailing commented code line Reviewed By: datumbox Differential Revision: D38824231 fbshipit-source-id: 14dc283f11df26287fe6446946b441f51eb82181

added SceneFLow variant datasets

c057189

facebook-github-bot added the cla signed label Aug 2, 2022

NicolasHug reviewed Aug 2, 2022

View reviewed changes

TeodorPoncu added 2 commits August 2, 2022 13:50

Changed split name to variant name

996df9b

removed trailing commented code line

9f85dc1

NicolasHug reviewed Aug 3, 2022

View reviewed changes

Merge branch 'main' into add-stereo-flyingthings

62bbe44

NicolasHug approved these changes Aug 3, 2022

View reviewed changes

TeodorPoncu merged commit 96aa3d9 into main Aug 3, 2022

NicolasHug added module: datasets new feature labels Aug 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added SceneFLow variant datasets #6345

Added SceneFLow variant datasets #6345

Uh oh!

TeodorPoncu commented Aug 2, 2022

Uh oh!

NicolasHug left a comment

Uh oh!

NicolasHug Aug 2, 2022

Uh oh!

NicolasHug Aug 2, 2022

Uh oh!

TeodorPoncu Aug 2, 2022

Uh oh!

NicolasHug Aug 2, 2022

Uh oh!

NicolasHug left a comment

Uh oh!

NicolasHug Aug 3, 2022

Uh oh!

TeodorPoncu Aug 3, 2022

Uh oh!

NicolasHug Aug 3, 2022

Uh oh!

TeodorPoncu Aug 3, 2022

Uh oh!

NicolasHug Aug 3, 2022

Uh oh!

TeodorPoncu Aug 3, 2022

Uh oh!

NicolasHug left a comment

Uh oh!

github-actions bot commented Aug 3, 2022

Uh oh!

Uh oh!

	split=("FlyingThings3D", "Driving", "Monkaa"), pass_name=("clean", "final")
	split=("FlyingThings3D", "Driving", "Monkaa"), pass_name=("clean", "final", "both")


		root = Path(root) / "SceneFlow"

		verify_str_arg(variant, "variant", valid_values=("FlyingThings3D", "Driving", "Monkaa"))

Added SceneFLow variant datasets #6345

Added SceneFLow variant datasets #6345

Uh oh!

Conversation

TeodorPoncu commented Aug 2, 2022

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 3, 2022

Uh oh!

Uh oh!