Add CarlaStereo, Kitti2012Stereo, and Kitti2015Stereo datasets for Stereo(#6269) #6311

TeodorPoncu · 2022-07-25T16:01:05Z

Broken down PR(#6269) as per @NicolasHug suggestion

NicolasHug

Thanks @TeodorPoncu ,

I have a few comments but they're mostly minor or about conventions. The PR looks great in general, nice job!

Apart from the comments below, we'll need to add these classes to datasets.rst in order to look at the rendered docs.

torchvision/datasets/_stereo_matching.py

NicolasHug · 2022-07-27T17:23:00Z

torchvision/datasets/_stereo_matching.py

+                imgs,
+                dsp_maps,
+                valid_masks,
+            ) = self.transforms(imgs, dsp_maps, valid_masks)


Following up on our previous discussion #6269 (comment) : I'm still not sure this is the signature we want, but I can see how flattening all of it would be a bit verbose as well.

Perhaps the best way forward here is to keep it as-is for now, and re-evaluate when we look at the implementation of the transforms.

CC @datumbox perhaps you already have an opinion about this?

I actually mostly wrapped up writing the transform pipeline yesterday, and I have mixed feelings about the signature now. Since tuples are immutable, sometimes you've got to do a lot of unpacking and repacking inside the transform, so I'd actually be more inclined to use the flattened signature, even though it comes at the cost of increased verbosity.

@NicolasHug I'm missing context to provide an educated opinion. I understand that you are dealing with the legacy datasets and you have to "hack them together" to progress with the Depth Perception work, so in that sense I agree with your "keep as-is and re-evaluate" proposal.

I suspect that there are some incompatibilities with the current approach and the new Transforms API but that's not because you have multiple images/masks/maps but because left/right inputs need to be processed together rather than independently. I will probably need to understand more how this works on this task but it's an interesting use-case that we will need to support and might challenge some of the assumptions we made about the transforms and especially the fact the one that suggests that input can be transformed independently. cc @pmeier @vfdev-5

To provide some context as to the specific use-case:

class RandomHorizontalFlip(torch.nn.Module): def __init__(self, p: float = 0.5) -> None: super().__init__() self.p = p def forward( self, images: Tuple[Tensor, Tensor], disparities: Tuple[Tensor, Tensor], masks: Tuple[Tensor, Tensor], ) -> Tuple[Tuple, Tuple, Tuple]: img_left, img_right = images dsp_left, dsp_right = disparities mask_left, mask_right = masks if dsp_right is not None and torch.rand(1) > self.p: # horizontal flip at pixel level img_left, img_right = F.hflip(img_left), F.hflip(img_right) dsp_left, dsp_right = F.hflip(dsp_left), F.hflip(dsp_right) # mask flipping flow is heavily skewed towards how we handle # generating pseudo-masks via the transform compose if mask_left is not None and mask_right is not None: mask_left, mask_right = F.hflip(mask_left), F.hflip(mask_right) # flip the two stereo channels return ( (mask_right, mask_left), (dsp_right, dsp_left), (mask_right, mask_left) ) return images, disparities, masks

The transform requires some inputs that for some datasets might not exist, namely the right disparity map. This is due to the fact, that we perform a horizontal pixel flip, the new target is not the transformed version of the original target but the opposite channel transformed target. Therefor the transform needs to have access to all the information.

Were it not for this transform, we could have just abided by the previous flow template.

For the bigger picture view, here is some extra context:

class Compose(torch.nn.Module): def __init__(self, transforms): super().__init__() self.transforms = transforms def forward(self, images, disparities, masks): for t in self.transforms: images, disparities, masks = t(images, disparities, masks) return images, disparities, masks

Flattening everything would require making transforms that support the following forward scheme:

def forward(self, img_left, img_right, dsp_left, dsp_right, mask_left, mask_right): for t in self.transforms: img_left, img_right, dsp_left, dsp_right, mask_left, mask_right = t(img_left, img_right, dsp_left, dsp_right, mask_left, mask_right) return img_left, img_right, dsp_left, dsp_right, mask_left, mask_right

Truthfully, a disparity map is actually just a flow map with a single channel / 0-flow on the y-axis. The same constraints that apply to the transforms in our optical flow reference scripts apply to these as well.

Bilinear resizing will still work as intended because both flows and disparities are measured in pixels and not some other real-world distance metric.

Judging by what several other paper repos provide in terms of augmentation pipelines (1 , 2) it seems that the same approach is being followed.

Typically in monocular depth-estimation the same problems are posed and although lambertian surfaces are an important issue, the assumption that color shifts should affect the depth-map are not taken into account. Both in the unsupervised and supervised setting we consider such alterations something that should not affect the true world model / estimation, but they are solely camera related / a regularization method.

In any case I agree that we can merge this PR to unblock and re-visit this considering the new transforms API.

OK sounds like the new API in principle supports both the multiple inputs (named or not) and special types like disparity maps. I don't see immediately an incompatibility but we should keep an eye for similar requests as this is a useful feedback for the new API.

@datumbox @NicolasHug since, since this PR is non-blocking for the model training, I could try dog-fooding the new transforms API and include in this PR parts of the reference script in order for us to get a clearer idea how everything would mix in with the new API and have a better picture of how the dataset should be structured.

Is that ok with you guys?

We literally just merged a bunch of updates today at #6305. It might be a bit premature to dogfood it (we'll definitely take upon your offer a bit later though) but you can already have a look how it works on main branch at torchvision.prototype.transforms.

torchvision/datasets/_stereo_matching.py

test/datasets_utils.py

test/test_datasets.py

…ons.

jdsgomes

LGTM. Regarding the open discussion I think we agree that this can be merged and revisited. I would just wait for @NicolasHug approval since he made most comments/suggestions.

NicolasHug

Thanks for addressing the comments @TeodorPoncu , I made another pass below

Would you mind adding the datasets to datasets.rst so that we can check the rendered docs as well? Thanks!

NicolasHug · 2022-07-29T08:55:22Z

torchvision/datasets/_stereo_matching.py

+        self._images = []  # type: ignore
+        self._disparities = []  # type: ignore


Do we really need to add type: ignore here? I'd suggest to type the rest, or not type anything at all (which is by far my prefered way)

I've left the #type: ingore because mypy starts throwing errors about empty lists since it cannot infer their dtype.

torchvision/datasets/_stereo_matching.py:48: error: Need type annotation for "_images" (hint: "_images: List[<type>] = ...") [var-annotated] self._images = []

torchvision/datasets/_stereo_matching.py:49: error: Need type annotation for "_disparities" (hint: "_disparities: List[<type>] = ...") [var-annotated] self._disparities = []

torchvision/datasets/_stereo_matching.py

NicolasHug

Thanks @TeodorPoncu , one last cosmetic comment regarding the dogs but other than that LGTM! I'll approve now so that you can merge when it's addressed. Nice work!

NicolasHug · 2022-08-01T14:33:20Z

torchvision/datasets/_stereo_matching.py

+        """Return example at given index.
+        Args:
+            index(int): The index of the example to retrieve
+        Returns:


This does not render properly in the docs: https://output.circle-artifacts.com/output/job/4dcd20b4-ad55-48a3-a0c9-474bfaae3aa3/artifacts/0/docs/generated/torchvision.datasets.Kitti2015Stereo.html#torchvision.datasets.Kitti2015Stereo.__getitem__

I think you just need to skip lines like so:

"""Return example at given index. Args: index(int): The index of the example to retrieve Returns:

Same for the other datasets :)

NicolasHug · 2022-08-01T15:39:49Z

torchvision/datasets/_stereo_matching.py

+        """Return example at given index.
+        Args:


The line skip should be between these two as well:

Suggested change

"""Return example at given index.

Args:

"""Return example at given index.

Args:

…orch/vision into add-stereo-dataset-interface

github-actions · 2022-08-02T10:44:02Z

Hey @TeodorPoncu!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

Summary: * Broken down PR(#6269). Added an additional dataset * Removed some types. Store None instead of "". Merged test util functions. * minor mypy fixes. minor doc fixes * reformated docstring * Added additional line-skips Reviewed By: NicolasHug Differential Revision: D38351752 fbshipit-source-id: 376714fcdd49cb474670ce8e6e959507a517ee46

Broken down PR(#6269). Added an additional dataset

e326295

facebook-github-bot added the cla signed label Jul 25, 2022

TeodorPoncu requested review from YosuaMichael, jdsgomes and NicolasHug July 25, 2022 16:01

NicolasHug reviewed Jul 28, 2022

View reviewed changes

TeodorPoncu added 2 commits July 28, 2022 13:28

Removed some types. Store None instead of "". Merged test util functi…

3f8d330

…ons.

Merge branch 'main' into add-stereo-dataset-interface

cf51963

jdsgomes approved these changes Jul 29, 2022

View reviewed changes

NicolasHug reviewed Jul 29, 2022

View reviewed changes

TeodorPoncu added 3 commits July 29, 2022 10:42

minor mypy fixes. minor doc fixes

16ee594

Merge branch 'main' into add-stereo-dataset-interface

1901a10

Merge branch 'main' into add-stereo-dataset-interface

7d1c16f

NicolasHug approved these changes Aug 1, 2022

View reviewed changes

TeodorPoncu added 2 commits August 1, 2022 15:48

reformated docstring

ba90c70

Merge branch 'main' into add-stereo-dataset-interface

f542677

NicolasHug reviewed Aug 1, 2022

View reviewed changes

TeodorPoncu added 4 commits August 1, 2022 18:23

Added additional line-skips

e7ff0e9

Merge branch 'add-stereo-dataset-interface' of https://github.com/pyt…

3d3899f

…orch/vision into add-stereo-dataset-interface

Merge branch 'main' into add-stereo-dataset-interface

a89afd3

Merge branch 'main' into add-stereo-dataset-interface

ccffbae

TeodorPoncu merged commit 6bea4ef into main Aug 2, 2022

NicolasHug added module: datasets new feature labels Aug 2, 2022

NicolasHug changed the title ~~Splitting Stereo Dataset PR(#6269)~~ Add CarlaStereo, Kitti2012Stereo, and Kitti2015Stereo datasets for Stereo(#6269) Aug 2, 2022

This was referenced Aug 2, 2022

Added SceneFLow variant datasets #6345

Merged

Add FallingThings dataset #6346

Merged

Added InStereo2k dataset #6347

Merged

Add Sintel Stereo dataset #6348

Merged

This was referenced Aug 2, 2022

Added ETH3D stereo dataset #6349

Merged

Added Middlebury2014 dataset #6350

Merged

Added CREStereo dataset #6351

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CarlaStereo, Kitti2012Stereo, and Kitti2015Stereo datasets for Stereo(#6269) #6311

Add CarlaStereo, Kitti2012Stereo, and Kitti2015Stereo datasets for Stereo(#6269) #6311

TeodorPoncu commented Jul 25, 2022 •

edited

Loading

NicolasHug left a comment

NicolasHug Jul 27, 2022

TeodorPoncu Jul 28, 2022

datumbox Jul 28, 2022

TeodorPoncu Jul 28, 2022 •

edited

Loading

TeodorPoncu Jul 28, 2022

TeodorPoncu Jul 28, 2022

jdsgomes Jul 28, 2022

datumbox Jul 28, 2022

TeodorPoncu Jul 28, 2022

datumbox Jul 28, 2022

jdsgomes left a comment

NicolasHug left a comment

NicolasHug Jul 29, 2022

TeodorPoncu Jul 29, 2022

NicolasHug left a comment

NicolasHug Aug 1, 2022

NicolasHug Aug 1, 2022

github-actions bot commented Aug 2, 2022

		self._images = [] # type: ignore
		self._disparities = [] # type: ignore

Add CarlaStereo, Kitti2012Stereo, and Kitti2015Stereo datasets for Stereo(#6269) #6311

Add CarlaStereo, Kitti2012Stereo, and Kitti2015Stereo datasets for Stereo(#6269) #6311

Conversation

TeodorPoncu commented Jul 25, 2022 • edited Loading

NicolasHug left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TeodorPoncu Jul 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jdsgomes left a comment

Choose a reason for hiding this comment

NicolasHug left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NicolasHug left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Aug 2, 2022

TeodorPoncu commented Jul 25, 2022 •

edited

Loading

TeodorPoncu Jul 28, 2022 •

edited

Loading