Add crestereo dataset #6269

TeodorPoncu · 2022-07-14T09:23:38Z

Added a Stereo Matching Dataset interface similar to the Optical Flow one. I believe we will need to have a renaming of datasets based on the task, at least for 2.5D related, as we might have naming clashes if we plan on adding Depth Estimation as well. Since datasets such as Kitti or FallingThings or SceneFlow can be used for multiple of these tasks.

I've also set the outputs of the disparity map and valid mask to (1, H, W), (H, W) to be aligned with the way the Flow datasets output a flow in shape (2, H, W) and a mask in (H, W) where possible.

NicolasHug

Thanks for the PR @TeodorPoncu , I just gave it a very brief look, I'll do a more in-depth review later

NicolasHug · 2022-07-14T09:51:17Z

torchvision/datasets/_stereo_matching.py

+)
+
+
+def read_pfm_file(file_path: str) -> np.array:


Could we use or modify our existing https://github.com/pytorch/vision/blob/main/torchvision/datasets/_optical_flow.py#L477:L477 for this?

Well, technically based on the header file information, .pfm files could store 1 or 3 channels for (H, W) information. During some initial tests I saw that most usually it was the case that they were storing just 1 channel of information, so the reshape in _optical_flow should be dynamic.

There's also the data slicing. Technically, if correct data is provided and the correct are files are linked to the dataset, slicing an extra channel data[:2, :, :] would yield no issues. However, the default that exists in the test utils make_pfm_file implicitly creates .pfm files with 3 channels.

data = data.reshape(h, w, 3).transpose(2, 0, 1) # <--- move to something like data.reshape(h, w, c) data = np.flip(data, axis=1) # flip on h dimension data = data[:2, :, :] return data.astype(np.float32)

I believe we could add an additional argument in the _read_pfm from stereo, something like slice channels such that: data=data[:slice_channels, :, :] and set it with a default value of 2 such that we do not break backwards compatibility.

It seems that we can modify the existing then, if so that would be preferable.
If you go down this path, please do the change in a separate PR. @NicolasHug would it make sense to move the util out of the optical_flow module?

Thanks for the details - I agree, let's just do the necessary changes and move it to dataset/utils.py

As discussed offline with @TeodorPoncu we decided to move the read_pfm_file to datasets.utils in a separate PR and add an extra argument in a BC way to handle the different scenarios. Would you agree with this approach @NicolasHug ?

lol didn't see your comment and basically suggested what you did :D

NicolasHug · 2022-07-14T09:52:07Z

test/datasets_utils.py


-    @test_all_configs
+    @ test_all_configs


Looks like there's a few formatting issues like these ones. We try to avoid formatting changes in PRs like these, because they can add noise to git blame

As you can see there are linting issues: https://app.circleci.com/pipelines/github/pytorch/vision/18822/workflows/40d854f9-fd3b-43cb-9471-9eb90d4c93ac/jobs/1522929

We have instructions for applying formatters here: https://github.com/pytorch/vision/blob/main/CONTRIBUTING.md#formatting

…sion into add-crestereo-dataset

…stereo-dataset

YosuaMichael

Hi @TeodorPoncu , thanks for the PR, you add quite a lot of stereo dataset which is nice!
My TLDR for this review is:

For download functionality, we may limit to only download what is needed and also check if the file exist before download
For some of the file_path, we should avoid using replace
Some suggestion where we might be able to refactor a bit

YosuaMichael · 2022-07-14T13:47:37Z

test/test_datasets.py

+        for split in ("tree", "shapenet", "reflective", "hole"):
+            with self.create_dataset(split=split) as (dataset, _):
+                for left, right, disparity, valid_mask in dataset:
+                    left_array = np.array(left)


The test logic inside the for loop seems very similar among all the stereo matching dataset, could we have a method for this so we minimize duplicate code?

Yes, most definitely. Would datasets_utils.py in the test directory be a good place where I can add that kind of method?

YosuaMichael · 2022-07-14T13:50:45Z

torchvision/datasets/_stereo_matching.py

+                disparities is a Tuple of (``np.ndarray``, ``np.ndarray``) with shape (1, H, W)
+                valid_masks is a Tuple of (``np.ndarray``, ``np.ndarray``) with shape (H, W)
+
+                In some cases, when a dataset does not provide disparties, the ``disparities`` and


NIT small typo disparties -> disparities

YosuaMichael · 2022-07-14T13:55:23Z

torchvision/datasets/_stereo_matching.py

+
+    Args:
+        root (str): Root directory of the dataset.
+        split (str): The split of the dataset to use. One of ``"tree"``, ``"shapenet"``, ``"reflective"``, ``"hole"``


Question: From my experience before, usually the parameter split options are train, test, val or some other that indicate functionality of the dataset. In this case, it is more of a filter of the image type of the dataset, just wondering if we still should use the term split or other parameter name.

Any opinion on this @NicolasHug ?

I agree it's best to restrict split to the usual (test, train, val,...)

Maybe this question will remove the problem: do we actually need to provide this parameter? Or can we just provide the entire datasets with all "categories"?

As a side note I can't find a reference to these in the paper or the repo, so I can't suggest a name ATM. Do You have thoughts @TeodorPoncu ?

"""Synthetic dataset used in training the `CREStereo <https://arxiv.org/pdf/2203.11483.pdf>`_ architecture."""

I think it should've been layed out in the first line of the docstring. I don't think we have to provide necessarily, as the authors use it just for training and nothing more. However there are some usecases in transfer learning / domain adaptation where I believe people would be interested in having finer control over the splits.

There's a "all" split, which basicly includes all 4 splits. I believe we could rename that to training if we don't ditch the granular split approach?

My take on all this is that unless we have a clear, specific, and immediate need for a feature in torchvision, we don't need to implement it now. I agree the granularity might be useful eventually, but if we don't need it ATM, it's best to leave that kind of problems for the future (should they ever pop-up).

So, in order to simplify this work further, I'd suggest to drop it altogether. At least for now.

I have no strong opinion whether or not to remove it. But if we dont remove it, I think we should give different name like categories or something else, but not split.

torchvision/datasets/_stereo_matching.py

YosuaMichael · 2022-07-14T14:23:09Z

torchvision/datasets/_stereo_matching.py

+            raise FileNotFoundError("No images found in {}".format(root))
+
+        if split == "train":
+            disparity_maps_left = sorted(glob(str(root / "disp_noc" / "*.png")))


Question: Just wondering if we can refactor by having something like get_disparity_map_files in the parent class, and the child class only need to pass the path.

I definitely think we can do something like that. For images as well. I'll check to see wether or not there can be any edge cases where that might not work.

I think this commit should solve those concerns @YosuaMichael. It also tackles the .replace cases.

YosuaMichael · 2022-07-14T14:24:23Z

torchvision/datasets/_stereo_matching.py

+        if not os.path.exists(file_path):
+            return None, None
+
+        # disparity decoding as per Sintel instructions


Could you give some link / reference regarding this instructions?

YosuaMichael · 2022-07-14T14:25:24Z

torchvision/datasets/_stereo_matching.py

+
+            disparity_maps_left = [file_path.replace(p, "disparity").replace(".png", ".pfm") for file_path in imgs_left]
+            disparity_maps_right = [
+                file_path.replace(p, "disparity").replace(".png", ".pfm") for file_path in imgs_right


I think we should try to avoid replace as much as possible for a path

Like you suggested above, I think we can wrap that up in the parent class somehow, where we would avoid replace all-together.

YosuaMichael · 2022-07-14T14:27:59Z

torchvision/datasets/_stereo_matching.py

+            intrinsics = json.load(f)
+            fx = intrinsics["camera_settings"][0]["intrinsic_settings"]["fx"]
+            # inverse of depth-from-disparity equation
+            disparity = (fx * 6.0 * 100) / depth.astype(np.float32)


NIT could we give name to these constant 6.0 and 100 ? (by assigning a variable to them)

YosuaMichael · 2022-07-14T14:28:30Z

torchvision/datasets/_stereo_matching.py

+        self._images = imgs
+
+        disparity_maps_left = list(p.replace("left", "left_disp") for p in imgs_left)
+        disparity_maps_right = list(p.replace("right", "right_disp") for p in imgs_left)


Similar comment as before, we should avoid using replace as much as possible for file_path

NicolasHug · 2022-07-15T08:27:36Z

test/datasets_utils.py

+    left_array = np.array(left)
+    right_array = np.array(right)


I think we could rely on transforms.functional.get_dimensions() to avoid converting into numpy arrays. It can handle tensors and PIL images.

NicolasHug · 2022-07-15T08:28:23Z

test/datasets_utils.py

+    assert len(disparity.shape) == 3
+    assert len(valid_mask.shape) == 2


NIT or just FYI: this is equivalent to array.ndim

NicolasHug · 2022-07-15T08:30:38Z

test/datasets_utils.py

+    left_array = np.array(left)
+    right_array = np.array(right)


same here regarding get_dimensions()

NicolasHug · 2022-07-15T08:34:05Z

torchvision/datasets/__init__.py

+    StereoETH3D,
+    StereoFallingThings,
+    StereoKitti2012,
+    StereoKitti2015,
+    StereoMiddlebury2014,
+    StereoSceneFlow,
+    StereoSintel,
+    CREStereo,
+    InStereo2k,


This is something we can bikeshed on at the very-end but I'm commenting so that we don't forget:

Historically we had the Kitti dataset (for classification) and then I added KittiFlow for optical flow. According to that (arbitrary) convention, the new datasets should probably be KittiStereo, SintelStereo, etc.

NicolasHug · 2022-07-15T08:42:49Z

torchvision/datasets/_stereo_matching.py

+                imgs,
+                dsp_maps,
+                valid_masks,
+            ) = self.transforms(imgs, dsp_maps, valid_masks)


Are we certain we want the transforms to have this signature? In optical flow, we flatten all the input instead of passing tuples

Flattening everything would result in having to write very verbose function headers for transforms (left_img, right_img, left_disp, right_disp, left_mask, right_mask).

I don't have any strong feelings towards this. The trade-off for tuples is that you have to do Tuple unpacking / manipulation inside the transform.

I think for the current transform it will be kinda temporary (put in reference like optical_flow), and I have no strong opinion on this.

For the future transform API, I think it will be good to package left and right together (either stacked as tensor or maybe tuple). And the reason is because the new transforms accept *input as shown: https://github.com/pytorch/vision/blob/main/torchvision/prototype/transforms/_transform.py#L22 and in this case we can't apply transform separately on left and right image.

NicolasHug · 2022-07-15T08:43:54Z

torchvision/datasets/_stereo_matching.py

+        img_right = self._read_img(self._images[index][1])
+
+        dsp_map_left, valid_mask_left = self._read_disparity(self._disparities[index][0])
+        dsp_map_right, valid_mask_right = self._read_disparity(self._disparities[index][1])


From our discussions in #6259 (comment) I was under the impression that we don't have a clear idea about the usefulness of the right disparity map and valid_mask. Does it really make sense to keep them and handling them here then?

In CREStereo there's an augmentation being used which basically horizontally flips the the image / disparity pixels, and then flips the left and right channels between them.

So in order to reproduce that augmentation procedure, we'd need to have access to the left and right disparity maps as well in order to perform the switch.

I made mock implementation of that transform and if the right mask doesn't exist the augmentation just returns its inputs.

NicolasHug · 2022-07-15T08:49:16Z

torchvision/datasets/_stereo_matching.py

+                valid_masks,
+            ) = self.transforms(imgs, dsp_maps, valid_masks)
+
+        return imgs[0], imgs[1], dsp_maps[0], valid_masks[0]


In optical flow, we don't return the valid_mask if the dataset doesn't have a built-in one (unless it gets generated by a transforms).

I'm not saying this is the best design by any means, but there is value in consistency across the library for such things. WDYT?

I believe it would just streamline transform definition / writing for the end user. Input / Output is isometric, they just have to handle how that input is being processed inside the transform.

For this, I also prefer for the dataset to always output same number of output (always has valid_mask although it can be None). It feel less "unexpected" to me.

One design question I have for valid_mask is when the dataset dont provide any file on valid_mask, should we generate the valid_mask on _read_disparity_files or should we just leave it None? (in this case we let the transform to define the valid_mask)

The proposal I've included in the PR is to basically make a mask full of ones when we don't get any information regarding that. Since the usage of the mask is to indicate on which pixels in the disparity map we should compute the loss a None would be equivalent to evaluating the loss on the entire outputs.

One design question I have for valid_mask is when the dataset dont provide any file on valid_mask

I think we should let the transform define it, like we do for OF datasets.

I agree that we should try to let the dataset always return the same number of outputs. This is why we let flow or disparity be None in "test" splits, where it's not available, as it's only available in "train".

However, in the case of valid_mask which may not always exist, it doesn't seem expected to always return it if it is not defined by the dataset. Taken in isolation, why should a dataset return something it doesn't actually have? This was the reason we decided not to return the valid_flow_masks in datasets that don't provide them (again, I'm not claiming this is a perfect solution).

As pointed out by @TeodorPoncu , it does make the writing of the transforms a tiny bit trickier. But this is why we provide the presets in our training recipes, which are able to handle all of this for the user, so this is mostly transparent and they don't need to write if/else-y code.

I would still err on the side of consistency with the rest of our API for this, as this is one of the most important aspect of a frictionless user experience

Thanks @TeodorPoncu , these are all good points TBH.

Speaking from anecdotal experience most of the people that I know who use datasets from torchvision don't even look at the training reference scripts / know that they exist.

Ah, that's a shame. Perhaps this is because we didn't do a great job at documenting these in the past. Hopefully the new doc revamp will help raising awareness about them.

Dataset centric way or Task centric way

I agree with your analysis. For OF datasets we opted for the dataset-centric way. I don't think that there is a clear winner a-priori, but now that we already went this way for OF datasets, honouring consistency is critical.

To me personally, transforms imply modifying / altering data, not necessarily creating or extracting information and as a personal preference I'd avoid using transforms that transform data types / generate information.

I tend to agree as well. We still decided to generate the fake data in the transforms rather in the datasets, because generating it in the datasets has various drawbacks:

it bounds to user to one specific generation, which may not be the one they want. For example in optical flow we generate a fake valid_mask depending on a given threshold, but the default threshold may not suit all users:

https://github.com/pytorch/vision/blob/1dd17538d98f960bcdfa35cab7fd51efbc73cbbf/references/optical_flow/transforms.py#L33:L33

(we could make this a parameter of the dataset class, but it's not ideal either)

it generates a lot of useless data and thus uses memory, while some users may not actually need the valid_mask at all. So it's best to avoid generating it on their behalf.

Another argument that I have for consistent shape outputs from getitem is such that you can guarantee transform associativity as often as possible.

I agree this is important, and we can have that regardless of the design :) In the OF datasets, we don't return (or generate fake) values that don't exist, but we still require all the transforms to have the same signature: img1, img2, flow, valid_mask:

https://github.com/pytorch/vision/blob/add-crestereo-dataset/torchvision/datasets/_optical_flow.py#L65:L65

vision/torchvision/datasets/_optical_flow.py

Lines 242 to 245 in 1dd1753

transforms (callable, optional): A function/transform that takes in

``img1, img2, flow, valid_flow_mask`` and returns a transformed version.

``valid_flow_mask`` is expected for consistency with other datasets which

return a built-in valid mask, such as :class:`~torchvision.datasets.KittiFlow`.

My only comment would be that it would be difficult to wrap multiple datasets together under some sort of wrapper if they don't have a unified return shape.

Multiset training is something that is being used in CREStereo. So if other users want to use other combinations, we would still force users to generate masks for compatibility reasons with other datasets / loss masking.

I agree that binding the user to the way he generates his data is not an ideal thing, however I think that when we force our users to make their own data we should provide them with out-of-the boxy idiomatic ways to do so as well.

Another thing that crosses my mind is that by having it in the dataset we provide a "standard" way of how the data should be interpreted which would allow better reproducibility among users?

Maybe something like target_transform args in other datasets would be a good fit? In our case mask_generation_trasnform?

I believe all of these concerns are covered by the existing design we have for optical flow datasets. I provided references above but I'm happy to provide more details offline/in person if you wish.

No, it's not necessary. I do agree that datasets should remain pure and we should not make any assumptions regarding masks. My final question would be w.r.t. to datasets (such as Kitti2015) for which some authors (i.e. Raft-STEREO) have some basic methods of creating a mask.

If paper authors / dataset authors provide a method for computing a valid mask method (i.e CREStereo) should we consider that a valid way of generating masks and add it into our dataset implementations or if there's no occlusion / valid map file should we just not return anything at all?

Both of these mask generations seem to be relevant at the "application/task" level rather than at the dataset level, so I would suggest to provide the same logic, but in the transforms of our training references instead (i.e. the dataset would not return it)

We made a similar decision for OF models: in the original code from the author's repo, the mask was generated within the dataset class

https://github.com/princeton-vl/RAFT/blob/aac9dd54726caf2cf81d8661b07663e220c5586d/core/datasets.py#L85-L88

but we decided to move that to the transforms to keep the datasets clear out of any task-related assumption:

vision/references/optical_flow/transforms.py

Lines 29 to 40 in 1dd1753

class MakeValidFlowMask(torch.nn.Module):

# This transform generates a valid_flow_mask if it doesn't exist.

# The flow is considered valid if ||flow||_inf < threshold

# This is a noop for Kitti and HD1K which already come with a built-in flow mask.

def __init__(self, threshold=1000):

super().__init__()

self.threshold = threshold

def forward(self, img1, img2, flow, valid_flow_mask):

if flow is not None and valid_flow_mask is None:

valid_flow_mask = (flow.abs() < self.threshold).all(axis=0)

return img1, img2, flow, valid_flow_mask

NicolasHug · 2022-07-15T08:53:47Z

torchvision/datasets/_stereo_matching.py

+                warnings.warn(
+                    "\nSplit 'test' has only no calibration settings, ignoring calibration argument.", RuntimeWarning
+                )
+        else:
+            if split != "test":
+                calibration = "perfect"
+                warnings.warn(
+                    f"\nSplit '{split}' has calibration settings, however None was provided as an argument."
+                    f"\nSetting calibration to 'perfect' for split '{split}'. Available calibration settings are: 'perfect', 'imperfect', 'both'.",
+                    RuntimeWarning,
+                )


For these kinds of user usage mistakes, we typically raise ValueError instead of warning. It's more obvious to users, and we try to prevent them from doing things they might not understand.

NicolasHug · 2022-07-15T09:02:00Z

torchvision/datasets/_stereo_matching.py

+
+    Args:
+        root (str): Root directory of the dataset.
+        split (str): The split of the dataset to use. One of ``"tree"``, ``"shapenet"``, ``"reflective"``, ``"hole"``


I agree it's best to restrict split to the usual (test, train, val,...)

Maybe this question will remove the problem: do we actually need to provide this parameter? Or can we just provide the entire datasets with all "categories"?

As a side note I can't find a reference to these in the paper or the repo, so I can't suggest a name ATM. Do You have thoughts @TeodorPoncu ?

NicolasHug

Thanks @TeodorPoncu , I made another pass but I didn't review everything yet (e.g. the tests).
Considering how big this PR is, I wonder if it would make sense to split it into smaller ones to ease reviwing? E.g. perhaps submit a first PR with 1 or 2 datasets and their corresponding tests, that would be representative of the rest of the implementation. Then, reviewing the remaining datasets would be a lot easier.

Also, I just saw that we seem to already have some datasets for stereo: LFWPairs and PhotoTour. Have you looked into the design of these?

Also it looks like the docs are missing: we'll need to add the classes to docs/source/datasets.rst so that we can see the rendered docs

Thank you!

NicolasHug · 2022-07-15T09:06:06Z

torchvision/datasets/_stereo_matching.py

+            self._disparities += disparities
+
+    def _read_disparity(self, file_path: str) -> Tuple:
+        disparity = np.array(Image.open(file_path), dtype=np.float32)


Here an everywhere else where there is a similar use of np.array(pil_img): we should try to use np.asarray so that the input is only copied when needed

PhotoTour is a slightly different task, but I'll have a look into it. Same for LFWPairs.

At a glance it seems PhotoTour doesn't have to apply any transforms on the target / gt. LFWPairs has a separate transform for targets, which I believe is not necessarily the best way to go for the CREStereo augmentation pipeline requirements.

…ddlebury per split download.

…lar to _optical_flow.py

…sion into add-crestereo-dataset

jdsgomes · 2022-07-19T10:13:38Z

test/test_datasets.py

+            pfm_path = os.path.join(scene_dir, "disp0GT.pfm")
+            datasets_utils.make_fake_pfm_file(h=100, w=100, file_name=pfm_path)
+            paths.append(pfm_path)
+        return paths


You don't need these path or return anything

jdsgomes

Just a first skim through the tests

jdsgomes · 2022-07-19T10:15:59Z

test/test_datasets.py

+        num_examples = 2 if config["split"] == "train" else 3
+
+        split_name = "two_view_training" if config["split"] == "train" else "two_view_test"
+        split_dir = os.path.join(eth3d_dir, split_name)


Nit: you might consider using pathlib. see example here: https://github.com/pytorch/vision/blob/main/test/test_datasets.py#L119

* Broken down PR(#6269). Added an additional dataset * Removed some types. Store None instead of "". Merged test util functions. * minor mypy fixes. minor doc fixes * reformated docstring * Added additional line-skips

Summary: * Broken down PR(#6269). Added an additional dataset * Removed some types. Store None instead of "". Merged test util functions. * minor mypy fixes. minor doc fixes * reformated docstring * Added additional line-skips Reviewed By: NicolasHug Differential Revision: D38351752 fbshipit-source-id: 376714fcdd49cb474670ce8e6e959507a517ee46

YosuaMichael · 2022-09-14T16:36:16Z

I close this PR since it is replaced by few other smaller PRs already

TeodorPoncu added 9 commits July 10, 2022 17:05

Added Stereo Matching dataset interface and several classic datasets.

cad18a4

added SceneFlow, FallingThings and CREStereo

df6ec4b

added SceneFlow, FallingThings and CREStereo

d0c5afb

"removed duplicate folder"

a566475

Added InStereo2k. Started working on dataset tests

8ea74f2

"Added calibrartion arg for Middlebury2014 (#6259)"

0959499

"Fixed test calibration test Middlebury2014 (#6259)"

a9365fe

Clean-up. Disp map format to (C, H, W) & valid mask to (H, W). (#6259)

96c7bf4

Ran ufmt. (#6259)

bbb1c56

TeodorPoncu requested review from YosuaMichael and jdsgomes July 14, 2022 09:23

facebook-github-bot added the cla signed label Jul 14, 2022

Adressed CI/CD errors

669611e

NicolasHug reviewed Jul 14, 2022

View reviewed changes

Ran formatting pre-commit hook

d9d17a8

TeodorPoncu mentioned this pull request Jul 14, 2022

Moved pfm file reading into dataset utils #6270

Merged

TeodorPoncu added 13 commits July 14, 2022 14:45

Added Stereo Matching dataset interface and several classic datasets.

a31ee83

added SceneFlow, FallingThings and CREStereo

4a5ac89

added SceneFlow, FallingThings and CREStereo

a1fc699

"removed duplicate folder"

62368b1

Added InStereo2k. Started working on dataset tests

33c52a5

"Added calibrartion arg for Middlebury2014 (#6259)"

2deab62

"Fixed test calibration test Middlebury2014 (#6259)"

cbc55f3

Clean-up. Disp map format to (C, H, W) & valid mask to (H, W). (#6259)

0759706

Ran ufmt. (#6259)

de94c2c

Adressed CI/CD errors

4256ca4

Ran formatting pre-commit hook

d7882ca

Merge branch 'add-crestereo-dataset' of https://github.com/pytorch/vi…

f8d1228

…sion into add-crestereo-dataset

Merge branch 'main' of https://github.com/pytorch/vision into add-cre…

1436d64

…stereo-dataset

YosuaMichael reviewed Jul 14, 2022

View reviewed changes

TeodorPoncu added 2 commits July 14, 2022 19:50

Middlebury disparity quickfix

ec550e8

Fixed mypy errors. Addressed download checks.

1dd1753

NicolasHug reviewed Jul 15, 2022

View reviewed changes

TeodorPoncu added 6 commits July 15, 2022 12:30

Dataset renaming. Test changes. getitem removed. Warnings removed. Mi…

9f70687

…ddlebury per split download.

Forced disparity to be positive

78f4a52

Merge branch 'main' into add-crestereo-dataset

1baaaef

Removed implicit mask creation. Added private built_in_mask flag simi…

e2ad8d2

…lar to _optical_flow.py

Merge branch 'add-crestereo-dataset' of https://github.com/pytorch/vi…

71343fe

…sion into add-crestereo-dataset

Added getiem & docs to inform support multi shape returns

93f4b6c

jdsgomes reviewed Jul 19, 2022

View reviewed changes

removed path returns from helper test functions

c83bc80

jdsgomes reviewed Jul 19, 2022

View reviewed changes

TeodorPoncu added 2 commits July 19, 2022 11:36

replaced os.path.join with pathlib in tests

650bf67

crestereo draft implementation

39efae5

TeodorPoncu added a commit that referenced this pull request Jul 25, 2022

Broken down PR(#6269). Added an additional dataset

e326295

TeodorPoncu mentioned this pull request Jul 25, 2022

Add CarlaStereo, Kitti2012Stereo, and Kitti2015Stereo datasets for Stereo(#6269) #6311

Merged

Merging from training prototyping

ce66e4c

TeodorPoncu marked this pull request as draft July 27, 2022 10:29

YosuaMichael closed this Sep 14, 2022

		assert len(disparity.shape) == 3
		assert len(valid_mask.shape) == 2

	transforms (callable, optional): A function/transform that takes in
	``img1, img2, flow, valid_flow_mask`` and returns a transformed version.
	``valid_flow_mask`` is expected for consistency with other datasets which
	return a built-in valid mask, such as :class:`~torchvision.datasets.KittiFlow`.

	class MakeValidFlowMask(torch.nn.Module):
	# This transform generates a valid_flow_mask if it doesn't exist.
	# The flow is considered valid if \|\|flow\|\|_inf < threshold
	# This is a noop for Kitti and HD1K which already come with a built-in flow mask.
	def __init__(self, threshold=1000):
	super().__init__()
	self.threshold = threshold

	def forward(self, img1, img2, flow, valid_flow_mask):
	if flow is not None and valid_flow_mask is None:
	valid_flow_mask = (flow.abs() < self.threshold).all(axis=0)
	return img1, img2, flow, valid_flow_mask

Add crestereo dataset #6269

Add crestereo dataset #6269

Uh oh!

Conversation

TeodorPoncu commented Jul 14, 2022

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TeodorPoncu Jul 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

YosuaMichael left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug Jul 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TeodorPoncu Jul 14, 2022 •

edited

Loading

NicolasHug Jul 15, 2022 •

edited

Loading