-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Port RASampler
functionality to iterable datasets
#6018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The RASampler is currently implemented by dispatching all copies of the same sample to different GPUs. It's fairly hacky: it relies on the existence of DDP, and on the assumption that different GPUs use different RNGs. This is usually True, but there is no general guarantee for arbitrary user code. I think we should be able to implement a datapipe-compatible RASampler by simply not sharding across GPUs (or only partially sharding). Still hacky, but not much more than what we currently have. |
@NicolasHug Agreed that the previous implementation is hacky and can be simplified/improved. I've asked Philip to create a separate ticket so that I can remove the feature from the Batteries Included 2 project. Feel free to edit the description to outline potential changes on the implementation. As long as the new class provides the same result as the previous, we should be OK to make changes to its implementation. |
One thing that came to mind thinking about this is that we will be wasting quite some I/O resources when using this functionality. The prototype datasets will read the raw bytes and store them in a tensor for later decoding. Since we will throw away the majority of the samples using this technique, we wasted all these reads. I don't know if that is significant since we only read and not decode, but we should watch out for this. |
Another sub-optimal thing is that we would be reading and decoding the same sample multiple times instead of once. That's already a problem with our current implementation though. |
@NicolasHug Are you sure it's a problem on the current implementation? Doesn't the decoding happen after the indexing op? My understanding is that it's not but I could be wrong. |
The decoding happens after the indexing, but the indexes are the same across GPUs - at least according to the docstring. So multiple GPUs will receive the same index, hence decoding the same sample (and then transforming it differently) |
I see what you mean. You mean that we decode the same thing over and over. That's expected and I agree it was happening on the previous implementation. For a moment I thought you meant that the new API would require us decoding the part of the dataset that we throw-away, which would obviously be a problem. Sorry for the confusion. |
Our current classification references use the "Repeated Augment Sampler" (
RASampler
) from #5051:vision/references/classification/sampler.py
Line 7 in d585f86
Since after the revamp we will have iterable- rather than map-style datasets, samplers are no longer supported.
Given that the
RASampler
increases accuracy, we need to support the same functionality going forward. It can probably be achieved by appending a customRepeatedAugmentIterDataPipe
to the dataset graph, but we need to make sure it works correctly with the shuffling and sharding.cc @pmeier @YosuaMichael @datumbox @vfdev-5 @bjuncek
The text was updated successfully, but these errors were encountered: