diff --git a/docs/source/datasets.rst b/docs/source/datasets.rst index 8366a41180..06a9fdd3e0 100644 --- a/docs/source/datasets.rst +++ b/docs/source/datasets.rst @@ -42,7 +42,7 @@ torchtext.datasets - All workers (DDP workers *and* DataLoader workers) see a different part of the data. The datasets are already wrapped inside `ShardingFilter `_ - and you may need to call ``dp.apply_sharing(num_shards, shard_id)`` in order to shard the + and you may need to call ``dp.apply_sharding(num_shards, shard_id)`` in order to shard the data across ranks (DDP workers) and DataLoader workers. One way to do this is to create ``worker_init_fn`` that calls ``apply_sharding`` with appropriate number of shards (DDP workers * DataLoader workers) and shard id (inferred through rank