Skip to content

Commit c551256

Browse files
authored
documentation: add TF 2.4.1 support to sm distributed data parallel docs and other updates (#2179)
1 parent 2904cd3 commit c551256

File tree

6 files changed

+17
-13
lines changed

6 files changed

+17
-13
lines changed

doc/api/training/sdp_versions/v1.0.0/smd_data_parallel_pytorch.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ PyTorch API
155155

156156
**Supported versions:**
157157

158-
- PyTorch 1.6
158+
- PyTorch 1.6.0
159159

160160

161161
.. function:: smdistributed.dataparallel.torch.distributed.is_available()

doc/api/training/sdp_versions/v1.0.0/smd_data_parallel_tensorflow.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -414,7 +414,7 @@ TensorFlow API
414414

415415
.. function:: smdistributed.dataparallel.tensorflow.DistributedOptimizer
416416

417-
Applicable if you use the ``tf.estimator`` API in TensorFlow 2.x (2.3).
417+
Applicable if you use the ``tf.estimator`` API in TensorFlow 2.x (2.3.1).
418418
419419
Construct a new ``DistributedOptimizer`` , which uses TensorFlow
420420
optimizer under the hood for computing single-process gradient values
@@ -489,7 +489,7 @@ TensorFlow API
489489

490490
.. function:: smdistributed.dataparallel.tensorflow.BroadcastGlobalVariablesHook
491491

492-
Applicable if you use the ``tf.estimator`` API in TensorFlow 2.x (2.3).
492+
Applicable if you use the ``tf.estimator`` API in TensorFlow 2.x (2.3.1).
493493

494494

495495
``SessionRunHook`` that will broadcast all global variables from root

doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,18 @@
88

99
### PyTorch
1010

11-
#### Add support for PyTorch 1.7
11+
#### Add support for PyTorch 1.7.1
1212

13-
- Adds support for `gradient_as_bucket_view` (PyTorch 1.7 only), `find_unused_parameters` (PyTorch 1.7 only) and `broadcast_buffers` options to `smp.DistributedModel`. These options behave the same as the corresponding options (with the same names) in
13+
- Adds support for `gradient_as_bucket_view` (PyTorch 1.7.1 only), `find_unused_parameters` (PyTorch 1.7.1 only) and `broadcast_buffers` options to `smp.DistributedModel`. These options behave the same as the corresponding options (with the same names) in
1414
`torch.DistributedDataParallel` API. Please refer to the [SageMaker distributed model parallel API documentation](https://sagemaker.readthedocs.io/en/stable/api/training/smd_model_parallel_pytorch.html#smp.DistributedModel) for more information.
1515

16-
- Adds support for `join` (PyTorch 1.7 only) context manager, which is to be used in conjunction with an instance of `smp.DistributedModel` to be able to train with uneven inputs across participating processes.
16+
- Adds support for `join` (PyTorch 1.7.1 only) context manager, which is to be used in conjunction with an instance of `smp.DistributedModel` to be able to train with uneven inputs across participating processes.
1717

18-
- Adds support for `_register_comm_hook` (PyTorch 1.7 only) which will register the callable as a communication hook for DDP. NOTE: Like in DDP, this is an experimental API and subject to change.
18+
- Adds support for `_register_comm_hook` (PyTorch 1.7.1 only) which will register the callable as a communication hook for DDP. NOTE: Like in DDP, this is an experimental API and subject to change.
19+
20+
### Tensorflow
21+
22+
- Adds support for Tensorflow 2.4.1
1923

2024
## Bug Fixes
2125

@@ -32,7 +36,7 @@ regular dicts.
3236

3337
### PyTorch
3438

35-
- A performance regression was observed when training on SMP with PyTorch 1.7.1 compared to 1.6. The rootcause was found to be the slowdown in performance of `.grad` method calls in PyTorch 1.7.1 compared to 1.6. Please see the related discussion: https://github.com/pytorch/pytorch/issues/50636.
39+
- A performance regression was observed when training on SMP with PyTorch 1.7.1 compared to 1.6.0. The rootcause was found to be the slowdown in performance of `.grad` method calls in PyTorch 1.7.1 compared to 1.6.0. Please see the related discussion: https://github.com/pytorch/pytorch/issues/50636.
3640

3741

3842
# Sagemaker Distributed Model Parallel 1.1.0 Release Notes

doc/api/training/smp_versions/v1.1.0/smd_model_parallel_tensorflow.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
TensorFlow API
22
==============
33

4-
**Supported version: 2.3**
4+
**Supported version: 2.3.1**
55

66
**Important**: This API document assumes you use the following import statement in your training scripts.
77

doc/api/training/smp_versions/v1.2.0/smd_model_parallel_pytorch.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
PyTorch API
77
===========
88

9-
**Supported versions: 1.7.1, 1.6**
9+
**Supported versions: 1.7.1, 1.6.0**
1010

1111
This API document assumes you use the following import statements in your training scripts.
1212

@@ -159,7 +159,7 @@ This API document assumes you use the following import statements in your traini
159159
This parameter is forwarded to the underlying ``DistributedDataParallel`` wrapper.
160160
Please see: `broadcast_buffer <https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html#torch.nn.parallel.DistributedDataParallel>`__.
161161

162-
- ``gradient_as_bucket_view (PyTorch 1.7 only)`` (default: False): To be
162+
- ``gradient_as_bucket_view (PyTorch 1.7.1 only)`` (default: False): To be
163163
used with ``ddp=True``. This parameter is forwarded to the underlying
164164
``DistributedDataParallel`` wrapper. Please see `gradient_as_bucket_view <https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html#torch.nn.parallel.DistributedDataParallel>`__.
165165

@@ -257,7 +257,7 @@ This API document assumes you use the following import statements in your traini
257257

258258
.. function:: join( )
259259

260-
**Available for PyTorch 1.7 only**
260+
**Available for PyTorch 1.7.1 only**
261261

262262
A context manager to be used in conjunction with an instance of
263263
``smp.DistributedModel`` to be able to train with uneven inputs across

doc/api/training/smp_versions/v1.2.0/smd_model_parallel_tensorflow.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
TensorFlow API
22
==============
33

4-
**Supported version: 2.3**
4+
**Supported version: 2.4.1, 2.3.1**
55

66
**Important**: This API document assumes you use the following import statement in your training scripts.
77

0 commit comments

Comments
 (0)