You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/advanced.rst
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,19 +14,19 @@ We could also launch multiple functions (e.g. train on many GPUs, test on one GP
14
14
func=train,
15
15
hostnames=["node1", "node2"],
16
16
workers_per_host=8
17
-
).value(rank=0)
17
+
).rank(0)
18
18
19
19
accuracy = trx.launch(
20
20
func=test,
21
-
func_kwargs={'model': model},
21
+
func_args=(trained_model,),
22
22
hostnames=["localhost"],
23
23
workers_per_host=1
24
-
).value(rank=0)
24
+
).rank(0)
25
25
26
26
print(f'Accuracy: {accuracy}')
27
27
28
28
29
-
:mod:`torchrunx.launch` is self-cleaning: all processes are terminated (and the used memory is completely released) after each invocation.
29
+
:mod:`torchrunx.launch` is self-cleaning: all processes are terminated (and the used memory is completely released) before the subsequent invocation.
30
30
31
31
Launcher class
32
32
--------------
@@ -85,9 +85,9 @@ Raises a ``RuntimeError`` if ``hostnames="slurm"`` or ``workers_per_host="slurm"
85
85
Propagating exceptions
86
86
----------------------
87
87
88
-
Exceptions that are raised inWorkers will be raised by the launcher process.
88
+
Exceptions that are raised inworkers will be raised by the launcher process.
89
89
90
-
A :mod:`torchrunx.AgentKilledError`will be raised if any agent dies unexpectedly (e.g. ifforce-killed by the OS, due to segmentation faults or OOM).
90
+
A :mod:`torchrunx.AgentFailedError` or :mod:`torchrunx.WorkerFailedError`will be raised if any agent or worker dies unexpectedly (e.g. ifsent a signal from the OS, due to segmentation faults or OOM).
91
91
92
92
Environment variables
93
93
---------------------
@@ -100,14 +100,14 @@ Environment variables in the launcher process that match the ``default_env_vars`
100
100
Custom logging
101
101
--------------
102
102
103
-
We forward all logs (i.e. from ``logging`` and ``stdio``) from workers and agents to the Launcher. By default, the logs from the first agent and its first worker are printed into the Launcher's ``stdout`` stream. Logs from all agents and workers are written to files in ``$TORCHRUNX_LOG_DIR`` (default: ``./torchrunx_logs``) and are named by timestamp, hostname, and local_rank.
103
+
We forward all logs (i.e. from :mod:`logging` and :mod:`sys.stdin`/:mod:`sys.stdout`) from workers and agents to the launcher. By default, the logs from the first agent and its first worker are printed into the launcher's ``stdout`` stream. Logs from all agents and workers are written to files in ``$TORCHRUNX_LOG_DIR`` (default: ``./torchrunx_logs``) and are named by timestamp, hostname, and local_rank.
104
104
105
-
``logging.Handler`` objects can be provided via the ``log_handlers`` argument to provide further customization (mapping specific agents/workers to custom output streams).
105
+
:mod:`logging.Handler` objects can be provided via the ``log_handlers`` argument to provide further customization (mapping specific agents/workers to custom output streams).
0 commit comments