@@ -6,6 +6,17 @@ User-Level Fault Mitigation (ULFM)
6
6
This chapter documents the features and options specific to the **User
7
7
Level Failure Mitigation (ULFM) ** Open MPI implementation.
8
8
9
+ TL;DR
10
+ -----
11
+ This is an extremely terse summary of how to use ULFM:
12
+
13
+ .. code-block ::
14
+
15
+ ./configure --with-ft=ulfm [...options...]
16
+ make [-j N] all install
17
+ mpicc my-ft-program.c -o my-ft-program
18
+ mpiexec -n 4 --with-ft ulfm my-ft-program
19
+
9
20
Features
10
21
--------
11
22
@@ -100,11 +111,11 @@ Available from: https://journals.sagepub.com/doi/10.1177/1094342013488238.
100
111
Building ULFM support in Open MPI
101
112
---------------------------------
102
113
103
- In Open MPI |ompi_ver |, ULFM support is **enabled by default ** |mdash |
104
- when you build Open MPI, unless you specify ``--without-ft ``, ULFM
114
+ In Open MPI |ompi_ver |, ULFM support is **built-in by default ** |mdash |
115
+ that is, when you build Open MPI, unless you specify ``--without-ft ``, ULFM
105
116
support will automatically be built.
106
117
107
- Optionally, you can specify ``--with-ft `` to ensure that ULFM support
118
+ Optionally, you can specify ``--with-ft ulfm `` to ensure that ULFM support
108
119
is definitely built.
109
120
110
121
Support notes
@@ -215,7 +226,7 @@ Running your application
215
226
216
227
You can launch your application with fault tolerance by simply using
217
228
the normal Open MPI ``mpiexec `` launcher, with the
218
- ``--with-ft ulfm `` CLI option:
229
+ ``--with-ft ulfm `` CLI option (or its synonym `` --with-ft mpi ``) :
219
230
220
231
.. code-block ::
221
232
@@ -234,6 +245,11 @@ you use ``mpiexec`` within an allocation (e.g., ``salloc``,
234
245
Run-time tuning knobs
235
246
^^^^^^^^^^^^^^^^^^^^^
236
247
248
+ The main control for enabling/disabling fault tolerance at runtime
249
+ is the ``--with-ft ulfm `` (or its synomym ``--with-ft mpi ``) ``mpiexec ``
250
+ CLI option. This option will setup multiple subsystems of Open MPI
251
+ to enable fault tolerance.
252
+
237
253
ULFM comes with a variety of knobs for controlling how it runs. The
238
254
default parameters are sane and should result in good performance in
239
255
most cases. You can change the default settings with ``--mca
@@ -243,9 +259,10 @@ errmgr_detector_bar <value>`` for PRTE options.
243
259
PRTE level options
244
260
~~~~~~~~~~~~~~~~~~
245
261
246
- * ``prrte_enable_recovery <true|false> (default: false) `` controls
262
+ * ``prrte_enable_ft <true|false> (default: false) `` controls
247
263
automatic cleanup of apps with failed processes within
248
- mpirun. Enabling this option also enables ``mpi_ft_enable ``.
264
+ mpirun. This option is automatically set to ``true `` when using
265
+ ``--with-ft ulfm ``.
249
266
* ``errmgr_detector_priority <int> (default 1005 ``) selects the
250
267
PRRTE-based failure detector. Only available when
251
268
``prte_enable_recovery `` is ``true ``. You can set this to ``0 `` when
@@ -263,17 +280,29 @@ PRTE level options
263
280
Open MPI level options
264
281
~~~~~~~~~~~~~~~~~~~~~~
265
282
266
- * ``mpi_ft_enable <true|false> (default: same as
267
- prrte_enable_recovery) `` permits turning on/off fault tolerance at
268
- runtime. When false, failure detection is disabled; Interfaces
269
- defined by the fault tolerance extensions are substituted with dummy
270
- non-fault tolerant implementations (e.g., ``MPIX_Comm_agree `` is
271
- implemented with ``MPI_Allreduce ``); All other controls below become
272
- irrelevant.
283
+ Some default values are applied to some Open MPI parameters when using
284
+ ``mpiexec --with-ft ulfm ``. These defaults are obtained from the ``ft-mpi ``
285
+ aggregate MCA param file
286
+ ``$installdir/share/openmpi/amca-param-sets/ft-mpi ``. You can tune the
287
+ runtime behavior with ULFM by either setting or unsetting variables in
288
+ this file, or by overiding the variable on the command line (e.g.,
289
+ ``--mca btl ofi,self ``). Note that if fault tolerance is not enabled at
290
+ runtime (that is, when not using ``--with-ft ulfm ``), this param file is
291
+ not loaded, which may change which components are selected (this in turn
292
+ may change observed performance when comparing with and without fault
293
+ tolerance).
294
+
295
+ * ``mpi_ft_enable <true|false> (default: false) ``
296
+ permits turning on/off fault tolerance at runtime. This option is
297
+ automatically set to ``true `` from the aggregate MCA param file
298
+ ``ft-mpi `` loaded when using ``--with-ft ulfm ``. When false, failure
299
+ detection is disabled; Interfaces defined by the fault tolerance extensions
300
+ are substituted with dummy non-fault tolerant implementations (e.g.,
301
+ ``MPIX_Comm_agree `` is implemented with ``MPI_Allreduce ``); All other
302
+ controls below become irrelevant.
273
303
* ``mpi_ft_verbose <int> (default: 0) `` increases the output of the
274
304
fault tolerance activities. A value of 1 will report detected
275
- failures.
276
- * ``mpi_ft_detector <true|false> (default: false) ``, **EXPERIMENTAL **
305
+ failuresulfm ``mpi_ft_detector <true|false> (default: false) ``, **DEPRECATED **
277
306
controls the activation of the Open MPI level failure detector. When
278
307
this detector is turned off, all failure detection is delegated to
279
308
PRTE (see above). The Open MPI level fault detector is
@@ -291,13 +320,16 @@ Open MPI level options
291
320
latency (typically 1us increase). * You may want to **enable this
292
321
option if you experience false positive ** processes incorrectly
293
322
reported as failed with the Open MPI failure detector.
323
+ This option is only relevant when `mpi_ft_detector ` is `true `.
294
324
* ``mpi_ft_detector_period <float> (default: 3e0 seconds) `` heartbeat
295
325
period. Recommended value is 1/3 of the timeout. _Values lower than
296
326
100us may impart a noticeable effect on latency (typically a 3us
297
327
increase)._
328
+ This option is only relevant when `mpi_ft_detector ` is `true `.
298
329
* ``mpi_ft_detector_timeout <float> (default: 1e1 seconds) `` heartbeat
299
330
timeout (i.e. failure detection speed). Recommended value is 3 times
300
331
the heartbeat period.
332
+ This option is only relevant when `mpi_ft_detector ` is `true `.
301
333
302
334
Known Limitations in ULFM
303
335
^^^^^^^^^^^^^^^^^^^^^^^^^
0 commit comments