Skip to content

Commit c36a9c8

Browse files
Update docs to mention DataSizeBased aggregation (#4639)
1 parent 7e015b3 commit c36a9c8

File tree

2 files changed

+9
-7
lines changed

2 files changed

+9
-7
lines changed

docs/user_guide/source/advanced/aggregation.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ There are two implementations of aggregation in BP5, none of them is the same as
1818

1919
**EveryoneWrites** is the same strategy as the previous except that every process immediately writes its own data to its designated file. Since it basically implements an N-to-N write pattern, this method does not scale, so only use it up to a moderate number of processes (1-4 process * number of file system servers). At small scale, as long as the file system can deal with the on-rush of the write requests, this method can provide the fastest I/O.
2020

21+
**DataSizeBased** is also similar to *EveryoneWritesSerial*, except that before writing any timestep, writer ranks are first partitioned to balance the amount of data written to each subfile. The current greedy partitioning strategy is fast and "best effort", and likely won't produce subfiles of exactly equal size. Once writer chains are built from the partitioned ranks, writing proceeds exactly as in *EveryoneWritesSerial*. In this aggregator, use *NumSubFiles* to control the number of subfiles, as *NumAggregators* is ignored.
22+
2123
**TwoLevelShm** has a subset of processes that actually write to disk (*NumAggregators*). There must be at least one process per compute node, which creates a shared-memory segment for other processes on the node to send their data. The aggregator process basically serializes the writing of data from this subset of processes (itself and the processes that send data to it). TwoLevelShm performs similarly to EveryoneWritesSerial on Lustre, and is the only good option on Summit's GPFS.
2224

2325
The number of files (*NumSubFiles*) can be smaller than *NumAggregators*, and then multiple aggregators will write to one file concurrently. Such a setup becomes useful when the number of nodes is many times more than the number of file servers.

docs/user_guide/source/engines/bp5.rst

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -62,14 +62,14 @@ This engine allows the user to fine tune the buffering operations through the fo
6262

6363
#. Aggregation
6464

65-
#. **AggregationType**: *TwoLevelShm*, *EveryoneWritesSerial* and
66-
*EveryoneWrites* are three data aggregation strategies. See :ref:`Aggregation in BP5`. The default is *TwoLevelShm*.
65+
#. **AggregationType**: *TwoLevelShm*, *EveryoneWritesSerial*, *DataSizeBased*, and
66+
*EveryoneWrites* are four data aggregation strategies. See :ref:`Aggregation in BP5`. The default is *TwoLevelShm*.
6767

68-
#. **NumAggregators**: The number of processes that will ever write data directly to storage. The default is set to the number of compute nodes the application is running on (i.e. one process per compute node). TwoLevelShm will select a fixed number of processes *per compute-node* to get close to the intention of the user but does not guarantee the exact number of aggregators.
68+
#. **NumAggregators**: The number of processes that will ever write data directly to storage. The default is set to the number of compute nodes the application is running on (i.e. one process per compute node). TwoLevelShm will select a fixed number of processes *per compute-node* to get close to the intention of the user but does not guarantee the exact number of aggregators. *DataSaizeBased* will ignore this configuration setting and set the value to *NumSubFiles*.
6969

7070
#. **AggregatorRatio**: An alternative option to NumAggregators to pick every nth process as aggregator. The number of aggregators will be automatically kept to be within 1 and total number of processes no matter what bad number is supplied here. Moreover, TwoLevelShm will select an fixed number of processes *per compute-node* to get close to the intention of this ratio but does not guarantee the exact number of aggregators.
7171

72-
#. **NumSubFiles**: The number of data files to write to in the *.bp/* directory. Only used by *TwoLevelShm* aggregator, where the number of files can be smaller then the number of aggregators. The default is set to *NumAggregators*.
72+
#. **NumSubFiles**: The number of data files to write to in the *.bp/* directory. Used by *TwoLevelShm* and *DataSizeBased* aggregators. For *TwoLevelShm* the number of files can be smaller then the number of aggregators, while for *DataSizeBased*, the number of aggregators is ignored and set equal to this value. The default is set to *NumAggregators*.
7373

7474
#. **StripeSize**: The data blocks of different processes are aligned to this size (default is 4096 bytes) in the files. Its purpose is to avoid multiple processes to write to the same file system block and potentially slow down the write.
7575

@@ -160,10 +160,10 @@ This engine allows the user to fine tune the buffering operations through the fo
160160
=============================== ===================== ===========================================================
161161
OpenTimeoutSecs float **0** for *ReadRandomAccess* mode, **3600** for *Read* mode, ``10.0``, ``5``
162162
BeginStepPollingFrequencySecs float **1**, 10.0
163-
AggregationType string **TwoLevelShm**, EveryoneWritesSerial, EveryoneWrites
164-
NumAggregators integer >= 1 **0 (one file per compute node)**
163+
AggregationType string **TwoLevelShm**, EveryoneWritesSerial, DataSizeBased, EveryoneWrites
164+
NumAggregators integer >= 1 **0 (one file per compute node)**, ignored when *AggregationType=DataSizeBased*
165165
AggregatorRatio integer >= 1 not used unless set
166-
NumSubFiles integer >= 1 **=NumAggregators**, only used when *AggregationType=TwoLevelShm*
166+
NumSubFiles integer >= 1 **=NumAggregators**, used when *AggregationType=TwoLevelShm* or *AggregationType=DataSizeBased*
167167
StripeSize integer+units **4KB**
168168
MaxShmSize integer+units **4294762496**
169169
BufferVType string **chunk**, malloc

0 commit comments

Comments
 (0)