Skip to content

Poor performancewhen constructing edge_node, face_edge and face_face connectivity #1196

@philipc2

Description

@philipc2

I've observed very poor performance when constructing the Grid.edge_node_connectivity and Grid.face_edge_connectivity.

The timings below were taken on a single NCAR Derecho CPU node.

  • AMD EPYC™ 7763 Milan processors
  • Dual-socket nodes, 64 cores per socket
  • 256 GB DDR4 memory per node
Resolution Nodes Faces Edges Grid Load Time (s) Connectivity Construction Time (s) Total Time (s)
30km 1,310,720 655,362 1,966,080 2.023 9.37 11.393
15km 5,242,880 2,621,442 7,864,320 7.673 39.987 47.66
7.5km 20,971,520 10,485,762 31,457,280 28.716 99.309 128.025
3.75km 83,886,080 41,943,042 125,829,120 113.943 406.8 520.743

The timing for the 15km grid seems inconsistent with the others, since there's an expected scaling of about 4x. The others follow this trend.

Currently, we have the following implementation.

def _build_edge_node_connectivity(face_nodes, n_face, n_max_face_nodes):
"""Constructs the UGRID connectivity variable (``edge_node_connectivity``)
and stores it within the internal (``Grid._ds``) and through the attribute
(``Grid.edge_node_connectivity``).
Additionally, the attributes (``inverse_indices``) and
(``fill_value_mask``) are stored for constructing other
connectivity variables.
Parameters
----------
repopulate : bool, optional
Flag used to indicate if we want to overwrite the existed `edge_node_connectivity` and generate a new
inverse_indices, default is False
"""
padded_face_nodes = close_face_nodes(face_nodes, n_face, n_max_face_nodes)
# array of empty edge nodes where each entry is a pair of indices
edge_nodes = np.empty((n_face * n_max_face_nodes, 2), dtype=INT_DTYPE)
# first index includes starting node up to non-padded value
edge_nodes[:, 0] = padded_face_nodes[:, :-1].ravel()
# second index includes second node up to padded value
edge_nodes[:, 1] = padded_face_nodes[:, 1:].ravel()
# sorted edge nodes
edge_nodes.sort(axis=1)
# unique edge nodes
edge_nodes_unique, inverse_indices = np.unique(
edge_nodes, return_inverse=True, axis=0
)
# find all edge nodes that contain a fill value
fill_value_mask = np.logical_or(
edge_nodes_unique[:, 0] == INT_FILL_VALUE,
edge_nodes_unique[:, 1] == INT_FILL_VALUE,
)
# all edge nodes that do not contain a fill value
non_fill_value_mask = np.logical_not(fill_value_mask)
edge_nodes_unique = edge_nodes_unique[non_fill_value_mask]
# Update inverse_indices accordingly
indices_to_update = np.where(fill_value_mask)[0]
remove_mask = np.isin(inverse_indices, indices_to_update)
inverse_indices[remove_mask] = INT_FILL_VALUE
# Compute the indices where inverse_indices exceeds the values in indices_to_update
indexes = np.searchsorted(indices_to_update, inverse_indices, side="right")
# subtract the corresponding indexes from `inverse_indices`
for i in range(len(inverse_indices)):
if inverse_indices[i] != INT_FILL_VALUE:
inverse_indices[i] -= indexes[i]
return edge_nodes_unique, inverse_indices, fill_value_mask

Metadata

Metadata

Assignees

Labels

scalabilityRelated to scalability & performance efforts

Type

Projects

Status

🏗 In progress

Relationships

None yet

Development

No branches or pull requests

Issue actions