Skip to content

PIXIE3D ADIOS2 restart aborts due to unreasonable amount of memory required (follow up to issue #2042) #2992

@lchacon-LANL

Description

@lchacon-LANL

Describe the bug
ADIOS2 restart (opening bp file read-only) in PIXIE3D overwhelms memory and aborts with error:

std::bad_alloc
adios2_open_new_comm

Memory per MPI rank at fresh start is ~0.6 GB. Memory per MPI rank at restart (after ~4000 time steps on a 64x128x64 mesh) is > 4 GB, independently of the number of MPI ranks. 32 MPI ranks overwhelms node memory (128 GB) and code aborts.

Memory with fresh start with 16 ranks:
fresh 16-core PIXIE3D run

Memory at restart (from BP file with ~4000 time slices) with 16 ranks:
restarted 16-core PIXIE3D run

Memory at restart per MPI rank seems independent of the number of MPI ranks. See for 8 ranks:
restarted 8-core PIXIE3D run

To Reproduce
I am at latest master-branch commit ff8326c. Very hard to provide "simple" example as both PIXIE3D and (very large) PIXIE3D BP restart file are required. Seems like problem is related to the number of time steps in restart file, and therefore providing a smaller restart file will not be very helpful. I am happy to provide access to both upon request.

Expected behavior
Memory at restart should be inversely proportional to the number of MPI ranks to avoid overwhelming node memory.

Desktop (please complete the following information):

  • OS/Platform: Linux ba170.localdomain 3.10.0-1160.45.1.1chaos.ch6.x86_64
  • Build: gcc 9.4.0, openmpi 3.1.6, shared libs

Additional context
Seems like problem is related to the number of time steps stored in the restart file, as PIXIE3D has no issue restarting from BP files with the same underlying 3D mesh but fewer time steps.

Following up
Was the issue fixed? Please report back.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions