Description
Related to IN:04155266 on Legion but is a general issue, as our most recent OpenMPIs are 3.1.4 and 3.1.5 (beta module). OpenMPI 3 > 3.1.1 has a bug where vader_segment.x
shared memory files are left behind (only/mostly on an aborted run?). If they exist, then a new run on those nodes will fail with this:
node-o08a-029: Unable to allocate shared memory for intra-node messaging.
node-o08a-029: Delete stale shared memory files in /dev/shm.
Note that /dev/shm
is not full in this case.
OpenMPI 4.0.2 and later have fixed a bunch of vader issues, and are using PMIx 3 rather than 2, which has better hooks for doing job shutdown cleanup.
Note: 4.0.x deprecates the openib BTL in favour of UCX.
https://www.open-mpi.org/software/ompi/major-changes.php
https://www.open-mpi.org/faq/?category=openfabrics#run-ucx
https://www.open-mpi.org/faq/?category=building#build-p2p
It also suggests to build --without-verbs
when using UCX.
See open-mpi/ompi#6322 and open-mpi/ompi#7220 for bug.