Description
I am using self-compiled Open MPI from the 5.0.x development branch on a standard desktop system with openSUSE Tumbleweed. I have up-to-date Git submodules and have executed autogen.pl
before compilation.
In the commit 43b5d8c, the code of ompi_osc_rdma_component_query
has been changed to always return OMPI_ERR_RMA_SHARED
when shared memory functionality is queried. Before, the function was returning -1
. This change, however, leads to unnecessary failures of the component selection in ompi_osc_base_select
. The latter function fails when any of the available one-sided communication components produces OMPI_ERR_RMA_SHARED
, even though other components would work perfectly fine.
To give an example, I tested compilation and execution of the following program:
#include <mpi.h>
#include <stdio.h>
int main (int argc, char* argv[])
{
MPI_Win win;
int *ptr, nproc, rank, size = sizeof(int), disp = 1;
// The processes allocate a continuous shared memory segment.
// Each process controls a chunk of the bytesize of one integer.
// Each process writes its rank into the shared memory.
// The rank-0 process then prints contents of the whole shared memory (= all rank IDs).
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Win_allocate_shared(size, disp, MPI_INFO_NULL, MPI_COMM_WORLD, &ptr, &win);
*ptr = rank;
MPI_Win_fence(0, win);
if (rank == 0)
{
for (int i = 0; i < nproc; i++)
{
printf("%d ", ptr[i]);
}
printf("\n");
}
MPI_Win_free(&win);
MPI_Finalize();
return 0;
}
When I compile the program with the current 5.0.x version and attempt to run it, I get
[yunipher:00000] *** An error occurred in MPI_Win_allocate_shared
[yunipher:00000] *** reported by process [3017211905,0]
[yunipher:00000] *** on communicator MPI_COMM_WORLD
[yunipher:00000] *** MPI_ERR_RMA_SHARED: Memory cannot be shared
[yunipher:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[yunipher:00000] *** and MPI will try to terminate your MPI job as well)
This can be avoided either by using a pre 43b5d8c version of the code, or manually excluding the broken "rdma" osc component ("sm" is then considered alone)
$ mpiexec -n 1 --mca osc ^rdma ./test.x
I believe that the code in ompi_osc_base_select
is overreacing. It should not pass through the error status OMPI_ERR_RMA_SHARED
from a single component unless all available components are unusable for shared memory.