Skip to content

User's PMIx call fails after MPI_Init + UCX/HCOLL #6982

Open
@angainor

Description

@angainor

@artpol84 FYI As discussed with @rhc54 in an issue reported in the PMIx repo, it seems something in the OpenMPI runtime 'breaks' the PMIx infrastructure so that it is not possible to distribute user's keys if a call to PMIx_Set + PMIx_Commit is made after the MPI_Init call. That is, PMIx_Get fails on the clients with error -46. If the user's code sets the custom key before MPI_Init, then the code works as expected.

What's puzzling is that I only observe this problem when the UCX pml and HCOLL are enabled. I compile the code attached at the end of this post against OMPI master + it's internal PMIx, but I see the same behavior for OMPI 4.0.1 + PMIx 2.1.4:

$ mpirun -map-by node ./pmixtest
PMIx initialized
PMIx_Put on test-key
PMIx_Put on test-key
Tue Sep 17 14:47:15 2019 ERROR: pmixtest.c:55  Client ns 39583745 rank 1: PMIx_Get test-key: -46
Tue Sep 17 14:47:15 2019 ERROR: pmixtest.c:55  Client ns 39583745 rank 0: PMIx_Get test-key: -46

If I turn off UCX and HCOLL, things work as expected:

$ mpirun -mca pml ^ucx -mca coll_hcoll_enable 0 -map-by node ./pmixtest
PMIx initialized
PMIx_Put on test-key
PMIx_Put on test-key
PMIx_get test-key returned 256 bytes
0: obtained data "rank 1"
PMIx_get test-key returned 256 bytes
1: obtained data "rank 0"
PMIx finalized

Here is the reproducing code. To compile one needs to pass include and link path to the PMIx installation used by OpenMPI. I'd appreciate any insight.

#include <stdio.h>
#include <pmix.h>
#include <mpi.h>

static pmix_proc_t allproc = {};
static pmix_proc_t myproc = {};

#define ERR(msg, ...)							\
    do {								\
	time_t tm = time(NULL);						\
	char *stm = ctime(&tm);						\
	stm[strlen(stm)-1] = 0;						\
	fprintf(stderr, "%s ERROR: %s:%d  " msg "\n", stm, __FILE__, __LINE__, ## __VA_ARGS__); \
	exit(1);							\
    } while(0);


int pmi_set_string(const char *key, void *data, size_t size)
{
    int rc;
    pmix_value_t value;

    PMIX_VALUE_CONSTRUCT(&value);
    value.type = PMIX_BYTE_OBJECT;
    value.data.bo.bytes = data;
    value.data.bo.size  = size;
    if (PMIX_SUCCESS != (rc = PMIx_Put(PMIX_GLOBAL, key, &value))) {
        ERR("Client ns %s rank %d: PMIx_Put failed: %d\n", myproc.nspace, myproc.rank, rc);
    }

    if (PMIX_SUCCESS != (rc = PMIx_Commit())) {
        ERR("Client ns %s rank %d: PMIx_Commit failed: %d\n", myproc.nspace, myproc.rank, rc);
    }

    /* protect the data */
    value.data.bo.bytes = NULL;
    value.data.bo.size  = 0;
    PMIX_VALUE_DESTRUCT(&value);
    printf("PMIx_Put on %s\n", key);


    return 0;
}

int pmi_get_string(uint32_t peer_rank, const char *key, void **data_out, size_t *data_size_out)
{
    int rc;
    pmix_proc_t proc;
    pmix_value_t *pvalue;

    PMIX_PROC_CONSTRUCT(&proc);
    (void)strncpy(proc.nspace, myproc.nspace, PMIX_MAX_NSLEN);
    proc.rank = peer_rank;
    if (PMIX_SUCCESS != (rc = PMIx_Get(&proc, key, NULL, 0, &pvalue))) {
        ERR("Client ns %s rank %d: PMIx_Get %s: %d\n", myproc.nspace, myproc.rank, key, rc);
    }
    if(pvalue->type != PMIX_BYTE_OBJECT){
        ERR("Client ns %s rank %d: PMIx_Get %s: got wrong data type\n", myproc.nspace, myproc.rank, key);
    }
    *data_out = pvalue->data.bo.bytes;
    *data_size_out = pvalue->data.bo.size;

    /* protect the data */
    pvalue->data.bo.bytes = NULL;
    pvalue->data.bo.size = 0;
    PMIX_VALUE_RELEASE(pvalue);
    PMIX_PROC_DESTRUCT(&proc);

    printf("PMIx_get %s returned %zi bytes\n", key, data_size_out[0]);

    return 0;
}

int main(int argc, char *argv[])
{
    char data[256] = {};
    char *data_out;
    size_t size_out;
    int rc;
    pmix_value_t *pvalue;

    // if MPI_Init is executed before PMIx_set, PMIx_Get fails with -46
    MPI_Init(&argc, &argv);

    if (PMIX_SUCCESS != (rc = PMIx_Init(&myproc, NULL, 0))) {
	ERR("PMIx_Init failed");
        exit(1);
    }
    if(myproc.rank == 0) printf("PMIx initialized\n");

    sprintf(data, "rank %d", myproc.rank);
    pmi_set_string("test-key", data, 256);

    // if MPI_Init is executed after PMIx_set, PMIx_Get works fine
    // MPI_Init(&argc, &argv);
    
    pmi_get_string((myproc.rank+1)%2, "test-key", (void**)&data_out, &size_out);
    printf("%d: obtained data \"%s\"\n", myproc.rank, data_out);

    if (PMIX_SUCCESS != (rc = PMIx_Finalize(NULL, 0))) {
        ERR("Client ns %s rank %d:PMIx_Finalize failed: %d\n", myproc.nspace, myproc.rank, rc);
    }
    if(myproc.rank == 0) printf("PMIx finalized\n");

    MPI_Finalize();
}

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions