Closed
Description
A number of the Jenkins systems have started to see errors that look like this:
[ompi:32674] listen_thread: accept() failed: Invalid argument (22).
LANL, IBM, and Mellanox have all seen this issue on occasion when running MPI applications as simple as hello_world. Below is the example from the IBM Jenkins:
timeout --preserve-status -k 22s 20s mpirun -np 2 -mca btl tcp,vader,sm,self hello_c
[p10a602:65201] listen_thread: accept() failed: Invalid argument (22).
It has been floated that this is a pmix issue with the failure coming from mpirun
. It does not happen all the time, so it might be a timing issue on startup, maybe?