Closed
Description
Using openmpi 2.1.2 and the Intel MPI Benchmark suite (https://software.intel.com/sites/default/files/managed/76/6c/IMB_2017_Update2.tgz) on x86 systems (multiple SUSE versions)
I get this error
mpirun -np 2 --mca btl vader,self /usr/lib/mpi/gcc/openmpi2/tests/IMB/IMB-MPI1
[snip...]
#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 2
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 0.60 0.60 0.60 0.00
1 1000 0.59 0.59 0.59 3.38
2 1000 0.55 0.55 0.55 7.31
4 1000 0.62 0.62 0.62 12.81
8 1000 0.64 0.64 0.64 24.94
16 1000 0.76 0.76 0.76 41.94
32 1000 0.56 0.56 0.56 114.88
64 1000 0.64 0.64 0.64 200.60
128 1000 0.65 0.65 0.65 396.01
256 1000 1.10 1.10 1.10 463.57
512 1000 1.49 1.50 1.49 684.71
1024 1000 1.82 1.82 1.82 1122.39
2048 1000 2.07 2.07 2.07 1979.64
4096 1000 2.63 2.63 2.63 3113.39
8192 1000 2.74 2.74 2.74 5986.21
16384 1000 4.42 4.42 4.42 7410.65
[portia:25305] *** Process received signal ***
[portia:25305] Signal: Segmentation fault (11)
[portia:25305] Signal code: Address not mapped (1)
[portia:25305] Failing at address: 0x56dc0730
[portia:25305] [ 0] linux-gate.so.1(__kernel_rt_sigreturn+0x0)[0xf77bdf70]
[portia:25305] [ 1] /usr/lib/mpi/gcc/openmpi2/lib/openmpi/mca_btl_vader.so(mca_btl_vader_poll_handle_frag+0x133)[0xf5882e33]
[portia:25305] [ 2] /usr/lib/mpi/gcc/openmpi2/lib/openmpi/mca_btl_vader.so(+0x4251)[0xf5883251]
[portia:25305] [ 3] /usr/lib/mpi/gcc/openmpi2/lib/libopen-pal.so.20(opal_progress+0x70)[0xf7377720]
[portia:25305] [ 4] /usr/lib/mpi/gcc/openmpi2/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send+0x925)[0xf54571b5]
[portia:25305] [ 5] /usr/lib/mpi/gcc/openmpi2/lib/libmpi.so.20(MPI_Sendrecv+0x299)[0xf77279e9]
[portia:25305] [ 6] /usr/lib/mpi/gcc/openmpi2/tests/IMB/IMB-MPI1(+0xbee9)[0x5663bee9]
[portia:25305] [ 7] /usr/lib/mpi/gcc/openmpi2/tests/IMB/IMB-MPI1(+0x65c8)[0x566365c8]
[portia:25305] [ 8] /usr/lib/mpi/gcc/openmpi2/tests/IMB/IMB-MPI1(+0x1f02)[0x56631f02]
[portia:25305] [ 9] /lib/libc.so.6(__libc_start_main+0xf3)[0xf7511743]
[portia:25305] [10] /usr/lib/mpi/gcc/openmpi2/tests/IMB/IMB-MPI1(+0x1971)[0x56631971]
[portia:25305] *** End of error message ***
while mpirun -np 2 --mca btl sm,self /usr/lib/mpi/gcc/openmpi2/tests/IMB/IMB-MPI1
works fine
Tried to gdb the SEGV but no success yet.