Closed
Description
What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)
ompi: 3.1.4
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Manual installation.
Please describe the system on which you are running
- Operating system/version:
ScientificLinux 7.3 - Computer hardware:
EPYC 7551 (currently just testing single node) - Network type:
not important (single node)
Details of the problem
I would like to fully control MPI ranking and binding on the command-line interface and optionally do it with --host
assignments.
In particular I would like to run the OSU benchmarks and do some benchmarks.
However, I have the same problem for simple codes.
In the following I will use this code (for brevity)
program test
use mpi
character(len=MPI_MAX_PROCESSOR_NAME) :: name
integer :: i, n, namel
call MPI_Init(i)
call MPI_Comm_Rank(MPI_COMM_WORLD, n, i)
call MPI_Get_Processor_Name(name, namel, i)
print *, n, name(1:namel)
call MPI_Finalize(i)
end
Cluster script:
I am requesting 64 cores on a 2-socket EPYC7551 machine (totalling 64 cores).
#!/bin/bash
#BSUB -n 64
#BSUB -R "select[model == EPYC7551]"
#BSUB -R "rusage[mem=100MB]"
#BSUB -q epyc
#BSUB -W 1:00
# Read array of affinity settings
readarray affinity < affinity.$LSB_JOBID
# See #6631 which I created some time ago (and is fixed in master)
unset LSB_AFFINITY_HOSTFILE
size=${#affinity[@]}
for i in $(seq 0 $((size-1)))
do
hosti=$(echo ${affinity[$i]} | awk '{print $1}')
cpuseti=$(echo ${affinity[$i]} | awk '{print $2}')
for j in $(seq $((i+1)) $((size-1)))
do
hostj=$(echo ${affinity[$j]} | awk '{print $1}')
cpusetj=$(echo ${affinity[$j]} | awk '{print $2}')
# 1. Direct CPU-set
mpirun --report-bindings \
-np 2 --cpu-set $cpuseti,$cpusetj ./run \
> direct.$hosti.$cpuseti-$hostj.$cpusetj
# 2. Explicit sub (report-bindings first)
mpirun --report-bindings \
-np 1 --host $hosti --cpu-set $cpuseti ./run \
: \
-np 1 --host $hostj --cpu-set $cpusetj ./run \
> sub-0.$hosti.$cpuseti-$hostj.$cpusetj
# 3. Explicit sub (report-bindings 1)
mpirun --report-bindings \
-np 1 --host $hosti --cpu-set $cpuseti --report-bindings ./run \
: \
-np 1 --host $hostj --cpu-set $cpusetj ./run \
> sub-1.$hosti.$cpuseti-$hostj.$cpusetj
# 4. Explicit sub (report-bindings 2)
mpirun --report-bindings \
-np 1 --host $hosti --cpu-set $cpuseti ./run \
: \
-np 1 --host $hostj --cpu-set $cpusetj --report-bindings ./run \
> sub-2.$hosti.$cpuseti-$hostj.$cpusetj
# 5. Explicit sub (report-bindings 1 and 2)
mpirun --report-bindings \
-np 1 --host $hosti --cpu-set $cpuseti --report-bindings ./run \
: \
-np 1 --host $hostj --cpu-set $cpusetj --report-bindings ./run \
> sub-1-2.$hosti.$cpuseti-$hostj.$cpusetj
# 6. Explicit affinity setting via env-var
{
echo ${affinity[$i]}
echo ${affinity[$j]}
} > test.affinity
cat test.affinity
LSB_AFFINITY_HOSTFILE=$(pwd)/test.affinity
mpirun -np 2 --report-bindings $OSU_HOME/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_latency \
> affinity.$hosti.$cpuseti-$hostj.$cpusetj
unset LSB_AFFINITY_HOSTFILE
done
done
Explaining the script
Although I am allocating entire nodes and not using everything I would still expect OpenMPI to obey my requested bindings.
- A single host only requires the cpu-set. A comma-separated list should be enough.
This gives me:
[n-62-27-29:01122] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././././././././././././././././././././././././././././././.][./././././././././././././././././././././././././././././././.]
[n-62-27-29:01122] MCW rank 1 bound to socket 0[core 1[hwt 0]]: [./B/./././././././././././././././././././././././././././././.][./././././././././././././././././././././././././././././././.]
regardless of --cpu-set <args>
2-6. All yield exactly the same output:
[n-62-27-29:04404] MCW rank 0 is not bound (or bound to all available processors)
[n-62-27-29:04404] MCW rank 1 is not bound (or bound to all available processors)
I have also tried adding --bind-to core
with the same output.
Possibly related issues: