Skip to content

Poor thread scaling in multi-group mode #2160

Open
@jtramm

Description

@jtramm

The multigroup mode for OpenMC currently does not scale very well when run using shared memory threading (via OpenMP) past 10 threads or so. For example, on this 2D C5G7 input deck, when I run it on a dual socket Intel Xeon platinum 8180M node with 56 cores and 112 threads, I get the following behavior:

MPI Ranks OpenMP Threads per Rank Inactive [particles/sec] Active [particles/sec]
1 1 39,510 23,165
1 4 115,423 64,612
1 8 154,411 67,412
1 16 103,526 59,432
1 56 96,408 56,838
1 112 134,937 80,139
56 1 868,351 498,800
56 2 1,102,130 754,118
28 4 1,270,910 673,048
2 56 233,411 142,513

These results were generated with gcc 10.2.0 and OpenMPI 2.1.6. I also tried out some other compilers (namely, intel and llvm) and found the results followed the same trend. So, I don't think this can be chalked up to a poor compiler implementation, especially as continuous energy MC scales well with OpenMP on this node.

One initial problem that comes to mind is that the tally space is fairly small for this problem (a 51 x 51 fission rate mesh), which may lead to a lot of memory contention when doing tallies. However, the poor performance seems to affect inactive batches equally, so there must be a more fundamental problem.

My guess is that there is a false sharing issue specific to the multi-group MC mode. Hopefully it can be easily fixed once spotted. I may try to hunt this down at some point in the future but thought I would open the issue now in the interim.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions