Skip to content

cgroup awareness #1155

Closed
Closed
@kcgthb

Description

@kcgthb

Hi!

I understand that OpenBLAS tries to automatically detect the number of CPU cores on a machine at runtime, to determine the number of threads to start when {OPENBLAS,GOTO,OMP,_NUM_THREADS} is not set.

It works fine in most cases, but when the process runs in a cgroup context, for instance one where the cpuset subsystem is in use, it may result in less-than-optimal behavior.

For instance, on a 16-core machine, if a process runs inside a cgroup where 4 CPUs have been allocated via cpuset, OpenBLAS will start 16 threads, which will be pinned on just 4 CPU-cores and which will compete with each other. In the end, the performance will be about 1/4th of what it would have been by just starting 4 threads.

So I'm wondering if any thought has been given about this already, and how OpenBLAS could try to detect if it's running in a constrained context, in order to properly allocate the resources it can use.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions