Skip to content

enhance sched_getaffinity function to avoid early crash when counting available cores on systems with more than 1024 cores #3701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Flamefire
Copy link
Contributor

@Flamefire Flamefire commented May 26, 2021

Dynamically determine the size of the cpu_set_t struct doubling it on each try

This is basically what the os module in Python 3.3+ does, see e.g. https://github.com/akheron/cpython/blob/f91d2f2e2e992c3006a5023eb4ba3cf0d082fde8/Modules/posixmodule.c#L5706

Error shown even on eb --help is:

ERROR: sched_getaffinity failed for pid 675186 ec -1

Dynamically determine the size of the cpu_set_t struct doubling it on
each try
@boegel
Copy link
Member

boegel commented May 26, 2021

@Flamefire Can you mention the error/crash you run into without this fix in the PR description?

@boegel boegel added this to the 4.4.0 milestone May 26, 2021
@Flamefire
Copy link
Contributor Author

Added

@akesandgren
Copy link
Contributor

Apart from my comment above this code is sane and correct.

@boegel boegel changed the title Support systems with more than 1024 cores enhance sched_getaffinity function to avoid early crash when counting available cores on systems with more than 1024 cores May 26, 2021
Copy link
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@akesandgren akesandgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@akesandgren
Copy link
Contributor

Going in, thanks @Flamefire!

@akesandgren akesandgren merged commit 86c6764 into easybuilders:develop May 27, 2021
@Flamefire Flamefire deleted the increase_max_cpu_count_support branch May 27, 2021 06:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants