Skip to content

rcache/vma: do not release vma stuctures in vma_tree_delete #2719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 13, 2017

Conversation

hjelmn
Copy link
Member

@hjelmn hjelmn commented Jan 12, 2017

This commit fixes a deadlock that can occur when the libc version
holds a lock when calling munmap. In this case we could end up calling
free() from vma_tree_delete which would in turn try to obtain the lock
in libc. To avoid the issue put any deleted vma's in a new list on the
vma module and release them on the next call to vma_tree_insert. This
should be safe as this function is not called from the memory hooks.

Backported from 79cabc9

Fixes #1654

Signed-off-by: Nathan Hjelm [email protected]

This commit fixes a deadlock that can occur when the libc version
holds a lock when calling munmap. In this case we could end up calling
free() from vma_tree_delete which would in turn try to obtain the lock
in libc. To avoid the issue put any deleted vma's in a new list on the
vma module and release them on the next call to vma_tree_insert. This
should be safe as this function is not called from the memory hooks.

Backported from 79cabc9

Fixes open-mpi#1654

Signed-off-by: Nathan Hjelm <[email protected]>

while (NULL != (item = opal_list_remove_first (&rcache->vma_gc_list))) {
OBJ_RELEASE(item);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to use OPAL_LIST_DESTRUCT here?

Or is it just to-MAY-to vs. to-MAH-to?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would then need to OBJ_CONSTRUCT again. So its MAY to MAH.

@jsquyres
Copy link
Member

@Di0gen @jladd-mlnx The reliability of the Mellanox jenkins has been quite low recently -- a lot of triggered builds that never start (like the current status of this PR):

mjenkins

Can something be fixed? Thanks!

@mike-dubman
Copy link
Member

seems bug in jenkins, we are checking offline

22:44:12 [Set GitHub commit status (universal)] ERROR on repos [GHRepository@2e2bf78e[description=Open MPI main development repository,homepage=<null>,name=ompi,license=<null>,fork=false,size=104857,milestones={},language=C,commits={},source=<null>,parent=<null>,url=https://api.github.com/repos/open-mpi/ompi,id=24107001]] (sha:e4d8db0) with context:gh-ompi-master-pr

@jsquyres
Copy link
Member

@hppritcha Good to go.

@hppritcha hppritcha merged commit 40482a6 into open-mpi:v2.0.x Jan 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants