Skip to content
This repository was archived by the owner on Sep 30, 2022. It is now read-only.

README: update list of frameworks #1096

Merged

Conversation

jsquyres
Copy link
Member

Signed-off-by: Jeff Squyres [email protected]

@hppritcha Please review
@hjelmn I took the liberty of adding "patcher" in the list, even though #1079 is not merged yet.

@jsquyres jsquyres added this to the v2.0.0 milestone Apr 25, 2016
odls - OpenRTE daemon local launch subsystem
oob - Out of band messaging
plm - Process lifecycle management
ras - Resource allocation system
rmaps - Resource mapping system
rml - RTE message layer
routed - Routing table for the RML
rtc - Run-time control framework
schitzo - OpenRTE personality framework
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be "schizo" no 't'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch; thanks -- will fix.

@mellanox-github
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ompi-release-pr/1548/ for details.

odls - OpenRTE daemon local launch subsystem
oob - Out of band messaging
plm - Process lifecycle management
ras - Resource allocation system
rmaps - Resource mapping system
rml - RTE message layer
routed - Routing table for the RML
rtc - Run-time control framework
schitzo - OpenRTE personality framework
sensor - Software and hardware health monitoring
snapc - Snapshot coordination
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snapc was removed in v2.x

@jsquyres jsquyres force-pushed the pr/v2.0.0/README-frameworks-list-update branch 2 times, most recently from ad8d3d9 to d27a073 Compare April 25, 2016 17:15
@mellanox-github
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ompi-release-pr/1551/ for details.

snapc - Snapshot coordination
sstore - Distributed scalable storage
rtc - Run-time control framework
schitzo - OpenRTE personality framework
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schizo

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arrgh. Fixed.

@jsquyres jsquyres force-pushed the pr/v2.0.0/README-frameworks-list-update branch from d27a073 to 01a5064 Compare April 25, 2016 21:10
@mellanox-github
Copy link

Test FAILed.
See http://bgate.mellanox.com/jenkins/job/gh-ompi-release-pr/1554/ for details.

@jsquyres
Copy link
Member Author

@Di0gen Please investigate this Jenkins failure -- this PR is a README change; there's no reason it should cause SIGABORT in the CI test. Thanks.

@mike-dubman
Copy link
Member

mike-dubman commented Apr 26, 2016

  • orte oob and ud oob failing during multi-threaded test case
  • also - opal_mutex_unlock: Operation not permitted
  • @Di0gen can help if it is jenkins CI infra related, which is not the case here.
00:48:23 + taskset -c 16,17 timeout -s SIGSEGV 10m /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/bin/mpirun -np 2 -bind-to core -mca btl_openib_if_include mlx4_0:1 -x MXM_RDMA_PORTS=mlx4_0:1 -x UCX_NET_DEVICES=mlx4_0:1 -x UCX_TLS=rc,cm -mca pml yalla /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/thread_tests/thread-tests-1.1/latency_th 8
00:48:23 opal_mutex_unlock: Operation not permitted
00:48:23 [jenkins01:17237] *** Process received signal ***
00:48:23 [jenkins01:17237] Signal: Aborted (6)
00:48:23 [jenkins01:17237] Signal code:  (-6)
00:48:23 [jenkins01:17237] [ 0] /lib64/libpthread.so.0[0x3d6980f710]
00:48:23 [jenkins01:17237] [ 1] /lib64/libc.so.6(gsignal+0x35)[0x3d69032925]
00:48:23 [jenkins01:17237] [ 2] /lib64/libc.so.6(abort+0x175)[0x3d69034105]
00:48:23 [jenkins01:17237] [ 3] /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/lib/openmpi/mca_oob_ud.so(+0x3efb)[0x7ffff4542efb]
00:48:23 [jenkins01:17237] [ 4] /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/lib/openmpi/mca_oob_ud.so(+0x5f5b)[0x7ffff4544f5b]
00:48:23 [jenkins01:17237] [ 5] /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/lib/libopen-rte.so.20(+0x78e54)[0x7ffff79dae54]
00:48:23 [jenkins01:17237] [ 6] /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/lib/libopen-rte.so.20(orte_oob_base_set_addr+0x300)[0x7ffff79dab0d]
00:48:23 [jenkins01:17237] [ 7] /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/lib/libopen-pal.so.20(opal_libevent2022_event_base_loop+0x53c)[0x7ffff76b6f2c]
00:48:23 [jenkins01:17237] [ 8] /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/lib/libopen-pal.so.20(+0x398c7)[0x7ffff765a8c7]
00:48:23 [jenkins01:17237] [ 9] /lib64/libpthread.so.0[0x3d698079d1]
00:48:23 [jenkins01:17237] [10] /lib64/libc.so.6(clone+0x6d)[0x3d690e8b6d]
00:48:23 [jenkins01:17237] *** End of error message ***
00:48:23 --------------------------------------------------------------------------
00:48:23 mpirun noticed that process rank 0 with PID 0 on node jenkins01 exited on signal 6 (Aborted).

@rhc54
Copy link

rhc54 commented Apr 26, 2016

FWIW: that's a Mellanox code area - can someone from Mellanox please address it?

@jladd-mlnx

@jsquyres jsquyres merged commit 9304284 into open-mpi:v2.x Apr 26, 2016
@jsquyres jsquyres deleted the pr/v2.0.0/README-frameworks-list-update branch April 26, 2016 14:47
@jsquyres
Copy link
Member Author

@miked-mellanox @jladd-mlnx @Di0gen Following up on the mysterious Mellanox Jenkins error (which looks like it might be a larger race condition) here: open-mpi/ompi#1586

@jladd-mlnx
Copy link
Member

Thanks, Jeff! Looks like the bug was squashed. Do we know in which commit
this bug was introduced?

On Tue, Apr 26, 2016 at 12:51 PM, Jeff Squyres [email protected]
wrote:

@miked-mellanox https://github.com/miked-mellanox @jladd-mlnx
https://github.com/jladd-mlnx @Di0gen https://github.com/Di0gen
Following up on the mysterious Mellanox Jenkins error (which looks like it
might be a larger race condition) here: open-mpi/ompi#1586
open-mpi/ompi#1586


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#1096 (comment)

@jsquyres
Copy link
Member Author

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants