-
Notifications
You must be signed in to change notification settings - Fork 68
README: update list of frameworks #1096
README: update list of frameworks #1096
Conversation
odls - OpenRTE daemon local launch subsystem | ||
oob - Out of band messaging | ||
plm - Process lifecycle management | ||
ras - Resource allocation system | ||
rmaps - Resource mapping system | ||
rml - RTE message layer | ||
routed - Routing table for the RML | ||
rtc - Run-time control framework | ||
schitzo - OpenRTE personality framework |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be "schizo" no 't'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch; thanks -- will fix.
Test PASSed. |
odls - OpenRTE daemon local launch subsystem | ||
oob - Out of band messaging | ||
plm - Process lifecycle management | ||
ras - Resource allocation system | ||
rmaps - Resource mapping system | ||
rml - RTE message layer | ||
routed - Routing table for the RML | ||
rtc - Run-time control framework | ||
schitzo - OpenRTE personality framework | ||
sensor - Software and hardware health monitoring | ||
snapc - Snapshot coordination |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
snapc was removed in v2.x
ad8d3d9
to
d27a073
Compare
Test PASSed. |
snapc - Snapshot coordination | ||
sstore - Distributed scalable storage | ||
rtc - Run-time control framework | ||
schitzo - OpenRTE personality framework |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
schizo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arrgh. Fixed.
Signed-off-by: Jeff Squyres <[email protected]>
d27a073
to
01a5064
Compare
Test FAILed. |
@Di0gen Please investigate this Jenkins failure -- this PR is a README change; there's no reason it should cause SIGABORT in the CI test. Thanks. |
00:48:23 + taskset -c 16,17 timeout -s SIGSEGV 10m /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/bin/mpirun -np 2 -bind-to core -mca btl_openib_if_include mlx4_0:1 -x MXM_RDMA_PORTS=mlx4_0:1 -x UCX_NET_DEVICES=mlx4_0:1 -x UCX_TLS=rc,cm -mca pml yalla /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/thread_tests/thread-tests-1.1/latency_th 8
00:48:23 opal_mutex_unlock: Operation not permitted
00:48:23 [jenkins01:17237] *** Process received signal ***
00:48:23 [jenkins01:17237] Signal: Aborted (6)
00:48:23 [jenkins01:17237] Signal code: (-6)
00:48:23 [jenkins01:17237] [ 0] /lib64/libpthread.so.0[0x3d6980f710]
00:48:23 [jenkins01:17237] [ 1] /lib64/libc.so.6(gsignal+0x35)[0x3d69032925]
00:48:23 [jenkins01:17237] [ 2] /lib64/libc.so.6(abort+0x175)[0x3d69034105]
00:48:23 [jenkins01:17237] [ 3] /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/lib/openmpi/mca_oob_ud.so(+0x3efb)[0x7ffff4542efb]
00:48:23 [jenkins01:17237] [ 4] /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/lib/openmpi/mca_oob_ud.so(+0x5f5b)[0x7ffff4544f5b]
00:48:23 [jenkins01:17237] [ 5] /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/lib/libopen-rte.so.20(+0x78e54)[0x7ffff79dae54]
00:48:23 [jenkins01:17237] [ 6] /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/lib/libopen-rte.so.20(orte_oob_base_set_addr+0x300)[0x7ffff79dab0d]
00:48:23 [jenkins01:17237] [ 7] /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/lib/libopen-pal.so.20(opal_libevent2022_event_base_loop+0x53c)[0x7ffff76b6f2c]
00:48:23 [jenkins01:17237] [ 8] /var/lib/jenkins/jobs/gh-ompi-release-pr/workspace-2/ompi_install1/lib/libopen-pal.so.20(+0x398c7)[0x7ffff765a8c7]
00:48:23 [jenkins01:17237] [ 9] /lib64/libpthread.so.0[0x3d698079d1]
00:48:23 [jenkins01:17237] [10] /lib64/libc.so.6(clone+0x6d)[0x3d690e8b6d]
00:48:23 [jenkins01:17237] *** End of error message ***
00:48:23 --------------------------------------------------------------------------
00:48:23 mpirun noticed that process rank 0 with PID 0 on node jenkins01 exited on signal 6 (Aborted). |
FWIW: that's a Mellanox code area - can someone from Mellanox please address it? |
@miked-mellanox @jladd-mlnx @Di0gen Following up on the mysterious Mellanox Jenkins error (which looks like it might be a larger race condition) here: open-mpi/ompi#1586 |
Thanks, Jeff! Looks like the bug was squashed. Do we know in which commit On Tue, Apr 26, 2016 at 12:51 PM, Jeff Squyres [email protected]
|
Signed-off-by: Jeff Squyres [email protected]
@hppritcha Please review
@hjelmn I took the liberty of adding "patcher" in the list, even though #1079 is not merged yet.