Skip to content

Conversation

@sunkuamzn
Copy link
Collaborator

No description provided.

shijin-aws and others added 8 commits March 28, 2023 21:03
Implement all the functions in efa_rdm_srx_owner_ops. Also refactor the code in rxr_pkt_proc_msgrtm()
and rxr_pkt_proc_tagrtm() so they can be reused by efa_rdm_srx_owner_ops.

Signed-off-by: Shi Jin <[email protected]>
Signed-off-by: Sai Sunku <[email protected]>
Signed-off-by: Shi Jin <[email protected]>
This patch fixed two issues in efa provider that will cause
trouble when sharing rx to a peer provider.

1. rxr_msg_recv (trecv, recvv, trecvv) should pass flags as
rxr_rx_flags(ep) to rxr_msg_generic_recv, which means it should
enable FI_COMPLETION in the flags as long as the prov_info->rx_attr.flags
support FI_COMPLETION. For rxr_msg_recvmsg (trecvmsg), it should
pass application flags |= util_ep.rx_msg_flags, which will have NO FI_COMPLETION
when application binds rx cq with FI_SELECTIVE_COMPLETION, and does not have
FI_COMPLETION in the flags of fi_recvmsg.

2. when calling ofi_need_completion in rxr_rx_entry_report_completion, the rxr_rx_flags(ep)
should not be passed in as the cq_flags. Instead, cq_flags should be either 0 or FI_SELECTIVE_COMPLETION
which can be derived from util_ep.rx_msg_flags.

Signed-off-by: Shi Jin <[email protected]>
Signed-off-by: Sai Sunku <[email protected]>
shijin-aws pushed a commit that referenced this pull request Sep 12, 2023
If a posted receive matches with a saved receive, we may need to
increment the rx counter.  Set the rx counter increment callback
to match that of the posted receive.  This fixes an assert in
xnet_cntr_inc() accessing a NULL cntr_inc function pointer.

Program received signal SIGABRT, Aborted.
0x0000155552d4d37f in raise () from /lib64/libc.so.6
#0  0x0000155552d4d37f in raise () from /lib64/libc.so.6
#1  0x0000155552d37db5 in abort () from /lib64/libc.so.6
#2  0x0000155552d37c89 in __assert_fail_base.cold.0 () from /lib64/libc.so.6
#3  0x0000155552d45a76 in __assert_fail () from /lib64/libc.so.6
#4  0x00001555522967f9 in xnet_cntr_inc (ep=0x6e4c70, xfer_entry=0x6f7a30) at prov/tcp/src/xnet_cq.c:347
#5  0x0000155552296836 in xnet_report_cntr_success (ep=0x6e4c70, cq=0x6ca930, xfer_entry=0x6f7a30) at prov/tcp/src/xnet_cq.c:354
#6  0x000015555229970d in xnet_complete_saved (saved_entry=0x6f7a30) at prov/tcp/src/xnet_progress.c:153
#7  0x0000155552299961 in xnet_recv_saved (saved_entry=0x6f7a30, rx_entry=0x6f7840) at prov/tcp/src/xnet_progress.c:188
#8  0x00001555522946f8 in xnet_srx_tag (srx=0x6dd1c0, recv_entry=0x6f7840) at prov/tcp/src/xnet_srx.c:445
#9  0x0000155552294bb1 in xnet_srx_trecv (ep_fid=0x6dd1c0, buf=0x6990c4, len=4, desc=0x0, src_addr=0, tag=21474836494, ignore=3458764513820540928, context=0x7ffffffeb180) at prov/tcp/src/xnet_srx.c:558
#10 0x000015555228f60e in fi_trecv (ep=0x6dd1c0, buf=0x6990c4, len=4, desc=0x0, src_addr=0, tag=21474836494, ignore=3458764513820540928, context=0x7ffffffeb180) at ./include/rdma/fi_tagged.h:91
#11 0x00001555522900a7 in xnet_rdm_trecv (ep_fid=0x6d9fe0, buf=0x6990c4, len=4, desc=0x0, src_addr=0, tag=21474836494, ignore=3458764513820540928, context=0x7ffffffeb180) at prov/tcp/src/xnet_rdm.c:212

Signed-off-by: Sean Hefty <[email protected]>
@github-actions
Copy link

This pull request is stale because it has been open 360 days with no activity. Remove stale label or comment, otherwise it will be closed in 7 days.

@github-actions github-actions bot added the stale label Mar 25, 2024
@github-actions github-actions bot closed this Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants