Skip to content

coro::task_container gc fix not completing coroutines#288

Merged
jbaldwin merged 2 commits intomainfrom
issue-287/task_container_gc_fix
Jan 30, 2025
Merged

coro::task_container gc fix not completing coroutines#288
jbaldwin merged 2 commits intomainfrom
issue-287/task_container_gc_fix

Conversation

@jbaldwin
Copy link
Owner

@jbaldwin jbaldwin commented Jan 29, 2025

The coro::task_container::gc_internal function was deleting coroutines when marked as .done(), however there is some mechanism that rarely would cause the user_task coroutine to not actually execute. I'm still not sure exactly why this is the case but:

  1. Simply disabling gc_internal() would stop the problem
  2. Running gc_internal() and moving the coro::task to a 'dead' list
    still caused the issue.

With these in mind I spent time re-reading the specification on the final_suspend and determined that coro::task_container should be a thing atomic counter to track the submitted coroutines and have them self delete. The self deletion is now done via a
coro::detail::task_self_destroying coroutine type that takes advantage of the promise's final_suspend() not suspending. The spec states that if this doesn't suspend then the coroutine will call destroy() on itself.

Closes #287

@jbaldwin jbaldwin self-assigned this Jan 29, 2025
@jbaldwin jbaldwin force-pushed the issue-287/task_container_gc_fix branch 3 times, most recently from cbce67c to 2e8e40f Compare January 29, 2025 19:30
@jbaldwin jbaldwin mentioned this pull request Jan 29, 2025
@jbaldwin
Copy link
Owner Author

We can safely ignore the false positive on codacy, the variable very much is used.

@jbaldwin jbaldwin force-pushed the issue-287/task_container_gc_fix branch from 2e8e40f to 49829ea Compare January 29, 2025 19:54
The coro::task_container::gc_internal function was deleting coroutines
when marked as .done(), however there is some mechanism that rarely
would cause the user_task coroutine to not actually execute. I'm still
not sure exactly why this is the case but:

1) Simply disabling gc_internal() would stop the problem
2) Running gc_internal() and moving the coro::task to a 'dead' list
   still caused the issue.

With these in mind I spent time re-reading the specification on the
final_suspend and determined that coro::task_container should be a thing
atomic counter to track the submitted coroutines and have them self
delete. The self deletion is now done via a
coro::detail::task_self_destroying coroutine type that takes advantage
of the promise's final_suspend() not suspending. The spec states that if
this doesn't suspend then the coroutine will call destroy() on itself.

Closes #287
@jbaldwin jbaldwin force-pushed the issue-287/task_container_gc_fix branch from 49829ea to 0ba7020 Compare January 29, 2025 19:56
@jbaldwin jbaldwin added the bug Something isn't working label Jan 29, 2025
Copy link

@breakds breakds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this approach the logic is actually a lot simpler. Thanks!

@jbaldwin
Copy link
Owner Author

I renamed task_self_destroying to task_self_deleting since its more inline with the delete keyword.

@jbaldwin jbaldwin merged commit 49333e7 into main Jan 30, 2025
37 checks passed
@jbaldwin jbaldwin deleted the issue-287/task_container_gc_fix branch January 30, 2025 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

schedule stress test

2 participants