|
| 1 | +# Improve interoperability with task groups |
| 2 | + |
| 3 | +## Motivation |
| 4 | + |
| 5 | +As described in the [overarching proposal](readme.md#motivation), combined use of a `task_arena` |
| 6 | +and a `task_group` is helpful when there is a need to execute some tasks asynchronously, |
| 7 | +but is non-trivial to do properly. Here we propose specific APIs to make it easier. |
| 8 | + |
| 9 | +## Proposed API |
| 10 | + |
| 11 | +We suggest new overloads for `enqueue` which additionally take `task_group` as an argument, |
| 12 | +and the new `task_arena::wait_for` method that also takes `task_group`. |
| 13 | + |
| 14 | +```cpp |
| 15 | +// Defined in header <oneapi/tbb/task_arena.h> |
| 16 | + |
| 17 | +namespace oneapi::tbb { |
| 18 | + class task_arena { |
| 19 | + public: |
| 20 | + ... // public types and members of class task_arena |
| 21 | + |
| 22 | + // Proposed new methods |
| 23 | + template<typename F> void enqueue(F&& f, task_group& tg); |
| 24 | + task_group_status wait_for(task_group& tg); |
| 25 | + }; |
| 26 | + |
| 27 | + namespace this_task_arena { |
| 28 | + template<typename F> void enqueue(F&& f, task_group& tg); |
| 29 | + } |
| 30 | +} // namespace oneapi::tbb |
| 31 | +``` |
| 32 | + |
| 33 | +## Design discussion |
| 34 | + |
| 35 | +### Enqueue a function as a part of a task group |
| 36 | + |
| 37 | +There are two existing methods to submit a task for asynchronous execution in a task arena: |
| 38 | +```cpp |
| 39 | +template<typename F> void task_arena::enqueue(F&& f); // (1) |
| 40 | +void task_arena::enqueue(task_handle&& h); // (2) |
| 41 | +``` |
| 42 | +The `this_task_arena` namespace also has two functions with the same signatures. |
| 43 | +
|
| 44 | +The proposed new overload for `enqueue` is similar to (1) but also takes `task_group` as the second argument: |
| 45 | +```cpp |
| 46 | +template<typename F> void task_arena::enqueue(F&& f, task_group& tg); |
| 47 | +``` |
| 48 | +Semantically it is equivalent to (2) which argument is created by `tg.defer(std::forward<F>(f))`. |
| 49 | +Implementation-wise it is just a header-based wrapper over (2); a more elaborated implementation |
| 50 | +does not appear necessary. An analogous function should also be added to the `this_task_arena` namespace. |
| 51 | + |
| 52 | +### Wait for completion of a task_group |
| 53 | + |
| 54 | +The new proposed method of `task_arena` takes a `task_group` argument and does not return until |
| 55 | +all tasks in that task group are complete or cancelled: |
| 56 | +```cpp |
| 57 | +task_group_status task_arena::wait_for(task_group& tg); |
| 58 | +``` |
| 59 | +Note that the scope of waiting includes not only the tasks submitted via the methods of `task_arena` |
| 60 | +but all tasks in the task group, independent of the way they were created and added as well as of |
| 61 | +task arenas they were submitted to. |
| 62 | +
|
| 63 | +The returned value indicates the [completion status]( |
| 64 | +https://oneapi-spec.uxlfoundation.org/specifications/oneapi/v1.4-rev-1/elements/onetbb/source/task_scheduler/task_group/task_group_status_enum) |
| 65 | +of the task group. |
| 66 | +
|
| 67 | +The method is semantically equivalent to `execute([&tg]{ return tg.wait(); })`, and can be implemented |
| 68 | +that way. However, a better implementation for the current code base should instead use the `wait_delegate` |
| 69 | +class (see `oneapi/tbb/task_group.h`) and directly call the `execute` library entry point with this delegate. |
| 70 | +
|
| 71 | +There is no need to have a similar function in the `this_task_arena` namespace, as it would be |
| 72 | +no different from calling `tg.wait()`. |
| 73 | +
|
| 74 | +### Should `execute` be extended as well? |
| 75 | +
|
| 76 | +Another method, `task_arena::execute` appear similar to `enqueue` in the sense that it also takes a callable |
| 77 | +and executes it in the arena. Should it also interoperate with a task group, and in which way? |
| 78 | +
|
| 79 | +The purpose of `execute` is to make sure that the provided callable is executed in a certain task arena, |
| 80 | +so that any work created by the callable is shared within the arena. To achieve that, the calling thread |
| 81 | +attempts to join the arena; if successful, it executes the callable and returns, while if not - which means |
| 82 | +that the arena is full with other threads - the callable is wrapped into a task and delegated to those threads, |
| 83 | +and the calling thread blocks until the task is complete. |
| 84 | +
|
| 85 | +A reasonable interoperability semantics could be that the callable, while executed in the given arena, |
| 86 | +also counts as a task in the given group. It would be roughly equivalent to the following code: |
| 87 | +```cpp |
| 88 | +// auto res = ta.execute(f, tg) could mean: |
| 89 | +{ |
| 90 | + auto th = tg.defer([]{}); // an empty "proxy" task for counting |
| 91 | + ta.execute(f); |
| 92 | +} // th is destroyed when the thread leaves the scope |
| 93 | +``` |
| 94 | + |
| 95 | +Note that `ta.execute([&]{ tg.run(f); })` is not suitable because it submits `f` into the arena |
| 96 | +but does not ensure its completion, and `ta.execute([&]{ tg.run_and_wait(f); })` does not work either |
| 97 | +because it waits for all tasks in the group, not only for `f`. |
| 98 | + |
| 99 | +Overall, it is not obvious if adding a task group parameter to `execute` is a useful extension. |
| 100 | + |
| 101 | +### Thoughts on work isolation |
| 102 | + |
| 103 | +It makes sense to also consider work isolation for this API. While waiting for task group completion, |
| 104 | +the thread can take unrelated tasks for execution, and that can potentially result in a delayed return |
| 105 | +and in latency increase. To prevent that, tasks in the group should carry a unique tag that is |
| 106 | +also specified for the waiting call. The `isolated_task_group` preview class provides the desired |
| 107 | +functionality, but not the regular `task_group`. |
| 108 | + |
| 109 | +Note that extending `this_task_arena::isolate` with a task group argument would not help. `isolate` |
| 110 | +uses a unique isolation scope for a given callable; its purpose is to isolate the work, which the callable |
| 111 | +produces and then waits for, from every other task, and specifically from stealing "outermost" tasks which |
| 112 | +interfere with the callable. |
| 113 | + |
| 114 | +We can consider the following options for providing isolation in `task_arena::wait_for(task_group&)`: |
| 115 | +- keep the `isolated_task_group` class and support it in the proposed `task_arena` extensions; |
| 116 | +- somehow extend the `task_group` class to optionally support work isolation (might require incompatible changes); |
| 117 | +- add an isolation tag (automatically or on demand) only when a `task_group` is used with `task_arena`. |
| 118 | + |
| 119 | +## Open Questions |
| 120 | + |
| 121 | +- Is there any value in implementing this proposal first as experimental/preview API? |
| 122 | +- Should a new overload for `execute` be added, that takes a task group argument? |
| 123 | +- Whether/how work isolation is supported needs to be decided |
0 commit comments