A leaked Executor does not need to track active tasks #111

james7132 · 2024-04-09T20:20:40Z

A 'static Executor that has been leaked never has its Drop impl called. This may not be uncommon for use cases where an executor is initialized at startup and never dropped until program termination. In such a use case, the active field in the executor's state isn't really all that useful, as it's only used in the Drop impl and Executor::is_empty.

I understand the intent is to keep the API footprint of the crate fairly low, but would it be feasible to add something like this to the public API of the crate?

#[repr(transparent)]
pub struct LeakedExecutor(State);

impl Executor<'static> {
  fn leak(self) -> &'static LeakedExecutor;
}

impl LeakedExecutor {
  fn spawn<'a: 'static, T: Send + 'a, F: Future<Output = T> +  'a>(&self, fut: F) -> Task<T>;
  async fn tick(&self);
  async fn run(&self);
}

LeakedExecutor should be able to omit all changes to active, and shouldn't require a lock when spawning or finishing tasks. Leaking is also not const, so the atomic check for the whether the state's initialized is also omitted from all calls to the type.

The text was updated successfully, but these errors were encountered:

notgull · 2024-04-10T00:48:20Z

I like the idea as an optimization, perhaps one that's only enabled with a specific feature flag. Although I'm a little skeptical of the API. It feels like it could be this instead and not lock us into repr(transparent):

struct LeakedExecutor(&'static State);

@smol-rs/admins Any other thoughts?

james7132 · 2024-04-10T05:40:59Z

Hacked together a quick implementation (just omits all of the active task tracking from the implementation), and there are notable improvements in both single threaded and multithreaded performance, with some cases being upwards of 9x faster (multi_thread/*::spawn_many_local) due to the elimination of the lock-convoy triggered by a large number of short-lived tasks.

single_thread/executor::spawn_one
                        time:   [1.6157 µs 1.6238 µs 1.6362 µs]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
single_thread/executor::spawn_batch
                        time:   [28.169 µs 29.650 µs 32.196 µs]
Found 19 outliers among 100 measurements (19.00%)
  10 (10.00%) low severe
  3 (3.00%) low mild
  3 (3.00%) high mild
  3 (3.00%) high severe
single_thread/executor::spawn_many_local
                        time:   [6.1952 ms 6.2230 ms 6.2578 ms]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
single_thread/executor::spawn_recursively
                        time:   [50.202 ms 50.479 ms 50.774 ms]
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe
single_thread/executor::yield_now
                        time:   [5.8795 ms 5.8883 ms 5.8977 ms]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

multi_thread/executor::spawn_one
                        time:   [1.2565 µs 1.2979 µs 1.3470 µs]
Found 8 outliers among 100 measurements (8.00%)
  7 (7.00%) high mild
  1 (1.00%) high severe
multi_thread/executor::spawn_batch
                        time:   [38.009 µs 43.693 µs 52.882 µs]
Found 22 outliers among 100 measurements (22.00%)
  21 (21.00%) high mild
  1 (1.00%) high severe
Benchmarking multi_thread/executor::spawn_many_local: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 386.6s, or reduce sample count to 10.
multi_thread/executor::spawn_many_local
                        time:   [27.492 ms 27.652 ms 27.814 ms]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
Benchmarking multi_thread/executor::spawn_recursively: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 16.6s, or reduce sample count to 30.
multi_thread/executor::spawn_recursively
                        time:   [165.82 ms 166.04 ms 166.26 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
multi_thread/executor::yield_now
                        time:   [22.469 ms 22.649 ms 22.798 ms]
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) low severe
  3 (3.00%) low mild

single_thread/leaked_executor::spawn_one
                        time:   [1.4717 µs 1.4778 µs 1.4832 µs]
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) low severe
  2 (2.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe
single_thread/leaked_executor::spawn_many_local
                        time:   [4.2622 ms 4.3065 ms 4.3489 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) low mild
single_thread/leaked_executor::spawn_recursively
                        time:   [26.566 ms 26.899 ms 27.228 ms]
single_thread/leaked_executor::yield_now
                        time:   [5.7200 ms 5.7270 ms 5.7342 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

multi_thread/leaked_executor::spawn_one
                        time:   [1.3755 µs 1.4321 µs 1.4892 µs]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
multi_thread/leaked_executor::spawn_many_local
                        time:   [4.1838 ms 4.2394 ms 4.2989 ms]
Found 7 outliers among 100 measurements (7.00%)
  7 (7.00%) high mild
multi_thread/leaked_executor::spawn_recursively
                        time:   [43.074 ms 43.159 ms 43.241 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild
multi_thread/leaked_executor::yield_now
                        time:   [23.210 ms 23.257 ms 23.302 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild

Resolves #111. Creates a `StaticExecutor` type under a feature flag and allows constructing it from an `Executor` via `Executor::leak`. Unlike the executor it came from, it's a wrapper around a `State` and omits all changes to `active`. Note, unlike the API proposed in #111, this PR also includes a unsafe `StaticExecutor::spawn_scoped` for spawning non-'static tasks, where the caller is responsible for ensuring that the task doesn't outlive the borrowed state. This would be required for Bevy to migrate to this type, where we're currently using lifetime transmutation on `Executor` to enable `Thread::scope`-like APIs for working with borrowed state. `StaticExecutor` does not have an external lifetime parameter so this approach is infeasible without such an API. The performance gains while using the type are substantial: ``` single_thread/executor::spawn_one time: [1.6157 µs 1.6238 µs 1.6362 µs] Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) high mild 3 (3.00%) high severe single_thread/executor::spawn_batch time: [28.169 µs 29.650 µs 32.196 µs] Found 19 outliers among 100 measurements (19.00%) 10 (10.00%) low severe 3 (3.00%) low mild 3 (3.00%) high mild 3 (3.00%) high severe single_thread/executor::spawn_many_local time: [6.1952 ms 6.2230 ms 6.2578 ms] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) high mild 3 (3.00%) high severe single_thread/executor::spawn_recursively time: [50.202 ms 50.479 ms 50.774 ms] Found 6 outliers among 100 measurements (6.00%) 5 (5.00%) high mild 1 (1.00%) high severe single_thread/executor::yield_now time: [5.8795 ms 5.8883 ms 5.8977 ms] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild multi_thread/executor::spawn_one time: [1.2565 µs 1.2979 µs 1.3470 µs] Found 8 outliers among 100 measurements (8.00%) 7 (7.00%) high mild 1 (1.00%) high severe multi_thread/executor::spawn_batch time: [38.009 µs 43.693 µs 52.882 µs] Found 22 outliers among 100 measurements (22.00%) 21 (21.00%) high mild 1 (1.00%) high severe Benchmarking multi_thread/executor::spawn_many_local: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 386.6s, or reduce sample count to 10. multi_thread/executor::spawn_many_local time: [27.492 ms 27.652 ms 27.814 ms] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) low mild 3 (3.00%) high mild Benchmarking multi_thread/executor::spawn_recursively: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 16.6s, or reduce sample count to 30. multi_thread/executor::spawn_recursively time: [165.82 ms 166.04 ms 166.26 ms] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild multi_thread/executor::yield_now time: [22.469 ms 22.649 ms 22.798 ms] Found 8 outliers among 100 measurements (8.00%) 5 (5.00%) low severe 3 (3.00%) low mild single_thread/leaked_executor::spawn_one time: [1.4717 µs 1.4778 µs 1.4832 µs] Found 9 outliers among 100 measurements (9.00%) 3 (3.00%) low severe 2 (2.00%) low mild 3 (3.00%) high mild 1 (1.00%) high severe single_thread/leaked_executor::spawn_many_local time: [4.2622 ms 4.3065 ms 4.3489 ms] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) low mild single_thread/leaked_executor::spawn_recursively time: [26.566 ms 26.899 ms 27.228 ms] single_thread/leaked_executor::yield_now time: [5.7200 ms 5.7270 ms 5.7342 ms] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild multi_thread/leaked_executor::spawn_one time: [1.3755 µs 1.4321 µs 1.4892 µs] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild multi_thread/leaked_executor::spawn_many_local time: [4.1838 ms 4.2394 ms 4.2989 ms] Found 7 outliers among 100 measurements (7.00%) 7 (7.00%) high mild multi_thread/leaked_executor::spawn_recursively time: [43.074 ms 43.159 ms 43.241 ms] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild multi_thread/leaked_executor::yield_now time: [23.210 ms 23.257 ms 23.302 ms] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild ```

james7132 mentioned this issue Apr 11, 2024

Static Executors #112

Merged

notgull closed this as completed in #112 May 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A leaked Executor does not need to track active tasks #111

A leaked Executor does not need to track active tasks #111

james7132 commented Apr 9, 2024 •

edited

Loading

notgull commented Apr 10, 2024

james7132 commented Apr 10, 2024 •

edited

Loading

A leaked Executor does not need to track active tasks #111

A leaked Executor does not need to track active tasks #111

Comments

james7132 commented Apr 9, 2024 • edited Loading

notgull commented Apr 10, 2024

james7132 commented Apr 10, 2024 • edited Loading

james7132 commented Apr 9, 2024 •

edited

Loading

james7132 commented Apr 10, 2024 •

edited

Loading