Skip to content

Refactor scheduler and implement spinner thread for Partr. #56475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

gbaraldi
Copy link
Member

@gbaraldi gbaraldi commented Nov 6, 2024

Also add option for child first
I'm splitting this from the workstealing PR to facilitate the reviews. This part should be much easier to merge.

The spinner design is rougly based on go's and mostly reuses the seq-cst barriers we have currently for sleeping. Though unlike n_threads_running the new counters rely on the thread being woken up so they will underreport

The new design doesnt care about how many threads are spinning, it just checks how many idle threads are available (we may want to combine this with threads running but they have slightly different ideas.
The performance gain is slight but does exist

@oscardssmith oscardssmith added the multithreading Base.Threads and related functionality label Jan 10, 2025
@gbaraldi gbaraldi force-pushed the gb/sched-refact branch 2 times, most recently from 0d72173 to 9ecf779 Compare January 17, 2025 21:01
@gbaraldi
Copy link
Member Author

Using

function fib(n::Int)
           n <= 1 && return n
           t = Threads.@spawn fib(n - 2)
           return fib(n - 1) + fetch(t)::Int
       end

as a benchmark this shows a pretty measurable improvement:
nightly

julia> @benchmark fib(20)
BenchmarkTools.Trial: 1571 samples with 1 evaluation.
 Range (min  max):  2.895 ms    7.492 ms  ┊ GC (min  max): 0.00%   0.00%
 Time  (median):     2.958 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.181 ms ± 544.458 μs  ┊ GC (mean ± σ):  6.30% ± 10.37%

  ▆█▃            ▁▁▁                                           
  ███▇▄▄▁▁▁▁▁▃▅███████▇▆▅▅▅▅▅▇▅▆▅▅▅▇▆▆▄▅▆▅▃▅▆▃▆▅▄▅▅▄▅▃▄▃▃▁▄▁▄ █
  2.9 ms       Histogram: log(frequency) by time      5.45 ms <

 Memory estimate: 3.71 MiB, allocs estimate: 67768

PR

julia> @benchmark fib(20)
BenchmarkTools.Trial: 1882 samples with 1 evaluation.
 Range (min  max):  2.403 ms    5.918 ms  ┊ GC (min  max): 0.00%  57.77%
 Time  (median):     2.456 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.655 ms ± 501.850 μs  ┊ GC (mean ± σ):  6.95% ± 11.26%

  ▆█▃            ▁▂▁                                           
  ███▇▃▁▁▁▁▁▁▁▅▆█████▇▇▇▄▅▅▃▅▆▆▅▃▅▆▅▇▅▅▅▆▆▅▅▅▅▅▅▆▄▅▅▅▃▆▄▅▃▅▅▅ █
  2.4 ms       Histogram: log(frequency) by time      4.75 ms <

 Memory estimate: 3.51 MiB, allocs estimate: 54732.

While this benchmark isn't super comprehensive, it is pretty much just measuring scheduler latency, and given that this doesn't really change any scheduler decisions and just changes the wakeup logic it seems like a pretty nice improvement

@gbaraldi gbaraldi requested a review from vtjnash January 20, 2025 19:57
@oscardssmith oscardssmith added the performance Must go faster label Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multithreading Base.Threads and related functionality performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants