Skip to content

Add a barrier module in tests to replace the use of Semaphore. #88

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 1, 2023

Conversation

lyrm
Copy link
Collaborator

@lyrm lyrm commented Jul 27, 2023

This PR provides a very basic barrier implementation to replace the use of semaphores in qcheck tests. Semaphores (and now barriers) are used to improve chances that domains run in parallel even if the work they are doing is very small and fast.

The barrier module makes this task easier as there is no need to use different modules for 2 domains or more, like previously with Semaphore.Binary and Semaphore.Counting or some weird use or Semaphore.Counting (that were just meant to emulate a barrier).

let sema = Semaphore.Binary.create () in
let domain1 =
     Domain.spawn (fun () ->
            Semaphore.Binary.release sema;
            work ())  in
let popped2 =
      while not (Semaphore.Binary.try_acquire sema) do
            Domain.cpu_relax ()
      done;
      work ()) in
  ....

becomes :

let barrier = Barrier.create 2 in
let domain1 =
     Domain.spawn (fun () ->
            Barrier.await barrier;
            work ())  in
let popped2 =
     Barrier.await barrier;
     work () in
  ....

@polytypic
Copy link
Contributor

Looks good.

It is also possible to use a different API design:

type t = { counter : int Atomic.t; total : int }

let make total = { counter = Atomic.make 0; total }

let await { counter; total } =
  if Atomic.get counter = total then
    Atomic.compare_and_set counter total 0 |> ignore;
  Atomic.incr counter;
  while Atomic.get counter < total do
    Domain.cpu_relax ()
  done

The above is also an auto-reset barrier, but it requires the user to make sure that no thread will try to re-enter the barrier before every thread has exited the barrier. A simple way to do that is to use two such barriers:

let entry_barrier = Barrier.make n in
let exit_barrier = Barrier.make n in
for ... do
  Barrier.await entry_barrier;
  work ();
  Barrier.await exit_barrier;
done

This is perhaps a bit more tricky to use correctly, so having the barrier do both entry and exit sync is probably less error prone.

@polytypic
Copy link
Contributor

polytypic commented Jul 31, 2023

One minor suggestion I have is to change the order of things in the barrier implementation. Consider the end of the await implementation:

  (* Wait for enough waiters to arrive *)
  while Atomic.get waiters < size do
    Domain.cpu_relax ()
  done;
  (* Have passed. Increased [passed]. If last one to pass, reset the
     barrier. *)
  if Atomic.fetch_and_add passed 1 = size - 1 then (
    Atomic.set passed 0;
    Atomic.set waiters 0)

What happens above is that after all waiters have passed, the last one to pass resets the barrier. This is minor thing, but this means that the last waiter to exit the barrier does a little bit more work. The minor issue with that is then that the last waiter potentially takes a little bit more time before starting to execute code (which we want to synchronize) after the barrier. Usually that should not really be a problem, but it can be easily avoided by moving code around a bit:

let create n = { waiters = Atomic.make n; size = n; passed = Atomic.make 0 }

let await { waiters; size; passed } =
  (* Have passed. Increased [passed]. If last one to pass, reset the
     barrier. *)
  if Atomic.fetch_and_add passed 1 = size - 1 then (
    Atomic.set passed 0;
    Atomic.set waiters 0);
  (* Wait for the barrier to release the previous group *)
  while Atomic.get waiters = size do
    Domain.cpu_relax ()
  done;
  (* Add itself in the waiters group *)
  Atomic.incr waiters;
  (* Wait for enough waiters to arrive *)
  while Atomic.get waiters < size do
    Domain.cpu_relax ()
  done

Now the barrier starts at the state where every thread is just about to "pass" (note the change in create). The extra work is done at the beginning, before the main wait, and then there is no extra work at the end before exiting to the code which we wanted to synchronize.

Copy link
Contributor

@polytypic polytypic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 🚀

Feel free to merge whenever you feel this is ready — with or without the minor suggestion I made.

@lyrm
Copy link
Collaborator Author

lyrm commented Jul 31, 2023

Thanks for the review.

I will keep in mind the idea of a pair of entry/exit barriers. However, the point of this module being to make it as easy as possible to run tests that actually runs in parallel, I prefer to keep it as simple as possible :)

I pushed the changes you suggested for putting the extra work at the beginning of the await function, instead of at the end, as it is a pretty good point !

@lyrm lyrm merged commit 83253a5 into ocaml-multicore:main Aug 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants