scope scheduling RFC #1

nikomatsakis · 2018-11-01T09:52:47Z

Offer more variations on the scope construct that give more control
over the relative scheduling of spawned tasks. The choices would be:

Default: it is not important which tasks get scheduled when. Let
Rayon pick based on which can be most efficiently implemented. For
now, this is "per-thread LIFO", but it could change in the future.
Per-thread LIFO: The task that the current thread spawned most
recently will be the first to be executed. Thieves will steal the
thread that was spawned first (and thus would be the last
one that the current thread would execute). Tasks spawned by stolen
tasks will be processed first by the thief, again in a LIFO order,
but maybe stolen by other threads in turn. This is the current
default and can be implemented with the highest "micro efficiency".
Per-thread FIFO: The task that the current thread spawned first
is executed first. Thieves will also steal tasks in the same
order. Tasks spawned by stolen tasks will be processed first by the
thief. This is "roughly" the behavior today for thread-pools that
enabled the breadth_first scheduling option.
Global FIFO: Tasks always execute in the order in which they
were spawned, regardless of whether they were stolen or not.

nikomatsakis · 2018-11-01T09:52:51Z

cc @cuviper

cuviper · 2018-11-01T16:37:00Z

I shared an alternate branch in rayon-rs/rayon#601 (comment):
https://github.com/cuviper/rayon/tree/thread-fifo-spawn

This has all three variants implemented, but the public API is hard-coded to the per-thread FIFO at the moment. I did that so we could get stylo performance numbers, and it sound like it meets their needs!

ghost · 2018-11-08T16:54:08Z

My thinking is we don't need global FIFO ordering because it's prohibitively slow due to contention. It doesn't offer much over per-thread FIFO and sort of feels antithetical to how Rayon works and splits tasks into distinct chunks so that they can be processed with as little data sharing as possible.

Per-thread LIFO, while a totally sensible strategy, can already be simulated using rayon::join. For example, the following LIFO-ordered processing:

let v = ...;
rayon::scope(|scope| {
    for i in &v {
        scope.spawn(move |_| println!("{}", i));
    }
});

Can be rewritten as:

let v = ...;
fn go(mut it: impl Iterator<Item = ...>) {
    if let Some(i) = it.next() {
        rayon::join(move || go(it), move || println!("{}", i));
    }
}
rayon::join(||, || go(v.iter()));

Or, even better... :)

let v = ...;
v.par_iter().for_each(|i| println!("{}", i));

So if we supported per-thread LIFO ordered scope queues, it'd just be syntax sugar over rayon::join with recursion. It seems convenient, but I'm not convinced it's worth supporting at all.

To me, the only reasonable ordering strategy is per-thread FIFO. Others are not wrong per se, but they seem dubious.

I'd be okay with merging the per-thread FIFO strategy implemented by @cuviper and not promising too much in the documentation, i.e. the strategy is best-effort. rayon::join is a best-effort LIFO strategy in the same sense - we don't guarantee anything beyond that.

nikomatsakis · 2018-11-14T18:32:58Z

I just pushed a big update — I think the RFC is now "first draft complete".

In writing it, I came up with a few open questions:

Is the "global FIFO" mode worthwhile? I feel pretty good about both of the "per-thread" modes. The global FIFO mode however I am not so sure about. In general, it's easy to emulate global modes in "user space" (the TSP code, for example, does this, in order to use a priority queue). Seems like an easy thing to cut.

Should we adopt a builder? Particularly if we cut the "global FIFO" mode, then this RFC could be as simple as adding rayon::fifo_scope -- the builder feels kind of like overkill.

Should we reflect the "kind" of the scope in its type? To that end, we could add a rayon::FifoScope type, which gives us the freedom to use traits to statically resolve the "if fifo" check (even if, in our current implementation, a FifoScope simply wraps a Scope).

nikomatsakis · 2018-11-14T18:36:44Z

To answer @stjepang:

My thinking is we don't need global FIFO ordering because it's prohibitively slow due to contention. It doesn't offer much over per-thread FIFO and sort of feels antithetical to how Rayon works and splits tasks into distinct chunks so that they can be processed with as little data sharing as possible.

I think I agree. It seems very easy to emulate this in user space.

Per-thread LIFO, while a totally sensible strategy, can already be simulated using rayon::join.

Yes, but it requires non-constant stack space this way. I think I'd prefer prefer to keep per-thread LIFO for the reasons I enumerated in the RFC:

It preserves existing behavior.
It is the "minimal overhead" variant, and it therefore enables global scheduling to run with minimal overhead (whether FIFO or priority queue).
I think for "don't care" cases, LIFO is a better default. I'd rather that you do the tasks that were most directly related to the things you were last doing, absent any other preference. If we only offer that through join, you have to pay in stack space -- this can be "as cheap as" O(log(N)), and that's fine, but sometimes you can't easily "split in half", in which case it's O(N).

nikomatsakis · 2018-11-14T18:38:06Z

cc @bholley -- just thought you'd like to be aware of this RFC

nikomatsakis · 2018-11-14T19:02:06Z

To make it clearer what I am contemplating:

I am mildly leaning towards a "pared down" RFC at this point, which basically just adds two things to rayon-core (and mirrored in rayon):

rayon::fifo_scope, which acts like rayon::scope today but with per-thread FIFO scheduling
rayon::FifoScope, which corresponds to the existing rayon::Scope type

I see I did not make this explicit in my previous comments.

cuviper · 2018-11-14T22:22:33Z

accepted/rfc0000-scope-scheduling.md

+
+then we should execute:
+
+- first, the tasks from the join (in reverse order, so B and then A)


join does not reverse order -- B will be pushed, A will execute, then B will pop and execute (if it wasn't already stolen).

oh, I misremembered. =) Will fix.

Still not reversed...

cuviper · 2018-11-14T22:33:15Z

accepted/rfc0000-scope-scheduling.md

+contain a generic). Given the performance measurements, it is unlikely
+that the more complex types are worthwhile. This change could also be
+made at some later time, conceivably, though it would require more
+additions to the `ScopeBuilder` API.


How could it be changed later? If we expose per-thread FIFO using the current Scope type now, it seems to me that we'll be locked into that.

I think I meant "if we use a distinct type, we can still use a branch in the implementation" -- I'll tweak the wording

cuviper · 2018-11-14T22:58:54Z

I think we should also be clear about what happens to the breadth_first flag. We want to deprecate it, but will we make any effort to directly preserve its behavior?

If we don't preserve behavior, we should be very clear in the deprecation notice that it's a no-op.
- Users like Stylo will need to explicitly change their code with the new rayon-core -- regardless of whether rayon itself is updated.
If we do want to preserve behavior, then we need to talk about how.
- I don't mean to preserve everything, where a local thread always steals from the front of its own deque. This behavior is harmful, as noted in the RFC.
- We can potentially preserve just the scope-spawn order that Stylo wants. It wouldn't be hard with a unified Scope -- we would just forward scope to fifo_scope in that case. But if there's a separate FifoScope, I don't know how to make breadth_first+scope use that transparently.

nikomatsakis · 2018-11-17T11:56:34Z

I think we should also be clear about what happens to the breadth_first flag. We want to deprecate it, but will we make any effort to directly preserve its behavior? ... We can potentially preserve just the scope-spawn order that Stylo wants. It wouldn't be hard with a unified Scope -- we would just forward scope to fifo_scope in that case.

Very interesting points. It's appealing to say that we make a "best effort" to preserve the flag, though I'm not sure whether there are any other clients beyond Stylo. Even though we went to some lengths to caveat that flag, altering the behavior could definitely affect people in practice (theoretically, I am skeptical there are many users).

Still, it might be a nice design to offer some more explicit choices:

scope -- "implementation defined" ordering (currently picks based on the (deprecated) flag), returns Scope, which dynamically selects
lifo_scope -- explicit, returns LifoScope, statically known
fifo_scope -- explicit, returns FifoScope, statically known

(In particular I have no idea if, in the future, it might be the case that some other default choice "seems better".)

For now, I would expect LifoScope and FifoScope to simply wrap an internal Scope -- they might offer a method like into_scope to permit "casting" to the more general type (for now, this would just return the hidden inner type, but in the future it could wrap in an enum).

nikomatsakis · 2018-11-17T11:57:06Z

I'd say that preserving the breadth-first flag is not sufficient justification for the complexity above, but one could argue it's a genuinely nicer design.

cuviper · 2018-11-28T20:09:57Z

From our discussion today, we're leaning toward leaving breadth_first entirely as-is, though deprecated, rather than trying to engineer compatible effects on just scope/spawn. This means it would still affect the choice to pop or steal from the local deque, with all the problems of rayon-rs/rayon#590. However, after some grace period we would plan to no-op breadth_first.

In terms of this RFC, this means we'd implement:

scope and Scope with thread-LIFO behavior, the status quo
- deprecated breadth_first will make threads steal from their own deque, crudely FIFO
- breadth_first will become a no-op later, so the deque is always LIFO
fifo_scope and FifoScope with the new thread-FIFO behavior
- FIFO here is only with respect to the given scope, not the entire deque
- breadth_first would have no semantic effect here, just making the indirection useless

And global_fifo can be left as a future extension, should we find a compelling use case.

cuviper · 2018-11-28T20:21:38Z

Another part of rayon-rs/rayon#601 was making unscoped spawn always use the pool's global injector queue, so it would always be globally FIFO. Should we address that in this RFC too? Perhaps:

spawn left as the status quo, using inject_or_push to either hit the pool's injector (FIFO) or be added to the local deque (LIFO, unless affected by breadth_first).
fifo_spawn to use an indirection queue in the local WorkerThread, similar to what FifoScope will do. Calling from outside the pool still goes to the global injector.

nikomatsakis · 2018-11-29T12:02:14Z

Interesting. I do prefer that global {fifo_,}spawn -- when invoked from a worker thread -- performs "as if" there were a (implicit) global scope surrounding the worker thread, which I think fits what you said @cuviper, correct? If so, I'll add that too.

cuviper · 2018-11-29T16:59:23Z

Yeah, thinking of spawn in a global scope is a good way to associate it with this RFC. :)

If we call the new global function fifo_spawn, what about the method on FifoScope? Is it just spawn, or a somewhat-redundant fifo_spawn to match the global?

nikomatsakis · 2018-12-07T11:13:04Z

@cuviper updated

ghost · 2018-12-07T12:38:05Z

Perhaps it'd be more idiomatic to name the function scope_fifo rather than fifo_scope? 🚲 We've similarly chosen sort_unstable over unstable_sort because that way functions are more discoverable when sorted in rustdoc or when the IDE autocompletes .sort. The same applies to fifo_spawn -> spawn_fifo.

Do you think we could omit the function fifo_spawn? It's not like anyone needs it right now and we can easily add it later if there's a need. Plus, the semantics of such a function in combination with the regular spawn are non-obvious.

I'd be happy with introducing just the following minimal API:

pub fn scope_fifo<'scope, OP, R>(op: OP) -> R where
    OP: for<'s> FnOnce(&'s ScopeFifo<'scope>) -> R + 'scope + Send,
    R: Send;

pub struct ScopeFifo<'scope>;

impl<'scope> ScopeFifo<'scope> {
    pub fn spawn<BODY>(&self, body: BODY)
    where
        BODY: FnOnce(&ScopeFifo<'scope>) + Send + 'scope;
}

nikomatsakis · 2018-12-07T13:32:36Z

👍 for scope_fifo, done.

I don't have a strong opinion about whether spawn_fifo should be retained -- but I was wondering how much to say about the semantics of pushing into an outer scope etc. I'm not sure if there is any other way to do it, though, apart from spawn_fifo, because scope types are not Send or Sync...

ghost · 2018-12-07T13:40:55Z

because scope types are not Send or Sync

Not sure if I'm missing something here, but Scope<'_> is Send + Sync: https://docs.rs/rayon/1.0.3/rayon/struct.Scope.html#synthetic-implementations

nikomatsakis · 2018-12-07T13:43:38Z

OK =) I was misremembering (I think initially they were not, but I have a vague memory that we changed it).

In that case, I don't think that any new challenge in particular is introduced by spawn_fifo that doesn't already exist.

That is, you can do something like this:

scope_fifo(|scope1| {
    scope_fifo(|scope2| {
        scope1.spawn(foo);
        scope2.spawn(bar);
        scope1.spawn(baz);
    });
});

Presumably this will execute in an interleaved fashion.

ghost · 2018-12-07T14:56:40Z

So what's the right metaphor for global spawn() and spawn_fifo() functions?

Is it this (a):

scope_fifo(|scope1| {
    scope(|scope2| {
        // `spawn()` is `scope2.spawn()`
        // `spawn_fifo()` is scope1.spawn()`
        main();
    });
});

Or this (b):

scope(|scope1| {
    scope_fifo(|scope2| {
        // `spawn()` is `scope1.spawn()`
        // `spawn_fifo()` is scope2.spawn()`
        main();
    });
});

Or something else (c)?

Presumably this will execute in an interleaved fashion.

Are you saying that (a) and (b) behave identically? That might be true in the proposed implementation, but I think it'd make sense for inner scopes to prioritize their own tasks over others on at a best-effort basis (but no guarantees!).

cuviper · 2018-12-07T21:03:16Z

Scope was made Send + Sync in rayon-rs/rayon#443.

That is, you can do something like this:
scope_fifo(|scope1| {
    scope_fifo(|scope2| {
        scope1.spawn(foo);
        scope2.spawn(bar);
        scope1.spawn(baz);
    });
});
Presumably this will execute in an interleaved fashion.

Interesting example! While I think we would generally want to say that scopes are prioritized inside-out, we don't currently have a way to delay those inner scope1.spawn calls. So yes, they'll execute interleaved.

In the absence of any stealing, your example will:

enter scope1
- enter scope2
  - spawn s1 foo, s2 bar, s1 baz
  - pop s1 -> run s1 foo
  - pop s2 -> run s2 bar
  - leave scope2
- pop s1 -> run s1 baz
- leave scope1.

But for a slightly weirder case, suppose we add an earlier spawn:

enter scope1
- spawn s1 pre
- enter scope2
  - spawn s1 foo, s2 bar, s1 baz
  - pop s1 -> run s1 pre
  - pop s2 -> run s2 bar
  - leave scope2
- pop s1 -> run s1 foo
- pop s1 -> run s1 baz
- leave scope1.

We essentially have to run as many scope1 spawns as were pushed after the first scope2 spawn, and then scope2 can return -- this was just one interleaver in both of these cases. Ideally we would to prioritize scope2 fully ahead of scope1, but I'm not sure how to make this happen. In practice, I think it will be OK though.

So what's the right metaphor for global spawn() and spawn_fifo() functions?

In this case, we should think of them as independent scopes, not like nesting, although the interleaving effect is probably about the same. A spawn_fifo() call will only be ordered relative to other spawn_fifo() calls from the same thread.

cuviper

No objections, just some typos and clarifications.

accepted/rfc0000-scope-scheduling.md

cuviper · 2018-12-08T00:08:07Z

accepted/rfc0000-scope-scheduling.md

+
+then we should execute:
+
+- first, the tasks from the join (in reverse order, so B and then A)


Still not reversed...

accepted/rfc0000-scope-scheduling.md

nikomatsakis · 2018-12-10T12:55:20Z

@cuviper thanks! Fixed.

ghost · 2018-12-10T13:30:52Z

Thank you for your replies! I'm happy with the current state of this RFC. 👍

cuviper · 2018-12-11T02:34:47Z

What's our RFC endgame? Should I go ahead and implement a new PR for this?

nikomatsakis · 2018-12-11T13:31:36Z

@cuviper yes, I think so -- I guess I will merge

nikomatsakis · 2018-12-12T18:46:28Z

In discussion on Gitter we decided to rename from ScopeFifo::spawn to ScopeFifo::spawn_fifo, for the following reasons:

It is consistent with the global spawn_fifo method
It is more "bullet-proof", makes it harder to mistakenly use LIFO when you wanted FIFO
Leaves room for a single scope that supports both LIFO and FIFO spawns, should we want that

nikomatsakis · 2018-12-12T18:58:35Z

We decided to merge =)

615: Implement RFC #1: FIFO spawns r=nikomatsakis a=cuviper This implements rayon-rs/rfcs#1, adding `spawn_fifo`, `scope_fifo`, and `ScopeFifo`, and deprecating the `breadth_first` flag. Fixes #590. Closes #601. Co-authored-by: Josh Stone <[email protected]>

start working on scope scheduling RFC

24b3e13

cuviper mentioned this pull request Nov 1, 2018

Deprecate breadth_first() and always spawn FIFO rayon-rs/rayon#601

Closed

update with lots of new text

31c81e7

nikomatsakis changed the title ~~[WIP] scope scheduling RFC~~ scope scheduling RFC Nov 14, 2018

cuviper reviewed Nov 14, 2018

View reviewed changes

update to just offer fifo_scope and fifo_spawn

fe3bd48

rename to scope_fifo and not fifo_scope

271d347

cuviper reviewed Dec 8, 2018

View reviewed changes

fix cuviper's comments

912cdbc

cuviper approved these changes Dec 11, 2018

View reviewed changes

cuviper mentioned this pull request Dec 12, 2018

Implement RFC #1: FIFO spawns rayon-rs/rayon#615

Merged

specify method name of ScopeFifo::spawn_fifo

2b3045b

nikomatsakis merged commit e3534a7 into rayon-rs:master Dec 12, 2018


		then we should execute:

		- first, the tasks from the join (in reverse order, so B and then A)

scope scheduling RFC #1

scope scheduling RFC #1

Uh oh!

Conversation

nikomatsakis commented Nov 1, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikomatsakis commented Nov 1, 2018

Uh oh!

cuviper commented Nov 1, 2018

Uh oh!

ghost commented Nov 8, 2018

Uh oh!

nikomatsakis commented Nov 14, 2018

Uh oh!

nikomatsakis commented Nov 14, 2018

Uh oh!

nikomatsakis commented Nov 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikomatsakis commented Nov 14, 2018

Uh oh!

cuviper Nov 14, 2018

Choose a reason for hiding this comment

Uh oh!

nikomatsakis Nov 17, 2018

Choose a reason for hiding this comment

Uh oh!

cuviper Dec 8, 2018

Choose a reason for hiding this comment

Uh oh!

cuviper Nov 14, 2018

Choose a reason for hiding this comment

Uh oh!

nikomatsakis Nov 17, 2018

Choose a reason for hiding this comment

Uh oh!

cuviper commented Nov 14, 2018

Uh oh!

nikomatsakis commented Nov 17, 2018

Uh oh!

nikomatsakis commented Nov 17, 2018

Uh oh!

cuviper commented Nov 28, 2018

Uh oh!

cuviper commented Nov 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikomatsakis commented Nov 29, 2018

Uh oh!

cuviper commented Nov 29, 2018

Uh oh!

nikomatsakis commented Dec 7, 2018

Uh oh!

ghost commented Dec 7, 2018

Uh oh!

nikomatsakis commented Dec 7, 2018

Uh oh!

ghost commented Dec 7, 2018

Uh oh!

nikomatsakis commented Dec 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented Dec 7, 2018 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cuviper commented Dec 7, 2018

Uh oh!

cuviper left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cuviper Dec 8, 2018

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nikomatsakis commented Nov 1, 2018 •

edited

Loading

nikomatsakis commented Nov 14, 2018 •

edited

Loading

cuviper commented Nov 28, 2018 •

edited

Loading

nikomatsakis commented Dec 7, 2018 •

edited

Loading

ghost commented Dec 7, 2018 •

edited by ghost

Loading