Skip to content

Commit 7f8d474

Browse files
add docs on task-specific buffering using threads
Co-Authored-By: Mason Protter <[email protected]>
1 parent f6f3553 commit 7f8d474

File tree

2 files changed

+70
-6
lines changed

2 files changed

+70
-6
lines changed

base/threadingconstructs.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -246,8 +246,8 @@ For example, the above conditions imply that:
246246
- Communicating between iterations using blocking primitives like `Channel`s is incorrect.
247247
- Write only to locations not shared across iterations (unless a lock or atomic operation is
248248
used).
249-
- The value of [`threadid()`](@ref Threads.threadid) may change even within a single
250-
iteration. See [`Task Migration`](@ref man-task-migration)
249+
- Unless the `:static` schedule is used, the value of [`threadid()`](@ref Threads.threadid)
250+
may change even within a single iteration. See [`Task Migration`](@ref man-task-migration).
251251
252252
## Schedulers
253253

doc/src/manual/multi-threading.md

Lines changed: 68 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,68 @@ julia> a
239239

240240
Note that [`Threads.@threads`](@ref) does not have an optional reduction parameter like [`@distributed`](@ref).
241241

242+
### Using `@threads` without data races
243+
244+
Taking the example of a naive sum
245+
246+
```julia-repl
247+
julia> function sum_single(a)
248+
s = 0
249+
for i in a
250+
s += i
251+
end
252+
s
253+
end
254+
sum_single (generic function with 1 method)
255+
256+
julia> sum_single(1:1_000_000)
257+
500000500000
258+
```
259+
260+
Simply adding `@threads` exposes a data race with multiple threads reading and writing `s` at the same time.
261+
```julia-repl
262+
julia> function sum_multi_bad(a)
263+
s = 0
264+
Threads.@threads for i in a
265+
s += i
266+
end
267+
s
268+
end
269+
sum_multi_bad (generic function with 1 method)
270+
271+
julia> sum_multi_bad(1:1_000_000)
272+
70140554652
273+
```
274+
275+
Note that the result is not `500000500000` as it should be, and will most likely change each evaluation.
276+
277+
To fix this, buffers that are specific to the task may be used to segment the sum into chunks that are race-free.
278+
Here `sum_single` is reused, with its own internal buffer `s`, and vector `a` is split into `nthreads()`
279+
chunks for parallel work via `nthreads()` `@spawn`-ed tasks.
280+
281+
```julia-repl
282+
julia> function sum_multi_good(a)
283+
chunks = Iterators.partition(a, length(a) ÷ Threads.nthreads())
284+
tasks = map(chunks) do chunk
285+
Threads.@spawn sum_single(chunk)
286+
end
287+
chunk_sums = fetch.(tasks)
288+
return sum_single(chunk_sums)
289+
end
290+
sum_multi_good (generic function with 1 method)
291+
292+
julia> sum_multi_good(1:1_000_000)
293+
500000500000
294+
```
295+
!!! Note
296+
Buffers should not be managed based on `threadid()` i.e. `buffers = zeros(Threads.nthreads())` because concurrent tasks
297+
can yield, meaning multiple concurrent tasks may use the same buffer on a given thread, introducing risk of data races.
298+
Further, when more than one thread is available tasks may change thread at yield points, which is known as
299+
[task migration](@ref man-task-migration).
300+
301+
Another option is the use of atomic operations on variables shared across tasks/threads, which may be more performant
302+
depending on the characteristics of the operations.
303+
242304
## Atomic Operations
243305

244306
Julia supports accessing and modifying values *atomically*, that is, in a thread-safe way to avoid
@@ -390,11 +452,13 @@ threads in Julia:
390452

391453
## [Task Migration](@id man-task-migration)
392454

393-
After a task starts running on a certain thread (e.g. via [`@spawn`](@ref Threads.@spawn) or
394-
[`@threads`](@ref Threads.@threads)), it may move to a different thread if the task yields.
455+
After a task starts running on a certain thread it may move to a different thread if the task yields.
456+
457+
Such tasks may have been started with [`@spawn`](@ref Threads.@spawn) or [`@threads`](@ref Threads.@threads),
458+
although the `:static` schedule option for `@threads` does freeze the threadid.
395459

396-
This means that [`threadid()`](@ref Threads.threadid) should not be treated as constant within a task, and therefore
397-
should not be used to index into a vector of buffers or stateful objects.
460+
This means that in most cases [`threadid()`](@ref Threads.threadid) should not be treated as constant within a task,
461+
and therefore should not be used to index into a vector of buffers or stateful objects.
398462

399463
!!! compat "Julia 1.7"
400464
Task migration was introduced in Julia 1.7. Before this tasks always remained on the same thread that they were

0 commit comments

Comments
 (0)