Fix hanging when starting over 512 subgraphs #2354
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This issue was introduced when upgrading to tokio 1.0, the
#[tokio::main]
syntax changed and the number of blocking threads was inadvertently lowered from 2000 to the default 512. I was able to reproduce the issue locally. This fixes the root cause which is the incorrect use of the blocking thread pool. At least in the critical locations, we still use it incorrectly in others, #905 tracks getting this right in all places.There are at least two ways we where using the blocking thread pool incorrectly:
spawn_blocking
. This leads to a "double dip" deadlock, the nested calls are waiting on the callers to release a blocking threads, but the callers are of course waiting on the nested call to complete.spawn_blocking
for the task that callsrun_subgraph
. Tasks that run indeterminately should not be put on the blocking thread pool, because that could exhaust the available threads. Ideally we'd make it so thatrun_subgraph
doesn't do blocking calls, but the quick fix is to use an OS thread. I addedgraph::spawn_thread
to make this convenient.