-
Notifications
You must be signed in to change notification settings - Fork 6k
Ensure threads are merged when tearing down the Rasterizer #19919
Ensure threads are merged when tearing down the Rasterizer #19919
Conversation
It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat. Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. |
@iskakaushik @chinmaygarde I haven't documented the new APIs in this PR nor have I added tests yet. I wanted to run this solution by you guys first. Let me know if you see obvious drawbacks in this solution. |
direction LGTM. How are you planning to test it? |
I think we can test this using ShellTests. (https://github.com/flutter/engine/blob/master/shell/common/shell_test.h) |
task_runners_.GetPlatformTaskRunner()) { | ||
return true; | ||
} | ||
if (surface_ == nullptr || raster_thread_merger_.get() == nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in what case surface_ = nullptr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the rasterizer has already been torn down, the surface_
would be null. And in that case we shouldn't try to merge threads or wait the threads to be merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blasten I updated the PR with your comments.
task_runners_.GetPlatformTaskRunner()) { | ||
return true; | ||
} | ||
if (surface_ == nullptr || raster_thread_merger_.get() == nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the rasterizer has already been torn down, the surface_
would be null. And in that case we shouldn't try to merge threads or wait the threads to be merged.
|
shell/common/rasterizer.h
Outdated
/// @attention If raster and platform task runners are not the same or not | ||
/// merged. This method will try to merge the task runners and | ||
/// might block the current thread and wait until the 2 task | ||
/// runners are merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: run-on sentence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM + just minor nits. Excited to give this a try!
@@ -26,6 +32,15 @@ void RasterThreadMerger::MergeWithLease(size_t lease_term) { | |||
is_merged_ = task_queues_->Merge(platform_queue_id_, gpu_queue_id_); | |||
lease_term_ = lease_term; | |||
} | |||
merged_condition_.notify_one(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grab lock.
fml/raster_thread_merger.h
Outdated
@@ -55,7 +67,9 @@ class RasterThreadMerger | |||
fml::TaskQueueId gpu_queue_id_; | |||
fml::RefPtr<fml::MessageLoopTaskQueues> task_queues_; | |||
std::atomic_int lease_term_; | |||
bool is_merged_; | |||
std::atomic_bool is_merged_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be replaced by lease_term_ > 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iskakaushik @blasten Updated with suggests and also added some unit tests for raster_thread_merger
fml/raster_thread_merger.h
Outdated
@@ -55,7 +67,9 @@ class RasterThreadMerger | |||
fml::TaskQueueId gpu_queue_id_; | |||
fml::RefPtr<fml::MessageLoopTaskQueues> task_queues_; | |||
std::atomic_int lease_term_; | |||
bool is_merged_; | |||
std::atomic_bool is_merged_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
shell/common/rasterizer.h
Outdated
/// @attention If raster and platform task runners are not the same or not | ||
/// merged. This method will try to merge the task runners and | ||
/// might block the current thread and wait until the 2 task | ||
/// runners are merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Some tests are failing because we are running raster_thread_merger on platforms that don't support it. See flutter/flutter#38844 |
@@ -246,7 +246,7 @@ bool MessageLoopTaskQueues::Unmerge(TaskQueueId owner) { | |||
bool MessageLoopTaskQueues::Owns(TaskQueueId owner, | |||
TaskQueueId subsumed) const { | |||
std::lock_guard guard(queue_mutex_); | |||
return subsumed == queue_entries_.at(owner)->owner_of || owner == subsumed; | |||
return subsumed == queue_entries_.at(owner)->owner_of; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Would you mind adding a TODO, and filing an issue about reverting this logic once Android is fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually might not need to revert this as I think this make sense. waiting for @iskakaushik to confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, this is good.
fml::AutoResetWaitableEvent term_platform; | ||
fml::AutoResetWaitableEvent latch_wait_until_merged; | ||
std::thread thread_platform([&]() { | ||
TEST(RasterThreadMerger, HandleTaskQueuesAreTheSame) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: missing checks for ASSERT_TRUE(raster_thread_merger_->TaskQueuesAreSame())
, and ASSERT_FALSE(raster_thread_merger_->TaskQueuesAreSame())
in a separate TEST.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make raster_thread_merger_->TaskQueuesAreSame()
private. The user of the raster_thread_merger shouldn't need to know if the merging is static or dynamic. So the test only needed to test if the threads are merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM + some minor nits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if (engine) { | ||
engine->OnOutputSurfaceDestroyed(); | ||
} | ||
// Step 1: Next, tell the raster thread that its rasterizer should suspend | ||
// access to the underlying surface. | ||
if (should_post_raster_task) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cyanglaz I was thinking about this. Sorry for all these questions:
- Is it possible that
should_post_raster_task
istrue
at this point even though it should befalse
? - Why isn't this checking a global shared state (such as
RasterThreadMerger::IsMerged()
) instead of a local state? - Could this be the reason why we need to merge the threads to fix the race condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should_post_raster_task
was only exists for static thread merging. This value is alwaystrue
for dynamic thread merging.- I'm not sure why it was a local var, but global state would also work.
- This is the reason why we needed to merge threads here. Because thread merging happens on raster thread and this is running on platform thread, we cannot know if threads are merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the platform thread is blocked at this point waiting on the latch, and the raster queue is running on the platform thread, can we release the latch and indicate that the raster should be teared down right after the latch is awaited?
Description
This is to solve the deadlock in Shell::OnPlatformViewDestroy when dynamic thread merging is enabled.
In this solution, before we tearDown the rasterizer, we make sure the threads are merged.
We also make sure the threads are un-merged after tearing down the rasterizer, so we can re-create the surface on the raster thread.
go/flutter-thread-merging-rasterizer-teardown
TODO:
1. remove logging code.
2. add tests.
Related Issues
Fixes flutter/flutter#57067, flutter/flutter#23975
Tests
Checklist
Before you create this PR confirm that it meets all requirements listed below by checking the relevant checkboxes (
[x]
). This will ensure a smooth and quick review process.Breaking Change
Did any tests fail when you ran them? Please read handling breaking changes.