Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

boost shell startup #8489

Closed
wants to merge 2 commits into from
Closed

boost shell startup #8489

wants to merge 2 commits into from

Conversation

SupSaiYaJin
Copy link
Contributor

@SupSaiYaJin SupSaiYaJin commented Apr 8, 2019

The main idea is release the latch between platform thread and ui thread. These threads are always very busy when startup especialy in large complex app.
Look forward to discuss with you guys.

@SupSaiYaJin
Copy link
Contributor Author

SupSaiYaJin commented Apr 10, 2019

  1. Create DartVM in UI thread instead of platform thread.
  2. Don't wait for creating engine when create shell, but create a ui latch for it.
  3. Check the ui latch when needs to aquire engine.
  4. Move ServiceProtocal stuff to UI thread.
  5. Do gpu task before ui task when platform view created and don't wait for ui task.

The first frame's time is almost 2 times earlier than before in our app after this change.

@dnfield
Copy link
Contributor

dnfield commented Apr 11, 2019

Well, for what it's worth this does seem to speed things up in some iOS add2app stuff I have.

That said, it looks like some of the comments about the order of things might need to be updated a bit. Chinmay or Jason will also be able to check the correctness of this change better than I can.

@cbracken
Copy link
Member

/cc @chinmaygarde for further review

@cbracken
Copy link
Member

@chinmaygarde have you had a chance to have a look at this? If so, any feedback?

Copy link
Contributor

@dnfield dnfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked offline with @chinmaygarde about this. It's not thread safe internally to the engine.

This is a very interesting idea though. If we could somehow make all of the Engine class unique_ptr's waitable/thread safe, we could probably experiment with doing this. I'd love to see this patch updated to do that, but if not Chinmay and I discussed some ideas to make it happen and one of us or someone else will get to it (a lot of people want this problem improved).

@dnfield
Copy link
Contributor

dnfield commented May 14, 2019

To elaborate a little bit more on this we'd want to make sure access to https://github.com/flutter/engine/blob/master/shell/common/shell.h#L91-L94 is thread safe. Right now that's being taken care of by the thread jumps and waits - if we remove that, we have to make sure it's safe not just for embedder consumers but also inside the shell itself.

@cbracken
Copy link
Member

cbracken commented Jun 1, 2019

@SupSaiYaJin do you plan to followup with changes to address the thread-safety issue?

@SupSaiYaJin
Copy link
Contributor Author

To elaborate a little bit more on this we'd want to make sure access to https://github.com/flutter/engine/blob/master/shell/common/shell.h#L91-L94 is thread safe. Right now that's being taken care of by the thread jumps and waits - if we remove that, we have to make sure it's safe not just for embedder consumers but also inside the shell itself.

Can you provide some specific ideas? I would try to do it.

@dnfield
Copy link
Contributor

dnfield commented Jun 5, 2019

We'd have to make all four pointers the shell holds guarded like this, perhaps as futures, so that internal access within the class is safe and only happens when the necessary parts are set up.

@SupSaiYaJin
Copy link
Contributor Author

@dnfield Do you mean to use std::future?
I think only Engine needs to be guarded, the others will be set up early.

@dnfield
Copy link
Contributor

dnfield commented Jun 12, 2019

Yes on std::future

They really all need it for internal consistency and safety. Embedders won't access the others but within the engine there's nothing stopping that and we may already have clients that do

@SupSaiYaJin
Copy link
Contributor Author

@dnfield I have made a huge change by using std::shared_future, could you plz check it?
I dont know why the checks are failed.

@dnfield
Copy link
Contributor

dnfield commented Jun 14, 2019

The test that's failing is one that's checking thread safety.

Shell::CreateCallback<PlatformView> on_create_platform_view,
Shell::CreateCallback<Rasterizer> on_create_rasterizer) {
if (!task_runners.IsValid()) {
FML_LOG(ERROR) << "Task runners to run the shell were invalid.";
return nullptr;
}

auto shell =
std::unique_ptr<Shell>(new Shell(std::move(vm), task_runners, settings));
auto shell = std::unique_ptr<Shell>(new Shell(task_runners, settings));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to tell the shell about how to get a VM at this point. The platform view may need it for some embedding implementations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also nit: this can just be auto shell = std::make_unique<Shell>(task_runners, settings);

}

DartVM* Shell::GetDartVM() {
return &vm_;
fml::WeakPtr<Shell> Shell::GetShell() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WeakPtrs are only safe for access on the thread they were created on. It looks like there are violations of this in several places in this code.

It also is concerning that this is getting used from the destructor of the shell in post task calls - as in, it seems like there's some attempt to refernece the memory of an object whose destructor has potentially completed.

Copy link
Contributor Author

@SupSaiYaJin SupSaiYaJin Jun 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do yu think of holding Shell with std::shared_ptr instead of std::unique_ptr by AndroidShellHolder and EmbedderEngine, let Shell extends std::enable_shared_from_this, then we can use std::weak_ptr.lock() on each closure to access Shell.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for that scale of change it'd be really nice if we had a design document that outlined the rationale fo rdoing this, the impact of it, and what alternative approaches are being rejected. It would help give the engine team a chance to evaluate what's becoming a fairly large change, and allow for some more comments/refinement than we can likely achieve in a GitHub pull request format.

fml::MakeCopyable([engine = std::move(engine_), &ui_latch]() mutable {
engine.reset();
fml::MakeCopyable([shell = GetShell(), &ui_latch]() mutable {
shell->vm_->GetServiceProtocol()->RemoveHandler(shell.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very dangerous. We're almost definitely going to see situations where the shell pointer is being used on the wrong thread and after the destruction of the object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use [this] directly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure how that helps anything. We're still be using a pointer to the shell at a point where its destructor has finished.

@dnfield
Copy link
Contributor

dnfield commented Jun 15, 2019

I haven't thoroughly reviewed the whole patch, but the last couple comments I left seem like pretty serious design concerns to me. What do you think?

@cbracken
Copy link
Member

@SupSaiYaJin have you had a chance to take a look at @dnfield 's comments?

@xster
Copy link
Member

xster commented Jun 29, 2019

From offline conversations, I believe this PR is more meant as an illustration of inefficiencies that can exist during the engine startup. Since this is so central to everything downstream, it could be sensible for our engine team to pick up (the way we did with flutter/flutter#25075).

@SupSaiYaJin can you open an issue with more numerical descriptions of this issue in terms of how this occurs and from testing or live data, how much this patch affects loading time?

@eseidelGoogle
Copy link
Contributor

@gaaclarke has been looking at startup performance (however is unfortunately out of the office at the moment).

@dnfield
Copy link
Contributor

dnfield commented Jul 26, 2019

I've poked at this a bit more. It seems like it would be really nice to move the VM start up somewhere that doesn't block the platform thread. To do that, we'd need to kill the shell constructor that takes a VM, which is currently depended on by flutter_runner, but doesn't need to be.

We'd also have to kill off the shell vending a reference to the VM, or just make it pass through to the engine and block if the engine is still creating the VM.

That seems like it should be safe to me, but I'm a little fuzzy on how that might impact the rest of the ivars in the shell that might want the VM to be around by initialization. We'd have to figure out a good way to test that they can tolerate the VM starting later than it currently does from their perspective.

@gaaclarke
Copy link
Member

gaaclarke commented Jul 26, 2019

@eseidel You've got me for one more day!

Thanks @SupSaiYaJin. I read through the diff, here is my summary of what you did:

You unlatched synchronization points inside the setup of Shells to allow certain initialization tasks to happen in parallel. Then replaced all getters for the initialized instance variables with calls to Futures which will delay synchronization to the last possible moment.

The problem I have with it is that it is pretty invasive. I'd have to see some hard evidence that the gains merited the risks. Also, access to the instance variables incur a cost in perpetuity when accessing them.

How about this as an alternative: #10182 ? I think this will get the majority of the performance boost you achieved without having to rip up so much. We just shuffle around our synchronization points so they are closer to where they need to be instead of forcing things to happen serially.

I'll have to think this though a bit more to make sure its safe. I'd like @chinmaygarde's take too.

edit: I mentioned in that PR that the code would look cleaner if it was using futures, maybe that's something you'd like to refactor @SupSaiYaJin while I'm on vacation? Provided we are comfortable with this change.

@liyuqian
Copy link
Contributor

It would be nice to attach the shell_benchmarks results so we can compare different approaches. Unfortunately, we don't yet run that benchmarks on our bots so we don't collect the measurements automatically (flutter/flutter#34746). Hence having the benchmarks data on the PR description would be useful. (CC @gaaclarke for reference after vacation 😄 )

@cbracken
Copy link
Member

Thanks for your contribution. Since we haven't heard back for a while on this, and it's currently got merge conflicts, I'm going to close this PR for now. Please don't hesitate to comment on the PR if you object to the approach @gaaclarke has proposed with #10182; we will reopen it right away!

@cbracken cbracken closed this Sep 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants