boost shell startup #8489

SupSaiYaJin · 2019-04-08T14:37:05Z

The main idea is release the latch between platform thread and ui thread. These threads are always very busy when startup especialy in large complex app.
Look forward to discuss with you guys.

SupSaiYaJin · 2019-04-10T08:24:55Z

Create DartVM in UI thread instead of platform thread.
Don't wait for creating engine when create shell, but create a ui latch for it.
Check the ui latch when needs to aquire engine.
Move ServiceProtocal stuff to UI thread.
Do gpu task before ui task when platform view created and don't wait for ui task.

The first frame's time is almost 2 times earlier than before in our app after this change.

dnfield · 2019-04-11T03:12:55Z

Well, for what it's worth this does seem to speed things up in some iOS add2app stuff I have.

That said, it looks like some of the comments about the order of things might need to be updated a bit. Chinmay or Jason will also be able to check the correctness of this change better than I can.

shell/common/shell.cc

cbracken · 2019-04-22T18:02:32Z

/cc @chinmaygarde for further review

cbracken · 2019-05-13T17:39:21Z

@chinmaygarde have you had a chance to have a look at this? If so, any feedback?

dnfield

I talked offline with @chinmaygarde about this. It's not thread safe internally to the engine.

This is a very interesting idea though. If we could somehow make all of the Engine class unique_ptr's waitable/thread safe, we could probably experiment with doing this. I'd love to see this patch updated to do that, but if not Chinmay and I discussed some ideas to make it happen and one of us or someone else will get to it (a lot of people want this problem improved).

dnfield · 2019-05-14T03:31:10Z

To elaborate a little bit more on this we'd want to make sure access to https://github.com/flutter/engine/blob/master/shell/common/shell.h#L91-L94 is thread safe. Right now that's being taken care of by the thread jumps and waits - if we remove that, we have to make sure it's safe not just for embedder consumers but also inside the shell itself.

cbracken · 2019-06-01T01:09:14Z

@SupSaiYaJin do you plan to followup with changes to address the thread-safety issue?

SupSaiYaJin · 2019-06-05T06:25:54Z

To elaborate a little bit more on this we'd want to make sure access to https://github.com/flutter/engine/blob/master/shell/common/shell.h#L91-L94 is thread safe. Right now that's being taken care of by the thread jumps and waits - if we remove that, we have to make sure it's safe not just for embedder consumers but also inside the shell itself.

Can you provide some specific ideas? I would try to do it.

dnfield · 2019-06-05T13:32:41Z

We'd have to make all four pointers the shell holds guarded like this, perhaps as futures, so that internal access within the class is safe and only happens when the necessary parts are set up.

SupSaiYaJin · 2019-06-12T08:25:44Z

@dnfield Do you mean to use std::future?
I think only Engine needs to be guarded, the others will be set up early.

dnfield · 2019-06-12T13:11:53Z

Yes on std::future

They really all need it for internal consistency and safety. Embedders won't access the others but within the engine there's nothing stopping that and we may already have clients that do

SupSaiYaJin · 2019-06-14T06:12:58Z

@dnfield I have made a huge change by using std::shared_future, could you plz check it?
I dont know why the checks are failed.

dnfield · 2019-06-14T15:23:22Z

The test that's failing is one that's checking thread safety.

dnfield · 2019-06-15T07:04:20Z

shell/common/shell.cc

    Shell::CreateCallback<PlatformView> on_create_platform_view,
    Shell::CreateCallback<Rasterizer> on_create_rasterizer) {
  if (!task_runners.IsValid()) {
    FML_LOG(ERROR) << "Task runners to run the shell were invalid.";
    return nullptr;
  }

-  auto shell =
-      std::unique_ptr<Shell>(new Shell(std::move(vm), task_runners, settings));
+  auto shell = std::unique_ptr<Shell>(new Shell(task_runners, settings));


I think we need to tell the shell about how to get a VM at this point. The platform view may need it for some embedding implementations.

Also nit: this can just be auto shell = std::make_unique<Shell>(task_runners, settings);

shell/common/shell.cc

dnfield · 2019-06-15T07:16:09Z

shell/common/shell.cc

 }

-DartVM* Shell::GetDartVM() {
-  return &vm_;
+fml::WeakPtr<Shell> Shell::GetShell() const {


WeakPtrs are only safe for access on the thread they were created on. It looks like there are violations of this in several places in this code.

It also is concerning that this is getting used from the destructor of the shell in post task calls - as in, it seems like there's some attempt to refernece the memory of an object whose destructor has potentially completed.

What do yu think of holding Shell with std::shared_ptr instead of std::unique_ptr by AndroidShellHolder and EmbedderEngine, let Shell extends std::enable_shared_from_this, then we can use std::weak_ptr.lock() on each closure to access Shell.

I think for that scale of change it'd be really nice if we had a design document that outlined the rationale fo rdoing this, the impact of it, and what alternative approaches are being rejected. It would help give the engine team a chance to evaluate what's becoming a fairly large change, and allow for some more comments/refinement than we can likely achieve in a GitHub pull request format.

dnfield · 2019-06-15T07:17:09Z

shell/common/shell.cc

-      fml::MakeCopyable([engine = std::move(engine_), &ui_latch]() mutable {
-        engine.reset();
+      fml::MakeCopyable([shell = GetShell(), &ui_latch]() mutable {
+        shell->vm_->GetServiceProtocol()->RemoveHandler(shell.get());


This seems very dangerous. We're almost definitely going to see situations where the shell pointer is being used on the wrong thread and after the destruction of the object.

Could we use [this] directly?

I'm not really sure how that helps anything. We're still be using a pointer to the shell at a point where its destructor has finished.

dnfield · 2019-06-15T07:18:00Z

I haven't thoroughly reviewed the whole patch, but the last couple comments I left seem like pretty serious design concerns to me. What do you think?

cbracken · 2019-06-24T18:00:49Z

@SupSaiYaJin have you had a chance to take a look at @dnfield 's comments?

xster · 2019-06-29T01:52:50Z

From offline conversations, I believe this PR is more meant as an illustration of inefficiencies that can exist during the engine startup. Since this is so central to everything downstream, it could be sensible for our engine team to pick up (the way we did with flutter/flutter#25075).

@SupSaiYaJin can you open an issue with more numerical descriptions of this issue in terms of how this occurs and from testing or live data, how much this patch affects loading time?

eseidelGoogle · 2019-07-26T18:05:02Z

@gaaclarke has been looking at startup performance (however is unfortunately out of the office at the moment).

dnfield · 2019-07-26T18:13:09Z

I've poked at this a bit more. It seems like it would be really nice to move the VM start up somewhere that doesn't block the platform thread. To do that, we'd need to kill the shell constructor that takes a VM, which is currently depended on by flutter_runner, but doesn't need to be.

We'd also have to kill off the shell vending a reference to the VM, or just make it pass through to the engine and block if the engine is still creating the VM.

That seems like it should be safe to me, but I'm a little fuzzy on how that might impact the rest of the ivars in the shell that might want the VM to be around by initialization. We'd have to figure out a good way to test that they can tolerate the VM starting later than it currently does from their perspective.

gaaclarke · 2019-07-26T20:45:55Z

@eseidel You've got me for one more day!

Thanks @SupSaiYaJin. I read through the diff, here is my summary of what you did:

You unlatched synchronization points inside the setup of Shells to allow certain initialization tasks to happen in parallel. Then replaced all getters for the initialized instance variables with calls to Futures which will delay synchronization to the last possible moment.

The problem I have with it is that it is pretty invasive. I'd have to see some hard evidence that the gains merited the risks. Also, access to the instance variables incur a cost in perpetuity when accessing them.

How about this as an alternative: #10182 ? I think this will get the majority of the performance boost you achieved without having to rip up so much. We just shuffle around our synchronization points so they are closer to where they need to be instead of forcing things to happen serially.

I'll have to think this though a bit more to make sure its safe. I'd like @chinmaygarde's take too.

edit: I mentioned in that PR that the code would look cleaner if it was using futures, maybe that's something you'd like to refactor @SupSaiYaJin while I'm on vacation? Provided we are comfortable with this change.

liyuqian · 2019-07-27T22:37:32Z

It would be nice to attach the shell_benchmarks results so we can compare different approaches. Unfortunately, we don't yet run that benchmarks on our bots so we don't collect the measurements automatically (flutter/flutter#34746). Hence having the benchmarks data on the PR description would be useful. (CC @gaaclarke for reference after vacation 😄 )

cbracken · 2019-09-16T17:38:04Z

Thanks for your contribution. Since we haven't heard back for a while on this, and it's currently got merge conflicts, I'm going to close this PR for now. Please don't hesitate to comment on the PR if you object to the approach @gaaclarke has proposed with #10182; we will reopen it right away!

googlebot added the cla: yes label Apr 8, 2019

cbracken requested review from chinmaygarde and jason-simmons April 8, 2019 18:51

dnfield reviewed Apr 11, 2019

View reviewed changes

shell/common/shell.cc Outdated Show resolved Hide resolved

dnfield reviewed Apr 11, 2019

View reviewed changes

shell/common/shell.cc Outdated Show resolved Hide resolved

dnfield suggested changes May 13, 2019

View reviewed changes

cbracken added the waiting for customer response label Jun 1, 2019

dnfield reviewed Jun 15, 2019

View reviewed changes

shell/common/shell.cc Show resolved Hide resolved

dnfield reviewed Jun 15, 2019

View reviewed changes

SupSaiYaJin added 2 commits June 18, 2019 18:12

boost shell startup

c3b2a3f

use share_ptr for shell

45affc4

SupSaiYaJin mentioned this pull request Jun 30, 2019

Flutter cold start is slowly in some low-end Android devices flutter/flutter#35358

Closed

gaaclarke mentioned this pull request Jul 26, 2019

Made flutter startup faster by allowing initialization to be parallelized #10182

Merged

cbracken closed this Sep 16, 2019

boost shell startup #8489

boost shell startup #8489

Uh oh!

Conversation

SupSaiYaJin commented Apr 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SupSaiYaJin commented Apr 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dnfield commented Apr 11, 2019

Uh oh!

Uh oh!

Uh oh!

cbracken commented Apr 22, 2019

Uh oh!

cbracken commented May 13, 2019

Uh oh!

dnfield left a comment

Choose a reason for hiding this comment

Uh oh!

dnfield commented May 14, 2019

Uh oh!

cbracken commented Jun 1, 2019

Uh oh!

SupSaiYaJin commented Jun 5, 2019

Uh oh!

dnfield commented Jun 5, 2019

Uh oh!

SupSaiYaJin commented Jun 12, 2019

Uh oh!

dnfield commented Jun 12, 2019

Uh oh!

SupSaiYaJin commented Jun 14, 2019

Uh oh!

dnfield commented Jun 14, 2019

Uh oh!

dnfield Jun 15, 2019

Choose a reason for hiding this comment

Uh oh!

dnfield Jun 15, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dnfield Jun 15, 2019

Choose a reason for hiding this comment

Uh oh!

SupSaiYaJin Jun 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnfield Jun 17, 2019

Choose a reason for hiding this comment

Uh oh!

dnfield Jun 15, 2019

Choose a reason for hiding this comment

Uh oh!

SupSaiYaJin Jun 17, 2019

Choose a reason for hiding this comment

Uh oh!

dnfield Jun 17, 2019

Choose a reason for hiding this comment

Uh oh!

dnfield commented Jun 15, 2019

Uh oh!

cbracken commented Jun 24, 2019

Uh oh!

xster commented Jun 29, 2019

Uh oh!

eseidelGoogle commented Jul 26, 2019

Uh oh!

dnfield commented Jul 26, 2019

Uh oh!

gaaclarke commented Jul 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liyuqian commented Jul 27, 2019

Uh oh!

cbracken commented Sep 16, 2019

Uh oh!

SupSaiYaJin commented Apr 8, 2019 •

edited

Loading

SupSaiYaJin commented Apr 10, 2019 •

edited

Loading

SupSaiYaJin Jun 17, 2019 •

edited

Loading

gaaclarke commented Jul 26, 2019 •

edited

Loading