Skip to content

Promotion not always getting into a local function #50573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jensjoha opened this issue Nov 29, 2022 · 3 comments
Closed

Promotion not always getting into a local function #50573

jensjoha opened this issue Nov 29, 2022 · 3 comments

Comments

@jensjoha
Copy link
Contributor

This doesn't work:

main() {
  foo();
}

void foo([int? x]) {
  x ??= 42;
  print(x + 1); // <- not an error; x is non-null because of the above line.
  void y() {
    print(x + 1); // <-- error because I don't know.
  }
  y();
}

But this does:

main() {
  foo();
}

void foo([int? x]) {
  if (x != null) {
    print(x + 1);
    void y() {
      print(x + 1);
    }
    y();
  }
}

But this doesn't:

main() {
  foo();
}

void foo([int? x]) {
  x ??= 42;
  if (x != null) {
    print(x + 1);
    void y() {
      print(x + 1); // <- error
    }
    y();
  }
}

and on the last example if I analyze it with the analyzer I get this:

Analyzing t.dart...

  error - t.dart:10:15 - The operator '+' can't be unconditionally invoked because the receiver can be 'null'. Try adding a null check to the target ('!'). -
          unchecked_use_of_nullable_value
   info - t.dart:7:9 - The operand can't be null, so the condition is always true. Remove the condition. - unnecessary_null_comparison

2 issues found.

where I'm told that on line 7 it can't be null so the 'if'-check is unnecessary (which is true), but on line 10 I'm told that I can't use it because it can be null (it can't).

I was told by @johnniwinther (about the first example) that

Flow analysis conservatively assumes that a local function can be called at any point in time. This means that it doesn't know if x is read before or after the x ??= 42 assignment. For the code at hand, an improvement would be to preserve promotions the occur prior to local function declaration when the promoted local is not assigned to after the declaration.

Can we do better?
/cc @lrhn @stereotype441

@lrhn
Copy link
Member

lrhn commented Nov 29, 2022

I hope we can. I'll defer to @stereotype441 about whether we actually can.

It should be "obvious" that a function cannot be called before its declaration. It can be called at any later time. (We do not try to determine whether the function can escape. If it does, "any later time" must mean until the end of the program, not just until the end of the declaring function. And since we don't know, we have to assume the worst case.)

The bigger problem is guaranteeing that every code path from that declaration to the end of the program will never assign a value to variables that the function refers to, which do not have the type they were promoted to.
That's a non-local property of the code after the declaration, which means that the type of the code in the body of the function cannot be type-analyzed until the rest of the scope of the variables it closes over has. And probably vice-versa, in case you do call the function immediately.
And for each of those variables, we have to decide whether they are ever, possibly, assigned to in a way which demotes. If they are, the function cannot assume those variables have any type more precise than the lowest it's ever demoted to.

But what defines demotion?
Arguably, the variable x is demoted back to int? after then if statement, where the false branch joins the true branch.
In this example, there is no code there. If there had been any code, we'd have had to figure out whether that code can possibly call y. We'd have to assume it can - again, no escape analysis, no side-effect analysis, any code must be assumed to be able to trigger arbitrary user code.

Does that mean that y might get called in code where x has been demoted? No, because it's only really demoted along the false branch, and y only exists along the true branch.
We should probably check whether there is any assignment to x on any path downstream from the declaration of y, and if any of those declarations demote x. But not actually demote. They might promote the already-demoted-but-not-reassigned variable, or do nothing at all. What matters is whether the assignment preserves the promotion of the captured variables.

(If any closure contains an assignment to x, things get even harder, because then we also don't know when assignments happen. We might know which types are assigned to the variable, but again, we need to do type inference in some order, so we don't always have all the information.)

Since the second example works, my guess is that we are already doing something, but it takes very little to throw us off course. Possibly the problem in the third example is that the if statement is not the one doing the promotion, because you promoted the variable just before, and our algorithm somehow misses that that's better information. Same reasoning might apply to the first example - what we do only works if we promote in a branch.

@stereotype441
Copy link
Member

@jensjoha

Can we do better? /cc @lrhn @stereotype441

Yeah. This is a common complaint and a lot of people would like it to be improved. The canonical tracking bug for this request is dart-lang/language#1536. I'm absolutely open to improving this; it's just a matter of prioritizing it against other work, and this year when I looked into possible flow analysis improvements to take on, I decided that field promotion would give a bigger benefit. I'll add a note to that issue to point to this one, so that when we look at prioritizing flow analysis work for next year, we'll be reminded that this came up again.

FWIW, it's not a trivial fix to do, because type analysis happens in a two fixed passes: one pass where we resolve each bare identifier to its declaration, and a second pass where we assign a type to everything. (There are some assumptions pretty deeply baked in to the analyzer and front end that would make it difficult to expand this to more than two passes). In the front end the first pass happens during parsing, so we have some additional limitations on what data we can collect during that pass. At the moment, the data we collect is approximately: for every block, what variables does it assign to and/or write capture? Returning to your first example:

void foo([int? x]) {
  x ??= 42;
  print(x + 1); // <- not an error; x is non-null because of the above line.
  void y() {
    print(x + 1); // <-- error because I don't know.
  }
  y();
}

During the first pass, the only information we collect about x is that there is an assignment to it somewhere inside the outermost block. Then, during the second pass, when we reach the declaration of void y(), all we know is that there's a write to x somewhere, but we don't know that that write has definitely occurred in the past. So we don't know whether it's safe to preserve the promotion or not.

The way I would approach improving this is to change the first pass so that instead of keeping track of information on a block-by-block basis, it builds up a simplified graph of the control flow in the function, and then I'd add a step in between the two passes that analyzes that simplified graph to determine which variables need to be demoted at the beginning of each block. So for your example above, the first pass would essentially model the code as:

declare x
write to x
create closure y

And it would see that at the time y is created, all assignments to x are in the past (and there are no loops allowing those assignments to be reachable again), so there is no need to demote x.

@lrhn

Since the second example works, my guess is that we are already doing something, but it takes very little to throw us off course.

Correct. What's happening in the second example is that we see that there are no assignments to x at all, so we know that there is no danger of its value changing between the point when the closure y is created and when it's called.

Possibly the problem in the third example is that the if statement is not the one doing the promotion, because you promoted the variable just before, and our algorithm somehow misses that that's better information. Same reasoning might apply to the first example - what we do only works if we promote in a branch.

Actually it's not about what does the promotion, it's simply that in both the first and third examples, there is an assignment to x, and we don't do sufficient temporal reasoning to see that those assignments can only precede the creation of the closure y.

@stereotype441
Copy link
Member

Closing this issue in favor of the canonical issue dart-lang/language#1536. Let's have any further discussion there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants