Skip to content

Flow analysis can be used to escalate mixed mode unsoundness #1143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
stereotype441 opened this issue Aug 10, 2020 · 36 comments
Closed

Flow analysis can be used to escalate mixed mode unsoundness #1143

stereotype441 opened this issue Aug 10, 2020 · 36 comments
Assignees

Comments

@stereotype441
Copy link
Member

For a very long time, I thought that the weakening of soundness in mixed mode was limited to nullability. That is, I thought that in mixed mode:

  • within an opted out library, the result of evaluating an expression having static type T would always be a value whose runtime type is a subtype of T (because T contains the *s necessary to permit nulls in the appropriate places)
  • within an opted in library, the result of evaluating an expression having static type T would always be a value whose runtime type is a subtype of T', where T' is formed from T by adding/removing *s to T in appropriate positions.

But that turns out not to be the case. The following program breaks soundness by allowing a String to be assigned to a variable of type int. Both the analyzer and CFE accept it without issuing any errors:

test.dart (opted out):

// @dart=2.6

import 'opted_in.dart';

T violateSoundness<T>(dynamic x) => accomplice<T>(x, null);

main() {
  String s = 'soudness, my old nemesis, we meet again!';
  int i = violateSoundness<int>(s);
  print(i.abs());
}

opted_in.dart:

// @dart=2.10

T accomplice<T>(Object? x, int i) {
  if (x is! T) {
    if (i is int) throw 0;
  }
  return x;
}

What's happening is that flow analysis considers the implicit "else" branch of if (i is int) to promote the type of i to Never, hence it considers it unreachable. So it considers the flow state after if (i is int) throw 0; to be unreachable, and hence it considers if (x is! T) { if (i is int) throw 0; } to promote the type of x to T. So it believes that the return x; at the end of accomplice is sound.

@stereotype441
Copy link
Member Author

stereotype441 commented Aug 10, 2020

I believe the crux of the problem is that the inherent unsoundness of mixed mode is sufficient to cause flow analysis to incorrectly conclude that a flow control path is unreachable, and since it uses unreachability to discard control flow paths in joins, that means it can be made to conclude that any type promotion whatsoever survives beyond a join point.

I believe we could fix this by limiting the circumstances in which flow analysis concludes that code is unreachable. Flow analysis only concludes that code is unreachable when there is a clear transfer of control flow (e.g. an explicit return) or an expression has static type Never. We could change this so that it only concludes that code is unreachable when there is a clear transfer of control flow, or an expression has a type of Never and takes a form in which we are confident that null couldn't possibly leak in from an opted-out library (e.g. an explicit throw or a call to a static method declared in an opted-in library). We would also have to decide not to fix dart-lang/sdk#41985 (otherwise we would create similar loopholes with if (i != null)).

Side note: as an alternative, it is tempting to try to simply reduce the circumstances in which an expression can have a type of Never to just those circumstances where we are confident that the control flow is unreachable even in mixed mode. But I've been trying to think through this and I think there are too many holes in the mixed-mode type system to be able to do it. Beyond the possibility of the "factor" algorithm returning Never (causing a variable's type to be promoted to Never when in fact it might have a value of null), we have these circumstances, and possibly others I hvaen't thought of:

  • An opted-in library might declare a variable of type Never Function(), but then an opted-out library might store a function having type Never* Function() in that variable.
  • An opted-in library might declare an interface with a method having type Never Function(), but then an opted-out library might implement that interface with a method having type Never* Function().
  • An opted-in library might declare a container class C<T> with a method having type T Function(), and instantiate it with a type argument of Never, but then pass it to an opted-out library that treats it as C<Never*> and stores a value of null in it; then control returns to the opted-in library, which calls the method and receives the null result.

@stereotype441
Copy link
Member Author

@leafpetersen pointed out to me over IM that another possible solution would be to enforce soundness by having the compiler insert an implicit throw any time flow analysis determines that a code path is unreachable. Note that we would only have to do this in weak mode, and we wouldn't have to do it when the unreachability is due to explicit control flow, so hopefully the impact on code size could be kept reasonably low.

This would probably be a nicer experience for users than what I proposed in my comment above, because instead of saying to users "we're treating your obviously unreachable code path as 'technically reachable due to type system mumbo jumbo'", we'll just be respecting user intent by giving them a runtime check of a null safety condition that they're already asking for in their declared types. They probably won't even notice or care that the exception came from an esoteric corner of flow analysis.

@eernstg
Copy link
Member

eernstg commented Aug 11, 2020

One main element of the soundness violation here is of course flow analysis. I agree that throwing on every occasion where an unreachable path is actually reached will safely remove any soundness issues caused by reaching it. But this may have a substantial cost in terms of the size of the generated code. Do we have an easy and safe way to eliminate most unreachable paths from this extra step, or would it apply to all unreachable paths?

Another thing is that the story currently says "dynamic type tests and type checks will behave as in legacy code", and that's no longer true because we have to add "... , and the very act of reaching certain locations will now throw (because they would be unreachable with sound null safety)". I'm not sure that's a very reassuring story for developers to work with.

But another element in the soundness violation is the fact that a declaration int i does not ensure that i is int evaluates to true in code with null safety in a mixed program (i is int can of course be false with int i in legacy code). And that is only possible because the value of i is null.

We're reserving the right to insert assertions doing dynamic null checks on parameters, and we're of course reserving the right to have a dynamic error for i.isEven in code with null safety when i is null.

This means that it is not inconsistent to throw because int i is null when we reach the type test.

So we could compile i is int to (i as int) is int. That is: In an is expression whose true or false continuation promotes a variable to Never, we perform the dynamic check which is justified by the static type of the variable at this point, thus ensuring that the computed reachability is sound.

I believe this allows for a simpler story: "An is expression may have a dynamic error because the left operand evaluates to null even though the expression has a strictly non-nullable type". It also seems easier to ensure that this approach doesn't cause program size to go up too much.

Don't you think this will catch all sources of soundness violations that have been raised here?

@stereotype441
Copy link
Member Author

@eernstg

One main element of the soundness violation here is of course flow analysis. I agree that throwing on every occasion where an unreachable path is actually reached will safely remove any soundness issues caused by reaching it. But this may have a substantial cost in terms of the size of the generated code. Do we have an easy and safe way to eliminate most unreachable paths from this extra step, or would it apply to all unreachable paths?

I'm not too worried about the code size bloat issue. We would not have to apply it to all unreachable paths, only to the ones where flow analysis is imposing its own type-based notion of unreachability on top of the explicit control flow that's already present. AFAICT that only happens in four circumstances:

  1. When a variable's type is promoted to Never by an is check. For practical real-world code, this should only happen if the user is deliberately trying to check for mixed mode unsoundness (e.g. by doing i is int on a variable whose static type is non-nullable int). I think this is going to be really weird and esoteric.
  2. If we continue with the fix to Flow analysis should treat 's != null' as true if s is non-nullable sdk#41985, when a variable's type is promoted to Never by an == null check. Again, this should only happen in the weird and esoteric case where the user is deliberately trying to check for mixed mode unsoundness (e.g. by doing i != null on a variable whose static type is non-nullable int).
  3. When there is an expression having a type of Never that isn't simply a throw expression. For practical real-world code, this should only happen if you're calling a method whose sole purpose is to throw an exception or exit the program (e.g. the fail method in package:test). I think this is rare enough outside of tests that we don't need to worry about it imposing a big code size cost.
  4. When a switch statement covers all possible values of an enum except null. I suspect this is going to be the case that pops up most frequently, but it's so common to see users do an explicit throw statement in this circumstance that I can't imagine adding it automatically would be much worse in terms of code size.

...

So we could compile i is int to (i as int) is int. That is: In an is expression whose true or false continuation promotes a variable to Never, we perform the dynamic check which is justified by the static type of the variable at this point, thus ensuring that the computed reachability is sound.

I believe this allows for a simpler story: "An is expression may have a dynamic error because the left operand evaluates to null even though the expression has a strictly non-nullable type". It also seems easier to ensure that this approach doesn't cause program size to go up too much.

Don't you think this will catch all sources of soundness violations that have been raised here?

I think this is equivalent to what I said in cases 1 and 2 above, and I agree that it's unlikely to cause program size to go up too much. It doesn't catch everything, though. We still would need to cover cases 3 and 4 to close the soundness hole.

@ds84182
Copy link

ds84182 commented Aug 11, 2020

Unless I'm mistaken, in mixed soundness mode shouldn't opted-in libraries do a runtime check in their entrypoints (like the covariant modifier)?

Essentially, accomplice should behave more like T accomplice<T>(Object? x, covariant int i as int?) so to opted-out code its T Function<T>(Object?, int?) and to opted-in code its T Function<T>(Object?, int)

The unsoundness certainly begins at the function entrypoint, because i is not an int.

@munificent
Copy link
Member

munificent commented Aug 11, 2020

4. When a switch statement covers all possible values of an enum except null. I suspect this is going to be the case that pops up most frequently, but it's so common to see users do an explicit throw statement in this circumstance that I can't imagine adding it automatically would be much worse in terms of code size.

This sparked an idle curiosity around which control flow constructs are most common. I happened to already have a scratchpad up to let me answer that easily so, going by the Dart SDK, Flutter repo, and 1,000 more recent pub packages, it's:

--- Statements (393495 total) ---
 296365 ( 75.316%): if        ****************************************************************************
  57061 ( 14.501%): if else   ***************
  19320 (  4.910%): for in    *****
  10032 (  2.549%): for ;;    ***
   6054 (  1.539%): switch    **
   4155 (  1.056%): while     **
    508 (  0.129%): do while  *
Files: 41385
Lines: 13036917

cc @pq and @bwilkerson for code-scraping example reasons. :)

@leafpetersen
Copy link
Member

Unless I'm mistaken, in mixed soundness mode shouldn't opted-in libraries do a runtime check in their entrypoints (like the covariant modifier)?

Essentially, accomplice should behave more like T accomplice<T>(Object? x, covariant int i as int?) so to opted-out code its T Function<T>(Object?, int?) and to opted-in code its T Function<T>(Object?, int)

The unsoundness certainly begins at the function entrypoint, because i is not an int.

In this simple example, this could be caught (and we do provide an assertion mode which will stop this particular example). In general though, catching all unsoundness is both much harder, and in contradiction to our goal of making migration non-breaking to non-migrated code, and so we have chosen to make mixed mode execution unsound with respect to nullability. It is a problem though that flow analysis lets this nullability unsoundness be leveraged up to a full unsoundness, and so we will need to resolve this. My inclination at the moment is to add the appropriate unreachability assertions - I think this overall will be a positive for users anyway.

@eernstg
Copy link
Member

eernstg commented Aug 12, 2020

@stereotype441 wrote:

We still would need to cover cases 3 and 4 to close the soundness hole.

Right, so throwing at unreachable locations is the most promising approach, and apparently not too costly.

@stereotype441
Copy link
Member Author

stereotype441 commented Aug 12, 2020

Great! Elaborating a bit more on where throws would have to be inserted, here are some examples that illustrate what I'm thinking of. I think this covers all the soundness holes. Hopefully this can be a seed for someone (probably @munificent) to write up test cases.

promotionToNever(int i) {
  if (i is int) return; // Should throw if `i` is null
}
unnecessaryNullCheck(int f()) {
  if (f() != null) return; // Should throw if `f` returns null
}
unnecessaryIfNull(int f(), int g()) {
  return f() ?? g(); // Should throw if `f` returns null (rather than calling `g`)
}
unnecessaryIfNullAssign(List<int> x, int f()) {
  x[0] ??= f(); // Should throw if `x[0]` returns null (rather than calling `f`)
}
unnecessaryNullAwareAccess(int f()) {
  f()?.gcd(throw ...); // Should throw if `f` returns null
}
callReturningNever(Never f()) {
  f(); // Should throw if `f` completes normally
}
switchOnBool(bool b) {
  switch (b) {
    case true: return;
    case false: return;
  } // Should throw if the implicit `default` branch is taken
}
enum E { e1, e2 }
switchOnEnum(E e) {
  switch (e) {
    case E.e1: return;
    case E.e2: return;
  } // Should throw if the implicit `default` branch is taken
}

Notes:

  • Test cases for promotionToNever should be sure to also cover the form i is! int.
  • Test cases for unnecessaryNullCheck should be sure to also cover the forms null != f(), f() == null, and null == f().
  • Technically it's not necessary for is and == null checks to throw if they don't get used for branching (e.g. bool b = i is int;), but probably we should do so anyway for consistency (and I think it's easier to implement that way anyhow).
  • Test cases for unnecessaryIfNullAssign should be sure to cover various different forms that can appear on the LHS of ??=, especially those involving null shorting.
  • Test cases for callReturningNever should be sure to cover instance method calls, static method calls, top level function calls, local function calls, operator invocations, and various kinds of getter invocations.
  • Test cases for callReturningNever should cover cases where the call isn't a statement, e.g. List<Never> x = [f()];

Regarding how to implement this (CC @johnniwinther), I'm thinking that we make the following modifications to the FlowAnalysis API:

  • For promotionToNever, isExpression should return a boolean indicating whether the implementation should throw if the "is" check is not satisfied.
  • For unnecessaryNullCheck, equalityOp_end should return a boolean indicating whether the implementation should throw if the operands are equal.
  • For unnecessaryIfNull and unnecessaryIfNullAssign, ifNullExpression_rightBegin should be given the type of the LHS as an additional argument, and it should return a boolean indicating whether the implementation should throw if the LHS evaluated to null. (Note: we could just let the CFE handle this all by itself, but I want to improve flow analysis so that it takes into account the type of the LHS, so this is a step in that direction).
  • For unnecessaryNullAwareAccess, nullAwareAccess_rightBegin should be given the type of the target as an additional argument, and it should return a boolean indicating whether the implementation should throw if the target evaluated to null. (Note: we could just let the CFE handle this all by itself, but I want to improve flow analysis so that it takes into account the type of the target, so this is a step in that direction).
  • For callReturningNever, this should be handled entirely in the CFE, in the logic where it decides whether to call handleExit based on an expression type.
  • For switchOnBool and switchOnEnum, these should be handled entirely in the CFE, in the logic where it decides whether to pass isExhaustive=true to switchStatement_end.

And then it would be up to the CFE and back end teams to decide whether to implement the behaviors in the back end or by transforming the kernel. WDYT?

@stereotype441
Copy link
Member Author

Side note: currently flow analysis ignores is checks where the LHS is anything other than a variable. So if i is a local variable with static type int, the "false" branch of i is int is considered unreachable; but if f is a function returning int, the "false" branch of f() is int is considered reachable. Maybe for consistency they both should be considered unreachable. If so, then the promotionToNever case above would have to be generalized to arbitrary expressions, e.g.:

promotionToNever(int f()) {
  if (f() is int) return; // Should throw if `f` returns null
}

@johnniwinther
Copy link
Member

The plan in #1143 (comment) sounds good.

@stereotype441 stereotype441 self-assigned this Aug 25, 2020
@stereotype441
Copy link
Member Author

@munificent's test cases have landed; I will work on the flow analysis logic described in #1143 (comment) today.

@stereotype441
Copy link
Member Author

For posterity, here's a sharp edge of the unnecessaryNullAwareAccess case that we'll need to be careful about:

T accomplice<T>(Object? x, int i) {
  i?.hashCode.remainder(x is! T ? throw 0 : 1);
  return x;
}

Since i has a non-nullable type, flow analysis will conclude that the ?. cannot null short. Therefore it will assume that x is! T ? throw 0 : 1 is always part of the control flow path, so it will promote the type of x to T, allowing the return statement without a cast. But at runtime, when a null value is passed for i, the ?. will null short, so execution will proceed directly to the return statement and we'll once again have a full soundness violation.

This case is important because it illustrates that when flow analysis believes a ?. cannot null short, we must insert an implicit null check even if the method being called (hashCode in this case) exists on Object. Similar situations could arise if there's an extension on int?.

I'll add some test cases to make sure this is covered.

dart-bot pushed a commit to dart-lang/sdk that referenced this issue Sep 2, 2020
This is the flow analysis portion of the fix to
dart-lang/language#1143.  Follow-up changes
will be needed in the CFE and/or backends to ensure that exceptions
are thrown under appropriate circumstances.

This CL also makes some improvements to flow analysis's reachability
analysis so that it accounts for nullability of the target when
analyzing the reachability of `??=` and `?.`.  Hopefully these
improvements should make the fix to
dart-lang/language#1143 clearer and more
consistent.

Change-Id: I5fa5c070f13fd57ac4c2fb87f2d67588861594b0
Bug: dart-lang/language#1143
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/160440
Reviewed-by: Konstantin Shcheglov <[email protected]>
Reviewed-by: Johnni Winther <[email protected]>
Commit-Queue: Paul Berry <[email protected]>
@stereotype441
Copy link
Member Author

Ok, with dart-lang/sdk@d833f2f landed, the next step is to implement the CFE logic to cause exceptions to be thrown at the appropriate times. @johnniwinther

@stereotype441
Copy link
Member Author

dart-lang/sdk@d833f2f caused a build failure so I had to revert it. Will try to re-land today.

dart-bot pushed a commit to dart-lang/sdk that referenced this issue Sep 2, 2020
This reverts commit d833f2f.

Reason for revert: Broke build, e.g. https://ci.chromium.org/p/dart/builders/ci/dart-sdk-mac/12688

Original change's description:
> Flow analysis changes to fix mixed-mode unsoundness loophole.
> 
> This is the flow analysis portion of the fix to
> dart-lang/language#1143.  Follow-up changes
> will be needed in the CFE and/or backends to ensure that exceptions
> are thrown under appropriate circumstances.
> 
> This CL also makes some improvements to flow analysis's reachability
> analysis so that it accounts for nullability of the target when
> analyzing the reachability of `??=` and `?.`.  Hopefully these
> improvements should make the fix to
> dart-lang/language#1143 clearer and more
> consistent.
> 
> Change-Id: I5fa5c070f13fd57ac4c2fb87f2d67588861594b0
> Bug: dart-lang/language#1143
> Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/160440
> Reviewed-by: Konstantin Shcheglov <[email protected]>
> Reviewed-by: Johnni Winther <[email protected]>
> Commit-Queue: Paul Berry <[email protected]>

[email protected],[email protected],[email protected]

Change-Id: If1215b19975e0958d612dd69767088095d853879
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: dart-lang/language#1143
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/161580
Reviewed-by: Paul Berry <[email protected]>
dart-bot pushed a commit to dart-lang/sdk that referenced this issue Sep 3, 2020
This is a reland of d833f2f

Original change's description:
> Flow analysis changes to fix mixed-mode unsoundness loophole.
> 
> This is the flow analysis portion of the fix to
> dart-lang/language#1143.  Follow-up changes
> will be needed in the CFE and/or backends to ensure that exceptions
> are thrown under appropriate circumstances.
> 
> This CL also makes some improvements to flow analysis's reachability
> analysis so that it accounts for nullability of the target when
> analyzing the reachability of `??=` and `?.`.  Hopefully these
> improvements should make the fix to
> dart-lang/language#1143 clearer and more
> consistent.
> 
> Change-Id: I5fa5c070f13fd57ac4c2fb87f2d67588861594b0
> Bug: dart-lang/language#1143
> Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/160440
> Reviewed-by: Konstantin Shcheglov <[email protected]>
> Reviewed-by: Johnni Winther <[email protected]>
> Commit-Queue: Paul Berry <[email protected]>

Bug: dart-lang/language#1143
Change-Id: If9c45649c1e9f4b19f7c282e7a1c4c956a7bc17f
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/161622
Commit-Queue: Paul Berry <[email protected]>
Reviewed-by: Konstantin Shcheglov <[email protected]>
@stereotype441
Copy link
Member Author

dart-lang/sdk@d833f2f was re-landed as dart-lang/sdk@a4b2047 so I'm passing this back to Johnni.

@johnniwinther
Copy link
Member

Where is the rule that tells that the switch statement in switchOnBool is exhaustive? We handle enums but I haven't seen a rule about bool.

@stereotype441
Copy link
Member Author

@johnniwinther, it looks like this is underspecified. The flow analysis spec says:

  • switch statement: ...
    • If the cases are exhaustive, then let after(N) = break(N) otherwise let
      after(N) = join(after(E), break(N)).

But I can't find any definition of what we mean by "exhaustive". There's a little bit of text in the NNBD feature specification document describing when a switch statement should generate a warning, but it's not very precisely specified either:

  • If T is an enum type, it is a warning if the switch does not handle all
    enum cases, either explicitly or via a default.
  • If T is Q? where Q is an enum type, it is a warning if the switch does
    not handle all enum cases and null, either explicitly or via a default.

So I guess we have to make a decision and then fix the spec to make it more precise. In an ideal world I would want to define an "exhaustive" switch statement as a switch statement meeting any one of the following requirements:

  1. The switch statement has a default clause
  2. The type of the switch expression is Null and the switch statement has a case E clause where the constant value of E is null
  3. The type of the switch expression is T, where T is an enum type, and for each possible value of the enum T, the switch statement has a case E clause where the constant value of E equals that enum value
  4. The type of the switch expression is T?, where T is an enum type, and the switch statement has a case E clause where the constant value of E is null, and for each possible value of the enum T, the switch statement has a case E clause where the constant value of E equals that enum value
  5. The type of the switch expression is bool, and for both the values true and false, the switch statement has a case E clause where the constant value of E equals that value
  6. The type of the switch expression is bool?, and for all three of the values true, false, and null, the switch statement has a case E clause where the constant value of E equals that value

But that's a pretty far cry from what we currently implement, and we're getting short on time, so I think it's reasonable to do something more limited (as long as it's sound). Here's what's currently implemented AFAICT:

  • The analyzer looks like it handles cases 1, 3, and 4, and it recognizes null and enum values by their syntactic forms rather than by constant evaluation (so for instance case (MyEnum.value): is not recognized as handling the enum value MyEnum.value).
  • The front end looks like it handles cases 1 and 3, and it also recognizes enum values by their syntactic forms, but it does ignore parentheses, so it recognizes case (MyEnum.value): but not case (true ? MyEnum.value : MyEnum.otherValue):. The front end has a soundness bug when switching on nullable enum types, which I've filed as a separate bug: CFE fails to account for null when checking whether a switch statement as exhaustive sdk#43348.

@johnniwinther, @scheglov, @leafpetersen: opinions on how we should define exhaustiveness?

@lrhn
Copy link
Member

lrhn commented Sep 7, 2020

There are two reasons for caring about exhaustiveness:

  1. The traditional one: Warn about non-exhaustive enum switches.

  2. Control-flow analysis - if the switch is not exhaustive, it is equivalent to one with a reachable default:break; clause. A switch which returns on every branch makes the code afterwards unreachable only if it's exhaustive. We don't care about exhaustiveness in itself, we only care about whether the implicit default case is reachable and must be taken into account for flow analysis, or whether we can say for certain that it's not reachable.

The first one is what the spec handles. It's a user convenience, and IMO shouldn't be part of the language spec to begin with (but I'm still in the "no warnings!" camp).

The second one is only relevant to our new flow analysis. It cares about semantic safety (can it deduce that there is no path through the switch which doesn't satisfy X), but we use it for promotion, so it needs to be reduced to something users can reason about. (So we can't let our amazing full-program analysis deduce that the only values reaching the switch on Object x is 2, "hello" and false, and consider it exhaustive if it has those three cases.

I don't particularly care about switching over bool? (and not at all about switching over bool, that's what if is for).
If we don't support that, I'm fine.
A switch over the Null type is just daft, so I'm also fine with not supporting that.
So, not supporting 2, 5 and 6 is fine by me.

Case 1 is not a separate case. All switches can be treated as having a default case, even if it's just default:break; as a default ... default ... case. What matters is whether the default case is reachable, which depends on the type and the remaining cases.
If it is reachable, then it's just one more path on the flow analysis, and if not, we can probably warn if it is written explicitly.

That makes enum values and nullable enum values (3 and 4) the only relevant cases.
I do want to support those. (I also want to support them properly, by evaluating the constant expressions and ensuring that the results contain all the enum values, but I'm OK with fixing that at a later time, as a bugfix).

So, in short:

An enum switch is a switch where the switch expression has type E or E?, and E is an enum-type,
called the enum type of the enum switch.
An enum switch is exhaustive if for each value v of its enum type,
the switch has a case with an expression evaluating to v,
and, if the type is E?, there is also a case with an expression evaluating to null.

We provide a warning if:

  • An enum switch is not exhaustive and does not provide a default case.
  • An enum is exhaustive, and it provides an explicit default case.

For flow control, an exhaustive enum considers any default case as unreachable.

I'm OK with adding switches over bool? (and bool for consistency), but I don't think it's a priority at all.

@johnniwinther
Copy link
Member

Do note that determining exhaustiveness based on evaluated case constants (not just syntactical references to enum values, probably not allowing parentheses) is not feasible, since evaluation isn't possible at the point in time where exhaustiveness is needed for the flow analysis (or it'll at least require a (complete) rewrite of the different phases of CFE).

So I'd stick with the simple syntactic enum references and handle cases 1, 3, and 4. The other cases are uncommon and unnecessary so we don't need to help the user in those cases, we just need to be sound.

@lrhn
Copy link
Member

lrhn commented Sep 8, 2020

We probably will need to keep doing syntax based enum recognition for now, because we don't have time to change that.
(It does worry me that we got there to begin with, because syntactically recognizing constant values is not something we should ever be doing, IMO it should always be value based).

So, I would like that to be fixed in the future, if possible.
It prevents the use-case of moving enum values out of the declaration:

enum Foo { bar, baz }
const bar = Foo.bar;
const baz = Foo.baz;
...
int indexOf(Foo foo) {
   switch (foo) {
     case bar: return 1;
     case baz: return 2;
   }
}

Would it make any kind of sense to do constant evaluation during the analysis phase as well? If a constant expression is detectable as such, then analysis of it could perhaps be able to collect its value along with its static type.
Or are there some sharp corners, perhaps around .fromEnvironment, which makes that impossible?

@johnniwinther
Copy link
Member

johnniwinther commented Sep 8, 2020

Constant evaluation is riddled with nasty corner cases. There seems to always be another case that you hadn't thought of.

And yes, .fromEnvironment is one of the problematic cases; analyzer does have an environment and dart2js doesn't evaluate these until closed world computation which is long time after flow analysis.

@johnniwinther
Copy link
Member

The implementation strategy has some unforeseen consequences. See for instance failures on co19_2/LibTest/core/DateTime/parse_A03_t01 in https://dart-review.googlesource.com/c/sdk/+/162004 )

If we have defensive code like

// opt-in
method(String input) {
  if (input == null) {
    throw new ArgumentError.notNull('input');
  }
}

then we generate

// opt-in
method(String input) {
   if (input == null && throw new UnsoundError()) {
    throw new ArgumentError.notNull('input');
  }
}

making all tests/code that rely on the exact error fail.

@eernstg
Copy link
Member

eernstg commented Sep 9, 2020

Is it out of the question to say that the code should be rewritten as follows?:

// opt-in
method(String input) {
  ArgumentError.checkNotNull(input, 'input');
}

@johnniwinther
Copy link
Member

I like the ArgumentError.checkNotNull approach. I'll change the sdk cases to use this and see how/if fixes the all cases. Also, it would be part of the migration story to use this instead of custom ("dead") code paths.

@lrhn
Copy link
Member

lrhn commented Sep 9, 2020

Internally in the platform libraries, if you want to check that a non nullable argument isn't actually (unsoundly) null, please use checkNotNullable from dart:_internal.
That makes it very easy to remove it again when we stop allowing unsound code.

@leafpetersen
Copy link
Member

This all seems fine with me. Cases 1, 3, 4 are what I intended by exhaustive. I'm fine extending this - the most useful extension would be bool? I think - but I'm fine not extending it. I'll add a TODO to make whatever we decide here clear in the spec.

@scheglov
Copy link
Contributor

As far as I remember we accepted support for syntactic only enum constants when initially considered exhaustiveness. Exactly because computing constant values before resolution is done is problematic.

@lrhn
Copy link
Member

lrhn commented Sep 17, 2020

I was talking to @johnniwinther and that gave rise to an idea.

First, let me make sure I understand the problem correctly:

  • We can get get into situations where a promoted variable does not hold a value of the promoted type.
  • This can only happen due to expected unreachable control flow actually being reachable.
  • That can happen for one of two reasons:
    1. An expression of type Never which evaluates to null
    2. A branch based on the type of a value assuming that one branch is unreachable. That always starts with a value being null even though the type is unnullable (but can obviously be escalated through this very unsoundness issue).

That is:

Object? x = "a";
if (x is! int) {
  neverFunction();
}
// We assume x is int, but if neverFunction returns `null`, that's not true.
1 + x;  // Crash and burn.

int z = something();
int? y = 0; // Assignment promoted to int;
if (z == null) {
  // Assumed unreachable due to type of x
  y = null; // Demotes y to int?
}
// Demotion of y doesn't reach here.
y + z;

Are there any other types of code which causes problems?

We want to ensure that we don’t reach unsound code. We do that by throwing at some point before actually reading x or y at the unsafely promoted type.

Since unsoundness is a fact, users might start writing code to test for it. We don’t want to get too much in their way about that. Giving warnings is fine, but we don’t want to make their defensive unsoundness mitigation code into an error unnecessarily.

That means that I don’t want to just make insert a throw UnsoundError() at the beginning of the if (z == null) then-branch. It could be followed by z = 0; instead of y = null;, and that would be reasonable unsound-aware code.

The idea is then that we can handle potentially non-throwing Never expressions by making them always throw. Effectively, if an expression e has static type Never and is potentially unsound (it’s not a literal throw expression, or something else that we can recognize as definitely throwing), we can convert it to throw UnsoundNullError.never(e); in null safe code. (That constructor just ignores the argument and puts in a message of "Expression of type Never did not throw".)
I believe we might already do that.

That will make any assumption about Never returning functions in null safe code sound, and it’s unlikely to make anyone complain. The Never type is new, so it’s unlikely that we have old code which is declared as returning Never and which actually meaningfully returns null.

For the type based branching (including if, ?/:, ?? or ??=), we can choose to not assume that any branch is unreachable. If the user writes a branch where the condition is not constant, then we assume that both branches are initially reachable. We can warn or hint that the code looks unreachable, but our analysis will not assume that it is so. The example above will treat y = null as reachable from the x == null test, even though it doesn’t seem so.

If the presumed-unreachable branch ends up throwing (or control-flow escaping in some other way), then that’s fine, we won’t merge it back in. If not, the author has to accept that the language assumes that the code they wrote is meaningful. Don’t write actually unreachable code, and you won’t have any problems with that.

If you don’t write the presumed unreachable branch, meaning that it’s the else branch which is presumed unreachable:

int x = something();
if (x is int) {
  yadda;    
}

then it still works to assume that the implicit empty else branch is reachable. It does mean that:

int foo(int x) {
  if (x is int) return x;
}

will complain about not returning an int at the end. That’s probably fine, don’t write it that way, then.

I believe that our current approach is based on erring later, when the unsafely promoted variable is used (but I'm not actually sure exactly where we do what).

There is one presumed unreachable case where we might want to throw: The exhaustive enum switch with no explicit default. I think it is reasonable to make the default throw, because otherwise you have to write it yourself:

enum Foo {foo, bar};
int fooId(Foo v) {
  int id;
  switch (v) {
    case Foo.foo: id = 2; break;
    case Foo.bar: id = 4; break;
  }
  return id + 1;           
}

This code looks reasonable, and if we make the implicit default branch reachable, then we can’t assume id is assigned afterwards. I think we should insert an implicit default: throw UnsoundNullError(); here to preserve the promotions.
That's also what we currently do.

Do you think these three rules:

  1. Make expressions of type Never guaranteed to throw.

  2. Assume all branches that an author wrote to be reachable (unless the condition is a constant bool).

  3. Make presumed exhaustive non-nullable enum switches have a default which throws.

can solve all the problems we are seeing here?

As changes go, they seem localized and predictable. Number 1 and 3 are early throws on presumably unreachable code. They are easily explainable. For number 1, if code statically analyzes as if it definitely escapes, then it does escape. For 3, if a switch statically analyzes as being exhaustive, then either it is exhaustive or it escapes.

For number 2, it’s not unreasonable to let dead code have effect on analysis. It’s there, the author wrote it, and they can remove it again if they don’t want it to have an effect, but it will allow them to defensively handle unsound nulls explicitly without getting tripped up by our soundness guards. You can understand the code by reading it, without having to mentally do reachability analysis to know that a branch doesn’t matter. (I actually think that’s better.)

For the examples above, that would yield:

promotionToNever(int i) {
  if (i is int) return; // Both branches reachable, but no problem.
}
unnecessaryNullCheck(int f()) {
  if (f() != null) return; // Both branches reachable, but no problem.
}
unnecessaryIfNull(int f(), int g()) {
  return f() ?? g(); // Should call g if f returns null, like the author asked for.
}
unnecessaryIfNullAssign(List<int> x, int f()) {
  x[0] ??= f(); // Should call f if x[0] is null, like the author asked for.
}
unnecessaryNullAwareAccess(int f()) {
  f()?.gcd(throw ...); // Should not throw if `f` returns null, as the author asked for.
}
callReturningNever(Never f()) {
  f(); // Should throw if `f` completes normally
}
switchOnBool(bool b) {
  switch (b) {
    case true: return;
    case false: return;
  } // Should throw if the implicit `default` branch is taken 
    // (if we treat bool as an enum and recognizes exhaustiveness).
}
enum E { e1, e2 }
switchOnEnum(E e) {
  switch (e) {
    case E.e1: return;
    case E.e2: return;
  } // Should throw if the implicit `default` branch is taken
}

@stereotype441
Copy link
Member Author

I was talking to @johnniwinther and that gave rise to an idea.

First, let me make sure I understand the problem correctly:

  • We can get get into situations where a promoted variable does not hold a value of the promoted type.

  • This can only happen due to expected unreachable control flow actually being reachable.

  • That can happen for one of two reasons:

    1. An expression of type Never which evaluates to null
    2. A branch based on the type of a value assuming that one branch is unreachable. That always starts with a value being null even though the type is unnullable (but can obviously be escalated through this very unsoundness issue).

That is:

Object? x = "a";
if (x is! int) {
  neverFunction();
}
// We assume x is int, but if neverFunction returns `null`, that's not true.
1 + x;  // Crash and burn.

int z = something();
int? y = 0; // Assignment promoted to int;
if (z == null) {
  // Assumed unreachable due to type of x
  y = null; // Demotes y to int?
}
// Demotion of y doesn't reach here.
y + z;

Are there any other types of code which causes problems?

There's one other scenario: a switch statement on a non-nullable enum type being assumed to be exhaustive, due to the fact that it has case clauses for every possible enum value. If the switch expression is null even though the type is unnullable, then the code following the switch statement will be reached even though flow analysis thinks it's unreachable.

I guess you mention it later in your comment, I was just surprised that you didn't list it here because it can be used to escalate unsoundness just litke the other two cases.

We want to ensure that we don’t reach unsound code. We do that by throwing at some point before actually reading x or y at the unsafely promoted type.

Since unsoundness is a fact, users might start writing code to test for it. We don’t want to get too much in their way about that. Giving warnings is fine, but we don’t want to make their defensive unsoundness mitigation code into an error unnecessarily.

That means that I don’t want to just make insert a throw UnsoundError() at the beginning of the if (z == null) then-branch. It could be followed by z = 0; instead of y = null;, and that would be reasonable unsound-aware code.

The idea is then that we can handle potentially non-throwing Never expressions by making them always throw. Effectively, if an expression e has static type Never and is potentially unsound (it’s not a literal throw expression, or something else that we can recognize as definitely throwing), we can convert it to throw UnsoundNullError.never(e); in null safe code. (That constructor just ignores the argument and puts in a message of "Expression of type Never did not throw".)
I believe we might already do that.

Yes, or rather, we will if/when we land https://dart-review.googlesource.com/c/sdk/+/162004.

That will make any assumption about Never returning functions in null safe code sound, and it’s unlikely to make anyone complain. The Never type is new, so it’s unlikely that we have old code which is declared as returning Never and which actually meaningfully returns null.

For the type based branching (including if, ?/:, ?? or ??=), we can choose to not assume that any branch is unreachable. If the user writes a branch where the condition is not constant, then we assume that both branches are initially reachable. We can warn or hint that the code looks unreachable, but our analysis will not assume that it is so. The example above will treat y = null as reachable from the x == null test, even though it doesn’t seem so.

If the presumed-unreachable branch ends up throwing (or control-flow escaping in some other way), then that’s fine, we won’t merge it back in. If not, the author has to accept that the language assumes that the code they wrote is meaningful. Don’t write actually unreachable code, and you won’t have any problems with that.

If you don’t write the presumed unreachable branch, meaning that it’s the else branch which is presumed unreachable:

int x = something();
if (x is int) {
  yadda;    
}

then it still works to assume that the implicit empty else branch is reachable. It does mean that:

int foo(int x) {
  if (x is int) return x;
}

will complain about not returning an int at the end. That’s probably fine, don’t write it that way, then.

This is a really interesting idea. It would have two really nice properties that our current solution doesn't have: 1. it allows power users like the Flutter devs to write defensive code without resorting to tricks like casting to dynamic, and 2. when we finally get to Dart 3.0 and remove mixed mode, we can revert flow analysis back to its current behavior (because then the defensive code really will be unreachable), and that will help people find their defensive code and remove it. And it wouldn't give up on the good things about our current solution (namely, that it fixes the unsoundness escalation bug, and it doesn't get in the way of us warning users about potentially suspect code).

I believe that our current approach is based on erring later, when the unsafely promoted variable is used (but I'm not actually sure exactly where we do what).

I don't think so. If by "our current approach" you mean what we're doing in https://dart-review.googlesource.com/c/sdk/+/162004, then no, we're actually throwing an exception as soon as the type check produces a result that's inconsistent with strong mode, essentially treating this:

int x = f();
if (x == null) { ... }

as though it were written this way:

int x = f();
if (x == null && throw ...) { ... }

...completely breaking people's attempts to write defensive code. Which is why I'm so intrigued by your idea.

There is one presumed unreachable case where we might want to throw: The exhaustive enum switch with no explicit default. I think it is reasonable to make the default throw, because otherwise you have to write it yourself:

enum Foo {foo, bar};
int fooId(Foo v) {
  int id;
  switch (v) {
    case Foo.foo: id = 2; break;
    case Foo.bar: id = 4; break;
  }
  return id + 1;           
}

This code looks reasonable, and if we make the implicit default branch reachable, then we can’t assume id is assigned afterwards. I think we should insert an implicit default: throw UnsoundNullError(); here to preserve the promotions.
That's also what we currently do.

Yes, if/when we land https://dart-review.googlesource.com/c/sdk/+/162004.

Do you think these three rules:

  1. Make expressions of type Never guaranteed to throw.
  2. Assume all branches that an author wrote to be reachable (unless the condition is a constant bool).
  3. Make presumed exhaustive non-nullable enum switches have a default which throws.

can solve all the problems we are seeing here?

Yeah, I think it could.

As changes go, they seem localized and predictable. Number 1 and 3 are early throws on presumably unreachable code. They are easily explainable. For number 1, if code statically analyzes as if it definitely escapes, then it does escape. For 3, if a switch statically analyzes as being exhaustive, then either it is exhaustive or it escapes.

For number 2, it’s not unreasonable to let dead code have effect on analysis. It’s there, the author wrote it, and they can remove it again if they don’t want it to have an effect, but it will allow them to defensively handle unsound nulls explicitly without getting tripped up by our soundness guards. You can understand the code by reading it, without having to mentally do reachability analysis to know that a branch doesn’t matter. (I actually think that’s better.)

For the examples above, that would yield:

promotionToNever(int i) {
  if (i is int) return; // Both branches reachable, but no problem.
}
unnecessaryNullCheck(int f()) {
  if (f() != null) return; // Both branches reachable, but no problem.
}
unnecessaryIfNull(int f(), int g()) {
  return f() ?? g(); // Should call g if f returns null, like the author asked for.
}
unnecessaryIfNullAssign(List<int> x, int f()) {
  x[0] ??= f(); // Should call f if x[0] is null, like the author asked for.
}
unnecessaryNullAwareAccess(int f()) {
  f()?.gcd(throw ...); // Should not throw if `f` returns null, as the author asked for.
}
callReturningNever(Never f()) {
  f(); // Should throw if `f` completes normally
}
switchOnBool(bool b) {
  switch (b) {
    case true: return;
    case false: return;
  } // Should throw if the implicit `default` branch is taken 
    // (if we treat bool as an enum and recognizes exhaustiveness).
}
enum E { e1, e2 }
switchOnEnum(E e) {
  switch (e) {
    case E.e1: return;
    case E.e2: return;
  } // Should throw if the implicit `default` branch is taken
}

@johnniwinther
Copy link
Member

I think we should work to see if we can make this new approach work.

@leafpetersen
Copy link
Member

I discussed this with @stereotype441 and I agree, this looks really promising. As best I can tell, this gives up on some unreachability knowledge that we really don't care about in practice, in exchange for making it much easier for users to understand what's going on. The main risk I can see is that there is some unexpected interaction with other parts of flow analysis that causes this to break patterns that we care about, but I can't see anything right now, and it sounds like we're close enough to a prototype to be able to measure this.

One question I have is what the type of int x should be promoted to in the "unreachable" branch of if (x == null). I think @stereotype441 is proposing to essentially say that it keeps its type: that is, x has type int in the unreachable true branch. An alternative, I think is to treat it as having type null (since that is in fact the accurate type)? On the other hand, it's not clear that if/when we drop this behavior when mixed mode goes away that moving from int to Never would be more breaking than Null to Never, and I'm not sure I see much benefit to typing the variable as Null (other than perhaps making it clearer to the user what the case is that got them there?).

@lrhn
Copy link
Member

lrhn commented Sep 18, 2020

Yes, I talked with Johnni again after posting this, and the question of the type of x came up there too. I agree to let it keep its type.

A variable with type int can validly have value null in unsound mode, so there is no (new) issue with that. Checking int x = ...; if (x == null) ... doesn't leave room to promote. It's equivalent to if (x is Null) which is not checking a subtype of int, so there should be no promotion in that case.
We could promote it to Never, but that would mean any reads happening before demotion will throw (because of our Never-soundness-guarantee). Probably not a problem in practice, why read something you just checked was null, you can just write null instead.

@stereotype441
Copy link
Member Author

Ok, I prototyped @lrhn's idea yesterday (actually, just the new behavior for is and == that Lasse proposed) and ran it through the internal testing suite overnight. It looks like it doesn't break any already-migrated code, so I think this is going to work. Prototype is here if you're curious. It's not ready for code review yet, because I haven't updated any test expectations, but I'll work on cleaning it up today. I just talked to @johnniwinther and he's working on a CL to do the remaining part of Lasse's proposal. It sounds like it is going well too.

I'm going to try to get a combination of my cleaned-up CL and Johnni's running through the internal testing suite today so that hopefully we can land them with confidence on Monday. I'll update this issue when I have more news to share.

@stereotype441
Copy link
Member Author

Yes, I talked with Johnni again after posting this, and the question of the type of x came up there too. I agree to let it keep its type.

A variable with type int can validly have value null in unsound mode, so there is no (new) issue with that. Checking int x = ...; if (x == null) ... doesn't leave room to promote. It's equivalent to if (x is Null) which is not checking a subtype of int, so there should be no promotion in that case.
We could promote it to Never, but that would mean any reads happening before demotion will throw (because of our Never-soundness-guarantee). Probably not a problem in practice, why read something you just checked was null, you can just write null instead.

Agreed. I independently came to the same conclusion when making my prototype yesterday, so it sounds like we're all on the same page.

dart-bot pushed a commit to dart-lang/sdk that referenced this issue Sep 25, 2020
…e semantics

Bug: dart-lang/language#1143
Change-Id: Id177f8b19c15ef246f21ddd840410cddfee2e5cc
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/163600
Commit-Queue: Johnni Winther <[email protected]>
Reviewed-by: Konstantin Shcheglov <[email protected]>
@johnniwinther
Copy link
Member

Implemented by https://dart-review.googlesource.com/c/sdk/+/163380 and https://dart-review.googlesource.com/c/sdk/+/163600

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants