-
Notifications
You must be signed in to change notification settings - Fork 213
Flow analysis can be used to escalate mixed mode unsoundness #1143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I believe the crux of the problem is that the inherent unsoundness of mixed mode is sufficient to cause flow analysis to incorrectly conclude that a flow control path is unreachable, and since it uses unreachability to discard control flow paths in joins, that means it can be made to conclude that any type promotion whatsoever survives beyond a join point. I believe we could fix this by limiting the circumstances in which flow analysis concludes that code is unreachable. Flow analysis only concludes that code is unreachable when there is a clear transfer of control flow (e.g. an explicit Side note: as an alternative, it is tempting to try to simply reduce the circumstances in which an expression can have a type of
|
@leafpetersen pointed out to me over IM that another possible solution would be to enforce soundness by having the compiler insert an implicit This would probably be a nicer experience for users than what I proposed in my comment above, because instead of saying to users "we're treating your obviously unreachable code path as 'technically reachable due to type system mumbo jumbo'", we'll just be respecting user intent by giving them a runtime check of a null safety condition that they're already asking for in their declared types. They probably won't even notice or care that the exception came from an esoteric corner of flow analysis. |
One main element of the soundness violation here is of course flow analysis. I agree that throwing on every occasion where an unreachable path is actually reached will safely remove any soundness issues caused by reaching it. But this may have a substantial cost in terms of the size of the generated code. Do we have an easy and safe way to eliminate most unreachable paths from this extra step, or would it apply to all unreachable paths? Another thing is that the story currently says "dynamic type tests and type checks will behave as in legacy code", and that's no longer true because we have to add "... , and the very act of reaching certain locations will now throw (because they would be unreachable with sound null safety)". I'm not sure that's a very reassuring story for developers to work with. But another element in the soundness violation is the fact that a declaration We're reserving the right to insert assertions doing dynamic null checks on parameters, and we're of course reserving the right to have a dynamic error for This means that it is not inconsistent to throw because So we could compile I believe this allows for a simpler story: "An Don't you think this will catch all sources of soundness violations that have been raised here? |
I'm not too worried about the code size bloat issue. We would not have to apply it to all unreachable paths, only to the ones where flow analysis is imposing its own type-based notion of unreachability on top of the explicit control flow that's already present. AFAICT that only happens in four circumstances:
I think this is equivalent to what I said in cases 1 and 2 above, and I agree that it's unlikely to cause program size to go up too much. It doesn't catch everything, though. We still would need to cover cases 3 and 4 to close the soundness hole. |
Unless I'm mistaken, in mixed soundness mode shouldn't opted-in libraries do a runtime check in their entrypoints (like the Essentially, The unsoundness certainly begins at the function entrypoint, because |
This sparked an idle curiosity around which control flow constructs are most common. I happened to already have a scratchpad up to let me answer that easily so, going by the Dart SDK, Flutter repo, and 1,000 more recent pub packages, it's:
cc @pq and @bwilkerson for code-scraping example reasons. :) |
In this simple example, this could be caught (and we do provide an assertion mode which will stop this particular example). In general though, catching all unsoundness is both much harder, and in contradiction to our goal of making migration non-breaking to non-migrated code, and so we have chosen to make mixed mode execution unsound with respect to nullability. It is a problem though that flow analysis lets this nullability unsoundness be leveraged up to a full unsoundness, and so we will need to resolve this. My inclination at the moment is to add the appropriate unreachability assertions - I think this overall will be a positive for users anyway. |
@stereotype441 wrote:
Right, so throwing at unreachable locations is the most promising approach, and apparently not too costly. |
Great! Elaborating a bit more on where throws would have to be inserted, here are some examples that illustrate what I'm thinking of. I think this covers all the soundness holes. Hopefully this can be a seed for someone (probably @munificent) to write up test cases. promotionToNever(int i) {
if (i is int) return; // Should throw if `i` is null
}
unnecessaryNullCheck(int f()) {
if (f() != null) return; // Should throw if `f` returns null
}
unnecessaryIfNull(int f(), int g()) {
return f() ?? g(); // Should throw if `f` returns null (rather than calling `g`)
}
unnecessaryIfNullAssign(List<int> x, int f()) {
x[0] ??= f(); // Should throw if `x[0]` returns null (rather than calling `f`)
}
unnecessaryNullAwareAccess(int f()) {
f()?.gcd(throw ...); // Should throw if `f` returns null
}
callReturningNever(Never f()) {
f(); // Should throw if `f` completes normally
}
switchOnBool(bool b) {
switch (b) {
case true: return;
case false: return;
} // Should throw if the implicit `default` branch is taken
}
enum E { e1, e2 }
switchOnEnum(E e) {
switch (e) {
case E.e1: return;
case E.e2: return;
} // Should throw if the implicit `default` branch is taken
} Notes:
Regarding how to implement this (CC @johnniwinther), I'm thinking that we make the following modifications to the
And then it would be up to the CFE and back end teams to decide whether to implement the behaviors in the back end or by transforming the kernel. WDYT? |
Side note: currently flow analysis ignores promotionToNever(int f()) {
if (f() is int) return; // Should throw if `f` returns null
} |
The plan in #1143 (comment) sounds good. |
@munificent's test cases have landed; I will work on the flow analysis logic described in #1143 (comment) today. |
For posterity, here's a sharp edge of the T accomplice<T>(Object? x, int i) {
i?.hashCode.remainder(x is! T ? throw 0 : 1);
return x;
} Since This case is important because it illustrates that when flow analysis believes a I'll add some test cases to make sure this is covered. |
This is the flow analysis portion of the fix to dart-lang/language#1143. Follow-up changes will be needed in the CFE and/or backends to ensure that exceptions are thrown under appropriate circumstances. This CL also makes some improvements to flow analysis's reachability analysis so that it accounts for nullability of the target when analyzing the reachability of `??=` and `?.`. Hopefully these improvements should make the fix to dart-lang/language#1143 clearer and more consistent. Change-Id: I5fa5c070f13fd57ac4c2fb87f2d67588861594b0 Bug: dart-lang/language#1143 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/160440 Reviewed-by: Konstantin Shcheglov <[email protected]> Reviewed-by: Johnni Winther <[email protected]> Commit-Queue: Paul Berry <[email protected]>
Ok, with dart-lang/sdk@d833f2f landed, the next step is to implement the CFE logic to cause exceptions to be thrown at the appropriate times. @johnniwinther |
dart-lang/sdk@d833f2f caused a build failure so I had to revert it. Will try to re-land today. |
This reverts commit d833f2f. Reason for revert: Broke build, e.g. https://ci.chromium.org/p/dart/builders/ci/dart-sdk-mac/12688 Original change's description: > Flow analysis changes to fix mixed-mode unsoundness loophole. > > This is the flow analysis portion of the fix to > dart-lang/language#1143. Follow-up changes > will be needed in the CFE and/or backends to ensure that exceptions > are thrown under appropriate circumstances. > > This CL also makes some improvements to flow analysis's reachability > analysis so that it accounts for nullability of the target when > analyzing the reachability of `??=` and `?.`. Hopefully these > improvements should make the fix to > dart-lang/language#1143 clearer and more > consistent. > > Change-Id: I5fa5c070f13fd57ac4c2fb87f2d67588861594b0 > Bug: dart-lang/language#1143 > Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/160440 > Reviewed-by: Konstantin Shcheglov <[email protected]> > Reviewed-by: Johnni Winther <[email protected]> > Commit-Queue: Paul Berry <[email protected]> [email protected],[email protected],[email protected] Change-Id: If1215b19975e0958d612dd69767088095d853879 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: dart-lang/language#1143 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/161580 Reviewed-by: Paul Berry <[email protected]>
This is a reland of d833f2f Original change's description: > Flow analysis changes to fix mixed-mode unsoundness loophole. > > This is the flow analysis portion of the fix to > dart-lang/language#1143. Follow-up changes > will be needed in the CFE and/or backends to ensure that exceptions > are thrown under appropriate circumstances. > > This CL also makes some improvements to flow analysis's reachability > analysis so that it accounts for nullability of the target when > analyzing the reachability of `??=` and `?.`. Hopefully these > improvements should make the fix to > dart-lang/language#1143 clearer and more > consistent. > > Change-Id: I5fa5c070f13fd57ac4c2fb87f2d67588861594b0 > Bug: dart-lang/language#1143 > Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/160440 > Reviewed-by: Konstantin Shcheglov <[email protected]> > Reviewed-by: Johnni Winther <[email protected]> > Commit-Queue: Paul Berry <[email protected]> Bug: dart-lang/language#1143 Change-Id: If9c45649c1e9f4b19f7c282e7a1c4c956a7bc17f Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/161622 Commit-Queue: Paul Berry <[email protected]> Reviewed-by: Konstantin Shcheglov <[email protected]>
dart-lang/sdk@d833f2f was re-landed as dart-lang/sdk@a4b2047 so I'm passing this back to Johnni. |
Where is the rule that tells that the switch statement in |
@johnniwinther, it looks like this is underspecified. The flow analysis spec says:
But I can't find any definition of what we mean by "exhaustive". There's a little bit of text in the NNBD feature specification document describing when a switch statement should generate a warning, but it's not very precisely specified either:
So I guess we have to make a decision and then fix the spec to make it more precise. In an ideal world I would want to define an "exhaustive" switch statement as a switch statement meeting any one of the following requirements:
But that's a pretty far cry from what we currently implement, and we're getting short on time, so I think it's reasonable to do something more limited (as long as it's sound). Here's what's currently implemented AFAICT:
@johnniwinther, @scheglov, @leafpetersen: opinions on how we should define exhaustiveness? |
There are two reasons for caring about exhaustiveness:
The first one is what the spec handles. It's a user convenience, and IMO shouldn't be part of the language spec to begin with (but I'm still in the "no warnings!" camp). The second one is only relevant to our new flow analysis. It cares about semantic safety (can it deduce that there is no path through the switch which doesn't satisfy X), but we use it for promotion, so it needs to be reduced to something users can reason about. (So we can't let our amazing full-program analysis deduce that the only values reaching the switch on I don't particularly care about switching over Case 1 is not a separate case. All switches can be treated as having a That makes enum values and nullable enum values (3 and 4) the only relevant cases. So, in short:
I'm OK with adding switches over |
Do note that determining exhaustiveness based on evaluated case constants (not just syntactical references to enum values, probably not allowing parentheses) is not feasible, since evaluation isn't possible at the point in time where exhaustiveness is needed for the flow analysis (or it'll at least require a (complete) rewrite of the different phases of CFE). So I'd stick with the simple syntactic enum references and handle cases 1, 3, and 4. The other cases are uncommon and unnecessary so we don't need to help the user in those cases, we just need to be sound. |
We probably will need to keep doing syntax based enum recognition for now, because we don't have time to change that. So, I would like that to be fixed in the future, if possible. enum Foo { bar, baz }
const bar = Foo.bar;
const baz = Foo.baz;
...
int indexOf(Foo foo) {
switch (foo) {
case bar: return 1;
case baz: return 2;
}
} Would it make any kind of sense to do constant evaluation during the analysis phase as well? If a constant expression is detectable as such, then analysis of it could perhaps be able to collect its value along with its static type. |
Constant evaluation is riddled with nasty corner cases. There seems to always be another case that you hadn't thought of. And yes, |
The implementation strategy has some unforeseen consequences. See for instance failures on If we have defensive code like // opt-in
method(String input) {
if (input == null) {
throw new ArgumentError.notNull('input');
}
} then we generate // opt-in
method(String input) {
if (input == null && throw new UnsoundError()) {
throw new ArgumentError.notNull('input');
}
} making all tests/code that rely on the exact error fail. |
Is it out of the question to say that the code should be rewritten as follows?: // opt-in
method(String input) {
ArgumentError.checkNotNull(input, 'input');
} |
I like the |
Internally in the platform libraries, if you want to check that a non nullable argument isn't actually (unsoundly) |
This all seems fine with me. Cases 1, 3, 4 are what I intended by exhaustive. I'm fine extending this - the most useful extension would be |
As far as I remember we accepted support for syntactic only enum constants when initially considered exhaustiveness. Exactly because computing constant values before resolution is done is problematic. |
I was talking to @johnniwinther and that gave rise to an idea. First, let me make sure I understand the problem correctly:
That is: Object? x = "a";
if (x is! int) {
neverFunction();
}
// We assume x is int, but if neverFunction returns `null`, that's not true.
1 + x; // Crash and burn.
int z = something();
int? y = 0; // Assignment promoted to int;
if (z == null) {
// Assumed unreachable due to type of x
y = null; // Demotes y to int?
}
// Demotion of y doesn't reach here.
y + z; Are there any other types of code which causes problems? We want to ensure that we don’t reach unsound code. We do that by throwing at some point before actually reading Since unsoundness is a fact, users might start writing code to test for it. We don’t want to get too much in their way about that. Giving warnings is fine, but we don’t want to make their defensive unsoundness mitigation code into an error unnecessarily. That means that I don’t want to just make insert a The idea is then that we can handle potentially non-throwing That will make any assumption about For the type based branching (including If the presumed-unreachable branch ends up throwing (or control-flow escaping in some other way), then that’s fine, we won’t merge it back in. If not, the author has to accept that the language assumes that the code they wrote is meaningful. Don’t write actually unreachable code, and you won’t have any problems with that. If you don’t write the presumed unreachable branch, meaning that it’s the else branch which is presumed unreachable: int x = something();
if (x is int) {
yadda;
} then it still works to assume that the implicit empty else branch is reachable. It does mean that: int foo(int x) {
if (x is int) return x;
} will complain about not returning an I believe that our current approach is based on erring later, when the unsafely promoted variable is used (but I'm not actually sure exactly where we do what). There is one presumed unreachable case where we might want to throw: The exhaustive enum switch with no explicit default. I think it is reasonable to make the default throw, because otherwise you have to write it yourself: enum Foo {foo, bar};
int fooId(Foo v) {
int id;
switch (v) {
case Foo.foo: id = 2; break;
case Foo.bar: id = 4; break;
}
return id + 1;
} This code looks reasonable, and if we make the implicit Do you think these three rules:
can solve all the problems we are seeing here? As changes go, they seem localized and predictable. Number 1 and 3 are early throws on presumably unreachable code. They are easily explainable. For number 1, if code statically analyzes as if it definitely escapes, then it does escape. For 3, if a switch statically analyzes as being exhaustive, then either it is exhaustive or it escapes. For number 2, it’s not unreasonable to let dead code have effect on analysis. It’s there, the author wrote it, and they can remove it again if they don’t want it to have an effect, but it will allow them to defensively handle unsound For the examples above, that would yield: promotionToNever(int i) {
if (i is int) return; // Both branches reachable, but no problem.
}
unnecessaryNullCheck(int f()) {
if (f() != null) return; // Both branches reachable, but no problem.
}
unnecessaryIfNull(int f(), int g()) {
return f() ?? g(); // Should call g if f returns null, like the author asked for.
}
unnecessaryIfNullAssign(List<int> x, int f()) {
x[0] ??= f(); // Should call f if x[0] is null, like the author asked for.
}
unnecessaryNullAwareAccess(int f()) {
f()?.gcd(throw ...); // Should not throw if `f` returns null, as the author asked for.
}
callReturningNever(Never f()) {
f(); // Should throw if `f` completes normally
}
switchOnBool(bool b) {
switch (b) {
case true: return;
case false: return;
} // Should throw if the implicit `default` branch is taken
// (if we treat bool as an enum and recognizes exhaustiveness).
}
enum E { e1, e2 }
switchOnEnum(E e) {
switch (e) {
case E.e1: return;
case E.e2: return;
} // Should throw if the implicit `default` branch is taken
} |
There's one other scenario: a switch statement on a non-nullable enum type being assumed to be exhaustive, due to the fact that it has case clauses for every possible enum value. If the switch expression is I guess you mention it later in your comment, I was just surprised that you didn't list it here because it can be used to escalate unsoundness just litke the other two cases.
Yes, or rather, we will if/when we land https://dart-review.googlesource.com/c/sdk/+/162004.
This is a really interesting idea. It would have two really nice properties that our current solution doesn't have: 1. it allows power users like the Flutter devs to write defensive code without resorting to tricks like casting to
I don't think so. If by "our current approach" you mean what we're doing in https://dart-review.googlesource.com/c/sdk/+/162004, then no, we're actually throwing an exception as soon as the type check produces a result that's inconsistent with strong mode, essentially treating this: int x = f();
if (x == null) { ... } as though it were written this way: int x = f();
if (x == null && throw ...) { ... } ...completely breaking people's attempts to write defensive code. Which is why I'm so intrigued by your idea.
Yes, if/when we land https://dart-review.googlesource.com/c/sdk/+/162004.
Yeah, I think it could.
|
I think we should work to see if we can make this new approach work. |
I discussed this with @stereotype441 and I agree, this looks really promising. As best I can tell, this gives up on some unreachability knowledge that we really don't care about in practice, in exchange for making it much easier for users to understand what's going on. The main risk I can see is that there is some unexpected interaction with other parts of flow analysis that causes this to break patterns that we care about, but I can't see anything right now, and it sounds like we're close enough to a prototype to be able to measure this. One question I have is what the type of |
Yes, I talked with Johnni again after posting this, and the question of the type of A variable with type |
Ok, I prototyped @lrhn's idea yesterday (actually, just the new behavior for I'm going to try to get a combination of my cleaned-up CL and Johnni's running through the internal testing suite today so that hopefully we can land them with confidence on Monday. I'll update this issue when I have more news to share. |
Agreed. I independently came to the same conclusion when making my prototype yesterday, so it sounds like we're all on the same page. |
…e semantics Bug: dart-lang/language#1143 Change-Id: Id177f8b19c15ef246f21ddd840410cddfee2e5cc Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/163600 Commit-Queue: Johnni Winther <[email protected]> Reviewed-by: Konstantin Shcheglov <[email protected]>
For a very long time, I thought that the weakening of soundness in mixed mode was limited to nullability. That is, I thought that in mixed mode:
*
s necessary to permit nulls in the appropriate places)*
s to T in appropriate positions.But that turns out not to be the case. The following program breaks soundness by allowing a String to be assigned to a variable of type
int
. Both the analyzer and CFE accept it without issuing any errors:test.dart (opted out):
opted_in.dart:
What's happening is that flow analysis considers the implicit "else" branch of
if (i is int)
to promote the type ofi
toNever
, hence it considers it unreachable. So it considers the flow state afterif (i is int) throw 0;
to be unreachable, and hence it considersif (x is! T) { if (i is int) throw 0; }
to promote the type ofx
toT
. So it believes that thereturn x;
at the end ofaccomplice
is sound.The text was updated successfully, but these errors were encountered: