Skip to content

[incrParse] Fix bug mapping a node's location back to its location in the cached syntax tree #19782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 12, 2018

Conversation

nathawes
Copy link
Contributor

@nathawes nathawes commented Oct 8, 2018

Also fix Edit::intersectsOrTouchesRange check only returning true when the ranges overlapped, rather than when they overlapped or touched. This was causing an edit that extended an existing identifier, for example, to reuse the existing node, and parse the remaining characters as a separate node in the tree.

Resolves rdar://problem/45108439

… the cached syntax tree

Also fix Edit::intersectsOrTouchesRange check only returning true when the
ranges overlapped, rather than when they overlapped or 'touched'.

Resolves rdar://problem/45108439
@nathawes nathawes requested review from ahoppen and nkcsgexi October 8, 2018 23:28
@nathawes
Copy link
Contributor Author

nathawes commented Oct 8, 2018

@swift-ci please smoke test

Copy link
Contributor

@nkcsgexi nkcsgexi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Does the stress tester test other mutation patterns than extending a token?

@ahoppen
Copy link
Member

ahoppen commented Oct 9, 2018

I'd like to take a closer look at this since I know that I had to switch around some < and <= in my initial implementation as well. I think I'll get to it tomorrow morning CEST.

@nathawes
Copy link
Contributor Author

nathawes commented Oct 9, 2018

That would be great, thanks @ahoppen!

Here's the commit where the new -> old location computation's condition was changed from < to <= if that helps: b26dd11

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple comments inline that fix some more underlying bugs that have just surfaced.

@@ -38,7 +38,7 @@ struct SourceEdit {
/// Check if the characters replaced by this edit fall into the given range
/// or are directly adjacent to it
bool intersectsOrTouchesRange(size_t RangeStart, size_t RangeEnd) {
return !(End <= RangeStart || Start >= RangeEnd);
return End >= RangeStart && Start <= RangeEnd;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@@ -85,7 +85,7 @@ llvm::Optional<Syntax> SyntaxParsingCache::lookUp(size_t NewPosition,
size_t OldPosition = NewPosition;
for (auto I = Edits.rbegin(), E = Edits.rend(); I != E; ++I) {
auto Edit = *I;
if (Edit.End <= OldPosition) {
if (Edit.End < OldPosition) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some more consideration this should be

if (Edit.End + Edit.ReplacementLength - Edit.originalLength() <= OldPosition) {

Explanation

With <= we don't catch the following case:

Original text: foobar
Change foo to fo resulting in fobar

The edit range is 1-4 with original length 3 and replacement length 2.
If I now try to look up bar, we start with OldPosition = NewPosition = 3. Now we need to see if that position needs to be adjusted because of the edit (which it obviously needs to be), but we get Edit.End (4) <= OldPosition (3) does not hold.

The issue is that we are comparing OldPosition which is still in the position space of the post-edit file and Edit.End which is in the position space of the pre-edit file, hence we need to translate Edit.End to the post-position space as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right! Ok, so this should make it correct for a single edit, but if there are multiple I think we'll need to account for the original-vs-replacement length differences of any edits earlier in the file too. I'll give that a go and update shortly.

// RUN: %incr-transfer-tree --expected-incremental-syntax-tree %S/Outputs/extend-identifier-at-eof.json %s

func foo() {}
_ = x<<<|||x>>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add comment to this file explaining that no trailing newline should be added to the end of it.

_ = <<REPLACE<6|||7>>></reparse REPLACE>
_ = <<REPLACE_BY_LONGER<6|||"Hello World">>>
_ = <<REPLACE<6|||7>>>
_ = </reparse REPLACE><<REPLACE_BY_LONGER<6|||"Hello World">>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting the </reparse REPLACE> before the 6 seemed really odd to me and it turns out there is another off-by-one error here, where we don't report the last unexpectedly reparsed character.

Investigating further, I noticed that we need to remove the - 1 in swift-syntax-test.cpp:252 plus its corresponding comment. I don't know if that's related to some of your changes on SourceLoc or has been a bug ever since, but it fixed the issue. After that the end repairs marker also needs to go at the end of the line.

@ahoppen
Copy link
Member

ahoppen commented Oct 10, 2018 via email

@nathawes
Copy link
Contributor Author

nathawes commented Oct 10, 2018

Ok, I might be misunderstanding how multiple edits are meant to be understood. If I we have two edits in source order is the second edit positioned:

  1. in terms of the original file, or
  2. in terms of the original file after the first edit was applied?

I was thinking it was (1), where I think doing it in reverse wouldn't help, but if it's (2) you're right.

@ahoppen
Copy link
Member

ahoppen commented Oct 11, 2018

All changes are in terms of the original file. But because we are now speaking in terms of the post-edit position space, we need to traverse the edits in normal order again (not reverse order) to get the correct transformation. I extracted the logic, wrote a couple test cases for it and pushed it into this PR. While I was at it, I also addressed my other review comments.

Sorry for pushing into your branch, but it felt like this is cleaner than creating a new PR on which you would then need to rebase.

@ahoppen
Copy link
Member

ahoppen commented Oct 11, 2018

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 5c568af

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 5c568af

@nathawes
Copy link
Contributor Author

No problem about pushing into the PR and thanks for adding the unit tests! The transformation looks good to me now, too. I came up with a different solution yesterday:

  size_t AccumulatedShift = 0;
  for (auto I = Edits.begin(), E = Edits.end(); I != E; ++I) {
    auto Edit = *I;
    size_t Shift = Edit.ReplacementLength - Edit.originalLength();
    if (Edit.End + AccumulatedShift + Shift <= NewPosition) {
      AccumulatedShift += Shift;
    } else {
      break;
    }
   }
  return /*OldPosition =*/ NewPosition - AccumulatedShift;

But they're equivalent, anyway (this one just applies the previous edits and current edit on the left hand side of the <= comparison, while yours applies the current edit on the left hand side, and un-applies the previous edits on the right hand side).

The test failure is just the expected syntax tree json needing to be updated for the 'don't add a new line' comment Alex added in one of the tests. I'll update it shortly and merge if all goes well.

Nathan Hawes added 3 commits October 11, 2018 14:17
…ranges on the same line

It wasn't accounting for the prefix length before a reparse tag previously when
keeping track of pre_column_offset and post_column_offset.
…an SourceRange

The only client of the 'toSourceRange' method immediately constructs a
CharSourceRange from it, and ByteBasedSourceRange's EndLoc isn't the start of
the last token in the range (as SourceRange expects).
@nathawes
Copy link
Contributor Author

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 8baba29

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 8baba29

@nathawes
Copy link
Contributor Author

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 9003d83

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 9003d83

ahoppen added a commit to ahoppen/swift-syntax that referenced this pull request Oct 12, 2018
After swiftlang/swift#19782 is merged, statements adjacent to an edit will
no longer be reused. Hence the REPLACE_BY_LONGER line is expected to be
reparsed.
@ahoppen
Copy link
Member

ahoppen commented Oct 12, 2018

We need to adjust a test case in Swift Syntax to expect the extended expected reparse range.

Please test with following pull request:
swiftlang/swift-syntax#18

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 9003d83

@ahoppen
Copy link
Member

ahoppen commented Oct 12, 2018

Linux test failure seems unrelated.

@nathawes
Copy link
Contributor Author

Please test with following pull request:
swiftlang/swift-syntax#18

@swift-ci Please smoke test Linux

@nathawes nathawes merged commit b1c8013 into swiftlang:master Oct 12, 2018
@nathawes nathawes deleted the incremental-parsing-bug branch October 12, 2018 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants