-
-
Notifications
You must be signed in to change notification settings - Fork 84
Narrow search range for tree-sitter queries #1769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
In the meantime, If the file is particularly large, it might be better to disable treesitter scopes entirely, similarly to how VS code disable syntax highlighting on massive files. |
Fwiw we're already parsing the file; this issue is just about running the tree query, which is likely many orders of magnitude faster |
#1889 removes some of the code that we will need to resume restricting the search tree. |
@AndreasArvidsson while you're looking into performance stuff, fwiw it appears tree-sitter/tree-sitter#2484 is fixed so we should be able to start restricting tree-sitter query range again. I'm not sure it's urgent but figured I'd flag since you're very familiar with performance right now |
Doesn't look like it's fixed actually. They just closed it. I rolled back the changes and tried to run our test and they are indeed failing. |
Strange. Possible we're on an old web-tree-sitter in vscode parse tree. I'll bump at some point and we can try again |
Please do |
Done; hopefully it works now 🤞 |
No go. Still 4 test that are failing. I will have a look at the position logic to make sure we haven't done anything incorrect. |
Oh I think it might be our logic in Edit: Looks like the each capture needs to be overlapping the startPosition-endPosition range. Take this example: We are here matching two different captures. We have the field definition where we start our content range and we have the semicolon where we end our content range.
That means that we are not overlapping with the field capture and the pattern is not matched |
This patten does not work (
(field_definition
value: (arrow_function)
) @namedFunction.start
.
";"? @namedFunction.end
) But this does work: (_
(field_definition
value: (arrow_function)
) @namedFunction.start
.
";"? @namedFunction.end
) The only difference is the inclusion of the wildcard at the top level. Without that we don't have a common ancestor for all captures. |
…ies (#2714) Today we generate all captures in the entire document whenever we do a Tree sitter query. This is of course quite expensive for large documents and it would be better if we could narrow the search range. That is what's implemented in this pull request. I also found some incorrect scm patterns that broke with this implementation that I have fixed. Fixes #1769 ## Checklist - [/] I have added [tests](https://www.cursorless.org/docs/contributing/test-case-recorder/) - [/] I have updated the [docs](https://github.com/cursorless-dev/cursorless/tree/main/docs) and [cheatsheet](https://github.com/cursorless-dev/cursorless/tree/main/cursorless-talon/src/cheatsheet) - [/] I have not broken the cheatsheet --------- Co-authored-by: Phil Cohen <[email protected]>
In #1800, we introduced a workaround for tree-sitter/tree-sitter#2484, which means that we no longer restrict the search range for our tree-sitter queries. That workaround has no impact on correctness, but could have performance implications for large files, as we get all matches and then filter them after the fact, rather than restricting to the range we care about. Once tree-sitter/tree-sitter#2484 is fixed, we'd like to reactivate our code that restricts the search range, by rolling back #1800
The text was updated successfully, but these errors were encountered: