-
-
Notifications
You must be signed in to change notification settings - Fork 84
Change scopeType matchers to rely on tree-sitter style scheme queries #616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Using things like |
My proposal featured a simplified query language for delimiters, which we search upwards from the current token. It would be good to somehow merge these proposals, to have a concept like “all results for query Q2 within the node returned by Q1 containing the mark”. |
I propose we do a slightly different thing, using the delimiter mechanism described above. If we run Q2 within the delimiting scope, then “every” simply means “all matches” and “ordinal” means the ordinal’th match. The default would be to again return the node which contains the mark. |
Sorry not sure I follow. Can you elaborate on the problem? |
Can you elaborate on what you mean by this? What kind of delimiters? |
For, e.g., function parameters in a definition in Haskell, I search upwards to the first function definition node, which will delimit the second query, a downwards search for parameters rooted at that definition. I think this achieves something similar to @Will-Sommers suggestion of using an Theirs would search the entire document for all function definitions with parameters, then grab the one which contains the mark, and then (probably could) return a list of captures corresponding to the parameters, either taking every, the n’th, or by default the one containing the mark. Would even, I just realised, open up the suggestion of saying things like “take next arg” to take the parameter one after the one which contains the cursor, and other such relative commands. |
Ok I'm confused: are you suggesting that we just do what's described in the issue, or are you proposing that we change something? |
@Will-Sommers maybe tomorrow let's go through the exercise of writing a language definition? I think we might be able to make some headway in an hour and it should help to make things more concrete |
I propose we use @Will-Sommers approach of marking — what I call — delimiters using a named capture, then select all matches where the delimiter is the smallest possible region containing the mark, and use my approach to assign captures to every/ordinal/default. |
How is that different from what's described in the issue? |
Fwiw the reason that I was leaning away from rooted queries and towards top-level matches was so that we could support "take every key" both with nothing selected, which would expand to containing map, and with something selected, which would search within selection. If we use top-level matches, then we can handle both in the same way |
Ok updated issue description to clarify iteration stuff a bit |
Ok I captured our mob programming session in draft PR #620 |
What about something like “take every key in funk red made”, where we have “in” as a composing keyword to do the second search (every key) in the scope of the first (funk red made). That way searches could have a default delimiter set by, e.g, |
Also @wenkokke can we align on naming here? I'm using |
Basically the term refers to the maximal range within the document for which the given scope is the canonical instance of that scope type. So for the |
I’m using delimiter in the sense of (parse) trees or e.g. delimited continuations, i.e., a node which delimits a subtree, but iteratorScope is probably more generally clear? I’d prefer to use something like “punctuation” for what you call delimiters, but I’m fine either way. |
I’m using “delimiter” the way @Will-Sommers uses “iteratorScope”, I think? I don’t fully understand “searchScope”? |
Just a small note — looks like nvim has added some stuff here as well: |
Ooh nice find. Interesting to see how they've handled ranges |
Heyo, so it does look like user supported predicates are supported, in a sense. What ends up happening is that a From that point, the ball is back in our court(see above link). Going to play around with implementing |
Yep sounds good to me. I think ideally we'll just need to implement a few simple generic custom functions |
Heyo @pokey — Just to catch you up on my thoughts. The PR I'm working on will not define All of this will happen within the returned matcher function and we'll implement it there, adding I'm working on some tests for the code itself now. I'm not sure the philosophy related to unit tests(most everything looks like functional integration tests rather than unit.
|
Just chatted with @AndreasArvidsson and we came up with a nice name for
Then internally the object that describes what came back would have |
@pokey @AndreasArvidsson — this works for me but I'll likely bring it into another PR after the initial one was merged. Here's the diff of my work on it, the |
other possible terms for
the notion is that we want to indicate the range within which this scope is the canonical instance of the given scope type. So eg for |
Why not just My second preference would be domain. Dominion is reserved for |
Otherwise, what is currently preventing the first pull request for this issue from being merged? Do we have a to do list? |
The PR needs some various fixes; I think @Will-Sommers has a todo list based on our most recent pairing session |
😄 |
- Partially addresses #616 - Partially addresses #436 - Depends on #1396 ## Todo - [x] **[DISCUSS]** What to do about fallback `iterationScope`? That's the only thing that is a regression here. - [x] File issues for FIXMEs - [x] File issue for defining iteration scopes. Can probably reuse most of the code from the regular scope handler other than creating the target - [x] File issue to add unit tests for scope handlers - [x] File issue to add some Python scope types where multiple can end at the same point (due to lack of closing brackets) - [x] Add test that checks no scope types are duplicated between legacy and new definition, or file issue to add test - [x] File PR for my 7783da6 (Add support for domain, leading, trailing, interior) #1427 - [x] Look through comments on this thread for anything worth filing / doing - [x] Open as new PR? - [x] Remove extraneous test cases - [x] Double check #629 (comment); a lot of those tests we already have for the generic modifier code - [x] Make sure changes to parse-tree-extension are shipped - [x] Close #785 if we fix that - [x] Comment on #484 saying the process has started and providing link to example - [x] Close #797 if we fix that --------- Co-authored-by: Pokey Rule <[email protected]>
We are today relying on Tree sitter queries and most languages are migrated |
Background
Currently Cursorless uses a custom pattern definition DSL alongside a set of helper functions in nodeMatcher.ts to match various scope types, such as
item
within a list orargue
within a function definition or function invocation.The tree-sitter project also provides a DSL, written in Scheme which allows a user to query for patterns within syntax trees. Here's a link to the docs and an example usage in JS via the web-tree-sitter project. Each query then is allowed to assign a
name
to a node, such as@comment
or@punctuation.bracket
:The name can then be read or asserted against.
The thought is that moving towards this approach will be more expressive out of the box. Additionally, many other projects including Neovim and Helix rely on queries for syntax highlighting as well as indentation which might help to make the incremental work for adding a new language a little bit simpler, since there are already partial or full definitions to work from. In particular, Helix already uses these queries for their textobjects, which is a simplified version of Cursorless scope types. Here's an example of a set of textobject definitions; they exist for several other languages as well
The Work
queries
directory with a subdirectory for each language, egqueries/python
, etcScopeTypes
, placing the file inqueries/<language>/scopeTypes.scm
SyntaxNode.Tree.Language
) and so are top down rather than bottom up as cursorless node matchers currently work.@<scopeType>.searchScope
) we first find a match, and then search within that rangeThe definitions
@<scopeType>
, so eg@namedFunction
@<scopeType>.removalRange
indicates a different range that should be used for removal@<scopeType>.domain
indicates that we should first expand to the smallest containing match for this tag and then search for a rooted instance of@<scopeType>
within this region. The canonical example for this one is enablingtake value
from within the key in a map: we'd set@collectionItem.domain
to be the containing pair@<scopeType>.iterationScope
indicates that when user says"every <scopeType>"
, we should first expand to the smallest instance of this tag, and then yield all top-level instances of@<scopeType>
within this range. Here, top-level means not contained by any other match within the search range. Also, note that when finding the instances in the range, we should use@<scopeType>.domain
if it exists. See below for an explanation@<scopeType>.interior
is used byexcludeInterior
andinteriorOnly
stages (see update inside / outside #254)Migration notes
This will require a replacement of each of the language matcher files with a
scopeTypes.scm
definition. For this reason, we will want to support both paths while the migration occurs. We can keep doing continuous delivery during migration because every language other than C# is well tested.Questions
scopeType
totextObject
? That is the term used in both nvim tree-sitter, helix, and by redstart voice@<scopeType>.iterationScope
?@<scopeType>.parent
?,
stuff for removal ranges a lot. I wonder if we want to add Toml configuration for languages where we can indicate scopes that should be handled as comma-separated lists. Along this direction, it's worth thinking about the connection to Support generic comma-separated lists #357@<scopeType>.domain
? We do that today by just iterating the parent. Might be useful to keep this one as a fallback 🤷♂️Challenging cases
Why we need to use
@<scopeType>.domain
when searching within@<scopeType>.iterationScope
Consider the following case:
If the user says
"take every key fine"
, we want to just returnfoo
, excluding the nested keybar
. In this casekey.iterationScope
isobject
andkey.domain
ispair
. If we just looked for instances ofkey
within theobject
, we'd get the nested key as well. However, if we search for top-levelpair
objects we won't, as desiredWhy
@<scopeType>
must be rooted within@<scopeType>.domain
We can actually use the same code example as above:
If the user says "take key" with the cursor at the indicated position (after second opening bracket), we want to select
foo
. We first expand to the containingpair
, as that is the definition ofkey.domain
. Then we need to find thekey
. If we just look for top-levelkey
s (ie not contained by otherkey
s), we'll end up with bothfoo
andbar
. If we require that thekey
be rooted within thepair
, that won't happenFwiw, we could possibly instead exclude any
@<scopeType>
matches which are contained within a lower@<scopeType>.domain
Resources
Addenda:
name
of a node and using this DSL will likely be the approach used to support multi-language documents.Attributions
👋 Big H/T 🎩 to @wenkokke for the original idea
The text was updated successfully, but these errors were encountered: