-
Notifications
You must be signed in to change notification settings - Fork 133
Conversation
|
||
namespace Microsoft.PythonTools.Analysis.Indexing { | ||
internal interface ISymbolIndex { | ||
Task<IEnumerable<HierarchicalSymbol>> HierarchicalDocumentSymbolsAsync(Uri uri, CancellationToken token = default); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Task<IEnumerable<T>>
has weird semantic meaning, something like Lazy<Lazy<T>>
. It is better to have Task<IReadOnlyList<T>>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I know. I currently can't do a task which gives back a read only list because I'm using (perhaps abusing) yield return
to lazily traverse the index, then a Take
at the use of the enumerable to limit how far it goes based on the limits provided by the LS. IAsyncEnumerable
hasn't left the proposal phase yet, so I'll have to come up with something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you actually use Task
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely I'll just end up passing limits down to the index when doing lookups, it just feels a lot less clean and inflexible. In any case, everything here in the first commit is just prototype code that I wanted to push somewhere so I wasn't sitting on it anymore. 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed your question, sorry.
I don't actually use Task, but Mikhail had suggested sticking async on things (and it made sense, especially given how async the new analyzer is turning out). The future would be to have another implementation of this interface that instead goes to something like a sqlite database on disk or similar, so I wasn't sure if it was a good idea to make this completely synchronous.
The other thing that I was considering was cancellation; If a user is searching for a symbol, I believe the editor will cancel the request and send a new one when the input changes, so I think I'd need to be passing the cancellation token down so I can allow it to throw when it's cancelled to stop the query. Unfortunately, that has the unfortunate effect of my IEnumerable
being tied to that token, so if a user of this interface were to give the query result to something else, that something else may end up being cancelled unexpectedly where it doesn't make sense.
Perhaps I drop the async on this and make it fully synchronous instead, and make a new interface if we need something that's asynchronous (and have the service provide both?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think indexer should be working similar to what async parser and analysis do. I.e.
- Get notification on text changes (probably push rather than listening on events to be less dependent on the source),
- Cancel current indexing on the changed document
- Start internal indexing task asynchronously.
- Fire event 'indexing complete' on every document and on overall end of indexing.
It can probably run indexing on several files in parallel. Indexing results on each document should be an immutable snapshot. Snapshot is getting replaced when indexing on the file is done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I probably don't understand what this method does. Is it a method that awaits for the most recent snapshot or it runs indexing over single document?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is just the query over the index's state. It does no indexing itself. This is somewhat similar to GetModuleVariables
which the indexer will partially replace.
src/LanguageServer/Impl/Implementation/Server.WorkspaceSymbols.cs
Outdated
Show resolved
Hide resolved
Design question: is it possible that indexing of one file can trigger indexing of another file? |
src/LanguageServer/Impl/Implementation/Server.WorkspaceSymbols.cs
Outdated
Show resolved
Hide resolved
No, I don't think so. Symbols are not analysis dependent, at least not in the way I'm trying to implement it. It's a pure function from file content to symbols. |
@AlexanderSher - no, it shouldn't as indexing only consumes changes. |
Then I don't understand how indexing can be async, unless we plan to run on different parts of the document in parallel. |
The end game here is that the indexer does its own parsing instead of relying on some event from the language server to give it the trees. When the new analyzer comes online, there will no longer be a parse event for every file in the project, so the indexer will need to do its own thing in lieu of being given everything. Technically speaking, that doesn't need to be async either, since the walker is fast and the process would generally be "read file, parse, update index", which is just a synchronous action. There are two sides; the API for querying (ISymbolIndex), and then the other side which provides ASTs, the parsing, etc, which will need to be changed. |
Well, "read file" is an I/O operation, which should be async.
Meaning we will be doing the same parsing work twice? |
It should be async as in indexing in its own task/thread upon receiving document change. When done, set result on the task. Like in https://github.com/MikhailArkhipov/python-language-server/blob/analysis10/src/Analysis/Ast/Impl/Modules/PythonModule.cs#L171 and https://github.com/MikhailArkhipov/python-language-server/blob/analysis10/src/Analysis/Ast/Impl/Modules/PythonModule.cs#L244 |
@AlexanderSher "Meaning we will be doing the same parsing work twice?" Possibly, but not always. Indexer processes all files in the workspace. New analyzer is only loading open file and dependencies. So there can be |
Ok, so
And so on. That's why I thought that we need some kind of a service that takes care of all parsing requests and provides awaiters for the code that needs to wait for the most recent parse or index. |
Next update cancels prev parse. May be helpful to sprinkle parser with cancellation. But in RL parsing takes like 10-20ms tops so cancellation mostly means 'ignore the result'. Throttle is done in standard way in editors, i.e. parse on idle + ~200 ms (not yet implemented). Changes 99% come in a single active file, unless someone stats flipping across files and changing them. So # of parallel parsers depends more on # of imports that can be going in parallel. I haven't seen more ~2-3 so far. |
Dropping the WIP, since I think it's looking alright. Let me know what you think. All of the old IMember code is still there in the LS symbols file; I didn't delete any of it even though it's no longer referenced. |
|
||
public SymbolIndex() { } | ||
|
||
public IEnumerable<HierarchicalSymbol> HierarchicalDocumentSymbols(Uri uri) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you can easily return IReadOnlyList<HierarchicalSymbol>
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, though my intention was that a future PR allows this data to be persisted in a local database, where not needing to collect the entire result beforehand. Though, I guess it's a bit pointless when the HierarchicalSymbol
itself contains an IList<HierarchicalSymbol>
.
…thand names, use queue instead of linked list
We figured out that I forgot to specialize for-loops and generators, so I'm doing that now. |
Any last opinions on this? |
Ship it. |
For #433.
Closes #483.
Closes #484.
Closes #485.
General overview:
__init__
in aclass
will be marked as a constructor.Property
.__.*__
are given the kindOperator
.Method
.UPPERCASE_UNDERSCORE_123
at the top level or directly inside a class are given the kindConstant
.Things to do in future PRs:
Uri
toTaskCompletionSource<HierarchicalSymbol>
).