-
Notifications
You must be signed in to change notification settings - Fork 133
Conversation
Having all tests use one and the same instance of server will create a dependency between tests. Not a bad thing by itself (users run chunks of code), but since test running order isn't guaranteed, it would be a huge problem to track those issues. Why server initialization takes so much time? |
I'm not generating these tests so that we can generate them once and keep using them. They're more for sanity checks and comparison, such that someone can generate them, run the tests, and inspect them for possible bugs to report (see everything that links to #334). Honestly, I think it's valuable for these sorts of tests to not be deterministic, because it shows that we have problems with ordering affecting tests. Creating a server per tests takes about a second. If I look at a test's log after running a single test (so one user), I see:
You can see that there's more than a second until the document actually gets opened. As for "why" it's slow, I have no idea other than to say that starting the server is generally slow. All of our tests are slow like this. I don't have a way to profile tests because we're using netcore. |
Single test can be slow because it creates a cache for modules, but the remaining tests should use that cache and run faster. What do you see in logs when several tests are executed? |
Each time server starts it a) loads builtin types and b) inspects cache and libraries for scraping |
Ok, but if we have that data in cache, shouldn't we just load it instead of scraping those libraries again? |
Yes, but all this takes time. builtins is a big module compared to a small amount of code in the test... |
I modified things to have every test spawn its own server, and each test takes anywhere between ~500 ms and 1 second, as compared to the first test taking 1 second and all subsequent tests taking a handful of milliseconds. All of the logs look the same as what I copied above, just with varying times until the document is opened. |
This isn't right. Loading from cache should be much faster. It looks like we either ignore cache in tests or can't find it. |
No, I mean loading builtins as in |
I've found an interesting thing: https://github.com/Microsoft/python-language-server/blob/af7773157b4870fbd05babb144c6c2cfa3ce5bbb/src/Analysis/Engine/Impl/Interpreter/Ast/AstPythonInterpreterFactory.cs#L65: |
For one specific test group (with one server per test), doing that lowers the test time from 1:23 to 0:55. In general, instead of a test taking anywhere from 500ms to a second, it's more like 250ms to a second. Of course, this is nowhere near as fast as me just making the one server and reusing it, because everything happens exactly once and completion/hover is a fast call inward. |
Ok, what takes most of those 250ms is |
What's the right path forward on this? I can see that with the new import system, the code @AlexanderSher mentioned was moved and is no longer negated, so I should be able to have cached data. However, making one server per test is still as slow as it was the first time around. For I still think it's okay in this specific instance to share a server between tests. I'm not expecting to ever pass every test in this suite anyway, and a person is going to have to manually look at the results to generate minimal examples regardless, so I'd rather them need to wait double digit seconds instead of double digit minutes. |
I think it is OK for now. Let's file a work item to review this and address if there are indeed issues. |
* add test generation script * remove unused imports * use SemaphoreSlim and async func instead of lock and Task.Run
Closes #334.
The generated code uses a static class to share a server between tests (as every test is read-only). The >1200 tests run in just 16 seconds, as compared to spinning up a server per-test which makes the suite take over 30 minutes.
In order to simulate editor behavior, the test cases will filter the results from their completion results to limit them to entries which contain the preceding token. For example, completing after
hello.wor
will only use entries withinsertText
that containwor
, just like VSC.Some tests may appear to be flaky at first glance. However, it's not due to the inter-test sharing of a server, but due to what seems to be actual race conditions in our analysis we should investigate or otherwise solve via the rewrite.
The script will infer the input/output location based on its location (
src
), so as long as the TestData is populated, it'll do the rest.Formatted using black.