Speed up "pythoneval" tests #1668

gnprice · 2016-06-07T01:54:56Z

The one slowest test task in our test suite is testpythoneval.py aka run eval-test -- in fact, on my desktop with 8-way parallelism, the whole suite takes 1m28s, and if that one task is omitted it takes 51s. This is because for much of that time it's the single straggler task running, even though since commit 56bb4ba we start it at the very beginning. (There are also some quick wins we can get to make the suite much faster than 51s, once we make this task much faster and/or break it into smaller pieces.)

Looking inside the test code, it looks like it does two things for each of a number of test cases:

Run the type-checker on the test program
Run the test program
and then compares the output to an expected output.

Is step 2 still useful? It made a lot of sense when mypy was its own implementation of a Python-like language, but now it seems like it just serves as a regression check on CPython. In principle it could be useful to check that our tests run the way we think they do, but that's basically a test of those test cases themselves. I'm not sure how often that's useful; I think most of the test cases themselves are very straightforward. So I suggest we cut it out.

The text was updated successfully, but these errors were encountered:

gvanrossum · 2016-06-07T03:18:08Z

I don't think step 2 is useful. We should however keep step 1, since these are the only tests that don't use the fake builtins from mypy/test/data/fixtures/. Unfortunately this means somewhat painful surgery on the test case data files, since the checker output and the CPython output are just combined in the [out] sections. Probably you can run myunit with the -u or -i flag to make it generate new versions of the test case data files.

JukkaL · 2016-06-07T13:32:02Z

Step 2 isn't essential any more. (I'd still consider it at least slightly useful as long as PEP 484 is provisional and the typing module is still evolving. However, it might be better to just move anything that seems valuable to the the typing module test suite.)

The biggest bottleneck is processing the stubs for builtins and typing for every test case. If we'd use incremental type checking to cache these stubs the test cases would be much faster. We could have a setup method that populates the cache. (This has been discussed elsewhere.)

Beyond that and removing step 2, we have at least these options for speeding up things:

Don't run type checking in a subprocess but just call the build function directly.
Split slowest tasks into multiple subtasks that are scheduled independently. Once the "evaluation" tests are faster, the type checker tests might even be the next long pole, so we may want to split them.
Cache some of the fake stubs used in other tests.
Try to reuse subprocesses to avoid some startup overhead. In particular, if we split tasks then running only a small subset of tests may become slower, since we pay the startup overhead multiple times.

ddfisher · 2016-06-07T18:34:12Z

If we're going to be investing significantly in our test suite, I think we should consider moving to a standard test runner like py.test. This would make it easier for new contributors to start on the project, and should open the door to things like getting test coverage, which I think would be pretty useful (because I suspect we have some important uncovered sections).

gnprice · 2016-06-07T20:27:46Z

Yeah, I agree that something like py.test would be a good choice and I wouldn't want to spend a lot of effort reinventing the kind of thing it would do for us.

Clearing out unnecessary test cases and steps is effort that I think will commute pretty well across such a change -- whatever tests we have will get translated across systems, and clearing them out shouldn't be much harder in the existing setup than it will be with py.test. It might even make a migration somewhat easier, if there's less to translate.

Apart from that, the other kind of effort that I think can make sense to invest in our test system now is where there's a big win for a small amount of work, like the reordering and other fixes I made in March. I don't think there's an opportunity to do that right now, though there might be after cutting out step 2 of these tests.

I'll cut more specific tasks for removing step 2 and so on.

ddfisher · 2016-06-07T20:39:29Z

Agreed.

ilevkivskyi · 2018-05-18T20:28:36Z

We first try #5083 then if it will not give enough speed-up we can re-consider other options discussed here.

This was referenced Jun 7, 2016

Cut out evaluation in the "pythoneval" tests #1671

Closed

Fold type-checking remnant of "pythoneval" tests into "check" tests #1672

Closed

gnprice mentioned this issue Jun 7, 2016

Switch to pytest from our homegrown test runner #1673

Closed

emmatyping mentioned this issue Sep 28, 2017

Move pythoneval to use mypy.api.run #4025

Closed

ilevkivskyi closed this as completed May 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up "pythoneval" tests #1668

Speed up "pythoneval" tests #1668

gnprice commented Jun 7, 2016

gvanrossum commented Jun 7, 2016

JukkaL commented Jun 7, 2016

ddfisher commented Jun 7, 2016

gnprice commented Jun 7, 2016

ddfisher commented Jun 7, 2016

ilevkivskyi commented May 18, 2018

Speed up "pythoneval" tests #1668

Speed up "pythoneval" tests #1668

Comments

gnprice commented Jun 7, 2016

gvanrossum commented Jun 7, 2016

JukkaL commented Jun 7, 2016

ddfisher commented Jun 7, 2016

gnprice commented Jun 7, 2016

ddfisher commented Jun 7, 2016

ilevkivskyi commented May 18, 2018