Skip to content

Speed up "pythoneval" tests #1668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gnprice opened this issue Jun 7, 2016 · 6 comments
Closed

Speed up "pythoneval" tests #1668

gnprice opened this issue Jun 7, 2016 · 6 comments

Comments

@gnprice
Copy link
Collaborator

gnprice commented Jun 7, 2016

The one slowest test task in our test suite is testpythoneval.py aka run eval-test -- in fact, on my desktop with 8-way parallelism, the whole suite takes 1m28s, and if that one task is omitted it takes 51s. This is because for much of that time it's the single straggler task running, even though since commit 56bb4ba we start it at the very beginning. (There are also some quick wins we can get to make the suite much faster than 51s, once we make this task much faster and/or break it into smaller pieces.)

Looking inside the test code, it looks like it does two things for each of a number of test cases:

  1. Run the type-checker on the test program
  2. Run the test program
    and then compares the output to an expected output.

Is step 2 still useful? It made a lot of sense when mypy was its own implementation of a Python-like language, but now it seems like it just serves as a regression check on CPython. In principle it could be useful to check that our tests run the way we think they do, but that's basically a test of those test cases themselves. I'm not sure how often that's useful; I think most of the test cases themselves are very straightforward. So I suggest we cut it out.

@gvanrossum
Copy link
Member

I don't think step 2 is useful. We should however keep step 1, since these are the only tests that don't use the fake builtins from mypy/test/data/fixtures/. Unfortunately this means somewhat painful surgery on the test case data files, since the checker output and the CPython output are just combined in the [out] sections. Probably you can run myunit with the -u or -i flag to make it generate new versions of the test case data files.

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 7, 2016

Step 2 isn't essential any more. (I'd still consider it at least slightly useful as long as PEP 484 is provisional and the typing module is still evolving. However, it might be better to just move anything that seems valuable to the the typing module test suite.)

The biggest bottleneck is processing the stubs for builtins and typing for every test case. If we'd use incremental type checking to cache these stubs the test cases would be much faster. We could have a setup method that populates the cache. (This has been discussed elsewhere.)

Beyond that and removing step 2, we have at least these options for speeding up things:

  • Don't run type checking in a subprocess but just call the build function directly.
  • Split slowest tasks into multiple subtasks that are scheduled independently. Once the "evaluation" tests are faster, the type checker tests might even be the next long pole, so we may want to split them.
  • Cache some of the fake stubs used in other tests.
  • Try to reuse subprocesses to avoid some startup overhead. In particular, if we split tasks then running only a small subset of tests may become slower, since we pay the startup overhead multiple times.

@ddfisher
Copy link
Collaborator

ddfisher commented Jun 7, 2016

If we're going to be investing significantly in our test suite, I think we should consider moving to a standard test runner like py.test. This would make it easier for new contributors to start on the project, and should open the door to things like getting test coverage, which I think would be pretty useful (because I suspect we have some important uncovered sections).

@gnprice
Copy link
Collaborator Author

gnprice commented Jun 7, 2016

Yeah, I agree that something like py.test would be a good choice and I wouldn't want to spend a lot of effort reinventing the kind of thing it would do for us.

Clearing out unnecessary test cases and steps is effort that I think will commute pretty well across such a change -- whatever tests we have will get translated across systems, and clearing them out shouldn't be much harder in the existing setup than it will be with py.test. It might even make a migration somewhat easier, if there's less to translate.

Apart from that, the other kind of effort that I think can make sense to invest in our test system now is where there's a big win for a small amount of work, like the reordering and other fixes I made in March. I don't think there's an opportunity to do that right now, though there might be after cutting out step 2 of these tests.

I'll cut more specific tasks for removing step 2 and so on.

@ddfisher
Copy link
Collaborator

ddfisher commented Jun 7, 2016

Agreed.

@ilevkivskyi
Copy link
Member

We first try #5083 then if it will not give enough speed-up we can re-consider other options discussed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants