Skip to content

Travis CI tests failing for 3.4 on master #3543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gvanrossum opened this issue Jun 14, 2017 · 12 comments · Fixed by #4047
Closed

Travis CI tests failing for 3.4 on master #3543

gvanrossum opened this issue Jun 14, 2017 · 12 comments · Fixed by #4047

Comments

@gvanrossum
Copy link
Member

See e.g.

These are commits from PRs that passed all tests. The failure is always only on the Python 3.4 build, in different eval-test-* subtests. The "Actual" test output is always empty, which makes me wonder if the processes just die for some Travis-CI-specific reason. I see nothing at https://www.traviscistatus.com/.

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 14, 2017

I also saw a 3.6 failure: https://travis-ci.org/python/mypy/builds/242766320?utm_source=github_status&utm_medium=notification

Previously I had similar issues apparently caused by Travis CI killing processes when we had too many of them running in parallel. Restricting the maximum level of parallelism in Travis CI could help.

@pkch
Copy link
Contributor

pkch commented Jun 26, 2017

I also noticed that the parallelization is usually at 32 workers, but occasionally switches to 2 workers without any obvious reason (for just one or two builds). Perhaps Travis VM reports different number of cores to our test runner?

I believe we don't have access to sudo, otherwise, we could run

sudo free -m -t
sudo dmesg

to get more diagnostics.

@JukkaL
Copy link
Collaborator

JukkaL commented Jun 26, 2017

We could try restricting the maximum level of parallelism in Travis CI to, say, 16.

@emmatyping
Copy link
Member

emmatyping commented Jul 2, 2017

According to the Travis docs, we get 2 cpu cores per container see here. Im not sure we should go over that, at least not too much.

EDIT: I just tested and using 2 cores leads to about a 40% increase in time spent per run. Im pretty sure we don't want that.

@pkch
Copy link
Contributor

pkch commented Jul 2, 2017

@ethanhs Yeah I noticed the same. Even a reduction from 32 to 16 resulted in a slight increase in runtime. Why would that happen given that we only have 2 cores?

I guess our tests have a decent amount of I/O wait (presumably disk?), and we unintentionally use our (very expensive, process-based) workers to deal with blocking I/O.

Obviously, the ideal solution would be to just use 2 processes instead of 32, but within them create either threads or (better) an asyncio loop to deal with I/O wait. But that would require:

  • a verification that my guess is correct
  • a material rewrite of our test runner (which we plan to phase out)
  • giving up on pytest (which, through xdist, supports multiprocessing but has no plugins that support threads or async)

So that ideal solution is no good.

Practically, I think we can just keep the number of workers low enough that the memory problems don't happen, and high enough that blocking on I/O is not a big performance hit.

@emmatyping
Copy link
Member

Also, I just ran nproc which is a coreutils tool to determine the number of processes one can spawn (eg for my 4 core 8 thread i7 it says 8). Travis is saying 32.

I think investigating if your suspicion is correct would be very useful. One interesting thing I've noticed is that all of the failures are on the longest running container. If you think it is switching to 2 workers randomly, perhaps we can use nproc to help debug and see if that changes on failing builds?

@JukkaL
Copy link
Collaborator

JukkaL commented Jul 3, 2017

Let's just decrease the maximum parallelism from 32 to 16 and see if that fixes the problem instead of doing anything more involved. It isn't very valuable to understand the root cause if it's specific to Travis CI and we can find a simple workaround. Slower tests are generally preferable to unreliable tests, in my opinion.

gvanrossum pushed a commit that referenced this issue Jul 5, 2017
This is an attempt to fix the spurious errors that have happened in Travis (see #3543). Hopefully we won't need to reduce this further to avoid errors.
@matthiaskramm
Copy link
Contributor

It doesn't seem like the decrease helped. Tests are still failing in typeshed CI, pretty consistently now.

@emmatyping
Copy link
Member

I believe the issue there is that here mypy is run with the default concurrent processes, which on a travis worker is 32. If we lower that to 12, it should improve things (but probably won't solve them).

@JelleZijlstra
Copy link
Member

Thanks for finding that! Would you mind submitting a PR to typeshed to fix it?

JelleZijlstra pushed a commit to python/typeshed that referenced this issue Sep 20, 2017
Mypy has issues with running its test suite with many processes
concurrently. This should reduce travis test failures, if not completely
resolve failures. See issue python/mypy#3543
@gvanrossum
Copy link
Member Author

Are we still seeing flakes on Travis-CI? Jukka and I discussed this off-line, and the best approach we can think of is to increase the test timeout from 30s to 5min. If it then goes away we can assume that was the issue. The timeouts rarely if ever caught anything real (most tests don't even run with a timeout).

@emmatyping
Copy link
Member

Are we still seeing flakes on Travis-CI?

Yes this was hit in #4041 I believe. I'll make a PR to up the timeout to 5 minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants