Skip to content

CI: run Python tests in random order/parallel [WIP] #9266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 25 commits into from

Conversation

dbaston
Copy link
Member

@dbaston dbaston commented Feb 20, 2024

What does this PR do?

Updates the Ubuntu 20.04 CI configuration to make most Python tests run in random order and in some cases, in parallel.

I am still working through a few remaining failures, e.g. https://github.com/dbaston/gdal/actions/runs/7977513063/job/21780552676#step:16:3591 and https://github.com/dbaston/gdal/actions/runs/7978242428/job/21782935240#step:16:4345

What are related issues/pull requests?

#4407

Tasklist

  • Review
  • Adjust for comments
  • All CI builds and checks have passed

@coveralls
Copy link
Collaborator

coveralls commented Feb 20, 2024

Coverage Status

coverage: 68.949%. remained the same
when pulling 136d934 on dbaston:pytest-independent
into cf119e5 on OSGeo:master.

@dbaston
Copy link
Member Author

dbaston commented Feb 20, 2024

I can reproduce the ogr_tiledb.py crash locally (using the Ubuntu 20.04 container). It appears to crash when loading pyarrow:

Thread 1 "python3" received signal SIGSEGV, Segmentation fault.
0x00007fefb5218f6b in __pyx_convert_PyBytes_string_to_py_std__in_string(std::string const&) ()
   from /usr/local/lib/python3.8/dist-packages/pyarrow/lib.cpython-38-x86_64-linux-gnu.so
(gdb) py-bt
Traceback (most recent call first):
  <built-in method exec_dynamic of module object at remote 0x7fefd25b80e0>
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 1174, in exec_module
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "/usr/local/lib/python3.8/dist-packages/pyarrow/__init__.py", line 321, in <module>
    }
  <built-in method exec of module object at remote 0x7fefd25b00e0>
  File "/usr/local/lib/python3.8/dist-packages/_pytest/assertion/rewrite.py", line 178, in exec_module
    exec(co, module.__dict__)
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  <built-in method __import__ of module object at remote 0x7fefd25b00e0>
  File "/usr/local/lib/python3.8/dist-packages/_pytest/outcomes.py", line 285, in importorskip
    __import__(modname)
  File "/home/dan/dev/gdal/build-ubuntu_20.04/autotest/ogr/ogr_tiledb.py", line 1136, in test_ogr_tiledb_arrow_stream_pyarrow

@dbaston dbaston force-pushed the pytest-independent branch 6 times, most recently from d82c21f to f9c7b0f Compare April 4, 2024 20:25
@dbaston dbaston force-pushed the pytest-independent branch from f9c7b0f to 80b76e1 Compare April 5, 2024 12:02
@dbaston
Copy link
Member Author

dbaston commented Apr 18, 2024

I'm going to propose closing this in favor of a separate PR that includes this batch of test changes but does not enable random ordering or parallel tests for CI.

With random order testing, I see occasional segfaults when loading the pyarrow module:
https://github.com/OSGeo/gdal/actions/runs/8575988473/job/23520819962#step:16:3682
https://github.com/OSGeo/gdal/actions/runs/8575988473/job/23506108879#step:16:3650

This seems like an environment issue, and fixing it is beyond the spirit of #4407.

With regard to parallel testing, I see very rare failures such as this one, which appears to involve two tests are operating on the same files in the data/ directory, one of which creates statistics that are not expected by the other.

https://github.com/OSGeo/gdal/actions/runs/8575988473/job/23541380513#step:16:4086

I've verified that tests are cleaning up after themselves, i.e. any changes to the data/ directory are limited to the test scope. So fixing this would involve always copying files from data/ to a temporary directory if they may be modified by a test. It's not a bad goal, but it would require a lot of changes and (IMO) can be left for another day.

@rouault
Copy link
Member

rouault commented Apr 18, 2024

I'm going to propose closing this in favor of a separate PR that includes this batch of test changes but does not enable random ordering or parallel tests for CI.

+1

@rcoup
Copy link
Member

rcoup commented Apr 18, 2024

fixing this would involve always copying files from data/ to a temporary directory if they may be modified by a test. It's not a bad goal, but it would require a lot of changes and (IMO) can be left for another day.

Maybe Python will eventually do copy-on-write on macOS & Linux which would make this fast and painless (though cp on Linux and cp -c on macOS use it in the meantime).

@dbaston
Copy link
Member Author

dbaston commented Apr 18, 2024

Closing this in favor of #9698

@dbaston dbaston closed this Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants