Skip to content

Rename databricks util module to make it pytest-collectable #30311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Mar 26, 2023


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member Author

potiuk commented Mar 26, 2023

See #30306

Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@potiuk potiuk changed the title Rename databrics util module to make it pytest-collectable Rename databricks util module to make it pytest-collectable Mar 26, 2023
@potiuk
Copy link
Member Author

potiuk commented Mar 26, 2023

Closing - there is a better fix (and no need to have test_* prefix) in #30315

@potiuk potiuk closed this Mar 26, 2023
potiuk added a commit to potiuk/airflow that referenced this pull request Mar 30, 2023
Currently we had a requirement to use `test_` prefix for modules for all
our tests, however this means that we could have missed some of the
tests from pytest collection when they were placed in a module without
the `test_` prefix. This happened already: see apache#30311 and apache#30306

The previous attempts to remove it failed, because cassandra tests
seemed to be incompatible with pytest collection when we allowed all
Python files, also there were a few names that caused the pytest
collection to fail with selecting all Python files for collection.

The move is accompanied by converting pytest configuration to
pyproject.toml.

After long investigation, it turned out that the cause of cassandra
failures was pytest assert rewrite with collided with Cython-related
type_codes.py definition of types.

See datastax/python-driver#1142 for PR
that is aimed to fix it on cassandra side, in the meantime we
are manually patching the type_codes.py file from Cassandra by adding
PYTEST_DONT_REWRITE to its docstring when building the CI image for
testing.

Another error was using run_module method which also suffers from
assert rewriting incompatibilities but this could be fixed by using
run_path instead. pytest-dev/pytest#9007

Yet another assert-rewrite problem was that ``test_airflow_context``
also failed with ``--assert=rewrite`` turned on when python_files
were not filtered. No solution was found on how we can disable
the rewrite (apparently it happens for yaml.py file but adding
PYTEST_DONT_REWRITE to yaml.py and related files did not help,
so we had to extract the test to separate test type)

Also test in docker_tests/kubernetes_tests should be run with working
directory being in their folders in order to apply pyproject.toml from
their directores (without specifying asyncio mode).

The following changes are applied:

* conversion of pytest configuration from pytest.ini to
  pyproject.toml (for main project, docker_tests and breeze)

* addedd pyproject.toml to breeze, docker_tests, kubernetes_tests
  to reflect they need different configuration for pytest (lack
  of pytest-asyncio for example)

* adding automated patching of Cassandra type_codes.py with
  PYTEST_DONT_REWRITE docstring marker

* add pyproject.toml to docker context in order to include it in
  sources of airflow in the image

* changed working directory for k8s suite of breeze commands to be
  kubernetes_tests

* remove pytest.ini from docker compose mounting and automated removal
  in entrypoin in case old image is used

* rename few breeze function to not start with "test_" so that they
  are not collected as test methods

* remove few test dags from collecton by pytest by name

* replace run_module with run_path usage in EKS test and test_www.

* CI workflow is updated to use docker_tests as working directory
  for those tests

* perf dags were moved out of the tests/test_utils dir to dev
  directory.

* PlainAsserts test type was added and the ``test_airflow_context``
  test is run in this separate test type when running parallel
  tests by default.
potiuk added a commit that referenced this pull request Mar 31, 2023
Currently we had a requirement to use `test_` prefix for modules for all
our tests, however this means that we could have missed some of the
tests from pytest collection when they were placed in a module without
the `test_` prefix. This happened already: see #30311 and #30306

The previous attempts to remove it failed, because cassandra tests
seemed to be incompatible with pytest collection when we allowed all
Python files, also there were a few names that caused the pytest
collection to fail with selecting all Python files for collection.

The move is accompanied by converting pytest configuration to
pyproject.toml.

After long investigation, it turned out that the cause of cassandra
failures was pytest assert rewrite with collided with Cython-related
type_codes.py definition of types.

See datastax/python-driver#1142 for PR
that is aimed to fix it on cassandra side, in the meantime we
are manually patching the type_codes.py file from Cassandra by adding
PYTEST_DONT_REWRITE to its docstring when building the CI image for
testing.

Another error was using run_module method which also suffers from
assert rewriting incompatibilities but this could be fixed by using
run_path instead. pytest-dev/pytest#9007

Yet another assert-rewrite problem was that ``test_airflow_context``
also failed with ``--assert=rewrite`` turned on when python_files
were not filtered. No solution was found on how we can disable
the rewrite (apparently it happens for yaml.py file but adding
PYTEST_DONT_REWRITE to yaml.py and related files did not help,
so we had to extract the test to separate test type)

Also test in docker_tests/kubernetes_tests should be run with working
directory being in their folders in order to apply pyproject.toml from
their directores (without specifying asyncio mode).

The following changes are applied:

* conversion of pytest configuration from pytest.ini to
  pyproject.toml (for main project, docker_tests and breeze)

* addedd pyproject.toml to breeze, docker_tests, kubernetes_tests
  to reflect they need different configuration for pytest (lack
  of pytest-asyncio for example)

* adding automated patching of Cassandra type_codes.py with
  PYTEST_DONT_REWRITE docstring marker

* add pyproject.toml to docker context in order to include it in
  sources of airflow in the image

* changed working directory for k8s suite of breeze commands to be
  kubernetes_tests

* remove pytest.ini from docker compose mounting and automated removal
  in entrypoin in case old image is used

* rename few breeze function to not start with "test_" so that they
  are not collected as test methods

* remove few test dags from collecton by pytest by name

* replace run_module with run_path usage in EKS test and test_www.

* CI workflow is updated to use docker_tests as working directory
  for those tests

* perf dags were moved out of the tests/test_utils dir to dev
  directory.

* PlainAsserts test type was added and the ``test_airflow_context``
  test is run in this separate test type when running parallel
  tests by default.
@potiuk potiuk deleted the make-databrics-utils-pytest-collectable branch April 4, 2023 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants