Skip to content

Pytest assert rewrite fails when rewriting some parts of cassandra cython code #10844

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
4 tasks done
potiuk opened this issue Mar 28, 2023 · 10 comments
Open
4 tasks done
Labels
topic: rewrite related to the assertion rewrite mechanism type: bug problem that needs to be addressed

Comments

@potiuk
Copy link

potiuk commented Mar 28, 2023

It seems that assert rewrite has a subtle bug that causes crash when collecting and executing code when we are importing cassandra.

Context: When we added python_files = "*.py" to Apache Airflow in order to not accidentally skip some of our tests ( apache/airflow#30315 ), the PRs started to fail with mysterious:

from cassandra.cluster import Cluster
cassandra/cluster.py:48: in init cassandra.cluster ???
cassandra/connection.py:40: in init cassandra.connection ???
cassandra/protocol.py:698: in genexpr
???
cassandra/protocol.py:698: in genexpr
???
E KeyError: '@py_builtins'

After some (difficult and wild) investigation, it turned out that this is because Pytest assert rewrite fails when trying to rewrite the https://github.com/datastax/python-driver/blob/master/cassandra/type_codes.py file (which apparently comes from Cython integraiton - https://github.com/datastax/python-driver/blob/master/cassandra/type_codes.pxd

This is the most likely reason because either adding --assert=plain or patching the type_codes.py file with PYTEST_DONT_REWRITE to docsstring solves the problem.

I've opened a PR to cassandra to include PYTEST_DONT_REWRITE datastax/python-driver#1142 and in Apache Airflow we have PR to autoamaticallly patch cassandra driver with it apache/airflow#30315, but those are merely workarounds for the problem.

Reproduction:

An easy way to reproduce it:

  1. Pull the CI image of Airlfow that contain the workaround and all the airflow dependencies (it contains patched types_code.py):
docker pull ghcr.io/apache/airflow/main/ci/python3.10:8580edf1cb0e67efdf45e6686d2f0239bc8f1ebb
  1. Enter the image (you will be dropped into shell with everything ready to run the tests):
docker run -it ghcr.io/apache/airflow/main/ci/python3.10:8580edf1cb0e67efdf45e6686d2f0239bc8f1ebb
  1. Collect cassandra pytest:
pytest --collect-only tests/providers/apache/cassandra/sensors/

This results in:

8 tests collected in 0.28s

  1. Remove PYTEST_DONT_REWRITE from patched types_code:
vi /usr/local/lib/python3.10/site-packages/cassandra/type_codes.py

This module contains currently PYTEST_DONT_REWRITE. Remove and save the file.

"""
PYTEST_DONT_REWRITE
Module with constants for Cassandra type codes.
...

  1. Collect cassandra pytest again:
pytest --collect-only tests/providers/apache/cassandra/sensors/

This results in series of errors for each test being collected:

_______________________________________________________________ ERROR collecting tests/providers/apache/cassandra/sensors/test_record.py _______________________________________________________________
tests/providers/apache/cassandra/sensors/test_record.py:22: in <module>
    from airflow.providers.apache.cassandra.sensors.record import CassandraRecordSensor
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:688: in _load_unlocked
    ???
/usr/local/lib/python3.10/site-packages/_pytest/assertion/rewrite.py:168: in exec_module
    exec(co, module.__dict__)
airflow/providers/apache/cassandra/sensors/record.py:26: in <module>
    from airflow.providers.apache.cassandra.hooks.cassandra import CassandraHook
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:688: in _load_unlocked
    ???
/usr/local/lib/python3.10/site-packages/_pytest/assertion/rewrite.py:168: in exec_module
    exec(co, module.__dict__)
airflow/providers/apache/cassandra/hooks/cassandra.py:24: in <module>
    from cassandra.cluster import Cluster, Session
cassandra/cluster.py:48: in init cassandra.cluster
    ???
cassandra/connection.py:40: in init cassandra.connection
    ???
cassandra/protocol.py:698: in genexpr
    ???
cassandra/protocol.py:698: in genexpr
    ???
E   KeyError: '@py_builtins'

Mandatory information:

Versions

  • Pytest: 7.2.2
  • OS: docker container based on debian buster (official Python 3.10 image - same for other python versions)

Linux 209653871bc9 5.15.0-67-generic #74-Ubuntu SMP Wed Feb 22 14:14:39 UTC 2023 x86_64 GNU/Linux

The output of pip list:

Package                                Version     Editable project location
-------------------------------------- ----------- -------------------------
adal                                   1.2.7
aiobotocore                            2.5.0
aiofiles                               22.1.0
aiohttp                                3.8.4
aioitertools                           0.11.0
aioresponses                           0.7.4
aiosignal                              1.3.1
alabaster                              0.7.13
alembic                                1.10.2
aliyun-python-sdk-core                 2.13.36
aliyun-python-sdk-kms                  2.16.0
amqp                                   5.1.1
analytics-python                       1.4.post1
ansiwrap                               0.8.4
anyio                                  3.6.2
apache-airflow                         2.6.0.dev0  /opt/airflow
apache-beam                            2.46.0
apispec                                5.2.2
appdirs                                1.4.4
argcomplete                            3.0.5
arrow                                  1.2.3
asana                                  3.2.0
asgiref                                3.6.0
asn1crypto                             1.5.1
astroid                                2.15.1
asttokens                              2.2.1
async-timeout                          4.0.2
asynctest                              0.13.0
atlasclient                            1.0.0
atlassian-python-api                   3.35.0
attrs                                  22.2.0
Authlib                                1.2.0
aws-sam-translator                     1.63.0
aws-xray-sdk                           2.11.0
azure-batch                            13.0.0
azure-common                           1.1.28
azure-core                             1.26.3
azure-cosmos                           4.3.1
azure-datalake-store                   0.0.52
azure-identity                         1.12.0
azure-keyvault-secrets                 4.7.0
azure-kusto-data                       0.0.45
azure-mgmt-containerinstance           1.5.0
azure-mgmt-core                        1.3.2
azure-mgmt-datafactory                 1.1.0
azure-mgmt-datalake-nspkg              3.0.1
azure-mgmt-datalake-store              0.5.0
azure-mgmt-nspkg                       3.0.2
azure-mgmt-resource                    23.0.0
azure-nspkg                            3.0.2
azure-servicebus                       7.8.3
azure-storage-blob                     12.15.0
azure-storage-common                   2.1.0
azure-storage-file                     2.1.0
azure-storage-file-datalake            12.10.1
azure-synapse-spark                    0.7.0
Babel                                  2.12.1
backcall                               0.2.0
backoff                                1.10.0
bcrypt                                 4.0.1
beautifulsoup4                         4.12.0
billiard                               3.6.4.0
bitarray                               2.7.3
black                                  23.1a1
bleach                                 6.0.0
blinker                                1.5
boto                                   2.49.0
boto3                                  1.26.76
botocore                               1.29.76
bowler                                 0.9.0
cachelib                               0.9.0
cachetools                             5.3.0
cassandra-driver                       3.25.0
cattrs                                 22.2.0
celery                                 5.2.7
certifi                                2022.12.7
cffi                                   1.15.1
cfgv                                   3.3.1
cfn-lint                               0.76.1
cgroupspy                              0.2.2
chardet                                4.0.0
charset-normalizer                     2.1.1
checksumdir                            1.2.0
ciso8601                               2.3.0
click                                  8.1.3
click-default-group                    1.2.2
click-didyoumean                       0.3.0
click-plugins                          1.1.1
click-repl                             0.2.0
clickclick                             20.10.2
cloudant                               2.15.0
cloudpickle                            2.2.1
colorama                               0.4.6
colorlog                               4.8.0
ConfigUpdater                          3.1.1
connexion                              2.14.2
coverage                               7.2.2
crcmod                                 1.7
cron-descriptor                        1.2.35
croniter                               1.3.8
cryptography                           39.0.2
curlify                                2.2.1
dask                                   2023.3.2
databricks-sql-connector               2.4.1
datadog                                0.45.0
db-dtypes                              1.0.5
decorator                              5.1.1
defusedxml                             0.7.1
Deprecated                             1.2.13
dill                                   0.3.1.1
distlib                                0.3.6
distributed                            2023.3.2
dnspython                              2.3.0
docker                                 6.0.1
docopt                                 0.6.2
docutils                               0.16
ecdsa                                  0.18.0
elasticsearch                          7.13.4
elasticsearch-dbapi                    0.2.10
elasticsearch-dsl                      7.4.1
email-validator                        1.3.1
entrypoints                            0.4
eralchemy2                             1.3.7
et-xmlfile                             1.1.0
eventlet                               0.33.3
exceptiongroup                         1.1.1
execnet                                1.9.0
executing                              1.2.0
facebook-business                      16.0.1
fastavro                               1.7.3
fasteners                              0.18
fastjsonschema                         2.16.3
filelock                               3.10.7
fissix                                 21.11.13
Flask                                  2.2.3
Flask-AppBuilder                       4.3.0
Flask-Babel                            2.0.0
Flask-Bcrypt                           1.0.1
Flask-Caching                          2.0.2
Flask-JWT-Extended                     4.4.4
Flask-Limiter                          3.3.0
Flask-Login                            0.6.2
Flask-Session                          0.4.0
Flask-SQLAlchemy                       2.5.1
Flask-WTF                              1.1.1
flower                                 1.2.0
frozenlist                             1.3.3
fsspec                                 2023.3.0
future                                 0.18.3
gcloud-aio-auth                        4.2.0
gcloud-aio-bigquery                    6.3.0
gcloud-aio-storage                     8.1.0
gcsfs                                  2023.3.0
geomet                                 0.2.1.post1
gevent                                 22.10.2
gitdb                                  4.0.10
GitPython                              3.1.31
google-ads                             18.0.0
google-api-core                        2.8.2
google-api-python-client               1.12.11
google-auth                            2.16.3
google-auth-httplib2                   0.1.0
google-auth-oauthlib                   0.8.0
google-cloud-aiplatform                1.16.1
google-cloud-appengine-logging         1.1.3
google-cloud-audit-log                 0.2.4
google-cloud-automl                    2.8.0
google-cloud-bigquery                  2.34.4
google-cloud-bigquery-datatransfer     3.7.0
google-cloud-bigquery-storage          2.14.1
google-cloud-bigtable                  2.11.1
google-cloud-build                     3.9.0
google-cloud-compute                   0.7.0
google-cloud-container                 2.11.1
google-cloud-core                      2.3.2
google-cloud-datacatalog               3.9.0
google-cloud-dataflow-client           0.5.4
google-cloud-dataform                  0.2.0
google-cloud-dataplex                  1.1.0
google-cloud-dataproc                  5.0.0
google-cloud-dataproc-metastore        1.6.0
google-cloud-dlp                       3.8.0
google-cloud-kms                       2.12.0
google-cloud-language                  1.3.2
google-cloud-logging                   3.2.1
google-cloud-memcache                  1.4.1
google-cloud-monitoring                2.11.0
google-cloud-orchestration-airflow     1.4.1
google-cloud-os-login                  2.7.1
google-cloud-pubsub                    2.13.5
google-cloud-redis                     2.9.0
google-cloud-resource-manager          1.6.0
google-cloud-secret-manager            1.0.2
google-cloud-spanner                   1.19.3
google-cloud-speech                    1.3.4
google-cloud-storage                   2.7.0
google-cloud-tasks                     2.10.1
google-cloud-texttospeech              1.0.3
google-cloud-translate                 1.7.2
google-cloud-videointelligence         1.16.3
google-cloud-vision                    1.0.2
google-cloud-workflows                 1.7.1
google-crc32c                          1.5.0
google-resumable-media                 2.4.1
googleapis-common-protos               1.56.4
graphql-core                           3.2.3
graphviz                               0.20.1
greenlet                               2.0.2
grpc-google-iam-v1                     0.12.4
grpcio                                 1.53.0
grpcio-gcp                             0.2.2
grpcio-status                          1.48.2
gssapi                                 1.8.2
gunicorn                               20.1.0
h11                                    0.14.0
hdfs                                   2.7.0
HeapDict                               1.0.1
hmsclient                              0.1.1
httpcore                               0.16.3
httplib2                               0.21.0
httpx                                  0.23.3
humanize                               4.6.0
hvac                                   1.1.0
identify                               2.5.22
idna                                   3.4
ijson                                  3.2.0.post0
imagesize                              1.4.1
importlib-metadata                     6.1.0
importlib-resources                    5.12.0
impyla                                 0.18.0
incremental                            22.10.0
inflection                             0.5.1
influxdb-client                        1.36.1
iniconfig                              2.0.0
ipdb                                   0.13.13
ipython                                8.11.0
isodate                                0.6.1
itsdangerous                           2.1.2
jaraco.classes                         3.2.3
JayDeBeApi                             1.2.3
jedi                                   0.18.2
jeepney                                0.8.0
Jinja2                                 3.1.2
jira                                   3.5.0
jmespath                               0.10.0
JPype1                                 1.4.1
jschema-to-python                      1.2.3
json-merge-patch                       0.2
jsondiff                               2.0.0
jsonpatch                              1.32
jsonpath-ng                            1.5.3
jsonpickle                             3.0.1
jsonpointer                            2.3
jsonschema                             4.17.3
jsonschema-spec                        0.1.4
junit-xml                              1.9
jupyter_client                         8.1.0
jupyter_core                           5.3.0
keyring                                23.13.1
kombu                                  5.2.4
krb5                                   0.5.0
kubernetes                             23.6.0
kubernetes-asyncio                     24.2.2
kylinpy                                2.8.4
lazy-object-proxy                      1.9.0
ldap3                                  2.9.1
limits                                 3.3.1
linkify-it-py                          2.0.0
locket                                 1.0.0
lockfile                               0.12.2
looker-sdk                             23.2.0
lxml                                   4.9.2
lz4                                    4.3.2
Mako                                   1.2.4
Markdown                               3.4.3
markdown-it-py                         2.2.0
MarkupSafe                             2.1.2
marshmallow                            3.19.0
marshmallow-enum                       1.5.1
marshmallow-oneofschema                3.0.1
marshmallow-sqlalchemy                 0.26.1
matplotlib-inline                      0.1.6
mdit-py-plugins                        0.3.5
mdurl                                  0.1.2
mongomock                              4.1.2
monotonic                              1.6
more-itertools                         9.1.0
moreorless                             0.4.0
moto                                   4.1.6
mpmath                                 1.3.0
msal                                   1.21.0
msal-extensions                        1.0.0
msgpack                                1.0.5
msrest                                 0.7.1
msrestazure                            0.6.4
multi-key-dict                         2.0.3
multidict                              6.0.4
mypy                                   1.0.0
mypy-boto3-appflow                     1.26.78
mypy-boto3-rds                         1.26.99
mypy-boto3-redshift-data               1.26.88
mypy-extensions                        1.0.0
mysql-connector-python                 8.0.32
mysqlclient                            2.1.1
nbclient                               0.7.2
nbformat                               5.8.0
neo4j                                  5.6.0
networkx                               3.0
nodeenv                                1.7.0
numpy                                  1.24.2
oauthlib                               3.2.2
objsize                                0.6.1
openapi-schema-validator               0.4.4
openapi-spec-validator                 0.5.6
openpyxl                               3.1.2
opentelemetry-api                      1.15.0
opentelemetry-exporter-otlp            1.15.0
opentelemetry-exporter-otlp-proto-grpc 1.15.0
opentelemetry-exporter-otlp-proto-http 1.15.0
opentelemetry-exporter-prometheus      1.12.0rc1
opentelemetry-proto                    1.15.0
opentelemetry-sdk                      1.15.0
opentelemetry-semantic-conventions     0.36b0
opsgenie-sdk                           2.1.5
oracledb                               1.2.2
ordered-set                            4.1.0
orjson                                 3.8.8
oscrypto                               1.3.0
oss2                                   2.17.0
packaging                              21.3
pandas                                 1.5.3
pandas-gbq                             0.17.9
papermill                              2.4.0
paramiko                               3.1.0
parso                                  0.8.3
partd                                  1.3.0
pathable                               0.4.3
pathspec                               0.9.0
pbr                                    5.11.1
pdpyras                                4.5.2
pendulum                               2.1.2
pexpect                                4.8.0
pickleshare                            0.7.5
pinotdb                                0.4.14
pip                                    23.0.1
pipdeptree                             2.7.0
pipx                                   1.2.0
pkginfo                                1.9.6
platformdirs                           3.2.0
pluggy                                 1.0.0
ply                                    3.11
plyvel                                 1.5.0
portalocker                            2.7.0
pre-commit                             3.2.1
presto-python-client                   0.8.3
prison                                 0.2.1
prometheus-client                      0.16.0
prompt-toolkit                         3.0.38
proto-plus                             1.19.6
protobuf                               3.20.0
psutil                                 5.9.4
psycopg2-binary                        2.9.5
ptyprocess                             0.7.0
pure-eval                              0.2.2
pure-sasl                              0.6.2
py-partiql-parser                      0.1.0
py4j                                   0.10.9.5
pyarrow                                9.0.0
pyasn1                                 0.4.8
pyasn1-modules                         0.2.8
pycountry                              22.3.5
pycparser                              2.21
pycryptodome                           3.17
pycryptodomex                          3.17
pydantic                               1.10.7
pydata-google-auth                     1.7.0
pydot                                  1.4.2
pydruid                                0.6.5
pyenchant                              3.2.2
pyexasol                               0.25.2
PyGithub                               1.58.1
Pygments                               2.14.0
pygraphviz                             1.10
pyhcl                                  0.4.4
PyHive                                 0.6.5
PyJWT                                  2.6.0
pykerberos                             1.2.4
pymongo                                3.13.0
pymssql                                2.2.7
PyNaCl                                 1.5.0
pyodbc                                 4.0.35
pyOpenSSL                              23.1.1
pyparsing                              3.0.9
pypsrp                                 0.8.1
pyrsistent                             0.19.3
pyspark                                3.3.2
pyspnego                               0.8.0
pytest                                 7.2.2
pytest-asyncio                         0.21.0
pytest-capture-warnings                0.0.4
pytest-cov                             4.0.0
pytest-httpx                           0.21.3
pytest-instafail                       0.4.2
pytest-rerunfailures                   11.1.2
pytest-timeouts                        1.2.1
pytest-xdist                           3.2.1
python-arango                          7.5.7
python-daemon                          3.0.1
python-dateutil                        2.8.2
python-dotenv                          1.0.0
python-http-client                     3.3.7
python-jenkins                         1.7.0
python-jose                            3.3.0
python-ldap                            3.4.3
python-nvd3                            0.15.0
python-slugify                         8.0.1
python-telegram-bot                    20.2
pytz                                   2023.2
pytz-deprecation-shim                  0.1.0.post0
pytzdata                               2020.1
pywinrm                                0.4.3
PyYAML                                 6.0
pyzmq                                  25.0.2
qds-sdk                                1.16.1
reactivex                              4.0.4
readme-renderer                        37.3
redis                                  3.5.3
redshift-connector                     2.0.910
regex                                  2023.3.23
requests                               2.28.2
requests-file                          1.5.1
requests-kerberos                      0.14.0
requests-mock                          1.10.0
requests-ntlm                          1.2.0
requests-oauthlib                      1.3.1
requests-toolbelt                      0.10.1
responses                              0.23.1
rfc3339-validator                      0.1.4
rfc3986                                1.5.0
rich                                   13.3.3
rich_argparse                          1.1.0
rich-click                             1.6.1
rsa                                    4.9
ruff                                   0.0.259
s3transfer                             0.6.0
sarif-om                               1.0.4
sasl                                   0.3.1
scramp                                 1.4.4
scrapbook                              0.5.0
SecretStorage                          3.3.3
semver                                 2.13.0
sendgrid                               6.10.0
sentinels                              1.0.0
sentry-sdk                             1.17.0
setproctitle                           1.3.2
setuptools                             66.1.1
simple-salesforce                      1.12.3
six                                    1.16.0
slack-sdk                              3.20.2
smbprotocol                            1.10.1
smmap                                  5.0.0
sniffio                                1.3.0
snowballstemmer                        2.2.0
snowflake-connector-python             3.0.2
snowflake-sqlalchemy                   1.4.7
sortedcontainers                       2.4.0
soupsieve                              2.4
Sphinx                                 5.3.0
sphinx-airflow-theme                   0.0.11
sphinx-argparse                        0.4.0
sphinx-autoapi                         2.0.1
sphinx-copybutton                      0.5.1
sphinx-jinja                           2.0.2
sphinx-rtd-theme                       1.2.0
sphinxcontrib-applehelp                1.0.4
sphinxcontrib-devhelp                  1.0.2
sphinxcontrib-htmlhelp                 2.0.1
sphinxcontrib-httpdomain               1.8.1
sphinxcontrib-jquery                   4.1
sphinxcontrib-jsmath                   1.0.1
sphinxcontrib-qthelp                   1.0.3
sphinxcontrib-redoc                    1.6.0
sphinxcontrib-serializinghtml          1.1.5
sphinxcontrib-spelling                 8.0.0
spython                                0.3.0
SQLAlchemy                             1.4.47
sqlalchemy-bigquery                    1.6.1
sqlalchemy-drill                       1.1.2
SQLAlchemy-JSONField                   1.0.1.post0
sqlalchemy-redshift                    0.8.12
SQLAlchemy-Utils                       0.40.0
sqlparse                               0.4.3
sshpubkeys                             3.3.1
sshtunnel                              0.4.0
stack-data                             0.6.2
starkbank-ecdsa                        2.2.0
statsd                                 4.0.1
sympy                                  1.11.1
tableauserverclient                    0.24
tabulate                               0.9.0
tblib                                  1.7.0
tenacity                               8.2.2
termcolor                              2.2.0
text-unidecode                         1.3
textwrap3                              0.9.2
thrift                                 0.16.0
thrift-sasl                            0.4.3
time-machine                           2.9.0
tomli                                  2.0.1
toolz                                  0.12.0
tornado                                6.2
towncrier                              22.12.0
tqdm                                   4.65.0
traitlets                              5.9.0
trino                                  0.322.0
twine                                  4.0.2
types-boto                             2.49.18.7
types-certifi                          2021.10.8.3
types-croniter                         1.3.2.7
types-Deprecated                       1.2.9.2
types-docutils                         0.19.1.7
types-Markdown                         3.4.2.6
types-paramiko                         3.0.0.5
types-protobuf                         4.22.0.0
types-PyMySQL                          1.0.19.6
types-pyOpenSSL                        23.1.0.1
types-python-dateutil                  2.8.19.11
types-python-slugify                   8.0.0.2
types-pytz                             2023.2.0.1
types-PyYAML                           6.0.12.9
types-redis                            4.5.3.0
types-requests                         2.28.11.16
types-setuptools                       67.6.0.5
types-tabulate                         0.9.0.1
types-termcolor                        1.1.6.2
types-toml                             0.10.8.5
types-urllib3                          1.26.25.9
typing_extensions                      4.5.0
tzdata                                 2023.2
tzlocal                                4.3
uamqp                                  1.6.4
uc-micro-py                            1.0.1
unicodecsv                             0.14.1
Unidecode                              1.3.6
uritemplate                            3.0.1
urllib3                                1.26.15
userpath                               1.8.0
vertica-python                         1.3.1
vine                                   5.0.0
virtualenv                             20.21.0
volatile                               2.1.0
watchtower                             2.0.1
wcwidth                                0.2.6
webencodings                           0.5.1
websocket-client                       1.5.1
Werkzeug                               2.2.3
wheel                                  0.40.0
wrapt                                  1.15.0
WTForms                                3.0.1
xmltodict                              0.13.0
yamllint                               1.30.0
yandexcloud                            0.206.0
yarl                                   1.8.2
zeep                                   4.2.1
zenpy                                  2.0.25
zict                                   2.2.0
zipp                                   3.15.0
zope.event                             4.6
zope.interface                         6.0
zstandard                              0.20.0
  • a detailed description of the bug or problem you are having
  • output of pip list from the virtual environment you are using
  • pytest and operating system versions
  • minimal example if possible
@Zac-HD Zac-HD added type: bug problem that needs to be addressed topic: rewrite related to the assertion rewrite mechanism labels Apr 8, 2023
@Zac-HD
Copy link
Member

Zac-HD commented Apr 8, 2023

The first step for anyone who wants to work on this will be to get a minimal reproducing example, which we can investigate more easily and later e.g. add to our own test suite.

@potiuk
Copy link
Author

potiuk commented Apr 9, 2023

Sure. That one is easy (requires pyenv + pyenv-virtualenv + installing cassandra-driver): tested on MacOS M1:

Here are the minium reproducible steps:

pyenv virtualenv 3.9 cassandra-pytest
pyenv activate cassandra-pytest

pip install pytest cassandra-driver

mkdir test

cat >pytest.ini <<EOF
[pytest]
python_files = *.py
EOF

cat >test/test_example.py <<EOF
from cassandra.cluster import Cluster
EOF

pytest test

Produces:

Screenshot 2023-04-09 at 03 04 35

When followed by:

pytest test --assert=plain

You get (correctly):

Screenshot 2023-04-09 at 03 05 17

@potiuk
Copy link
Author

potiuk commented Apr 9, 2023

Also just to capture the installed versions:

Collecting pytest
  Using cached pytest-7.3.0-py3-none-any.whl (320 kB)
Collecting cassandra-driver
  Downloading cassandra-driver-3.26.0.tar.gz (287 kB)
     |████████████████████████████████| 287 kB 1.1 MB/s
Collecting pluggy<2.0,>=0.12
  Using cached pluggy-1.0.0-py2.py3-none-any.whl (13 kB)
Collecting exceptiongroup>=1.0.0rc8
  Using cached exceptiongroup-1.1.1-py3-none-any.whl (14 kB)
Collecting packaging
  Using cached packaging-23.0-py3-none-any.whl (42 kB)
Collecting iniconfig
  Using cached iniconfig-2.0.0-py3-none-any.whl (5.9 kB)
Collecting tomli>=1.0.0
  Using cached tomli-2.0.1-py3-none-any.whl (12 kB)
Collecting six>=1.9
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting geomet<0.3,>=0.1
  Using cached geomet-0.2.1.post1-py3-none-any.whl (18 kB)
Collecting click
  Using cached click-8.1.3-py3-none-any.whl (96 kB)
Using legacy 'setup.py install' for cassandra-driver, since package 'wheel' is not installed.
Installing collected packages: six, click, tomli, pluggy, packaging, iniconfig, geomet, exceptiongroup, pytest, cassandra-driver
    Running setup.py install for cassandra-driver ... done
Successfully installed cassandra-driver-3.26.0 click-8.1.3 exceptiongroup-1.1.1 geomet-0.2.1.post1 iniconfig-2.0.0 packaging-23.0 pluggy-1.0.0 pytest-7.3.0 six-1.16.0 tomli-2.0.1

@egegunes
Copy link

@potiuk did you manage to find the problem? I'm hitting the same issue:

$ pip freeze
amqp==5.1.1
async-timeout==4.0.2
billiard==4.1.0
cassandra-driver==3.28.0
celery==5.3.1
click==8.1.5
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.3.0
exceptiongroup==1.1.2
geomet==0.2.1.post1
iniconfig==2.0.0
kombu==5.3.1
mock==5.1.0
packaging==23.1
pkg_resources==0.0.0
pluggy==1.2.0
prompt-toolkit==3.0.39
pytest==7.4.0
python-dateutil==2.8.2
redis==4.6.0
six==1.16.0
stream-framework==1.4.0
tomli==2.0.1
typing_extensions==4.7.1
tzdata==2023.3
vine==5.0.0
wcwidth==0.2.6
=========================================================== short test summary info ===========================================================
ERROR stream_framework/tests/serializers.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/serializers.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/feeds/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/feeds/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/feeds/aggregated_feed/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/feeds/aggregated_feed/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/managers/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/managers/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/storage/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/storage/cassandra.py - KeyError: '@py_builtins'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 10 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
======================================================= 3 warnings, 10 errors in 1.42s ========================================================

@bluetech
Copy link
Member

Link to problematic code: https://github.com/datastax/python-driver/blob/8c41066330eb04c34eff57153ab2eda810844d5f/cassandra/protocol.py#L699C5-L699C126

    # Names match type name in module scope. Most are imported from cassandra.cqltypes (except CUSTOM_TYPE)
    type_codes = _cqltypes_by_code = dict((v, globals()[k]) for k, v in type_codes.__dict__.items() if not k.startswith('_'))

Didn't verify, but I'm guessing that, in combination with python_files = *.py , pytest rewrites the type_codes module, which adds @py_builtins global to it, which is not also defined in the protocol.py module, which causes the problem.

If you can convince datastax/python-driver to add a pytest-specific workaround, I think adding the string PYTEST_DONT_REWRITE to the type_codes.py module docstring will fix the problem. Or just ask them to avoid this magic entirely.

Otherwise, pytest is probably not going to change how its assertion rewriting works. My recommendation is not to use python_files = *.py , then I think the problematic module won't get rewritten.

@potiuk
Copy link
Author

potiuk commented Jul 15, 2023

Yeah. I've basically given up and manually add the PYTEST_DONT_REWRITE after installing cassandra in our CI image:

Yes I tried to do it already: datastax/python-driver#1142 - where I tried to convince datastax maintainers to add the rewrite comment (which from my point of view would be generally harmless and 0 maintenance - but they decided otherwise, I guess pytest is not going to do anything, so seems like no-one wants to own the problem.

Too bad, I am certainly not going to want to die on the hill (especially that it seems that this is a no-ones hill). We have a working workaround in our test env, so i am good.

I think using *.py is indeed not default. Though I think it's useful in big projects like Airflow - we had some of our tests not being collected because they were accidentally put in a file without test_ prefix, so we prefer to keep *.py because there is no other way to easily find that you are missing some tests from being collected (and run) - which is quite nasty surprise effect actually, but it's a niche case indeed (especially that pytest will happily run those tests when the file is specified directly.

But this is not a big issue in general and if neither Pytest nor Cassandra are concerned - who am I to tell otherwise. It's their choice, some of their user will suffer, and loose time unnecessarily to debug the problem, but well - they are in full rght to make their choice. It's nicely documented with my issues and PRs and people can find workarounds if they look for it, this is one of very legitimate ways of addressing the problem (have it documented how users dealt with it).

However I might haveother suggestions for pytest team maybe:

  1. maybe pytest assertion should avoid rewriting libraries imported from installed packages - not sure why cassandra modules are being rewritten (is it needed?) when it is not a "test" code but a 3rd-party library (sorry I do not know details how rewrites are done, but from an outsider's look it should not be needed - but maybe I miss important cases there

  2. maybe pytest should warn "Hey I am running this test now, but this file will be excluded when automatically collecting test because it does not follow the {python_files} pattern" if you run pytest some_file.py and the file would be excluded by python_files - I think having that one would make airflow give up "*.py" because it woudl be caught usually much earlier in the process. BTW.Actually just writing that gave me an idea, that maybe I could even add a pytest fixture to do it even if there is no support for it in pytest. But it would be nice to have it in pytest as built-in warning.

@bluetech
Copy link
Member

@potiuk Their problematic code is

dict((v, globals()[k]) for k, v in type_codes.__dict__.items() if not k.startswith('_'))

They might be more receptive to changes in this line if they are not obviously pytest specific. For example

dict((v, globals()[k]) for k, v in type_codes.__dict__.items() if k.endswith('Type'))

(should also allow them to remove CUSTOM_TYPE = object() line from the protocol.py file)

or

dict((v, globals()[k]) for k, v in type_codes.__dict__.items() if not k.startswith(('_', '@'))

(kinda pytest specific but hopefully OK).

maybe pytest assertion should avoid rewriting libraries imported from installed packages - not sure why cassandra modules are being rewritten (is it needed?) when it is not a "test" code but a 3rd-party library

Hmm I'm not sure off hand why 3rd-party, there's probably something relying on it but might be worth a try to avoid it.

@potiuk
Copy link
Author

potiuk commented Jul 15, 2023

Surely I might be the messenger who sends those messages back-forth - but I have literally 0 context (I understand all the words you have written, butI have no idea about internals of both pytest and cassandra, to assess why and how those changes could have avoided the problem).

So I will gently copy your message there and tag you @bluetech if they have more questions maybe that will convince them.

@potiuk
Copy link
Author

potiuk commented Jul 15, 2023

Added comment there: datastax/python-driver#1142 (comment)

potiuk added a commit to potiuk/airflow that referenced this issue Jul 15, 2023
The cassandra hack added in apache#30315 does not seem to have a chance to get
away. Neither Pytest pytest-dev/pytest#10844
nor Datastax datastax/python-driver#1142
want to own the problem for now (though there is a proposal from
pytest contributors on how Datastax could refactor their code to
avoid the problem)

However during the discussion an idea popped in my head on how
we could come back to test_* pattern with far less probability of
missing some tests that are added to wrong files. Seems that we
can add a fixture that will outright fail tests if they are
placed if files not following the test_* pattern. While it would
not help in case test would be wrongly named in the first place,
it would definitely help to not to add more tests in wrongly named
files because it will be literally impossible to run the tests
added in a wrong file, even if you manualy do `pytest somefile.py`
and avoid running collection.

I also did a quick check to try to find cases where the test_*
file name was already violated and I found (and renamed) two that
I have found. It seems it is quite likely that similar mistake
could be done in the future - but with the fixture I added it
should be far less likely someone adds tests in a wrongly named
file.
@potiuk
Copy link
Author

potiuk commented Jul 15, 2023

BTW. I just created a PR with my fixture idea in Airflow apache/airflow#32626 - which also allows us to remove the cassandra hack entirely.

And surely, I found a few cases where some of our contributors DID NOT follow the test_*.py convention and we missed it - so if we would have no such protection in place, it would surely happen in more cases in the future if we just followed the advice of using test_*.py.

The fixture failing test from badly named files is a very good protection because none of the contributors will be able to run their tests locally if they are placed in a wrongly named file. Without such protection in place, I think test_*.py is a bit of a trap set on both reviewers and contributors, because developers will run their tests locally manually, and submit it, while reviewers might easily miss the fact that the file is wrongly named - and it will pass tests in CI regardless if the tests are working, because the tests will not be collected.

So I think pytest mightainers might want to consider if they want to do something about it (like a warning i mentioned). I think just advising people to follow the recommended test_*.py without having a mechanism to prevent such silly (but easily overlookable) mistake is basically inviting to fall in such a trap eventually. I wonder how many projects out there using pytest and recommended test_*.py patterns have tests that are not actually run in CI because of that.

potiuk added a commit to apache/airflow that referenced this issue Jul 15, 2023
…32626)

The cassandra hack added in #30315 does not seem to have a chance to get
away. Neither Pytest pytest-dev/pytest#10844
nor Datastax datastax/python-driver#1142
want to own the problem for now (though there is a proposal from
pytest contributors on how Datastax could refactor their code to
avoid the problem)

However during the discussion an idea popped in my head on how
we could come back to test_* pattern with far less probability of
missing some tests that are added to wrong files. Seems that we
can add a fixture that will outright fail tests if they are
placed if files not following the test_* pattern. While it would
not help in case test would be wrongly named in the first place,
it would definitely help to not to add more tests in wrongly named
files because it will be literally impossible to run the tests
added in a wrong file, even if you manualy do `pytest somefile.py`
and avoid running collection.

I also did a quick check to try to find cases where the test_*
file name was already violated and I found (and renamed) two that
I have found. It seems it is quite likely that similar mistake
could be done in the future - but with the fixture I added it
should be far less likely someone adds tests in a wrongly named
file.
ahidalgob pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue May 15, 2024
…32626)

The cassandra hack added in #30315 does not seem to have a chance to get
away. Neither Pytest pytest-dev/pytest#10844
nor Datastax datastax/python-driver#1142
want to own the problem for now (though there is a proposal from
pytest contributors on how Datastax could refactor their code to
avoid the problem)

However during the discussion an idea popped in my head on how
we could come back to test_* pattern with far less probability of
missing some tests that are added to wrong files. Seems that we
can add a fixture that will outright fail tests if they are
placed if files not following the test_* pattern. While it would
not help in case test would be wrongly named in the first place,
it would definitely help to not to add more tests in wrongly named
files because it will be literally impossible to run the tests
added in a wrong file, even if you manualy do `pytest somefile.py`
and avoid running collection.

I also did a quick check to try to find cases where the test_*
file name was already violated and I found (and renamed) two that
I have found. It seems it is quite likely that similar mistake
could be done in the future - but with the fixture I added it
should be far less likely someone adds tests in a wrongly named
file.

GitOrigin-RevId: c6594480e2722513fd082a6c65e30e2504698ba2
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Sep 19, 2024
…32626)

The cassandra hack added in #30315 does not seem to have a chance to get
away. Neither Pytest pytest-dev/pytest#10844
nor Datastax datastax/python-driver#1142
want to own the problem for now (though there is a proposal from
pytest contributors on how Datastax could refactor their code to
avoid the problem)

However during the discussion an idea popped in my head on how
we could come back to test_* pattern with far less probability of
missing some tests that are added to wrong files. Seems that we
can add a fixture that will outright fail tests if they are
placed if files not following the test_* pattern. While it would
not help in case test would be wrongly named in the first place,
it would definitely help to not to add more tests in wrongly named
files because it will be literally impossible to run the tests
added in a wrong file, even if you manualy do `pytest somefile.py`
and avoid running collection.

I also did a quick check to try to find cases where the test_*
file name was already violated and I found (and renamed) two that
I have found. It seems it is quite likely that similar mistake
could be done in the future - but with the fixture I added it
should be far less likely someone adds tests in a wrongly named
file.

GitOrigin-RevId: c6594480e2722513fd082a6c65e30e2504698ba2
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Nov 8, 2024
…32626)

The cassandra hack added in #30315 does not seem to have a chance to get
away. Neither Pytest pytest-dev/pytest#10844
nor Datastax datastax/python-driver#1142
want to own the problem for now (though there is a proposal from
pytest contributors on how Datastax could refactor their code to
avoid the problem)

However during the discussion an idea popped in my head on how
we could come back to test_* pattern with far less probability of
missing some tests that are added to wrong files. Seems that we
can add a fixture that will outright fail tests if they are
placed if files not following the test_* pattern. While it would
not help in case test would be wrongly named in the first place,
it would definitely help to not to add more tests in wrongly named
files because it will be literally impossible to run the tests
added in a wrong file, even if you manualy do `pytest somefile.py`
and avoid running collection.

I also did a quick check to try to find cases where the test_*
file name was already violated and I found (and renamed) two that
I have found. It seems it is quite likely that similar mistake
could be done in the future - but with the fixture I added it
should be far less likely someone adds tests in a wrongly named
file.

GitOrigin-RevId: c6594480e2722513fd082a6c65e30e2504698ba2
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue May 4, 2025
…32626)

The cassandra hack added in #30315 does not seem to have a chance to get
away. Neither Pytest pytest-dev/pytest#10844
nor Datastax datastax/python-driver#1142
want to own the problem for now (though there is a proposal from
pytest contributors on how Datastax could refactor their code to
avoid the problem)

However during the discussion an idea popped in my head on how
we could come back to test_* pattern with far less probability of
missing some tests that are added to wrong files. Seems that we
can add a fixture that will outright fail tests if they are
placed if files not following the test_* pattern. While it would
not help in case test would be wrongly named in the first place,
it would definitely help to not to add more tests in wrongly named
files because it will be literally impossible to run the tests
added in a wrong file, even if you manualy do `pytest somefile.py`
and avoid running collection.

I also did a quick check to try to find cases where the test_*
file name was already violated and I found (and renamed) two that
I have found. It seems it is quite likely that similar mistake
could be done in the future - but with the fixture I added it
should be far less likely someone adds tests in a wrongly named
file.

GitOrigin-RevId: c6594480e2722513fd082a6c65e30e2504698ba2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: rewrite related to the assertion rewrite mechanism type: bug problem that needs to be addressed
Projects
None yet
Development

No branches or pull requests

4 participants