Pytest assert rewrite fails when rewriting some parts of cassandra cython code #10844

potiuk · 2023-03-28T08:35:41Z

It seems that assert rewrite has a subtle bug that causes crash when collecting and executing code when we are importing cassandra.

Context: When we added python_files = "*.py" to Apache Airflow in order to not accidentally skip some of our tests ( apache/airflow#30315 ), the PRs started to fail with mysterious:

from cassandra.cluster import Cluster
cassandra/cluster.py:48: in init cassandra.cluster ???
cassandra/connection.py:40: in init cassandra.connection ???
cassandra/protocol.py:698: in genexpr
???
cassandra/protocol.py:698: in genexpr
???
E KeyError: '@py_builtins'

After some (difficult and wild) investigation, it turned out that this is because Pytest assert rewrite fails when trying to rewrite the https://github.com/datastax/python-driver/blob/master/cassandra/type_codes.py file (which apparently comes from Cython integraiton - https://github.com/datastax/python-driver/blob/master/cassandra/type_codes.pxd

This is the most likely reason because either adding --assert=plain or patching the type_codes.py file with PYTEST_DONT_REWRITE to docsstring solves the problem.

I've opened a PR to cassandra to include PYTEST_DONT_REWRITE datastax/python-driver#1142 and in Apache Airflow we have PR to autoamaticallly patch cassandra driver with it apache/airflow#30315, but those are merely workarounds for the problem.

Reproduction:

An easy way to reproduce it:

Pull the CI image of Airlfow that contain the workaround and all the airflow dependencies (it contains patched types_code.py):

docker pull ghcr.io/apache/airflow/main/ci/python3.10:8580edf1cb0e67efdf45e6686d2f0239bc8f1ebb

Enter the image (you will be dropped into shell with everything ready to run the tests):

docker run -it ghcr.io/apache/airflow/main/ci/python3.10:8580edf1cb0e67efdf45e6686d2f0239bc8f1ebb

Collect cassandra pytest:

pytest --collect-only tests/providers/apache/cassandra/sensors/

This results in:

8 tests collected in 0.28s

Remove PYTEST_DONT_REWRITE from patched types_code:

vi /usr/local/lib/python3.10/site-packages/cassandra/type_codes.py

This module contains currently PYTEST_DONT_REWRITE. Remove and save the file.

"""
PYTEST_DONT_REWRITE
Module with constants for Cassandra type codes.
...

Collect cassandra pytest again:

pytest --collect-only tests/providers/apache/cassandra/sensors/

This results in series of errors for each test being collected:

_______________________________________________________________ ERROR collecting tests/providers/apache/cassandra/sensors/test_record.py _______________________________________________________________
tests/providers/apache/cassandra/sensors/test_record.py:22: in <module>
    from airflow.providers.apache.cassandra.sensors.record import CassandraRecordSensor
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:688: in _load_unlocked
    ???
/usr/local/lib/python3.10/site-packages/_pytest/assertion/rewrite.py:168: in exec_module
    exec(co, module.__dict__)
airflow/providers/apache/cassandra/sensors/record.py:26: in <module>
    from airflow.providers.apache.cassandra.hooks.cassandra import CassandraHook
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:688: in _load_unlocked
    ???
/usr/local/lib/python3.10/site-packages/_pytest/assertion/rewrite.py:168: in exec_module
    exec(co, module.__dict__)
airflow/providers/apache/cassandra/hooks/cassandra.py:24: in <module>
    from cassandra.cluster import Cluster, Session
cassandra/cluster.py:48: in init cassandra.cluster
    ???
cassandra/connection.py:40: in init cassandra.connection
    ???
cassandra/protocol.py:698: in genexpr
    ???
cassandra/protocol.py:698: in genexpr
    ???
E   KeyError: '@py_builtins'

Mandatory information:

Versions

Pytest: 7.2.2
OS: docker container based on debian buster (official Python 3.10 image - same for other python versions)

Linux 209653871bc9 5.15.0-67-generic #74-Ubuntu SMP Wed Feb 22 14:14:39 UTC 2023 x86_64 GNU/Linux

The output of pip list:

Package                                Version     Editable project location
-------------------------------------- ----------- -------------------------
adal                                   1.2.7
aiobotocore                            2.5.0
aiofiles                               22.1.0
aiohttp                                3.8.4
aioitertools                           0.11.0
aioresponses                           0.7.4
aiosignal                              1.3.1
alabaster                              0.7.13
alembic                                1.10.2
aliyun-python-sdk-core                 2.13.36
aliyun-python-sdk-kms                  2.16.0
amqp                                   5.1.1
analytics-python                       1.4.post1
ansiwrap                               0.8.4
anyio                                  3.6.2
apache-airflow                         2.6.0.dev0  /opt/airflow
apache-beam                            2.46.0
apispec                                5.2.2
appdirs                                1.4.4
argcomplete                            3.0.5
arrow                                  1.2.3
asana                                  3.2.0
asgiref                                3.6.0
asn1crypto                             1.5.1
astroid                                2.15.1
asttokens                              2.2.1
async-timeout                          4.0.2
asynctest                              0.13.0
atlasclient                            1.0.0
atlassian-python-api                   3.35.0
attrs                                  22.2.0
Authlib                                1.2.0
aws-sam-translator                     1.63.0
aws-xray-sdk                           2.11.0
azure-batch                            13.0.0
azure-common                           1.1.28
azure-core                             1.26.3
azure-cosmos                           4.3.1
azure-datalake-store                   0.0.52
azure-identity                         1.12.0
azure-keyvault-secrets                 4.7.0
azure-kusto-data                       0.0.45
azure-mgmt-containerinstance           1.5.0
azure-mgmt-core                        1.3.2
azure-mgmt-datafactory                 1.1.0
azure-mgmt-datalake-nspkg              3.0.1
azure-mgmt-datalake-store              0.5.0
azure-mgmt-nspkg                       3.0.2
azure-mgmt-resource                    23.0.0
azure-nspkg                            3.0.2
azure-servicebus                       7.8.3
azure-storage-blob                     12.15.0
azure-storage-common                   2.1.0
azure-storage-file                     2.1.0
azure-storage-file-datalake            12.10.1
azure-synapse-spark                    0.7.0
Babel                                  2.12.1
backcall                               0.2.0
backoff                                1.10.0
bcrypt                                 4.0.1
beautifulsoup4                         4.12.0
billiard                               3.6.4.0
bitarray                               2.7.3
black                                  23.1a1
bleach                                 6.0.0
blinker                                1.5
boto                                   2.49.0
boto3                                  1.26.76
botocore                               1.29.76
bowler                                 0.9.0
cachelib                               0.9.0
cachetools                             5.3.0
cassandra-driver                       3.25.0
cattrs                                 22.2.0
celery                                 5.2.7
certifi                                2022.12.7
cffi                                   1.15.1
cfgv                                   3.3.1
cfn-lint                               0.76.1
cgroupspy                              0.2.2
chardet                                4.0.0
charset-normalizer                     2.1.1
checksumdir                            1.2.0
ciso8601                               2.3.0
click                                  8.1.3
click-default-group                    1.2.2
click-didyoumean                       0.3.0
click-plugins                          1.1.1
click-repl                             0.2.0
clickclick                             20.10.2
cloudant                               2.15.0
cloudpickle                            2.2.1
colorama                               0.4.6
colorlog                               4.8.0
ConfigUpdater                          3.1.1
connexion                              2.14.2
coverage                               7.2.2
crcmod                                 1.7
cron-descriptor                        1.2.35
croniter                               1.3.8
cryptography                           39.0.2
curlify                                2.2.1
dask                                   2023.3.2
databricks-sql-connector               2.4.1
datadog                                0.45.0
db-dtypes                              1.0.5
decorator                              5.1.1
defusedxml                             0.7.1
Deprecated                             1.2.13
dill                                   0.3.1.1
distlib                                0.3.6
distributed                            2023.3.2
dnspython                              2.3.0
docker                                 6.0.1
docopt                                 0.6.2
docutils                               0.16
ecdsa                                  0.18.0
elasticsearch                          7.13.4
elasticsearch-dbapi                    0.2.10
elasticsearch-dsl                      7.4.1
email-validator                        1.3.1
entrypoints                            0.4
eralchemy2                             1.3.7
et-xmlfile                             1.1.0
eventlet                               0.33.3
exceptiongroup                         1.1.1
execnet                                1.9.0
executing                              1.2.0
facebook-business                      16.0.1
fastavro                               1.7.3
fasteners                              0.18
fastjsonschema                         2.16.3
filelock                               3.10.7
fissix                                 21.11.13
Flask                                  2.2.3
Flask-AppBuilder                       4.3.0
Flask-Babel                            2.0.0
Flask-Bcrypt                           1.0.1
Flask-Caching                          2.0.2
Flask-JWT-Extended                     4.4.4
Flask-Limiter                          3.3.0
Flask-Login                            0.6.2
Flask-Session                          0.4.0
Flask-SQLAlchemy                       2.5.1
Flask-WTF                              1.1.1
flower                                 1.2.0
frozenlist                             1.3.3
fsspec                                 2023.3.0
future                                 0.18.3
gcloud-aio-auth                        4.2.0
gcloud-aio-bigquery                    6.3.0
gcloud-aio-storage                     8.1.0
gcsfs                                  2023.3.0
geomet                                 0.2.1.post1
gevent                                 22.10.2
gitdb                                  4.0.10
GitPython                              3.1.31
google-ads                             18.0.0
google-api-core                        2.8.2
google-api-python-client               1.12.11
google-auth                            2.16.3
google-auth-httplib2                   0.1.0
google-auth-oauthlib                   0.8.0
google-cloud-aiplatform                1.16.1
google-cloud-appengine-logging         1.1.3
google-cloud-audit-log                 0.2.4
google-cloud-automl                    2.8.0
google-cloud-bigquery                  2.34.4
google-cloud-bigquery-datatransfer     3.7.0
google-cloud-bigquery-storage          2.14.1
google-cloud-bigtable                  2.11.1
google-cloud-build                     3.9.0
google-cloud-compute                   0.7.0
google-cloud-container                 2.11.1
google-cloud-core                      2.3.2
google-cloud-datacatalog               3.9.0
google-cloud-dataflow-client           0.5.4
google-cloud-dataform                  0.2.0
google-cloud-dataplex                  1.1.0
google-cloud-dataproc                  5.0.0
google-cloud-dataproc-metastore        1.6.0
google-cloud-dlp                       3.8.0
google-cloud-kms                       2.12.0
google-cloud-language                  1.3.2
google-cloud-logging                   3.2.1
google-cloud-memcache                  1.4.1
google-cloud-monitoring                2.11.0
google-cloud-orchestration-airflow     1.4.1
google-cloud-os-login                  2.7.1
google-cloud-pubsub                    2.13.5
google-cloud-redis                     2.9.0
google-cloud-resource-manager          1.6.0
google-cloud-secret-manager            1.0.2
google-cloud-spanner                   1.19.3
google-cloud-speech                    1.3.4
google-cloud-storage                   2.7.0
google-cloud-tasks                     2.10.1
google-cloud-texttospeech              1.0.3
google-cloud-translate                 1.7.2
google-cloud-videointelligence         1.16.3
google-cloud-vision                    1.0.2
google-cloud-workflows                 1.7.1
google-crc32c                          1.5.0
google-resumable-media                 2.4.1
googleapis-common-protos               1.56.4
graphql-core                           3.2.3
graphviz                               0.20.1
greenlet                               2.0.2
grpc-google-iam-v1                     0.12.4
grpcio                                 1.53.0
grpcio-gcp                             0.2.2
grpcio-status                          1.48.2
gssapi                                 1.8.2
gunicorn                               20.1.0
h11                                    0.14.0
hdfs                                   2.7.0
HeapDict                               1.0.1
hmsclient                              0.1.1
httpcore                               0.16.3
httplib2                               0.21.0
httpx                                  0.23.3
humanize                               4.6.0
hvac                                   1.1.0
identify                               2.5.22
idna                                   3.4
ijson                                  3.2.0.post0
imagesize                              1.4.1
importlib-metadata                     6.1.0
importlib-resources                    5.12.0
impyla                                 0.18.0
incremental                            22.10.0
inflection                             0.5.1
influxdb-client                        1.36.1
iniconfig                              2.0.0
ipdb                                   0.13.13
ipython                                8.11.0
isodate                                0.6.1
itsdangerous                           2.1.2
jaraco.classes                         3.2.3
JayDeBeApi                             1.2.3
jedi                                   0.18.2
jeepney                                0.8.0
Jinja2                                 3.1.2
jira                                   3.5.0
jmespath                               0.10.0
JPype1                                 1.4.1
jschema-to-python                      1.2.3
json-merge-patch                       0.2
jsondiff                               2.0.0
jsonpatch                              1.32
jsonpath-ng                            1.5.3
jsonpickle                             3.0.1
jsonpointer                            2.3
jsonschema                             4.17.3
jsonschema-spec                        0.1.4
junit-xml                              1.9
jupyter_client                         8.1.0
jupyter_core                           5.3.0
keyring                                23.13.1
kombu                                  5.2.4
krb5                                   0.5.0
kubernetes                             23.6.0
kubernetes-asyncio                     24.2.2
kylinpy                                2.8.4
lazy-object-proxy                      1.9.0
ldap3                                  2.9.1
limits                                 3.3.1
linkify-it-py                          2.0.0
locket                                 1.0.0
lockfile                               0.12.2
looker-sdk                             23.2.0
lxml                                   4.9.2
lz4                                    4.3.2
Mako                                   1.2.4
Markdown                               3.4.3
markdown-it-py                         2.2.0
MarkupSafe                             2.1.2
marshmallow                            3.19.0
marshmallow-enum                       1.5.1
marshmallow-oneofschema                3.0.1
marshmallow-sqlalchemy                 0.26.1
matplotlib-inline                      0.1.6
mdit-py-plugins                        0.3.5
mdurl                                  0.1.2
mongomock                              4.1.2
monotonic                              1.6
more-itertools                         9.1.0
moreorless                             0.4.0
moto                                   4.1.6
mpmath                                 1.3.0
msal                                   1.21.0
msal-extensions                        1.0.0
msgpack                                1.0.5
msrest                                 0.7.1
msrestazure                            0.6.4
multi-key-dict                         2.0.3
multidict                              6.0.4
mypy                                   1.0.0
mypy-boto3-appflow                     1.26.78
mypy-boto3-rds                         1.26.99
mypy-boto3-redshift-data               1.26.88
mypy-extensions                        1.0.0
mysql-connector-python                 8.0.32
mysqlclient                            2.1.1
nbclient                               0.7.2
nbformat                               5.8.0
neo4j                                  5.6.0
networkx                               3.0
nodeenv                                1.7.0
numpy                                  1.24.2
oauthlib                               3.2.2
objsize                                0.6.1
openapi-schema-validator               0.4.4
openapi-spec-validator                 0.5.6
openpyxl                               3.1.2
opentelemetry-api                      1.15.0
opentelemetry-exporter-otlp            1.15.0
opentelemetry-exporter-otlp-proto-grpc 1.15.0
opentelemetry-exporter-otlp-proto-http 1.15.0
opentelemetry-exporter-prometheus      1.12.0rc1
opentelemetry-proto                    1.15.0
opentelemetry-sdk                      1.15.0
opentelemetry-semantic-conventions     0.36b0
opsgenie-sdk                           2.1.5
oracledb                               1.2.2
ordered-set                            4.1.0
orjson                                 3.8.8
oscrypto                               1.3.0
oss2                                   2.17.0
packaging                              21.3
pandas                                 1.5.3
pandas-gbq                             0.17.9
papermill                              2.4.0
paramiko                               3.1.0
parso                                  0.8.3
partd                                  1.3.0
pathable                               0.4.3
pathspec                               0.9.0
pbr                                    5.11.1
pdpyras                                4.5.2
pendulum                               2.1.2
pexpect                                4.8.0
pickleshare                            0.7.5
pinotdb                                0.4.14
pip                                    23.0.1
pipdeptree                             2.7.0
pipx                                   1.2.0
pkginfo                                1.9.6
platformdirs                           3.2.0
pluggy                                 1.0.0
ply                                    3.11
plyvel                                 1.5.0
portalocker                            2.7.0
pre-commit                             3.2.1
presto-python-client                   0.8.3
prison                                 0.2.1
prometheus-client                      0.16.0
prompt-toolkit                         3.0.38
proto-plus                             1.19.6
protobuf                               3.20.0
psutil                                 5.9.4
psycopg2-binary                        2.9.5
ptyprocess                             0.7.0
pure-eval                              0.2.2
pure-sasl                              0.6.2
py-partiql-parser                      0.1.0
py4j                                   0.10.9.5
pyarrow                                9.0.0
pyasn1                                 0.4.8
pyasn1-modules                         0.2.8
pycountry                              22.3.5
pycparser                              2.21
pycryptodome                           3.17
pycryptodomex                          3.17
pydantic                               1.10.7
pydata-google-auth                     1.7.0
pydot                                  1.4.2
pydruid                                0.6.5
pyenchant                              3.2.2
pyexasol                               0.25.2
PyGithub                               1.58.1
Pygments                               2.14.0
pygraphviz                             1.10
pyhcl                                  0.4.4
PyHive                                 0.6.5
PyJWT                                  2.6.0
pykerberos                             1.2.4
pymongo                                3.13.0
pymssql                                2.2.7
PyNaCl                                 1.5.0
pyodbc                                 4.0.35
pyOpenSSL                              23.1.1
pyparsing                              3.0.9
pypsrp                                 0.8.1
pyrsistent                             0.19.3
pyspark                                3.3.2
pyspnego                               0.8.0
pytest                                 7.2.2
pytest-asyncio                         0.21.0
pytest-capture-warnings                0.0.4
pytest-cov                             4.0.0
pytest-httpx                           0.21.3
pytest-instafail                       0.4.2
pytest-rerunfailures                   11.1.2
pytest-timeouts                        1.2.1
pytest-xdist                           3.2.1
python-arango                          7.5.7
python-daemon                          3.0.1
python-dateutil                        2.8.2
python-dotenv                          1.0.0
python-http-client                     3.3.7
python-jenkins                         1.7.0
python-jose                            3.3.0
python-ldap                            3.4.3
python-nvd3                            0.15.0
python-slugify                         8.0.1
python-telegram-bot                    20.2
pytz                                   2023.2
pytz-deprecation-shim                  0.1.0.post0
pytzdata                               2020.1
pywinrm                                0.4.3
PyYAML                                 6.0
pyzmq                                  25.0.2
qds-sdk                                1.16.1
reactivex                              4.0.4
readme-renderer                        37.3
redis                                  3.5.3
redshift-connector                     2.0.910
regex                                  2023.3.23
requests                               2.28.2
requests-file                          1.5.1
requests-kerberos                      0.14.0
requests-mock                          1.10.0
requests-ntlm                          1.2.0
requests-oauthlib                      1.3.1
requests-toolbelt                      0.10.1
responses                              0.23.1
rfc3339-validator                      0.1.4
rfc3986                                1.5.0
rich                                   13.3.3
rich_argparse                          1.1.0
rich-click                             1.6.1
rsa                                    4.9
ruff                                   0.0.259
s3transfer                             0.6.0
sarif-om                               1.0.4
sasl                                   0.3.1
scramp                                 1.4.4
scrapbook                              0.5.0
SecretStorage                          3.3.3
semver                                 2.13.0
sendgrid                               6.10.0
sentinels                              1.0.0
sentry-sdk                             1.17.0
setproctitle                           1.3.2
setuptools                             66.1.1
simple-salesforce                      1.12.3
six                                    1.16.0
slack-sdk                              3.20.2
smbprotocol                            1.10.1
smmap                                  5.0.0
sniffio                                1.3.0
snowballstemmer                        2.2.0
snowflake-connector-python             3.0.2
snowflake-sqlalchemy                   1.4.7
sortedcontainers                       2.4.0
soupsieve                              2.4
Sphinx                                 5.3.0
sphinx-airflow-theme                   0.0.11
sphinx-argparse                        0.4.0
sphinx-autoapi                         2.0.1
sphinx-copybutton                      0.5.1
sphinx-jinja                           2.0.2
sphinx-rtd-theme                       1.2.0
sphinxcontrib-applehelp                1.0.4
sphinxcontrib-devhelp                  1.0.2
sphinxcontrib-htmlhelp                 2.0.1
sphinxcontrib-httpdomain               1.8.1
sphinxcontrib-jquery                   4.1
sphinxcontrib-jsmath                   1.0.1
sphinxcontrib-qthelp                   1.0.3
sphinxcontrib-redoc                    1.6.0
sphinxcontrib-serializinghtml          1.1.5
sphinxcontrib-spelling                 8.0.0
spython                                0.3.0
SQLAlchemy                             1.4.47
sqlalchemy-bigquery                    1.6.1
sqlalchemy-drill                       1.1.2
SQLAlchemy-JSONField                   1.0.1.post0
sqlalchemy-redshift                    0.8.12
SQLAlchemy-Utils                       0.40.0
sqlparse                               0.4.3
sshpubkeys                             3.3.1
sshtunnel                              0.4.0
stack-data                             0.6.2
starkbank-ecdsa                        2.2.0
statsd                                 4.0.1
sympy                                  1.11.1
tableauserverclient                    0.24
tabulate                               0.9.0
tblib                                  1.7.0
tenacity                               8.2.2
termcolor                              2.2.0
text-unidecode                         1.3
textwrap3                              0.9.2
thrift                                 0.16.0
thrift-sasl                            0.4.3
time-machine                           2.9.0
tomli                                  2.0.1
toolz                                  0.12.0
tornado                                6.2
towncrier                              22.12.0
tqdm                                   4.65.0
traitlets                              5.9.0
trino                                  0.322.0
twine                                  4.0.2
types-boto                             2.49.18.7
types-certifi                          2021.10.8.3
types-croniter                         1.3.2.7
types-Deprecated                       1.2.9.2
types-docutils                         0.19.1.7
types-Markdown                         3.4.2.6
types-paramiko                         3.0.0.5
types-protobuf                         4.22.0.0
types-PyMySQL                          1.0.19.6
types-pyOpenSSL                        23.1.0.1
types-python-dateutil                  2.8.19.11
types-python-slugify                   8.0.0.2
types-pytz                             2023.2.0.1
types-PyYAML                           6.0.12.9
types-redis                            4.5.3.0
types-requests                         2.28.11.16
types-setuptools                       67.6.0.5
types-tabulate                         0.9.0.1
types-termcolor                        1.1.6.2
types-toml                             0.10.8.5
types-urllib3                          1.26.25.9
typing_extensions                      4.5.0
tzdata                                 2023.2
tzlocal                                4.3
uamqp                                  1.6.4
uc-micro-py                            1.0.1
unicodecsv                             0.14.1
Unidecode                              1.3.6
uritemplate                            3.0.1
urllib3                                1.26.15
userpath                               1.8.0
vertica-python                         1.3.1
vine                                   5.0.0
virtualenv                             20.21.0
volatile                               2.1.0
watchtower                             2.0.1
wcwidth                                0.2.6
webencodings                           0.5.1
websocket-client                       1.5.1
Werkzeug                               2.2.3
wheel                                  0.40.0
wrapt                                  1.15.0
WTForms                                3.0.1
xmltodict                              0.13.0
yamllint                               1.30.0
yandexcloud                            0.206.0
yarl                                   1.8.2
zeep                                   4.2.1
zenpy                                  2.0.25
zict                                   2.2.0
zipp                                   3.15.0
zope.event                             4.6
zope.interface                         6.0
zstandard                              0.20.0

a detailed description of the bug or problem you are having
output of pip list from the virtual environment you are using
pytest and operating system versions
minimal example if possible

The text was updated successfully, but these errors were encountered:

Zac-HD · 2023-04-08T23:13:31Z

The first step for anyone who wants to work on this will be to get a minimal reproducing example, which we can investigate more easily and later e.g. add to our own test suite.

potiuk · 2023-04-09T01:06:09Z

Sure. That one is easy (requires pyenv + pyenv-virtualenv + installing cassandra-driver): tested on MacOS M1:

Here are the minium reproducible steps:

pyenv virtualenv 3.9 cassandra-pytest
pyenv activate cassandra-pytest

pip install pytest cassandra-driver

mkdir test

cat >pytest.ini <<EOF
[pytest]
python_files = *.py
EOF

cat >test/test_example.py <<EOF
from cassandra.cluster import Cluster
EOF

pytest test

Produces:

When followed by:

pytest test --assert=plain

You get (correctly):

potiuk · 2023-04-09T01:09:07Z

Also just to capture the installed versions:

Collecting pytest
  Using cached pytest-7.3.0-py3-none-any.whl (320 kB)
Collecting cassandra-driver
  Downloading cassandra-driver-3.26.0.tar.gz (287 kB)
     |████████████████████████████████| 287 kB 1.1 MB/s
Collecting pluggy<2.0,>=0.12
  Using cached pluggy-1.0.0-py2.py3-none-any.whl (13 kB)
Collecting exceptiongroup>=1.0.0rc8
  Using cached exceptiongroup-1.1.1-py3-none-any.whl (14 kB)
Collecting packaging
  Using cached packaging-23.0-py3-none-any.whl (42 kB)
Collecting iniconfig
  Using cached iniconfig-2.0.0-py3-none-any.whl (5.9 kB)
Collecting tomli>=1.0.0
  Using cached tomli-2.0.1-py3-none-any.whl (12 kB)
Collecting six>=1.9
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting geomet<0.3,>=0.1
  Using cached geomet-0.2.1.post1-py3-none-any.whl (18 kB)
Collecting click
  Using cached click-8.1.3-py3-none-any.whl (96 kB)
Using legacy 'setup.py install' for cassandra-driver, since package 'wheel' is not installed.
Installing collected packages: six, click, tomli, pluggy, packaging, iniconfig, geomet, exceptiongroup, pytest, cassandra-driver
    Running setup.py install for cassandra-driver ... done
Successfully installed cassandra-driver-3.26.0 click-8.1.3 exceptiongroup-1.1.1 geomet-0.2.1.post1 iniconfig-2.0.0 packaging-23.0 pluggy-1.0.0 pytest-7.3.0 six-1.16.0 tomli-2.0.1

egegunes · 2023-07-15T12:41:10Z

@potiuk did you manage to find the problem? I'm hitting the same issue:

$ pip freeze
amqp==5.1.1
async-timeout==4.0.2
billiard==4.1.0
cassandra-driver==3.28.0
celery==5.3.1
click==8.1.5
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.3.0
exceptiongroup==1.1.2
geomet==0.2.1.post1
iniconfig==2.0.0
kombu==5.3.1
mock==5.1.0
packaging==23.1
pkg_resources==0.0.0
pluggy==1.2.0
prompt-toolkit==3.0.39
pytest==7.4.0
python-dateutil==2.8.2
redis==4.6.0
six==1.16.0
stream-framework==1.4.0
tomli==2.0.1
typing_extensions==4.7.1
tzdata==2023.3
vine==5.0.0
wcwidth==0.2.6

=========================================================== short test summary info ===========================================================
ERROR stream_framework/tests/serializers.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/serializers.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/feeds/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/feeds/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/feeds/aggregated_feed/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/feeds/aggregated_feed/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/managers/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/managers/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/storage/cassandra.py - KeyError: '@py_builtins'
ERROR stream_framework/tests/storage/cassandra.py - KeyError: '@py_builtins'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 10 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
======================================================= 3 warnings, 10 errors in 1.42s ========================================================

bluetech · 2023-07-15T16:00:39Z

Link to problematic code: https://github.com/datastax/python-driver/blob/8c41066330eb04c34eff57153ab2eda810844d5f/cassandra/protocol.py#L699C5-L699C126

    # Names match type name in module scope. Most are imported from cassandra.cqltypes (except CUSTOM_TYPE)
    type_codes = _cqltypes_by_code = dict((v, globals()[k]) for k, v in type_codes.__dict__.items() if not k.startswith('_'))

Didn't verify, but I'm guessing that, in combination with python_files = *.py , pytest rewrites the type_codes module, which adds @py_builtins global to it, which is not also defined in the protocol.py module, which causes the problem.

If you can convince datastax/python-driver to add a pytest-specific workaround, I think adding the string PYTEST_DONT_REWRITE to the type_codes.py module docstring will fix the problem. Or just ask them to avoid this magic entirely.

Otherwise, pytest is probably not going to change how its assertion rewriting works. My recommendation is not to use python_files = *.py , then I think the problematic module won't get rewritten.

potiuk · 2023-07-15T16:45:12Z

Yeah. I've basically given up and manually add the PYTEST_DONT_REWRITE after installing cassandra in our CI image:

Yes I tried to do it already: datastax/python-driver#1142 - where I tried to convince datastax maintainers to add the rewrite comment (which from my point of view would be generally harmless and 0 maintenance - but they decided otherwise, I guess pytest is not going to do anything, so seems like no-one wants to own the problem.

Too bad, I am certainly not going to want to die on the hill (especially that it seems that this is a no-ones hill). We have a working workaround in our test env, so i am good.

I think using *.py is indeed not default. Though I think it's useful in big projects like Airflow - we had some of our tests not being collected because they were accidentally put in a file without test_ prefix, so we prefer to keep *.py because there is no other way to easily find that you are missing some tests from being collected (and run) - which is quite nasty surprise effect actually, but it's a niche case indeed (especially that pytest will happily run those tests when the file is specified directly.

But this is not a big issue in general and if neither Pytest nor Cassandra are concerned - who am I to tell otherwise. It's their choice, some of their user will suffer, and loose time unnecessarily to debug the problem, but well - they are in full rght to make their choice. It's nicely documented with my issues and PRs and people can find workarounds if they look for it, this is one of very legitimate ways of addressing the problem (have it documented how users dealt with it).

However I might haveother suggestions for pytest team maybe:

maybe pytest assertion should avoid rewriting libraries imported from installed packages - not sure why cassandra modules are being rewritten (is it needed?) when it is not a "test" code but a 3rd-party library (sorry I do not know details how rewrites are done, but from an outsider's look it should not be needed - but maybe I miss important cases there
maybe pytest should warn "Hey I am running this test now, but this file will be excluded when automatically collecting test because it does not follow the {python_files} pattern" if you run pytest some_file.py and the file would be excluded by python_files - I think having that one would make airflow give up "*.py" because it woudl be caught usually much earlier in the process. BTW.Actually just writing that gave me an idea, that maybe I could even add a pytest fixture to do it even if there is no support for it in pytest. But it would be nice to have it in pytest as built-in warning.

bluetech · 2023-07-15T17:03:51Z

@potiuk Their problematic code is

dict((v, globals()[k]) for k, v in type_codes.__dict__.items() if not k.startswith('_'))

They might be more receptive to changes in this line if they are not obviously pytest specific. For example

dict((v, globals()[k]) for k, v in type_codes.__dict__.items() if k.endswith('Type'))

(should also allow them to remove CUSTOM_TYPE = object() line from the protocol.py file)

or

dict((v, globals()[k]) for k, v in type_codes.__dict__.items() if not k.startswith(('_', '@'))

(kinda pytest specific but hopefully OK).

maybe pytest assertion should avoid rewriting libraries imported from installed packages - not sure why cassandra modules are being rewritten (is it needed?) when it is not a "test" code but a 3rd-party library

Hmm I'm not sure off hand why 3rd-party, there's probably something relying on it but might be worth a try to avoid it.

potiuk · 2023-07-15T17:08:46Z

Surely I might be the messenger who sends those messages back-forth - but I have literally 0 context (I understand all the words you have written, butI have no idea about internals of both pytest and cassandra, to assess why and how those changes could have avoided the problem).

So I will gently copy your message there and tag you @bluetech if they have more questions maybe that will convince them.

potiuk · 2023-07-15T17:14:14Z

Added comment there: datastax/python-driver#1142 (comment)

The cassandra hack added in apache#30315 does not seem to have a chance to get away. Neither Pytest pytest-dev/pytest#10844 nor Datastax datastax/python-driver#1142 want to own the problem for now (though there is a proposal from pytest contributors on how Datastax could refactor their code to avoid the problem) However during the discussion an idea popped in my head on how we could come back to test_* pattern with far less probability of missing some tests that are added to wrong files. Seems that we can add a fixture that will outright fail tests if they are placed if files not following the test_* pattern. While it would not help in case test would be wrongly named in the first place, it would definitely help to not to add more tests in wrongly named files because it will be literally impossible to run the tests added in a wrong file, even if you manualy do `pytest somefile.py` and avoid running collection. I also did a quick check to try to find cases where the test_* file name was already violated and I found (and renamed) two that I have found. It seems it is quite likely that similar mistake could be done in the future - but with the fixture I added it should be far less likely someone adds tests in a wrongly named file.

potiuk · 2023-07-15T17:59:30Z

BTW. I just created a PR with my fixture idea in Airflow apache/airflow#32626 - which also allows us to remove the cassandra hack entirely.

And surely, I found a few cases where some of our contributors DID NOT follow the test_*.py convention and we missed it - so if we would have no such protection in place, it would surely happen in more cases in the future if we just followed the advice of using test_*.py.

The fixture failing test from badly named files is a very good protection because none of the contributors will be able to run their tests locally if they are placed in a wrongly named file. Without such protection in place, I think test_*.py is a bit of a trap set on both reviewers and contributors, because developers will run their tests locally manually, and submit it, while reviewers might easily miss the fact that the file is wrongly named - and it will pass tests in CI regardless if the tests are working, because the tests will not be collected.

So I think pytest mightainers might want to consider if they want to do something about it (like a warning i mentioned). I think just advising people to follow the recommended test_*.py without having a mechanism to prevent such silly (but easily overlookable) mistake is basically inviting to fall in such a trap eventually. I wonder how many projects out there using pytest and recommended test_*.py patterns have tests that are not actually run in CI because of that.

…32626) The cassandra hack added in #30315 does not seem to have a chance to get away. Neither Pytest pytest-dev/pytest#10844 nor Datastax datastax/python-driver#1142 want to own the problem for now (though there is a proposal from pytest contributors on how Datastax could refactor their code to avoid the problem) However during the discussion an idea popped in my head on how we could come back to test_* pattern with far less probability of missing some tests that are added to wrong files. Seems that we can add a fixture that will outright fail tests if they are placed if files not following the test_* pattern. While it would not help in case test would be wrongly named in the first place, it would definitely help to not to add more tests in wrongly named files because it will be literally impossible to run the tests added in a wrong file, even if you manualy do `pytest somefile.py` and avoid running collection. I also did a quick check to try to find cases where the test_* file name was already violated and I found (and renamed) two that I have found. It seems it is quite likely that similar mistake could be done in the future - but with the fixture I added it should be far less likely someone adds tests in a wrongly named file.

…32626) The cassandra hack added in #30315 does not seem to have a chance to get away. Neither Pytest pytest-dev/pytest#10844 nor Datastax datastax/python-driver#1142 want to own the problem for now (though there is a proposal from pytest contributors on how Datastax could refactor their code to avoid the problem) However during the discussion an idea popped in my head on how we could come back to test_* pattern with far less probability of missing some tests that are added to wrong files. Seems that we can add a fixture that will outright fail tests if they are placed if files not following the test_* pattern. While it would not help in case test would be wrongly named in the first place, it would definitely help to not to add more tests in wrongly named files because it will be literally impossible to run the tests added in a wrong file, even if you manualy do `pytest somefile.py` and avoid running collection. I also did a quick check to try to find cases where the test_* file name was already violated and I found (and renamed) two that I have found. It seems it is quite likely that similar mistake could be done in the future - but with the fixture I added it should be far less likely someone adds tests in a wrongly named file. GitOrigin-RevId: c6594480e2722513fd082a6c65e30e2504698ba2

This was referenced Mar 28, 2023

Remove requirement for test_ prefix for pytest test modules apache/airflow#30315

Merged

Add PYTEST_DONT_REWRITE to type_codes module datastax/python-driver#1142

Closed

Zac-HD added type: bug problem that needs to be addressed topic: rewrite related to the assertion rewrite mechanism labels Apr 8, 2023

potiuk mentioned this issue Apr 9, 2023

Pytest assert rewrite causes serialzation problem with dill #10845

Closed

4 tasks

potiuk mentioned this issue Jul 15, 2023

Replace cassandra hack with fixture to fail badly named test files apache/airflow#32626

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pytest assert rewrite fails when rewriting some parts of cassandra cython code #10844

Pytest assert rewrite fails when rewriting some parts of cassandra cython code #10844

potiuk commented Mar 28, 2023 •

edited

Loading

Zac-HD commented Apr 8, 2023

potiuk commented Apr 9, 2023 •

edited

Loading

potiuk commented Apr 9, 2023

egegunes commented Jul 15, 2023

bluetech commented Jul 15, 2023

potiuk commented Jul 15, 2023

bluetech commented Jul 15, 2023

potiuk commented Jul 15, 2023

potiuk commented Jul 15, 2023

potiuk commented Jul 15, 2023 •

edited

Loading

Pytest assert rewrite fails when rewriting some parts of cassandra cython code #10844

Pytest assert rewrite fails when rewriting some parts of cassandra cython code #10844

Comments

potiuk commented Mar 28, 2023 • edited Loading

Reproduction:

Mandatory information:

Versions

The output of pip list:

Zac-HD commented Apr 8, 2023

potiuk commented Apr 9, 2023 • edited Loading

potiuk commented Apr 9, 2023

egegunes commented Jul 15, 2023

bluetech commented Jul 15, 2023

potiuk commented Jul 15, 2023

bluetech commented Jul 15, 2023

potiuk commented Jul 15, 2023

potiuk commented Jul 15, 2023

potiuk commented Jul 15, 2023 • edited Loading

potiuk commented Mar 28, 2023 •

edited

Loading

potiuk commented Apr 9, 2023 •

edited

Loading

potiuk commented Jul 15, 2023 •

edited

Loading