Skip to content

Conversation

@Bento007
Copy link
Contributor

@Bento007 Bento007 commented Sep 16, 2025

Reason for Change

fixing a bug when ingesting ATAC datasets

{ "levelname": "ERROR", "asctime": "2025-09-16T18:04:14.016Z", "name": "processing", "message": "validation failed", "lineno": 185, "pathname": "/backend/layers/processing/process_validate_atac.py", "exc_info": "Traceback (most recent call last):\n File \"/backend/layers/processing/process_validate_atac.py\", line 180, in process\n errors, fragment_index_file, fragment_file = self.schema_validator.validate_atac(\n File \"/backend/layers/thirdparty/schema_validator_provider.py\", line 102, in validate_atac\n atac_seq.process_fragment(fragment_file, anndata_file, True, output_file=output_file),\n File \"/opt/venv/lib/python3.10/site-packages/cellxgene_schema/atac_seq.py\", line 226, in process_fragment\n errors = validate_anndata_with_fragment(parquet_file, anndata_file, organism_ontology_term_id)\n File \"/opt/venv/lib/python3.10/site-packages/cellxgene_schema/atac_seq.py\", line 284, in validate_anndata_with_fragment\n validate_fragment_start_coordinate_greater_than_0(parquet_file),\n File \"/opt/venv/lib/python3.10/site-packages/cellxgene_schema/atac_seq.py\", line 135, in wrapper\n return func(*args, **kwargs)\n File \"/opt/venv/lib/python3.10/site-packages/cellxgene_schema/atac_seq.py\", line 313, in validate_fragment_start_coordinate_greater_than_0\n if not (df[\"start coordinate\"] > 0).all().execute():\n File \"/opt/venv/lib/python3.10/site-packages/ibis/expr/types/core.py\", line 424, in execute\n return self._find_backend(use_default=True).execute(\n File \"/opt/venv/lib/python3.10/site-packages/ibis/backends/duckdb/__init__.py\", line 1415, in execute\n for name, col in zip(table.column_names, table.columns)\nAttributeError: 'pyarrow.lib.RecordBatchReader' object has no attribute 'column_names'" }

Changes

  • pin duckdb to the last working version and add a comment

Testing

  • unit tests will pass again

Notes for Reviewer

create a patch release of cellxgene_schema

@Bento007 Bento007 changed the title fix: pin Cython version in requirements.txt fix: pin duckdb version in requirements.txt Sep 16, 2025
Copy link
Contributor

@nayib-jose-gloria nayib-jose-gloria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update the pr name to duckdb, otherwise approved

@Bento007 Bento007 enabled auto-merge (squash) September 16, 2025 21:10
@Bento007 Bento007 merged commit b2d8ad2 into main Sep 16, 2025
10 of 11 checks passed
@Bento007 Bento007 deleted the tsmith/fix-dependencies branch September 16, 2025 21:12
Bento007 added a commit that referenced this pull request Sep 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants