Skip to content

chore: retriever test reorganization + adding new tests (smoke e2e) (STEP 1.5)#1888

Merged
hajdul88 merged 43 commits intodevfrom
feature/cog-3532-empower-test_search-db-retrievers-tests-reorg-2
Dec 16, 2025
Merged

chore: retriever test reorganization + adding new tests (smoke e2e) (STEP 1.5)#1888
hajdul88 merged 43 commits intodevfrom
feature/cog-3532-empower-test_search-db-retrievers-tests-reorg-2

Conversation

@hajdul88
Copy link
Collaborator

@hajdul88 hajdul88 commented Dec 11, 2025

This PR restructures the end-to-end tests for the multi-database search layer to improve maintainability, consistency, and coverage across supported Python versions and database settings.

Key Changes

-Migrates the existing E2E tests to pytest for a more standard and extensible testing framework.
-Introduces pytest fixtures to centralize and reuse test setup logic.
-Implements proper event loop management to support multiple asynchronous pytest tests reliably.
-Improves SQLAlchemy handling in tests, ensuring clean setup and teardown of database state.
-Extends multi-database E2E test coverage across all supported Python versions.

Benefits

-Cleaner and more modular test structure.
-Reduced duplication and clearer test intent through fixtures.
-More reliable async test execution.
-Better alignment with our supported Python version matrix.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Code refactoring
  • Performance improvement
  • Other (please specify):

Screenshots/Videos (if applicable)

Pre-submission Checklist

  • I have tested my changes thoroughly before submitting this PR
  • This PR contains minimal changes necessary to address the issue/feature
  • My code follows the project's coding standards and style guidelines
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if applicable)
  • All new and existing tests pass
  • I have searched existing PRs to ensure this change hasn't been submitted already
  • I have linked any relevant issues in the description
  • My commits have clear and descriptive messages

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

Summary by CodeRabbit

  • Tests

    • Expanded end-to-end test suite for the search database with comprehensive setup/teardown, new session-scoped fixtures, and multiple tests validating graph/vector consistency, retriever contexts, triplet metadata, search result shapes, side effects, and feedback-weight behavior.
  • Chores

    • CI updated to run matrixed test jobs across multiple Python versions and standardize test execution for more consistent, parallelized runs.

✏️ Tip: You can customize this high-level summary in your review settings.

@pull-checklist
Copy link

Please make sure all the checkboxes are checked:

  • I have tested these changes locally.
  • I have reviewed the code changes.
  • I have added end-to-end and unit tests (if applicable).
  • I have updated the documentation and README.md file (if necessary).
  • I have removed unnecessary code and debug statements.
  • PR title is clear and follows the convention.
  • I have tagged reviewers or team members for feedback.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 11, 2025

Walkthrough

Adds a Python-version matrix to the search DB test workflow and replaces direct pytest invocations across jobs; overhauls the end-to-end search DB test file with session-scoped async fixtures, environment setup helpers, and multiple focused tests for graph/vector consistency, retrievers, triplets, search results, and feedback-weight behavior.

Changes

Cohort / File(s) Change Summary
Workflow Configuration
\.github/workflows/search_db_tests.yml
Adds a JSON-array python-versions input and a matrix strategy sourcing python-version via fromJSON(inputs.python-versions); updates job names to include matrix.python-version; converts test steps to run uv run pytest ... -v --log-level=INFO; adds fetch-depth: 0 on some checkouts; applies matrix to multiple test jobs (Neo4j, Kuzu, Postgres variants).
Test Suite Restructuring
cognee/tests/test_search_db.py
Adds session-scoped event_loop fixture and two async setup fixtures (setup_test_environment, setup_test_environment_for_feedback), plus e2e_state and feedback_state fixtures. Splits a monolithic test into multiple tests that assert graph-vector consistency, non-empty retriever contexts, triplets containing vector_distance, shapes/content of search results/wrappers, graph side effects/node fields, and feedback-weight calculation; introduces extensive setup/teardown and dataset seeding/cognification logic.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Inspect matrix/job naming and correct wiring of fromJSON(inputs.python-versions) and matrix.python-version usage.
  • Review new session-scoped async fixtures for cross-test state leakage and proper teardown (engine disposal, cache clearing, pruning).
  • Validate correctness and determinism of dataset seeding, cognification, embedding creation, and feedback-weight assertions.
  • Confirm pytest invocation and flags are consistent across workflow jobs.

Suggested labels

run-checks

Suggested reviewers

  • Vasilije1990

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main changes: test reorganization, new tests, and smoke e2e testing, directly corresponding to the restructured E2E tests and expanded test coverage in the PR.
Description check ✅ Passed The description covers key changes (pytest migration, fixtures, event loop management, SQLAlchemy improvements, multi-version coverage), benefits, type of change (new feature and refactoring), and completes all pre-submission checklist items with DCO affirmation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/cog-3532-empower-test_search-db-retrievers-tests-reorg-2

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 80d9189 and 03d59ac.

📒 Files selected for processing (1)
  • cognee/tests/test_search_db.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indentation in Python code
Use snake_case for Python module and function names
Use PascalCase for Python class names
Use ruff format before committing Python code
Use ruff check for import hygiene and style enforcement with line-length 100 configured in pyproject.toml
Prefer explicit, structured error handling in Python code

Files:

  • cognee/tests/test_search_db.py

⚙️ CodeRabbit configuration file

**/*.py: When reviewing Python code for this project:

  1. Prioritize portability over clarity, especially when dealing with cross-Python compatibility. However, with the priority in mind, do still consider improvements to clarity when relevant.
  2. As a general guideline, consider the code style advocated in the PEP 8 standard (excluding the use of spaces for indentation) and evaluate suggested changes for code style compliance.
  3. As a style convention, consider the code style advocated in CEP-8 and evaluate suggested changes for code style compliance.
  4. As a general guideline, try to provide any relevant, official, and supporting documentation links to any tool's suggestions in review comments. This guideline is important for posterity.
  5. As a general rule, undocumented function definitions and class definitions in the project's Python code are assumed incomplete. Please consider suggesting a short summary of the code for any of these incomplete definitions as docstrings when reviewing.

Files:

  • cognee/tests/test_search_db.py
cognee/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use shared logging utilities from cognee.shared.logging_utils in Python code

Files:

  • cognee/tests/test_search_db.py
cognee/tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

cognee/tests/**/*.py: Place Python tests under cognee/tests/ organized by type (unit, integration, cli_tests)
Name Python test files test_*.py and use pytest.mark.asyncio for async tests

Files:

  • cognee/tests/test_search_db.py
cognee/tests/*

⚙️ CodeRabbit configuration file

cognee/tests/*: When reviewing test code:

  1. Prioritize portability over clarity, especially when dealing with cross-Python compatibility. However, with the priority in mind, do still consider improvements to clarity when relevant.
  2. As a general guideline, consider the code style advocated in the PEP 8 standard (excluding the use of spaces for indentation) and evaluate suggested changes for code style compliance.
  3. As a style convention, consider the code style advocated in CEP-8 and evaluate suggested changes for code style compliance, pointing out any violations discovered.
  4. As a general guideline, try to provide any relevant, official, and supporting documentation links to any tool's suggestions in review comments. This guideline is important for posterity.
  5. As a project rule, Python source files with names prefixed by the string "test_" and located in the project's "tests" directory are the project's unit-testing code. It is safe, albeit a heuristic, to assume these are considered part of the project's minimal acceptance testing unless a justifying exception to this assumption is documented.
  6. As a project rule, any files without extensions and with names prefixed by either the string "check_" or the string "test_", and located in the project's "tests" directory, are the project's non-unit test code. "Non-unit test" in this context refers to any type of testing other than unit testing, such as (but not limited to) functional testing, style linting, regression testing, etc. It can also be assumed that non-unit testing code is usually written as Bash shell scripts.

Files:

  • cognee/tests/test_search_db.py
🧠 Learnings (2)
📚 Learning: 2025-11-24T16:45:09.996Z
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to cognee/tests/**/*.py : Name Python test files test_*.py and use pytest.mark.asyncio for async tests

Applied to files:

  • cognee/tests/test_search_db.py
📚 Learning: 2024-12-04T18:37:55.092Z
Learnt from: hajdul88
Repo: topoteretes/cognee PR: 251
File: cognee/tests/infrastructure/databases/test_index_graph_edges.py:0-0
Timestamp: 2024-12-04T18:37:55.092Z
Learning: In the `index_graph_edges` function, both graph engine and vector engine initialization failures are handled within the same try-except block, so a single test covers both cases.

Applied to files:

  • cognee/tests/test_search_db.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
  • GitHub Check: End-to-End Tests / Test Pipeline Caching
  • GitHub Check: End-to-End Tests / Test Entity Extraction
  • GitHub Check: End-to-End Tests / Test dataset database deletion in Cognee
  • GitHub Check: End-to-End Tests / Test multi tenancy with different situations in Cognee
  • GitHub Check: End-to-End Tests / Test graph edge ingestion
  • GitHub Check: End-to-End Tests / Conversation sessions test (Redis)
  • GitHub Check: End-to-End Tests / Test Feedback Enrichment
  • GitHub Check: End-to-End Tests / Test permissions with different situations in Cognee
  • GitHub Check: End-to-End Tests / Test Cognify - Edge Centered Payload
  • GitHub Check: End-to-End Tests / Concurrent Subprocess access test
  • GitHub Check: End-to-End Tests / Test using different async databases in parallel in Cognee
  • GitHub Check: End-to-End Tests / Server Start Test
  • GitHub Check: End-to-End Tests / Run Telemetry Pipeline Test
  • GitHub Check: End-to-End Tests / Deduplication Test
  • GitHub Check: End-to-End Tests / Conversation sessions test (FS)
  • GitHub Check: End-to-End Tests / S3 Bucket Test
  • GitHub Check: End-to-End Tests / Test dataset database handlers in Cognee
  • GitHub Check: Basic Tests / Run Simple Examples
  • GitHub Check: Basic Tests / Run Unit Tests
  • GitHub Check: Basic Tests / Run Integration Tests
  • GitHub Check: Basic Tests / Run Simple Examples BAML
  • GitHub Check: CLI Tests / CLI Functionality Tests
  • GitHub Check: CLI Tests / CLI Integration Tests
🔇 Additional comments (1)
cognee/tests/test_search_db.py (1)

138-293: Excellent fixture design for expensive E2E operations.

The session-scoped e2e_state and feedback_state fixtures follow best practices by computing expensive LLM calls and setup once, then allowing multiple lightweight assertion-only tests to run against the shared state. This significantly improves test performance and maintainability.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@hajdul88 hajdul88 self-assigned this Dec 11, 2025
@hajdul88 hajdul88 changed the title chore: extends e2e multidb search tests chore: extends e2e multidb search tests (WIP) Dec 11, 2025
@hajdul88 hajdul88 marked this pull request as draft December 11, 2025 15:38
@hajdul88 hajdul88 changed the title chore: extends e2e multidb search tests (WIP) chore: extends e2e multidb search tests (smoke e2e multidb) Dec 12, 2025
This reverts commit 7a82bd7.
@hajdul88 hajdul88 changed the title chore: extends e2e multidb search tests (smoke e2e multidb) chore: extends e2e multidb search tests (smoke e2e multidb) (STEP 3) Dec 12, 2025
@hajdul88 hajdul88 marked this pull request as ready for review December 12, 2025 13:47
@hajdul88 hajdul88 requested review from lxobr and pazone December 12, 2025 13:50
@hajdul88 hajdul88 changed the title chore: extends e2e multidb search tests (smoke e2e multidb) (STEP 3) chore: retriever test reorganization + adding new tests (smoke e2e) (STEP 3) Dec 12, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7a82bd7 and 80d9189.

📒 Files selected for processing (2)
  • .github/workflows/search_db_tests.yml (9 hunks)
  • cognee/tests/test_search_db.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
.github/**

⚙️ CodeRabbit configuration file

.github/**: * When the project is hosted on GitHub: All GitHub-specific configurations, templates, and tools should be found in the '.github' directory tree.

  • 'actionlint' erroneously generates false positives when dealing with GitHub's ${{ ... }} syntax in conditionals.
  • 'actionlint' erroneously generates incorrect solutions when suggesting the removal of valid ${{ ... }} syntax.

Files:

  • .github/workflows/search_db_tests.yml
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indentation in Python code
Use snake_case for Python module and function names
Use PascalCase for Python class names
Use ruff format before committing Python code
Use ruff check for import hygiene and style enforcement with line-length 100 configured in pyproject.toml
Prefer explicit, structured error handling in Python code

Files:

  • cognee/tests/test_search_db.py

⚙️ CodeRabbit configuration file

**/*.py: When reviewing Python code for this project:

  1. Prioritize portability over clarity, especially when dealing with cross-Python compatibility. However, with the priority in mind, do still consider improvements to clarity when relevant.
  2. As a general guideline, consider the code style advocated in the PEP 8 standard (excluding the use of spaces for indentation) and evaluate suggested changes for code style compliance.
  3. As a style convention, consider the code style advocated in CEP-8 and evaluate suggested changes for code style compliance.
  4. As a general guideline, try to provide any relevant, official, and supporting documentation links to any tool's suggestions in review comments. This guideline is important for posterity.
  5. As a general rule, undocumented function definitions and class definitions in the project's Python code are assumed incomplete. Please consider suggesting a short summary of the code for any of these incomplete definitions as docstrings when reviewing.

Files:

  • cognee/tests/test_search_db.py
cognee/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use shared logging utilities from cognee.shared.logging_utils in Python code

Files:

  • cognee/tests/test_search_db.py
cognee/tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

cognee/tests/**/*.py: Place Python tests under cognee/tests/ organized by type (unit, integration, cli_tests)
Name Python test files test_*.py and use pytest.mark.asyncio for async tests

Files:

  • cognee/tests/test_search_db.py
cognee/tests/*

⚙️ CodeRabbit configuration file

cognee/tests/*: When reviewing test code:

  1. Prioritize portability over clarity, especially when dealing with cross-Python compatibility. However, with the priority in mind, do still consider improvements to clarity when relevant.
  2. As a general guideline, consider the code style advocated in the PEP 8 standard (excluding the use of spaces for indentation) and evaluate suggested changes for code style compliance.
  3. As a style convention, consider the code style advocated in CEP-8 and evaluate suggested changes for code style compliance, pointing out any violations discovered.
  4. As a general guideline, try to provide any relevant, official, and supporting documentation links to any tool's suggestions in review comments. This guideline is important for posterity.
  5. As a project rule, Python source files with names prefixed by the string "test_" and located in the project's "tests" directory are the project's unit-testing code. It is safe, albeit a heuristic, to assume these are considered part of the project's minimal acceptance testing unless a justifying exception to this assumption is documented.
  6. As a project rule, any files without extensions and with names prefixed by either the string "check_" or the string "test_", and located in the project's "tests" directory, are the project's non-unit test code. "Non-unit test" in this context refers to any type of testing other than unit testing, such as (but not limited to) functional testing, style linting, regression testing, etc. It can also be assumed that non-unit testing code is usually written as Bash shell scripts.

Files:

  • cognee/tests/test_search_db.py
🧠 Learnings (4)
📚 Learning: 2024-11-18T12:54:36.758Z
Learnt from: 0xideas
Repo: topoteretes/cognee PR: 233
File: .github/workflows/test_cognee_llama_index_notebook.yml:0-0
Timestamp: 2024-11-18T12:54:36.758Z
Learning: In the `.github/workflows/test_cognee_llama_index_notebook.yml` workflow, it's acceptable to remove the `--all-extras` flag from `poetry install` to reduce costs by not installing unnecessary dependencies.

Applied to files:

  • .github/workflows/search_db_tests.yml
📚 Learning: 2025-11-24T16:45:09.996Z
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to cognee/tests/**/*.py : Place Python tests under cognee/tests/ organized by type (unit, integration, cli_tests)

Applied to files:

  • .github/workflows/search_db_tests.yml
📚 Learning: 2025-11-24T16:45:09.996Z
Learnt from: CR
Repo: topoteretes/cognee PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T16:45:09.996Z
Learning: Applies to cognee/tests/**/*.py : Name Python test files test_*.py and use pytest.mark.asyncio for async tests

Applied to files:

  • cognee/tests/test_search_db.py
📚 Learning: 2024-12-04T18:37:55.092Z
Learnt from: hajdul88
Repo: topoteretes/cognee PR: 251
File: cognee/tests/infrastructure/databases/test_index_graph_edges.py:0-0
Timestamp: 2024-12-04T18:37:55.092Z
Learning: In the `index_graph_edges` function, both graph engine and vector engine initialization failures are handled within the same try-except block, so a single test covers both cases.

Applied to files:

  • cognee/tests/test_search_db.py
🔇 Additional comments (2)
cognee/tests/test_search_db.py (1)

241-260: Session-scoped E2E fixture design is solid (compute once, assert many).
Good tradeoff for expensive LLM-backed setup; keeping tests “assert-only” reduces duplicated calls.

Also applies to: 268-290

.github/workflows/search_db_tests.yml (1)

57-58: Pytest switch looks good; ensure cognee_setup always installs pytest (+ plugins) consistently across matrices.
If cognee_setup conditionally installs dev deps, matrix jobs may fail only on some variants.

In pytest-asyncio, is overriding the `event_loop` fixture (and especially making it session-scoped) still supported in recent versions? Provide current recommended approach.

Also applies to: 103-104, 162-163, 228-228

@hajdul88 hajdul88 changed the title chore: retriever test reorganization + adding new tests (smoke e2e) (STEP 3) chore: retriever test reorganization + adding new tests (smoke e2e) (STEP 1.5) Dec 12, 2025
@hajdul88 hajdul88 changed the title chore: retriever test reorganization + adding new tests (smoke e2e) (STEP 1.5) chore: retriever test reorganization + adding new tests (smoke e2e) (STEP 1) Dec 12, 2025
@hajdul88 hajdul88 changed the title chore: retriever test reorganization + adding new tests (smoke e2e) (STEP 1) chore: retriever test reorganization + adding new tests (smoke e2e) (STEP 1.5) Dec 12, 2025
@hajdul88 hajdul88 merged commit b4aaa7f into dev Dec 16, 2025
243 of 246 checks passed
@hajdul88 hajdul88 deleted the feature/cog-3532-empower-test_search-db-retrievers-tests-reorg-2 branch December 16, 2025 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants