feat: add neighborhood-based graph traversal for retrievers by Vasilije1990 · Pull Request #2328 · topoteretes/cognee

Vasilije1990 · 2026-03-08T16:30:08Z

Summary

Add get_neighborhood(node_ids, depth, edge_types) to GraphDBInterface and implement in Kuzu, Neo4j, and Neptune adapters using variable-length Cypher path patterns ([*1..N])
Add project_neighborhood_from_db() to CogneeGraph with extracted _process_nodes_and_edges() helper to eliminate duplication
Add neighborhood_depth hyperparameter to brute_force_triplet_search, GraphCompletionRetriever, and GraphCompletionContextExtensionRetriever
Wire neighborhood_depth end-to-end through the search API (search() → authorized_search() → search_in_datasets_context() → retriever factory)

When neighborhood_depth is set, the retriever extracts a k-hop subgraph around the top vector-search seed nodes instead of projecting the full graph. This gives more focused, structurally relevant context for graph-based completions.

Usage:

await cognee.search(
    "What is X?",
    query_type=SearchType.GRAPH_COMPLETION,
    neighborhood_depth=2,  # 2-hop neighborhood around seed nodes
)

Test plan

Verify existing search behavior is unchanged when neighborhood_depth is not set (default None)
Test with neighborhood_depth=1 and neighborhood_depth=2 on a populated knowledge graph
Verify Kuzu adapter get_neighborhood() returns correct nodes/edges format
Verify Neo4j adapter get_neighborhood() returns correct nodes/edges format

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Neighborhood-based search: retrieve subgraphs around specified nodes with configurable depth control
- Extended search API with neighborhood_depth parameter to enhance contextual retrieval capabilities
- Optional edge-type filtering for precise neighborhood queries across all supported databases
Refactor
- Consolidated internal graph processing logic for improved code maintainability

Add configurable k-hop neighborhood extraction to graph retrievers. When neighborhood_depth is set, the retriever extracts a subgraph around vector-search seed nodes instead of projecting the full graph. Changes: - Add get_neighborhood() abstract method to GraphDBInterface - Implement get_neighborhood() in Kuzu, Neo4j, and Neptune adapters - Add project_neighborhood_from_db() to CogneeGraph with shared _process_nodes_and_edges() helper to avoid code duplication - Wire neighborhood_depth parameter through brute_force_triplet_search, GraphCompletionRetriever, GraphCompletionContextExtensionRetriever, search factory, and search API layers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: vasilije <vas.markovic@gmail.com>

coderabbitai · 2026-03-08T16:30:35Z

Walkthrough

This PR adds neighborhood-based graph querying across the stack. It introduces get_neighborhood to the graph database interface with implementations for Kuzu, Neo4j, and Neptune adapters to retrieve k-hop subgraphs. The search API is extended with an optional neighborhood_depth parameter that threads through the retrieval pipeline to enable localized graph context queries. Core graph projection logic in CogneeGraph is refactored for reusability.

Changes

Cohort / File(s)	Summary
API Layer `cognee/api/v1/search/search.py`	Added optional `neighborhood_depth` parameter to search endpoint signature; parameter propagated through internal search_function call.
Database Interface & Adapters `cognee/infrastructure/databases/graph/graph_db_interface.py`, `cognee/infrastructure/databases/graph/kuzu/adapter.py`, `cognee/infrastructure/databases/graph/neo4j_driver/adapter.py`, `cognee/infrastructure/databases/graph/neptune_driver/adapter.py`	Added abstract `get_neighborhood` method to GraphDBInterface; implemented k-hop neighborhood retrieval in three adapters using database-specific queries, including node/edge property expansion, optional edge-type filtering, and consistent return format.
Graph Projection Refactor `cognee/modules/graph/cognee_graph/CogneeGraph.py`	Extracted node/edge processing logic into private `_process_nodes_and_edges` helper method; added new `project_neighborhood_from_db` public method for subgraph projection; updated `project_graph_from_db` signature with node/edge projection parameters and error handling.
Retrieval Layer `cognee/modules/retrieval/graph_completion_retriever.py`, `cognee/modules/retrieval/graph_completion_context_extension_retriever.py`, `cognee/modules/retrieval/utils/brute_force_triplet_search.py`	Added `neighborhood_depth` and `neighborhood_seed_top_k` parameters to retriever initialization; updated `brute_force_triplet_search` and helper methods to conditionally use neighborhood projection when depth is set; parameters threaded through triplet search pipeline.
Search Flow `cognee/modules/search/methods/search.py`, `cognee/modules/search/methods/get_search_type_retriever_instance.py`	Extended search signatures to accept and propagate `neighborhood_depth` through authorized_search, dataset context search, and retriever instantiation; neighborhood_depth extracted from kwargs and passed to graph retriever initialization.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~35 minutes

Possibly related PRs

PR #2217: Modifies Kuzu adapter neighbor-retrieval logic; overlaps with get_neighborhood implementation in this PR.
PR #1926: Updates CogneeGraph projection and triplet_distance_penalty handling; shares refactoring of projection methods with this PR.
PR #1991: Modifies brute_force_triplet_search and helper call paths; overlaps with neighborhood parameter threading in this PR.

Suggested labels

run-checks, core-team

Suggested reviewers

lxobr
hajdul88
dexters1

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is largely complete with a clear summary, usage example, and test plan. However, the provided description deviates from the template structure; it lacks explicit sections for Acceptance Criteria, Type of Change checkbox selection, Screenshots, Pre-submission Checklist completion, and DCO Affirmation.	Fill in all template sections including: Type of Change (mark 'New feature'), Acceptance Criteria, Screenshots of tests passing, Pre-submission Checklist items, and DCO Affirmation confirmation.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main feature addition: neighborhood-based graph traversal for retrievers, which aligns with the core changes across multiple files.
Docstring Coverage	✅ Passed	Docstring coverage is 90.32% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/neighborhood-graph-traversal

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

cognee/modules/graph/cognee_graph/CogneeGraph.py (1)

257-259: Preserve the traceback in these projection logs.

Both except Exception blocks currently reduce the failure to str(e), which drops the stack trace right where adapter/query diagnostics matter most.

🪵 Suggested fix

-        except Exception as e:
-            logger.error(f"Error during graph projection: {str(e)}")
+        except Exception:
+            logger.error("Error during graph projection", exc_info=True)
             raise

-        except Exception as e:
-            logger.error(f"Error during neighborhood projection: {str(e)}")
+        except Exception:
+            logger.error("Error during neighborhood projection", exc_info=True)
             raise

As per coding guidelines, "Prefer explicit, structured error handling in Python code".

Also applies to: 304-306

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cognee/modules/graph/cognee_graph/CogneeGraph.py` around lines 257 - 259, The
except blocks in CogneeGraph.py are logging only str(e), which omits the
traceback; replace those logger.error(...) calls inside the graph projection
error handlers with logger.exception("Error during graph projection") or
logger.error("Error during graph projection", exc_info=True) so the stack trace
is preserved in logs, and apply the same change to the other similar except
block (around lines 304-306) that currently logs str(e); keep the existing bare
"raise" to re-raise the original exception.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cognee/api/v1/search/search.py`:
- Around line 44-45: The new public parameter neighborhood_depth is forwarded
unchanged to the adapter get_neighborhood(), allowing 0, negative or non-int
values to create invalid path patterns; validate neighborhood_depth early in the
containing function (the public search handler that returns List[SearchResult])
by checking it is an integer > 0 (and within any configured max if applicable),
and if not raise/return a clear API error (e.g., BadRequest/ValueError) before
calling get_neighborhood(); update callers that pass neighborhood_depth through
(the code around where neighborhood_depth is forwarded) to rely on this
validated value.

In `@cognee/infrastructure/databases/graph/neptune_driver/adapter.py`:
- Around line 690-725: get_neighborhood() mixes external node IDs (~id) and
internal Neptune ids (id(n)), causing mismatches; ensure the same ID domain is
used throughout by returning and filtering on the external id property (`~id`).
Update the path_query to RETURN neighbor.`~id` (collect into neighbor_ids),
build all_ids as union of node_ids and those neighbor `~id`s, change nodes_query
to WHERE n.`~id` IN $ids and RETURN n.`~id` AS node_id, and change edges_query
to WHERE source.`~id` IN $ids AND target.`~id` IN $ids and RETURN source.`~id`
AS source_id, target.`~id` AS target_id (keep function name get_neighborhood and
variables path_query, nodes_query, edges_query, all_ids, neighbor_ids, node_ids
to locate changes).

In `@cognee/modules/graph/cognee_graph/CogneeGraph.py`:
- Around line 280-291: project_neighborhood_from_db currently forwards invalid
inputs (depth <= 0 or empty seed_node_ids) to the adapter and treats any empty
edges_data as an error even when nodes_data contains only the requested seeds;
validate inputs early and relax the empty-edge check: in
project_neighborhood_from_db, before calling adapter.get_neighborhood validate
and raise a clear input error if depth < 1 or seed_node_ids is empty (use
InvalidDimensionsError or a new InvalidInputError), then call
adapter.get_neighborhood; after the call, only raise EntityNotFoundError if
nodes_data is empty (no nodes returned); allow edges_data to be empty when
nodes_data contains the requested seed_node_ids (i.e., accept seed-only
neighborhoods) and only treat missing edges as an error when your logic expects
at least one edge type to be present.

In `@cognee/modules/retrieval/utils/brute_force_triplet_search.py`:
- Around line 55-56: The neighborhood_depth flag is being ignored when
relevant_ids_to_filter is falsy because the code calls project_graph_from_db()
inside the neighborhood branch; change the logic in brute_force_triplet_search
(around the neighborhood_depth check) so that if neighborhood_depth is set and
relevant_ids_to_filter is empty you either (A) fail fast by raising a ValueError
indicating seed IDs are required for neighborhood mode, or (B) compute/derive
seed IDs before entering neighborhood mode (e.g., call the existing
seed-derivation helper or add a new get_seed_ids function) and then proceed to
call project_graph_from_db() only with those seed IDs; ensure references to
neighborhood_seed_top_k and relevant_ids_to_filter are used to derive seeds if
you choose option B and do not fall back to full-graph projection silently.

---

Nitpick comments:
In `@cognee/modules/graph/cognee_graph/CogneeGraph.py`:
- Around line 257-259: The except blocks in CogneeGraph.py are logging only
str(e), which omits the traceback; replace those logger.error(...) calls inside
the graph projection error handlers with logger.exception("Error during graph
projection") or logger.error("Error during graph projection", exc_info=True) so
the stack trace is preserved in logs, and apply the same change to the other
similar except block (around lines 304-306) that currently logs str(e); keep the
existing bare "raise" to re-raise the original exception.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b77f4569-41b9-4b38-a164-9607d8b0a297

📥 Commits

Reviewing files that changed from the base of the PR and between f7ba5db and 6cfe0e8.

📒 Files selected for processing (11)

cognee/api/v1/search/search.py
cognee/infrastructure/databases/graph/graph_db_interface.py
cognee/infrastructure/databases/graph/kuzu/adapter.py
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py
cognee/infrastructure/databases/graph/neptune_driver/adapter.py
cognee/modules/graph/cognee_graph/CogneeGraph.py
cognee/modules/retrieval/graph_completion_context_extension_retriever.py
cognee/modules/retrieval/graph_completion_retriever.py
cognee/modules/retrieval/utils/brute_force_triplet_search.py
cognee/modules/search/methods/get_search_type_retriever_instance.py
cognee/modules/search/methods/search.py

coderabbitai · 2026-03-08T16:38:05Z

cognee/api/v1/search/search.py

+    neighborhood_depth: Optional[int] = None,
 ) -> List[SearchResult]:


⚠️ Potential issue | 🟠 Major

Validate neighborhood_depth before forwarding it.

Line 233 passes the new public parameter through unchanged. 0, negative values, or non-ints will currently reach the adapter get_neighborhood() queries and build invalid [*1..N] path patterns instead of returning a clear API error.

🛡️ Suggested guard

async def search( query_text: str, @@ retriever_specific_config: Optional[dict] = None, neighborhood_depth: Optional[int] = None, ) -> List[SearchResult]: + if neighborhood_depth is not None and ( + not isinstance(neighborhood_depth, int) or neighborhood_depth < 1 + ): + raise CogneeValidationError( + message="neighborhood_depth must be a positive integer.", + name="InvalidNeighborhoodDepth", + ) + """

Also applies to: 217-233

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cognee/api/v1/search/search.py` around lines 44 - 45, The new public parameter neighborhood_depth is forwarded unchanged to the adapter get_neighborhood(), allowing 0, negative or non-int values to create invalid path patterns; validate neighborhood_depth early in the containing function (the public search handler that returns List[SearchResult]) by checking it is an integer > 0 (and within any configured max if applicable), and if not raise/return a clear API error (e.g., BadRequest/ValueError) before calling get_neighborhood(); update callers that pass neighborhood_depth through (the code around where neighborhood_depth is forwarded) to rely on this validated value.

coderabbitai · 2026-03-08T16:38:05Z

cognee/infrastructure/databases/graph/neptune_driver/adapter.py

+            if edge_types:
+                allowed = "|".join(edge_types)
+                path_query = f"""
+                MATCH (seed:{self._GRAPH_NODE_LABEL})-[:{allowed}*1..{depth}]-(neighbor:{self._GRAPH_NODE_LABEL})
+                WHERE seed.`~id` IN $node_ids
+                RETURN DISTINCT id(neighbor) AS nid
+                """
+            else:
+                path_query = f"""
+                MATCH (seed:{self._GRAPH_NODE_LABEL})-[*1..{depth}]-(neighbor:{self._GRAPH_NODE_LABEL})
+                WHERE seed.`~id` IN $node_ids
+                RETURN DISTINCT id(neighbor) AS nid
+                """
+
+            result = await self.query(path_query, {"node_ids": node_ids})
+            neighbor_ids = [record["nid"] for record in result if record.get("nid")]
+
+            all_ids = list(set(node_ids) | set(neighbor_ids))
+
+            # Step 2: Fetch all nodes
+            nodes_query = f"""
+            MATCH (n:{self._GRAPH_NODE_LABEL})
+            WHERE id(n) IN $ids
+            RETURN id(n) AS node_id, properties(n) AS properties
+            """
+            nodes_result = await self.query(nodes_query, {"ids": all_ids})
+            nodes = [(r["node_id"], r["properties"]) for r in nodes_result]
+
+            # Step 3: Fetch all edges between collected nodes
+            edges_query = f"""
+            MATCH (source:{self._GRAPH_NODE_LABEL})-[r]->(target:{self._GRAPH_NODE_LABEL})
+            WHERE id(source) IN $ids AND id(target) IN $ids
+            RETURN id(source) AS source_id, id(target) AS target_id,
+                   type(r) AS relationship_name, properties(r) AS properties
+            """
+            edges_result = await self.query(edges_query, {"ids": all_ids})


⚠️ Potential issue | 🔴 Critical

Keep get_neighborhood() on a single ID domain.

Line 694 matches seed nodes by ~id, but Lines 712 and 721 switch to id(n) / id(source). That makes all_ids a mix of external IDs and Neptune internal IDs, so seed nodes and their incident edges can disappear from the returned neighborhood.

🔧 One consistent way to fix it

if edge_types: allowed = "|".join(edge_types) path_query = f""" MATCH (seed:{self._GRAPH_NODE_LABEL})-[:{allowed}*1..{depth}]-(neighbor:{self._GRAPH_NODE_LABEL}) WHERE seed.`~id` IN $node_ids - RETURN DISTINCT id(neighbor) AS nid + RETURN DISTINCT neighbor.`~id` AS nid """ else: path_query = f""" MATCH (seed:{self._GRAPH_NODE_LABEL})-[*1..{depth}]-(neighbor:{self._GRAPH_NODE_LABEL}) WHERE seed.`~id` IN $node_ids - RETURN DISTINCT id(neighbor) AS nid + RETURN DISTINCT neighbor.`~id` AS nid """ @@ nodes_query = f""" MATCH (n:{self._GRAPH_NODE_LABEL}) - WHERE id(n) IN $ids - RETURN id(n) AS node_id, properties(n) AS properties + WHERE n.`~id` IN $ids + RETURN n.`~id` AS node_id, properties(n) AS properties """ @@ edges_query = f""" MATCH (source:{self._GRAPH_NODE_LABEL})-[r]->(target:{self._GRAPH_NODE_LABEL}) - WHERE id(source) IN $ids AND id(target) IN $ids - RETURN id(source) AS source_id, id(target) AS target_id, + WHERE source.`~id` IN $ids AND target.`~id` IN $ids + RETURN source.`~id` AS source_id, target.`~id` AS target_id, type(r) AS relationship_name, properties(r) AS properties """

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if edge_types:

allowed = "|".join(edge_types)

path_query = f"""

MATCH (seed:{self._GRAPH_NODE_LABEL})-[:{allowed}*1..{depth}]-(neighbor:{self._GRAPH_NODE_LABEL})

WHERE seed.`~id` IN $node_ids

RETURN DISTINCT id(neighbor) AS nid

"""

else:

path_query = f"""

MATCH (seed:{self._GRAPH_NODE_LABEL})-[*1..{depth}]-(neighbor:{self._GRAPH_NODE_LABEL})

WHERE seed.`~id` IN $node_ids

RETURN DISTINCT id(neighbor) AS nid

"""

result = await self.query(path_query, {"node_ids": node_ids})

neighbor_ids = [record["nid"] for record in result if record.get("nid")]

all_ids = list(set(node_ids) | set(neighbor_ids))

# Step 2: Fetch all nodes

nodes_query = f"""

MATCH (n:{self._GRAPH_NODE_LABEL})

WHERE id(n) IN $ids

RETURN id(n) AS node_id, properties(n) AS properties

"""

nodes_result = await self.query(nodes_query, {"ids": all_ids})

nodes = [(r["node_id"], r["properties"]) for r in nodes_result]

# Step 3: Fetch all edges between collected nodes

edges_query = f"""

MATCH (source:{self._GRAPH_NODE_LABEL})-[r]->(target:{self._GRAPH_NODE_LABEL})

WHERE id(source) IN $ids AND id(target) IN $ids

RETURN id(source) AS source_id, id(target) AS target_id,

type(r) AS relationship_name, properties(r) AS properties

"""

edges_result = await self.query(edges_query, {"ids": all_ids})

if edge_types:

allowed = "|".join(edge_types)

path_query = f"""

MATCH (seed:{self._GRAPH_NODE_LABEL})-[:{allowed}*1..{depth}]-(neighbor:{self._GRAPH_NODE_LABEL})

WHERE seed.`~id` IN $node_ids

RETURN DISTINCT neighbor.`~id` AS nid

"""

else:

path_query = f"""

MATCH (seed:{self._GRAPH_NODE_LABEL})-[*1..{depth}]-(neighbor:{self._GRAPH_NODE_LABEL})

WHERE seed.`~id` IN $node_ids

RETURN DISTINCT neighbor.`~id` AS nid

"""

result = await self.query(path_query, {"node_ids": node_ids})

neighbor_ids = [record["nid"] for record in result if record.get("nid")]

all_ids = list(set(node_ids) | set(neighbor_ids))

# Step 2: Fetch all nodes

nodes_query = f"""

MATCH (n:{self._GRAPH_NODE_LABEL})

WHERE n.`~id` IN $ids

RETURN n.`~id` AS node_id, properties(n) AS properties

"""

nodes_result = await self.query(nodes_query, {"ids": all_ids})

nodes = [(r["node_id"], r["properties"]) for r in nodes_result]

# Step 3: Fetch all edges between collected nodes

edges_query = f"""

MATCH (source:{self._GRAPH_NODE_LABEL})-[r]->(target:{self._GRAPH_NODE_LABEL})

WHERE source.`~id` IN $ids AND target.`~id` IN $ids

RETURN source.`~id` AS source_id, target.`~id` AS target_id,

type(r) AS relationship_name, properties(r) AS properties

"""

edges_result = await self.query(edges_query, {"ids": all_ids})

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cognee/infrastructure/databases/graph/neptune_driver/adapter.py` around lines 690 - 725, get_neighborhood() mixes external node IDs (~id) and internal Neptune ids (id(n)), causing mismatches; ensure the same ID domain is used throughout by returning and filtering on the external id property (`~id`). Update the path_query to RETURN neighbor.`~id` (collect into neighbor_ids), build all_ids as union of node_ids and those neighbor `~id`s, change nodes_query to WHERE n.`~id` IN $ids and RETURN n.`~id` AS node_id, and change edges_query to WHERE source.`~id` IN $ids AND target.`~id` IN $ids and RETURN source.`~id` AS source_id, target.`~id` AS target_id (keep function name get_neighborhood and variables path_query, nodes_query, edges_query, all_ids, neighbor_ids, node_ids to locate changes).

coderabbitai · 2026-03-08T16:38:05Z

cognee/modules/graph/cognee_graph/CogneeGraph.py

+        if node_dimension < 1 or edge_dimension < 1:
+            raise InvalidDimensionsError()
+        try:
+            logger.info(f"Retrieving {depth}-hop neighborhood for {len(seed_node_ids)} seed nodes.")
+            nodes_data, edges_data = await adapter.get_neighborhood(
+                node_ids=seed_node_ids,
+                depth=depth,
+                edge_types=edge_types,
+            )
+
+            if not nodes_data or not edges_data:
+                raise EntityNotFoundError(message="Empty neighborhood projected from the database.")


⚠️ Potential issue | 🟠 Major

Validate neighborhood inputs and allow seed-only results.

project_neighborhood_from_db() currently forwards depth <= 0 and empty seed_node_ids straight to the adapter, and Line 290 also raises when the neighborhood contains seed nodes but no edges. That makes malformed requests and sparse-but-valid neighborhoods fail deep in the backend instead of producing a clear boundary behavior.

💡 One way to harden this path

if node_dimension < 1 or edge_dimension < 1: raise InvalidDimensionsError() + if depth < 1: + raise ValueError("depth must be >= 1") + if not seed_node_ids: + raise ValueError("seed_node_ids must not be empty") try: logger.info(f"Retrieving {depth}-hop neighborhood for {len(seed_node_ids)} seed nodes.") nodes_data, edges_data = await adapter.get_neighborhood( node_ids=seed_node_ids, depth=depth, edge_types=edge_types, ) - if not nodes_data or not edges_data: + if not nodes_data: raise EntityNotFoundError(message="Empty neighborhood projected from the database.") + edges_data = edges_data or []

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cognee/modules/graph/cognee_graph/CogneeGraph.py` around lines 280 - 291, project_neighborhood_from_db currently forwards invalid inputs (depth <= 0 or empty seed_node_ids) to the adapter and treats any empty edges_data as an error even when nodes_data contains only the requested seeds; validate inputs early and relax the empty-edge check: in project_neighborhood_from_db, before calling adapter.get_neighborhood validate and raise a clear input error if depth < 1 or seed_node_ids is empty (use InvalidDimensionsError or a new InvalidInputError), then call adapter.get_neighborhood; after the call, only raise EntityNotFoundError if nodes_data is empty (no nodes returned); allow edges_data to be empty when nodes_data contains the requested seed_node_ids (i.e., accept seed-only neighborhoods) and only treat missing edges as an error when your logic expects at least one edge type to be present.

coderabbitai · 2026-03-08T16:38:05Z

cognee/modules/retrieval/utils/brute_force_triplet_search.py

+    neighborhood_depth: Optional[int] = None,
+    neighborhood_seed_top_k: Optional[int] = 10,


⚠️ Potential issue | 🟠 Major

Don't silently fall back to full-graph projection in neighborhood mode.

With neighborhood_depth set, Lines 68-88 still call project_graph_from_db() whenever relevant_ids_to_filter is empty/falsy. That makes the new flag a silent no-op and can turn a bounded neighborhood request back into a full-graph projection. Please either fail fast here or derive seed IDs before entering neighborhood mode.

Also applies to: 68-88

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cognee/modules/retrieval/utils/brute_force_triplet_search.py` around lines 55 - 56, The neighborhood_depth flag is being ignored when relevant_ids_to_filter is falsy because the code calls project_graph_from_db() inside the neighborhood branch; change the logic in brute_force_triplet_search (around the neighborhood_depth check) so that if neighborhood_depth is set and relevant_ids_to_filter is empty you either (A) fail fast by raising a ValueError indicating seed IDs are required for neighborhood mode, or (B) compute/derive seed IDs before entering neighborhood mode (e.g., call the existing seed-derivation helper or add a new get_seed_ids function) and then proceed to call project_graph_from_db() only with those seed IDs; ensure references to neighborhood_seed_top_k and relevant_ids_to_filter are used to derive seeds if you choose option B and do not fall back to full-graph projection silently.

coderabbitai bot reviewed Mar 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add neighborhood-based graph traversal for retrievers#2328

feat: add neighborhood-based graph traversal for retrievers#2328
Vasilije1990 wants to merge 1 commit intodevfrom
feature/neighborhood-graph-traversal

Vasilije1990 commented Mar 8, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 8, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 8, 2026

Uh oh!

coderabbitai bot Mar 8, 2026

Uh oh!

coderabbitai bot Mar 8, 2026

Uh oh!

coderabbitai bot Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		neighborhood_depth: Optional[int] = None,
		) -> List[SearchResult]:

		neighborhood_depth: Optional[int] = None,
		neighborhood_seed_top_k: Optional[int] = 10,

Conversation

Vasilije1990 commented Mar 8, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Vasilije1990 commented Mar 8, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 8, 2026 •

edited

Loading