Feat/e2e datasets metadata by paul-tharun · Pull Request #49 · factly/gopie

paul-tharun · 2026-01-22T23:16:49Z

Summary by CodeRabbit

New Features
- Natural-language → SQL generation endpoint; DuckDB documentation lookup; visualization end-to-end tooling and runner; multi-organization header support.
Improvements
- Hybrid semantic+sparse search and size-aware OLAP sampling for better results; smarter SQL planning with clearer non‑SQL responses; CSV conversion utility; enhanced health and sandbox handling.
Tests / Docs
- Extensive new/updated unit and e2e tests and expanded README.

Major changes: - Add fetch-sql endpoint for NL to SQL generation - Add unified GopieClient for API requests - Add new SQL agent workflow with sql_planner_graph and nl_to_sql_graph - Refactor single/multi dataset graphs with dedicated sql_agent nodes - Add DuckDB docs search functionality - Update prompts and validation logic - Add new tools: respond_to_user, return_result_paths - Update dependencies: langgraph>=0.6.0, dspy==3.1.0, beautifulsoup4 - Various bug fixes and improvements

- Updated `_build_ilike_query` and `_build_levenshtein_query` to ensure proper casting of columns to VARCHAR for SQL queries. - Changed `data` type in `VisualizationResult` from `list[bytes]` to `list[str]` for better compatibility. - Refactored `update_sandbox_timeout` to return a tuple indicating whether a new sandbox was created. - Adjusted `pre_model_hook` to utilize the updated sandbox handling logic. - Enhanced prompt guidelines to exclude non-string columns from fuzzy value generation. - Updated test cases to validate new query handling and data type changes.

claude · 2026-01-22T23:17:16Z

Claude finished @paul-tharun's task —— View job

PR Review: Feat/e2e datasets metadata

I've conducted a comprehensive review of this large PR with 100 files changed. Here are my findings:

Review Checklist

Read repository setup and guidelines (CLAUDE.md, configs)
Review architectural changes (new graphs, workflows)
Analyze API security and validation
Check error handling and edge cases
Evaluate performance implications
Assess test coverage
Provide comprehensive feedback

🏗️ Architecture & Design

✅ Positive Changes

Well-structured agent architecture: New nl_to_sql_graph and sql_planner_graph follow LangGraph patterns correctly
Separation of concerns: Clear separation between SQL planning, execution, and validation workflows
Type safety: Good use of Pydantic models for API validation (FetchSqlRequest, FetchSqlResponse)
Unified client pattern: New GopieClient centralizes API communication with proper header management

⚠️ Architectural Concerns

Commented code in workflows: chat-server/app/workflow/graph/sql_planner_graph/graph.py:19-30 has extensive commented conditional edges that should either be implemented or removed
Complex state management: The addition of multiple new graph types increases complexity without clear documentation of when each should be used

🔒 Security Analysis

✅ Good Security Practices

Input validation: Proper Pydantic validation in new API endpoints (fetch_sql.py:21-22)
SQL injection prevention: Using parameterized queries and proper escaping (column_value_matching.py:40)
Header handling: Proper organization ID header management in GopieClient

🚨 Security Issues

Insufficient input sanitization in column_value_matching.py:40:
```
WHERE LOWER(CAST({column_name} AS VARCHAR)) LIKE '%' || LOWER('{escaped_value}') || '%'
```
The escaped_value variable name suggests sanitization, but no actual escaping is shown in this snippet.

Generic exception handling in fetch_sql.py:36-42:

except Exception as e:
    logger.error(f"Error in fetch_sql: {e}")
    raise HTTPException(
        status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
        detail="Failed to generate SQL queries, please try again.",
    ) from e

This could mask important security errors and provide too generic error messages.

🚀 Performance Analysis

✅ Performance Improvements

Smart sampling: New table size estimation and sampling logic in table_utils.py prevents performance issues on large datasets
Connection pooling: Using singleton pattern for HTTP client connections (SingletonAiohttp)
Async operations: Proper use of async/await throughout new components

⚠️ Performance Concerns

Multiple vector searches: New DuckDB documentation search (duckdb_docs/search.py) adds another vector search operation that could impact latency
Memory usage: Complex state objects with nested query results could consume significant memory in high-concurrency scenarios

🧪 Testing & Quality

✅ Testing Infrastructure

Good test structure: Maintained separation between unit and e2e tests
DSPy integration: Advanced evaluation system with optimized evaluators

❌ Testing Gaps

No tests for new API endpoint: fetch_sql.py lacks dedicated test coverage
Missing edge case tests: New graph workflows need tests for failure scenarios
Integration tests: No tests covering the interaction between new nl_to_sql_graph and existing workflows

🐛 Bug & Error Handling Issues

State validation inconsistency in sql_agent.py:27-28:

if not query_result or not single_dataset_result:
    raise ValueError("query_result is not properly initialized")

This doesn't provide context about what went wrong or how to fix it.

Resource cleanup: Missing cleanup logic for failed operations in new graph workflows
Retry logic: Inconsistent retry mechanisms across different agents - some have configurable retry counts, others don't

📋 Code Quality

✅ Good Practices

Type hints: Comprehensive type annotations throughout new code
Documentation: Good docstrings in new functions like get_duckdb_docs()
Code organization: Logical file structure following established patterns

⚠️ Areas for Improvement

Hardcoded values: Magic numbers like LIMIT 200000 in SQL queries should be configurable
Long functions: Some functions exceed 50 lines and could benefit from decomposition
Duplicate logic: Similar error handling patterns repeated across multiple files

🔧 Specific Recommendations

Security:

# In column_value_matching.py - add proper escaping
def escape_sql_value(value: str) -> str:
    return value.replace("'", "''").replace("%", "\\%").replace("_", "\\_")

Error handling:

# In fetch_sql.py - more specific exception handling
except ValueError as e:
    raise HTTPException(status_code=400, detail=str(e))
except ConnectionError as e:
    raise HTTPException(status_code=503, detail="Service unavailable")

Configuration:

# Add to config.py
DEFAULT_QUERY_LIMIT: int = 200000
MAX_SAMPLING_PERCENTAGE: float = 5.0

Testing:
- Add unit tests for fetch_sql_router
- Add integration tests for new graph workflows
- Add performance tests for large dataset scenarios

📊 Overall Assessment

Category	Rating	Comments
Architecture	⭐⭐⭐⭐⚪	Well-designed but complex
Security	⭐⭐⭐⚪⚪	Good practices with some gaps
Performance	⭐⭐⭐⭐⚪	Smart optimizations implemented
Testing	⭐⭐⚪⚪⚪	Infrastructure good, coverage lacking
Code Quality	⭐⭐⭐⭐⚪	Clean code with room for improvement

Overall: This is a substantial improvement to the system with well-thought-out architectural changes. The main concerns are security gaps, insufficient testing of new features, and some complexity that could be better documented. I recommend addressing the security issues before merging.

coderabbitai · 2026-01-22T23:17:18Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds org-scoped request handling and a GopieClient; implements NL→SQL and SQL planner graphs with semantic search and SQL agents; introduces OLAP query builders (DuckDB/ClickHouse), sampling helpers, hybrid dense+sparse Qdrant support (SPLADE), refactors single/multi-dataset workflows, and expands tests, viz tooling, and indexing/migration scripts.

Changes

Cohort / File(s)	Summary
API & Org Propagation `chat-server/app/api/v1/routers/dataset_upload.py`, `chat-server/app/api/v1/routers/query.py`, `chat-server/app/api/v1/routers/fetch_sql.py`, `chat-server/app/main.py`	Add `x_organization_id` Header param and propagate org_id through endpoints and service calls; add new fetch-sql router.
Gopie client & services `chat-server/app/services/gopie/client.py`, `.../dataset_info.py`, `.../generate_schema.py`, `.../sql_executor.py`	Introduce `GopieClient`, replace direct HTTP sessions, add org-scoped dataset/project fetch, sampling-aware summary generation, and org-scoped SQL execution.
Qdrant & Hybrid Vector Store `chat-server/app/services/qdrant/qdrant_setup.py`, `.../vector_store.py`, `.../schema_search.py`, `.../get_schema.py`, `.../schema_vectorization.py`	Per-collection client/config, async/sync client singletons, collection existence handling, SPLADE sparse support, hybrid dense+sparse upserts and hybrid search, and org_id filtering for schema queries.
OLAP query builders & table utils `chat-server/app/utils/olap/*`, `chat-server/app/utils/graph_utils/table_utils.py`, `chat-server/app/utils/graph_utils/column_value_matching.py`	Add `OlapQueryBuilder` interface and DuckDB/ClickHouse implementations; table size estimation, sampling decision helpers, and size-aware fuzzy SQL generation with org_id propagation.
NL→SQL & SQL Planner Graphs `chat-server/app/workflow/graph/nl_to_sql_graph/`, `chat-server/app/workflow/graph/sql_planner_graph/`, related `node/*`, `types.py`	Add NL→SQL state graph (supervisor, semantic_search, sql_agent) and SQL planner graph (generate_sql, match_columns, search_docs) with typed states and nodes.
Single / Multi-dataset workflow refactor `chat-server/app/workflow/graph/single_dataset_graph/`, `chat-server/app/workflow/graph/multi_dataset_graph/`	Split single-dataset flow into prepare_data, sql_agent, execute_sql with routing; replace plan_query with sql_agent in multi-dataset graph; rename/reshape state keys (previous_sql_queries/prev_sql_queries).
Prompts & selector updates `chat-server/app/workflow/prompts/*`, `chat-server/app/workflow/prompts/generate_sql_prompt.py`	Rename/add many prompt APIs (generate_sql, extract_column_assumptions, etc.), add DB-aware prompt pieces, update prompt_map and input formatters.
Model & embedding providers `chat-server/app/utils/providers/embedding_providers/vllm.py`, `.../__init__.py`, `chat-server/app/models/provider.py`, `chat-server/app/utils/model_registry/model_provider.py`	Add VLLM embedding provider and wiring; expose VLLM provider and adjust LLM retry behavior in provider selection.
Qdrant tooling & scripts `tests/scripts/reset_and_reindex_collection.py`, `chat-server/scripts/index_duckdb_docs.py`, `chat-server/scripts/scrape_duckdb_docs.py`	New scripts to scrape/index DuckDB docs and migrate/reindex Qdrant collections to hybrid dense+sparse vectors.
Visualization & sandbox `chat-server/app/workflow/graph/visualize_data_graph/`, `tests/e2e/viz_utils/`	Add VizIOManager, PerExampleWorkflow, viz test runner; update sandbox timeout helper to create sandbox when None and return (sandbox, created_new); pre_process_data and visualization nodes now pass org_id to SQL executor.
Chat history, formatting & models `chat-server/app/utils/chat_history/processor.py`, `chat-server/app/workflow/prompts/formatters/format_query_result.py`, `chat-server/app/models/query.py`, `chat-server/app/models/schema.py`	Refactor chat history to inline SQL IDs; rename fields (no_sql_response→non_sql_response, sql_results→sql_queries), add org_id to DatasetSchema, update model serialization.
Tools & RunnableConfig plumbing `chat-server/app/tool_utils/`, `chat-server/app/tool_utils/tools/`	Tool enum updates (RESPOND_TO_USER, RETURN_RESULT_PATHS), pass RunnableConfig into tools to extract org_id, and refactor tool-node metadata application.
CSV & utilities `chat-server/app/utils/csv_utils.py`, `chat-server/app/utils/graph_utils/result_validation.py`	Add convert_rows_to_csv utility and adjust result validation/truncation messaging and JSON serialization defaults.
Tests & test infra `chat-server/tests/*/`	Large test additions and updates: unit tests for OLAP builders, schema search, match_columns, prompts; e2e viz tooling, dataset managers, test config updates, and changed .gitignore for datasets.
Config, deps, docs & infra `chat-server/app/core/config.py`, `chat-server/pyproject.toml`, `chat-server/README.md`, `.gitignore`, `docker-compose*.yaml`, `tests/test_config.py`	New settings (duckdb collection, sparse model, sampling thresholds, OLAP_DB_TYPE), dependency updates/additions, README and test config updates, narrowed .gitignore entry for datasets.
Minor renames & logging assorted files (`app/core/log.py`, event utils, constants)	Logger exception level tweak, removal of some streaming helpers, and assorted renames/refactors across workflow nodes and prompts.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant API as Fetch SQL API
    participant NLToSQL as NL-to-SQL Graph
    participant Supervisor
    participant Semantic as SemanticSearch
    participant Qdrant
    participant SQLAgent
    participant Planner as SQL Planner Agent

    User->>API: POST /fetch-sql (user_query, dataset_ids, project_ids)
    API->>NLToSQL: invoke(payload)
    NLToSQL->>Supervisor: decide route
    alt single dataset
        Supervisor->>SQLAgent: goto sql_agent
        SQLAgent->>Planner: invoke sql_planning_agent
        Planner->>SQLAgent: sql_queries
    else multi-dataset
        Supervisor->>Semantic: goto semantic_search
        Semantic->>Qdrant: search_schemas(user_query,...)
        Qdrant->>Semantic: schema results
        Semantic->>NLToSQL: semantic results
        NLToSQL->>SQLAgent: provide semantic results
        SQLAgent->>Planner: invoke sql_planning_agent
        Planner->>SQLAgent: sql_queries
    end
    SQLAgent->>API: return sql_queries + optional message
    API->>User: FetchSqlResponse

sequenceDiagram
    participant State
    participant Prepare as PrepareData
    participant SQLAgent
    participant Execute as ExecuteSQL
    participant Validate

    State->>Prepare: dataset_info, user_query
    Prepare->>Prepare: fetch schema, sample data, column assumptions
    Prepare->>SQLAgent: dataset_info
    SQLAgent->>SQLAgent: call sql_planning_agent -> generate sql_queries or non_sql_response
    alt sql_queries present
        SQLAgent->>Execute: execute_sql (sql_queries)
        Execute->>Validate: results
    else non_sql_response
        SQLAgent->>Validate: non_sql_response
    end
    Validate->>State: updated query_result + recommendation

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Feat/role authorization #39: Changes to middleware and repository/service signatures around org/role-scoped dataset/project access—overlaps with removed role parameter and org scoping.
factly/gopie#88: Modifies GOPIE integration and dataset upload code—overlaps with introduction of GopieClient and org-scoped dataset/project fetches.
factly/gopie#194: Modifies Qdrant schema retrieval and org-scoped filtering—overlaps with get_schema/get_project_schemas changes.

Suggested reviewers

paul-tharun

Poem

🐰 I hopped through graphs and vectors new,

I fetched schemas, sampled, and knew,
Dense and sparse now dance as one,
Org-scoped queries get their run,
Tiny rabbit, big refactor — done! 🥕

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/e2e-datasets-metadata

coderabbitai

Actionable comments posted: 20

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (15)

chat-server/tests/performance_tools/performance_tracker.py (1)
40-82: Unused test_cases parameter detected.

The test_cases parameter is accepted by save_run but is never used in the method body. The run_data dictionary includes results_file but omits test_cases.

If test cases should be persisted with the run data, add them to run_data. Otherwise, consider removing the unused parameter.
🔧 Option A: Include test_cases in run_data
         run_data = {
             "run_id": run_id,
             "timestamp": timestamp,
             "model_config": model_config,
             "notes": notes,
             "summary": {
                 "total_tests": total,
                 "scores": scores,
                 "avg_score": round(score_stats.get("average", 0), 2),
                 "median_score": round(score_stats.get("median", 0), 2),
                 "min_score": round(score_stats.get("min", 0), 2),
                 "max_score": round(score_stats.get("max", 0), 2),
                 "errors": errors,
                 "avg_request_time": round(timing_stats.get("avg_request_time", 0), 2),
                 "median_request_time": round(timing_stats.get("median_request_time", 0), 2),
                 "min_request_time": round(timing_stats.get("min_request_time", 0), 2),
                 "max_request_time": round(timing_stats.get("max_request_time", 0), 2),
                 "total_time": round(timing_stats.get("total_time", 0), 2),
             },
             "results_file": results_file,
+            "test_cases": test_cases,
         }
🔧 Option B: Remove the unused parameter
     def save_run(
         self,
         model_config: dict[str, str],
         summary: dict[str, Any],
         notes: str = "",
-        test_cases: list[dict[str, Any]] | None = None,
         results_file: str = "",
     ) -> str:
chat-server/app/workflow/agent/node/multi_dataset.py (1)

80-87: Fix incorrect state field reference: use previous_sql_queries not relevant_sql_queries.

Line 86 reads from state.get("relevant_sql_queries", []), but AgentState only defines previous_sql_queries (line 26 of agent/types.py). This causes SQL history to always be empty, breaking multi-dataset reasoning. Change to state.get("previous_sql_queries", []).

The same bug exists in single_dataset.py:74.

chat-server/tests/scripts/replicate_prod_to_local.py (1)

576-596: Update the example JSON file path to use an actual file or wildcard pattern.

The example path tests/chat_server_tests/output/golden_dataset.json does not exist. Actual golden dataset files generated by the test tools use timestamp suffixes (e.g., golden_dataset_20250806_010759.json). Either update the example to reference an actual generated file, use a wildcard pattern like golden_dataset_*.json, or clarify that this is a placeholder path. The output/ directory structure also doesn't currently exist in the repository.
chat-server/app/workflow/agent/node/single_dataset.py (1)
70-75: Fix state key mismatch: use previous_sql_queries instead of relevant_sql_queries

Line 74 attempts to retrieve from state.get("relevant_sql_queries", []), but the AgentState defines the key as previous_sql_queries (set by context_processor). The relevant_sql_queries key exists only in the visualize_data_graph InputState. Using the wrong key means the agent context is lost unless the default empty list masks the issue.
-        "prev_sql_queries": state.get("relevant_sql_queries", []),
+        "prev_sql_queries": state.get("previous_sql_queries", []),
Note: The same issue exists in chat-server/app/workflow/agent/node/multi_dataset.py:86.
chat-server/pyproject.toml (1)
10-30: Move pre-commit to the dev dependency group.

Pre-commit is a development tool and should be in the [dependency-groups] dev section (which currently contains only test tools), not in the main dependencies. The langgraph 1.0.6 APIs (ToolNode, Command, StateGraph, add_messages) are actively used throughout the codebase and appear compatible with the version bump.
Diff to move pre-commit to dev
dependencies = [
  "aioboto3>=14.3.0",
  "beautifulsoup4>=4.14.2",
  "altair>=5.5.0",
  "dspy==3.1.0",
  "e2b-code-interpreter>=1.5.1",
  "fastapi[standard]>=0.115.12",
  "langchain>=0.3.21",
  "langchain-community>=0.3.20",
  "langchain-openai>=0.3.9",
  "langchain-qdrant>=0.2.0",
  "langgraph>=1.0.6",
  "langgraph-cli[inmem]>=0.3.3",
  "langsmith>=0.3.42",
  "logging>=0.4.9.6",
  "model2vec>=0.7.0",
  "pillow>=12.0.0",
  "portkey-ai>=1.11.1",
- "pre-commit>=4.2.0",
  "pydantic>=2.10.6",
  "qdrant-client>=1.13.3",
  "vega-datasets>=0.9.0",
  "vl-convert-python>=1.8.0",
]

[dependency-groups]
dev = [
  "pytest>=8.4.1",
  "pytest-asyncio>=0.24.0",
  "pytest-cache>=1.0",
  "pytest-sugar>=1.0.0",
  "pytest-testmon>=2.1.3",
  "pytest-xdist>=3.8.0",
+ "pre-commit>=4.2.0",
]
chat-server/app/tool_utils/tool_node.py (1)
26-64: Ensure tool metadata is always applied regardless of config type.

The current code returns config unchanged if it's not a dict (line 64), meaning tool_text, tool_category, and should_display_tool are dropped when config is None or non-dict. This can occur in langgraph's ToolNode execution flow. Fix by defaulting to {} when config is not a dict and always merging tool_config.
🛠️ Proposed fix
-        if isinstance(config, dict):
-            merged_config: RunnableConfig = config.copy()
+        base_config: RunnableConfig = config if isinstance(config, dict) else {}
+        merged_config: RunnableConfig = base_config.copy()
             if "tags" in tool_config:
                 existing_tags = merged_config.get("tags", [])
                 if not isinstance(existing_tags, list):
                     existing_tags = [existing_tags] if existing_tags else []
                 merged_config["tags"] = existing_tags + tool_config["tags"]

             if "metadata" in tool_config:
                 existing_metadata = merged_config.get("metadata", {})
                 if not isinstance(existing_metadata, dict):
                     existing_metadata = {}
                 merged_config["metadata"] = {**existing_metadata, **tool_config["metadata"]}

-            return merged_config
-
-        return config
+        return merged_config
chat-server/app/tool_utils/tools/get_table_schema.py (2)
69-73: Filter uses OR (should) instead of AND (must) logic.

When both dataset_ids and org_id filters are present, using should combines them with OR semantics—returning records that match any dataset ID or the org ID. This likely isn't the intended behavior; you probably want records that match the specified dataset IDs and belong to the organization.
🐛 Proposed fix to use AND logic
         if filter_conditions:
             search_result = await client.scroll(
                 collection_name=settings.QDRANT_COLLECTION,
-                scroll_filter=Filter(should=filter_conditions),
+                scroll_filter=Filter(must=filter_conditions),
             )
80-86: Exception type mismatch: catching JSONDecodeError but code uses Pydantic.

DatasetSchema(**metadata) performs Pydantic validation, which raises ValidationError, not json.JSONDecodeError. The current handler won't catch actual validation errors.
🐛 Proposed fix
+from pydantic import ValidationError
...
                     try:
                         metadata = payload.get("metadata", {})
                         dataset_schema = DatasetSchema(**metadata)
                         schemas.append(dataset_schema.format_for_prompt())
-                    except json.JSONDecodeError as e:
+                    except (json.JSONDecodeError, ValidationError) as e:
                         logger.exception(f"Error parsing schema JSON: {e}")
                         continue
chat-server/tests/unit/test_schema_search.py (1)
153-154: Inconsistent mock signature will cause test failure.

This fake_get_vector_store only accepts embeddings, while all other test cases in this file were updated to accept both embeddings=None and collection_name=None. If the real get_vector_store is called with collection_name, this mock will raise a TypeError.
🐛 Proposed fix
-    def fake_get_vector_store(embeddings):
+    def fake_get_vector_store(embeddings=None, collection_name=None):
         return object()
chat-server/app/services/qdrant/schema_search.py (1)
18-61: Fix tenant isolation: move org_id to must filter.

Using org_id in should allows results to match by project/dataset ID alone, ignoring the organization context. This creates a cross-tenant data leak risk. With OR semantics, a schema from a different organization with the same project ID could be returned. Additionally, org_id is often not provided to these functions (callers in semantic_search.py, context_processor.py, and identify_datasets.py omit it), resulting in no tenant filtering in those cases.

Separate org_id into a must condition to enforce tenant isolation:
Proposed fix
-        if org_id:
-            filter_conditions.append(
-                models.FieldCondition(
-                    key="metadata.org_id",
-                    match=models.MatchValue(value=org_id),
-                )
-            )
+        must_conditions = []
+        if org_id:
+            must_conditions.append(
+                models.FieldCondition(
+                    key="metadata.org_id",
+                    match=models.MatchValue(value=org_id),
+                )
+            )
 
-        if filter_conditions:
-            query_filter = models.Filter(should=filter_conditions)
+        if filter_conditions or must_conditions:
+            query_filter = models.Filter(
+                should=filter_conditions or None,
+                must=must_conditions or None,
+            )
Apply the same fix to get_schema_from_qdrant, get_schema_by_dataset_ids, and get_project_schemas in get_schema.py.
chat-server/tests/unit/test_column_value_matching.py (1)
71-79: Duplicate dictionary key "error" - only the second definition is used.

The dummy logger has "error" defined twice (lines 75 and 77). Python dictionaries only keep the last value for duplicate keys, so this is redundant. While it doesn't affect test behavior (both are no-op lambdas), it should be cleaned up.
🧹 Proposed fix
     dummy_logger = type(
         "L",
         (),
         {
             "error": lambda *args, **kwargs: None,
             "debug": lambda *args, **kwargs: None,
-            "error": lambda *args, **kwargs: None,
         },
     )()
chat-server/app/services/qdrant/get_schema.py (1)
100-112: Filter logic may not behave as intended for multi-dataset + org_id queries.

Using Filter(should=filter_conditions) applies OR logic across all conditions. When org_id is provided alongside multiple dataset_ids, this will match records where dataset_id matches ANY of the IDs OR org_id matches—potentially returning datasets from other organizations.

If the intent is to filter datasets within a specific organization, consider using nested filters:
🐛 Proposed fix for AND semantics between dataset_ids (OR) and org_id
+        dataset_conditions = [
+            FieldCondition(
+                key="metadata.dataset_id",
+                match=MatchValue(value=dataset_id),
+            )
+            for dataset_id in dataset_ids
+        ]
+
+        filter_obj = Filter(should=dataset_conditions)
+
         if org_id:
-            filter_conditions.append(
-                FieldCondition(
-                    key="metadata.org_id",
-                    match=MatchValue(value=org_id),
-                )
+            filter_obj = Filter(
+                must=[
+                    Filter(should=dataset_conditions),
+                    FieldCondition(
+                        key="metadata.org_id",
+                        match=MatchValue(value=org_id),
+                    ),
+                ]
             )

         search_result = await client.scroll(
             collection_name=settings.QDRANT_COLLECTION,
-            scroll_filter=Filter(should=filter_conditions),
+            scroll_filter=filter_obj,
             limit=len(dataset_ids),
         )
chat-server/app/tool_utils/tools/list_datasets.py (3)
79-85: Mutable default arguments and type hint issue.

Line 82-83: Using mutable default arguments (list[str] = []) is a Python antipattern that can lead to unexpected behavior if the list is mutated. Line 84: The config parameter should use Optional[RunnableConfig] type hint for consistency.
🐛 Proposed fix
 `@tool`
 async def get_all_datasets(
     status_message: str = "",
-    project_ids: list[str] = [],
-    dataset_ids: list[str] = [],
-    config: RunnableConfig = None,
+    project_ids: list[str] | None = None,
+    dataset_ids: list[str] | None = None,
+    config: RunnableConfig | None = None,
 ) -> str:
Then update the usage inside the function:
-    if not project_ids and not dataset_ids:
+    if not project_ids and not dataset_ids:
         ...
 
-    if project_ids:
+    if project_ids:
         dataset_names = await get_dataset_names_from_project_ids(project_ids, org_id=org_id)
38-52: Missing exception handling for network errors.

Unlike get_dataset_names_from_project_ids which has a try-except block (lines 27-28), this function lacks exception handling for network failures. This inconsistency could cause unhandled exceptions to propagate.
🐛 Proposed fix
 async def get_project_ids_for_datasets_ids(
     dataset_ids: list[str],
     org_id: Optional[str] = None,
 ) -> dict[str, str]:
     dataset_id_project_map = {}
     client = GopieClient(org_id=org_id)
     for dataset_id in dataset_ids:
-        path = f"/v1/api/datasets/{dataset_id}/project"
-        async with await client.get(path, ssl=False) as response:
-            if response.status == 200:
-                dataset_data = await response.json()
-                dataset_id_project_map[dataset_id] = dataset_data.get("project_id", "")
-            else:
-                logger.warning(f"Dataset {dataset_id} not found")
+        try:
+            path = f"/v1/api/datasets/{dataset_id}/project"
+            async with await client.get(path, ssl=False) as response:
+                if response.status == 200:
+                    dataset_data = await response.json()
+                    dataset_id_project_map[dataset_id] = dataset_data.get("project_id", "")
+                else:
+                    logger.warning(f"Dataset {dataset_id} not found")
+        except Exception as e:
+            logger.exception(f"Error fetching project for dataset {dataset_id}: {e}")
     return dataset_id_project_map
55-76: Missing exception handling (same as above).

This function also lacks the try-except block present in get_dataset_names_from_project_ids. Consider adding consistent error handling.

🤖 Fix all issues with AI agents

In `@chat-server/app/services/gopie/client.py`:
- Around line 51-97: The get/post methods (get and post in client.py) currently
lack timeouts and can hang; add a GOPIE_API_TIMEOUT integer setting (following
E2B_TIMEOUT pattern) to config/settings and use aiohttp.ClientTimeout with that
value when making outbound calls: either configure the SingletonAiohttp session
(_session) with ClientTimeout(total=GOPIE_API_TIMEOUT) on creation or pass
timeout=aiohttp.ClientTimeout(total=GOPIE_API_TIMEOUT) to self._session.get(...)
and self._session.post(...); update the client to read the GOPIE_API_TIMEOUT
setting and apply it so get/post never send requests without an explicit
timeout.

In `@chat-server/app/services/gopie/dataset_info.py`:
- Around line 11-31: The async context manager usage in get_dataset_info (and
similarly in get_project_info) incorrectly uses "async with await
client.get(path)"; remove the explicit await and use "async with
client.get(path)" so the coroutine returned by client.get is awaited by the
context manager itself; update both functions to change the context manager
lines to use client.get(path) without await and keep the inner await
response.json() as-is to preserve reading the body.

In `@chat-server/app/services/qdrant/get_schema.py`:
- Around line 47-55: The code assumes search_result is always set but if
filter_conditions is empty the client.scroll branch won't run and search_result
is unbound; update the function containing this logic (the block using
filter_conditions, search_result, client.scroll and settings.QDRANT_COLLECTION)
to either (a) early-return None when filter_conditions is empty/invalid (e.g.,
when dataset_id is falsy and org_id is None) or (b) initialize search_result to
a safe default before the if (e.g., None) and add a guard before accessing
search_result[0]; pick one approach and apply it consistently so that the
subsequent check "if not search_result[0] or not search_result[0][0]" cannot
raise UnboundLocalError.

In `@chat-server/app/tool_utils/tools/plan_sql_query.py`:
- Around line 39-41: The plan_sql_query function calls
get_prompt_llm_chain("generate_sql", config) but omits schema enforcement;
update that call in plan_sql_query to pass schema=PlanQueryOutput so the LLM
response is validated into the PlanQueryOutput dataclass, and update the
plan_sql_query docstring to describe the actual return contract: a
PlanQueryOutput containing sql_queries (list of SqlQueryOutput objects which
include nested tables_used), non_sql_response, user_friendly_response, and
limitations (instead of the old top-level reasoning/expected_result/tables_used
fields).

In `@chat-server/app/utils/graph_utils/column_value_matching.py`:
- Around line 21-65: The _build_ilike_query function (and related query
builders) currently interpolate table_name and column_name directly into SQL,
creating SQL injection risk; update match_column_values() to validate both table
and column against a whitelist of actual schema identifiers (e.g., query the
database schema or use a precomputed set) and only pass validated names into
_build_ilike_query, and likewise change get_table_estimated_size() to require
validated table_name before interpolation; ensure validation rejects or
normalizes any identifier that is not an exact match to the known table/column
names and fail fast rather than interpolating untrusted values.

In `@chat-server/app/utils/graph_utils/table_utils.py`:
- Around line 45-46: calculate_sampling_percentage currently divides by
estimated_size and will raise ZeroDivisionError when estimated_size is 0; update
the function (calculate_sampling_percentage) to guard against estimated_size <=
0 by returning a safe formatted value instead of dividing (e.g., return
"100.000000" to indicate full sampling) and ensure the computed sample_pct is
clamped between 0 and 100 when using settings.TARGET_ROWS / estimated_size to
avoid extremes; locate the logic that uses settings.TARGET_ROWS and sample_pct
and add the conditional check and clamping there.
- Around line 18-22: The code directly interpolates table_name into size_query
creating an SQL injection risk and also can divide by zero in
calculate_sampling_percentage; to fix, validate/sanitize table_name before
building size_query (e.g., allow only /^[A-Za-z0-9_]+$/ or check against a
whitelist) and reject or raise a clear error for invalid names instead of
interpolating raw input (this change touches the size_query construction and
callers that pass table_name and can optionally use/extend execute_sql to accept
safe parameters later); also add a zero-check in calculate_sampling_percentage
to handle estimated_size == 0 (return 0 sampling percentage or raise a
controlled error) before performing any division.

In `@chat-server/app/workflow/graph/multi_dataset_graph/node/analyze_dataset.py`:
- Around line 47-49: The indexing of subqueries using subquery_index is unsafe:
before assigning current_query = subqueries[state.get("subquery_index", 0)],
validate that subqueries is a non-empty list and that
state.get("subquery_index") is an int within range; if not, fallback to a safe
default (e.g., None or the first element) and handle that downstream. Update the
code around subqueries, state.get("subquery_index"), and current_query to
perform a type-and-range check (or coerce the index) and choose a safe fallback
to avoid IndexError/TypeError.

In `@chat-server/app/workflow/graph/multi_dataset_graph/node/sql_agent.py`:
- Line 31: The code reads query_result.subqueries[query_index].retry_count
without checking that subqueries exists and query_index is in range; add a
defensive check before access in the function/method where this line appears
(use query_result, subqueries, query_index and retry_count as references):
verify that query_result.subqueries is truthy and that 0 <= query_index <
len(query_result.subqueries) and only then read .retry_count, otherwise assign a
safe default (e.g., 0) or handle the out-of-bounds case (raise a clear error or
skip processing) so an IndexError cannot occur.

In `@chat-server/app/workflow/graph/nl_to_sql_graph/node/semantic_search.py`:
- Around line 8-20: The semantic_search function calls search_schemas without
scoping by organization; extract org_id from the RunnableConfig metadata (e.g.,
config.metadata.get("org_id") or the equivalent pattern used in execute_query.py
/ analyze_dataset.py) and pass it into the search_schemas call as the org_id
parameter so search_schemas(user_query=..., embeddings=..., dataset_ids=...,
project_ids=..., org_id=org_id). Ensure you use the same metadata access pattern
as get_model_provider and other workflow nodes to avoid KeyError when metadata
is absent.

In `@chat-server/app/workflow/graph/nl_to_sql_graph/node/sql_agent.py`:
- Around line 35-38: The schema lookup is missing the org_id, so change the
get_schema_from_qdrant call to pass the tenant id (e.g.,
get_schema_from_qdrant(dataset_id=dataset_id, org_id=org_id)); ensure org_id is
obtained from the current context/inputs (where this node receives tenant info)
or added to the function signature if absent, and propagate it alongside
dataset_ids to avoid cross-tenant schema leakage while keeping the same return
handling when schema is None.

In `@chat-server/app/workflow/graph/single_dataset_graph/graph.py`:
- Around line 40-42: The graph currently always routes from prepare_data to
sql_agent even when prepare_data returns no dataset_info; update the graph
and/or agents to handle that case: add a conditional edge from "prepare_data" to
an error handler node (e.g. "prepare_error" or "validate_result") and only add
the edge prepare_data -> "sql_agent" when dataset_info is present, or modify
sql_agent to validate dataset_info (the variable name dataset_info) before
calling sql_planning_agent and route to the error handler if missing; touch the
graph_builder edges around START, "prepare_data", "sql_agent", "execute_sql",
and "validate_result" and the sql_agent/sql_planning_agent dataset_info check to
ensure safe conditional routing.

In `@chat-server/app/workflow/graph/single_dataset_graph/node/execute_sql.py`:
- Around line 47-52: The loop over
query_result.single_dataset_query_result.sql_queries calls
execute_sql_with_full_and_truncated without passing the tenant/org id, causing
org_id to be None; update the call in execute_sql.py to pass the current org id
(e.g., org_id variable or attribute available in the surrounding scope) into
execute_sql_with_full_and_truncated(query=sql_info.sql_query, org_id=org_id) and
ensure sql_info.sql_query_result is assigned the truncated_result_data as before
so tenant isolation is preserved.
- Around line 21-35: The guard incorrectly checks isinstance(last_message,
ErrorMessage) while last_message is a list; change how messages are read so you
inspect the actual last message element: fetch messages = state.get("messages",
[]) and set last_message = messages[-1] if messages else None, then update the
conditional to check if non_sql_response is present or isinstance(last_message,
ErrorMessage) and raise/return accordingly (use the existing ErrorMessage class
and the query_result variable names shown).

In `@chat-server/app/workflow/graph/single_dataset_graph/node/prepare_data.py`:
- Around line 106-109: The code builds sample_data_query via f-string with
dataset_name, which allows SQL injection; instead, stop interpolating raw
dataset_name into SQL in prepare_data.py and use a safe approach when calling
execute_sql_with_limit: either pass a parameterized query where possible (use
placeholders for values) and supply parameters to execute_sql_with_limit, or
(for SQL identifiers) validate/sanitize dataset_name against a strict pattern
(e.g., /^[A-Za-z0-9_]+$/) or use the DB driver's identifier-quoting mechanism
before injecting it; update the sample_data_query construction and the call to
execute_sql_with_limit (referencing sample_data_query, dataset_name and
execute_sql_with_limit) so only a validated/quoted identifier is used and all
dynamic values are passed as parameters, then keep using convert_rows_to_csv on
the returned rows.

In `@chat-server/app/workflow/graph/single_dataset_graph/node/routing.py`:
- Around line 5-10: The code sets query_result = state.get("query_result", [])
which can be a list and causes attribute access errors; change the default to
None (query_result = state.get("query_result")) and ensure you only access
query_result.single_dataset_query_result when query_result is truthy, e.g. set
single_dataset_res = query_result.single_dataset_query_result if query_result
else None, then compute sql_queries from single_dataset_res only when it and its
sql_queries exist (keeping the existing list comprehension for [sq.sql_query for
sq in single_dataset_res.sql_queries if sq.sql_query]).

In `@chat-server/app/workflow/graph/single_dataset_graph/node/sql_agent.py`:
- Around line 23-28: The code accesses query_result.single_dataset_query_result
before verifying query_result is present, which can raise AttributeError; update
the logic in sql_agent.py so you first retrieve query_result =
state.get("query_result") and immediately check if not query_result (or
query_result is None) and raise the ValueError there, and only after that access
query_result.single_dataset_query_result to assign single_dataset_result; ensure
the check covers both query_result and single_dataset_result as intended.

In `@chat-server/tests/e2e/viz_utils/per_example_workflow.py`:
- Around line 401-543: The test scoring logic in process_example assumes a 0–10
scale (score >= 8) but the evaluator may return 0–1; detect and normalize the
score before comparing: after retrieving evaluation =
test_result.get("evaluation", {}) and score = evaluation.get("score", 0), coerce
score to a float, and if score <= 1 then scale it (e.g., score_scaled =
float(score) * 10) else keep as-is (score_scaled = float(score)); then use
passed = score_scaled >= 8 and use score_scaled in any printed messages and
result fields so the pass/fail decision matches the intended 0–10 threshold.
Ensure this change is applied inside process_example where
test_result/evaluation/score are handled.

In `@chat-server/tests/e2e/viz_utils/viz_test_case_runner.py`:
- Around line 39-46: The VizEvaluationSchema currently constrains score to 0–1
but the runner uses a 0–10 pass threshold (score >= 8); update the schema field
in VizEvaluationSchema (the score Field) to ge=0.0 and le=10.0 and change its
description to "Similarity score between 0 and 10 (higher is better)"; then
check other places that parse/validate or prompt for the score (any code that
consumes VizEvaluationSchema or performs the score >= 8 check) and ensure
prompts, validation, and test expectations consistently use the 0–10 scale.

In `@chat-server/tests/e2e/viz_utils/viz_utils.py`:
- Around line 192-201: The dict_hash function is using hashlib.sha256
incorrectly by slicing the hash object; instead ensure `serialized` is bytes
(encode when needed), pass that to `hashlib.sha256(...)`, then call
`.hexdigest()` (or `.digest()` if you want raw bytes) and slice the resulting
bytes/hex string as intended before returning; update the logic in dict_hash to
encode `serialized` when it's a str, compute the hash via
`hashlib.sha256(serialized_bytes)`, then return the proper sliced `.hexdigest()`
(or slice `.digest()` and then hexlify) so `m.hexdigest()` is called on a real
digest rather than on the hash object.

🟡 Minor comments (10)

chat-server/app/workflow/graph/graph_mermaid/sql_agent_graph.mmd-8-12 (1)
8-12: Connect or remove the orphan search_duckdb_docs node.

Right now it’s defined but not part of the flow, so the graph renders a floating node and misrepresents the pipeline. If it’s intended, wire it in; otherwise remove it.
✅ Example fix (wire into flow)
 __start__ --> generate_sql;
-generate_sql --> __end__;
+generate_sql --> search_duckdb_docs;
+search_duckdb_docs --> __end__;
chat-server/tests/dspy/evaluator.py-82-92 (1)

82-92: Broad exception catch masks specific failure modes that could aid debugging.

The exception handler at line 82 catches all exceptions indiscriminately, making it difficult to distinguish between parsing failures (recoverable) and unexpected errors. Similar code in this file (e.g., optimize_evaluator.py line 28) catches specific exception types like ValueError and AttributeError. Consider narrowing the catch to specific exceptions that the dspy.ChainOfThought evaluation is expected to raise (e.g., ValueError, TypeError, AttributeError) to let unexpected errors propagate for visibility.

Additionally, returning evaluation_score: 0.0 silently on any failure could cause misleading test results—tests expecting high scores will correctly fail, but those expecting low scores might incorrectly pass.

chat-server/tests/dspy/optimize_evaluator.py-40-42 (1)

40-42: --no-reasoning flag and use_reasoning parameter are unused

The CLI flag and function parameter are collected and passed through but never actually used to control behavior. EvaluatorModule hardcodes dspy.ChainOfThought with no provision to switch to dspy.Predictor. Remove these unused parameters or implement the feature by adding a constructor parameter to EvaluatorModule and conditionally instantiating the appropriate module type.
chat-server/tests/test_config.py-33-33 (1)
33-33: Update the module docstring default to match the new URL.

The header comment still mentions port 8001.
🔧 Suggested docstring update
-    CHAT_SERVER_URL        - Chat server endpoint (default: http://localhost:8001/api/v1/chat/completions)
+    CHAT_SERVER_URL        - Chat server endpoint (default: http://localhost:8003/api/v1/chat/completions)
chat-server/tests/test_config.py-22-26 (1)

22-26: Add python-dotenv as an explicit dependency in pyproject.toml.

The import will work because python-dotenv is available as a transitive dependency in the lockfile, but it's not explicitly declared. Add it to the main dependencies list to ensure it remains available if upstream dependencies change.

chat-server/tests/e2e/utils/dataset_test_cases.py-11-26 (1)

11-26: Replace non-breaking hyphens and en dashes with regular ASCII hyphens in expected_result strings.

The file contains multiple instances of Unicode non-breaking hyphens (U+2011) and en dashes (U+2013) in expected_result strings (lines 55, 205, 235, 250, 265, 434, 494, 584, 590, 629, 644, 659, 689). These characters differ from regular hyphens (U+002D) and will cause string comparison failures if test assertions compare expected results programmatically.

Examples: division‑by‑zero, Non‑SQL, group‑by, side‑by‑side, X‑axis, Y‑axis.

Replace all instances with standard ASCII hyphens (the - character).
chat-server/app/workflow/graph/multi_dataset_graph/node/sql_agent.py-86-86 (1)
86-86: Typo: "Succesfully" should be "Successfully".
Proposed fix
-                    content="Succesfully Completed query planning step in multidataset workflow."
+                    content="Successfully completed query planning step in multi-dataset workflow."
chat-server/app/workflow/graph/multi_dataset_graph/node/sql_agent.py-21-21 (1)
21-21: Potential TypeError if subqueries is None.

If state.get("subqueries") returns None, the indexing [query_index] will raise a TypeError. The fallback "No input" only applies if the entire conditional is falsy.
Proposed fix
-    user_query = state.get("subqueries")[query_index] if state.get("subqueries") else "No input"
+    subqueries = state.get("subqueries")
+    user_query = subqueries[query_index] if subqueries else "No input"
chat-server/app/services/gopie/generate_schema.py-15-49 (1)
15-49: Remove unnecessary else clause and add defensive quoting for dataset_name.

The else block on line 37 is unnecessary since the preceding if statement returns early. The code can be simplified by removing the else and unindenting the subsequent statements.

Additionally, while dataset_name comes from the trusted Gopie API response, consider quoting it defensively to align with test patterns elsewhere in the codebase (e.g., f'SELECT * FROM "{dataset_name}"'). This provides defense-in-depth against any unexpected values.

Finally, add error handling around the execute_sql call on line 72 to provide comprehensive error messages and logging per coding guidelines.
♻️ Remove unnecessary else clause and add quoting
     if not should_use_sampling(estimated_size):
         logger.debug(
             f"[{dataset_name}] Small dataset detected ({estimated_size} rows). "
             "Using standard nested query logic."
         )
         return f"""
         SELECT DISTINCT * FROM (
-            SELECT * FROM {dataset_name} LIMIT 200000
+            SELECT * FROM "{dataset_name}" LIMIT 200000
         )
         LIMIT {limit}
         """
-    else:
-        pct_str = calculate_sampling_percentage(estimated_size)
+    pct_str = calculate_sampling_percentage(estimated_size)

-        logger.debug(
-            f"[{dataset_name}] Large dataset detected ({estimated_size} rows). "
-            f"Sampling {pct_str}% (system) to retrieve approx {settings.TARGET_ROWS} rows."
-        )
+    logger.debug(
+        f"[{dataset_name}] Large dataset detected ({estimated_size} rows). "
+        f"Sampling {pct_str}% (system) to retrieve approx {settings.TARGET_ROWS} rows."
+    )

-        return f"""
-        SELECT DISTINCT * FROM {dataset_name}
-        USING SAMPLE {pct_str}% (system)
-        LIMIT {limit}
-        """
+    return f"""
+    SELECT DISTINCT * FROM "{dataset_name}"
+    USING SAMPLE {pct_str}% (system)
+    LIMIT {limit}
+    """
chat-server/tests/e2e/viz_utils/per_example_workflow.py-181-238 (1)
181-238: Quote dataset names in sample SQL to avoid syntax errors/injection.

dataset_name is interpolated directly into SQL without quoting. If the name contains spaces/reserved words or quotes, this will fail (and can be abused if names aren’t sanitized).
✅ Safer quoting
-            body = {"query": f"SELECT * FROM {dataset_name} LIMIT 5"}
+            safe_name = dataset_name.replace('"', '""')
+            body = {"query": f'SELECT * FROM "{safe_name}" LIMIT 5'}

coderabbitai · 2026-01-22T23:30:04Z

+async def get_dataset_info(
+    dataset_id: str,
+    project_id: str,
+    org_id: Optional[str] = None,
+) -> DatasetDetails:
+    """Get dataset information from Gopie API.

-    url = f"{settings.GOPIE_API_ENDPOINT}/v1/api/projects/{project_id}/datasets/{dataset_id}"
-    headers = {"accept": "application/json"}
+    Args:
+        dataset_id: The dataset ID
+        project_id: The project ID
+        org_id: Optional organization ID for multi-tenant support

-    try:
-        async with http_session.get(url, headers=headers) as response:
-            data = await response.json()
-            return DatasetDetails(**data)
-    except Exception as e:
-        logger.exception(f"Error getting dataset info: {e!s}")
-        raise e
+    Returns:
+        DatasetDetails object
+    """
+    client = GopieClient(org_id=org_id)
+    path = f"/v1/api/projects/{project_id}/datasets/{dataset_id}"

+    async with await client.get(path) as response:
+        data = await response.json()
+    return DatasetDetails(**data)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find and inspect GopieClient.get method find chat-server -name "*.py" -type f | xargs rg -l "class GopieClient" | head -5

Repository: factly/gopie

Length of output: 96

🏁 Script executed:

# Read the GopieClient implementation cat -n chat-server/app/services/gopie/client.py

Repository: factly/gopie

Length of output: 3577

🏁 Script executed:

# Also check lines 34-52 in dataset_info.py to see other affected functions sed -n '30,55p' chat-server/app/services/gopie/dataset_info.py

Repository: factly/gopie

Length of output: 679

Remove explicit await from async context manager pattern.

async with await client.get(path) is non-standard for aiohttp. The async context manager protocol automatically awaits the coroutine; use async with client.get(path) instead. This applies to both get_dataset_info and get_project_info functions.

🔧 Fix

- async with await client.get(path) as response: + async with client.get(path) as response: data = await response.json()

🤖 Prompt for AI Agents

In `@chat-server/app/services/gopie/dataset_info.py` around lines 11 - 31, The async context manager usage in get_dataset_info (and similarly in get_project_info) incorrectly uses "async with await client.get(path)"; remove the explicit await and use "async with client.get(path)" so the coroutine returned by client.get is awaited by the context manager itself; update both functions to change the context manager lines to use client.get(path) without await and keep the inner await response.json() as-is to preserve reading the body.

coderabbitai · 2026-01-22T23:30:04Z

        if filter_conditions:
            search_result = await client.scroll(
                collection_name=settings.QDRANT_COLLECTION,
                scroll_filter=Filter(should=filter_conditions),
                limit=1,
            )

-        if not search_result[0][0]:
+        if not search_result[0] or not search_result[0][0]:
            return None


⚠️ Potential issue | 🟠 Major

Potential UnboundLocalError when filter_conditions is empty.

If dataset_id is falsy and org_id is None, filter_conditions will be empty, the if filter_conditions: block won't execute, and search_result will be unbound when accessed on line 54.

🐛 Proposed fix

if filter_conditions: search_result = await client.scroll( collection_name=settings.QDRANT_COLLECTION, scroll_filter=Filter(should=filter_conditions), limit=1, ) + else: + return None if not search_result[0] or not search_result[0][0]: return None

Alternatively, add early validation:

+ if not dataset_id: + logger.warning("get_schema_from_qdrant called without dataset_id") + return None + try: client = await QdrantSetup.get_async_client(settings.QDRANT_COLLECTION)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if filter_conditions:

search_result = await client.scroll(

collection_name=settings.QDRANT_COLLECTION,

scroll_filter=Filter(should=filter_conditions),

limit=1,

)

if not search_result[0][0]:

if not search_result[0] or not search_result[0][0]:

return None

if filter_conditions:

search_result = await client.scroll(

collection_name=settings.QDRANT_COLLECTION,

scroll_filter=Filter(should=filter_conditions),

limit=1,

)

else:

return None

if not search_result[0] or not search_result[0][0]:

return None

🤖 Prompt for AI Agents

In `@chat-server/app/services/qdrant/get_schema.py` around lines 47 - 55, The code assumes search_result is always set but if filter_conditions is empty the client.scroll branch won't run and search_result is unbound; update the function containing this logic (the block using filter_conditions, search_result, client.scroll and settings.QDRANT_COLLECTION) to either (a) early-return None when filter_conditions is empty/invalid (e.g., when dataset_id is falsy and org_id is None) or (b) initialize search_result to a safe default before the if (e.g., None) and add a guard before accessing search_result[0]; pick one approach and apply it consistently so that the subsequent check "if not search_result[0] or not search_result[0][0]" cannot raise UnboundLocalError.

coderabbitai · 2026-01-22T23:30:04Z

    try:
-        chain = get_prompt_llm_chain("plan_sql_query_tool", config)
+        chain = get_prompt_llm_chain("generate_sql", config)
        response = await chain.ainvoke({"user_query": user_query, "dataset_info": dataset_info})


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find generate_sql prompt definition find chat-server -type f -name "*.py" -o -name "*.json" -o -name "*.yaml" -o -name "*.yml" | head -50 | xargs grep -l "generate_sql" 2>/dev/null

Repository: factly/gopie

Length of output: 168

🏁 Script executed:

#!/bin/bash # Search for generate_sql in prompts directory fd "prompts" chat-server --type d

Repository: factly/gopie

Length of output: 168

🏁 Script executed:

#!/bin/bash # Search for get_prompt_llm_chain implementation rg -n "def get_prompt_llm_chain" chat-server --type py -A 10

Repository: factly/gopie

Length of output: 893

🏁 Script executed:

#!/bin/bash # Inspect plan_sql_query to see full implementation cat -n chat-server/app/tool_utils/tools/plan_sql_query.py

Repository: factly/gopie

Length of output: 2585

🏁 Script executed:

#!/bin/bash # Get full get_prompt_llm_chain implementation sed -n '86,150p' chat-server/app/utils/langsmith/prompt_manager.py

Repository: factly/gopie

Length of output: 1362

🏁 Script executed:

#!/bin/bash # Search for generate_sql prompt or prompt mapping rg -n "generate_sql" chat-server --type py -B 2 -A 5

Repository: factly/gopie

Length of output: 9931

🏁 Script executed:

#!/bin/bash # Search for PromptSelector or NodeName enum/mapping rg -n "class NodeName|generate_sql" chat-server/app --type py

Repository: factly/gopie

Length of output: 1836

🏁 Script executed:

#!/bin/bash # Check format_generate_sql_input full signature and expected parameter types sed -n '111,140p' chat-server/app/workflow/prompts/generate_sql_prompt.py

Repository: factly/gopie

Length of output: 1107

🏁 Script executed:

#!/bin/bash # Check PlanQueryOutput schema definition rg -n "class PlanQueryOutput" chat-server --type py -A 15

Repository: factly/gopie

Length of output: 1977

🏁 Script executed:

#!/bin/bash # Get full PlanQueryOutput schema and SqlQueryOutput sed -n '1,50p' chat-server/app/workflow/graph/sql_planner_graph/generate_sql.py

Repository: factly/gopie

Length of output: 2137

Add missing schema=PlanQueryOutput parameter and update tool docstring to match actual contract.

plan_sql_query calls get_prompt_llm_chain("generate_sql", config) without schema=PlanQueryOutput, causing unstructured LLM output instead of validated PlanQueryOutput. This contradicts the workflow implementation (generate_sql.py:80) which enforces the schema. Additionally, the tool's docstring documents incorrect return fields:

Promises: reasoning, expected_result, tables_used (top-level)

Actual PlanQueryOutput: sql_queries (list of SqlQueryOutput objects with nested tables_used), non_sql_response, user_friendly_response, limitations

Pass schema=PlanQueryOutput to enforce structured output and correct the docstring to match the actual schema.

🤖 Prompt for AI Agents

In `@chat-server/app/tool_utils/tools/plan_sql_query.py` around lines 39 - 41, The plan_sql_query function calls get_prompt_llm_chain("generate_sql", config) but omits schema enforcement; update that call in plan_sql_query to pass schema=PlanQueryOutput so the LLM response is validated into the PlanQueryOutput dataclass, and update the plan_sql_query docstring to describe the actual return contract: a PlanQueryOutput containing sql_queries (list of SqlQueryOutput objects which include nested tables_used), non_sql_response, user_friendly_response, and limitations (instead of the old top-level reasoning/expected_result/tables_used fields).

coderabbitai · 2026-01-22T23:30:04Z

+def _build_ilike_query(
+    table_name: str,
+    column_name: str,
+    escaped_value: str,
+    estimated_size: int,
+    limit: int = 5,
+) -> str:
+    """Build optimized ILIKE query based on table size.
+
+    Args:
+        table_name: Name of the table
+        column_name: Name of the column to search
+        escaped_value: Escaped search value
+        estimated_size: Estimated row count from metadata
+        limit: Number of rows to return
+
+    Returns:
+        SQL query string
+    """
+    where_clause = f"WHERE LOWER(CAST({column_name} AS VARCHAR)) LIKE '%' || LOWER('{escaped_value}') || '%'"
+
+    if not should_use_sampling(estimated_size):
+        return f"""
+        SELECT DISTINCT {column_name}
+        FROM (SELECT * FROM {table_name} LIMIT 200000)
+        {where_clause}
+        LIMIT {limit}
+        """
+    else:
+        pct_str = calculate_sampling_percentage(estimated_size)
+
+        logger.debug(
+            f"[{table_name}] Large dataset detected ({estimated_size} rows) for ILIKE search. "
+            f"Sampling {pct_str}% (system)."
+        )
+
+        return f"""
+        SELECT DISTINCT {column_name}
+        FROM (
+            SELECT * FROM {table_name}
+            USING SAMPLE {pct_str}% (system)
+        )
+        {where_clause}
+        LIMIT {limit}
+        """


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the full file to understand context cat -n chat-server/app/utils/graph_utils/column_value_matching.py | head -100

Repository: factly/gopie

Length of output: 3655

🏁 Script executed:

# Find who calls _build_ilike_query and related functions rg -n "_build_ilike_query|_build_levenshtein_query|verify_fuzzy_values|find_similar_values" --type=py chat-server/app/ -A 3 -B 3

Repository: factly/gopie

Length of output: 5354

🏁 Script executed:

# Check if there's any schema validation or whitelisting rg -n "table_name|column_name" --type=py chat-server/app/utils/graph_utils/ -B 2 -A 2 | head -150

Repository: factly/gopie

Length of output: 13328

🏁 Script executed:

# Find callers of find_similar_column_values rg -n "find_similar_column_values" --type=py chat-server/app/ -B 2 -A 5

Repository: factly/gopie

Length of output: 38

🏁 Script executed:

# Check if there's any schema validation or whitelist before these values are used rg -n "schema|whitelist|validate.*column|validate.*table" --type=py chat-server/app/utils/graph_utils/ -i

Repository: factly/gopie

Length of output: 162

🏁 Script executed:

# Look for how database schema metadata is accessed rg -n "get_table|get_column|table_metadata|column_metadata|schema" --type=py chat-server/app/utils/graph_utils/ -i | head -50

Repository: factly/gopie

Length of output: 543

🏁 Script executed:

# Get the full file to understand the function signatures and data sources cat -n chat-server/app/utils/graph_utils/column_value_matching.py | tail -200

Repository: factly/gopie

Length of output: 8903

🏁 Script executed:

# Look at the beginning of the main public function cat -n chat-server/app/utils/graph_utils/column_value_matching.py | head -160

Repository: factly/gopie

Length of output: 6204

🏁 Script executed:

# Search for calls to these functions from other modules rg -n "from.*column_value_matching|import.*find_similar\|import.*verify_fuzzy" --type=py chat-server/app/ -A 3

Repository: factly/gopie

Length of output: 533

🏁 Script executed:

# Check get_table_estimated_size to see if it validates table names cat -n chat-server/app/utils/graph_utils/table_utils.py | head -50

Repository: factly/gopie

Length of output: 1996

🏁 Script executed:

# Find who calls analyze_dataset rg -n "analyze_dataset" --type=py chat-server/app/ -B 2 -A 5 | head -100

Repository: factly/gopie

Length of output: 9503

🏁 Script executed:

# Check if there's any schema discovery or validation before these values are used rg -n "get_columns|describe|schema.*table" --type=py chat-server/app/ -i | head -40

Repository: factly/gopie

Length of output: 2051

🏁 Script executed:

# Check the execute_sql implementation cat -n chat-server/app/services/gopie/sql_executor.py | head -100

Repository: factly/gopie

Length of output: 2584

🌐 Web query:

DuckDB parameterized identifiers support table column names

💡 Result:

Short answer: No — DuckDB prepared parameters can only substitute values, not arbitrary identifiers (like column names). You can parameterize table names using query_table/query($1) or build SQL on the client side, but you cannot bind a parameter to an identifier (column name) inside a prepared statement. [1][2][3]

References:

Prepared statements / parameters (values only). [1]

query_table / query functions (allow passing a table name string). [2]

Identifiers / quoting rules (identifiers must be literal/quoted; not parameter placeholders). [3]

SQL injection risk: table_name and column_name are interpolated directly into queries without parameterization or validation.

While escaped_value is escaped for single quotes, the table_name and column_name parameters are interpolated directly into the SQL string via f-strings. Since these values are extracted from user-provided column_assumptions with only minimal falsy checks, an attacker could inject arbitrary SQL by crafting malicious table or column names.

DuckDB prepared statements do not support parameterized identifiers (only values), making whitelist validation against known schema elements the appropriate mitigation. Apply this validation in match_column_values() before passing these values to the query builders, or ensure the upstream data source guarantees only valid schema identifiers.

Additionally, get_table_estimated_size() in table_utils.py (line 21) has the same vulnerability when interpolating table_name.

🧰 Tools

🪛 Ruff (0.14.13)

43-48: Possible SQL injection vector through string-based query construction

(S608)

57-65: Possible SQL injection vector through string-based query construction

(S608)

🤖 Prompt for AI Agents

In `@chat-server/app/utils/graph_utils/column_value_matching.py` around lines 21 - 65, The _build_ilike_query function (and related query builders) currently interpolate table_name and column_name directly into SQL, creating SQL injection risk; update match_column_values() to validate both table and column against a whitelist of actual schema identifiers (e.g., query the database schema or use a precomputed set) and only pass validated names into _build_ilike_query, and likewise change get_table_estimated_size() to require validated table_name before interpolation; ensure validation rejects or normalizes any identifier that is not an exact match to the known table/column names and fail fast rather than interpolating untrusted values.

coderabbitai · 2026-01-22T23:30:05Z

+    query_result = state.get("query_result", [])
+    single_dataset_res = query_result.single_dataset_query_result
+
+    sql_queries = (
+        [sq.sql_query for sq in single_dataset_res.sql_queries if sq.sql_query]
+        if single_dataset_res and single_dataset_res.sql_queries


⚠️ Potential issue | 🟠 Major

Guard against missing query_result before attribute access.

state.get("query_result", []) can return [], and [] has no single_dataset_query_result, which will raise at runtime. Default to None and short-circuit when missing.

✅ Suggested fix

-def route_after_sql_generation(state: State) -> str: - query_result = state.get("query_result", []) - single_dataset_res = query_result.single_dataset_query_result +def route_after_sql_generation(state: State) -> str: + query_result = state.get("query_result") + if not query_result or not getattr(query_result, "single_dataset_query_result", None): + return "no_sql_queries" + single_dataset_res = query_result.single_dataset_query_result

🤖 Prompt for AI Agents

In `@chat-server/app/workflow/graph/single_dataset_graph/node/routing.py` around lines 5 - 10, The code sets query_result = state.get("query_result", []) which can be a list and causes attribute access errors; change the default to None (query_result = state.get("query_result")) and ensure you only access query_result.single_dataset_query_result when query_result is truthy, e.g. set single_dataset_res = query_result.single_dataset_query_result if query_result else None, then compute sql_queries from single_dataset_res only when it and its sql_queries exist (keeping the existing list comprehension for [sq.sql_query for sq in single_dataset_res.sql_queries if sq.sql_query]).

coderabbitai · 2026-01-22T23:30:05Z

+    query_result = state.get("query_result")
+
+    single_dataset_result = query_result.single_dataset_query_result
+
+    if not query_result or not single_dataset_result:
+        raise ValueError("query_result is not properly initialized")


⚠️ Potential issue | 🟠 Major

Potential AttributeError: accessing query_result before null check.

Line 25 accesses query_result.single_dataset_query_result before line 27 validates that query_result is not None. If state.get("query_result") returns None, this will raise AttributeError instead of the intended ValueError.

🐛 Proposed fix - reorder validation

query_result = state.get("query_result") - single_dataset_result = query_result.single_dataset_query_result - if not query_result or not single_dataset_result: + if not query_result: + raise ValueError("query_result is not properly initialized") + + single_dataset_result = query_result.single_dataset_query_result + + if not single_dataset_result: raise ValueError("query_result is not properly initialized")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

query_result = state.get("query_result")

single_dataset_result = query_result.single_dataset_query_result

if not query_result or not single_dataset_result:

raise ValueError("query_result is not properly initialized")

query_result = state.get("query_result")

if not query_result:

raise ValueError("query_result is not properly initialized")

single_dataset_result = query_result.single_dataset_query_result

if not single_dataset_result:

raise ValueError("query_result is not properly initialized")

🧰 Tools

🪛 Ruff (0.14.13)

28-28: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In `@chat-server/app/workflow/graph/single_dataset_graph/node/sql_agent.py` around lines 23 - 28, The code accesses query_result.single_dataset_query_result before verifying query_result is present, which can raise AttributeError; update the logic in sql_agent.py so you first retrieve query_result = state.get("query_result") and immediately check if not query_result (or query_result is None) and raise the ValueError there, and only after that access query_result.single_dataset_query_result to assign single_dataset_result; ensure the check covers both query_result and single_dataset_result as intended.

coderabbitai · 2026-01-22T23:30:05Z

+    async def process_example(
+        self, example_name: str, example_code: str, example_index: int, total_examples: int
+    ) -> ExampleResult:
+        """
+        Complete workflow for ONE example.
+
+        Args:
+            example_name: Name of the example
+            example_code: Python code from example
+            example_index: Current example number (1-indexed)
+            total_examples: Total number of examples
+
+        Returns:
+            ExampleResult with status and details
+        """
+        print(f"\n{'='*70}")
+        print(f"[{example_index}/{total_examples}] Processing: {example_name}")
+        print(f"{'='*70}")
+
+        result = ExampleResult(example_name=example_name, status="error")
+
+        # Step 1: Execute code and generate image
+        print("  → Step 1: Generating chart image...")
+        image_path, error = self._execute_python_code(example_code)
+
+        if error or not image_path:
+            result.status = "skip"
+            result.error_message = f"Failed to generate image: {error}"
+            print(f"    ⚠️  Skipped - {result.error_message}")
+            return result
+
+        print(f"    ✓ Image saved: {Path(image_path).name}")
+        result.image_path = image_path
+
+        # Step 2: Extract dataset names
+        print("  → Step 2: Extracting dataset names...")
+        dataset_names = self._extract_dataset_names(example_code)
+
+        if not dataset_names:
+            result.status = "skip"
+            result.error_message = "No datasets found in code"
+            print(f"    ⚠️  Skipped - {result.error_message}")
+            return result
+
+        print(f"    ✓ Found datasets: {', '.join(dataset_names)}")
+        result.datasets = dataset_names
+
+        # Step 3: Find datasets in Gopie (assumes already uploaded)
+        print("  → Step 3: Finding datasets in Gopie...")
+        dataset_ids = []
+        for ds_name in dataset_names:
+            ds_id = await self._find_dataset_in_gopie(ds_name)
+            if ds_id:
+                dataset_ids.append(ds_id)
+                print(f"    ✓ Found '{ds_name}' (ID: {ds_id[:8]}...)")
+            else:
+                print(f"    ⚠️  Dataset '{ds_name}' not found in Gopie")
+
+        if not dataset_ids:
+            result.status = "skip"
+            result.error_message = "No matching datasets found in Gopie"
+            print(f"    ⚠️  Skipped - {result.error_message}")
+            return result
+
+        # Step 4: Fetch schemas
+        print(f"  → Step 4: Fetching {len(dataset_ids)} schema(s)...")
+        fetch_tasks = [
+            self._fetch_schema_for_dataset(dataset_id=did, project_id=self.project_id)
+            for did in dataset_ids
+        ]
+        schemas_results = await asyncio.gather(*fetch_tasks, return_exceptions=True)
+        schemas = [s for s in schemas_results if isinstance(s, dict)]
+
+        if not schemas:
+            result.status = "error"
+            result.error_message = "Failed to fetch schemas"
+            print(f"    ❌ {result.error_message}")
+            return result
+
+        print(f"    ✓ Fetched {len(schemas)} schema(s)")
+
+        # Step 5: Generate test case with LLM
+        print("  → Step 5: Generating test case with LLM...")
+        try:
+            prompt = self._create_prompt_for_image(image_path=image_path, datasets_schema=schemas)
+            test_case = await self._call_llm(prompt)
+
+            unique_project_ids = {s.get("project_id", "") for s in schemas if s.get("project_id")}
+            unique_dataset_ids = [s.get("dataset_id", "") for s in schemas if s.get("dataset_id")]
+            unique_dataset_ids = [d for d in unique_dataset_ids if d]
+
+            test_case.project_id = next(iter(unique_project_ids)) if unique_project_ids else ""
+            test_case.dataset_id = unique_dataset_ids[0] if len(unique_dataset_ids) == 1 else ""
+
+            dataset_names_for_sql = [
+                s.get("dataset_name", "") for s in schemas if s.get("dataset_name")
+            ]
+            test_case.sql_queries = [f'SELECT * FROM "{dn}"' for dn in dataset_names_for_sql]
+            test_case.image_path = image_path
+
+            result.test_case = test_case
+            print(f"    ✓ Test case generated: {test_case.query[:60]}...")
+
+        except Exception as e:
+            result.status = "error"
+            result.error_message = f"LLM generation failed: {e}"
+            print(f"    ❌ {result.error_message}")
+            return result
+
+        # Step 6: Run test case immediately
+        print("  → Step 6: Running test case...")
+        try:
+            case_dict = {
+                "project_id": test_case.project_id,
+                "dataset_id": test_case.dataset_id,
+                "query": test_case.query,
+                "sql_queries": test_case.sql_queries,
+                "image_path": test_case.image_path,
+            }
+
+            test_result = await run_viz_test_case(case_dict)
+            result.test_result = test_result
+
+            if test_result.get("success"):
+                evaluation = test_result.get("evaluation", {})
+                score = evaluation.get("score", 0)
+                passed = score >= 8
+                status_emoji = "✅" if passed else "❌"
+                result.status = "success" if passed else "error"
+                print(
+                    f"    {status_emoji} Test {'PASSED' if passed else 'FAILED'} (score: {score}/10)"
+                )
+            else:
+                result.status = "error"
+                result.error_message = test_result.get("error", "Unknown error")
+                print(f"    ❌ Test error: {result.error_message}")
+
+        except Exception as e:
+            result.status = "error"
+            result.error_message = f"Test execution failed: {e}"
+            print(f"    ❌ {result.error_message}")
+
+        return result


⚠️ Potential issue | 🟠 Major

Score threshold assumes a 0–10 scale but evaluator appears 0–1.

If the evaluator returns 0–1 (as in the current prompt/schema), score >= 8 will always fail. Align the scale or convert before comparison.

🔧 Option if keeping 0–1 scale

- score = evaluation.get("score", 0) - passed = score >= 8 + score = evaluation.get("score", 0) + passed = score >= 0.8 status_emoji = "✅" if passed else "❌" result.status = "success" if passed else "error" print( - f" {status_emoji} Test {'PASSED' if passed else 'FAILED'} (score: {score}/10)" + f" {status_emoji} Test {'PASSED' if passed else 'FAILED'} (score: {score:.2f}/1.0)" )

🧰 Tools

🪛 Ruff (0.14.13)

498-498: Possible SQL injection vector through string-based query construction

(S608)

504-504: Do not catch blind exception: Exception

(BLE001)

538-538: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

In `@chat-server/tests/e2e/viz_utils/per_example_workflow.py` around lines 401 - 543, The test scoring logic in process_example assumes a 0–10 scale (score >= 8) but the evaluator may return 0–1; detect and normalize the score before comparing: after retrieving evaluation = test_result.get("evaluation", {}) and score = evaluation.get("score", 0), coerce score to a float, and if score <= 1 then scale it (e.g., score_scaled = float(score) * 10) else keep as-is (score_scaled = float(score)); then use passed = score_scaled >= 8 and use score_scaled in any printed messages and result fields so the pass/fail decision matches the intended 0–10 threshold. Ensure this change is applied inside process_example where test_result/evaluation/score are handled.

coderabbitai · 2026-01-22T23:30:05Z

+class VizEvaluationSchema(BaseModel):
+    passed: bool = Field(description="Whether the generated chart matches the reference")
+    score: float = Field(
+        ge=0.0,
+        le=1.0,
+        description="Similarity score between 0 and 1 (higher is better)",
+    )
+    reasoning: str = Field(description="Short explanation of the judgment")


⚠️ Potential issue | 🟠 Major

Align evaluation score scale with the pass/fail threshold.

The schema/prompt specify a 0–1 score, but the runner treats it as 0–10 (score >= 8). This will mark nearly all cases as failed or fail schema validation. Consider standardizing on 0–10 to match existing threshold.

🔧 Align schema & prompt to 0–10

class VizEvaluationSchema(BaseModel): passed: bool = Field(description="Whether the generated chart matches the reference") score: float = Field( ge=0.0, - le=1.0, - description="Similarity score between 0 and 1 (higher is better)", + le=10.0, + description="Similarity score between 0 and 10 (higher is better)", ) reasoning: str = Field(description="Short explanation of the judgment")

- "passed (boolean), score (float 0-1), reasoning (short string)." + "passed (boolean), score (float 0-10), reasoning (short string)."

Also applies to: 69-83, 314-321

🤖 Prompt for AI Agents

In `@chat-server/tests/e2e/viz_utils/viz_test_case_runner.py` around lines 39 - 46, The VizEvaluationSchema currently constrains score to 0–1 but the runner uses a 0–10 pass threshold (score >= 8); update the schema field in VizEvaluationSchema (the score Field) to ge=0.0 and le=10.0 and change its description to "Similarity score between 0 and 10 (higher is better)"; then check other places that parse/validate or prompt for the score (any code that consumes VizEvaluationSchema or performs the score >= 8 check) and ensure prompts, validation, and test expectations consistently use the 0–10 scale.

coderabbitai · 2026-01-22T23:30:05Z

+def dict_hash(dct: dict[Any, Any]) -> Any:
+    """Return a hash of the contents of a dictionary."""
+    serialized = json.dumps(dct, sort_keys=True)
+
+    try:
+        m = hashlib.sha256(serialized)[:32]  # pyright: ignore[reportArgumentType,reportIndexIssue]
+    except TypeError:
+        m = hashlib.sha256(serialized.encode())[:32]  # pyright: ignore[reportIndexIssue]
+
+    return m.hexdigest()


⚠️ Potential issue | 🔴 Critical

Bug: hashlib.sha256() returns a hash object, not bytes—slicing it won't work.

The code attempts to slice hashlib.sha256(serialized)[:32], but hashlib.sha256() returns a hash object, not bytes. This will raise a TypeError (which the code then catches and tries the encoded version, which also won't work correctly). The # pyright: ignore comments are masking the issue.

🐛 Proposed fix

def dict_hash(dct: dict[Any, Any]) -> Any: """Return a hash of the contents of a dictionary.""" serialized = json.dumps(dct, sort_keys=True) - - try: - m = hashlib.sha256(serialized)[:32] # pyright: ignore[reportArgumentType,reportIndexIssue] - except TypeError: - m = hashlib.sha256(serialized.encode())[:32] # pyright: ignore[reportIndexIssue] - - return m.hexdigest() + return hashlib.sha256(serialized.encode()).hexdigest()[:32]

🤖 Prompt for AI Agents

In `@chat-server/tests/e2e/viz_utils/viz_utils.py` around lines 192 - 201, The dict_hash function is using hashlib.sha256 incorrectly by slicing the hash object; instead ensure `serialized` is bytes (encode when needed), pass that to `hashlib.sha256(...)`, then call `.hexdigest()` (or `.digest()` if you want raw bytes) and slice the resulting bytes/hex string as intended before returning; update the logic in dict_hash to encode `serialized` when it's a str, compute the hash via `hashlib.sha256(serialized_bytes)`, then return the proper sliced `.hexdigest()` (or slice `.digest()` and then hexlify) so `m.hexdigest()` is called on a real digest rather than on the hash object.

fix(workflow): standardize SQL query state key naming - Changed state key from "relevant_sql_queries" to "previous_sql_queries" in multi_dataset and single_dataset agent nodes - Changed state key from "previous_sql_queries" to "prev_sql_queries" in analyze_query node to match input parameter naming ```

claude · 2026-01-24T06:29:51Z

Claude finished @paul-tharun's task —— View job

PR Review: Feat/e2e datasets metadata

I've conducted a comprehensive review of this large PR with 100 files changed. Here are my findings:

Review Checklist

Read repository setup and guidelines (CLAUDE.md, configs)
Review architectural changes (new graphs, workflows)
Analyze API security and validation
Check error handling and edge cases
Evaluate performance implications
Assess test coverage
Provide comprehensive feedback

🏗️ Architecture & Design

✅ Positive Changes

Well-structured agent architecture: New nl_to_sql_graph and sql_planner_graph follow LangGraph patterns correctly
Separation of concerns: Clear separation between SQL planning, execution, and validation workflows
Type safety: Good use of Pydantic models for API validation (FetchSqlRequest, FetchSqlResponse)
Unified client pattern: New GopieClient centralizes API communication with proper header management

⚠️ Architectural Concerns

Commented code in workflows: chat-server/app/workflow/graph/sql_planner_graph/graph.py:19-30 has extensive commented conditional edges that should either be implemented or removed
Complex state management: The addition of multiple new graph types increases complexity without clear documentation of when each should be used

🔒 Security Analysis

✅ Good Security Practices

Input validation: Proper Pydantic validation in new API endpoints (fetch_sql.py:21-22)
SQL injection prevention: Using parameterized queries and proper escaping (column_value_matching.py:40)
Header handling: Proper organization ID header management in GopieClient

🚨 Security Issues

Insufficient input sanitization in column_value_matching.py:40:
```
WHERE LOWER(CAST({column_name} AS VARCHAR)) LIKE '%' || LOWER('{escaped_value}') || '%'
```
The escaped_value variable name suggests sanitization, but no actual escaping is shown in this snippet.

Generic exception handling in fetch_sql.py:36-42:

except Exception as e:
    logger.error(f"Error in fetch_sql: {e}")
    raise HTTPException(
        status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
        detail="Failed to generate SQL queries, please try again.",
    ) from e

This could mask important security errors and provide too generic error messages.

🚀 Performance Analysis

✅ Performance Improvements

Smart sampling: New table size estimation and sampling logic in table_utils.py prevents performance issues on large datasets
Connection pooling: Using singleton pattern for HTTP client connections (SingletonAiohttp)
Async operations: Proper use of async/await throughout new components

⚠️ Performance Concerns

Multiple vector searches: New DuckDB documentation search (duckdb_docs/search.py) adds another vector search operation that could impact latency
Memory usage: Complex state objects with nested query results could consume significant memory in high-concurrency scenarios

🧪 Testing & Quality

✅ Testing Infrastructure

Good test structure: Maintained separation between unit and e2e tests
DSPy integration: Advanced evaluation system with optimized evaluators

❌ Testing Gaps

No tests for new API endpoint: fetch_sql.py lacks dedicated test coverage
Missing edge case tests: New graph workflows need tests for failure scenarios
Integration tests: No tests covering the interaction between new nl_to_sql_graph and existing workflows

🐛 Bug & Error Handling Issues

State validation inconsistency in sql_agent.py:27-28:

if not query_result or not single_dataset_result:
    raise ValueError("query_result is not properly initialized")

This doesn't provide context about what went wrong or how to fix it.

Resource cleanup: Missing cleanup logic for failed operations in new graph workflows
Retry logic: Inconsistent retry mechanisms across different agents - some have configurable retry counts, others don't

📋 Code Quality

✅ Good Practices

Type hints: Comprehensive type annotations throughout new code
Documentation: Good docstrings in new functions like get_duckdb_docs()
Code organization: Logical file structure following established patterns

⚠️ Areas for Improvement

Hardcoded values: Magic numbers like LIMIT 200000 in SQL queries should be configurable
Long functions: Some functions exceed 50 lines and could benefit from decomposition
Duplicate logic: Similar error handling patterns repeated across multiple files

🔧 Specific Recommendations

Security:

# In column_value_matching.py - add proper escaping
def escape_sql_value(value: str) -> str:
    return value.replace("'", "''").replace("%", "\\%").replace("_", "\\_")

Error handling:

# In fetch_sql.py - more specific exception handling
except ValueError as e:
    raise HTTPException(status_code=400, detail=str(e))
except ConnectionError as e:
    raise HTTPException(status_code=503, detail="Service unavailable")

Configuration:

# Add to config.py
DEFAULT_QUERY_LIMIT: int = 200000
MAX_SAMPLING_PERCENTAGE: float = 5.0

Testing:
- Add unit tests for fetch_sql_router
- Add integration tests for new graph workflows
- Add performance tests for large dataset scenarios

📊 Overall Assessment

Category	Rating	Comments
Architecture	⭐⭐⭐⭐⚪	Well-designed but complex
Security	⭐⭐⭐⚪⚪	Good practices with some gaps
Performance	⭐⭐⭐⭐⚪	Smart optimizations implemented
Testing	⭐⭐⚪⚪⚪	Infrastructure good, coverage lacking
Code Quality	⭐⭐⭐⭐⚪	Clean code with room for improvement

Overall: This is a substantial improvement to the system with well-thought-out architectural changes. The main concerns are security gaps, insufficient testing of new features, and some complexity that could be better documented. I recommend addressing the security issues before merging.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@chat-server/app/utils/providers/embedding_providers/vllm.py`:
- Around line 26-31: The VLLMEmbeddingProvider __init__ currently accepts but
never stores metadata and calls no super; either remove the unused metadata
parameter from VLLMEmbeddingProvider or persist it similarly to
LiteLLMEmbeddingProvider (e.g., store self.metadata and extract org-scoped keys
like user/trace_id as PortkeyEmbeddingProvider does) so downstream methods can
use it; additionally implement a fallback strategy by updating either
VLLMEmbeddingProvider (add a guarded get_embedding call that catches failures
and delegates to a configured fallback provider) or enhance
get_embedding_provider() to return a wrapper that attempts the primary provider
(VLLMEmbeddingProvider) and on exception switches to a secondary provider,
ensuring clear identifiers (VLLMEmbeddingProvider, LiteLLMEmbeddingProvider,
PortkeyEmbeddingProvider, get_embedding_provider) are used to locate and wire
the fallback logic.

🧹 Nitpick comments (5)

chat-server/app/workflow/graph/multi_dataset_graph/node/analyze_query.py (2)
71-93: Consider generating a unique ID for the fallback tool_call.

The hardcoded "id": "fallback_respond_to_user" may cause issues if any downstream code validates or indexes by tool_call_id expecting unique values. LangChain typically uses UUIDs.
♻️ Suggested improvement
+import uuid
+
 ...
             fallback = AIMessage(
                 content=getattr(response, "content", "") or "",
                 tool_calls=[
                     {
                         "name": "respond_to_user",
                         "args": {
                             "query_type": "data_query",
                             "confidence_score": 5,
                             "reasoning": "Model did not return a tool call; defaulting to full workflow.",
                             "clarification_needed": "",
                             "status_message": "Analyzing your question...",
                             "response_data": None,
                         },
-                        "id": "fallback_respond_to_user",
+                        "id": f"fallback_{uuid.uuid4().hex[:12]}",
                     }
                 ],
             )
179-185: Consider adding defensive checks for robustness.

Direct access to response.tool_calls[0]["args"] could raise IndexError or KeyError if the response structure is unexpected. While the outer try-except catches this, a defensive check would provide clearer error handling.
♻️ Suggested defensive check
 async def _handle_analysis_response(
     response: Any,
     query_result: QueryResult,
     tool_call_count: int,
     config: RunnableConfig,
 ) -> dict:
-    tool_args = response.tool_calls[0]["args"]
+    tool_calls = getattr(response, "tool_calls", None) or []
+    if not tool_calls or "args" not in tool_calls[0]:
+        logger.warning("_handle_analysis_response: missing tool_calls or args; using defaults")
+        tool_args = {}
+    else:
+        tool_args = tool_calls[0]["args"]

     query_type = tool_args.get("query_type", "conversational")
     confidence_score = tool_args.get("confidence_score", 5)
chat-server/app/core/config.py (1)

113-115: Consider surfacing these sampling knobs in env/docs.
Since TARGET_ROWS and SAMPLING_THRESHOLD are new tunables, adding them to .env.example or config docs would make them more discoverable for operators.
chat-server/app/utils/providers/embedding_providers/vllm.py (1)
12-23: Avoid clobbering or duplicating model_kwargs in embed_query.

If a caller supplies model_kwargs, this currently risks a TypeError (duplicate keyword) or silently discarding caller options. Prefer merging.
♻️ Suggested merge pattern
 class VLLMEmbeddingFunctionProvider(OpenAIEmbeddings):
     def embed_query(self, text: str, **kwargs: Any) -> list[float]:
         text = f"Instruct: {TASK_DESCRIPTION}\nQuery: {text}"
+        model_kwargs = {"extra_body": {"truncate_prompt_tokens": -1}}
+        model_kwargs |= kwargs.pop("model_kwargs", {})
         return super().embed_query(
-            text, model_kwargs={"extra_body": {"truncate_prompt_tokens": -1}}, **kwargs
+            text, model_kwargs=model_kwargs, **kwargs
         )

     async def aembed_query(self, text: str, **kwargs: Any) -> list[float]:
         text = f"Instruct: {TASK_DESCRIPTION}\nQuery: {text}"
+        model_kwargs = {"extra_body": {"truncate_prompt_tokens": -1}}
+        model_kwargs |= kwargs.pop("model_kwargs", {})
         return await super().aembed_query(
-            text, model_kwargs={"extra_body": {"truncate_prompt_tokens": -1}}, **kwargs
+            text, model_kwargs=model_kwargs, **kwargs
         )
chat-server/app/utils/model_registry/model_provider.py (1)
175-203: Guard retries when tools are bound to avoid duplicate side effects in downstream chains.

bind_tools() binds tool schemas only and doesn't execute tools directly. However, when this LLM is used downstream in a chain that executes tools, retries at higher levels can cause tool side effects to repeat. As a defensive measure, skip with_retry() when tools are bound, since side-effectful operations (database writes, external API calls) should be idempotent or prevented from retrying altogether.
♻️ Suggested guard to avoid retrying tool calls
-    if schema:
-        structured_llm = llm.with_structured_output(schema=schema, method="json_schema")  # type: ignore
-        return structured_llm.with_retry()  # type: ignore
+    if schema:
+        structured_llm = llm.with_structured_output(schema=schema, method="json_schema")  # type: ignore
+        return structured_llm.with_retry()  # type: ignore
     elif json_mode:
         llm = llm.bind(response_format={"type": "json_object"})
     if force_tool_calls:
         llm = llm.bind(tool_choice="required")
-    return llm.with_retry()  # type: ignore
+    if tool_names or force_tool_calls:
+        return llm  # avoid retrying side-effectful tool calls
+    return llm.with_retry()  # type: ignore

coderabbitai · 2026-01-24T06:37:22Z

+class VLLMEmbeddingProvider(BaseEmbeddingProvider):
+    def __init__(
+        self,
+        metadata: dict[str, str],
+    ):
+        pass


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find and examine BaseEmbeddingProvider and other embedding providers echo "=== Finding base.py ===" fd -t f 'base.py' chat-server/app/utils/providers/embedding_providers echo -e "\n=== Listing embedding providers directory ===" fd -t f '\.py$' chat-server/app/utils/providers/embedding_providers echo -e "\n=== Checking BaseEmbeddingProvider definition ===" rg -n "class BaseEmbeddingProvider" chat-server/app/utils/providers/embedding_providers -A 20

Repository: factly/gopie

Length of output: 1120

🏁 Script executed:

#!/bin/bash # Get full BaseEmbeddingProvider implementation echo "=== Full BaseEmbeddingProvider class ===" cat -n chat-server/app/utils/providers/embedding_providers/base.py echo -e "\n=== VLLMEmbeddingProvider __init__ ===" rg -n "class VLLMEmbeddingProvider" chat-server/app/utils/providers/embedding_providers/vllm.py -A 10 echo -e "\n=== OpenAI provider __init__ ===" rg -n "class.*EmbeddingProvider" chat-server/app/utils/providers/embedding_providers/openai.py -A 10 echo -e "\n=== LiteLLM provider __init__ ===" rg -n "class.*EmbeddingProvider" chat-server/app/utils/providers/embedding_providers/litellm.py -A 10 echo -e "\n=== Local provider __init__ ===" rg -n "class.*EmbeddingProvider" chat-server/app/utils/providers/embedding_providers/local.py -A 10 echo -e "\n=== Custom provider __init__ ===" rg -n "class.*EmbeddingProvider" chat-server/app/utils/providers/embedding_providers/custom.py -A 10 echo -e "\n=== Portkey provider __init__ ===" rg -n "class.*EmbeddingProvider" chat-server/app/utils/providers/embedding_providers/portkey.py -A 10

Repository: factly/gopie

Length of output: 3474

🏁 Script executed:

#!/bin/bash # Check embedding provider usage and instantiation echo "=== Check __init__.py for embedding providers ===" cat -n chat-server/app/utils/providers/embedding_providers/__init__.py echo -e "\n=== Search for VLLMEmbeddingProvider usage ===" rg -r "VLLMEmbeddingProvider" chat-server -g '*.py' -B 2 -A 2 echo -e "\n=== Check provider manager or factory for fallback logic ===" fd -t f '*.py' chat-server/app/utils/providers | head -20 | xargs -I {} bash -c 'echo "=== {} ===" && head -5 {}' echo -e "\n=== Look for fallback or routing patterns ===" rg -i "fallback|fallback_provider" chat-server -g '*.py' -A 2

Repository: factly/gopie

Length of output: 14425

🏁 Script executed:

#!/bin/bash # Check how embedding providers are instantiated/used echo "=== Search for embedding provider instantiation ===" rg "EmbeddingProvider\(" chat-server/app -g '*.py' -B 2 -A 2 echo -e "\n=== Check if there's a provider manager/factory ===" fd -t f '*.py' chat-server/app/utils/providers | grep -E "(manager|factory|init)" | xargs -I {} bash -c 'echo "=== {} ===" && head -20 {}' echo -e "\n=== Check __init__.py in providers parent ===" cat -n chat-server/app/utils/providers/__init__.py 2>/dev/null || echo "File not found" echo -e "\n=== Check for EmbeddingProvider in main app init ===" rg "EmbeddingProvider" chat-server/app -g '*.py' | head -20

Repository: factly/gopie

Length of output: 8631

🏁 Script executed:

#!/bin/bash # Get complete model_provider.py context echo "=== Full model_provider.py or relevant section ===" wc -l chat-server/app/utils/model_registry/model_provider.py head -80 chat-server/app/utils/model_registry/model_provider.py echo -e "\n=== Check for fallback in get_embedding_provider ===" rg -n "get_embedding_provider" chat-server/app/utils/model_registry/model_provider.py -A 20 echo -e "\n=== Check if metadata is used in VLLM get_embeddings_model ===" cat -n chat-server/app/utils/providers/embedding_providers/vllm.py

Repository: factly/gopie

Length of output: 6287

Remove unused metadata parameter and implement fallback mechanism.

BaseEmbeddingProvider has no __init__ method, so super().__init__() is not applicable. However, VLLMEmbeddingProvider accepts metadata but never stores or uses it, unlike PortkeyEmbeddingProvider (which extracts user/trace_id for org-scoped routing) and LiteLLMEmbeddingProvider (which stores metadata for auth). Either remove the unused parameter for consistency, or document why it's needed.

Additionally, per coding guidelines, implement fallback mechanisms at the provider level or in get_embedding_provider() factory—currently there is no fallback strategy if the primary provider fails.

🤖 Prompt for AI Agents

In `@chat-server/app/utils/providers/embedding_providers/vllm.py` around lines 26 - 31, The VLLMEmbeddingProvider __init__ currently accepts but never stores metadata and calls no super; either remove the unused metadata parameter from VLLMEmbeddingProvider or persist it similarly to LiteLLMEmbeddingProvider (e.g., store self.metadata and extract org-scoped keys like user/trace_id as PortkeyEmbeddingProvider does) so downstream methods can use it; additionally implement a fallback strategy by updating either VLLMEmbeddingProvider (add a guarded get_embedding call that catches failures and delegates to a configured fallback provider) or enhance get_embedding_provider() to return a wrapper that attempts the primary provider (VLLMEmbeddingProvider) and on exception switches to a secondary provider, ensuring clear identifiers (VLLMEmbeddingProvider, LiteLLMEmbeddingProvider, PortkeyEmbeddingProvider, get_embedding_provider) are used to locate and wire the fallback logic.

claude · 2026-02-02T17:00:18Z

Claude finished @paul-tharun's task —— View job

🔍 PR Review: Feat/e2e datasets metadata

Review Progress

Read repository setup and guidelines
Review architectural changes
Analyze security implementations
Check error handling patterns
Evaluate performance implications
Assess test coverage
Provide comprehensive feedback

🏗️ Architecture Review

This PR introduces significant architectural improvements with a well-structured approach to NL-to-SQL processing and multi-tenant organization support.

✅ Positive Changes

1. Clean NL-to-SQL Graph Architecture

Supervisor routing: Clean logic in nl_to_sql_graph/node/supervisor.py routing single vs multi-dataset queries
Separation of concerns: Clear split between semantic search, SQL agent, and planning
LangGraph patterns: Proper use of StateGraph, Command, and typed state management

2. Multi-Tenant Organization Support

Proper isolation: Organization ID properly threaded through GopieClient and Qdrant filters
Security boundaries: Clear separation using must conditions for org_id in schema search
Header management: Consistent X-Organization-id header handling

3. SQL Injection Prevention

Query builders: Safe abstraction through utils/olap/ query builders
Parameterized approach: SQL generation through builders rather than string interpolation
Validation patterns: Good use of escaped values in column matching

🚨 Critical Security Issues

1. Missing Organization ID in Semantic Search

Location: chat-server/app/workflow/graph/nl_to_sql_graph/node/semantic_search.py:20

The semantic search function calls search_schemas without passing org_id, allowing cross-tenant data leakage:

# VULNERABLE: No org_id passed
results = await search_schemas(
    user_query=user_query,
    embeddings=embeddings_model,
    dataset_ids=dataset_ids,
    project_ids=project_ids,
    # Missing: org_id=org_id
)

Fix Required: Extract org_id from config and pass it to search_schemas.

2. Division by Zero Risk

Location: chat-server/app/utils/graph_utils/table_utils.py:43

# VULNERABLE: Will crash if estimated_size is 0
sample_pct = (settings.TARGET_ROWS / estimated_size) * 100

Fix Required: Add zero-check before division.

3. Direct SQL Interpolation

Location: chat-server/app/workflow/graph/single_dataset_graph/node/prepare_data.py:121

# POTENTIAL RISK: Direct interpolation without validation
sample_data_query = f"SELECT * FROM {dataset_name} LIMIT 50"

While dataset_name comes from trusted Qdrant storage, consider defensive identifier quoting.

⚠️ Error Handling Issues

1. Inconsistent Error Guards

execute_sql.py:34: Checks isinstance(last_message, ErrorMessage) but last_message is a list
sql_agent.py:25: Accesses query_result.single_dataset_query_result before null check

2. Missing Resource Cleanup

No explicit cleanup logic for failed operations in new graph workflows
Broad exception handling masks specific failure modes

3. State Validation Issues

prepare_data: Returns without dataset_info but graph always routes to sql_agent
routing.py: Uses state.get("query_result", []) which can be list causing AttributeError

🚀 Performance Considerations

✅ Good Optimizations

Smart sampling: Table size estimation prevents performance issues on large datasets
Vector search optimization: Hybrid search with proper filtering
Connection pooling: Singleton pattern for HTTP client connections
Async operations: Proper async/await throughout

⚠️ Potential Concerns

Multiple vector searches: DuckDB docs search adds latency to SQL planning
Memory usage: Complex nested state objects in high-concurrency scenarios
Missing timeouts: GopieClient lacks explicit timeout configuration

🧪 Test Coverage Assessment

✅ Good Coverage

Existing unit tests cover GopieClient org_id handling
DSPy integration provides advanced evaluation
E2E visualization test infrastructure

❌ Missing Tests

No tests for new /fetch-sql endpoint
No integration tests for NL-to-SQL graph workflow
Missing edge case tests for new graph failure scenarios
No tests for org_id security boundaries

📋 Code Quality Issues

1. Error Message Inconsistencies

# Vague error without context
raise ValueError("query_result is not properly initialized")

2. Magic Numbers

# Should be configurable
SELECT * FROM {table} LIMIT 200000

3. Missing Function Documentation

Several new functions lack comprehensive docstrings explaining return contracts.

🔧 Specific Recommendations

High Priority Fixes

Add org_id to semantic search to prevent data leakage
Add zero-check in calculate_sampling_percentage
Fix error guard in execute_sql.py (list vs ErrorMessage)
Add validation in sql_agent.py before accessing query_result

Medium Priority

Add timeouts to GopieClient requests
Improve error messages with specific context
Add integration tests for new workflows
Configure magic numbers via settings

Code Quality

Add comprehensive docstrings to new functions
Implement consistent retry logic across agents
Add resource cleanup for failed operations

📊 Overall Assessment

Category	Rating	Comments
Architecture	⭐⭐⭐⭐⭐	Excellent modular design
Security	⭐⭐⭐⚪⚪	Good foundations, critical gaps
Performance	⭐⭐⭐⭐⚪	Smart optimizations implemented
Testing	⭐⭐⚪⚪⚪	Infrastructure good, coverage lacking
Code Quality	⭐⭐⭐⭐⚪	Clean code, room for improvement

Recommendation: This is a solid architectural improvement, but the security issue in semantic search must be fixed before merging to prevent cross-tenant data access.

coderabbitai

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (11)

chat-server/tests/e2e/utils/generate_app_cases.py (1)
290-309: ⚠️ Potential issue | 🔴 Critical

Complete the multi-dataset routing description and add "ai" → "assistant" role normalization.

Lines 290–309: The routing overview for identify_datasets omits the retry_semantic_search path. Update line 295 to:
  • identify_datasets → routes: plan_query | route_response (no_datasets_found) | retry_semantic_search
Lines 401–420: The _normalize_message function converts "human" to "user" but does not convert "ai" to "assistant". Since LangChain's AIMessage.type is "ai" but provider APIs (OpenAI, etc.) require role: "assistant", add this mapping after line 405:
            if role == "ai":
                role = "assistant"
These omissions will cause incomplete test-path coverage and generated test cases with invalid message roles that fail API compatibility checks.
chat-server/tests/unit/test_column_value_matching.py (2)
36-61: ⚠️ Potential issue | 🟠 Major

Bug: captured dict is never populated, assertion will fail.

The test declares captured = {} on line 39 but never assigns captured["query"] in the mock function. The assertion on line 61 will raise a KeyError.
🐛 Proposed fix
 `@pytest.mark.asyncio`
 async def test_find_similar_values_falls_back_to_levenshtein(monkeypatch):
     """Test fallback to Levenshtein when ILIKE returns empty results."""
     calls = {"count": 0}
     captured = {}
 
     async def fake_execute_sql(query: str, org_id=None):
         calls["count"] += 1
+        captured["query"] = query
         # Check for both DuckDB (levenshtein) and ClickHouse (levenshteinDistance) syntax
         if "levenshtein" in query.lower():
             return [
                 {"name": "Alison"},
                 {"name": "Alicia"},
             ]
         return []
73-83: ⚠️ Potential issue | 🟡 Minor

Bug: Duplicate dictionary key "error" in dummy logger.

The dictionary literal has "error" defined twice. The second definition silently overwrites the first. This appears to be a copy-paste error.
🐛 Proposed fix
     dummy_logger = type(
         "L",
         (),
         {
             "error": lambda *args, **kwargs: None,
             "debug": lambda *args, **kwargs: None,
-            "error": lambda *args, **kwargs: None,
         },
     )()
chat-server/tests/unit/test_vector_store.py (1)
48-77: ⚠️ Potential issue | 🟠 Major

Test mocks get_vector_store but implementation uses get_async_client.

The test mocks QdrantSetup.get_vector_store, but examining the actual add_document_to_vector_store implementation (in vector_store.py lines 73-90), the function calls QdrantSetup.get_async_client and performs a direct upsert, not get_vector_store.aadd_documents. This test will pass but doesn't exercise the actual code path.
🔧 Proposed fix - update test to match implementation
     `@pytest.mark.asyncio`
     async def test_add_document_to_vector_store(self, mock_document):
-        mock_vector_store = AsyncMock()
+        mock_async_client = AsyncMock()

         with (
             patch("app.services.qdrant.vector_store.QdrantSetup") as mock_qdrant_setup_class,
             patch("app.services.qdrant.vector_store.get_model_provider") as mock_get_model_provider,
+            patch("app.services.qdrant.vector_store.generate_sparse_vector") as mock_sparse,
         ):
             mock_model_provider = Mock()
             mock_embeddings = Mock()
+            mock_embeddings.embed_query.return_value = [0.1, 0.2, 0.3]
             mock_model_provider.get_embeddings_model.return_value = mock_embeddings
             mock_get_model_provider.return_value = mock_model_provider

-            mock_qdrant_setup_class.get_vector_store.return_value = mock_vector_store
+            mock_qdrant_setup_class.get_async_client = AsyncMock(return_value=mock_async_client)
             mock_qdrant_setup_class.get_document_id.return_value = "doc_id_123"
+            mock_sparse.return_value = Mock(indices=[1, 2], values=[0.5, 0.5])

             await add_document_to_vector_store(mock_document)

-            mock_qdrant_setup_class.get_vector_store.assert_called_once_with(
-                embeddings=mock_embeddings, collection_name="dataset_collection"
-            )
             mock_qdrant_setup_class.get_document_id.assert_called_once_with("proj1", "ds1")
-            mock_vector_store.aadd_documents.assert_called_once_with(
-                documents=[mock_document], ids=["doc_id_123"]
-            )
+            mock_async_client.upsert.assert_called_once()
chat-server/tests/unit/test_prompts.py (1)
107-168: ⚠️ Potential issue | 🟠 Major

Add @pytest.mark.unit to mark these tests for unit-only test runs.

The test file is missing the module-level pytest marker required for unit tests in chat-server/tests/unit/. This causes these tests to be skipped when running pytest -m unit.
✅ Suggested fix
 from unittest.mock import Mock, patch

 import pytest

+pytestmark = pytest.mark.unit
+
 from app.utils.langsmith.prompt_manager import PromptManager, get_prompt
 from app.workflow.prompts.prompt_selector import PromptSelector
chat-server/app/api/v1/routers/dataset_upload.py (1)
21-50: ⚠️ Potential issue | 🟠 Major

Guard against missing x-organization-id header before Qdrant storage.

store_schema_in_qdrant and create_dataset_schema both require org_id: str, but the header is optional (str | None). Passing None risks type errors and tenant-mis-scoping. Make the header required or validate/reject requests without it.
Suggested fix (header required)
 async def upload_schema(
     payload: UploadSchemaRequest,
     x_organization_id: Annotated[str | None, Header()] = None,
 ):
     """
     Processes and index dataset schema.
@@
     """
     try:
+        if x_organization_id is None:
+            raise HTTPException(
+                status_code=status.HTTP_400_BAD_REQUEST,
+                detail="x-organization-id header is required",
+            )
         project_id = payload.project_id
         dataset_id = payload.dataset_id
         dataset_details, project_details = await asyncio.gather(
chat-server/app/services/gopie/generate_schema.py (1)
70-76: ⚠️ Potential issue | 🟡 Minor

Missing error handling for the GET request.

Unlike execute_sql which checks response.status != HTTPStatus.OK, the GET request here doesn't validate the response status before parsing JSON. A non-200 response could result in unexpected behavior or unclear error messages.
🛡️ Proposed fix to add error handling
+    from http import HTTPStatus
+
     async with await client.get(path) as response:
+        if response.status != HTTPStatus.OK:
+            error_data = await response.json()
+            raise Exception(
+                f"Failed to fetch summary for {dataset_name}: "
+                f"{error_data.get('error', 'Unknown error')}"
+            )
         data = await response.json()
chat-server/app/tool_utils/tools/list_datasets.py (2)
38-52: ⚠️ Potential issue | 🟡 Minor

Add error handling for consistency with other functions.

This function lacks try/except handling unlike get_dataset_names_from_project_ids (lines 17-28) and get_dataset_names_for_dataset_ids (lines 62-70). An unhandled exception here could propagate and cause issues in the calling code.
🛡️ Proposed fix to add error handling
 async def get_project_ids_for_datasets_ids(
     dataset_ids: list[str],
     org_id: Optional[str] = None,
 ) -> dict[str, str]:
     dataset_id_project_map = {}
     client = GopieClient(org_id=org_id)
     for dataset_id in dataset_ids:
-        path = f"/v1/api/datasets/{dataset_id}/project"
-        async with await client.get(path, ssl=False) as response:
-            if response.status == 200:
-                dataset_data = await response.json()
-                dataset_id_project_map[dataset_id] = dataset_data.get("project_id", "")
-            else:
-                logger.warning(f"Dataset {dataset_id} not found")
+        try:
+            path = f"/v1/api/datasets/{dataset_id}/project"
+            async with await client.get(path, ssl=False) as response:
+                if response.status == 200:
+                    dataset_data = await response.json()
+                    dataset_id_project_map[dataset_id] = dataset_data.get("project_id", "")
+                else:
+                    logger.warning(f"Dataset {dataset_id} not found")
+        except Exception as e:
+            logger.exception(f"Error fetching project for dataset {dataset_id}: {e}")
     return dataset_id_project_map
55-76: ⚠️ Potential issue | 🟡 Minor

Potential data loss: only storing one dataset per project.

In get_dataset_names_for_dataset_ids, when multiple datasets belong to the same project, only the last one is retained because project_dataset_map[project_id] is overwritten on each iteration (line 68).
🐛 Proposed fix to accumulate datasets per project
 async def get_dataset_names_for_dataset_ids(
     dataset_project_ids_map: dict[str, str],
     org_id: Optional[str] = None,
 ) -> str:
-    project_dataset_map = {}
+    project_dataset_map: dict[str, list[str]] = {}
     client = GopieClient(org_id=org_id)
 
     for dataset_id, project_id in dataset_project_ids_map.items():
         path = f"/v1/api/projects/{project_id}/datasets/{dataset_id}"
 
         async with await client.get(path, ssl=False) as response:
             if response.status == 200:
                 dataset_data = await response.json()
-                project_dataset_map[project_id] = dataset_data.get("alias", "")
+                if project_id not in project_dataset_map:
+                    project_dataset_map[project_id] = []
+                project_dataset_map[project_id].append(dataset_data.get("alias", ""))
             else:
                 logger.warning(f"Dataset {dataset_id} not found")
 
     list_of_datasets_str = ""
-    for project_id, dataset_name in project_dataset_map.items():
+    for project_id, dataset_names in project_dataset_map.items():
         list_of_datasets_str += f"Project {project_id}:\n"
-        list_of_datasets_str += f"  - {dataset_name}\n"
+        for dataset_name in dataset_names:
+            list_of_datasets_str += f"  - {dataset_name}\n"
     return list_of_datasets_str
chat-server/app/workflow/graph/multi_dataset_graph/node/identify_datasets.py (2)
88-111: ⚠️ Potential issue | 🟡 Minor

Incorrect data type passed to set_node_message.

At line 91-94, a set is being passed as the message argument instead of a str. The curly braces {...} create a set, not a dict. This should be a string.
🐛 Proposed fix
             query_result.set_node_message(
                 "identify_datasets",
-                {
-                    "No relevant datasets found by doing semantic search. This subquery is "
-                    "not relevant to any datasets. Treating as conversational query."
-                },
+                "No relevant datasets found by doing semantic search. This subquery is "
+                "not relevant to any datasets. Treating as conversational query.",
             )
197-218: ⚠️ Potential issue | 🟡 Minor

Docstring mentions "analyze_dataset" but returns "plan_query".

The docstring at line 202 still references "analyze_dataset" as a possible return value, but the function now returns "plan_query" at line 218.
📝 Proposed fix to update docstring
     """
     Route to the appropriate next node based on dataset identification results.
 
     Returns:
-        str: One of "no_datasets_found", "retry_semantic_search", or "analyze_dataset"
+        str: One of "no_datasets_found", "retry_semantic_search", or "plan_query"
     """

🤖 Fix all issues with AI agents

In `@chat-server/app/api/v1/routers/dataset_upload.py`:
- Around line 2-4: Remove the duplicate Header import (only import Header once)
and add validation for x_organization_id before calling store_schema_in_qdrant:
ensure x_organization_id (the header bound variable) is not None/empty and if
missing raise an HTTPException with status code 400; then pass the validated
org_id (str) into store_schema_in_qdrant to satisfy its required str parameter
(refer to Header import, x_organization_id, and store_schema_in_qdrant).

In `@chat-server/app/core/log.py`:
- Around line 14-17: The custom logger method exception in
chat-server/app/core/log.py currently calls self._logger.critical(...,
stack_info=True) which records the call stack but loses the actual exception
traceback; update the exception method to pass exc_info=True (and keep
stack_info=True if desired) to self._logger so the logged output includes the
exception traceback; modify the exception function (named exception) to call
self._logger with exc_info=True (and stack_info=True if you want both) and
preserve the existing behavior of raising when self.dev_mode is true.

In `@chat-server/app/models/data.py`:
- Around line 28-37: DatasetDetails.columns is now Optional and can be None;
update all consumers to guard against None before iterating or accessing it —
e.g., in regenerate_fuzzy_values_prompt.py where matching_schema.columns is used
in list/generator comprehensions and in dataset_info.py where schema.columns is
looped, replace direct iterations with a null-safe form (check
matching_schema.columns is not None or use a fallback like
(matching_schema.columns or [])). Also add null checks before using row_count in
calculations and before using file_path for file operations so you don't assume
they are present (guard expressions around row_count and file_path where used).
Ensure you update usages referencing DatasetDetails, matching_schema.columns,
schema.columns, row_count, and file_path accordingly.

In `@chat-server/app/services/gopie/dataset_info.py`:
- Around line 34-43: Replace the re-raising pattern in the except blocks that
currently use "raise e" with a bare "raise" so the original traceback is
preserved; specifically update the except Exception as e blocks in
chat-server/app/services/gopie/dataset_info.py (the block that logs via
logger.exception("Error getting dataset info...") and the other similar block
later) to log with logger.exception(...) and then perform a bare raise rather
than "raise e".

In `@chat-server/app/services/qdrant/schema_search.py`:
- Around line 86-88: The async function search_schemas is calling the blocking
embeddings.embed_query which will block the event loop; change the call to the
async variant await embeddings.aembed_query(user_query) and assign its result
back to dense_vector, leaving generate_sparse_vector(user_query) as-is; ensure
the embeddings object supports aembed_query (as used elsewhere, e.g., health.py)
and update any nearby error handling or type expectations for dense_vector if
needed.

In `@chat-server/app/utils/olap/clickhouse.py`:
- Around line 13-19: The SQL builder currently injects table_name and
column_name directly (e.g., in get_estimated_size_query and the other builder
methods referenced), which permits SQL injection; update all methods that
interpolate identifiers to validate them against a strict identifier whitelist
or a conservative regex (e.g., only ASCII letters, numbers, and underscores,
starting with a letter) before using them, and then format identifiers using a
safe quoting strategy (e.g., wrap with ClickHouse backticks after validation and
escape any backticks found), ensuring you apply this change to
get_estimated_size_query and every builder method that inserts table_name or
column_name (the ones in the review ranges) so all identifier interpolation uses
the validated-and-quoted identifier rather than raw f-string insertion.

In `@chat-server/app/utils/olap/duckdb.py`:
- Around line 65-91: The two methods build_large_table_ilike_query and
build_large_table_levenshtein_query currently interpolate user inputs directly
into SQL and need the same sanitization as other query builders: stop inserting
raw value, table_name, and column_name into f-strings; instead use the existing
sanitization routine used elsewhere (e.g., the same escaping/quoting used by
build_sample_query or the project's SQL sanitizer) for table and column
identifiers and use parameterized placeholders or safely-escaped literal values
for the search term (escape single quotes and wrap properly) so that value,
table_name, and column_name cannot inject SQL; update both functions to call the
sanitizer for identifiers and to bind or escape the value consistently with
other query builders.
- Around line 38-63: The Levenshtein and ILIKE query builders
(build_levenshtein_query and build_ilike_query) directly interpolate the value
and identifiers into SQL, creating an injection risk; fix by
validating/whitelisting table_name and column_name (e.g., allow only
[A-Za-z_][A-Za-z0-9_]*), and sanitize or escape the value string (e.g., ensure
it's a str and escape single quotes) or better yet switch to DuckDB parameter
binding for the value; update these two methods to call the new
_validate_identifier and _sanitize_sql_string helpers (or use parameter
placeholders) before constructing the final SQL string.
- Around line 13-19: The get_estimated_size_query and other query-builder
methods currently interpolate table_name/column_name directly; to fix, add
parameterized query support to the execution layer (extend execute_sql to accept
a query plus params and use parameter binding) and update these methods (e.g.,
get_estimated_size_query) to return a parameterized SQL string with placeholders
and a corresponding params list instead of raw string interpolation; as an
immediate mitigation also validate identifier inputs (table_name, column_name)
in the class using a strict regex allowing only [A-Za-z0-9_] and raise on
invalid values before building the query so only safe identifiers are used until
execute_sql parameterization is implemented.

In
`@chat-server/app/workflow/graph/multi_dataset_graph/node/generate_subqueries.py`:
- Around line 18-67: _format_context_info currently only uses project_ids[0];
update it to iterate over all provided project_ids and append each project's
context and datasets to context_info: loop for pid in project_ids, call
get_project_info(pid, org_id=org_id) and get_project_schemas(pid, limit=50,
org_id=org_id), prepend each block with a project header (e.g., "PROJECT
CONTEXT: <pid> - <name>"), and for datasets slice the returned list to respect
max_datasets (or distribute max_datasets across projects if desired) before
formatting names/descriptions as the existing code does; keep the try/except
around per-project calls to continue on errors. Ensure you reference the same
function name (_format_context_info), and helper calls (get_project_info,
get_project_schemas) when making edits.

In `@chat-server/app/workflow/graph/multi_dataset_graph/node/sql_agent.py`:
- Around line 82-84: The success message string in the IntermediateStep call
contains a typo ("Succesfully"); update the content passed to IntermediateStep
in sql_agent.py (the IntermediateStep(...) invocation) to read "Successfully
Completed query planning step in multidataset workflow." so the spelling is
corrected while preserving the rest of the message and punctuation.

In `@chat-server/app/workflow/graph/single_dataset_graph/node/prepare_data.py`:
- Around line 85-112: The code extracts org_id but doesn't pass it into the
schema lookup, which breaks tenant isolation; update the call in prepare_data.py
where get_schema_from_qdrant is invoked (currently
get_schema_from_qdrant(dataset_id=dataset_id)) to include the org_id parameter
(e.g., get_schema_from_qdrant(dataset_id=dataset_id, org_id=org_id)), and if
necessary update get_schema_from_qdrant's signature/implementation to accept and
use org_id for tenant-scoped schema resolution; keep dataset_id usage and
existing error handling intact.

In
`@chat-server/app/workflow/prompts/multi_dataset_prompts/analyze_query_prompt.py`:
- Around line 177-186: The human prompt path is using the raw
previous_sql_queries list (previous_sql_queries variable) which differs from the
formatted representation produced by format_analyze_query_input used elsewhere;
to fix it, normalize previous_sql_queries before formatting the
human_template_str by running the same formatting helper
(format_analyze_query_input) on previous_sql_queries (preserving the "None"
fallback) so human_template_str.format receives the consistent "[id:1] SELECT
..." style strings as other code paths expect; update the place where
human_template_str.format is called to pass the formatted_previous_sql_queries
instead of the raw list.

In `@chat-server/scripts/scrape_duckdb_docs.py`:
- Around line 5-6: The project imports httpx in scrape_duckdb_docs.py but
pyproject.toml lacks that dependency; update the pyproject.toml dependencies
section to include "httpx" (alongside the existing "beautifulsoup4") so CI and
production environments can install it—ensure the exact package name "httpx" is
added under the same dependencies list used for other runtime packages.
- Around line 16-23: The _fetch_page function currently catches all Exceptions
and uses print; change it to catch httpx.RequestError for transport-level
failures around await client.get(url, ...) and httpx.HTTPStatusError for non-2xx
responses raised by response.raise_for_status(), log errors via a module logger
(e.g., logger = logging.getLogger(__name__)) instead of print and include the
URL and exception details in the log (use logger.error or logger.exception), and
still return response.text on success or None on handled errors; keep the
function signature and return semantics the same.

In `@chat-server/tests/e2e/utils/generate_app_cases.py`:
- Around line 401-420: The _normalize_message function currently escapes content
but not role, which can allow special chars in role to break generated Python
code; update _normalize_message to escape role the same way as content (replace
backslash with \\\\, double-quote with \\", newline with \\n, and remove \r) and
return the escaped role, and also normalize role values by mapping
"human"→"user" and "ai"→"assistant" (in addition to the existing "type"
fallback) so generated templates that inject role values use safe, normalized
strings.

In `@chat-server/tests/unit/test_generate_sql_prompt_formatting.py`:
- Around line 31-33: Add the missing module-level pytest marker by declaring
pytestmark = pytest.mark.unit near the top of the test module (after imports and
any module docstring) so this file follows the same convention as other unit
tests; modify the TestGenerateSqlPromptFormatting test module to include that
pytestmark declaration (referencing the module that contains class
TestGenerateSqlPromptFormatting).

In `@chat-server/tests/unit/test_schema_search.py`:
- Around line 48-51: The test accesses query call kwargs with
call_args.kwargs["filter"] which can raise KeyError because search_schemas
places filter inside each Prefetch rather than as a top-level kwarg; update the
assertion to use call_args.kwargs.get("filter") (matching other tests) and keep
the existing assertion on prefetch length (mock_client.query_points,
search_schemas, Prefetch are the symbols to update/inspect).

🧹 Nitpick comments (23)

chat-server/tests/e2e/utils/test_utils.py (1)
202-219: Consider using aiohttp for async HTTP requests.

The function is declared as async but uses synchronous requests.post, which blocks the event loop. Since the codebase already uses aiohttp (as seen in gopie/client.py), consider migrating to aiohttp for consistency and true non-blocking behavior.

This is a pre-existing pattern, so deferring is acceptable.
♻️ Suggested approach using aiohttp
import aiohttp

async def send_chat_request(test_case: Dict[str, Any], url: str) -> Dict[str, Any]:
    query_copy = test_case.copy()
    query_copy.pop("expected_result", None)

    try:
        headers = {
            "Content-Type": "application/json",
            "Accept": "text/event-stream",
            "X-organization-id": TestConfig.GOPIE_ORG_ID,
            "X-user-id": TestConfig.GOPIE_USER_ID,
        }
        async with aiohttp.ClientSession() as session:
            async with session.post(
                url,
                json={**query_copy, "chat_id": "", "trace_id": ""},
                headers=headers,
                timeout=aiohttp.ClientTimeout(total=REQUEST_TIMEOUT),
            ) as response:
                response.raise_for_status()
                # Process streaming response...
As per coding guidelines: "Use async/await for non-blocking I/O operations throughout the application".
chat-server/tests/unit/test_schema_search.py (1)
14-39: Consider extracting common mocks into fixtures.

The mock setup for mock_embeddings, mock_client, and the context manager patches are duplicated across all tests. Extracting these into pytest fixtures would reduce boilerplate and improve maintainability.
Example fixture approach
`@pytest.fixture`
def mock_embeddings():
    mock = Mock()
    mock.embed_query.return_value = [0.1, 0.2, 0.3]
    return mock


`@pytest.fixture`
def mock_qdrant_client():
    return AsyncMock()


`@pytest.fixture`
def patch_qdrant_and_sparse(mock_qdrant_client):
    with (
        patch(
            "app.services.qdrant.schema_search.QdrantSetup.get_async_client",
            return_value=mock_qdrant_client,
        ),
        patch("app.services.qdrant.schema_search.generate_sparse_vector") as mock_sparse,
    ):
        mock_sparse.return_value = models.SparseVector(indices=[1, 2], values=[0.5, 0.5])
        yield mock_qdrant_client
Also applies to: 57-70, 87-100, 117-130, 147-160
chat-server/app/workflow/prompts/multi_dataset_prompts/analyze_query_prompt.py (1)
188-191: Inconsistent use of langsmith_compatible formatter.

The template path (line 172) applies langsmith_compatible() to system_content, but the direct message path (line 189) uses the raw system_content. This could lead to different prompt behavior depending on which path is used.
Proposed fix for consistency
     return [
-        SystemMessage(content=system_content),
+        SystemMessage(content=langsmith_compatible(system_content)),
         HumanMessage(content=human_content),
     ]
chat-server/scripts/scrape_duckdb_docs.py (1)
112-138: Replace progress print statements with structured logs.

Use the project’s JSON logger for progress and completion messages to keep observability consistent.
As per coding guidelines: Use JSON structured logging with context for observability.
🔧 Suggested update
-        print(f"Scraping [{len(self.visited_urls)}/{self.max_pages}]: {url}")
+        logger.info(
+            "duckdb_docs_scrape_progress",
+            extra={"current": len(self.visited_urls), "max_pages": self.max_pages, "url": url},
+        )
@@
-        print(f"Scraped {len(self.scraped_content)} pages")
+        logger.info("duckdb_docs_scrape_complete", extra={"pages": len(self.scraped_content)})
@@
-        print(f"Saved {len(self.scraped_content)} pages to {output_file}")
+        logger.info(
+            "duckdb_docs_saved",
+            extra={"pages": len(self.scraped_content), "path": str(output_file)},
+        )
chat-server/scripts/index_duckdb_docs.py (1)

19-72: Use the structured logger instead of print for observability.
Replacing print statements with the app logger keeps output consistent and JSON-structured across the chat-server.

As per coding guidelines: Use JSON structured logging with context for observability.
chat-server/app/services/qdrant/vector_store.py (1)
47-49: Prefer next(iter(...)) over single-element slice for efficiency.

Using list(...)[0] creates an unnecessary intermediate list. The generator-based approach is more idiomatic and efficient.
♻️ Proposed fix
 def generate_sparse_vector(text: str) -> models.SparseVector:
     sparse_model = SparseModelManager.get_model()
-    result = list(sparse_model.embed([text]))[0]
+    result = next(iter(sparse_model.embed([text])))
     return models.SparseVector(indices=result.indices.tolist(), values=result.values.tolist())
chat-server/tests/unit/test_column_value_matching.py (1)
85-90: Consider using underscore prefix for unused unpacked variables.

The match_source and error_message variables are unpacked but never used. Prefixing with underscore communicates intent and silences linter warnings.
♻️ Proposed fix
-    similar_values, match_source, error_message = await find_similar_values(
+    similar_values, _match_source, _error_message = await find_similar_values(
         "foo", "name", "t", estimated_size=1000
     )
chat-server/app/workflow/graph/single_dataset_graph/node/execute_sql.py (1)
52-57: Redundant org_id extraction inside loop.

org_id is already extracted at lines 22-24 before entering the loop. The re-extraction on line 54 is redundant but harmless.
♻️ Proposed fix - remove redundant extraction
         for sql_info in query_result.single_dataset_query_result.sql_queries:
             try:
-                org_id = config.get("metadata", {}).get("org_id")
                 result_data, truncated_result_data = await execute_sql_with_full_and_truncated(
                     query=sql_info.sql_query, org_id=org_id
                 )
chat-server/app/workflow/prompts/multi_dataset_prompts/identify_datasets_prompt.py (1)
126-131: Avoid mutable default arguments.

Using mutable default arguments ([]) can lead to unexpected behavior if the function modifies them. While this function doesn't mutate the lists, using None with a default is the safer pattern.
♻️ Proposed fix
 def format_identify_datasets_input(
     user_query: str,
-    relevant_dataset_schemas: list[DatasetSchema] = [],
-    semantic_searched_datasets: list[DatasetSchema] = [],
+    relevant_dataset_schemas: list[DatasetSchema] | None = None,
+    semantic_searched_datasets: list[DatasetSchema] | None = None,
     validation_result: str | None = None,
 ) -> dict:
+    if relevant_dataset_schemas is None:
+        relevant_dataset_schemas = []
+    if semantic_searched_datasets is None:
+        semantic_searched_datasets = []
     input_str = f"USER QUERY: {user_query}"
chat-server/app/workflow/prompts/multi_dataset_prompts/generate_subqueries_prompt.py (1)
8-79: Document template placeholders and expected kwargs.
The prompt now relies on {context_info} and {user_input}, but the function doesn’t describe these variables. Add a short docstring or inline note listing required kwargs and template parameters.
📌 Suggested docstring
 def generate_subqueries_prompt(**kwargs) -> list | ChatPromptTemplate:
+    """
+    Build the subquery-decomposition prompt.
+
+    Template variables:
+      - context_info: Optional project/dataset context text.
+      - user_input: Raw user query.
+    """
     prompt_template = kwargs.get("prompt_template", False)
As per coding guidelines: Document prompt templates and parameters in prompt files.
chat-server/tests/scripts/reset_and_reindex_collection.py (1)
20-25: Remove unused # noqa: E402 directives.
Ruff reports these as unused, so they can be dropped to keep lint clean.
🧹 Suggested cleanup
-from qdrant_client import QdrantClient, models  # noqa: E402
+from qdrant_client import QdrantClient, models
@@
-from app.core.config import settings  # noqa: E402
-from app.core.log import custom_logger as logger  # noqa: E402
-from app.services.qdrant.qdrant_setup import QdrantSetup  # noqa: E402
-from app.services.qdrant.vector_store import (  # noqa: E402
+from app.core.config import settings
+from app.core.log import custom_logger as logger
+from app.services.qdrant.qdrant_setup import QdrantSetup
+from app.services.qdrant.vector_store import (
     generate_sparse_vector,
 )
chat-server/app/utils/olap/__init__.py (1)
8-15: Sort __all__ to satisfy RUF022.

Keeps export lists deterministic and aligns with Ruff’s expectation.
🧹 Suggested change
 __all__ = [
-    "OlapQueryBuilder",
-    "DuckDBQueryBuilder",
     "ClickHouseQueryBuilder",
+    "DuckDBQueryBuilder",
+    "OlapQueryBuilder",
     "get_query_builder",
-    "is_duckdb_family",
     "is_clickhouse_family",
+    "is_duckdb_family",
 ]
chat-server/app/utils/olap/factory.py (1)
3-8: Use structured custom_logger instead of stdlib logging.

Keeps logging consistent with the rest of the chat server’s structured logging approach.
🧾 Suggested change
-import logging
-
-from app.core.config import settings
+from app.core.config import settings
+from app.core.log import custom_logger as logger
 from app.utils.olap.base import OlapQueryBuilder
-
-logger = logging.getLogger(__name__)
As per coding guidelines: Use JSON structured logging with context for observability.
chat-server/tests/unit/test_match_columns.py (1)
189-258: Consider adding assertion for _regenerate_fuzzy_values call arguments.

The retry test verifies the outcome but doesn't assert that _regenerate_fuzzy_values was called with expected arguments. This could help catch regressions if the function signature changes.
💡 Proposed enhancement
         with (
             patch(
                 "app.workflow.graph.sql_planner_graph.match_columns.match_column_values",
                 new_callable=AsyncMock,
                 return_value=mock_column_mappings,
             ),
             patch(
                 "app.workflow.graph.sql_planner_graph.match_columns.validate_match_relevance",
                 new_callable=AsyncMock,
                 return_value=mock_column_mappings,
             ),
             patch(
                 "app.workflow.graph.sql_planner_graph.match_columns._regenerate_fuzzy_values",
                 new_callable=AsyncMock,
                 return_value=mock_regenerate_response,
-            ),
+            ) as mock_regenerate,
         ):
             state = {
                 "user_query": "show finance department",
                 "messages": [],
                 "multi_datasets_info": sample_multi_datasets_info,
                 "match_columns_retry_count": 0,
             }

             result = await match_columns(state, mock_config)

             assert result["match_columns_retry_count"] == 1
             assert "multi_datasets_info" in result
             assert len(result["messages"]) == 1
             assert "Retrying" in result["messages"][0].content
+            mock_regenerate.assert_called_once()
chat-server/app/workflow/graph/single_dataset_graph/node/sql_agent.py (2)
57-68: Consider using f-string conversion flag for cleaner formatting.

The static analysis suggests using an explicit conversion flag. While minor, using {e!s} is more explicit than {str(e)}.
♻️ Proposed fix
     except Exception as e:
-        error_message = f"Error in SQL planner agent: {str(e)}"
+        error_message = f"Error in SQL planner agent: {e!s}"
         single_dataset_result.error = error_message
63-68: Consider passing config parameter to adispatch_custom_event for proper config propagation in LangGraph workflows.

While config is optional in the adispatch_custom_event signature, passing it explicitly when available (as it is in this function) ensures proper context propagation in LangGraph-based workflows. The config: RunnableConfig parameter is already available in scope on line 13, so update the call at lines 63-68 to:
await adispatch_custom_event(
    "gopie-agent",
    {
        "content": "Error planning SQL queries",
    },
    config=config,
)
Apply the same change to other adispatch_custom_event calls in this file and similar files in the workflow directory.
chat-server/app/utils/olap/base.py (1)
26-29: Consider documenting the expected format of the pct parameter.

The pct parameter is typed as str across multiple methods (build_sample_query, build_large_table_ilike_query, build_large_table_levenshtein_query). Based on the usage in generate_schema.py, it's a formatted percentage string (e.g., "0.001234"). Adding this to the docstring would help implementers.
📝 Proposed docstring enhancement
     `@abstractmethod`
     def build_sample_query(self, table_name: str, pct: str, limit: int) -> str:
-        """Return query with sampling for large tables."""
+        """Return query with sampling for large tables.
+
+        Args:
+            table_name: Name of the table to sample
+            pct: Sampling percentage as a string (e.g., "0.001234")
+            limit: Maximum number of rows to return
+        """
         pass
Also applies to: 48-58
chat-server/app/workflow/graph/single_dataset_graph/types.py (1)
36-36: Consider using more specific types for list fields.

The prev_sql_queries, sql_queries, and explanations fields are typed as list | None or list[str] | None. For better type safety and IDE support, consider using explicit element types.
💡 Proposed type refinement
 class State(TypedDict):
     messages: Annotated[list[BaseMessage], add_messages]
     dataset_id: str | None
     user_query: str | None
     query_result: QueryResult
     dataset_info: SingleDatasetInfo
-    prev_sql_queries: list | None
+    prev_sql_queries: list[str] | None
     validation_result: str | None
     recommendation: str
     sql_queries: list[str] | None
     explanations: list[str] | None
Also applies to: 39-40
chat-server/app/tool_utils/tools/list_datasets.py (1)
79-85: Avoid mutable default arguments.

Using mutable default arguments (list[str] = []) is a known Python anti-pattern that can cause unexpected behavior. Additionally, config: RunnableConfig = None should use Optional for type correctness.
♻️ Proposed fix
 `@tool`
 async def get_all_datasets(
     status_message: str = "",
-    project_ids: list[str] = [],
-    dataset_ids: list[str] = [],
-    config: RunnableConfig = None,
+    project_ids: Optional[list[str]] = None,
+    dataset_ids: Optional[list[str]] = None,
+    config: Optional[RunnableConfig] = None,
 ) -> str:
     """
     ...
     """
-    if not project_ids and not dataset_ids:
+    if not project_ids:
+        project_ids = []
+    if not dataset_ids:
+        dataset_ids = []
+
+    if not project_ids and not dataset_ids:
chat-server/app/workflow/graph/sql_planner_graph/generate_sql.py (1)
86-130: Condition check at line 90 could be simplified.

The check len(sql_queries) == 0 is more idiomatically written as not sql_queries in Python.
♻️ Minor style improvement
-        if non_sql_response and len(sql_queries) == 0:
+        if non_sql_response and not sql_queries:
chat-server/app/workflow/prompts/generate_sql_prompt.py (2)
20-134: Add prompt/parameter documentation for this prompt module.

There’s no docstring describing the template’s purpose, required kwargs, and expected input sections. Add brief module/function docs so the prompt contract is explicit and maintainable.
📝 Suggested docstrings
+"""Prompt templates for SQL generation and input formatting."""
+
 def generate_sql_prompt(**kwargs) -> list[BaseMessage] | ChatPromptTemplate:
+    """Build the SQL-generation prompt.
+
+    Args:
+        prompt_template: If True, return ChatPromptTemplate.
+        input: Preformatted input string from format_generate_sql_input().
+    """
     prompt_template = kwargs.get("prompt_template", False)
     input_content = kwargs.get("input", "")
@@
 def format_generate_sql_input(
@@
-) -> dict:
+) -> dict:
+    """Format user query + dataset context into a single input string."""
As per coding guidelines: Document prompt templates and parameters in prompt files.

124-134: Avoid unused kwargs in format_generate_sql_input.

**kwargs is unused (ruff ARG001). Either remove it or rename to **_kwargs to explicitly mark it as unused.
♻️ Minimal lint fix
 def format_generate_sql_input(
@@
-    **kwargs,
+    **_kwargs,
 ) -> dict:
chat-server/app/workflow/graph/single_dataset_graph/node/prepare_data.py (1)
30-74: Reuse the shared CSV conversion helper to avoid drift.

convert_rows_to_csv already exists in app.utils.csv_utils. Duplicating logic here risks divergence over time; prefer importing the shared helper.
♻️ Example delegation
 def convert_rows_to_csv(rows: list[dict]) -> str:
-    """
-    Convert a list of dictionaries into a CSV-formatted string with special handling for certain cell values.
-    ...
-    """
-    if not rows:
-        return ""
-
-    output = io.StringIO()
-    writer = csv.writer(output)
-    ...
-    return output.getvalue()
+    """Delegate to the shared CSV formatter for consistent behavior."""
+    from app.utils.csv_utils import convert_rows_to_csv as _convert_rows_to_csv
+    return _convert_rows_to_csv(rows)

coderabbitai · 2026-02-02T17:15:27Z

    def exception(self, msg, *args, **kwargs):
-        self._logger.exception(msg, *args, stack_info=True, **kwargs)
+        self._logger.critical(msg, *args, stack_info=True, **kwargs)
        if self.dev_mode:
            raise


⚠️ Potential issue | 🟠 Major

Loss of exception traceback information.

The change from logger.exception() to logger.critical() with stack_info=True loses the actual exception traceback:

stack_info=True logs the call stack (where the log statement was invoked)

exc_info=True (used internally by logger.exception()) logs the exception traceback (the actual error and its origin)

This means when an exception occurs, you'll see where logger.exception() was called, but not the actual exception details and traceback that caused the error.

🐛 Proposed fix to preserve exception traceback

def exception(self, msg, *args, **kwargs): - self._logger.critical(msg, *args, stack_info=True, **kwargs) + self._logger.critical(msg, *args, exc_info=True, **kwargs) if self.dev_mode: raise

If you need both the call stack and exception info, use exc_info=True, stack_info=True.

🤖 Prompt for AI Agents

In `@chat-server/app/core/log.py` around lines 14 - 17, The custom logger method exception in chat-server/app/core/log.py currently calls self._logger.critical(..., stack_info=True) which records the call stack but loses the actual exception traceback; update the exception method to pass exc_info=True (and keep stack_info=True if desired) to self._logger so the logged output includes the exception traceback; modify the exception function (named exception) to call self._logger with exc_info=True (and stack_info=True if you want both) and preserve the existing behavior of raising when self.dev_mode is true.

coderabbitai · 2026-02-02T17:15:27Z

+    row_count: Optional[int] = None
+    columns: Optional[list[ColumnDetails]] = None
+    size: Optional[int] = None
+    file_path: Optional[str] = None
+    created_at: Optional[str] = None
+    updated_at: Optional[str] = None
+    created_by: Optional[str] = None
+    updated_by: Optional[str] = None
    custom_prompt: Optional[str] = None
+    sql_query: Optional[str] = None


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's find where DatasetDetails is defined and used rg -n "class DatasetDetails|DatasetDetails" --type py chat-server/ | head -20

Repository: factly/gopie

Length of output: 769

🏁 Script executed:

# Find all accesses to the optional fields mentioned in the review rg -n "\.columns\s*[^=]|\.row_count\s*[^=]|\.size\s*[^=]|\.file_path\s*[^=]" --type py chat-server/app/ -A2 -B2 | head -50

Repository: factly/gopie

Length of output: 5588

🏁 Script executed:

# Look for iteration or direct access patterns on these fields without null checks rg -n "for .* in .*\.columns|\.columns\[|\w+\.row_count \+|\w+\.row_count -" --type py chat-server/app/

Repository: factly/gopie

Length of output: 1470

🏁 Script executed:

# Check the context around the unsafe column iteration in regenerate_fuzzy_values_prompt.py sed -n '100,120p' chat-server/app/workflow/prompts/multi_dataset_prompts/regenerate_fuzzy_values_prompt.py

Repository: factly/gopie

Length of output: 1060

🏁 Script executed:

# Check the context in dataset_info.py around line 135 sed -n '130,140p' chat-server/app/services/gopie/dataset_info.py

Repository: factly/gopie

Length of output: 583

🏁 Script executed:

# Let's look at the full DatasetDetails model definition to confirm the field types sed -n '20,50p' chat-server/app/models/data.py

Repository: factly/gopie

Length of output: 890

Fix unsafe access to optional columns field in DatasetDetails consumers.

The columns: Optional[list[ColumnDetails]] field can now be None, but multiple locations iterate over it without null checks:

chat-server/app/workflow/prompts/multi_dataset_prompts/regenerate_fuzzy_values_prompt.py:109, 113 - List comprehension and generator expression iterate over matching_schema.columns without checking for None

chat-server/app/services/gopie/dataset_info.py:135 - Direct for-loop over schema.columns without null guard

Add null checks before any iteration or access to columns. Similarly, verify that row_count usage in calculations and file_path access also handle None values.

🤖 Prompt for AI Agents

In `@chat-server/app/models/data.py` around lines 28 - 37, DatasetDetails.columns is now Optional and can be None; update all consumers to guard against None before iterating or accessing it — e.g., in regenerate_fuzzy_values_prompt.py where matching_schema.columns is used in list/generator comprehensions and in dataset_info.py where schema.columns is looped, replace direct iterations with a null-safe form (check matching_schema.columns is not None or use a fallback like (matching_schema.columns or [])). Also add null checks before using row_count in calculations and before using file_path for file operations so you don't assume they are present (guard expressions around row_count and file_path where used). Ensure you update usages referencing DatasetDetails, matching_schema.columns, schema.columns, row_count, and file_path accordingly.

coderabbitai · 2026-02-02T17:15:27Z

    try:
-        async with http_session.get(url, headers=headers) as response:
-            data = await response.json()
-            return DatasetDetails(**data)
+        async with await client.get(path) as response:
+            response_data = await response.json()
+        data = response_data["data"] if is_view else response_data
+        if is_view and data.get("sql_query"):
+            data["description"] = f"{data['description']}\n\nSQL Query:\n{data['sql_query']}"
+        return DatasetDetails(**data)
    except Exception as e:
        logger.exception(f"Error getting dataset info: {e!s}")
        raise e


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Python raise vs raise e exception traceback difference

💡 Result:

Inside an except … as e: block:

raise (bare raise) re-raises the currently handled (“active”) exception without changing its traceback (so you keep the original stack frames showing where it first failed). This is the intended re-raise form. [1]

raise e raises the exception object e via the “raise an exception object” form of raise (i.e., it’s treated like raising an explicit exception expression). Practically, this changes what traceback you see by making the re-throw point (raise e) the reported raise location (often obscuring the original failure site compared to bare raise). [1][2]

If you’re converting/wrapping exceptions:

Prefer raise NewError(...) from e to keep a clear chain (__cause__) from the original to the new exception. [1][3]

Use raise NewError(...) from None to suppress the original context in the default traceback output. [3]

Sources:
[1] Python Language Reference – raise statement (re-raise vs raising an explicit expression) https://docs.python.org/3/reference/simple_stmts.html#the-raise-statement
[2] Discussion/examples of traceback differences between raise and raise e https://stackoverflow.com/questions/36153805/difference-between-raise-and-raise-e
[3] PEP 409 / exception context suppression (raise ... from None) https://peps.python.org/pep-0409/

🏁 Script executed:

# Check if the file exists and examine the actual code at lines 34-43 and 62-68 if [ -f "chat-server/app/services/gopie/dataset_info.py" ]; then echo "=== Lines 34-43 ===" sed -n '34,43p' chat-server/app/services/gopie/dataset_info.py echo "" echo "=== Lines 62-68 ===" sed -n '62,68p' chat-server/app/services/gopie/dataset_info.py else echo "File not found at expected path" # Try to find it find . -name "dataset_info.py" -type f 2>/dev/null fi

Repository: factly/gopie

Length of output: 810

Use bare raise to preserve the original exception traceback.

Inside the except blocks at lines 34–43 and 62–68, raise e changes the reported raise location and obscures the original traceback. Since logger.exception() already captures the full exception details, use bare raise instead to maintain the complete call stack for debugging and error diagnostics.

🔧 Suggested fix

except Exception as e: logger.exception(f"Error getting dataset info: {e!s}") - raise e + raise

Also applies to: 62-68

🧰 Tools

🪛 Ruff (0.14.14)

[warning] 43-43: Use raise without specifying exception name

Remove exception name

(TRY201)

🤖 Prompt for AI Agents

In `@chat-server/app/services/gopie/dataset_info.py` around lines 34 - 43, Replace the re-raising pattern in the except blocks that currently use "raise e" with a bare "raise" so the original traceback is preserved; specifically update the except Exception as e blocks in chat-server/app/services/gopie/dataset_info.py (the block that logs via logger.exception("Error getting dataset info...") and the other similar block later) to log with logger.exception(...) and then perform a bare raise rather than "raise e".

coderabbitai · 2026-02-02T17:15:27Z

+        dense_vector = embeddings.embed_query(user_query)
+        sparse_vector = generate_sparse_vector(user_query)



⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "schema_search.py" -type f 2>/dev/null | head -20

Repository: factly/gopie

Length of output: 106

🏁 Script executed:

# Check for requirements files to determine langchain version find . -name "requirements*.txt" -o -name "pyproject.toml" -o -name "setup.py" | head -20

Repository: factly/gopie

Length of output: 153

🏁 Script executed:

# Also check the actual file content around lines 86-88 fd schema_search.py --type f | head -1 | xargs cat -n 2>/dev/null | sed -n '75,100p'

Repository: factly/gopie

Length of output: 1155

🏁 Script executed:

# Get the function signature and broader context cat -n ./chat-server/app/services/qdrant/schema_search.py | sed -n '60,90p'

Repository: factly/gopie

Length of output: 1311

🏁 Script executed:

# Check the pyproject.toml for langchain/langchain-openai version cat ./chat-server/pyproject.toml | head -100

Repository: factly/gopie

Length of output: 1748

🏁 Script executed:

# Search for aembed_query usage in the codebase to see if it's already used elsewhere rg -i "aembed_query" --type py

Repository: factly/gopie

Length of output: 650

🌐 Web query:

langchain-openai OpenAIEmbeddings aembed_query async method version 0.3.9

💡 Result:

In langchain-openai==0.3.9, OpenAIEmbeddings includes an async aembed_query method.

Signature: async aembed_query(text: str, **kwargs) -> list[float] (returns a single embedding vector for one input string) [1].

Typical use:
from langchain_openai import OpenAIEmbeddings embeddings = OpenAIEmbeddings(model="text-embedding-3-large") vec = await embeddings.aembed_query("The meaning of life is 42")
[1]

Implementation-wise, aembed_query is a thin wrapper that calls aembed_documents([text]) and returns the first result [2].

Sources: LangChain OpenAIEmbeddings reference docs [1], LangChain source module showing aembed_query implementation [2].

🏁 Script executed:

# Check if there are other async embedding calls in the codebase as reference rg "await.*embed" --type py -A 2 -B 2

Repository: factly/gopie

Length of output: 3850

Use the async aembed_query method to avoid blocking the event loop. The search_schemas function is async (line 68), but embed_query is a synchronous call that will block at line 86. Use await embeddings.aembed_query(user_query) instead. The method is available in langchain-openai ≥ 0.3.9 and is already used elsewhere in the codebase (e.g., health.py).

Suggested fix

- dense_vector = embeddings.embed_query(user_query) + dense_vector = await embeddings.aembed_query(user_query)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

dense_vector = embeddings.embed_query(user_query)

sparse_vector = generate_sparse_vector(user_query)

dense_vector = await embeddings.aembed_query(user_query)

sparse_vector = generate_sparse_vector(user_query)

🤖 Prompt for AI Agents

In `@chat-server/app/services/qdrant/schema_search.py` around lines 86 - 88, The async function search_schemas is calling the blocking embeddings.embed_query which will block the event loop; change the call to the async variant await embeddings.aembed_query(user_query) and assign its result back to dense_vector, leaving generate_sparse_vector(user_query) as-is; ensure the embeddings object supports aembed_query (as used elsewhere, e.g., health.py) and update any nearby error handling or type expectations for dense_vector if needed.

coderabbitai · 2026-02-02T17:15:28Z

+import httpx
+from bs4 import BeautifulSoup


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify httpx is declared in dependency manifests. set -euo pipefail deps=$(fd -a -t f 'pyproject.toml|poetry.lock|Pipfile|setup.cfg|requirements.*|constraints.*' || true) if [ -z "$deps" ]; then echo "No dependency manifests found." exit 0 fi echo "=== Found dependency manifest files ===" echo "$deps" echo echo "=== Checking for httpx declaration ===" rg -n 'httpx' $deps || echo "httpx not found in dependency manifests"

Repository: factly/gopie

Length of output: 324

🏁 Script executed:

#!/bin/bash # Check for BeautifulSoup/beautifulsoup4 in dependency manifests rg -n 'beautifulsoup|bs4' /home/jailuser/git/chat-server/pyproject.toml /home/jailuser/git/pulumi/requirements.txt /home/jailuser/git/scripts/starter-project/requirements.txt 2>/dev/null || echo "beautifulsoup4/bs4 not found in dependency manifests"

Repository: factly/gopie

Length of output: 132

Add httpx to chat-server/pyproject.toml dependencies.

The script imports httpx at line 5 but it is not declared in the project's dependency manifest. This will cause an ImportError in CI and production environments. Add httpx to the dependencies section of pyproject.toml alongside the existing beautifulsoup4 declaration.

🤖 Prompt for AI Agents

In `@chat-server/scripts/scrape_duckdb_docs.py` around lines 5 - 6, The project imports httpx in scrape_duckdb_docs.py but pyproject.toml lacks that dependency; update the pyproject.toml dependencies section to include "httpx" (alongside the existing "beautifulsoup4") so CI and production environments can install it—ensure the exact package name "httpx" is added under the same dependencies list used for other runtime packages.

coderabbitai · 2026-02-02T17:15:28Z

+    async def _fetch_page(self, client: httpx.AsyncClient, url: str) -> str | None:
+        try:
+            response = await client.get(url, timeout=30.0)
+            response.raise_for_status()
+            return response.text
+        except Exception as e:
+            print(f"Failed to fetch {url}: {e}")
+            return None


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check if the file exists and examine the relevant section find . -name "scrape_duckdb_docs.py" -type f

Repository: factly/gopie

Length of output: 99

🏁 Script executed:

# Look for logging configuration and patterns in the chat-server directory find chat-server -type f -name "*.py" | head -20

Repository: factly/gopie

Length of output: 792

🏁 Script executed:

# Search for logging imports and usage patterns in the codebase rg "import logging" chat-server/ -t py -c

Repository: factly/gopie

Length of output: 125

🏁 Script executed:

# Search for JSON structured logging patterns rg "json.*log|structlog|pydantic.*log" chat-server/ -t py -A 2

Repository: factly/gopie

Length of output: 38

🏁 Script executed:

# Examine the logging configuration cat -n chat-server/app/core/log.py

Repository: factly/gopie

Length of output: 1941

🏁 Script executed:

# Examine the actual file content cat -n chat-server/scripts/scrape_duckdb_docs.py

Repository: factly/gopie

Length of output: 6154

🏁 Script executed:

# Look for other examples of error handling in the codebase rg "httpx\." chat-server/ -t py -B 2 -A 5 | head -60

Repository: factly/gopie

Length of output: 2021

🏁 Script executed:

# Check for other error handling patterns rg "except.*Error|except.*Exception" chat-server/ -t py -B 2 -A 3 | head -80

Repository: factly/gopie

Length of output: 6095

🏁 Script executed:

# Check if scripts are expected to follow the same logging standards rg "from app.core.log|CustomLogger" chat-server/scripts/ -t py

Repository: factly/gopie

Length of output: 38

🏁 Script executed:

# Look for other scripts to see their logging patterns ls -la chat-server/scripts/

Repository: factly/gopie

Length of output: 308

🏁 Script executed:

# Verify if there's JSON logging anywhere in the codebase rg "json.*format|JSONFormatter|structlog" chat-server/ -t py

Repository: factly/gopie

Length of output: 38

🏁 Script executed:

# Verify httpx exception types exist python3 << 'EOF' import httpx print("HTTPStatusError:", hasattr(httpx, 'HTTPStatusError')) print("RequestError:", hasattr(httpx, 'RequestError')) print("HTTPError:", hasattr(httpx, 'HTTPError')) EOF

Repository: factly/gopie

Length of output: 112

Avoid broad Exception + print; use proper error handling with specific httpx exceptions and logging.

Catching all exceptions hides real bugs and masks errors. Handle httpx.HTTPStatusError (status codes) and httpx.RequestError (transport errors) explicitly, and emit logs instead of print statements. This aligns with the error handling and logging requirements in coding guidelines.

🔧 Proposed fix

import asyncio import json +import logging from pathlib import Path import httpx from bs4 import BeautifulSoup +logger = logging.getLogger(__name__) + class DuckDBDocsScraper: async def _fetch_page(self, client: httpx.AsyncClient, url: str) -> str | None: try: response = await client.get(url, timeout=30.0) response.raise_for_status() return response.text - except Exception as e: - print(f"Failed to fetch {url}: {e}") + except httpx.HTTPStatusError as e: + logger.warning( + f"Failed to fetch {url}: HTTP {e.response.status_code}", + extra={"url": url, "status_code": e.response.status_code}, + ) + return None + except httpx.RequestError as e: + logger.error( + f"Failed to fetch {url}: {e}", + extra={"url": url, "error": str(e)}, + ) return None

🧰 Tools

🪛 Ruff (0.14.14)

[warning] 20-20: Consider moving this statement to an else block

(TRY300)

[warning] 21-21: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

In `@chat-server/scripts/scrape_duckdb_docs.py` around lines 16 - 23, The _fetch_page function currently catches all Exceptions and uses print; change it to catch httpx.RequestError for transport-level failures around await client.get(url, ...) and httpx.HTTPStatusError for non-2xx responses raised by response.raise_for_status(), log errors via a module logger (e.g., logger = logging.getLogger(__name__)) instead of print and include the URL and exception details in the log (use logger.error or logger.exception), and still return response.text on success or None on handled errors; keep the function signature and return semantics the same.

coderabbitai · 2026-02-02T17:15:28Z

+    def _normalize_message(self, message: Any) -> tuple[str, str]:
+        """Extract role and content from a message dict; handle missing/alternate keys."""
+        if isinstance(message, dict):
+            role = message.get("role") or message.get("type") or "user"
+            if role == "human":
+                role = "user"
+            content = (
+                message.get("content")
+                or message.get("text")
+                or message.get("message")
+                or ""
+            )
+            content = str(content) if content is not None else ""
+        else:
+            role = "user"
+            content = str(message) if message is not None else ""
+        content_escaped = (content or "").replace("\\", "\\\\").replace(
+            '"', '\\"'
+        ).replace("\n", "\\n").replace("\r", "")
+        return role, content_escaped


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

LangChain message role type values ai human system tool OpenAI mapping

💡 Result:

LangChain message role/type values (core)

LangChain’s main message classes (Python/JS) correspond to these conceptual roles: system, user, assistant, tool; plus a legacy function role for OpenAI’s older function-calling API. [1], [2], [3]

SystemMessage → role system [2], [3]

HumanMessage → role user [2], [3]

AIMessage (and streaming AIMessageChunk) → role assistant [2]

ToolMessage → role tool [2], [3]

FunctionMessage (legacy) → role function (legacy) [2], [3]

OpenAI role mapping (what LangChain converts to)

When LangChain converts messages to OpenAI Chat Completions message params, it maps to OpenAI-style roles: user, assistant, system, tool, function, developer. [4], [5]

Typical mapping:

HumanMessage → {"role":"user", ...} [2], [3]

AIMessage → {"role":"assistant", ...} (and tool calls are carried on the assistant message) [6]

SystemMessage → {"role":"system", ...} [6]

ToolMessage → {"role":"tool", ...} (includes tool_call_id; may include name) [3], [6]

FunctionMessage (legacy) → {"role":"function", ...} [2], [3]

Note on developer: LangChain’s OpenAI adapter mentions model-specific mappings such as converting system → developer for certain “reasoning models,” and generally supports the developer role in its conversion layer. [4], [5]

Built-in converters

Python: convert_to_openai_messages(...) shows the concrete OpenAI dicts produced (including assistant tool_calls and tool messages). [6]

JS (@langchain/openai): convertMessagesToCompletionsMessageParams / convertStandardContentMessageToCompletionsMessage document the role mapping behavior (including developer). [4], [5]

Sources: LangChain concepts/docs and reference pages for message roles/types and OpenAI conversion utilities. [1]-[6]

🏁 Script executed:

# First, let's examine the actual file to understand the context cat -n chat-server/tests/e2e/utils/generate_app_cases.py | sed -n '380,430p'

Repository: factly/gopie

Length of output: 2662

🏁 Script executed:

# Search for how messages are used in this file and if role normalization matters rg -n "role" chat-server/tests/e2e/utils/generate_app_cases.py | head -20

Repository: factly/gopie

Length of output: 838

🏁 Script executed:

# Check if there are any tests or usages that show what roles are expected rg -n "_normalize_message" chat-server/tests/e2e/utils/generate_app_cases.py -A 5 -B 5

Repository: factly/gopie

Length of output: 1575

🏁 Script executed:

# Search for OpenAI API schema or message format requirements in the codebase rg -n "OpenAI\|openai" chat-server/tests/e2e/utils/generate_app_cases.py

Repository: factly/gopie

Length of output: 38

🏁 Script executed:

# Search for where messages are created or come from rg -n "messages\s*=" chat-server/tests/e2e/utils/generate_app_cases.py | head -20

Repository: factly/gopie

Length of output: 38

🏁 Script executed:

# Look for any place where "ai" role is created rg -n '"ai"|\'ai\'' chat-server/tests/e2e/utils/generate_app_cases.py

Repository: factly/gopie

Length of output: 124

🏁 Script executed:

# Check the TestCasesList and test case structure rg -n "class.*TestCase|messages.*:" chat-server/tests/e2e/utils/generate_app_cases.py -B 2 -A 5

Repository: factly/gopie

Length of output: 3974

🏁 Script executed:

# Search for LangChain imports and usage rg -n "langchain|LangChain" chat-server/tests/e2e/utils/generate_app_cases.py -i

Repository: factly/gopie

Length of output: 109

Add role escaping to prevent code generation injection.

The current code escapes message content but leaves role values unescaped. Since role is directly inserted into generated Python code templates (line 448, 481), any special characters (quotes, backslashes) in a role value could break the output. Add escaping for role similar to content:

if isinstance(message, dict): role = message.get("role") or message.get("type") or "user" if role == "human": role = "user" content = ( message.get("content") or message.get("text") or message.get("message") or "" ) content = str(content) if content is not None else "" else: role = "user" content = str(message) if message is not None else "" + role_escaped = str(role).replace("\\", "\\\\").replace('"', '\\"') content_escaped = (content or "").replace("\\", "\\\\").replace( '"', '\\"' ).replace("\n", "\\n").replace("\r", "") - return role, content_escaped + return role_escaped, content_escaped

Also consider normalizing "ai" → "assistant" if test case messages come from LangChain sources (consistent with existing "human" → "user" normalization), though this may be defensive if the current data only produces OpenAI-standard roles.

🤖 Prompt for AI Agents

In `@chat-server/tests/e2e/utils/generate_app_cases.py` around lines 401 - 420, The _normalize_message function currently escapes content but not role, which can allow special chars in role to break generated Python code; update _normalize_message to escape role the same way as content (replace backslash with \\\\, double-quote with \\", newline with \\n, and remove \r) and return the escaped role, and also normalize role values by mapping "human"→"user" and "ai"→"assistant" (in addition to the existing "type" fallback) so generated templates that inject role values use safe, normalized strings.

coderabbitai · 2026-02-02T17:15:28Z

+class TestGenerateSqlPromptFormatting:
+    @pytest.fixture
+    def mock_dataset_schema(self):


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find chat-server -name "pytest.ini" -o -name "pyproject.toml" -o -name "setup.cfg" | head -10

Repository: factly/gopie

Length of output: 82

🏁 Script executed:

ls -la chat-server/tests/unit/ 2>/dev/null | head -20

Repository: factly/gopie

Length of output: 1337

🏁 Script executed:

head -50 chat-server/tests/unit/test_generate_sql_prompt_formatting.py 2>/dev/null

Repository: factly/gopie

Length of output: 1594

🏁 Script executed:

grep -r "pytest.mark.unit" chat-server/tests/unit/ 2>/dev/null | head -5

Repository: factly/gopie

Length of output: 443

🏁 Script executed:

cat chat-server/pyproject.toml

Repository: factly/gopie

Length of output: 1748

🏁 Script executed:

cat -n chat-server/tests/unit/test_generate_sql_prompt_formatting.py

Repository: factly/gopie

Length of output: 7576

Add the unit test marker to align with project conventions.

Unit tests under chat-server/tests/unit require the pytest.mark.unit marker for reliable filtering. This file is currently missing it, while other tests in the same directory follow this requirement.

Add pytestmark = pytest.mark.unit at the module level (after the imports and docstring, as established in other test files in this directory):

🔧 Suggested change

@@ """ Unit tests for generate_sql_prompt formatting logic, specifically verifying column value inclusion. """ from typing import TypedDict from unittest.mock import MagicMock import pytest from app.models.data import ColumnValueMatching from app.models.schema import DatasetSchema from app.workflow.graph.sql_planner_graph.types import DatasetsInfo from app.workflow.prompts.generate_sql_prompt import format_generate_sql_input +pytestmark = pytest.mark.unit

🤖 Prompt for AI Agents

In `@chat-server/tests/unit/test_generate_sql_prompt_formatting.py` around lines 31 - 33, Add the missing module-level pytest marker by declaring pytestmark = pytest.mark.unit near the top of the test module (after imports and any module docstring) so this file follows the same convention as other unit tests; modify the TestGenerateSqlPromptFormatting test module to include that pytestmark declaration (referencing the module that contains class TestGenerateSqlPromptFormatting).

coderabbitai · 2026-02-02T17:15:28Z

+        # Verify query_points was called with correct structure
+        call_args = mock_client.query_points.call_args
+        assert call_args.kwargs["filter"] is None
+        assert len(call_args.kwargs["prefetch"]) == 2  # Dense and sparse prefetch


⚠️ Potential issue | 🟡 Minor

Potential KeyError when accessing kwargs["filter"].

Based on the search_schemas implementation in the relevant code snippet, the filter parameter is passed inside each Prefetch object, not at the top level of query_points. If filter is not explicitly passed as a keyword argument, accessing call_args.kwargs["filter"] will raise a KeyError instead of returning None.

Other tests in this file (lines 77, 107, 137, 165) use the safer pattern: call_args.kwargs.get("filter").

Proposed fix

# Verify query_points was called with correct structure call_args = mock_client.query_points.call_args - assert call_args.kwargs["filter"] is None + assert call_args.kwargs.get("filter") is None assert len(call_args.kwargs["prefetch"]) == 2 # Dense and sparse prefetch

🤖 Prompt for AI Agents

In `@chat-server/tests/unit/test_schema_search.py` around lines 48 - 51, The test accesses query call kwargs with call_args.kwargs["filter"] which can raise KeyError because search_schemas places filter inside each Prefetch rather than as a top-level kwarg; update the assertion to use call_args.kwargs.get("filter") (matching other tests) and keep the existing assertion on prefetch length (mock_client.query_points, search_schemas, Prefetch are the symbols to update/inspect).

…onous source-to-destination migration with hybrid vector support.

claude · 2026-02-04T11:43:29Z

Claude finished @paul-tharun's task —— View job

🔍 PR Review: Feat/e2e datasets metadata

Review Progress

Read repository setup and guidelines
Review architectural changes
Analyze security implementations
Check error handling patterns
Evaluate performance implications
Assess test coverage
Provide comprehensive feedback

🏗️ Architecture Review

This PR introduces significant architectural improvements with a well-structured approach to NL-to-SQL processing and multi-tenant organization support.

✅ Positive Changes

1. Clean NL-to-SQL Graph Architecture

Supervisor routing: Clean logic in nl_to_sql_graph/node/supervisor.py routing single vs multi-dataset queries
Separation of concerns: Clear split between semantic search, SQL agent, and planning
LangGraph patterns: Proper use of StateGraph, Command, and typed state management

2. Multi-Tenant Organization Support

Proper isolation: Organization ID properly threaded through GopieClient and Qdrant filters
Security boundaries: Clear separation using must conditions for org_id in schema search
Header management: Consistent X-Organization-id header handling

3. SQL Injection Prevention

Query builders: Safe abstraction through utils/olap/ query builders
Parameterized approach: SQL generation through builders rather than string interpolation
Validation patterns: Good use of escaped values in column matching

🚨 Critical Security Issues

1. Missing Organization ID in Semantic Search

Location: chat-server/app/workflow/graph/nl_to_sql_graph/node/semantic_search.py:20

The semantic search function calls search_schemas without passing org_id, allowing cross-tenant data leakage:

# VULNERABLE: No org_id passed
results = await search_schemas(
    user_query=user_query,
    embeddings=embeddings_model,
    dataset_ids=dataset_ids,
    project_ids=project_ids,
    # Missing: org_id=org_id
)

Fix Required: Extract org_id from config and pass it to search_schemas.

2. Division by Zero Risk

Location: chat-server/app/utils/graph_utils/table_utils.py:43

# VULNERABLE: Will crash if estimated_size is 0
sample_pct = (settings.TARGET_ROWS / estimated_size) * 100

Fix Required: Add zero-check before division.

3. Direct SQL Interpolation

Location: chat-server/app/workflow/graph/single_dataset_graph/node/prepare_data.py:121

# POTENTIAL RISK: Direct interpolation without validation
sample_data_query = f"SELECT * FROM {dataset_name} LIMIT 50"

While dataset_name comes from trusted Qdrant storage, consider defensive identifier quoting.

⚠️ Error Handling Issues

1. Inconsistent Error Guards

execute_sql.py:34: Checks isinstance(last_message, ErrorMessage) but last_message is a list
sql_agent.py:25: Accesses query_result.single_dataset_query_result before null check

2. Missing Resource Cleanup

No explicit cleanup logic for failed operations in new graph workflows
Broad exception handling masks specific failure modes

3. State Validation Issues

prepare_data: Returns without dataset_info but graph always routes to sql_agent
routing.py: Uses state.get("query_result", []) which can be list causing AttributeError

🚀 Performance Considerations

✅ Good Optimizations

Smart sampling: Table size estimation prevents performance issues on large datasets
Vector search optimization: Hybrid search with proper filtering
Connection pooling: Singleton pattern for HTTP client connections
Async operations: Proper async/await throughout

⚠️ Potential Concerns

Multiple vector searches: DuckDB docs search adds latency to SQL planning
Memory usage: Complex nested state objects in high-concurrency scenarios
Missing timeouts: GopieClient lacks explicit timeout configuration

🧪 Test Coverage Assessment

✅ Good Coverage

Existing unit tests cover GopieClient org_id handling
DSPy integration provides advanced evaluation
E2E visualization test infrastructure

❌ Missing Tests

No tests for new /fetch-sql endpoint
No integration tests for NL-to-SQL graph workflow
Missing edge case tests for new graph failure scenarios
No tests for org_id security boundaries

📋 Code Quality Issues

1. Error Message Inconsistencies

# Vague error without context
raise ValueError("query_result is not properly initialized")

2. Magic Numbers

# Should be configurable
SELECT * FROM {table} LIMIT 200000

3. Missing Function Documentation

Several new functions lack comprehensive docstrings explaining return contracts.

🔧 Specific Recommendations

High Priority Fixes

Add org_id to semantic search to prevent data leakage
Add zero-check in calculate_sampling_percentage
Fix error guard in execute_sql.py (list vs ErrorMessage)
Add validation in sql_agent.py before accessing query_result

Medium Priority

Add timeouts to GopieClient requests
Improve error messages with specific context
Add integration tests for new workflows
Configure magic numbers via settings

Code Quality

Add comprehensive docstrings to new functions
Implement consistent retry logic across agents
Add resource cleanup for failed operations

📊 Overall Assessment

Category	Rating	Comments
Architecture	⭐⭐⭐⭐⭐	Excellent modular design
Security	⭐⭐⭐⚪⚪	Good foundations, critical gaps
Performance	⭐⭐⭐⭐⚪	Smart optimizations implemented
Testing	⭐⭐⚪⚪⚪	Infrastructure good, coverage lacking
Code Quality	⭐⭐⭐⭐⚪	Clean code, room for improvement

Recommendation: This is a solid architectural improvement, but the security issue in semantic search must be fixed before merging to prevent cross-tenant data access.

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@chat-server/tests/scripts/reset_and_reindex_collection.py`:
- Around line 1-33: The imports at the top of reset_and_reindex_collection.py
include unnecessary "# noqa: E402" markers (on AsyncQdrantClient, models,
settings, logger, QdrantSetup, generate_sparse_vector); remove these trailing "#
noqa: E402" annotations from those import lines so Ruff can report real
import-order issues in future, leaving the import order and the sys.path
insertion unchanged and without adding other noqa comments.
- Around line 78-86: In run_generate_sparse_vector, replace the current re-raise
"raise e" with a bare "raise" to preserve the original traceback when an
exception is caught; keep the logger.error call as-is so the error is logged but
ensure you re-raise with bare raise in the except block of
generate_sparse_vector wrapper to retain original exception context.

claude · 2026-02-05T11:06:56Z

Claude finished @paul-tharun's task —— View job

PR Review: Feat/e2e datasets metadata

I've conducted a comprehensive review of this large PR with 100 files changed. Here are my findings:

Review Checklist

Read repository setup and guidelines (CLAUDE.md, configs)
Review architectural changes (new graphs, workflows)
Analyze API security and validation
Check error handling and edge cases
Evaluate performance implications
Assess test coverage
Provide comprehensive feedback

🏗️ Architecture Review

This PR introduces significant architectural improvements with a well-structured approach to NL-to-SQL processing and multi-tenant organization support.

✅ Positive Changes

1. Clean NL-to-SQL Graph Architecture

Supervisor routing: Clean logic in nl_to_sql_graph/node/supervisor.py routing single vs multi-dataset queries
Separation of concerns: Clear split between semantic search, SQL agent, and planning
LangGraph patterns: Proper use of StateGraph, Command, and typed state management

2. Multi-Tenant Organization Support

Proper isolation: Organization ID properly threaded through GopieClient and Qdrant filters
Security boundaries: Clear separation using must conditions for org_id in schema search
Header management: Consistent X-Organization-id header handling

3. SQL Injection Prevention

Query builders: Safe abstraction through utils/olap/ query builders
Parameterized approach: SQL generation through builders rather than string interpolation
Validation patterns: Good use of escaped values in column matching

🚨 Critical Security Issues

1. Missing Organization ID in Semantic Search

Location: chat-server/app/workflow/graph/nl_to_sql_graph/node/semantic_search.py:20

The semantic search function calls search_schemas without passing org_id, allowing cross-tenant data leakage:

# VULNERABLE: No org_id passed
results = await search_schemas(
    user_query=user_query,
    embeddings=embeddings_model,
    dataset_ids=dataset_ids,
    project_ids=project_ids,
    # Missing: org_id=org_id
)

Fix Required: Extract org_id from config and pass it to search_schemas.

2. Division by Zero Risk

Location: chat-server/app/utils/graph_utils/table_utils.py:43

# VULNERABLE: Will crash if estimated_size is 0
sample_pct = (settings.TARGET_ROWS / estimated_size) * 100

Fix Required: Add zero-check before division.

3. Direct SQL Interpolation

Location: chat-server/app/workflow/graph/single_dataset_graph/node/prepare_data.py:121

# POTENTIAL RISK: Direct interpolation without validation
sample_data_query = f"SELECT * FROM {dataset_name} LIMIT 50"

While dataset_name comes from trusted Qdrant storage, consider defensive identifier quoting.

⚠️ Error Handling Issues

1. Inconsistent Error Guards

execute_sql.py:34: Checks isinstance(last_message, ErrorMessage) but last_message is a list
sql_agent.py:25: Accesses query_result.single_dataset_query_result before null check

2. Missing Resource Cleanup

No explicit cleanup logic for failed operations in new graph workflows
Broad exception handling masks specific failure modes

3. State Validation Issues

prepare_data: Returns without dataset_info but graph always routes to sql_agent
routing.py: Uses state.get("query_result", []) which can be list causing AttributeError

🚀 Performance Considerations

✅ Good Optimizations

Smart sampling: Table size estimation prevents performance issues on large datasets
Vector search optimization: Hybrid search with proper filtering
Connection pooling: Singleton pattern for HTTP client connections
Async operations: Proper async/await throughout

⚠️ Potential Concerns

Multiple vector searches: DuckDB docs search adds latency to SQL planning
Memory usage: Complex nested state objects in high-concurrency scenarios
Missing timeouts: GopieClient lacks explicit timeout configuration

🧪 Test Coverage Assessment

✅ Good Coverage

Existing unit tests cover GopieClient org_id handling
DSPy integration provides advanced evaluation
E2E visualization test infrastructure

❌ Missing Tests

No tests for new /fetch-sql endpoint
No integration tests for NL-to-SQL graph workflow
Missing edge case tests for new graph failure scenarios
No tests for org_id security boundaries

📋 Code Quality Issues

1. Error Message Inconsistencies

# Vague error without context
raise ValueError("query_result is not properly initialized")

2. Magic Numbers

# Should be configurable
SELECT * FROM {table} LIMIT 200000

3. Missing Function Documentation

Several new functions lack comprehensive docstrings explaining return contracts.

🔧 Specific Recommendations

High Priority Fixes

Add org_id to semantic search to prevent data leakage
Add zero-check in calculate_sampling_percentage
Fix error guard in execute_sql.py (list vs ErrorMessage)
Add validation in sql_agent.py before accessing query_result

Medium Priority

Add timeouts to GopieClient requests
Improve error messages with specific context
Add integration tests for new workflows
Configure magic numbers via settings

Code Quality

Add comprehensive docstrings to new functions
Implement consistent retry logic across agents
Add resource cleanup for failed operations

📊 Overall Assessment

Category	Rating	Comments
Architecture	⭐⭐⭐⭐⭐	Excellent modular design
Security	⭐⭐⭐⚪⚪	Good foundations, critical gaps
Performance	⭐⭐⭐⭐⚪	Smart optimizations implemented
Testing	⭐⭐⚪⚪⚪	Infrastructure good, coverage lacking
Code Quality	⭐⭐⭐⭐⚪	Clean code, room for improvement

Recommendation: This is a solid architectural improvement, but the security issue in semantic search must be fixed before merging to prevent cross-tenant data access.

…ation and synchronously initialize the sparse vector model.

claude · 2026-02-05T11:34:48Z

Claude finished @paul-tharun's task —— View job

🔍 PR Review: Feat/e2e datasets metadata

I've conducted a comprehensive review of this large PR with 100 files changed. Here are my findings:

Review Progress

Read repository setup and guidelines (CLAUDE.md, configs)
Review architectural changes (new graphs, workflows)
Analyze API security and validation
Check error handling and edge cases
Evaluate performance implications
Assess test coverage
Provide comprehensive feedback

🏗️ Architecture & Design

✅ Positive Changes

Well-structured agent architecture: New nl_to_sql_graph and sql_planner_graph follow LangGraph patterns correctly
Separation of concerns: Clear separation between SQL planning, execution, and validation workflows
Type safety: Good use of Pydantic models for API validation (FetchSqlRequest, FetchSqlResponse)
Unified client pattern: New GopieClient centralizes API communication with proper header management

⚠️ Architectural Concerns

Commented code in workflows: chat-server/app/workflow/graph/sql_planner_graph/graph.py:19-30 has extensive commented conditional edges that should either be implemented or removed
Complex state management: The addition of multiple new graph types increases complexity without clear documentation of when each should be used

🔒 Security Analysis

✅ Good Security Practices

Input validation: Proper Pydantic validation in new API endpoints (fetch_sql.py:21-22)
SQL injection prevention: Using parameterized queries and proper escaping (column_value_matching.py:40)
Header handling: Proper organization ID header management in GopieClient

🚨 Security Issues

Insufficient input sanitization in column_value_matching.py:40:
```
WHERE LOWER(CAST({column_name} AS VARCHAR)) LIKE '%' || LOWER('{escaped_value}') || '%'
```
The escaped_value variable name suggests sanitization, but no actual escaping is shown in this snippet.

Generic exception handling in fetch_sql.py:36-42:

except Exception as e:
    logger.error(f"Error in fetch_sql: {e}")
    raise HTTPException(
        status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
        detail="Failed to generate SQL queries, please try again.",
    ) from e

This could mask important security errors and provide too generic error messages.

🚀 Performance Analysis

✅ Performance Improvements

Smart sampling: New table size estimation and sampling logic in table_utils.py prevents performance issues on large datasets
Connection pooling: Using singleton pattern for HTTP client connections (SingletonAiohttp)
Async operations: Proper use of async/await throughout new components

⚠️ Performance Concerns

Multiple vector searches: New DuckDB documentation search (duckdb_docs/search.py) adds another vector search operation that could impact latency
Memory usage: Complex state objects with nested query results could consume significant memory in high-concurrency scenarios

🧪 Testing & Quality

✅ Testing Infrastructure

Good test structure: Maintained separation between unit and e2e tests
DSPy integration: Advanced evaluation system with optimized evaluators

❌ Testing Gaps

No tests for new API endpoint: fetch_sql.py lacks dedicated test coverage
Missing edge case tests: New graph workflows need tests for failure scenarios
Integration tests: No tests covering the interaction between new nl_to_sql_graph and existing workflows

🐛 Bug & Error Handling Issues

State validation inconsistency in sql_agent.py:27-28:

if not query_result or not single_dataset_result:
    raise ValueError("query_result is not properly initialized")

This doesn't provide context about what went wrong or how to fix it.

Resource cleanup: Missing cleanup logic for failed operations in new graph workflows
Retry logic: Inconsistent retry mechanisms across different agents - some have configurable retry counts, others don't

📋 Code Quality

✅ Good Practices

Type hints: Comprehensive type annotations throughout new code
Documentation: Good docstrings in new functions like get_duckdb_docs()
Code organization: Logical file structure following established patterns

⚠️ Areas for Improvement

Hardcoded values: Magic numbers like LIMIT 200000 in SQL queries should be configurable
Long functions: Some functions exceed 50 lines and could benefit from decomposition
Duplicate logic: Similar error handling patterns repeated across multiple files

🔧 Specific Recommendations

Security:

# In column_value_matching.py - add proper escaping
def escape_sql_value(value: str) -> str:
    return value.replace("'", "''").replace("%", "\\%").replace("_", "\\_")

Error handling:

# In fetch_sql.py - more specific exception handling
except ValueError as e:
    raise HTTPException(status_code=400, detail=str(e))
except ConnectionError as e:
    raise HTTPException(status_code=503, detail="Service unavailable")

Configuration:

# Add to config.py
DEFAULT_QUERY_LIMIT: int = 200000
MAX_SAMPLING_PERCENTAGE: float = 5.0

Testing:
- Add unit tests for fetch_sql_router
- Add integration tests for new graph workflows
- Add performance tests for large dataset scenarios

📊 Overall Assessment

Category	Rating	Comments
Architecture	⭐⭐⭐⭐⚪	Well-designed but complex
Security	⭐⭐⭐⚪⚪	Good practices with some gaps
Performance	⭐⭐⭐⭐⚪	Smart optimizations implemented
Testing	⭐⭐⚪⚪⚪	Infrastructure good, coverage lacking
Code Quality	⭐⭐⭐⭐⚪	Clean code with room for improvement

Overall: This is a substantial improvement to the system with well-thought-out architectural changes. The main concerns are security gaps, insufficient testing of new features, and some complexity that could be better documented. I recommend addressing the security issues before merging.

claude · 2026-02-05T11:36:08Z

Claude finished @paul-tharun's task —— View job

PR Review: Feat/e2e datasets metadata

Review Progress

Read repository setup and guidelines
Review architectural changes
Analyze security implementations
Check error handling patterns
Evaluate performance implications
Assess test coverage
Provide comprehensive feedback

🏗️ Architecture Review

This PR introduces significant architectural improvements with a well-structured approach to NL-to-SQL processing and multi-tenant organization support.

✅ Positive Changes

1. Clean NL-to-SQL Graph Architecture

Supervisor routing: Clean logic in nl_to_sql_graph/node/supervisor.py routing single vs multi-dataset queries
Separation of concerns: Clear split between semantic search, SQL agent, and planning
LangGraph patterns: Proper use of StateGraph, Command, and typed state management

2. Multi-Tenant Organization Support

Proper isolation: Organization ID properly threaded through GopieClient and Qdrant filters
Security boundaries: Clear separation using must conditions for org_id in schema search
Header management: Consistent X-Organization-id header handling

3. SQL Injection Prevention

Query builders: Safe abstraction through utils/olap/ query builders
Parameterized approach: SQL generation through builders rather than string interpolation
Validation patterns: Good use of escaped values in column matching

🚨 Critical Security Issues

1. Missing Organization ID in Semantic Search

Location: chat-server/app/workflow/graph/nl_to_sql_graph/node/semantic_search.py:20

The semantic search function calls search_schemas without passing org_id, allowing cross-tenant data leakage:

# VULNERABLE: No org_id passed
results = await search_schemas(
    user_query=user_query,
    embeddings=embeddings_model,
    dataset_ids=dataset_ids,
    project_ids=project_ids,
    # Missing: org_id=org_id
)

Fix Required: Extract org_id from config and pass it to search_schemas.

2. Division by Zero Risk

Location: chat-server/app/utils/graph_utils/table_utils.py:43

# VULNERABLE: Will crash if estimated_size is 0
sample_pct = (settings.TARGET_ROWS / estimated_size) * 100

Fix Required: Add zero-check before division.

3. Direct SQL Interpolation

Location: chat-server/app/workflow/graph/single_dataset_graph/node/prepare_data.py:121

# POTENTIAL RISK: Direct interpolation without validation
sample_data_query = f"SELECT * FROM {dataset_name} LIMIT 50"

While dataset_name comes from trusted Qdrant storage, consider defensive identifier quoting.

⚠️ Error Handling Issues

1. Inconsistent Error Guards

execute_sql.py:34: Checks isinstance(last_message, ErrorMessage) but last_message is a list
sql_agent.py:25: Accesses query_result.single_dataset_query_result before null check

2. Missing Resource Cleanup

No explicit cleanup logic for failed operations in new graph workflows
Broad exception handling masks specific failure modes

3. State Validation Issues

prepare_data: Returns without dataset_info but graph always routes to sql_agent
routing.py: Uses state.get("query_result", []) which can be list causing AttributeError

🚀 Performance Considerations

✅ Good Optimizations

Smart sampling: Table size estimation prevents performance issues on large datasets
Vector search optimization: Hybrid search with proper filtering
Connection pooling: Singleton pattern for HTTP client connections
Async operations: Proper async/await throughout

⚠️ Potential Concerns

Multiple vector searches: DuckDB docs search adds latency to SQL planning
Memory usage: Complex nested state objects in high-concurrency scenarios
Missing timeouts: GopieClient lacks explicit timeout configuration

🧪 Test Coverage Assessment

✅ Good Coverage

Existing unit tests cover GopieClient org_id handling
DSPy integration provides advanced evaluation
E2E visualization test infrastructure

❌ Missing Tests

No tests for new /fetch-sql endpoint
No integration tests for NL-to-SQL graph workflow
Missing edge case tests for new graph failure scenarios
No tests for org_id security boundaries

📋 Code Quality Issues

1. Error Message Inconsistencies

# Vague error without context
raise ValueError("query_result is not properly initialized")

2. Magic Numbers

# Should be configurable
SELECT * FROM {table} LIMIT 200000

3. Missing Function Documentation

Several new functions lack comprehensive docstrings explaining return contracts.

🔧 Specific Recommendations

High Priority Fixes

Add org_id to semantic search to prevent data leakage
Add zero-check in calculate_sampling_percentage
Fix error guard in execute_sql.py (list vs ErrorMessage)
Add validation in sql_agent.py before accessing query_result

Medium Priority

Add timeouts to GopieClient requests
Improve error messages with specific context
Add integration tests for new workflows
Configure magic numbers via settings

Code Quality

Add comprehensive docstrings to new functions
Implement consistent retry logic across agents
Add resource cleanup for failed operations

📊 Overall Assessment

Category	Rating	Comments
Architecture	⭐⭐⭐⭐⭐	Excellent modular design
Security	⭐⭐⭐⚪⚪	Good foundations, critical gaps
Performance	⭐⭐⭐⭐⚪	Smart optimizations implemented
Testing	⭐⭐⚪⚪⚪	Infrastructure good, coverage lacking
Code Quality	⭐⭐⭐⭐⚪	Clean code, room for improvement

Recommendation: This is a solid architectural improvement, but the security issue in semantic search must be fixed before merging to prevent cross-tenant data access.

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)

web/src/lib/stores/auth-store.ts (1)
202-211: ⚠️ Potential issue | 🟡 Minor

Missing role: null in catch block of checkSession.

When session check throws an error, role is not reset to null, unlike the successful !response.ok branch at line 197. This could leave stale role data in state after a network error.
🛡️ Proposed fix
       } catch (error) {
         console.error("Session check error:", error);
         set({
           user: null,
           isAuthenticated: false,
           accessToken: null,
           organizationId: null,
+          role: null,
         });
         setGlobalAccessToken(null);
         setGlobalOrganizationId(null);
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/src/lib/stores/auth-store.ts` around lines 202 - 211, In checkSession's
catch block the state reset omits role, leaving stale role values after errors;
update the set({...}) call inside the catch of checkSession to include role:
null (matching the !response.ok branch) so user/role/accessToken/organizationId
are all cleared on error, keeping state consistent with
setGlobalAccessToken(null) and setGlobalOrganizationId(null).
server/interfaces/http/routes/api/projects/datasets/update.go (1)
66-70: ⚠️ Potential issue | 🟡 Minor

Incorrect error message: "deleting" should be "updating".

The error response message at line 68 says "Error deleting dataset" but this is the update handler. This appears to be a copy-paste error.
🐛 Proposed fix
 		return ctx.Status(fiber.StatusInternalServerError).JSON(fiber.Map{
 			"error":   err.Error(),
-			"message": "Error deleting dataset",
+			"message": "Error updating dataset",
 			"code":    fiber.StatusInternalServerError,
 		})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/interfaces/http/routes/api/projects/datasets/update.go` around lines
66 - 70, The error response in the update handler incorrectly says "Error
deleting dataset"; update the message string to "Error updating dataset" in the
JSON returned by the update route (the block that calls
ctx.Status(fiber.StatusInternalServerError).JSON with "error": err.Error());
ensure the same handler/return (the update.go update route) uses "Error updating
dataset" so the message matches the update operation.
server/interfaces/http/routes/source/s3/create.go (2)
154-156: ⚠️ Potential issue | 🔴 Critical

Cleanup struct is missing userID and role initialization.

When the cleanup struct is initialized and later updated, userID and role are never set. However, cleanupResources uses these fields when calling h.datasetSvc.Delete(rc.datasetID, rc.orgID, rc.userID, rc.role) at line 56.

This means cleanup operations after dataset creation will pass empty/zero values for userID and role, which could cause authorization failures or unexpected behavior during error recovery.
🐛 Proposed fix to initialize userID
 	// Initialize cleanup resource object
 	cleanup := resourceCleanup{
 		tableName: res.TableName,
+		userID:    userID,
+		orgID:     orgID,
 	}
And later, only update the dataset-specific fields:
 	// Update cleanup object to include dataset info
 	cleanup.hasDataset = true
 	cleanup.datasetID = dataset.ID
-	cleanup.orgID = dataset.OrgID
Also applies to: 194-197
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/interfaces/http/routes/source/s3/create.go` around lines 154 - 156,
resourceCleanup is constructed without setting userID and role, causing
cleanupResources to call datasetSvc.Delete with empty auth fields; when creating
the cleanup variable in create.go (resourceCleanup{...}) set userID and role
from the current handler/request context (the same userID and role values used
elsewhere in this handler) so those fields are populated, and ensure subsequent
updates (the later assignments at the other initialization site around lines
194-197) only overwrite dataset-specific fields (datasetID, orgID, tableName)
and not userID/role so datasetSvc.Delete receives correct auth parameters.
37-37: ⚠️ Potential issue | 🔴 Critical

The role field is actively used and required; it should be assigned, not removed.

The role field is passed to h.datasetSvc.Delete() at line 56 and is essential for authorization checks (if role == models.Admin). However, both role and userID are never assigned to the cleanup struct during initialization, causing the Delete call to receive zero-values. Since userID and role are available from the context in the upload function, they should be populated when initializing the cleanup struct.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/interfaces/http/routes/source/s3/create.go` at line 37, The cleanup
struct is missing assignment of the required authorization fields; set
cleanup.userID and cleanup.role when you create the cleanup instance in the
upload handler so the later call to h.datasetSvc.Delete(...) receives the actual
context values instead of zero-values. Locate the upload function where the
cleanup struct is initialized (the struct with fields userID and role used by
h.datasetSvc.Delete) and populate those fields from the request context (the
same place you already read userID and role) so authorization checks (e.g., if
role == models.Admin) work correctly.

🧹 Nitpick comments (4)

web/src/app/projects/[projectId]/page.tsx (1)

19-23: Consider adding ProtectedPage wrapper for authentication.

Similar to other authenticated pages in this PR (e.g., HomePage), this page should be wrapped with ProtectedPage to ensure unauthenticated users are redirected appropriately.

As per coding guidelines: "Protect routes using the ProtectedRoute wrapper HOC for authentication-required pages".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/src/app/projects/`[projectId]/page.tsx around lines 19 - 23, The
ProjectPage component is not wrapped with the ProtectedPage auth wrapper; update
the default export so ProjectPage is returned/wrapped by ProtectedPage (i.e.,
export default ProtectedPage(ProjectPage) or wrap the JSX returned by
ProjectPage with <ProtectedPage>), ensuring the params prop handling (params:
Promise<{ projectId: string }>) is preserved and any server/client boundaries
remain correct; locate the ProjectPage function and apply the same ProtectedPage
pattern used by HomePage to enforce authentication and redirection for
unauthenticated users.

web/src/components/dataset/column-descriptions-modal.tsx (1)

233-243: Consider updating help text when editing is disabled.

The message "Click edit to add one" is shown even when canEdit is false, which could confuse users who don't see an edit button.

💡 Suggested improvement

                     ) : (
                       <p className="text-muted-foreground italic">
-                        No description available. Click edit to add one.
+                        No description available.{canEdit && " Click edit to add one."}
                       </p>
                     )}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/src/components/dataset/column-descriptions-modal.tsx` around lines 233 -
243, The "Click edit to add one." help text is shown even when editing is
disabled; update the JSX in column-descriptions-modal.tsx (the block using
descriptions[column.column_name] and the canEdit prop/variable) to conditionally
render the message: if descriptions[column.column_name] is missing and canEdit
is true show "Click edit to add one.", otherwise show a neutral message such as
"No description available." or "Editing disabled" when canEdit is false; adjust
the conditional around the <p> elements so the text reflects the canEdit state.

server/application/services/store.go (2)

61-68: Drop dead list parameters (or enforce them) for a clearer service contract.

ProjectService.List still accepts createdBy but does not use it, and this pattern now mirrors the role-removal transition. Tightening the signature here will reduce ambiguity for callers.

Also applies to: 107-107

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@server/application/services/store.go` around lines 61 - 68,
ProjectService.List currently accepts an unused createdBy parameter; either
remove it from the method signature and all callers or wire it through to the
repository call (update ProjectService.List signature and the call to
projectRepo.SearchProject to accept and forward createdBy, or update repository
if necessary). Locate the ProjectService.List method and the
projectRepo.SearchProject invocation to either delete the createdBy argument
from List and callers, or add createdBy handling in List and modify
SearchProject's signature and implementations to accept and filter by createdBy
so the parameter is not left unused.

28-29: Remove unused createdBy from Details service methods.

createdBy is accepted but ignored in both methods, which makes the API contract look stricter than the actual authorization path.

♻️ Proposed cleanup

-func (service *ProjectService) Details(id, orgID, createdBy string) (*models.Project, error) {
+func (service *ProjectService) Details(id, orgID string) (*models.Project, error) {
 	return service.projectRepo.Details(context.Background(), id, orgID)
 }

-func (service *DatasetService) Details(id, orgID, createdBy string) (*models.Dataset, error) {
+func (service *DatasetService) Details(id, orgID string) (*models.Dataset, error) {
 	return service.datasetRepo.Details(context.Background(), id, orgID)
 }

Also applies to: 93-94

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@server/application/services/store.go` around lines 28 - 29, The Details
method on ProjectService currently accepts an unused createdBy parameter; remove
createdBy from the ProjectService.Details signature and from any other service
Details variants, update all callers to stop passing createdBy, and adjust any
interface definitions and tests that reference ProjectService.Details; keep the
internal call to projectRepo.Details(context.Background(), id, orgID) unchanged.
Ensure you update method declarations (ProjectService.Details) and any
implementing interfaces so signatures match, run tests, and remove dead
parameters in related methods mentioned in the review (the other Details method
at lines 93-94).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/application/repositories/store.go`:
- Line 80: The ListByProjectAndOrg implementation is ignoring orgID and
createdBy when calling s.List; update the function (ListByProjectAndOrg) to pass
a filter that enforces tenant scoping and optional creator filtering instead of
only projectID and pagination—modify the call to s.List to include an
appropriate filter/map/selector that adds org_id = orgID and, if createdBy is
non-empty, created_by = createdBy (or whatever field names Dataset expects), so
the query returned by s.List is correctly restricted by org and creator.

In `@server/infrastructure/postgres/store/datasets/list_by_project_and_org.go`:
- Around line 9-11: The ListByProjectAndOrg implementation ignores orgID and
createdBy (it just calls s.List), so update it to apply org and creator
filtering: either (A) extend or call an underlying query method that accepts
orgID and createdBy and pass those values into the SQL/ORM filter, or (B) if
filtering is intentionally not applied, add a clear comment on
PgDatasetStore.ListByProjectAndOrg stating that org/user scoping is deferred and
why; specifically modify the function PgDatasetStore.ListByProjectAndOrg to use
orgID and createdBy in the dataset query (or call a new/internal method like
s.listWithFilters(ctx, projectID, orgID, createdBy, pagination)) to enforce
tenant/user isolation and return the paginated result.

In `@server/interfaces/http/routes/api/projects/details.go`:
- Around line 20-24: Replace the header-based org lookup with the
middleware-local org value: stop using ctx.Get(middleware.OrganizationIDHeader)
and instead read the org ID from the middleware-populated locals (e.g.,
ctx.Locals(<middleware's org-local-key>)) before calling
h.projectSvc.Details(projectID, orgID, userID); update the orgID variable
initialization so authorization uses the middleware-scoped org context (use the
exact local key constant provided by the middleware rather than the header
constant) and keep projectID and userID retrievals as-is.

In `@web/src/app/projects/`[projectId]/upload/page.tsx:
- Around line 9-36: The page lacks the authentication wrapper: wrap the
UploadDatasetPage with the app's ProtectedPage/ProtectedRoute so unauthenticated
users are redirected to login before any canEdit checks run; locate the
UploadDatasetPage export in web/src/app/projects/[projectId]/upload/page.tsx and
either return/render <ProtectedPage> (or the project's ProtectedRoute HOC)
around the component's JSX (ensuring DatasetUploadWizard remains inside) or
export the component via ProtectedPage(UploadDatasetPage), keep the existing
useAuthStore, useProject, and canEdit logic intact inside the wrapped component,
and ensure ProtectedPage handles redirect-to-login for unauthenticated users.

---

Outside diff comments:
In `@server/interfaces/http/routes/api/projects/datasets/update.go`:
- Around line 66-70: The error response in the update handler incorrectly says
"Error deleting dataset"; update the message string to "Error updating dataset"
in the JSON returned by the update route (the block that calls
ctx.Status(fiber.StatusInternalServerError).JSON with "error": err.Error());
ensure the same handler/return (the update.go update route) uses "Error updating
dataset" so the message matches the update operation.

In `@server/interfaces/http/routes/source/s3/create.go`:
- Around line 154-156: resourceCleanup is constructed without setting userID and
role, causing cleanupResources to call datasetSvc.Delete with empty auth fields;
when creating the cleanup variable in create.go (resourceCleanup{...}) set
userID and role from the current handler/request context (the same userID and
role values used elsewhere in this handler) so those fields are populated, and
ensure subsequent updates (the later assignments at the other initialization
site around lines 194-197) only overwrite dataset-specific fields (datasetID,
orgID, tableName) and not userID/role so datasetSvc.Delete receives correct auth
parameters.
- Line 37: The cleanup struct is missing assignment of the required
authorization fields; set cleanup.userID and cleanup.role when you create the
cleanup instance in the upload handler so the later call to
h.datasetSvc.Delete(...) receives the actual context values instead of
zero-values. Locate the upload function where the cleanup struct is initialized
(the struct with fields userID and role used by h.datasetSvc.Delete) and
populate those fields from the request context (the same place you already read
userID and role) so authorization checks (e.g., if role == models.Admin) work
correctly.

In `@web/src/lib/stores/auth-store.ts`:
- Around line 202-211: In checkSession's catch block the state reset omits role,
leaving stale role values after errors; update the set({...}) call inside the
catch of checkSession to include role: null (matching the !response.ok branch)
so user/role/accessToken/organizationId are all cleared on error, keeping state
consistent with setGlobalAccessToken(null) and setGlobalOrganizationId(null).

---

Nitpick comments:
In `@server/application/services/store.go`:
- Around line 61-68: ProjectService.List currently accepts an unused createdBy
parameter; either remove it from the method signature and all callers or wire it
through to the repository call (update ProjectService.List signature and the
call to projectRepo.SearchProject to accept and forward createdBy, or update
repository if necessary). Locate the ProjectService.List method and the
projectRepo.SearchProject invocation to either delete the createdBy argument
from List and callers, or add createdBy handling in List and modify
SearchProject's signature and implementations to accept and filter by createdBy
so the parameter is not left unused.
- Around line 28-29: The Details method on ProjectService currently accepts an
unused createdBy parameter; remove createdBy from the ProjectService.Details
signature and from any other service Details variants, update all callers to
stop passing createdBy, and adjust any interface definitions and tests that
reference ProjectService.Details; keep the internal call to
projectRepo.Details(context.Background(), id, orgID) unchanged. Ensure you
update method declarations (ProjectService.Details) and any implementing
interfaces so signatures match, run tests, and remove dead parameters in related
methods mentioned in the review (the other Details method at lines 93-94).

In `@web/src/app/projects/`[projectId]/page.tsx:
- Around line 19-23: The ProjectPage component is not wrapped with the
ProtectedPage auth wrapper; update the default export so ProjectPage is
returned/wrapped by ProtectedPage (i.e., export default
ProtectedPage(ProjectPage) or wrap the JSX returned by ProjectPage with
<ProtectedPage>), ensuring the params prop handling (params: Promise<{
projectId: string }>) is preserved and any server/client boundaries remain
correct; locate the ProjectPage function and apply the same ProtectedPage
pattern used by HomePage to enforce authentication and redirection for
unauthenticated users.

In `@web/src/components/dataset/column-descriptions-modal.tsx`:
- Around line 233-243: The "Click edit to add one." help text is shown even when
editing is disabled; update the JSX in column-descriptions-modal.tsx (the block
using descriptions[column.column_name] and the canEdit prop/variable) to
conditionally render the message: if descriptions[column.column_name] is missing
and canEdit is true show "Click edit to add one.", otherwise show a neutral
message such as "No description available." or "Editing disabled" when canEdit
is false; adjust the conditional around the <p> elements so the text reflects
the canEdit state.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 40d0dbb and bda820b.

📒 Files selected for processing (25)

server/application/repositories/store.go
server/application/services/store.go
server/infrastructure/postgres/store/datasets/list_by_project_and_org.go
server/infrastructure/postgres/store/datasets/list_by_project_and_role.go
server/interfaces/http/middleware/validate.go
server/interfaces/http/routes/api/datasets/details.go
server/interfaces/http/routes/api/datasets/get-column-descriptions.go
server/interfaces/http/routes/api/datasets/update-column-descriptions.go
server/interfaces/http/routes/api/projects/datasets/delete.go
server/interfaces/http/routes/api/projects/datasets/details.go
server/interfaces/http/routes/api/projects/datasets/update.go
server/interfaces/http/routes/api/projects/details.go
server/interfaces/http/routes/api/projects/list.go
server/interfaces/http/routes/source/database/create.go
server/interfaces/http/routes/source/database/refresh.go
server/interfaces/http/routes/source/s3/create.go
web/src/app/page.tsx
web/src/app/projects/[projectId]/datasets/[datasetId]/page.tsx
web/src/app/projects/[projectId]/page.tsx
web/src/app/projects/[projectId]/upload/page.tsx
web/src/components/dataset/column-descriptions-modal.tsx
web/src/components/dataset/dataset-header.tsx
web/src/components/project/inline-project-editor.tsx
web/src/components/project/project-card.tsx
web/src/lib/stores/auth-store.ts

💤 Files with no reviewable changes (1)

server/infrastructure/postgres/store/datasets/list_by_project_and_role.go

coderabbitai · 2026-02-27T10:49:38Z

 	DeleteByOrgAndCreator(ctx context.Context, datasetID, orgID, createdBy string) error
 	UpdateByOrgAndCreator(ctx context.Context, datasetID, orgID, createdBy string, params *models.UpdateDatasetParams) (*models.Dataset, error)
-	ListByProjectAndRole(ctx context.Context, projectID, orgID, createdBy string, role models.Role, pagination models.Pagination) (*models.PaginationView[*models.Dataset], error)
+	ListByProjectAndOrg(ctx context.Context, projectID, orgID, createdBy string, pagination models.Pagination) (*models.PaginationView[*models.Dataset], error)


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== ListByProjectAndOrg definitions ==" rg -n -C6 '\bListByProjectAndOrg\s*\(' server echo echo "== Candidate implementation files ==" fd 'list_by_project_and_org\.go$' server echo echo "== Inspect org/creator usage in implementations ==" for f in $(fd 'list_by_project_and_org\.go$' server); do echo "---- $f ----" rg -n -C6 'ListByProjectAndOrg|orgID|createdBy|org_id|created_by|WHERE|project_id' "$f" done

Repository: factly/gopie

Length of output: 4469

ListByProjectAndOrg ignores orgID and createdBy parameters—org scoping is not enforced.

The implementation at server/infrastructure/postgres/store/datasets/list_by_project_and_org.go accepts these parameters but forwards only projectID and pagination to s.List(), ignoring tenant and creator constraints. Update the implementation to apply orgID (and createdBy if intended) to query filters.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@server/application/repositories/store.go` at line 80, The ListByProjectAndOrg implementation is ignoring orgID and createdBy when calling s.List; update the function (ListByProjectAndOrg) to pass a filter that enforces tenant scoping and optional creator filtering instead of only projectID and pagination—modify the call to s.List to include an appropriate filter/map/selector that adds org_id = orgID and, if createdBy is non-empty, created_by = createdBy (or whatever field names Dataset expects), so the query returned by s.List is correctly restricted by org and creator.

coderabbitai · 2026-02-27T10:49:38Z

+func (s *PgDatasetStore) ListByProjectAndOrg(ctx context.Context, projectID, orgID, createdBy string, pagination models.Pagination) (*models.PaginationView[*models.Dataset], error) {
+	return s.List(ctx, projectID, pagination)
+}


⚠️ Potential issue | 🟠 Major

Parameters orgID and createdBy are ignored in the implementation.

The method signature accepts orgID and createdBy but these values are never used—the call simply delegates to s.List(ctx, projectID, pagination). This creates a misleading API contract where callers may assume org-scoped or user-scoped filtering is applied when it is not.

If org/user filtering is intentionally deferred, consider adding a comment explaining this. Otherwise, incorporate these parameters into the query to enforce proper tenant isolation.

🔍 Verification: Check if List method supports org filtering

#!/bin/bash # Description: Check the List method signature and implementation to see if it supports org/user filtering # Find the List method implementation ast-grep --pattern $'func ($_ *PgDatasetStore) List($$$) { $$$ }'

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@server/infrastructure/postgres/store/datasets/list_by_project_and_org.go` around lines 9 - 11, The ListByProjectAndOrg implementation ignores orgID and createdBy (it just calls s.List), so update it to apply org and creator filtering: either (A) extend or call an underlying query method that accepts orgID and createdBy and pass those values into the SQL/ORM filter, or (B) if filtering is intentionally not applied, add a clear comment on PgDatasetStore.ListByProjectAndOrg stating that org/user scoping is deferred and why; specifically modify the function PgDatasetStore.ListByProjectAndOrg to use orgID and createdBy in the dataset query (or call a new/internal method like s.listWithFilters(ctx, projectID, orgID, createdBy, pagination)) to enforce tenant/user isolation and return the paginated result.

coderabbitai · 2026-02-27T10:49:38Z

 	projectID := ctx.Params("projectID")
 	orgID := ctx.Get(middleware.OrganizationIDHeader)
 	userID := ctx.Locals(middleware.UserCtxKey).(string)
-	role := ctx.Locals(middleware.RoleCtxKey).(models.Role)

-	project, err := h.projectSvc.Details(projectID, orgID, userID, role)
+	project, err := h.projectSvc.Details(projectID, orgID, userID)


⚠️ Potential issue | 🟠 Major

Use middleware-scoped org context instead of request headers for tenant enforcement.

Line 21 currently trusts a user-provided header for orgID. For authorization-sensitive reads, use middleware locals to avoid tenant spoofing.

🔒 Suggested fix

- orgID := ctx.Get(middleware.OrganizationIDHeader) + orgID := ctx.Locals(middleware.OrganizationCtxKey).(string)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

projectID := ctx.Params("projectID")

orgID := ctx.Get(middleware.OrganizationIDHeader)

userID := ctx.Locals(middleware.UserCtxKey).(string)

role := ctx.Locals(middleware.RoleCtxKey).(models.Role)

project, err := h.projectSvc.Details(projectID, orgID, userID, role)

project, err := h.projectSvc.Details(projectID, orgID, userID)

projectID := ctx.Params("projectID")

orgID := ctx.Locals(middleware.OrganizationCtxKey).(string)

userID := ctx.Locals(middleware.UserCtxKey).(string)

project, err := h.projectSvc.Details(projectID, orgID, userID)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@server/interfaces/http/routes/api/projects/details.go` around lines 20 - 24, Replace the header-based org lookup with the middleware-local org value: stop using ctx.Get(middleware.OrganizationIDHeader) and instead read the org ID from the middleware-populated locals (e.g., ctx.Locals(<middleware's org-local-key>)) before calling h.projectSvc.Details(projectID, orgID, userID); update the orgID variable initialization so authorization uses the middleware-scoped org context (use the exact local key constant provided by the middleware rather than the header constant) and keep projectID and userID retrievals as-is.

coderabbitai · 2026-02-27T10:49:39Z

 export default function UploadDatasetPage({
  params,
 }: {
  params: Promise<{ projectId: string }>;
 }) {
  const { projectId } = React.use(params);
+  const { role, user } = useAuthStore();
+  const router = useRouter();
+
+  const { data: project, isLoading } = useProject({
+    variables: { projectId },
+  });
+
+  React.useEffect(() => {
+    if (!isLoading && project) {
+      const canEdit = role === "admin" || project.created_by === user?.id;
+      if (!canEdit) {
+        router.replace(`/projects/${projectId}`);
+      }
+    }
+  }, [isLoading, project, role, user, projectId, router]);
+
+  if (isLoading || !project) return null;
+
+  const canEdit = role === "admin" || project.created_by === user?.id;
+  if (!canEdit) return null;

  return <DatasetUploadWizard projectId={projectId} />;


⚠️ Potential issue | 🟡 Minor

Missing ProtectedPage wrapper for authentication.

Per coding guidelines, authentication-required pages should use the ProtectedRoute or ProtectedPage wrapper. While the canEdit check handles authorization, unauthenticated users should be redirected to login rather than seeing a blank page.

🛡️ Proposed fix

+import { ProtectedPage } from "@/components/auth/protected-page"; + export default function UploadDatasetPage({ params, }: { params: Promise<{ projectId: string }>; }) { // ... existing code ... - return <DatasetUploadWizard projectId={projectId} />; + return ( + <ProtectedPage> + <DatasetUploadWizard projectId={projectId} /> + </ProtectedPage> + ); }

As per coding guidelines: "Protect routes using the ProtectedRoute wrapper HOC for authentication-required pages".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@web/src/app/projects/`[projectId]/upload/page.tsx around lines 9 - 36, The page lacks the authentication wrapper: wrap the UploadDatasetPage with the app's ProtectedPage/ProtectedRoute so unauthenticated users are redirected to login before any canEdit checks run; locate the UploadDatasetPage export in web/src/app/projects/[projectId]/upload/page.tsx and either return/render <ProtectedPage> (or the project's ProtectedRoute HOC) around the component's JSX (ensuring DatasetUploadWizard remains inside) or export the component via ProtectedPage(UploadDatasetPage), keep the existing useAuthStore, useProject, and canEdit logic intact inside the wrapped component, and ensure ProtectedPage handles redirect-to-login for unauthenticated users.

yuvrajsinh5252 added 5 commits January 20, 2026 13:58

feat(auth): temp changes

03b6540

refactor: added harcoded metadata for testing

0e8b653

chore: update dependencies

e8c860d

coderabbitai Bot reviewed Jan 22, 2026

View reviewed changes

Tharun Paul added 2 commits January 24, 2026 05:54

feat(embedding): add vLLM embedding provider support

e407ffe

coderabbitai Bot reviewed Jan 24, 2026

View reviewed changes

yuvrajsinh5252 added 3 commits February 2, 2026 13:55

feat: integrate hybrid search with Qdrant

d1f0923

feat(docs): add script for hybrid search migration

e010cdc

Merge branch 'feat/hybrid-search' into feat/e2e-datasets-metadata

d775e63

coderabbitai Bot reviewed Feb 2, 2026

View reviewed changes

refactor: reimplement Qdrant collection reindexing script for asynchr…

a875bb0

…onous source-to-destination migration with hybrid vector support.

coderabbitai Bot reviewed Feb 4, 2026

View reviewed changes

Comment thread chat-server/tests/scripts/reset_and_reindex_collection.py

Comment thread chat-server/tests/scripts/reset_and_reindex_collection.py

feat: Add sparse model health check to the /health endpoint.

af0a2b3

feat: Update Qdrant reindexing script to use gRPC for client communic…

700b36b

…ation and synchronously initialize the sparse vector model.

fix: Ensure --port argument is parsed as an integer.

40d0dbb

web: provide access project and datasets to all org members

bda820b

coderabbitai Bot reviewed Feb 27, 2026

View reviewed changes

surajmn1 force-pushed the feat/e2e-datasets-metadata branch 2 times, most recently from 7b6a888 to bda820b Compare March 20, 2026 08:03

		dense_vector = embeddings.embed_query(user_query)
		sparse_vector = generate_sparse_vector(user_query)

Conversation

paul-tharun commented Jan 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

claude Bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Feat/e2e datasets metadata

Review Checklist

🏗️ Architecture & Design

✅ Positive Changes

⚠️ Architectural Concerns

🔒 Security Analysis

✅ Good Security Practices

🚨 Security Issues

🚀 Performance Analysis

✅ Performance Improvements

⚠️ Performance Concerns

🧪 Testing & Quality

✅ Testing Infrastructure

❌ Testing Gaps

🐛 Bug & Error Handling Issues

📋 Code Quality

✅ Good Practices

⚠️ Areas for Improvement

🔧 Specific Recommendations

📊 Overall Assessment

Uh oh!

coderabbitai Bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Feat/e2e datasets metadata

Review Checklist

🏗️ Architecture & Design

✅ Positive Changes

⚠️ Architectural Concerns

🔒 Security Analysis

✅ Good Security Practices

🚨 Security Issues

paul-tharun commented Jan 22, 2026 •

edited by coderabbitai Bot

Loading

claude Bot commented Jan 22, 2026 •

edited

Loading

coderabbitai Bot commented Jan 22, 2026 •

edited

Loading

claude Bot commented Jan 24, 2026 •

edited

Loading

claude Bot commented Feb 2, 2026 •

edited

Loading