Skip to content

feat(llm): enhance llm_processor to support fallback models and update LitellmLLMProvider to utilize fallback_model_list #1739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 13, 2025

Conversation

yichieh-lu
Copy link
Collaborator

@yichieh-lu yichieh-lu commented Jun 12, 2025

This PR enhances llm_processor to support a fallback mechanism, allowing it to automatically switch to alternative models in case the primary model fails.

Summary by CodeRabbit

  • New Features

    • Introduced structured fallback support for language models, enabling automatic switching to alternative models if the primary model is unavailable.
  • Improvements

    • Enhanced reliability of AI responses by integrating a fallback mechanism for model selection during completions.
    • Improved tracking accuracy by updating model identification in usage reporting to reflect actual model metadata.
    • Simplified pipeline interfaces by removing the propagation and storage of generator model names across multiple generation and retrieval components.

…e LitellmLLMProvider to utilize fallback_model_list
Copy link
Contributor

coderabbitai bot commented Jun 12, 2025

Walkthrough

The changes introduce a structured fallback mechanism for LLM model selection. The llm_processor function now builds a prioritized list of fallback models for each configuration. The LitellmLLMProvider class is updated to accept this fallback list and uses a Router to manage model selection and completion requests, enhancing robustness in model invocation. Additionally, the trace_cost decorator in utils.py is adjusted to extract the model name from the result metadata instead of from an external variable. Numerous pipeline generation functions and their corresponding component classes have been simplified by removing the generator_name parameter and its propagation, streamlining function signatures and internal state.

Changes

File(s) Change Summary
src/providers/init.py Enhanced llm_processor to build and attach a fallback_model_list containing model names and parameters.
src/providers/llm/litellm.py Modified LitellmLLMProvider constructor to accept fallback_model_list, instantiate Router, and route completions through it instead of direct calls. Added fallback testing flag.
src/utils.py Updated trace_cost decorator to extract model name from the first element of the result's meta list instead of from an external variable.
src/pipelines/generation/*.py (many files) Removed generator_name parameter from async generation functions and eliminated "generator_name" entries from _components dictionaries in pipeline classes, simplifying interfaces and internal state.
src/pipelines/retrieval/db_schema_retrieval.py Removed generator_name parameter and corresponding component entry from filter_columns_in_tables function and DbSchemaRetrieval class.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant llm_processor
    participant LitellmLLMProvider
    participant Router

    User->>llm_processor: Provide model configuration(s)
    llm_processor->>llm_processor: Build fallback_model_list for each model
    llm_processor->>LitellmLLMProvider: Pass model config with fallback_model_list
    User->>LitellmLLMProvider: Request completion
    LitellmLLMProvider->>Router: Call acompletion with fallback_model_list
    Router->>Router: Attempt completion with primary and fallback models
    Router-->>LitellmLLMProvider: Return completion result
    LitellmLLMProvider-->>User: Return completion or stream
Loading

Possibly related PRs

  • fix(wren-ai-service): langfuse cost display issue #1699: This PR adds back the generator_name parameter and related components along with a trace_cost decorator to enable cost tracing and model name propagation in pipeline generation functions, directly related but opposite in effect to the current PR which removes these.

Suggested labels

module/ai-service, ci/ai-service

Poem

In the garden of code where models abound,
A rabbit hops, fallback paths are found.
With routers to guide and choices to make,
If one model naps, another will wake.
Now completions flow, never to stall—
Thanks to the team, who thought of it all!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate Unit Tests
  • Create PR with Unit Tests
  • Commit Unit Tests in branch feature/llms-fallbacks
  • Post Copyable Unit Tests in Comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai auto-generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
wren-ai-service/src/providers/llm/litellm.py (1)

86-95: Remove commented code.

The old implementation is no longer needed.

Apply this diff to remove the commented code:

-            # completion: Union[ModelResponse] = await acompletion(
-            #     model=self._model,
-            #     api_key=self._api_key,
-            #     api_base=self._api_base,
-            #     api_version=self._api_version,
-            #     timeout=self._timeout,
-            #     messages=openai_formatted_messages,
-            #     stream=streaming_callback is not None,
-            #     **generation_kwargs,
-            # )
wren-ai-service/src/providers/__init__.py (2)

82-96: Consider handling api_key configuration for fallback models.

The fallback model parameters don't include api_key_name which might be needed for authentication. Each fallback model might require its own API key configuration.

Would you like me to help implement proper api_key handling for fallback models?


103-103: Remove commented code.

Apply this diff to remove the commented line:

-            # "fallbacks": model.get("fallbacks", []),
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 509af37 and 467cd0f.

📒 Files selected for processing (2)
  • wren-ai-service/src/providers/__init__.py (1 hunks)
  • wren-ai-service/src/providers/llm/litellm.py (4 hunks)
🧰 Additional context used
🪛 Ruff (0.11.9)
wren-ai-service/src/providers/llm/litellm.py

8-8: litellm.acompletion imported but unused

Remove unused import: litellm.acompletion

(F401)

🔇 Additional comments (1)
wren-ai-service/src/providers/llm/litellm.py (1)

96-101:

❓ Verification inconclusive

Verify Router handles all required parameters.

The Router call omits api_key, api_base, api_version, and timeout parameters that were present in the original implementation. Please verify that the Router properly handles these through the fallback_model_list configuration.

Run the following script to check how Router handles these parameters:


🏁 Script executed:

#!/bin/bash
# Description: Check Router class implementation for parameter handling

# Search for Router class definition and its methods
ast-grep --pattern 'class Router {
  $$$
}'

# Search for Router's acompletion method to understand parameter handling
rg -A 10 "def acompletion.*Router" 

# Search for how Router uses model_list configuration
rg -A 5 "model_list.*Router|Router.*model_list"

Length of output: 132


I wasn’t able to locate the Router implementation or see how it handles api_key, api_base, api_version, and timeout. Let’s deepen the search:


🏁 Script executed:

#!/bin/bash
# 1. Find any Router class definitions
rg -n "class .*Router" .

# 2. Find where Router is imported
rg -n "import .*Router" .

# 3. Locate where self._router is initialized in litellm.py
rg -n "self\. _router" -n wren-ai-service/src/providers/llm/litellm.py

# 4. Search for fallback_model_list usage across the repo
rg -n "fallback_model_list" .

# 5. Search for any acompletion definitions
rg -n "def acompletion" -n .

Length of output: 4729


We still can’t find Router in our codebase—it’s coming from the external litellm package. Let’s inspect its implementation to confirm how it handles api_key, api_base, api_version, and timeout.


🏁 Script executed:

#!/bin/bash
# Inspect Router in the installed litellm package
python3 - << 'EOF'
import inspect, litellm
print("Router source:")
print(inspect.getsource(litellm.Router))
print("\nacompletion signature:")
print(inspect.signature(litellm.Router.acompletion))
EOF

Length of output: 131


Verify external Router parameter support

We weren’t able to locate or inspect the litellm.Router implementation in our sandbox, so it’s unclear whether it automatically picks up or falls back on the following parameters:

  • api_key
  • api_base
  • api_version
  • timeout

Please confirm that the external Router from the litellm package handles these internally (e.g., via its default config or the provided fallback_model_list) to avoid unintentional omissions.

yichieh-lu and others added 2 commits June 12, 2025 15:27
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
wren-ai-service/src/providers/llm/litellm.py (2)

8-10: Remove leftover unused ModelResponse import

ModelResponse is no longer referenced after switching to self._router.acompletion, so the import is dead code.

-from litellm.types.utils import ModelResponse
🧰 Tools
🪛 Ruff (0.11.9)

8-8: SyntaxError: Expected an identifier, but found a keyword 'from' that cannot be used here


8-8: SyntaxError: Simple statements must be separated by newlines or semicolons


8-8: SyntaxError: Simple statements must be separated by newlines or semicolons


9-9: SyntaxError: Expected an identifier, but found a keyword 'from' that cannot be used here


9-9: SyntaxError: Simple statements must be separated by newlines or semicolons


9-9: SyntaxError: Simple statements must be separated by newlines or semicolons

🪛 Pylint (3.3.7)

[error] 8-8: Parsing failed: 'invalid syntax (src.providers.llm.litellm, line 8)'

(E0001)


86-96: Delete the commented-out acompletion block

Keeping large commented sections clutters the file and can mislead readers. The git history already preserves this code.

-            # completion: Union[ModelResponse] = await acompletion(
-            #     model=self._model,
-            #     api_key=self._api_key,
-            #     api_base=self._api_base,
-            #     api_version=self._api_version,
-            #     timeout=self._timeout,
-            #     messages=openai_formatted_messages,
-            #     stream=streaming_callback is not None,
-            #     **generation_kwargs,
-            # )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 467cd0f and d143eae.

📒 Files selected for processing (1)
  • wren-ai-service/src/providers/llm/litellm.py (4 hunks)
🧰 Additional context used
🪛 Ruff (0.11.9)
wren-ai-service/src/providers/llm/litellm.py

8-8: SyntaxError: Expected an identifier, but found a keyword 'from' that cannot be used here


8-8: SyntaxError: Simple statements must be separated by newlines or semicolons


8-8: SyntaxError: Simple statements must be separated by newlines or semicolons


9-9: SyntaxError: Expected an identifier, but found a keyword 'from' that cannot be used here


9-9: SyntaxError: Simple statements must be separated by newlines or semicolons


9-9: SyntaxError: Simple statements must be separated by newlines or semicolons

🪛 Pylint (3.3.7)
wren-ai-service/src/providers/llm/litellm.py

[error] 8-8: Parsing failed: 'invalid syntax (src.providers.llm.litellm, line 8)'

(E0001)

yichieh-lu and others added 3 commits June 12, 2025 15:54
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Member

@cyyeh cyyeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm, I've left some comments

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
wren-ai-service/src/utils.py (2)

166-168: Unused variable generator_name – rename to _ or remove.

generator_name is captured but never used after the metadata switch to meta[0]["model"], so linters will flag this as an unused-variable warning.

-        result, generator_name = await func(*args, **kwargs)
+        # `generator_name` is intentionally ignored – rename to `_` to silence linters
+        result, _ = await func(*args, **kwargs)

169-174: Guard against missing or malformed meta items.

meta[0] is assumed to be a dict with "model"/"usage" keys.
A defensive approach avoids AttributeError if meta[0] is None or not a dict:

-                langfuse_context.update_current_observation(
-                    model=meta[0].get("model"),
-                    usage_details=meta[0].get("usage", {}),
-                )
+                first = meta[0] or {}
+                langfuse_context.update_current_observation(
+                    model=first.get("model"),
+                    usage_details=first.get("usage", {}),
+                )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22bcafc and f82ed06.

📒 Files selected for processing (2)
  • wren-ai-service/src/providers/llm/litellm.py (3 hunks)
  • wren-ai-service/src/utils.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • wren-ai-service/src/providers/llm/litellm.py
🔇 Additional comments (1)
wren-ai-service/src/utils.py (1)

167-178: Potential API mismatch – wrapper now returns a single value.

The decorated function still returns (result, generator_name), but the wrapper only relays result.
If any call-sites relied on the two-tuple signature outside the decorator, this is a silent breaking change.

Consider either:

  1. Forwarding the original tuple:
-        return result
+        return result, _
  1. Or updating all call-sites and the original function to return just result.

Please verify usage across the codebase before merging.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (3)
wren-ai-service/src/pipelines/generation/sql_generation.py (1)

108-115: ⚠️ Potential issue

Guard against missing "prompt" key

PromptBuilder.run() usually returns a dict with a "prompt" key, but if someone mutates the template or introduces a custom builder the key could be absent and we would pass None to the LLM – Litellm raises an opaque validation error in that case.

Add an early assertion for clarity:

@@
-    return await generator(prompt=prompt.get("prompt"))
+    prompt_str = prompt.get("prompt")
+    if prompt_str is None:
+        raise ValueError("PromptBuilder did not return a 'prompt' key")
+    return await generator(prompt=prompt_str)
wren-ai-service/src/pipelines/generation/sql_tables_extraction.py (1)

74-76: ⚠️ Potential issue

post_process will crash on malformed LLM output

orjson.loads(...)[ "tables" ] assumes:

  1. extract_sql_tables["replies"][0] exists
  2. The string is valid JSON
  3. A "tables" key is present

Any deviation raises, returning a 500 to callers. Wrap in a try/except and fall back to [], mirroring the pattern used in intent_classification.py.

-    return orjson.loads(extract_sql_tables.get("replies")[0])["tables"]
+    try:
+        return orjson.loads(extract_sql_tables.get("replies")[0])["tables"]
+    except Exception:
+        logger.warning("sql_tables_extraction – unable to parse tables from LLM reply")
+        return []
wren-ai-service/src/pipelines/generation/sql_question.py (1)

70-72: 🛠️ Refactor suggestion

Guard against malformed LLM JSON

orjson.loads(...)[ "question" ] will KeyError on bad output. Consider a safe fallback (e.g., original SQL) to avoid hard crashes.

🧹 Nitpick comments (11)
wren-ai-service/src/pipelines/generation/question_recommendation.py (1)

46-47: Consider tightening the type hints for generator

Any tells the type checker nothing. All the generate helpers in this refactor expect an async callable that accepts prompt=<str> and returns a dict. Encoding that contract makes IDE auto-completion and static analysis happier:

-from typing import Any
+from typing import Awaitable, Callable, Dict

-async def generate(prompt: dict, generator: Any) -> dict:
+GeneratorT = Callable[..., Awaitable[Dict]]
+
+async def generate(prompt: dict, generator: GeneratorT) -> dict:
wren-ai-service/src/pipelines/generation/sql_answer.py (1)

68-74: query_id should be optional in both signature and typing

run() forwards query_id which may be None, yet generate_answer declares the parameter as non-nullable str. Static checkers will complain and users of the provider might not expect None.

-from typing import Any, Optional
+from typing import Any, Optional

-async def generate_answer(
-    prompt: dict, generator: Any, query_id: str
-) -> dict:
+async def generate_answer(
+    prompt: dict,
+    generator: Any,
+    query_id: Optional[str] = None,
+) -> dict:

No runtime behaviour changes, but the annotation now matches reality.

wren-ai-service/src/pipelines/generation/data_assistance.py (1)

74-79: Align query_id typing with value passed in

run() uses query_id or "", so data_assistance() never receives None.
If an empty string really is the sentinel value, make that explicit with a default to avoid accidental None propagation and clarify intent:

-async def data_assistance(
-    prompt: dict, generator: Any, query_id: str
+async def data_assistance(
+    prompt: dict,
+    generator: Any,
+    query_id: str = "",
 ) -> dict:
wren-ai-service/src/pipelines/generation/relationship_recommendation.py (1)

59-60: Consistency: strengthen the generator’s type and fail fast on missing prompt

Same remark as for other pipelines – specify an async callable type and guard against a missing "prompt" key to surface configuration errors early:

-if prompt.get("prompt") is None:
-    raise KeyError('"prompt" key missing from prompt dict')
-return await generator(prompt=prompt.get("prompt"))
+prompt_text = prompt["prompt"]   # KeyError if absent – explicit is better
+return await generator(prompt=prompt_text)
wren-ai-service/src/pipelines/generation/sql_regeneration.py (1)

127-128: Minor: add early validation of prompt dictionary

All pipelines assume the prompt builder always returns a dict with "prompt" present. A defensive check prevents an opaque downstream failure:

-    return await generator(prompt=prompt.get("prompt"))
+    try:
+        prompt_text = prompt["prompt"]
+    except KeyError as exc:
+        raise ValueError("Prompt builder did not return the expected key 'prompt'") from exc
+    return await generator(prompt=prompt_text)
wren-ai-service/src/pipelines/generation/semantics_description.py (1)

69-72: Confirm that downstream nodes expect only the result dict

The return type changed from a tuple to a plain dict.
All downstream nodes (normalize, etc.) now access only generate["replies"], so they are safe.
However, any external consumer that imported and invoked generate directly will silently break.

Consider adding a short doc-string or a TypedDict / pydantic model here to make the new contract explicit and let mypy/pyright catch misuse earlier.

wren-ai-service/src/pipelines/generation/chart_adjustment.py (1)

111-118: Add minimal error handling for generator failures

All new generate_* helpers bubble any LLM exception straight up to Hamilton, which logs a cryptic stack-trace.
Consider catching Exception here, logging the prompt & adjustment options, then re-raising, so ops can triage bad prompts faster.

wren-ai-service/src/pipelines/generation/sql_tables_extraction.py (1)

68-70: Add defensive error handling around generator invocation

await generator(...) can raise provider/network exceptions that will propagate up the pipeline and bubble out of the AsyncDriver, causing the whole request to fail without context. Consider catching and re-raising with additional context or converting to a domain-specific error to improve observability.

-    return await generator(prompt=prompt.get("prompt"))
+    try:
+        return await generator(prompt=prompt.get("prompt"))
+    except Exception as exc:
+        logger.exception("extract_sql_tables – generator failed")
+        raise
wren-ai-service/src/pipelines/generation/sql_question.py (1)

62-66: Mirror error-handling approach used elsewhere

Same observation as in sql_tables_extraction: wrap the generator call so failures surface with context.

-    return await generator(prompt=prompt.get("prompt"))
+    try:
+        return await generator(prompt=prompt.get("prompt"))
+    except Exception as exc:
+        logger.exception("generate_sql_question – generator failed")
+        raise
wren-ai-service/src/pipelines/generation/chart_generation.py (1)

84-86: Same resiliency/observability remarks as other generators

Wrap the call in try/except for clearer logs and to keep pipeline failures consistent.

wren-ai-service/src/pipelines/retrieval/db_schema_retrieval.py (1)

321-329: Drop the unnecessary else branch after an early return

The else is redundant and slightly indents the happy-path unnecessarily.

 async def filter_columns_in_tables(
     prompt: dict, table_columns_selection_generator: Any
 ) -> dict:
     if prompt:
         return await table_columns_selection_generator(
             prompt=prompt.get("prompt")
         )
-    else:
-        return {}
+    return {}

This keeps the function flatter and satisfies pylint R1705.

🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 323-328: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d798a5 and cd5c739.

📒 Files selected for processing (21)
  • wren-ai-service/src/pipelines/generation/chart_adjustment.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/chart_generation.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/data_assistance.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/followup_sql_generation.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/followup_sql_generation_reasoning.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/intent_classification.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/misleading_assistance.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/question_recommendation.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/relationship_recommendation.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/semantics_description.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/sql_answer.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/sql_correction.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/sql_generation.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/sql_generation_reasoning.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/sql_question.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/sql_regeneration.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/sql_tables_extraction.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1 hunks)
  • wren-ai-service/src/pipelines/retrieval/db_schema_retrieval.py (1 hunks)
  • wren-ai-service/src/providers/llm/litellm.py (3 hunks)
  • wren-ai-service/src/utils.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • wren-ai-service/src/pipelines/generation/followup_sql_generation.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • wren-ai-service/src/utils.py
  • wren-ai-service/src/providers/llm/litellm.py
🧰 Additional context used
🧬 Code Graph Analysis (3)
wren-ai-service/src/pipelines/generation/sql_question.py (1)
wren-ai-service/src/pipelines/generation/sql_tables_extraction.py (1)
  • prompt (59-63)
wren-ai-service/src/pipelines/generation/question_recommendation.py (3)
wren-ai-service/src/pipelines/generation/relationship_recommendation.py (2)
  • generate (59-60)
  • prompt (49-54)
wren-ai-service/src/pipelines/generation/semantics_description.py (2)
  • generate (70-71)
  • prompt (55-65)
wren-ai-service/src/web/v1/routers/question_recommendation.py (1)
  • get (125-147)
wren-ai-service/src/pipelines/generation/relationship_recommendation.py (2)
wren-ai-service/src/pipelines/generation/question_recommendation.py (2)
  • generate (46-47)
  • prompt (21-41)
wren-ai-service/src/pipelines/generation/semantics_description.py (2)
  • generate (70-71)
  • prompt (55-65)
🪛 Pylint (3.3.7)
wren-ai-service/src/pipelines/retrieval/db_schema_retrieval.py

[refactor] 323-328: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)

🔇 Additional comments (5)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)

61-68: Double-check provider accepts query_id

llm_provider.get_generator() previously returned callables that accepted only prompt and **generation_kwargs.
You now pass an extra query_id. If the provider’s callable signature is still def __call__(prompt: str, **_), Python will swallow the kwarg but litellm’s router may not.

Please verify the provider’s __call__ signature or wrap the call with:

return await generator(prompt=prompt["prompt"], **({"query_id": query_id} if query_id else {}))
wren-ai-service/src/pipelines/generation/intent_classification.py (1)

284-286:

✅ Verification successful

Signature change LGTM – verify no stray “generator_name” usages remain

The function now matches the updated trace_cost contract. Run a quick grep to ensure no remnants of generator_name are still passed to classify_intent.


🏁 Script executed:

#!/bin/bash
# Search for still-existing calls with a generator_name argument
rg --line-number 'classify_intent\([^)]*generator_name' -g '*.py'

Length of output: 67


🏁 Script executed:

#!/bin/bash
# Search for any remaining occurrences of "generator_name" across Python files
rg --line-number 'generator_name' -g '*.py'

Length of output: 43


No stray “generator_name” references remain
A full search across all Python files returned zero occurrences of generator_name. The updated classify_intent signature aligns with the trace_cost contract.

wren-ai-service/src/pipelines/generation/sql_generation_reasoning.py (1)

78-82:

✅ Verification successful

Parameter order deviates from other pipelines – harmless but double-check decorator assumptions

Most generation functions take (prompt, generator) only. Here we have (prompt, generator, query_id).
If trace_cost introspects positional arguments to locate the generator, confirm it still picks the second param and ignores extras; otherwise cost tracing could break.


🏁 Script executed:

#!/bin/bash
# Inspect trace_cost implementation for positional assumptions
rg --line-number 'def trace_cost' -A3 -B0

Length of output: 309


🏁 Script executed:

#!/bin/bash
rg --heading -n "def trace_cost" -A30 wren-ai-service/src/utils.py

Length of output: 1228


No action needed: trace_cost safely handles extra positional arguments
The trace_cost decorator uses (*args, **kwargs) and merely forwards all parameters to the wrapped function without inspecting their positions. Adding query_id after generator won’t affect cost tracing.

wren-ai-service/src/pipelines/generation/followup_sql_generation_reasoning.py (1)

88-96: Verify the updated generator signature at call-site

generate_sql_reasoning() now forwards only prompt and query_id.
Confirm that every generator produced by llm_provider.get_generator() still accepts those exact keyword arguments—nothing more, nothing less.
A silent mismatch will surface only at runtime as a TypeError.

If any existing custom generator still expects generator_name, wrap it or update its signature accordingly.

wren-ai-service/src/pipelines/generation/misleading_assistance.py (1)

74-79: Double-check generator kw-arg compatibility after generator_name removal

As in the other pipelines, the call now passes just prompt and query_id.
Make sure the underlying generator factory was updated everywhere; otherwise this coroutine will raise at runtime.

No functional issue spotted here, only a consistency check.

Comment on lines 70 to 75
@trace_cost
async def generate_sql_correction(
prompt: dict, generator: Any, generator_name: str
prompt: dict, generator: Any
) -> dict:
return await generator(prompt=prompt.get("prompt")), generator_name
return await generator(prompt=prompt.get("prompt"))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Same nil-prompt edge case applies here

Replicate the defensive check suggested for sql_generation.py to avoid runtime errors if "prompt" is missing.

🤖 Prompt for AI Agents
In wren-ai-service/src/pipelines/generation/sql_correction.py around lines 70 to
75, add a defensive check to ensure the "prompt" key exists in the prompt
dictionary before calling generator. If "prompt" is missing or None, handle it
gracefully by returning an appropriate default or error response to prevent
runtime errors, similar to the fix applied in sql_generation.py.

Copy link
Member

@cyyeh cyyeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Thanks for contribution

@cyyeh cyyeh merged commit b13e0aa into Canner:main Jun 13, 2025
4 of 7 checks passed
@yichieh-lu yichieh-lu deleted the feature/llms-fallbacks branch June 13, 2025 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants