Skip to content

chore(wren-ai-service): improve embedding performance and allow setting batch size for embedding #1814

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 2, 2025

Conversation

cyyeh
Copy link
Member

@cyyeh cyyeh commented Jul 2, 2025

Summary by CodeRabbit

  • New Features

    • Improved embedding performance with faster, concurrent processing of document batches.
  • Refactor

    • Simplified embedding workflow and removed progress bar display during embedding operations.

@cyyeh cyyeh requested a review from yichieh-lu July 2, 2025 14:02
@cyyeh cyyeh added module/ai-service ai-service related ci/ai-service ai-service related labels Jul 2, 2025
Copy link
Contributor

coderabbitai bot commented Jul 2, 2025

Walkthrough

The AsyncDocumentEmbedder class in the embedder provider was refactored to introduce a configurable batch_size parameter in its constructor. The embedding logic was updated to process batches asynchronously using asyncio.gather, removing the synchronous progress bar and related imports. Method signatures were streamlined to utilize the instance's batch size.

Changes

File(s) Change Summary
wren-ai-service/src/providers/embedder/litellm.py Refactored AsyncDocumentEmbedder: added batch_size to constructor, switched to async batch processing with asyncio.gather, removed progress bar and related imports, updated method signatures.

Poem

A batch of thoughts, now sent with speed,
Async they fly, fulfilling the need.
Progress bars gone, but progress is clear,
Embeddings arrive, as fast as a deer!
With asyncio’s magic, the code’s now sublime—
Hopping through batches, one hop at a time! 🐇

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate Unit Tests
  • Create PR with Unit Tests
  • Post Copyable Unit Tests in a Comment
  • Commit Unit Tests in branch chore/ai-service/improve-embedding

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai auto-generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
wren-ai-service/src/providers/embedder/litellm.py (1)

110-113: Fix whitespace formatting issue.

Static analysis detected a formatting issue with whitespace before the colon.

Apply this diff to fix the formatting:

-        batches = [
-            texts_to_embed[i : i + batch_size]
-            for i in range(0, len(texts_to_embed), batch_size)
-        ]
+        batches = [
+            texts_to_embed[i:i + batch_size]
+            for i in range(0, len(texts_to_embed), batch_size)
+        ]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 87119c8 and 6928747.

📒 Files selected for processing (1)
  • wren-ai-service/src/providers/embedder/litellm.py (4 hunks)
🧰 Additional context used
🪛 Flake8 (7.2.0)
wren-ai-service/src/providers/embedder/litellm.py

[error] 111-111: whitespace before ':'

(E203)

⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: pytest
  • GitHub Check: pytest
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (5)
wren-ai-service/src/providers/embedder/litellm.py (5)

1-1: LGTM: Import addition supports concurrent processing.

The asyncio import is necessary for the asyncio.gather functionality introduced in the _embed_batch method.


84-84: LGTM: Configurable batch size improves flexibility.

The addition of the batch_size parameter with a sensible default of 32 allows users to tune performance based on their needs, and storing it as an instance attribute _batch_size follows good encapsulation practices.

Also applies to: 92-92


100-117: Excellent performance improvement with concurrent processing.

The refactoring to use asyncio.gather for concurrent batch processing is a significant performance improvement over the previous sequential approach. The implementation correctly:

  • Creates batches of the specified size
  • Processes all batches concurrently
  • Maintains proper error handling through the existing backoff decorator

119-137: LGTM: Proper metadata aggregation for concurrent responses.

The metadata aggregation logic correctly handles:

  • Extending embeddings from all batch responses
  • Accumulating usage statistics (prompt_tokens, total_tokens)
  • Preserving model information from the first response

141-141: AsyncDocumentEmbedder.run signature change—no callers found using removed parameters

A full-repo search for AsyncDocumentEmbedder.run, batch_size, and progress_bar showed no existing calls passing the now-removed parameters. This breaking change has no impact on current callers.

Comment on lines 154 to 157
embeddings, meta = await self._embed_batch(
texts_to_embed=texts_to_embed,
batch_size=batch_size,
progress_bar=progress_bar,
batch_size=self._batch_size,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Update method call to match refactored signature.

The call to _embed_batch should be updated to remove the batch_size parameter since it's no longer used by the method.

Apply this diff to match the refactored method signature:

-        embeddings, meta = await self._embed_batch(
-            texts_to_embed=texts_to_embed,
-            batch_size=self._batch_size,
-        )
+        embeddings, meta = await self._embed_batch(
+            texts_to_embed=texts_to_embed,
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
embeddings, meta = await self._embed_batch(
texts_to_embed=texts_to_embed,
batch_size=batch_size,
progress_bar=progress_bar,
batch_size=self._batch_size,
)
embeddings, meta = await self._embed_batch(
texts_to_embed=texts_to_embed,
)
🤖 Prompt for AI Agents
In wren-ai-service/src/providers/embedder/litellm.py around lines 154 to 157,
the call to the _embed_batch method incorrectly includes the batch_size
parameter, which has been removed in the refactored method signature. Remove the
batch_size argument from the method call so that only texts_to_embed is passed
to _embed_batch.

self._api_base_url = api_base_url
self._timeout = timeout
self._kwargs = kwargs

async def _embed_batch(
self, texts_to_embed: List[str], batch_size: int, progress_bar: bool = True
self, texts_to_embed: List[str], batch_size: int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unused parameter from method signature.

The batch_size parameter in the _embed_batch method signature is no longer used since the method now uses self._batch_size. This creates inconsistency and potential confusion.

Apply this diff to clean up the method signature:

-    async def _embed_batch(
-        self, texts_to_embed: List[str], batch_size: int
-    ) -> Tuple[List[List[float]], Dict[str, Any]]:
+    async def _embed_batch(
+        self, texts_to_embed: List[str]
+    ) -> Tuple[List[List[float]], Dict[str, Any]]:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self, texts_to_embed: List[str], batch_size: int
async def _embed_batch(
self, texts_to_embed: List[str]
) -> Tuple[List[List[float]], Dict[str, Any]]:
🤖 Prompt for AI Agents
In wren-ai-service/src/providers/embedder/litellm.py at line 98, remove the
unused parameter batch_size from the _embed_batch method signature since the
method uses self._batch_size internally. This will clean up the method signature
and avoid confusion about the parameter's usage.

@cyyeh cyyeh merged commit aabe125 into main Jul 2, 2025
15 checks passed
@cyyeh cyyeh deleted the chore/ai-service/improve-embedding branch July 2, 2025 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/ai-service ai-service related module/ai-service ai-service related wren-ai-service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants