feat(gemini): support multimodal inlineData in user messages#2435
feat(gemini): support multimodal inlineData in user messages#2435theonlyhennygod merged 1 commit intomainfrom
Conversation
PR intake checks found warnings (non-blocking)Fast safe checks found advisory issues. CI lint/test/build gates still enforce merge quality.
Action items:
Detected Linear keys: none Run logs: https://github.com/zeroclaw-labs/zeroclaw/actions/runs/22551770679 Detected blocking line issues (sample):
Detected advisory line issues (sample):
|
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
Note
|
| Cohort / File(s) | Summary |
|---|---|
Gemini Provider Multimodal Support src/providers/gemini.rs |
Converts Part from struct to serde untagged enum with Text and InlineData variants. Introduces InlineDataPart struct for inline data representation. Adds parse_inline_image_marker() to parse data URI-like image markers and build_user_parts() to split message content into text and inline data parts. Updates call sites (chat, chat_with_system, chat_with_history) to use build_user_parts(). Replaces all direct Part construction with enum variants. Adds comprehensive unit tests covering text-only, single/multiple images, image-only, and fallback scenarios. |
Sequence Diagram
sequenceDiagram
participant Client
participant GeminiProvider
participant ImageParser as parse_inline_image_marker
participant PartBuilder as build_user_parts
participant GeminiAPI
Client->>GeminiProvider: chat(message with [IMAGE:...])
GeminiProvider->>PartBuilder: build_user_parts(content)
PartBuilder->>ImageParser: parse_inline_image_marker([IMAGE:...])
ImageParser-->>PartBuilder: InlineDataPart{mime, base64}
PartBuilder->>PartBuilder: Split content into text & images
PartBuilder-->>GeminiProvider: Vec<Part::Text | Part::InlineData>
GeminiProvider->>GeminiProvider: Serialize parts to JSON
GeminiProvider->>GeminiAPI: Send request with inlineData
GeminiAPI-->>GeminiProvider: Response
GeminiProvider-->>Client: Message response
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~20 minutes
Suggested labels
size: XS
🚥 Pre-merge checks | ✅ 3 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. | Write docstrings for the functions missing them to satisfy the coverage threshold. | |
| Description check | ❓ Inconclusive | The description covers the core changes and validation steps, but omits most template sections (labels, metadata, security impact, compatibility, rollback plan, etc.) required by the repository template. | Complete the PR description template by filling in all required sections including Label Snapshot, Change Metadata, Security Impact, Validation Evidence, and Rollback Plan. |
✅ Passed checks (3 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | ✅ Passed | The title clearly and concisely describes the main change: adding multimodal inlineData support to the Gemini provider, which matches the primary objective of the pull request. |
| Linked Issues check | ✅ Passed | The PR fully addresses all coding requirements from issue #2376: adds Part enum variants (Text/InlineData), implements build_user_parts helper, supports multiple images, preserves text-only backward compatibility, and provides comprehensive test coverage. |
| Out of Scope Changes check | ✅ Passed | All changes are scoped to src/providers/gemini.rs and directly support the multimodal image input feature; no out-of-scope modifications to multimodal.rs, traits, dependencies, or configuration are present. |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches
🧪 Generate unit tests (beta)
- Create PR with unit tests
- Post copyable unit tests in a comment
- Commit unit tests in branch
issue-2376-gemini-multimodal
Comment @coderabbitai help to get the list of available commands and usage tips.
|
Thanks for contributing to ZeroClaw. For faster review, please ensure:
See |
81165b1 to
9066404
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/providers/gemini.rs (1)
991-997: Defensive empty check appears unreachable.Since this code path is only entered when
image_refsis non-empty (line 970 returns early otherwise), and the loop at lines 981-989 always pushes at least one part per image_ref,partscan never be empty at line 991. The check is harmless but could be removed for clarity.🔧 Suggested simplification
} - if parts.is_empty() { - vec![Part::Text { - text: String::new(), - }] - } else { - parts - } + parts } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/providers/gemini.rs` around lines 991 - 997, The final defensive branch that returns vec![Part::Text { text: String::new() }] when parts.is_empty() is unreachable because the code only runs when image_refs is non-empty and the loop that fills parts (the loop iterating image_refs and pushing into parts) always pushes at least one Part; remove the parts.is_empty() check and the fallback branch and simply return parts directly (reference the parts variable and the image_refs loop that populates it to locate the code).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/providers/gemini.rs`:
- Around line 991-997: The final defensive branch that returns vec![Part::Text {
text: String::new() }] when parts.is_empty() is unreachable because the code
only runs when image_refs is non-empty and the loop that fills parts (the loop
iterating image_refs and pushing into parts) always pushes at least one Part;
remove the parts.is_empty() check and the fallback branch and simply return
parts directly (reference the parts variable and the image_refs loop that
populates it to locate the code).
9066404 to
b01462d
Compare
Summary
Partsupport for both text andinlineData[IMAGE:data:...]markers into Gemini-native inline image parts for user messagesValidation
cargo test providers::gemini::tests -- --nocapturecargo test build_user_parts_ -- --nocaptureCloses #2376
Summary by CodeRabbit
New Features