Skip to content

feat(gemini): support multimodal inlineData in user messages#2435

Merged
theonlyhennygod merged 1 commit intomainfrom
issue-2376-gemini-multimodal
Mar 1, 2026
Merged

feat(gemini): support multimodal inlineData in user messages#2435
theonlyhennygod merged 1 commit intomainfrom
issue-2376-gemini-multimodal

Conversation

@theonlyhennygod
Copy link
Copy Markdown
Collaborator

@theonlyhennygod theonlyhennygod commented Mar 1, 2026

Summary

  • add Gemini request Part support for both text and inlineData
  • parse [IMAGE:data:...] markers into Gemini-native inline image parts for user messages
  • keep text-only serialization behavior intact and add focused multimodal tests

Validation

  • cargo test providers::gemini::tests -- --nocapture
  • cargo test build_user_parts_ -- --nocapture

Closes #2376

Summary by CodeRabbit

New Features

  • Gemini provider now supports inline images in message content, enabling users to send images directly within messages to the AI model.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 1, 2026

PR intake checks found warnings (non-blocking)

Fast safe checks found advisory issues. CI lint/test/build gates still enforce merge quality.

  • Missing required PR template sections: ## Validation Evidence (required), ## Security Impact (required), ## Privacy and Data Hygiene (required), ## Rollback Plan (required)
  • Incomplete required PR template fields: summary problem, summary why it matters, summary what changed, validation commands, security risk/mitigation, privacy status, rollback plan
  • Missing Linear issue key reference (RMN-<id>, CDV-<id>, or COM-<id>) in PR title/body (recommended for traceability, non-blocking).

Action items:

  1. Complete required PR template sections/fields.
  2. (Recommended) Link this PR to one active Linear issue key (RMN-xxx/CDV-xxx/COM-xxx) for traceability.
  3. Remove tabs, trailing whitespace, and merge conflict markers from added lines.
  4. Re-run local checks before pushing:
    • ./scripts/ci/rust_quality_gate.sh
    • ./scripts/ci/rust_strict_delta_gate.sh
    • ./scripts/ci/docs_quality_gate.sh

Detected Linear keys: none

Run logs: https://github.com/zeroclaw-labs/zeroclaw/actions/runs/22551770679

Detected blocking line issues (sample):

  • none

Detected advisory line issues (sample):

  • none

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 1, 2026

Warning

Rate limit exceeded

@theonlyhennygod has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 15 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 9066404 and b01462d.

📒 Files selected for processing (1)
  • src/providers/gemini.rs

Note

.coderabbit.yaml has unrecognized properties

CodeRabbit is using all valid settings from your configuration. Unrecognized properties (listed below) have been ignored and may indicate typos or deprecated fields that can be removed.

⚠️ Parsing warnings (1)
Validation error: Unrecognized key(s) in object: 'tools', 'path_filters', 'review_instructions'
⚙️ Configuration instructions
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
📝 Walkthrough

Walkthrough

The Gemini provider's Part struct is refactored from a flat structure into an enum supporting Text and InlineData variants. Helper methods parse_inline_image_marker and build_user_parts are added to extract image markers from message content and construct appropriate part types, enabling native multimodal image input support via Gemini's inlineData API.

Changes

Cohort / File(s) Summary
Gemini Provider Multimodal Support
src/providers/gemini.rs
Converts Part from struct to serde untagged enum with Text and InlineData variants. Introduces InlineDataPart struct for inline data representation. Adds parse_inline_image_marker() to parse data URI-like image markers and build_user_parts() to split message content into text and inline data parts. Updates call sites (chat, chat_with_system, chat_with_history) to use build_user_parts(). Replaces all direct Part construction with enum variants. Adds comprehensive unit tests covering text-only, single/multiple images, image-only, and fallback scenarios.

Sequence Diagram

sequenceDiagram
    participant Client
    participant GeminiProvider
    participant ImageParser as parse_inline_image_marker
    participant PartBuilder as build_user_parts
    participant GeminiAPI

    Client->>GeminiProvider: chat(message with [IMAGE:...])
    GeminiProvider->>PartBuilder: build_user_parts(content)
    
    PartBuilder->>ImageParser: parse_inline_image_marker([IMAGE:...])
    ImageParser-->>PartBuilder: InlineDataPart{mime, base64}
    
    PartBuilder->>PartBuilder: Split content into text & images
    PartBuilder-->>GeminiProvider: Vec<Part::Text | Part::InlineData>
    
    GeminiProvider->>GeminiProvider: Serialize parts to JSON
    GeminiProvider->>GeminiAPI: Send request with inlineData
    GeminiAPI-->>GeminiProvider: Response
    GeminiProvider-->>Client: Message response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

size: XS

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description covers the core changes and validation steps, but omits most template sections (labels, metadata, security impact, compatibility, rollback plan, etc.) required by the repository template. Complete the PR description template by filling in all required sections including Label Snapshot, Change Metadata, Security Impact, Validation Evidence, and Rollback Plan.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: adding multimodal inlineData support to the Gemini provider, which matches the primary objective of the pull request.
Linked Issues check ✅ Passed The PR fully addresses all coding requirements from issue #2376: adds Part enum variants (Text/InlineData), implements build_user_parts helper, supports multiple images, preserves text-only backward compatibility, and provides comprehensive test coverage.
Out of Scope Changes check ✅ Passed All changes are scoped to src/providers/gemini.rs and directly support the multimodal image input feature; no out-of-scope modifications to multimodal.rs, traits, dependencies, or configuration are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch issue-2376-gemini-multimodal

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the provider Auto scope: src/providers/** changed. label Mar 1, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 1, 2026

Thanks for contributing to ZeroClaw.

For faster review, please ensure:

  • PR template sections are fully completed
  • cargo fmt --all -- --check, cargo clippy --all-targets -- -D warnings, and cargo test are included
  • If automation/agents were used heavily, add brief workflow notes
  • Scope is focused (prefer one concern per PR)

See CONTRIBUTING.md and docs/pr-workflow.md for full collaboration rules.

@github-actions github-actions Bot added size: S Auto size: 81-250 non-doc changed lines. risk: medium Auto risk: src/** or dependency/config changes. distinguished contributor Contributor with 50+ merged PRs. provider: gemini Auto module: provider/gemini changed. and removed provider Auto scope: src/providers/** changed. labels Mar 1, 2026
@theonlyhennygod theonlyhennygod self-assigned this Mar 1, 2026
@theonlyhennygod theonlyhennygod force-pushed the issue-2376-gemini-multimodal branch from 81165b1 to 9066404 Compare March 1, 2026 20:11
@github-actions github-actions Bot added provider Auto scope: src/providers/** changed. and removed provider Auto scope: src/providers/** changed. labels Mar 1, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/providers/gemini.rs (1)

991-997: Defensive empty check appears unreachable.

Since this code path is only entered when image_refs is non-empty (line 970 returns early otherwise), and the loop at lines 981-989 always pushes at least one part per image_ref, parts can never be empty at line 991. The check is harmless but could be removed for clarity.

🔧 Suggested simplification
         }

-        if parts.is_empty() {
-            vec![Part::Text {
-                text: String::new(),
-            }]
-        } else {
-            parts
-        }
+        parts
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/providers/gemini.rs` around lines 991 - 997, The final defensive branch
that returns vec![Part::Text { text: String::new() }] when parts.is_empty() is
unreachable because the code only runs when image_refs is non-empty and the loop
that fills parts (the loop iterating image_refs and pushing into parts) always
pushes at least one Part; remove the parts.is_empty() check and the fallback
branch and simply return parts directly (reference the parts variable and the
image_refs loop that populates it to locate the code).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/providers/gemini.rs`:
- Around line 991-997: The final defensive branch that returns vec![Part::Text {
text: String::new() }] when parts.is_empty() is unreachable because the code
only runs when image_refs is non-empty and the loop that fills parts (the loop
iterating image_refs and pushing into parts) always pushes at least one Part;
remove the parts.is_empty() check and the fallback branch and simply return
parts directly (reference the parts variable and the image_refs loop that
populates it to locate the code).

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 12870df and 9066404.

📒 Files selected for processing (1)
  • src/providers/gemini.rs

@theonlyhennygod theonlyhennygod force-pushed the issue-2376-gemini-multimodal branch from 9066404 to b01462d Compare March 1, 2026 20:18
@github-actions github-actions Bot added provider Auto scope: src/providers/** changed. and removed provider Auto scope: src/providers/** changed. labels Mar 1, 2026
@theonlyhennygod theonlyhennygod merged commit 7350415 into main Mar 1, 2026
11 of 21 checks passed
@theonlyhennygod theonlyhennygod deleted the issue-2376-gemini-multimodal branch March 1, 2026 20:19
@gh-xj gh-xj mentioned this pull request Mar 2, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

distinguished contributor Contributor with 50+ merged PRs. provider: gemini Auto module: provider/gemini changed. risk: medium Auto risk: src/** or dependency/config changes. size: S Auto size: 81-250 non-doc changed lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Gemini provider missing image/multimodal input support

1 participant