vertex support multi-modal, function call and thinking #2926

rinfx · 2025-09-16T08:10:15Z

Ⅰ. Describe what this PR did

vertex-ai support multi-modal, function call and thinking

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

lingma-agents · 2025-09-16T08:10:39Z

Vertex AI 支持多模态、函数调用和推理功能

变更概述

新功能
- 新增对 Vertex AI 的函数调用（Function Call）支持，允许模型调用外部函数并返回结果。
- 实现推理上下文（Reasoning Context）处理，通过 <think> 标签标记推理内容。
- 支持多模态输入，包括文本和图像（通过 URL 或 Base64 编码）。
- 添加对 reasoningEffort 参数的支持，启用模型的推理预算配置。
重构
- 重构消息内容构建逻辑，支持多种内容类型（文本、图像、函数调用等）。
- 调整角色映射逻辑，将 assistant、tool 和 system 角色转换为 Vertex AI 支持的格式。
问题修复
- 修复了在处理函数调用和推理内容时的逻辑判断问题，确保正确解析和返回响应。
其他
- 新增图像内容转换函数 convertImageContent，支持 HTTP URL 和 Base64 格式的图像输入。

变更文件

文件路径	变更说明
plugins/wasm-go/extensions/ai-proxy/provider/vertex.go	新增对 Vertex AI 的函数调用、推理上下文和多模态输入的支持，重构消息处理逻辑以适配 Vertex AI 的 API 要求。

时序图

sequenceDiagram
    participant Client
    participant VertexProvider
    participant VertexAI
    Client->>VertexProvider: 发送聊天请求（含工具调用或图像）
    VertexProvider->>VertexProvider: 构建请求（转换角色、处理多模态内容）
    VertexProvider->>VertexAI: 调用 Vertex AI API
    VertexAI-->>VertexProvider: 返回响应（含函数调用或推理内容）
    VertexProvider->>Client: 返回标准化响应（含工具调用或推理标记）

💡 小贴士

与 lingma-agents 交流的方式

📜 直接回复评论
直接回复本条评论，lingma-agents 将自动处理您的请求。例如：

在当前代码中添加详细的注释说明。
请详细介绍一下你说的 LRU 改造方案，并使用伪代码加以说明。

📜 在代码行处标记
在文件的特定位置创建评论并 @lingma-agents。例如：

@lingma-agents 分析这个方法的性能瓶颈并提供优化建议。
@lingma-agents 对这个方法生成优化代码。

📜 在讨论中提问
在任何讨论中 @lingma-agents 来获取帮助。例如：

@lingma-agents 请总结上述讨论并提出解决方案。
@lingma-agents 请根据讨论内容生成优化代码。

lingma-agents

🔎 代码评审报告

🎯 评审意见概览

严重度	数量	说明
🔴 Blocker	0	阻断性问题，需立即修复。例如：系统崩溃、关键功能不可用或严重安全漏洞。
🟠 Critical	1	严重问题，高优先级修复。例如：核心功能异常或性能瓶颈影响用户体验。
🟡 Major	2	主要问题，建议修复。例如：非核心功能缺陷或代码维护性较差。
🟢 Minor	0	次要问题，酬情优化。例如：代码格式不规范或注释缺失。

总计: 3 个问题

📋 评审意见详情

💡 代码实现建议

以下是文件级别的代码建议，聚焦于代码的可读性、可维护性和潜在问题。

🔹 plugins/wasm-go/extensions/ai-proxy/provider/vertex.go (3 💬)

🚀 架构设计建议

以下是对代码架构和设计的综合分析，聚焦于跨文件交互、系统一致性和潜在优化空间。

🔍1. 函数调用参数序列化错误处理不完整

在处理函数调用时，代码忽略了 JSON 序列化错误（如 vertex.go 中第 267 行），这可能导致运行时错误或数据丢失。应妥善处理这些错误，以确保系统的健壮性和可靠性。

📌 关键代码

plugins/wasm-go/extensions/ai-proxy/provider/vertex.go (267-267)

args, _ := json.Marshal(part.FunctionCall.Args)

⚠️ 潜在风险

忽略序列化错误可能导致程序崩溃或数据不一致，影响服务稳定性。

🔍2. 多部分思考内容处理逻辑存在缺陷

在处理包含思考内容的多部分响应时（如 vertex.go 第 278 行），当前实现假设思考内容总是位于第二部分，这种硬编码方式容易出错且不易维护。建议重构为更灵活的内容解析机制。

📌 关键代码

plugins/wasm-go/extensions/ai-proxy/provider/vertex.go (278-278)

choice.Message.Content = reasoningContextMarkerStart + part.Text + reasoningContextMarkerEnd + candidate.Content.Parts[1].Text

⚠️ 潜在风险

如果响应结构发生变化或内容顺序不同，可能导致内容拼接错误，影响用户体验。

🔍3. 函数调用参数反序列化错误未被妥善处理

在构建 Vertex 请求时，对函数调用参数进行反序列化操作（如 vertex.go 第 428 行），但仅记录日志而未进一步处理错误，可能造成后续流程中使用无效数据。应增强错误处理逻辑，确保数据完整性。

📌 关键代码

plugins/wasm-go/extensions/ai-proxy/provider/vertex.go (428-428)

if err := json.Unmarshal([]byte(message.ToolCalls[0].Function.Arguments), &args); err != nil

⚠️ 潜在风险

反序列化失败可能导致函数调用参数丢失，影响模型推理结果的准确性。

🔍4. 图像内容转换逻辑存在安全隐患

convertImageContent 函数中直接从 URL 提取文件扩展名作为 MIME 类型（如 vertex.go 第 789-794 行），这种方式易受恶意输入攻击。建议通过内容检测来验证 MIME 类型，提高安全性。

📌 关键代码

plugins/wasm-go/extensions/ai-proxy/provider/vertex.go (789-794)

arr := strings.Split(imageUrl, ".")
		mimeType := "image/" + arr[len(arr)-1]

⚠️ 潜在风险

恶意用户可能上传非图像文件并伪装成图像，导致潜在的安全漏洞。

审查详情

📒 文件清单 （1 个文件）

📝 变更: 1 个文件

📝 变更文件:

plugins/wasm-go/extensions/ai-proxy/provider/vertex.go

💡 小贴士

与 lingma-agents 交流的方式

📜 直接回复评论
直接回复本条评论，lingma-agents 将自动处理您的请求。例如：

在当前代码中添加详细的注释说明。
请详细介绍一下你说的 LRU 改造方案，并使用伪代码加以说明。

📜 在代码行处标记
在文件的特定位置创建评论并 @lingma-agents。例如：

@lingma-agents 分析这个方法的性能瓶颈并提供优化建议。
@lingma-agents 对这个方法生成优化代码。

📜 在讨论中提问
在任何讨论中 @lingma-agents 来获取帮助。例如：

@lingma-agents 请总结上述讨论并提出解决方案。
@lingma-agents 请根据讨论内容生成优化代码。

lingma-agents · 2025-09-16T08:13:14Z

plugins/wasm-go/extensions/ai-proxy/provider/vertex.go

-			choice.Message.Content = candidate.Content.Parts[0].Text
+			part := candidate.Content.Parts[0]
+			if part.FunctionCall != nil {
+				args, _ := json.Marshal(part.FunctionCall.Args)


在处理函数调用参数时应避免忽略 JSON 序列化错误。

🟡 Major | 🐞 Bugs

📋 问题详情

当前代码在序列化 part.FunctionCall.Args 为 JSON 字符串时忽略了可能发生的错误（使用了 _ 忽略错误）。这可能导致在参数无法序列化时产生不完整或错误的工具调用内容，影响下游处理逻辑。

💡 解决方案

建议显式处理 json.Marshal 可能出现的错误，并在出错时记录日志并提供默认值（如空对象 {}），以增强健壮性。

- args, _ := json.Marshal(part.FunctionCall.Args) + args, err := json.Marshal(part.FunctionCall.Args) + if err != nil { + log.Errorf("failed to marshal function call args: %v", err) + args = []byte("{}") + }

您的反馈对我们很重要！(建议右键在新标签页中打开以下链接)

有用意见👍 ｜无用意见👎 ｜错误意见❌

lingma-agents · 2025-09-16T08:13:14Z

plugins/wasm-go/extensions/ai-proxy/provider/vertex.go

+					},
+				}
+			} else if part.Thounght != nil && len(candidate.Content.Parts) > 1 {
+				choice.Message.Content = reasoningContextMarkerStart + part.Text + reasoningContextMarkerEnd + candidate.Content.Parts[1].Text


处理多部分思考内容时逻辑存在缺陷。

🟠 Critical | 🐞 Bugs

📋 问题详情

当前代码在处理思考内容时假设 candidate.Content.Parts 中第一部分是思考内容，第二部分是正文。但若 Parts 数量不足或顺序不同，可能导致索引越界或内容错乱。

💡 解决方案

建议增加对 candidate.Content.Parts 长度的检查，确保访问 Parts[1] 时不会越界。

- choice.Message.Content = reasoningContextMarkerStart + part.Text + reasoningContextMarkerEnd + candidate.Content.Parts[1].Text + if len(candidate.Content.Parts) > 1 { + choice.Message.Content = reasoningContextMarkerStart + part.Text + reasoningContextMarkerEnd + candidate.Content.Parts[1].Text + } else { + choice.Message.Content = reasoningContextMarkerStart + part.Text + reasoningContextMarkerEnd + }

您的反馈对我们很重要！(建议右键在新标签页中打开以下链接)

有用意见👍 ｜无用意见👎 ｜错误意见❌

lingma-agents · 2025-09-16T08:13:14Z

plugins/wasm-go/extensions/ai-proxy/provider/vertex.go

+			if err := json.Unmarshal([]byte(message.ToolCalls[0].Function.Arguments), &args); err != nil {
+				log.Errorf("unable to unmarshal function arguments: %v", err)
+			}


函数调用参数反序列化错误未被妥善处理。

🟡 Major | 🐞 Bugs

📋 问题详情

在反序列化工具调用参数时，若 json.Unmarshal 失败，仅记录了错误日志但未采取进一步措施。这可能导致后续逻辑使用空的 args，引发潜在问题。

💡 解决方案

建议在反序列化失败时为 args 提供默认值（如空 map），以避免后续逻辑出错。

- if err := json.Unmarshal([]byte(message.ToolCalls[0].Function.Arguments), &args); err != nil { - log.Errorf("unable to unmarshal function arguments: %v", err) - } + if err := json.Unmarshal([]byte(message.ToolCalls[0].Function.Arguments), &args); err != nil { + log.Errorf("unable to unmarshal function arguments: %v", err) + args = make(map[string]interface{}) + }

您的反馈对我们很重要！(建议右键在新标签页中打开以下链接)

有用意见👍 ｜无用意见👎 ｜错误意见❌

codecov-commenter · 2025-09-16T08:59:28Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 44.96%. Comparing base (ef31e09) to head (a55b615).
⚠️ Report is 708 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2926      +/-   ##
==========================================
+ Coverage   35.91%   44.96%   +9.05%     
==========================================
  Files          69       82      +13     
  Lines       11576    13383    +1807     
==========================================
+ Hits         4157     6018    +1861     
+ Misses       7104     7017      -87     
- Partials      315      348      +33

see 80 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

johnlanni

LGTM

vertex support multi-modal, function call and thinking

e62e695

rinfx requested review from johnlanni and wydream as code owners September 16, 2025 08:10

Merge branch 'main' into new-features-for-vertex

2194acb

lingma-agents bot reviewed Sep 16, 2025

View reviewed changes

adjust thinking budget

e9135fb

rinfx added 2 commits September 17, 2025 16:43

Merge branch 'main' into new-features-for-vertex

148152d

add data: [DONE] at the end of stream

a55b615

johnlanni approved these changes Sep 18, 2025

View reviewed changes

johnlanni merged commit d7bebf7 into alibaba:main Sep 18, 2025
15 checks passed

ink-hz pushed a commit to ink-hz/higress-ai-capability-auth that referenced this pull request Nov 5, 2025

vertex support multi-modal, function call and thinking (alibaba#2926)

62fbf65

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vertex support multi-modal, function call and thinking #2926

vertex support multi-modal, function call and thinking #2926

Uh oh!

rinfx commented Sep 16, 2025

Uh oh!

lingma-agents bot commented Sep 16, 2025

与 lingma-agents 交流的方式

Uh oh!

lingma-agents bot left a comment •

edited

Loading

Uh oh!

lingma-agents bot Sep 16, 2025

Uh oh!

lingma-agents bot Sep 16, 2025

Uh oh!

lingma-agents bot Sep 16, 2025

Uh oh!

codecov-commenter commented Sep 16, 2025 •

edited

Loading

Uh oh!

johnlanni left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vertex support multi-modal, function call and thinking #2926

vertex support multi-modal, function call and thinking #2926

Uh oh!

Conversation

rinfx commented Sep 16, 2025

Ⅰ. Describe what this PR did

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

Uh oh!

lingma-agents bot commented Sep 16, 2025

Vertex AI 支持多模态、函数调用和推理功能

与 lingma-agents 交流的方式

Uh oh!

lingma-agents bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

🔎 代码评审报告

与 lingma-agents 交流的方式

Uh oh!

lingma-agents bot Sep 16, 2025

Choose a reason for hiding this comment

📋 问题详情

💡 解决方案

Uh oh!

lingma-agents bot Sep 16, 2025

Choose a reason for hiding this comment

📋 问题详情

💡 解决方案

Uh oh!

lingma-agents bot Sep 16, 2025

Choose a reason for hiding this comment

📋 问题详情

💡 解决方案

Uh oh!

codecov-commenter commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

johnlanni left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lingma-agents bot left a comment •

edited

Loading

codecov-commenter commented Sep 16, 2025 •

edited

Loading