fix(message_processing): return resolved path for file:// URL audio blocks#4021
Conversation
…locks
In `_process_single_file_block`, when a media block's source has
`{"type": "url", "url": "file://..."}`, the local path was correctly
extracted via `urllib.request.url2pathname()` but then immediately
overwritten by an unconditional call to `download_file_from_url(url, ...)`,
which is HTTP-only and returns `None` for `file://` URLs.
Net effect: `_process_single_file_block` returned `None` for
`file://` audio blocks, `_process_audio_block` was never reached, and
the agent received the raw audio block instead of a transcribed text.
Symptom: Telegram voice messages with `audio_mode="auto"` and a
configured Whisper provider stayed unparsed at the LLM, even though
the transcription provider was reachable and the workspace media
file existed locally.
This is orthogonal to PR agentscope-ai#1896 (which normalises `data` -> `source`
for blocks that lack a `source` dict). Once agentscope-ai#1896 hands a proper
`{"type": "url", "url": "file://..."}` source to this function, the
present bug still swallows it; both fixes are needed independently.
Fix: short-circuit the file:// branch with `return local_path`
immediately after `url2pathname()` succeeds, so the HTTP downloader
is never called for local-file URLs.
Co-Authored-By: Ren (Claude) <noreply@anthropic.com>
|
Hi @karls0r, thank you for your first Pull Request! 🎉 📋 About PR TemplateTo help maintainers review your PR faster, please make sure to include:
Complete PR information helps speed up the review process. You can edit the PR description to add these details. 🙌 Join Developer CommunityThanks so much for your contribution! We'd love to invite you to join the official QwenPaw developer group! You can find the Discord and DingTalk group links under the "Developer Community" section on our docs page: We truly appreciate your enthusiasm—and look forward to your future contributions! 😊 We'll review your PR soon. |
|
@qbc2016 Please help review this pr |
Welcome to QwenPaw! 🎉Thank you @karls0r for your first contribution! Your PR has been merged. 🚀 We'd love to give you a shout-out in our release notes! If you're comfortable sharing, please reply to this comment with your social media handles using the format below:
Thanks again for helping make QwenPaw better! |
…locks (agentscope-ai#4021) (cherry picked from commit 534d8ba)


Summary
Fixes audio (and any media) blocks delivered with
{"type": "url", "url": "file://..."}source: the local path was extracted correctly viaurl2pathname()but then unconditionally overwritten bydownload_file_from_url(), which is HTTP-only and returnsNoneforfile://URLs. As a result_process_single_file_blockreturnedNoneforfile://audio blocks,_process_audio_blockwas never invoked, and the agent saw the raw audio block instead of a transcribed text.Setup that reproduces it
mainaudio_mode = "auto"/v1/audio/transcriptionsendpoint)Send a voice message; the workspace media file lands in
~/.qwenpaw/workspaces/<agent>/media/<id>.oga. The constructed audio block hassource = {"type": "url", "url": "file:///.../<id>.oga"}. Without this fix, the LLM receives that audio block raw instead of the transcribed[Voice message]: ....Relation to #1896
Orthogonal. PR #1896 normalises
data→{"type": "url", "url": "file://..."}for blocks that arrive without asourcedict. After that normalisation lands,_process_single_file_blockstill drops the resultingfile://URL on the floor — that's what this PR fixes. Both can land independently; the audio pipeline only works end-to-end with both.Change
src/qwenpaw/agents/utils/message_processing.py— for theparsed.scheme == "file"branch, return the resolved local path immediately instead of falling through todownload_file_from_url(). Six added lines, no removals, no behaviour change for HTTP URLs.Tests
Happy to add unit coverage for
_process_single_file_block(currently uncovered) if that helps the review — let me know if you'd prefer the PR to grow.Issue
Confirmed in #1516 (comment).
🤖 Authored with Claude Code (Ren, Anthropic Claude Opus 4.7) — final review and submission by @karls0r.