Skip to content

feat(telegram): streaming response via sendMessageDraft for private chats#1433

Open
El-Chiang wants to merge 69 commits intoHKUDS:mainfrom
El-Chiang:feat/telegram-streaming-draft
Open

feat(telegram): streaming response via sendMessageDraft for private chats#1433
El-Chiang wants to merge 69 commits intoHKUDS:mainfrom
El-Chiang:feat/telegram-streaming-draft

Conversation

@El-Chiang
Copy link
Copy Markdown

Summary

Add real-time streaming response support for Telegram private chats using the sendMessageDraft API (Bot API 9.3+, opened to all bots in 9.5).

What it does

When a user messages the bot in a private Telegram chat, the bot now streams its response in real-time — the user sees text appearing character by character, similar to ChatGPT's typing effect.

How it works

  1. Detection: _should_stream_private_telegram() checks if the incoming message is from a Telegram private chat
  2. Streaming: During the first LLM turn, tokens are streamed via on_stream_chunk callback
  3. Draft updates: sendMessageDraft is called with throttling (350ms interval) to push partial text to the user
  4. Finalization: When generation completes, draft is cleaned up and final message is sent via normal sendMessage
  5. Fallback: If sendMessageDraft fails, silently falls back to normal non-streaming behavior

Key design decisions

  • Private chats only: sendMessageDraft is a Telegram API limitation — it only works in private chats, not groups
  • First turn only: Streaming only applies to the first LLM generation turn; subsequent tool-call turns use normal sending
  • Throttled updates: 350ms minimum interval between draft pushes to avoid rate limiting
  • Graceful degradation: Any draft API failure silently falls back to standard message sending

Files changed

File Changes
nanobot/agent/loop.py Add streaming detection, stream chunk callback, stream_id management
nanobot/channels/telegram.py Full sendMessageDraft implementation with draft state management
nanobot/providers/base.py Add stream and on_stream_chunk params to chat() interface
nanobot/providers/litellm_provider.py Implement streaming in LiteLLM provider
nanobot/providers/custom_provider.py Implement streaming in custom provider
nanobot/providers/openai_codex_provider.py Signature update for compatibility

References

El-Chiang and others added 30 commits February 9, 2026 17:36
…hat support

- Remove global litellm.api_base setting that interfered with provider routing
  when env vars like ANTHROPIC_BASE_URL were present
- Pass api_key explicitly per-request to prevent env var interference
- Add group chat support for DingTalk channel (auto-detect conversationType,
  route to groupMessages/send API for group, oToMessages/batchSend for private)
- Pass metadata through OutboundMessage so DingTalk send() knows the chat type
- Add providers/ module documentation (PROVIDERS.md)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add conversation_title extraction from DingTalk messages to improve
group chat context. Group messages now include a [群:title] prefix
and sender name so the agent knows who is speaking in which group.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add MCPManager for connecting external MCP servers and registering tools
- Add agent-browser, search-sessions, and send-ding skills
- Add subagent support with session management
- Update DingTalk channel with conversation title in messages
- Fix cron jobs.json to preserve Chinese characters (ensure_ascii=False)
- Add MCP server configuration to schema and CLI

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
External devices (e.g. M5Stack) can POST to /api/chat and block until
the agent replies, completing one conversation turn per HTTP round-trip.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DingTalk channel can now upload and send images via OApi media endpoint.
Message tool gains a media parameter for attaching local image files and
passes through channel metadata for request-response correlation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Track finish_reason through iterations and provide informative messages
when the model returns empty content or hits the iteration limit.
Increase default max_tool_iterations from 20 to 50.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Log compact previews of model responses including finish_reason,
tool call names, and truncated content for easier debugging.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…oding

Add 30s timeout to MCP server connections to prevent a slow/unreachable
server from blocking the entire gateway startup. Also fix HTTP channel
JSON responses to output readable UTF-8 instead of escaped unicode, and
default HttpConfig host to 127.0.0.1 for security.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… misc improvements

- Add thinking (enabled/adaptive) and effort (low/medium/high/max) params to AgentLoop and LiteLLM provider
- Add configurable memory_daily_subdir for daily notes path
- Rewrite CLAUDE.md in Chinese, streamline content
- Improve DingTalk group message detection by chat_id prefix fallback
- Update system prompt identity and simplify instructions
- Add journal skill filtering rules for sensitive content

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, tool calls and results were only kept in memory during a
single message processing cycle and discarded when saving to session.
This caused the model to "forget" it had used tools when the user
asked follow-up questions in the next turn.

Now prepends a <tool_use_summary> block to the saved assistant message
with truncated args (100 chars) and results (200 chars), so the model
retains awareness of its tool usage across conversation turns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Write daily rotating logs to ~/.nanobot/logs/ via loguru file sink,
enabled automatically in both agent and gateway commands. Expose log
path in system prompt so the assistant can search logs when needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…in assistant content

Previously tool use summaries were prepended to assistant content as
<tool_use_summary> XML tags, which caused the model to mimic the format
in its replies. Now persists them as a synthetic _tool_use_summary tool
call + tool result pair so the model retains tool awareness across turns
without polluting its output.

Also:
- Handle empty response after message tool as normal completion
- Add tool_use_log tracking to _process_system_message
- Update system prompt with restart log path and memory reminder
- Minor journal skill wording tweaks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add meme/sticker support to journal skill with usage guidelines
- Add INFO log for tool use summary when saving to session

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add _send_with_media() for sending photos via Telegram Bot API
- Support single photo with send_photo and album with send_media_group
- Add reply_to_message_id support for all message types
- Refactor send logic into _send_text() and _send_with_media()
- Add long message splitting (4096 char limit)
- Improve incoming message handling with reply context
- Add _contains_silent_marker() to detect [SILENT] in model output
- Suppress outbound message when model outputs [SILENT]
- Update system prompt with message dedup rules
- Prevents double-sending when message tool already sent the reply
- Add allow_bot_messages and allow_bot_from config options
- Skip webhook messages and self-bot messages
- Support allowlist-based bot message filtering
- Track bot's own user ID from READY event
El-Chiang and others added 24 commits February 20, 2026 11:02
Co-authored-by: Codex <noreply@openai.com>
Heartbeat and cron notification messages were written to isolated sessions
("heartbeat" / "cron:{job_id}"), so when users replied to a notification,
the agent had no context of what it had sent.

Changes:
- Add _route_session_key() and _cron_execution_session_key() helpers
- Heartbeat: use channel:chat_id as session key instead of fixed "heartbeat"
- Cron (deliver=True): use target user session instead of cron:{job_id}
- Cache heartbeat_target to prevent target drift between execute and notify
- Add 4 unit tests for routing logic

Closes: specs/2026-02-28_heartbeat-cron-history-injection.md
…hats

- Add sendMessageDraft support for real-time streaming in Telegram private chats
- Implement draft state management with throttling (350ms interval)
- Add graceful fallback when sendMessageDraft fails
- Stream only first LLM turn (tool calls use normal send)
- Provider layer: add stream + on_stream_chunk callback to chat()
- Finalize draft before sending final message to avoid duplicates

Refs: Telegram Bot API 9.3+ sendMessageDraft method
@wenjielei1990
Copy link
Copy Markdown
Contributor

Nice feature...Also looking into the same feature, the streaming capability.

But given the large number of file get changed (62 files), and many Chinese word in this commit, I wonder if this will ever get merged, given the limited bandwidth of the reviewer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants