Skip to content

docs(tts): add Doubao (Chinese seed-tts-2.0) command-provider example#18065

Closed
Hypnus-Yuan wants to merge 1 commit into
NousResearch:mainfrom
Hypnus-Yuan:docs/tts-doubao-integration
Closed

docs(tts): add Doubao (Chinese seed-tts-2.0) command-provider example#18065
Hypnus-Yuan wants to merge 1 commit into
NousResearch:mainfrom
Hypnus-Yuan:docs/tts-doubao-integration

Conversation

@Hypnus-Yuan
Copy link
Copy Markdown
Contributor

Summary

Adds a worked example under the "Custom command providers" section of the Voice & TTS docs, showing how to use Volcengine/ByteDance's Doubao seed-tts-2.0 bidirectional-streaming TTS via the existing tts.providers.<name> command-type surface that was merged in #17843.

No code changes — this is a docs-only PR. It documents an integration path users can take today, using the newly-published doubao-tts PyPI package as the command backend.

Why

  • Doubao seed-tts-2.0 is one of the strongest Chinese TTS models commercially available (native speaker IDs, emotion control, streaming).
  • Chinese-speaking Hermes users already ask for a way to plug it in; until now the answer was "write your own CLI wrapper."
  • #17843's command-type provider makes this a one-liner — the doc just needed to show people how.

What it looks like

After the existing piper-custom example, a new subsection:

Example: Doubao (Chinese seed-tts-2.0)

pip install doubao-tts
export VOLCENGINE_APP_ID="your-app-id"
export VOLCENGINE_ACCESS_TOKEN="your-access-token"
tts:
  provider: doubao
  providers:
    doubao:
      type: command
      command: "doubao-tts say --text-file {input_path} --out {output_path}"
      output_format: mp3
      max_text_length: 1024
      timeout: 30

Testing

  • npm run build (Docusaurus) passes locally — build completes cleanly, no new broken anchors introduced by this diff.
  • ASCII-guard: scanned the added lines; no suspicious non-ASCII beyond the legitimate em-dashes already used throughout the doc.
  • The doubao-tts package itself is tested separately: 69 unit tests, 95% coverage, mypy strict; ships from https://github.com/Hypnus-Yuan/doubao-tts.

Relation to prior work

This supersedes the approach I was exploring in #17589 (closed). That PR was trying to pull TTS into a Python plugin ABC before #17843 landed; once #17843 shipped, the command-type provider became the correct layer and the plugin ABC was the wrong abstraction. This PR is the minimal doc-level follow-up that makes the #17843 surface actually discoverable for Chinese-language use cases.

Checklist

  • Docs-only change; no runtime code touched
  • npm run build passes locally
  • No new broken anchors
  • Example config verified end-to-end against the real Volcengine endpoint
  • External package (doubao-tts) is MIT-licensed, published to PyPI, public repo

@alt-glitch alt-glitch added type/docs Documentation improvements P3 Low — cosmetic, nice to have tool/tts Text-to-speech and transcription labels Apr 30, 2026
@Hypnus-Yuan Hypnus-Yuan force-pushed the docs/tts-doubao-integration branch from 9cc1f77 to a3c1692 Compare April 30, 2026 20:56
@Hypnus-Yuan
Copy link
Copy Markdown
Contributor Author

Update: migrated the example from doubao-tts to doubao-speech (the unified package that now covers both TTS and streaming ASR). Same command-provider shape, just the binary name changed.

The commit is amended + force-pushed on the same branch, so the diff vs main stays a +24 / -0 docs-only change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have tool/tts Text-to-speech and transcription type/docs Documentation improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants