Release Version 1.12.5 - Fish Speech with Compilation Speed-Up · FranckyB/Voice-Clone-Studio

Fish Speech S2 Pro Integration and Features:

Added Fish Speech S2 Pro (4B parameter TTS model) as a new engine in the UI, supporting 80+ languages and 15,000+ inline expression tags for fine-grained speech control. The model is auto-downloaded from HuggingFace, and [tag] syntax is supported and automatically stripped for other engines.

The TTS model manager (tts_manager.py) now provides get_fish_speech() and generate_voice_clone_fish_speech() methods, handling model loading, reference audio encoding, kernel cache status reporting, and batch/paragraph generation with all Fish Speech parameters exposed.

Performance and Stability Enhancements:

Implemented Triton/Inductor GPU kernel compilation with persistent caching in models/.cache for Fish Speech, significantly accelerating repeat generations. Added Windows-specific patches for compilation compatibility and informative first-run cache warnings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Version 1.12.5 - Fish Speech with Compilation Speed-Up

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!