Version 1.12.5 - Fish Speech with Compilation Speed-Up #93
Replies: 2 comments
-
|
My 5060ti 16g vram shows out of memory. Please add ability to use a lesser vram demand version of fish audio model, like Fish Audio NF4. or use --half to reduce the vram requirements. Thank you! |
Beta Was this translation helpful? Give feedback.
-
|
I finally had a chance to try it out today, I had some issues with the RTX 5090 installation, but I shared it here: I just noticed, sadly it's not yet possible to TRAIN or use LoRA for Fish Speech S2 Pro. Will you consider to add this soon? I started to learn how to prepare dataset for it, that's a lot of work manually at the moment because it needs to insert the [TAG] as well on the .txt files, unlike how simple it was in Vibe Voice... so challenging indeed, but worth trying. 💪 I hope to see it soon, this model is so DYNAMIC that it's insane! I'll be happy to share the experience and feedback. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Fish Speech S2 Pro Integration and Features:
Added Fish Speech S2 Pro (4B parameter TTS model) as a new engine in the UI, supporting 80+ languages and 15,000+ inline expression tags for fine-grained speech control. The model is auto-downloaded from HuggingFace, and [tag] syntax is supported and automatically stripped for other engines.
The TTS model manager (tts_manager.py) now provides get_fish_speech() and generate_voice_clone_fish_speech() methods, handling model loading, reference audio encoding, kernel cache status reporting, and batch/paragraph generation with all Fish Speech parameters exposed.
Performance and Stability Enhancements:
Implemented Triton/Inductor GPU kernel compilation with persistent caching in models/.cache for Fish Speech, significantly accelerating repeat generations. Added Windows-specific patches for compilation compatibility and informative first-run cache warnings.
This discussion was created from the release Version 1.12.5 - Fish Speech with Compilation Speed-Up.
Beta Was this translation helpful? Give feedback.
All reactions