Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .env-example
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
HF_CACHE_HOST_PATH=C:\users\joshua.bailey\.cache\huggingface
HF_CACHE_CONTAINER_PATH=/home/user/app/.cache/huggingface
HF_TOKEN=hf_...

TORCH_CACHE_HOST_PATH=C:\users\joshua.bailey\.cache\torch
Comment on lines +1 to +5

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.env-example hardcodes a specific Windows username/path ("C:\users\joshua.bailey\...") which is not portable. Use a generic placeholder path (e.g., "C:\Users\\.cache\torch") so the example applies to all users.

Suggested change
HF_CACHE_HOST_PATH=C:\users\joshua.bailey\.cache\huggingface
HF_CACHE_CONTAINER_PATH=/home/user/app/.cache/huggingface
HF_TOKEN=hf_...
TORCH_CACHE_HOST_PATH=C:\users\joshua.bailey\.cache\torch
HF_CACHE_HOST_PATH=C:\Users\<your-user>\.cache\huggingface
HF_CACHE_CONTAINER_PATH=/home/user/app/.cache/huggingface
HF_TOKEN=hf_...
TORCH_CACHE_HOST_PATH=C:\Users\<your-user>\.cache\torch

Copilot uses AI. Check for mistakes.
TORCH_CACHE_CONTAINER_PATH=/home/user/app/.cache/torch
Comment on lines +1 to +6

Copilot AI Feb 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.env-example hardcodes a specific Windows username/path ("C:\users\joshua.bailey\...") which is not portable. Use a generic placeholder path (e.g., "C:\Users\\.cache\huggingface") so the example applies to all users.

Suggested change
HF_CACHE_HOST_PATH=C:\users\joshua.bailey\.cache\huggingface
HF_CACHE_CONTAINER_PATH=/home/user/app/.cache/huggingface
HF_TOKEN=hf_...
TORCH_CACHE_HOST_PATH=C:\users\joshua.bailey\.cache\torch
TORCH_CACHE_CONTAINER_PATH=/home/user/app/.cache/torch
HF_CACHE_HOST_PATH=C:\Users\<your-user>\.cache\huggingface
HF_CACHE_CONTAINER_PATH=/home/<your-user>/app/.cache/huggingface
HF_TOKEN=hf_...
TORCH_CACHE_HOST_PATH=C:\Users\<your-user>\.cache\torch
TORCH_CACHE_CONTAINER_PATH=/home/<your-user>/app/.cache/torch

Copilot uses AI. Check for mistakes.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@ models/*
config.json
Qwen/
.hf_cache/
.cache/
.env

# Jupyter
.ipynb_checkpoints/
Expand Down
26 changes: 22 additions & 4 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ RUN pip install --no-cache-dir \
torchvision==0.24.1+cu128
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
ENV PATH="/home/user/.cargo/bin:${PATH}"
WORKDIR /home/user/app
COPY ./wheel /home/user/app/wheel
RUN pip install --no-cache-dir \
gradio==6.5.0 \
soundfile==0.13.1 \
Expand All @@ -71,11 +73,24 @@ RUN pip install --no-cache-dir \
onnxruntime \
onnxruntime-gpu==1.23.2 \
markdown==3.10.1 \
einops
einops \
"gradio_filelister @ file:wheel/gradio_filelister-0.4.0-py3-none-any.whl" \
lhotse \
safetensors \
tensorboard \
vocos \
pydub \
"transformers>=4.57.3,<5" \
cn2an \
inflect \
jieba \
pypinyin \
"setuptools<81" \
https://github.com/csukuangfj/piper-phonemize/releases/download/2025.06.23/piper_phonemize-1.3.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl \
git+https://github.com/ysharma3501/LinaCodec.git \
git+https://github.com/ysharma3501/LuxTTS.git

COPY ./wheel /home/user/app/wheel
COPY ./requirements.txt /home/user/app/requirements.txt
WORKDIR /home/user/app
RUN pip install --no-cache-dir -r /home/user/app/requirements.txt

# LuxTTS voice cloning engine
Expand All @@ -100,11 +115,14 @@ ENV PATH="/home/user/app/venv/bin:$PATH" \
LD_LIBRARY_PATH="/usr/local/cuda/lib64:${LD_LIBRARY_PATH}"

COPY --chown=1001:1001 --from=builder /home/user/app/venv /home/user/app/venv


FROM runtime AS app
COPY ./modules /home/user/app/modules
COPY ./wheel /home/user/app/wheel
COPY ./tests /home/user/app/tests
COPY ./docs /home/user/app/docs
COPY ./config.json /home/user/app/config.json
# COPY ./config.json /home/user/app/config.json
COPY ./voice_clone_studio.py /home/user/app/voice_clone_studio.py
WORKDIR /home/user/app
EXPOSE 7860
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Voice Clone Studio

A modular Gradio-based web UI for voice cloning, voice design, and multi-speaker conversation, powered by [Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS), [VibeVoice](https://github.com/microsoft/VibeVoice) and [LuxTTS](https://github.com/ysharma3501/LuxTTS)
A modular Gradio-based web UI for voice cloning, voice design, and multi-speaker conversation, powered by [Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS), [VibeVoice](https://github.com/microsoft/VibeVoice), and [LuxTTS](https://github.com/ysharma3501/LuxTTS).
Supports both Whisper and VibeVoice ASR for automatic transcription.

![Voice Clone Studio](https://img.shields.io/badge/Voice%20Clone%20Studio-v1.0-blue) ![Qwen3-TTS](https://img.shields.io/badge/Qwen3--TTS-Powered-blue) ![LuxTTS](https://img.shields.io/badge/LuxTTS-TTS-orange) ![VibeVoice](https://img.shields.io/badge/VibeVoice-TTS-green) ![VibeVoice](https://img.shields.io/badge/VibeVoice-ASR-green)
Expand All @@ -14,7 +14,7 @@ Voice Clone Studio is fully modular. The main file dynamically loads self-contai
### Voice Clone
Clone voices from your own audio samples. Provide a short reference audio clip with its transcript, and generate new speech in that voice.

- **Multiple engines** - Qwen3-TTS (0.6B/1.7B) or VibeVoice (1.5B/Large/Large-4bit)
- **Multiple engines** - Qwen3-TTS (0.6B/1.7B), VibeVoice (1.5B/Large/Large-4bit), or LuxTTS (Default)
- **Voice prompt caching** - First generation processes the sample, subsequent ones are instant
- **Seed control** - Reproducible results with saved seeds
- **Emotion presets** - 40+ emotion presets with adjustable intensity
Expand Down
6 changes: 5 additions & 1 deletion docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,25 @@ services:
voice-clone-studio:
build:
context: .
target: runtime
target: app
ports:
- "7860:7860"
env_file: .env
environment:
- "GRADIO_SERVER_NAME=0.0.0.0"
- "HF_HOME=${HF_CACHE_CONTAINER_PATH:-/home/user/app/.cache/huggingface}"
- "TORCH_HOME=${TORCH_CACHE_CONTAINER_PATH:-/home/user/app/.cache/torch}"
- "NVIDIA_VISIBLE_DEVICES=all"
- "NVIDIA_DRIVER_CAPABILITIES=compute,utility"
volumes:
- "./config.json:/home/user/app/config.json"
- "./temp:/home/user/app/temp"
- "./datasets:/home/user/app/datasets"
- "./models:/home/user/app/models"
- "./samples:/home/user/app/samples"
- "./output:/home/user/app/output"
- "${HF_CACHE_HOST_PATH:-./.hf_cache}:${HF_CACHE_CONTAINER_PATH:-/home/user/app/.cache/huggingface}"
- "${TORCH_CACHE_HOST_PATH:-./.torch_cache}:${TORCH_CACHE_CONTAINER_PATH:-/home/user/app/.cache/torch}"
deploy:
resources:
reservations:
Expand Down
1 change: 1 addition & 0 deletions docs/updates.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
## February 7, 2026

#### Version 1.0.0 - Complete Modular Rewrite
- **LuxTTS Voice Clone Engine** - Added LuxTTS as a third voice cloning engine with 48 kHz output, caching, and UI controls
- **Full Modular Architecture** - Complete rewrite from a 6000+ line monolith into independent tool modules under `modules/core_components/tools/`
- **Tool System** - Each tab is now a self-contained tool with its own UI, events, and logic, loaded dynamically from a central registry
- **Enable/Disable Tools** - New "Visible Tools" section in Settings lets you toggle any tab on or off (persisted in config, takes effect on restart)
Expand Down
6 changes: 4 additions & 2 deletions modules/core_components/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
INPUT_MODAL_CSS,
INPUT_MODAL_HEAD,
INPUT_MODAL_HTML,
show_input_modal_js
show_input_modal_js,
)

from .emotion_manager import (
Expand All @@ -17,7 +17,7 @@
get_emotion_choices,
calculate_emotion_values,
handle_save_emotion,
handle_delete_emotion
handle_delete_emotion,
)

from .constants import (
Expand All @@ -35,6 +35,7 @@
DEFAULT_CONFIG,
QWEN_GENERATION_DEFAULTS,
VIBEVOICE_GENERATION_DEFAULTS,
LUXTTS_DEFAULTS,
)

__all__ = [
Expand Down Expand Up @@ -70,4 +71,5 @@
"DEFAULT_CONFIG",
"QWEN_GENERATION_DEFAULTS",
"VIBEVOICE_GENERATION_DEFAULTS",
"LUXTTS_DEFAULTS",
]
Loading
Loading