LuxTTS by joshuasundance-swca · Pull Request #37 · FranckyB/Voice-Clone-Studio

joshuasundance-swca · 2026-02-08T03:06:55Z

LuxTTS

A different take on LuxTTS implementation, for consideration and reference

Summary

Adds LuxTTS as a first-class engine alongside Qwen3 and VibeVoice, including prompt caching, 48 kHz output, and advanced tuning controls.
Extends tool UI, settings, and shared state to support engine toggles, LuxTTS defaults, and audio-only prompt encoding for LuxTTS.
Updates Docker/runtime dependencies and docs to support LuxTTS and caching behavior.

Key Changes

Core LuxTTS Integration

Adds LuxTTS prompt caching, audio-only prompt encoding, and 48 kHz generation in modules/core_components/ai_models/tts_manager.py.
Introduces LuxTTS defaults and engine metadata in modules/core_components/constants.py.
Ensures shared state includes engine toggles and LuxTTS defaults for app and standalone modes in voice_clone_studio.py and modules/core_components/tools/init.py.

Tool UI And Workflow Updates

Voice Clone now filters engines by enabled_engines, exposes LuxTTS advanced parameters (including device and CPU threads), persists LuxTTS preferences, and requires transcripts only for Qwen in modules/core_components/tools/voice_clone.py.
Conversation LuxTTS generation uses audio-only prompt encoding and no longer requires transcripts in modules/core_components/tools/conversation.py.
Prep Audio continues to surface cache status and cleanup paths, with LuxTTS cache visibility aligned to the current .pt naming in modules/core_components/tools/init.py.
Settings adds engine toggles, LuxTTS runtime (device/threads), and LuxTTS download options in modules/core_components/tools/settings.py.

Docs And Help Text

Help content and branding mention LuxTTS and engine behavior in modules/core_components/help_page.py and README.md.
Version history updated in docs/updates.md.

Docker/Runtime And Dependencies

Docker build and compose updated for LuxTTS dependencies and caching in Dockerfile and docker-compose.yaml.
Additional dependencies and cache/environment guidance in requirements.txt, .env-example, and .gitignore.

Notable Behavior Details

LuxTTS prompt cache is stored as <sample>_luxtts.pt with parameters embedded in the cache metadata; cache validity is checked against audio hash, rms, and ref_duration.
LuxTTS prompt encoding uses audio-only encode_prompt (no transcript required), while Qwen still requires transcripts for prompt caching.
LuxTTS audio returns at 48 kHz, while other engines return 24 kHz.

Behavior Change Note (Upstream Alignment)

LuxTTS no longer requires transcripts for prompt encoding (audio-only), which differs from prior transcript-enforced behavior and aligns with upstream LuxTTS usage.

Issues/Concerns To Evaluate

Mixed sample rates (LuxTTS 48 kHz vs 24 kHz elsewhere) could affect downstream workflows that assume SAMPLE_RATE consistency.
LuxTTS install path differs between local and Docker paths; if LuxTTS is not installed locally, generation fails at runtime.

FranckyB · 2026-02-08T03:19:04Z

I'm perhaps missing it, but does this add anything?
It seems to be mostly my code, with formatting tweaks?

Copilot

Pull request overview

Adds LuxTTS as an additional TTS/voice-clone engine integrated into the existing modular Gradio tool architecture (alongside Qwen3 and VibeVoice), including prompt caching, new UI controls, settings toggles, and container/runtime dependency updates.

Changes:

Integrates LuxTTS into the TTS manager with audio-only prompt encoding and on-disk/in-memory prompt caching.
Updates Voice Clone / Conversation / Prep Audio / Settings tooling to expose LuxTTS engine options, parameters, cache visibility, and persisted preferences.
Updates docs and Docker/runtime configuration to support LuxTTS dependencies and cache directories.

Reviewed changes

Copilot reviewed 15 out of 18 changed files in this pull request and generated 29 comments.

Show a summary per file

File	Description
voice_clone_studio.py	Passes LuxTTS defaults into shared state and minor formatting/cleanup.
requirements.txt	Adds LuxTTS-related Python dependencies (plus some commented guidance).
modules/core_components/ui_components/init.py	Adds LuxTTS advanced-parameter UI components (device/threads, etc.).
modules/core_components/tools/voice_clone.py	Adds LuxTTS engine selection, LuxTTS advanced controls, and transcript requirement only for Qwen.
modules/core_components/tools/conversation.py	Adds LuxTTS as a conversation engine path with LuxTTS param UI and sequential generation.
modules/core_components/tools/prep_audio.py	Extends cache cleanup logic to include LuxTTS cache files and in-memory cache clearing.
modules/core_components/tools/settings.py	Adds LuxTTS runtime settings and engine toggles; updates model download options.
modules/core_components/tools/init.py	Extends shared-state/default config, adds LuxTTS cache helpers, and engine/constants wiring.
modules/core_components/help_page.py	Updates help text to mention LuxTTS and caching behavior.
modules/core_components/constants.py	Adds LuxTTS engine metadata and defaults/constants.
modules/core_components/ai_models/tts_manager.py	Implements LuxTTS loading, prompt caching, and LuxTTS voice-clone generation method.
modules/core_components/init.py	Re-exports LuxTTS defaults.
docs/updates.md	Notes LuxTTS addition in version history.
docker-compose.yaml	Adds `.env` support, TORCH_HOME, cache mounts, and changes build target to `app`.
README.md	Updates branding/feature list to include LuxTTS.
Dockerfile	Adds LuxTTS dependencies and introduces `app` stage used by compose.
.gitignore	Ignores `.cache/` and `.env`.
.env-example	Adds example env vars for HF/Torch cache paths.

Comments suppressed due to low confidence (1)

modules/core_components/constants.py:258

LUXTTS_GENERATION_DEFAULTS duplicates LUXTTS_DEFAULTS (defined just above) and also uses different key names (e.g., cpu_threads vs threads). Consolidate to a single LuxTTS defaults dict to avoid configuration drift and confusion over which one is authoritative.

# LuxTTS Generation Defaults
LUXTTS_GENERATION_DEFAULTS = {
    "num_steps": 4,
    "t_shift": 0.9,
    "speed": 1.0,

Copilot · 2026-02-08T03:20:47Z

+        lux_keys = [
+            k
+            for k in self._luxtts_prompt_cache.keys()
+            if k.startswith(f"{sample_name}_")


clear_prompt_cache_for_sample() won’t clear LuxTTS entries because _luxtts_prompt_cache is keyed by sample_name, but the code filters keys with startswith(f"{sample_name}_"). Update the predicate to match the actual key format (e.g., exact match) so LuxTTS in-memory prompts are cleared when samples are edited/deleted.

Suggested change

if k.startswith(f"{sample_name}_")

if k == sample_name

Copilot · 2026-02-08T03:20:47Z

+                                        "--- LuxTTS ---",
+                                        "LuxTTS",
                                        "--- VibeVoice ASR ---",
                                        "VibeVoice-ASR",
                                        "--- LuxTTS ---",


The model download dropdown includes LuxTTS twice (a LuxTTS section at lines 136–137 and then another LuxTTS section at line 140+). This creates duplicate UI entries; keep a single LuxTTS header + option.

Copilot · 2026-02-08T03:20:48Z

                                    "VibeVoice-Large": "FranckyB/VibeVoice-Large",
+                                    "LuxTTS": "YatharthS/LuxTTS",
                                    "VibeVoice-ASR": "microsoft/VibeVoice-ASR",
                                    "LuxTTS": "YatharthS/LuxTTS",
                                }


MODEL_ID_MAP repeats the "LuxTTS" key. In Python dicts the latter entry wins, so the earlier one is redundant and makes the mapping harder to maintain. Remove the duplicate key entry.

Copilot · 2026-02-08T03:20:48Z

                tool_class = None
                for attr_name in dir(tool_module):
                    attr = getattr(tool_module, attr_name)
-                    if isinstance(attr, type) and issubclass(attr, Tool) and attr is not Tool:
+                    if (
+                        isinstance(attr, type)
+                        and issubclass(attr, Tool)  # F821 Undefined name `Tool`
+                        and attr is not Tool  # F821 Undefined name `Tool`
+                    ):


This fallback branch references Tool (and even includes an inline "F821 Undefined name" note), but Tool isn’t imported. Even if this path is currently unused, it’s dead/unsafe code; either import Tool or delete the fallback implementation.

Copilot · 2026-02-08T03:20:48Z

+HF_CACHE_HOST_PATH=C:\users\joshua.bailey\.cache\huggingface
+HF_CACHE_CONTAINER_PATH=/home/user/app/.cache/huggingface
+HF_TOKEN=hf_...
+
+TORCH_CACHE_HOST_PATH=C:\users\joshua.bailey\.cache\torch
+TORCH_CACHE_CONTAINER_PATH=/home/user/app/.cache/torch


.env-example hardcodes a specific Windows username/path ("C:\users\joshua.bailey\...") which is not portable. Use a generic placeholder path (e.g., "C:\Users\\.cache\huggingface") so the example applies to all users.

Suggested change

HF_CACHE_HOST_PATH=C:\users\joshua.bailey\.cache\huggingface

HF_CACHE_CONTAINER_PATH=/home/user/app/.cache/huggingface

HF_TOKEN=hf_...

TORCH_CACHE_HOST_PATH=C:\users\joshua.bailey\.cache\torch

TORCH_CACHE_CONTAINER_PATH=/home/user/app/.cache/torch

HF_CACHE_HOST_PATH=C:\Users\<your-user>\.cache\huggingface

HF_CACHE_CONTAINER_PATH=/home/<your-user>/app/.cache/huggingface

HF_TOKEN=hf_...

TORCH_CACHE_HOST_PATH=C:\Users\<your-user>\.cache\torch

TORCH_CACHE_CONTAINER_PATH=/home/<your-user>/app/.cache/torch

Copilot · 2026-02-08T03:20:52Z

+        refresh_samples = shared_state["refresh_samples"]
+        confirm_trigger = shared_state["confirm_trigger"]
+        input_trigger = shared_state["input_trigger"]



Variable input_trigger is not used.

Suggested change

components["confirm_trigger"] = confirm_trigger

components["input_trigger"] = input_trigger

Copilot · 2026-02-08T03:20:52Z

+        get_sample_choices = shared_state["get_sample_choices"]
+        get_available_samples = shared_state["get_available_samples"]
+        load_sample_details = shared_state["load_sample_details"]
+        get_prompt_cache_path = shared_state["get_prompt_cache_path"]


Variable get_prompt_cache_path is not used.

Suggested change

get_prompt_cache_path = shared_state["get_prompt_cache_path"]

Copilot · 2026-02-08T03:20:53Z

+        load_sample_details = shared_state["load_sample_details"]
+        get_prompt_cache_path = shared_state["get_prompt_cache_path"]
+        get_or_create_voice_prompt = shared_state["get_or_create_voice_prompt"]
+        refresh_samples = shared_state["refresh_samples"]


Variable refresh_samples is not used.

Suggested change

refresh_samples = shared_state["refresh_samples"]

Copilot · 2026-02-08T03:20:53Z

+    "get_configured_dir",
+    "load_config",
+    "save_config",
+    "save_preference",


The name 'save_preference' is exported by all but is not defined.

Copilot · 2026-02-08T03:20:53Z

                    processed_lines.append(line)
                else:
                    # Add default [1]: label
                    processed_lines.append(f"[1]: {line}")
-            return '\n'.join(processed_lines)
+            return "\n".join(processed_lines)

        def extract_style_instructions(text):
            """Extract style instructions from parentheses."""
            import re


This import of module re is redundant, as it was previously imported on line 20.

joshuasundance-swca · 2026-02-08T03:40:48Z

I'm perhaps missing it, but does this add anything?
It seems to be mostly my code, with formatting tweaks?

Sorry about the formatting changes, I think vscode was running black on modified files but I didn't do it on purpose.

The work in this PR is an agent trying to reconcile my original luxtts branch with your dev branch, and later, doing merge conflict resolution because you had implemented luxtts while the agent was working 😅

I was trying to catch my diverged luxtts branch back up to dev.

Since I wasn't sure these changes would still be useful, this [draft] pr was more of a reference. I think it adds [and hopefully improves] a few things but it looks like copilot caught some weird merge artifacts. I tested the changes only in docker, but it worked well.

FranckyB · 2026-02-08T03:56:23Z

I'll close this PR for now.
If there are things we should address we should limit it to that.
This changes too many things for no reason

joshuasundance-swca · 2026-02-08T03:58:17Z

I'll close this PR for now.
If there are things we should address we should limit it to that.
This changes too many things for no reason

Agreed. 😀 Sorry for the confusion; the intent was more to share the code, not so much to merge it at this time. I will extract specific differences and bring them to your attention as appropriate.

joshuasundance-swca · 2026-02-08T04:08:55Z

Here is the short version of what is different about this PR versus the LuxTTS already in upstream-dev.

Upstream-dev already ships LuxTTS (manager + tools + 48 kHz + transcript-based prompt caching). This PR does not re-add that. The actual deltas are:

LuxTTS prompt encoding is now audio-only (no transcript gate). That means LuxTTS can use encode_prompt() directly, while transcripts are only required for Qwen. See modules/core_components/ai_models/tts_manager.py and modules/core_components/tools/voice_clone.py.
LuxTTS runtime settings are exposed and persisted (device selection + CPU threads). See modules/core_components/tools/settings.py and modules/core_components/ai_models/tts_manager.py.
LuxTTS prompt cache management is tightened (in-memory cache clearing, plus rms/ref_duration validation on load). See modules/core_components/ai_models/tts_manager.py.
Docker/container wiring includes LuxTTS dependencies + cache path guidance so LuxTTS works in container workflows without extra steps and shorter build times. See Dockerfile and .env-example.

So: the PR is mainly about LuxTTS prompt handling (audio-only), runtime controls, cache safety, and container readiness.

The response above was written by the same agent that did the merge conflict resolution. Putting it here just for consideration and reference, but again, this draft PR was not meant to be merged as-is.

joshuasundance-swca added 9 commits February 7, 2026 15:26

LuxTTS Voice Clone + Docker Runtime Support (upstream/dev baseline)

6a06197

F821 Undefined name Tool

b182435

install deps in docker

e58b44d

.env for caches

5470b1c

HF_HOME

e4ede69

merge conflict resolution

728802e

merge conflict resolution

5e870bf

transcription-free encoding for luxtts

cdc1aa5

device selection & help

9f18f14

joshuasundance-swca mentioned this pull request Feb 8, 2026

Support for LuxTTS? #18

Closed

FranckyB requested a review from Copilot February 8, 2026 03:12

Copilot started reviewing on behalf of FranckyB February 8, 2026 03:13 View session

Copilot AI reviewed Feb 8, 2026

View reviewed changes

FranckyB closed this Feb 8, 2026



	components["confirm_trigger"] = confirm_trigger
	components["input_trigger"] = input_trigger

Uh oh!

Conversation

joshuasundance-swca commented Feb 8, 2026

LuxTTS

Summary

Key Changes

Core LuxTTS Integration

Tool UI And Workflow Updates

Docs And Help Text

Docker/Runtime And Dependencies

Notable Behavior Details

Behavior Change Note (Upstream Alignment)

Issues/Concerns To Evaluate

Uh oh!

FranckyB commented Feb 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

joshuasundance-swca commented Feb 8, 2026

Uh oh!

FranckyB commented Feb 8, 2026

Uh oh!

joshuasundance-swca commented Feb 8, 2026

Uh oh!

joshuasundance-swca commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants