Windows ships with a handful of voices that sound robotic and outdated. Meanwhile, AI voices that sound almost human already exist as open source projects, but none of them work with your everyday apps.
VoiceLink changes that. It makes AI voices show up as regular Windows voices. Pick one from the dropdown in Thorium Reader, Microsoft Edge, Narrator, or any app that reads text aloud and hear the difference instantly.
No technical knowledge needed. Install it, choose a voice, and enjoy listening.
All samples below are generated by the Kokoro model running entirely on a local machine. Click any sample to listen.
"The old bookshop on the corner had a peculiar charm about it. Dust motes danced in the sunlight that streamed through tall windows, and the smell of aged paper filled every room. It was the kind of place where you could lose an entire afternoon without even noticing."
| Voice | Style | Listen |
|---|---|---|
| Heart | Warm and expressive, great for audiobooks | βΆ Play |
| Bella | Clear and professional | βΆ Play |
| Nicole | Smooth and calm | βΆ Play |
| Adam | Natural and conversational | βΆ Play |
| Michael | Deep and authoritative | βΆ Play |
| Voice | Style | Listen |
|---|---|---|
| Emma | Classic British accent | βΆ Play |
| George | Refined British accent | βΆ Play |
VoiceLink ships with 11 voices in total (7 American, 4 British). See all voices below.
Dashboard: monitor the voice server and see system status at a glance
Voice Manager: rename voices, toggle them on or off, preview with one click
Setup Wizard: downloads everything automatically on first run
VoiceLink voices appear in any Windows app that supports text to speech. Here are a couple of examples:
Left: Thorium Reader Β Β·Β Right: Microsoft Edge Read Aloud
flowchart LR
A["π Your App<br/>Thorium, Edge, Narrator..."]
B["π VoiceLink<br/>The Bridge"]
C["π§ Kokoro AI<br/>Running Locally"]
A -- "sends text" --> B
B -- "sends text" --> C
C -- "returns audio" --> B
B -- "streams audio" --> A
VoiceLink registers itself as a standard Windows voice. When any app asks it to speak, it quietly passes the text to an AI model running on your machine and streams back natural sounding audio. Your apps never know the difference. They just see another voice in their dropdown list.
Everything runs on your computer. No internet needed after the initial setup. No cloud services. Your text never leaves your machine.
π Want to understand the full architecture? Read the Deep Dive Manual, a comprehensive technical reference covering every layer of the system.
- Windows 10 or 11 (64 bit)
- About 1.5 GB of free disk space (+ ~3 GB for Qwen3 TTS if enabled)
- An internet connection for the first setup (to download the AI model)
- A reasonably modern computer (a dedicated GPU helps but is not required)
- For Qwen3 voices / voice cloning: An NVIDIA GPU with at least 4 GB VRAM
- Download the installer from the latest release
- Run it as Administrator (right click β Run as administrator)
- Follow the setup wizard (it handles everything automatically):
- Downloads a self contained Python environment
- Installs all dependencies
- Downloads the AI voice model and all voice packs
- Starts the local voice server
- Registers 11 AI voices in Windows
- Open any app that reads text aloud (Thorium Reader, Edge, Narrator, Balabolka, etc.)
- Pick a VoiceLink voice from the voice list
- Enjoy!
No terminal commands. No configuration files. No Python installation. Just install and go.
| Location | What is there |
|---|---|
C:\ProgramData\VoiceLink\python\ |
Self contained Python environment with all packages |
C:\ProgramData\VoiceLink\models\ |
AI voice model and voice pack data |
C:\ProgramData\VoiceLink\server\ |
Local voice server |
C:\Program Files\VoiceLink\ |
The app itself |
VoiceLink ships with 11 voices powered by the Kokoro model, plus 6 additional voices and voice cloning through Qwen3-TTS (requires NVIDIA GPU):
| Voice | Accent | Gender | Description |
|---|---|---|---|
| Heart | πΊπΈ American | Female | Warm, expressive (the default voice) |
| Bella | πΊπΈ American | Female | Clear and professional |
| Nicole | πΊπΈ American | Female | Smooth and calm |
| Sarah | πΊπΈ American | Female | Friendly, conversational |
| Sky | πΊπΈ American | Female | Light and youthful |
| Adam | πΊπΈ American | Male | Natural, conversational |
| Michael | πΊπΈ American | Male | Deep and authoritative |
| Emma | π¬π§ British | Female | Classic British English |
| Isabella | π¬π§ British | Female | Elegant and refined |
| George | π¬π§ British | Male | Traditional British English |
| Lewis | π¬π§ British | Male | Warm British voice |
| Voice | Gender | Description |
|---|---|---|
| Serena | Female | Warm and gentle |
| Vivian | Female | Bright and expressive |
| Aiden | Male | Clear American midrange |
| Ryan | Male | Dynamic with strong rhythm |
| Dylan | Male | Youthful and natural |
| Eric | Male | Lively with bright timbre |
Plus voice cloning: clone any voice from a 3-second audio clip. Qwen3-TTS is accelerated with CUDA graphs via faster-qwen3-tts for ~1x realtime generation.
More voices and additional AI models are planned for future releases.
VoiceLink is functional and usable today. Here is where things stand:
| Component | Status | Details |
|---|---|---|
| AI Voice Server | β | Local server powered by Kokoro (11 voices) and Qwen3-TTS (6 voices + voice cloning) |
| Qwen3-TTS Engine | β | Optional GPU-accelerated engine with CUDA graphs (~1x realtime on RTX 4060) |
| Windows Voice Driver | β | Registered as a standard Windows voice, works in any compatible app |
| Desktop App | β | Dashboard, voice manager, voice studio (clone voices), system tray, server controls |
| Installer | β | Setup wizard that downloads and configures everything automatically |
| CI/CD Pipeline | β | Automated builds and releases through GitHub Actions |
Check TASKS.md for the detailed roadmap.
I was reading ebooks in Thorium Reader and the built in Windows voices were genuinely painful to listen to. AI voices that sound incredible exist as open source projects, but there was no simple way to plug them into everyday Windows apps.
So I built the bridge myself. This project is as much about understanding the technology deeply as it is about shipping something useful. If you want to see how every piece fits together, the Deep Dive Manual covers it all.
This is an open project. Whether you write code, design interfaces, or just want better voices on Windows, you are welcome here. Open an issue, start a discussion, or star the repo if you think this should exist.
MIT. See LICENSE for details.
Built by Manveer Anand

