This script utilizes the Kokoro-FastAPI to read the clipboard content aloud with a customizable voice or a combination of voices. It listens for a hotkey press (Ctrl+Shift+Space) to trigger the text-to-speech process.
- Reads the current clipboard content aloud.
- Pause/Resume: Press the hotkey again while playing to pause, press again to resume.
- Stop Playback: Press
Escapeto completely stop current playback. - Supports combined voices (e.g.,
af_sky+af_bella). - Configurable hotkey (
Ctrl+Shift+Spaceby default). - Supports multiple audio formats (
mp3by default). - Audio device selection on startup.
- Clipboard Access: The script uses
pyperclipto read text from the clipboard. - Text-to-Speech: The clipboard content is sent to Kokoro-FastAPI for speech generation.
- Audio Playback: The generated audio is played immediately using
sounddevicewith pause/resume support. - Playback Control: Audio is played in chunks, allowing for responsive pause/resume/stop functionality.
Install the required Python libraries:
pip install pyperclip requests keyboard pydub sounddevice numpyYour mileage may vary depending on OS, package updates etc. If you have missing modules after install, you can pip install them normally. This is a very basic script that doesn't require much beyond https://github.com/remsky/Kokoro-FastAPI.
Ensure that:
- The Kokoro-FastAPI service is running locally or is accessible at the configured
API_URL. - Your desired voice packs are installed and available in Kokoro-FastAPI.
- Run the script:
python clip_read.py
- Choose your audio output device (default or select from list).
- Copy any text to your clipboard.
- Press
Ctrl+Shift+Spaceto hear the clipboard content read aloud. - Control playback:
- Press
Ctrl+Shift+Spaceagain to pause playback - Press
Ctrl+Shift+Spaceagain to resume from where you paused - Press
Escapeto stop playback completely
- Press
- Press
Shift+Escto exit the program.
Update the following parameters in the script to customize behavior:
API_URL: URL of the Kokoro-FastAPI server (default:http://localhost:8880/v1/audio/speech).VOICE: Set to a single voice or combine multiple voices with a+(e.g.,af_sky+af_bella).RESPONSE_FORMAT: Choose the desired audio format (e.g.,mp3,wav).
The default hotkeys are:
Ctrl+Shift+Space: Read clipboard / Pause / Resume playbackEscape: Stop current playbackShift+Esc: Exit program
You can modify them in the following lines:
keyboard.add_hotkey("ctrl+shift+space", read_clipboard_aloud)
keyboard.add_hotkey("esc", stop_playback)
keyboard.add_hotkey("shift+esc", close_program)For a combined voice configuration:
VOICE = "af_sky+af_bella" # Combines two voicesTo run the script, copy some text to the clipboard, press Ctrl+Shift+Space, and enjoy the audio playback with full pause/resume control.
- If the clipboard is empty or contains non-text content, the script will notify you and do nothing.
- Ensure Kokoro-FastAPI is running and accessible before running the script.
- The script automatically saves generated audio files to the
saved_audio/directory. - Audio device selection is available on startup for better compatibility.
- Toggle Playback:
Ctrl+Shift+Spaceacts as a smart toggle - starts playback if nothing is playing, pauses if playing, resumes if paused. - Stop Playback:
Escapecompletely stops current playback (cannot resume from this point). - Exit Program:
Shift+Escterminates the program and all playback.
If the tool is helpful, consider supporting it on Ko-fi.