Hackathon starter kit combining Cloudflare Agents SDK with ElevenLabs APIs. Four demos in one app — voice chat with speech-to-text, soundscape generation, AI character creation, and music composition.
npm installCreate a .env file with your ElevenLabs API key (get one free at elevenlabs.io):
ELEVENLABS_API_KEY=your-key-here
npm startOpen http://localhost:5173.
AI chat agent powered by Workers AI. Every response is automatically spoken aloud via ElevenLabs TTS. Tap the mic to speak — audio streams to ElevenLabs Realtime STT via a WebSocket proxy through the agent, with live partial transcripts as you talk. Pick from available ElevenLabs voices.
ElevenLabs APIs: Text-to-Speech, Realtime Speech-to-Text, Voice Search
Describe a scene and the AI expands it into narration + ambient sound effect prompts. ElevenLabs generates the narration (TTS) and each ambient layer (Sound Effects API) in parallel. Play them together to hear the full scene.
ElevenLabs APIs: Text-to-Speech, Text-to-Sound-Effects
Design a custom AI character in two steps: describe a personality (Workers AI generates a system prompt) and a voice (ElevenLabs Voice Design generates previews). Pick your favorite voice, name the character, then chat with them — every response spoken in the custom voice.
ElevenLabs APIs: Voice Design, Voice Creation, Text-to-Speech
Compose original music from a text prompt. Choose duration (15s–2min), toggle instrumental mode, and ElevenLabs generates a full track. Build a library of saved tracks.
ElevenLabs APIs: Music Composition
src/
server.ts # Worker entry — exports agents, routes requests
agents/
voice-chat.ts # AIChatAgent + TTS + realtime STT WebSocket proxy
soundscape.ts # Agent with scene expansion + SFX generation
character.ts # AIChatAgent with voice design + character chat
music.ts # Agent with music composition + track library
lib/
elevenlabs.ts # Shared client factory + audio encoding
components/
audio-player.tsx # Reusable play/speak buttons
tabs/
voice-chat.tsx # Chat UI with mic input + auto-speak
soundscape.tsx # Scene builder with layered audio
character.tsx # Two-phase: design then chat
music.tsx # Compose + library UI
app.tsx # Tab shell
client.tsx # React entry
styles.css # Tailwind + Kumo
AIChatAgent— persistent AI chat with streaming (Voice Chat, Character)Agent— stateful Durable Object with RPC (Soundscape, Music)@callable()— typed server methods callable from the browsersetState/useAgent— real-time state sync between agent and UIuseAgentChat— React hook for chat with streaming, history, and stop/resumeonMessage— custom WebSocket message handling for audio chunk streaming
- Text-to-Dialogue — use
textToDialogue.convertto generate multi-speaker podcasts - Speech-to-Speech — record yourself and transform into a character voice
- Dubbing — transcribe → translate → re-voice in another language
- Collaborative soundscapes — multiple users build a scene via shared agent name
- Character gallery — save and share characters via URL
- AI DJ — compose mood-appropriate background music during conversations
npx wrangler r2 bucket create elevenlabs-audio
npx wrangler secret put ELEVENLABS_API_KEY
npm run deploy