🎥 Course Pipeline

Automated Google Slides → Narrated Videos → Final Course Video

course_pipeline is a deterministic, reproducible, dependency-aware video generation pipeline that converts a Google Slides deck into a fully narrated course video.

It supports:

Human voice recording
macOS native TTS (say)
ElevenLabs high-quality TTS
Incremental rebuilds
Slide-level redo
Fully ordered, dependency-driven execution using doit

This is designed for course creators, educators, and AI-first content pipelines.

✨ Features

🔗 Google Slides → Video (single source of truth)
🧠 Dependency-aware pipeline (no reruns unless needed)
🔢 Strict numeric ordering (1 → N slides, no randomness)
🎙️ Multiple audio modes:
- Interactive human recording
- macOS say
- ElevenLabs API (studio-quality voices)
🎞️ Per-slide video generation
🎬 Final concatenation into final.mp4
♻️ Incremental & resumable (crash-safe)
🧹 Redo individual slides easily

📦 Pipeline Overview

Google Slides URL
        │
        ▼
extract_notes     → notes/1.txt ... notes/N.txt
screenshot        → screenshots/1.png ... N.png
audio / tts_*     → audio/1.wav ... N.wav
video             → videos/1.mp4 ... N.mp4
final             → final.mp4

Each step is a doit task with explicit dependencies.

🛠️ Installation

Prerequisites

Python 3.11+
macOS (for say; optional if using ElevenLabs)
ffmpeg
Google Cloud credentials (Slides + Drive API)

Install ffmpeg:

brew install ffmpeg

Clone & setup

git clone https://github.com/<your-org>/course_pipeline.git
cd course_pipeline
python -m venv .venv
source .venv/bin/activate
pip install uv
uv sync

🔐 Credentials Setup

Google APIs

Enable:

Google Slides API
Google Drive API

Create OAuth credentials and save as:

credentials.json

First run will open a browser for auth and cache tokens locally.

ElevenLabs (optional but recommended)

export ELEVENLABS_API_KEY="your_api_key_here"

⚙️ Configuration

Edit settings in:

scripts/settings.py

Typical options:

Video resolution (e.g. 1920x1080)
FPS
Output directories

🚀 Usage

Run the full pipeline

uv run doit

This will:

Extract notes
Take slide screenshots
Generate audio
Create per-slide videos
Concatenate into final.mp4

🎙️ Audio Options

1️⃣ Interactive Human Recording

uv run doit audio

Displays slide text
Records mic input
Options:
- Enter → save & continue
- r → re-record
- s → skip

2️⃣ macOS Text-to-Speech

uv run doit tts_slides

Uses native say command.

3️⃣ ElevenLabs TTS (recommended)

uv run doit elevenlabs_tts

Studio-quality voices
Handles empty slides safely
Converts MP3 → WAV automatically

Force regeneration:

uv run doit elevenlabs_tts --force

🎞️ Video Generation

Generate per-slide videos:

uv run doit video

Each slide:

1.png + 1.wav → 1.mp4

Optimized for:

QuickTime
VLC
YouTube compatibility

🎬 Final Output

uv run doit final

Result:

build/<deck_id>/final.mp4

Force re-encode (if needed):

uv run doit final --reencode

♻️ Redo a Single Slide

Redo slide 12 completely:

uv run doit redo --slide 12
uv run doit

Redo only audio:

uv run doit redo --slide 12 --what audio

📋 Available Tasks

doit list

Example:

setup
extract_notes
screenshot
audio
tts_slides
elevenlabs_tts
video
final
redo

🧠 Why `doit`?

This pipeline does not rely on ad-hoc scripts.

doit gives:

True dependency graphs
Incremental builds
Crash-safe resumption
Deterministic ordering

This makes it suitable for large decks (100+ slides) and CI-like automation.

🧩 Design Principles

Single source of truth: Google Slides
Artifacts are explicit: text, image, audio, video
No hidden state
No silent reprocessing
Human-friendly override at every stage

📄 License

MIT License Free to use, modify, and distribute.

🤝 Contributing

PRs welcome for:

Windows/Linux TTS backends
Speaker diarization
Subtitle (SRT/VTT) generation
Video transitions
Parallel execution

🙌 Acknowledgements

Google Slides API
FFmpeg
ElevenLabs
doit task runner

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
scripts		scripts
.gitignore		.gitignore
README.md		README.md
README.md.old		README.md.old
dodo.py		dodo.py
env.sh.sample		env.sh.sample
list_devices.py		list_devices.py
main.py		main.py
pyproject.toml		pyproject.toml
release_content.sh		release_content.sh
settings.toml		settings.toml
settings.toml.sample		settings.toml.sample

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎥 Course Pipeline

✨ Features

📦 Pipeline Overview

🛠️ Installation

Prerequisites

Clone & setup

🔐 Credentials Setup

Google APIs

ElevenLabs (optional but recommended)

⚙️ Configuration

🚀 Usage

Run the full pipeline

🎙️ Audio Options

1️⃣ Interactive Human Recording

2️⃣ macOS Text-to-Speech

3️⃣ ElevenLabs TTS (recommended)

🎞️ Video Generation

🎬 Final Output

♻️ Redo a Single Slide

📋 Available Tasks

🧠 Why `doit`?

🧩 Design Principles

📄 License

🤝 Contributing

🙌 Acknowledgements

About

Uh oh!

Releases

Packages

Languages

cloudxlab/course_pipeline

Folders and files

Latest commit

History

Repository files navigation

🎥 Course Pipeline

✨ Features

📦 Pipeline Overview

🛠️ Installation

Prerequisites

Clone & setup

🔐 Credentials Setup

Google APIs

ElevenLabs (optional but recommended)

⚙️ Configuration

🚀 Usage

Run the full pipeline

🎙️ Audio Options

1️⃣ Interactive Human Recording

2️⃣ macOS Text-to-Speech

3️⃣ ElevenLabs TTS (recommended)

🎞️ Video Generation

🎬 Final Output

♻️ Redo a Single Slide

📋 Available Tasks

🧠 Why doit?

🧩 Design Principles

📄 License

🤝 Contributing

🙌 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

🧠 Why `doit`?

Packages