Skip to content

cloudxlab/course_pipeline

Repository files navigation

🎥 Course Pipeline

Automated Google Slides → Narrated Videos → Final Course Video

course_pipeline is a deterministic, reproducible, dependency-aware video generation pipeline that converts a Google Slides deck into a fully narrated course video.

It supports:

  • Human voice recording
  • macOS native TTS (say)
  • ElevenLabs high-quality TTS
  • Incremental rebuilds
  • Slide-level redo
  • Fully ordered, dependency-driven execution using doit

This is designed for course creators, educators, and AI-first content pipelines.


✨ Features

  • 🔗 Google Slides → Video (single source of truth)

  • 🧠 Dependency-aware pipeline (no reruns unless needed)

  • 🔢 Strict numeric ordering (1 → N slides, no randomness)

  • 🎙️ Multiple audio modes:

    • Interactive human recording
    • macOS say
    • ElevenLabs API (studio-quality voices)
  • 🎞️ Per-slide video generation

  • 🎬 Final concatenation into final.mp4

  • ♻️ Incremental & resumable (crash-safe)

  • 🧹 Redo individual slides easily


📦 Pipeline Overview

Google Slides URL
        │
        ▼
extract_notes     → notes/1.txt ... notes/N.txt
screenshot        → screenshots/1.png ... N.png
audio / tts_*     → audio/1.wav ... N.wav
video             → videos/1.mp4 ... N.mp4
final             → final.mp4

Each step is a doit task with explicit dependencies.


🛠️ Installation

Prerequisites

  • Python 3.11+
  • macOS (for say; optional if using ElevenLabs)
  • ffmpeg
  • Google Cloud credentials (Slides + Drive API)

Install ffmpeg:

brew install ffmpeg

Clone & setup

git clone https://github.com/<your-org>/course_pipeline.git
cd course_pipeline
python -m venv .venv
source .venv/bin/activate
pip install uv
uv sync

🔐 Credentials Setup

Google APIs

Enable:

  • Google Slides API
  • Google Drive API

Create OAuth credentials and save as:

credentials.json

First run will open a browser for auth and cache tokens locally.


ElevenLabs (optional but recommended)

export ELEVENLABS_API_KEY="your_api_key_here"

⚙️ Configuration

Edit settings in:

scripts/settings.py

Typical options:

  • Video resolution (e.g. 1920x1080)
  • FPS
  • Output directories

🚀 Usage

Run the full pipeline

uv run doit

This will:

  1. Extract notes
  2. Take slide screenshots
  3. Generate audio
  4. Create per-slide videos
  5. Concatenate into final.mp4

🎙️ Audio Options

1️⃣ Interactive Human Recording

uv run doit audio
  • Displays slide text

  • Records mic input

  • Options:

    • Enter → save & continue
    • r → re-record
    • s → skip

2️⃣ macOS Text-to-Speech

uv run doit tts_slides

Uses native say command.


3️⃣ ElevenLabs TTS (recommended)

uv run doit elevenlabs_tts
  • Studio-quality voices
  • Handles empty slides safely
  • Converts MP3 → WAV automatically

Force regeneration:

uv run doit elevenlabs_tts --force

🎞️ Video Generation

Generate per-slide videos:

uv run doit video

Each slide:

1.png + 1.wav → 1.mp4

Optimized for:

  • QuickTime
  • VLC
  • YouTube compatibility

🎬 Final Output

uv run doit final

Result:

build/<deck_id>/final.mp4

Force re-encode (if needed):

uv run doit final --reencode

♻️ Redo a Single Slide

Redo slide 12 completely:

uv run doit redo --slide 12
uv run doit

Redo only audio:

uv run doit redo --slide 12 --what audio

📋 Available Tasks

doit list

Example:

setup
extract_notes
screenshot
audio
tts_slides
elevenlabs_tts
video
final
redo

🧠 Why doit?

This pipeline does not rely on ad-hoc scripts.

doit gives:

  • True dependency graphs
  • Incremental builds
  • Crash-safe resumption
  • Deterministic ordering

This makes it suitable for large decks (100+ slides) and CI-like automation.


🧩 Design Principles

  • Single source of truth: Google Slides
  • Artifacts are explicit: text, image, audio, video
  • No hidden state
  • No silent reprocessing
  • Human-friendly override at every stage

📄 License

MIT License Free to use, modify, and distribute.


🤝 Contributing

PRs welcome for:

  • Windows/Linux TTS backends
  • Speaker diarization
  • Subtitle (SRT/VTT) generation
  • Video transitions
  • Parallel execution

🙌 Acknowledgements

  • Google Slides API
  • FFmpeg
  • ElevenLabs
  • doit task runner

About

The repository of pipeline to record the course.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published