Skip to content

dpshade/permaweb-llm-fuel

Repository files navigation

πŸš€ Permaweb LLM Fuel

An interactive tool for selecting and curating Permaweb documentation into llms.txt format for AI training.

🎯 Features

  • 🌐 Multi-site crawling: Automatically crawl documentation from Hyperbeam, AO, ArIO, and core Arweave sites
  • 🎯 Smart filtering: Quality scoring and content enhancement for optimal AI training data
  • πŸ“Š Interactive selection: Browse and filter documentation with real-time search
  • πŸ“„ llms.txt generation: Export curated content in the standard format for AI training
  • πŸ“± Responsive design: Works on desktop and mobile devices
  • πŸš€ Multiple deployment options: Deploy to Vercel or Arweave with flexible configurations

πŸ—οΈ Development Setup

Prerequisites

  • Bun (v1.0+) - Install Bun
  • Node.js (v18+) - For fallback compatibility
  • Git - For version control

Quick Start

# Clone the repository
git clone https://github.com/your-org/permaweb-llm-fuel.git
cd permaweb-llm-fuel

# Install dependencies
bun install

# Start development server
bun run dev

# Open in browser
open http://localhost:4321

Available Scripts

# Development
bun run dev              # Start development server
bun run preview          # Preview production build locally
bun run preview:local    # Preview with custom server

# Building
bun run build            # Build for production
bun run build:vercel     # Build for Vercel deployment
bun run clean            # Clean build artifacts

# Testing
bun run test             # Run tests once
bun run test:watch       # Run tests in watch mode
# Add --ui or --coverage flags as needed:
# bun run test --ui       # Run tests with UI
# bun run test --coverage # Run tests with coverage

# Linting & Validation
bun run lint             # Run Astro linter
bun run validate         # Run tests + lint + build

# Documentation Crawling
bun run crawl            # Show help and crawl all sites (pretty JSON)
bun run crawl:prod       # Crawl all sites (minified JSON for production)
bun run crawl <site>     # Crawl specific site (hyperbeam, ao, ario, arweave)
bun run crawl --force    # Force reindex all sites
bun run crawl <site> --force  # Force reindex specific site

# Deployment (via GitHub Actions)
bun run deploy:preview   # Deploy to preview (push to preview branch)
bun run deploy:prod      # Deploy to production (push to main branch)

πŸš€ CI/CD Pipeline

This repository uses a comprehensive CI/CD pipeline for automated testing, building, and deployment.

Workflow Overview

graph TD
    A[Push to Branch] --> B{Branch Type?}
    B -->|Feature Branch| C[Create PR]
    B -->|Preview Branch| D[Deploy to Preview]
    B -->|Master Branch| E[Deploy to Production]
    
    C --> F[Run Tests & Build]
    F --> G[Comment on PR]
    G --> H[Merge to Master]
    H --> E
    
    D --> I[Vercel Preview]
    E --> J[Vercel Production]
    E --> K[ArNS Deployment]
    
    L[Daily Cron] --> M[Crawl Documentation]
    M --> N[Update Index]
    N --> O[Deploy Updates]
Loading

Environment Structure

Environment Platform URL Deployment
Development Local localhost:4321 Manual (bun run dev)
Preview Vercel Vercel preview URLs Manual (bun run deploy:preview)
Production Arweave fuel_permawebllms.ar.io Manual (bun run deploy:prod)

GitHub Actions Workflows

1. Main Deployment Pipeline (.github/workflows/deploy.yml)

Triggers:

  • Pull requests to master
  • Pushes to master or preview branches
  • Manual workflow dispatch

Jobs:

  • test-and-build: Validates PRs and feature branches
  • deploy: Deploys to preview or production environments
  • cleanup: Maintains workflow history

2. Daily Documentation Crawl (.github/workflows/daily-crawl-deploy.yml)

Triggers:

  • Daily at 2 AM UTC
  • Manual trigger

Process:

  1. Crawls all configured documentation sites
  2. Generates fresh documentation index
  3. Uploads index to dedicated ArNS endpoint
  4. Reports crawl statistics

Deployment Commands

Preview Deployment (Vercel)

# Deploy to Vercel for fast iteration and testing
bun run deploy:preview

Production Deployment (Arweave)

# Deploy to permanent Arweave storage
bun run deploy:prod

Interactive Deployment

# Choose deployment target interactively
bun run deploy
# Select: 1) Preview (Vercel) or 2) Production (Arweave)

Required GitHub Secrets

Configure these secrets in your GitHub repository settings:

# Vercel Deployment
VERCEL_TOKEN=your_vercel_token
VERCEL_ORG_ID=team_6GvOyT5ARQJH1mo4x6fXQqzE
VERCEL_PROJECT_ID=prj_eFRPzU5WfC6v7l4GNqrtzkSUqdcT

# ArNS Deployment (when enabled)
DEPLOY_KEY=your_arweave_wallet_jwk
ANT_PROCESS=your_ant_process_id

πŸ“ Project Structure

permaweb-llm-fuel/
β”œβ”€β”€ .github/
β”‚   └── workflows/           # CI/CD workflows
β”œβ”€β”€ public/                  # Static assets
β”‚   β”œβ”€β”€ crawl-config.json   # Crawl configuration
β”‚   β”œβ”€β”€ docs-index.json     # Generated documentation index
β”‚   └── favicon.svg
β”œβ”€β”€ scripts/                 # Build and deployment scripts
β”‚   β”œβ”€β”€ deploy-preview.sh    # Preview deployment
β”‚   β”œβ”€β”€ deploy-production.sh # Production deployment
β”‚   └── post-build.js       # Post-build optimization
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ pages/              # Astro pages
β”‚   β”œβ”€β”€ styles/             # CSS styles
β”‚   └── utils/              # Utility functions
β”‚       β”œβ”€β”€ crawler.js      # Documentation crawler
β”‚       β”œβ”€β”€ content-enhancer.js
β”‚       β”œβ”€β”€ quality-scorer.js
β”‚       └── batch-processor.js
β”œβ”€β”€ test/                   # Test files
β”œβ”€β”€ astro.config.mjs        # Astro configuration
β”œβ”€β”€ package.json           # Dependencies and scripts
β”œβ”€β”€ preview-server.js      # Development preview server
└── vitest.config.js       # Test configuration

πŸ§ͺ Testing

The project uses Vitest for testing with jsdom environment for DOM testing.

# Run all tests
bun run test

# Watch mode for development
bun run test:watch

# Run with coverage
bun run test:coverage

# Interactive UI
bun run test:ui

Test Categories

  • Unit Tests: Individual function testing
  • Integration Tests: Component interaction testing
  • Crawler Tests: Documentation crawling validation
  • Content Tests: Quality scoring and enhancement validation

πŸ”§ Configuration

Crawl Configuration (public/crawl-config.json)

Configure documentation sources:

{
  "hyperbeam": {
    "url": "https://docs.hyperbeam.xyz",
    "selectors": {
      "content": "main",
      "title": "h1"
    }
  }
}

Build Configuration (astro.config.mjs)

Astro configuration for static site generation:

export default defineConfig({
  output: 'static',
  build: {
    inlineStylesheets: 'always'
  }
});

🌐 Deployment

Vercel Deployment

The project deploys to Vercel for both preview and production environments:

  • Preview: Automatic deployment on preview branch pushes
  • Production: Automatic deployment on master branch pushes

ArNS Deployment (Coming Soon)

Future deployment to Arweave Permaweb with ArNS:

# Will be enabled when ready
npx permaweb-deploy \
  --arns-name=permaweb-llm-fuel \
  --ant-process=$ANT_PROCESS \
  --deploy-folder=dist

🀝 Contributing

Development Workflow

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes
  4. Run tests: bun run validate
  5. Commit changes: git commit -m 'Add amazing feature'
  6. Push to branch: git push origin feature/amazing-feature
  7. Create Pull Request

Code Standards

  • Use kebab-case for file names
  • Use camelCase for variables and functions
  • Use PascalCase for classes and constructors
  • Write tests for new functionality
  • Follow existing code style and patterns

Commit Messages

Use conventional commit format:

feat: add new documentation source
fix: resolve crawler timeout issue
docs: update deployment instructions
test: add unit tests for quality scorer

πŸ“Š Monitoring

Build Status

Monitor deployment status:

  • GitHub Actions: Check workflow runs
  • Vercel Dashboard: Monitor deployments
  • Error Tracking: Review build logs

Performance Metrics

  • Build time optimization
  • Bundle size monitoring
  • Crawl success rates
  • Site performance metrics

πŸ”— Links

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Built with ❀️ for the Permaweb ecosystem

Architecture

The Permaweb LLM Fuel uses a streamlined approach for documentation indexing:

  • πŸ“ Public Access: public/docs-index.json - Accessible at /docs-index.json for frontend consumption and external API access
  • πŸ”„ Automatic Generation: Crawl processes generate and maintain the index file automatically
  • ⚑ Optimized Loading: Frontend loads the index via fetch for dynamic content rendering

This simplified structure eliminates redundancy while maintaining all functionality.

πŸ”§ UI Customization via Query Parameters

You can customize the Permaweb LLM Fuel UI by supplying query parameters in the URL. This is useful for embedding the tool in iframes, theming, or hiding certain UI elements.

Parameter Example Description
iframe ?iframe=true Enables iframe-optimized UI (removes header, compact layout, etc)
hide-header ?hide-header=1 Hides the main page header/title
minimal ?minimal=true Hides the main header for a compact layout
theme ?theme=dark Forces dark mode (light or dark)
translucent ?translucent=1 Enables translucent background (for overlays/embeds)
accent ?accent=%23ff6600 Sets the accent color (hex, URL-encoded, e.g. %23ff6600)
bg-color ?bg-color=%23f5f5f5 Sets the background color (hex, URL-encoded)
text-color ?text-color=%23000000 Sets the text color (hex, URL-encoded)

Example Usage

  • Minimal mode:
    https://your-llm-fuel.app/?minimal=true
    
  • Iframe mode with dark theme:
    https://your-llm-fuel.app/?iframe=true&theme=dark
    
  • Hide header and use a custom accent color:
    https://your-llm-fuel.app/?hide-header=1&accent=%23ff6600
    
  • Translucent overlay with custom background:
    https://your-llm-fuel.app/?translucent=1&bg-color=%23ffffff
    

You can combine multiple parameters as needed. Most parameters are compatible with each other.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •