Ragbot is the chat-led runtime of synthesis engineering — the open methodology for systematic human-AI collaboration on complex work. It is the workbench where humans and AI do the crafts of synthesis one collaborative turn at a time, on the user's own workspace, with skills, agent execution, durable memory, and bi-directional MCP. Workspace-aware. Local-first with frontier fallback. Production-grade observability. MIT-licensed.
Developed by Rajiv Pant. See INSTALL.md for setup, CONFIGURE.md for keys and providers, and CONTRIBUTING.md before opening a PR.
Ragbot v3.4.0 (May 14, 2026) is the next-major-features release. It moves the project from a polished 2024-paradigm chat-with-RAG product to a 2026-shaped conversational AI runtime: explicit agent loop, first-class MCP in both directions, an executable skills runtime, cross-workspace synthesis with visible confidentiality boundaries, durable memory beyond vector RAG, and the production-grade signals that make the architecture legible to engineering leadership.
Read the v3.4.0 release notes → · GitHub release →
- Agent loop runtime. Hand-rolled FSM — no LangGraph, no CrewAI, no AutoGen. Multi-step planning, sub-agent dispatch, sandboxed code execution (E2B / Daytona / DisabledSandbox), durable checkpoints, deterministic replay via
ragbot agent replay <task_id>. 45 tests across the agent loop core and capabilities surface. - First-class MCP — client and server. All six primitives (tools, resources, prompts, Roots, Sampling, Elicitation) plus Tasks. OAuth 2.1 + Dynamic Client Registration for remote servers. As a server, Ragbot exposes its workspace surface to Claude Code, Cursor, ChatGPT desktop, Gemini CLI, and any MCP-aware peer over stdio or HTTP/SSE.
- Skills as runtime. Progressive-disclosure execution of SKILL.md capabilities — Ragbot becomes the third compatible runtime for the SKILL.md format (after Claude Code and Codex CLI). Six starter skills ship in the box;
npx skills add synthesisengineering/synthesis-skillspulls in 32 more. - Cross-workspace synthesis. Multi-workspace chat with per-workspace
routing.yaml, four-level confidentiality (public/personal/client-confidential/air-gapped), and an append-only audit log at~/.synthesis/cross-workspace-audit.jsonl. The confidentiality gate fires before retrieval, so denied workspace combinations never read content. Citations name the source workspace explicitly. - Three-tier memory beyond RAG. Vector RAG + entity graph + session/working memory. A consolidation pass between sessions distills durable facts from the previous session into the entity graph — Anthropic's "Dreaming" pattern. Mem0 and Letta integrations are available behind the abstraction.
- Production-grade observability. OpenTelemetry GenAI semantic conventions on every model call, retrieval step, guardrail check, and tool dispatch. OTLP gRPC export to a bundled Jaeger in the docker-compose stack;
OTEL_EXPORTER_OTLP_ENDPOINTredirects to Phoenix, Langfuse, Datadog, or Honeycomb. Prometheus exposition at/api/metrics; cache stats at/api/metrics/cache. - Keyboard shortcut layer. Coherent set across the web UI:
⌘Kmodel picker,⌘Jworkspace switch,⌘/message-history search,⌘Nnew chat,⌘Bbackground the current run,⌘.cancel the current run,⌘?help overlay. - Open-weights additions. Llama 4, Qwen3, DeepSeek-V3, Mistral Large, and updated Gemma 4 entries in
engines.yamlwith Ollama 0.19 MLX backend notes. Full hardware-sizing matrix indocs/open-weights-sizing.md.
Ragbot v3.3 (May 2026) adds first-class local model support and a redesigned model picker:
- Local Gemma 4 via Ollama. A new
ollamaengine ships Google's Gemma 4 family (E4B, 26B MoE, 31B Dense) as first-class models alongside Anthropic, OpenAI, and Google. No API key required; LiteLLM routes via theollama_chat/prefix. The Docker stack reaches host Ollama viahost.docker.internal:11434out of the box (configurable withOLLAMA_API_BASEif Ollama runs elsewhere on your LAN). - Redesigned model picker. A single rich dropdown replaces the
three-step Provider → Category → Model cascade. Display names
(
Claude Opus 4.7instead ofclaude-opus-4-7). Pinned and Recent sections at the top. Type-ahead search.⌘K/Ctrl+Kglobal shortcut. Per-row badges for tier (Fast / Balanced / Powerful), context window, 🧠 thinking, 🏠 local. - User preferences API. New
/api/preferences/pinned-modelsand/api/preferences/recent-modelsendpoints persist your model selections across sessions in~/.synthesis/ragbot.yaml. - Thinking control moved adjacent to Model. Renders inline below the picker, only for thinking-capable models — so the control isn't there when it has no effect.
- Bug fix: non-flagship GPT-5.x and Gemini models no longer return
empty content on long-context RAG calls. The default
reasoning_effortfor these models is now the lowest declared mode (minimal) rather than unset, so the provider's own reasoning default doesn't consume the entire output-token budget. - Security. LiteLLM pinned
>=1.83.0in requirements (excludes the compromised 1.82.7 / 1.82.8 range from the March 2026 supply-chain incident).
The model picker, opened — one dropdown grouped by provider with Pinned, Recent, type-ahead search, and capability badges:
Type-ahead filters across the full list. Searching gemma surfaces all
three local Gemma 4 variants:
Selecting a local model surfaces the 🏠 badge on the trigger and hides the Thinking control (Gemma 4 doesn't expose thinking effort):
Advanced panel — per-provider API-key status, including the new local-only Ollama entry:
Ragbot v3.2 (April 2026) ships a one-command demo mode and a refreshed screenshot set captured against it:
- Demo mode via
RAGBOT_DEMO=1(orragbot --demo) ships a small bundled workspace and skill indemo/, hard-isolates discovery from the user's real workspaces, and surfaces an unmistakable banner in the Web UI. Anyone can clone the repo, set an API key, and have a working chat with RAG retrieval in under a minute — no database or workspace setup needed. /healthand/api/configreportdemo_modeso any consumer (UI, ops dashboards, screenshot tools) can render the right affordances.- Twenty new tests lock in the discovery isolation contract so a future change can't accidentally let real workspace or skill names leak through when demo mode is on.
A live conversation in demo mode showing RAG retrieval citing the bundled sample documents:
Ragbot v3.1 (April 2026) adds an LLM backend abstraction that decouples ragbot from any single provider gateway:
- Swappable LLM backends via
RAGBOT_LLM_BACKEND={litellm|direct}. The defaultlitellmbackend keeps the broadest provider/model coverage; thedirectbackend calls Anthropic, OpenAI, and google-genai SDKs directly with no third-party dependency. Adding alternatives (Bifrost, Portkey, OpenRouter) is a one-file change. - Web UI controls for reasoning effort and the cross-workspace skills toggle, alongside the existing workspace/model picker.
/api/chatacceptsthinking_effortandadditional_workspacesfields directly.
Strategic note on LiteLLM in 2026: it remains a defensible default because of provider/model coverage, but the March-2026 supply-chain incident (versions 1.82.7–1.82.8) and the API-compatibility lag for Claude 4.7+'s thinking.type.adaptive shape are real frictions. Pinning >=1.83.0 avoids the compromised range; the abstraction layer makes a future swap a configuration change, not a code rewrite.
Ragbot v3.0 (April 2026) ships three major upgrades over v2:
- Pgvector by default. PostgreSQL with the
pgvectorextension is the default vector backend, replacing embedded Qdrant. Native full-text search viatsvector+ GIN replaces in-process BM25. The legacy embedded Qdrant backend remains as an opt-in fallback (RAGBOT_VECTOR_BACKEND=qdrant). - Agent Skills as first-class content. Ragbot discovers and indexes Agent Skills (
SKILL.mdplus references and scripts) from~/.synthesis/skills,~/.claude/skills, and plugin caches. Newragbot skills {list,info,index}CLI. The compiler can include skills via asources.skillsblock incompile-config.yaml. - Workspace-rooted layout. AI Knowledge repos are discovered across
~/workspaces/*/ai-knowledge-*and via the synthesis-engineering shared~/.synthesis/console.yamlsource list. Configuration moved to~/.synthesis/(legacy~/.config/ragbot/falls through).
Plus reasoning-effort wiring (Claude 4.x adaptive thinking, GPT-5.5 reasoning, Gemini 3.x thinking levels) — see --thinking-effort and RAGBOT_THINKING_EFFORT.
Ragbot is a reference implementation of the synthesis-engineering methodology, focused on the conversational interaction primitive. Sibling reference implementations cover other primitives: synthesis-console for direct manipulation (browse and edit), Ragenie for the procedural primitive (workflow definition with autonomous execution), and synthesis-skills as the portable capability format consumed by every runtime and by external SKILL.md-compatible agents (Claude Code, Codex CLI, Cursor, Gemini CLI). The family of reference implementations will grow as the methodology and the AI landscape evolve. All implementations share the ~/.synthesis/ config home, the ai-knowledge-* workspace model, and a Python substrate library, and they integrate through Model Context Protocol (MCP) calls and a filesystem-as-source-of-truth contract.
Ragbot is developed using Synthesis Engineering (also known as Synthesis Coding)—a systematic approach that combines human architectural expertise with AI-assisted implementation. This methodology ensures that while AI accelerates development velocity, engineers maintain architectural authority, enforce quality standards, and deeply understand every component of the system.
Key principles applied in Ragbot's development:
- Human-defined architecture with AI-accelerated implementation
- Systematic quality assurance regardless of code origin
- Context preservation across development sessions
- Iterative refinement based on real-world usage
Learn more about this approach:
- Synthesis Engineering: The Professional Practice
- The Organizational Framework
- Technical Implementation with Claude Code
Code Contributors & Collaborators
How to Contribute
Your code contributions are welcome! Please read CONTRIBUTING.md for important safety guidelines (especially about not committing personal data), then fork the repository and submit a pull request with your improvements.
Get Ragbot running in 5 minutes:
# 1. Clone this repository
git clone https://github.com/synthesisengineering/ragbot.git
cd ragbot
# 2. Set up your API keys
cp .env.docker .env
# Edit .env and add at least one API key (OpenAI, Anthropic, or Gemini)
# 3. Get starter templates from ai-knowledge-ragbot
git clone https://github.com/rajivpant/ai-knowledge-ragbot.git ~/ai-knowledge-ragbot
cp -r ~/ai-knowledge-ragbot/source/datasets/templates/ datasets/my-data/
cp ~/ai-knowledge-ragbot/source/instructions/templates/default.md instructions/
# 4. Customize with your information
# Edit the files in datasets/my-data/ with your personal details
# 5. Start Ragbot with Docker
docker-compose up -d
# 6. Access the web interface
open http://localhost:3000If you want to keep your data in a separate directory or private repository:
# 1. Clone Ragbot
git clone https://github.com/synthesisengineering/ragbot.git
cd ragbot
# 2. Create your data directory
mkdir ~/ragbot-data
# Or clone your private data repo: git clone <your-private-repo> ~/ragbot-data
# 3. Set up Docker override
cp docker-compose.override.example.yml docker-compose.override.yml
# Edit docker-compose.override.yml to point to your data directory
# 4. Organize your data (get templates from ai-knowledge-ragbot)
git clone https://github.com/rajivpant/ai-knowledge-ragbot.git ~/ai-knowledge-ragbot
cp -r ~/ai-knowledge-ragbot/source/datasets/templates/* ~/ragbot-data/datasets/
cp ~/ai-knowledge-ragbot/source/instructions/templates/default.md ~/ragbot-data/instructions/
# 5. Configure API keys
cp .env.docker .env
# Edit .env with your API keys
# 6. Start Ragbot
docker-compose up -d- 📖 Knowledge Base: Get templates and runbooks from ai-knowledge-ragbot
- 🎓 Understand the philosophy: Read docs/DATA_ORGANIZATION.md
- 🐳 Docker deployment: See README-DOCKER.md for deployment guide
- 🤝 Contributing safely: Read CONTRIBUTING.md before contributing
- ⚙️ Detailed setup: Follow the installation guide and configuration guide
Want to evaluate ragbot end-to-end without setting up a workspace, a database, or any data? Use demo mode:
git clone https://github.com/synthesisengineering/ragbot.git
cd ragbot
cp .env.example .env
# Edit .env to set at least one API key (Anthropic, OpenAI, or Google).
# Install Python deps (in your preferred virtualenv):
pip install -r requirements.txt
# Start the bundled Postgres + ragbot stack:
docker compose up -d
# Run any subcommand in demo mode:
RAGBOT_DEMO=1 python3 src/ragbot.py db status
RAGBOT_DEMO=1 python3 src/ragbot.py skills list
RAGBOT_DEMO=1 python3 src/ragbot.py chat -p "What is ragbot?"Or for the Web UI:
RAGBOT_DEMO=1 python3 -m uvicorn src.api.main:app --port 8000 &
cd web && npm install && NEXT_PUBLIC_API_URL=http://localhost:8000 npm run dev
# Open http://localhost:3000 — you'll see a yellow "🎭 Demo mode" banner.Demo mode hard-isolates discovery to the bundled demo/ai-knowledge-demo/
workspace and demo/skills/ragbot-demo-skill/. Real workspaces on the
host are invisible to discovery while RAGBOT_DEMO=1 is set, so the
demo is safe to drive in front of an audience or to capture
screenshots from. Unset the env var to return to your real workspaces.
Ragbot's user configuration lives under ~/.synthesis/ (a shared home for synthesis-engineering tools — keys are reused by other tools in the family without duplication):
~/.synthesis/
├── keys.yaml # API keys (per-provider; per-workspace overrides supported)
├── ragbot.yaml # Ragbot user prefs (default_workspace, etc.)
└── console.yaml # Synthesis-console source list (optional; ragbot reads it for repo discovery)
Legacy ~/.config/ragbot/{keys,config}.yaml is still read as a fallback so existing setups keep working.
Ragbot's default vector store is PostgreSQL with the pgvector extension. The schema is shared across workspaces (one chunks table with a workspace column, an HNSW vector index for cosine ANN, and a generated tsvector + GIN index for native full-text search). Migrations are applied idempotently on first connection.
For local development without Docker, install pgvector for your PostgreSQL and point RAGBOT_DATABASE_URL at it. With Docker Compose, the bundled postgres service starts automatically. See CONFIGURE.md for both paths.
Use ragbot db status to confirm the active backend, and the legacy embedded Qdrant backend remains available via RAGBOT_VECTOR_BACKEND=qdrant.
Ragbot indexes Agent Skills (directories containing SKILL.md) as first-class content. The full directory tree is honoured — references/**/*.md and bundled scripts (*.py, *.sh, etc.) are all indexed and become queryable via RAG.
ragbot skills list # show all discovered skills
ragbot skills info <skill-name> # full details for one skill
ragbot skills index # index all skills into the 'skills' workspaceDiscovery sources (later wins on name collision):
~/.synthesis/skills/(synthesis-engineering shared install)~/.claude/skills/(Claude Code private)~/.claude/plugins/cache/<vendor>/skills/(plugin-installed)- Per-workspace roots declared in
compile-config.yaml(sources.skills.roots)
When the skills workspace has indexed content, ragbot chat automatically merges its results with the user's selected workspace via cross-workspace retrieval. Disable per-call with --no-skills or programmatically via additional_workspaces=[].
Models that advertise thinking support in engines.yaml (Claude Sonnet 4.6, Claude Opus 4.7, GPT-5.5, GPT-5.5-pro, Gemini 3.x) are wired through LiteLLM's reasoning_effort parameter. Defaults: flagship models → medium, non-flagship with thinking → off, models without thinking metadata → silent (no params sent).
ragbot chat --thinking-effort high -p "explain this..." # explicit high effort
RAGBOT_THINKING_EFFORT=low ragbot chat -p "explain this..." # globally low
ragbot chat --thinking-effort off -p "..." # disable on a flagshipRagbot implements a production-grade, multi-stage RAG pipeline based on research from leading AI systems including Perplexity, ChatGPT, Claude, and Gemini. Unlike simple RAG implementations, Ragbot uses sophisticated techniques proven to significantly improve retrieval accuracy.
Query → Phase 1 → Phase 2 → Phase 3 → Generate → Phase 4 → Response
Foundation Query Hybrid Response Verify with
Intel Retrieval & CRAG Confidence
Four-Phase Pipeline:
| Phase | Description | Key Techniques |
|---|---|---|
| Phase 1 | Foundation | Query preprocessing, full document retrieval, 16K context budget |
| Phase 2 | Query Intelligence | LLM planner, multi-query expansion (5-7 variations), HyDE |
| Phase 3 | Hybrid Retrieval | BM25 + Vector search, Reciprocal Rank Fusion, LLM reranking |
| Phase 4 | Verification | Hallucination detection, confidence scoring, CRAG loop |
Based on benchmarks from Anthropic, Microsoft, and other research:
| Technique | Impact |
|---|---|
| Contextual embeddings | 35% fewer retrieval failures |
| Hybrid search + reranking | 67% fewer retrieval failures |
| Query rewriting (multi-query) | +21 NDCG points |
- Query Preprocessing: Expands contractions ("what's" → "what is"), extracts key terms
- Document Detection: Recognizes "show me my biography" style queries
- Full Document Retrieval: Returns complete documents instead of fragments when appropriate
- Enhanced Embeddings: Includes filename and title in embeddings for better matching
- LLM Query Planner: Analyzes intent, determines retrieval strategy
- Multi-Query Expansion: Generates 5-7 query variations for better recall
- HyDE (Hypothetical Document Embeddings): Generates hypothetical answers for semantic search
- Provider-Agnostic: Uses fast model from same provider as user's selection
- Dual Search: Combines semantic (vector) and lexical (BM25) search
- Reciprocal Rank Fusion: Merges results from both search methods
- LLM Reranking: Scores relevance 0-10, reorders by combined score
- Result: Best of both semantic understanding and exact keyword matching
- Claim Extraction: Identifies factual claims in generated responses
- Evidence Matching: Checks each claim against retrieved context
- Confidence Scoring: 0.0-1.0 score based on claim verification
- CRAG (Corrective RAG): Re-retrieves for low-confidence responses (<0.7)
- Select a workspace in the sidebar
- Click "Index Workspace" in Advanced Settings to build the index (first time only)
- Enable "Enable RAG" checkbox
- Adjust "RAG context tokens" slider to control how much context is retrieved
| Setting | Default | Description |
|---|---|---|
| Enable RAG | On | Toggle RAG-augmented responses |
| RAG context tokens | 16000 | Maximum tokens for retrieved context |
| Confidence threshold | 0.7 | CRAG triggers below this score |
| Embedding model | all-MiniLM-L6-v2 | 384-dimension embeddings |
- Vector Database: Qdrant (local file-based storage at
/app/qdrant_data) - Embedding Model: sentence-transformers
all-MiniLM-L6-v2(80MB, 384 dimensions) - Chunking: ~500 tokens per chunk with 50-token overlap
- Similarity: Cosine distance for semantic matching
For the complete technical architecture, see docs/rag-architecture.md.
Ragbot integrates with the AI Knowledge ecosystem for managing knowledge bases across multiple workspaces.
The ai-knowledge-ragbot repository contains open-source runbooks, templates, and guides that ship with Ragbot:
- Instruction templates - Starter configurations for AI assistants
- Dataset templates - Personal and professional profile templates
- Runbooks - Procedures for content creation, communication, system configuration
- Guides - Reference materials for working with AI
Personal ai-knowledge repos can inherit from ai-knowledge-ragbot to get these shared resources while adding private content.
The AI Knowledge system manages content across multiple workspaces using a three-part architecture:
| Operation | Where | When |
|---|---|---|
Knowledge concatenation (all-knowledge.md) |
CI/CD (GitHub Actions) | Every push to source/ |
| Instruction compilation | Local (ragbot compile) |
When instructions change (rare) |
| RAG indexing | Local (ragbot index) |
When content changes + RAG needed |
Key concept: Edit source/ files directly. Knowledge concatenation is automatic. See docs/compilation-guide.md for details.
ai-knowledge-{workspace}/
├── source/ # Your source files (authoritative)
│ ├── instructions/ # WHO - Identity, persona, rules
│ ├── runbooks/ # HOW - Procedures, workflows
│ └── datasets/ # WHAT - Reference knowledge
├── compiled/ # Auto-generated
│ └── {project}/
│ └── instructions/ # LLM-specific (claude.md, chatgpt.md, gemini.md)
└── all-knowledge.md # Concatenated knowledge (CI/CD via GitHub Actions)
Quick examples:
# Compile instructions for a project
ragbot compile --project {name} --no-llm
# Index workspace for RAG
ragbot index --workspace {name}For detailed setup instructions, see the LLM Project Setup Guide.
Ragbot automatically discovers AI Knowledge repositories by convention:
- Mount your
ai-knowledgeparent directory to/app/ai-knowledge - Ragbot scans for directories matching
ai-knowledge-{workspace} - Each discovered repo provides instructions and knowledge for that workspace
# docker-compose.override.yml
services:
ragbot-web:
volumes:
- ${HOME}/workspaces:/root/workspaces:ro
- ./workspaces:/app/workspaces:roCreate workspace.yaml files to customize workspace behavior:
# workspaces/my-project/workspace.yaml
name: My Project
description: Project-specific AI assistant
status: active
type: work
inherits_from:
- personal # Inherit from personal workspace| Content Type | Loading Method | Use Case |
|---|---|---|
| Instructions | Always loaded | Core identity and behavior |
| Datasets | Direct or RAG | Small: direct, Large: RAG |
| Runbooks | RAG retrieval | Retrieved when relevant |
Ragbot supports models from Anthropic (Claude), OpenAI (GPT and reasoning models), Google (Gemini), and local open-weights models via Ollama. The authoritative list — model IDs, context windows, thinking-mode support, tier badges, and defaults — lives in engines.yaml. v3.3's redesigned model picker reads from the same file at runtime, so what you see in the UI matches what is configured in the repo.
Adding or updating models is an engines.yaml change, not a code change. See the v3.3 release notes for the local-model integration details.
Ragbot v3.4 ships expanded local-model coverage out of the box. The ollama engine in engines.yaml includes Gemma 4 (E4B, 26B MoE, 31B Dense), Llama 4 (Scout, Maverick), Qwen3.6 (27B Dense, 35B-A3B MoE), DeepSeek V3.2, and Mistral (Small 4, Medium 3.5, Large 3). Each entry carries MLX-backend notes, license info, recommended quantization tags, and a real-world parameter count.
Choosing a model for your Mac. Open-weights inference on Apple Silicon is bound by unified memory and bandwidth. A 16 GB Mac mini runs Gemma 4 E4B; a Mac Studio with 256 GB unified memory runs DeepSeek V3.2 and Mistral Large 3. The sizing matrix at docs/sizing-matrix.md maps every model in engines.yaml to each Mac hardware profile (Mac mini 16/32/64 GB, MacBook Air 24/32 GB, MacBook Pro M4 Pro 48 / M5 Max 128 GB, Mac Studio M3 Ultra 192/256 GB), with per-model memory footprints at FP16 and Q4, expected MLX tokens-per-second, and comfortable / tight / Q4-only / won't fit verdicts. Start there if you're trying to pick a model for your Mac, or pick a Mac for a model.
Read the installation guide and the configuration and personaliation guide.
The screenshots below were captured against the bundled demo workspace.
The model-picker and advanced-panel shots reflect v3.3 (May 2026); the
chat-and-skills shots are still v3.2 captures since those features didn't
change. The full sets are at screenshots/v3.3/ and
screenshots/v3.2/.
Settings panel and welcome state (v3.3)
Model picker open — Pinned, Recent, and by-provider sections (v3.3)
A chat that retrieves from the bundled sample documents (v3.2)
Advanced settings expanded (v3.3)
Cross-workspace skills auto-include (v3.2)
A follow-up question retrieves from the bundled demo skill via the cross-workspace fan-out:
The CLI uses workspaces with RAG (Retrieval-Augmented Generation) and automatically loads LLM-specific instructions based on the model you're using.
ragbot chat [options]
Input Options:
-p, --prompt PROMPT Prompt text
-f, --prompt_file FILE Read prompt from file
-i, --interactive Interactive mode with history
--stdin Read prompt from stdin
Workspace & Knowledge:
-profile NAME Workspace to use (auto-loads instructions and enables RAG)
--rag / --no-rag Enable/disable RAG retrieval (default: enabled)
Model Selection:
-e {openai,anthropic,google} Engine/provider
-m MODEL Model name (or 'flagship' for best)
Custom Instructions:
-c PATH [PATH ...] Explicit instruction files (overrides auto-loading)
-nc Disable all instructions
The recommended way to use the CLI is with workspaces:
# Chat with a workspace - instructions auto-loaded, RAG enabled
ragbot chat -profile personal -p "What are my travel preferences?"
# Use Anthropic Claude (loads claude.md instructions)
ragbot chat -profile personal -e anthropic -p "Summarize my work history"
# Use OpenAI GPT-5.2 (loads chatgpt.md instructions)
ragbot chat -profile personal -e openai -m gpt-5.2 -p "Summarize my work history"
# Use Google Gemini (loads gemini.md instructions)
ragbot chat -profile personal -e google -p "Summarize my work history"The system automatically loads the correct instruction file based on the LLM:
| Engine | Instruction File |
|---|---|
| anthropic | compiled/{workspace}/instructions/claude.md |
| openai | compiled/{workspace}/instructions/chatgpt.md |
compiled/{workspace}/instructions/gemini.md |
Maintain conversation history across multiple prompts:
ragbot chat -profile personal -i
> Tell me about my professional background
Ragbot.AI: [response based on RAG-retrieved knowledge]
> Summarize it in 3 bullet points
Ragbot.AI: [continues with context]
> /save session.json
Conversation saved to ...
> /quitFull help output for ragbot chat:
$ ragbot chat --help
usage: ragbot chat [-h] [-ls] [-p PROMPT | -f PROMPT_FILE | -i | --stdin]
[-profile PROFILE] [-c [CUSTOM_INSTRUCTIONS ...]] [-nc]
[--rag] [--no-rag]
[-e {openai,anthropic,google}] [-m MODEL] [-t TEMPERATURE]
[-mt MAX_TOKENS] [-l LOAD]
Ragbot.AI is an augmented brain and assistant. Learn more at https://ragbot.ai
options:
-h, --help show this help message and exit
-ls, --list-saved List all the currently saved JSON files.
-p, --prompt The user's input prompt
-f, --prompt_file Read prompt from a file
-i, --interactive Enable interactive mode with conversation history
--stdin Read prompt from stdin
-profile Workspace name (enables RAG and auto-loads instructions)
-c Custom instruction file paths (overrides auto-loading)
-nc Disable custom instructions
--rag Enable RAG retrieval (default)
--no-rag Disable RAG - instructions only
-e {openai,anthropic,google} LLM engine/provider
-m MODEL Model name or 'flagship'
-t TEMPERATURE Creativity (0-2)
-mt MAX_TOKENS Max response tokens
-l LOAD Load previous session from fileKnowledge is retrieved via RAG (Retrieval-Augmented Generation) from indexed workspace content:
ragbot chat -profile personal -p "What are my travel preferences?"
# RAG enabled for workspace: personal
# [Response based on retrieved knowledge]Ragbot is by Synthesis Engineering · synthesisengineering.org · synthesiscoding.org · MIT License









