RAG in 10 lines. No PhD required.
A dead-simple Python library and CLI that lets you build Retrieval-Augmented Generation pipelines in under 10 lines of code. Chat with your documents, codebase, or any text data.
pip install easyragfrom easyrag import RAG
rag = RAG()
rag.ingest("./docs")
answer = rag.ask("How does authentication work?")
print(answer)That's it. Five lines. No API keys. Runs locally.
Problem: A new developer joins your team. Your project has 200+ pages of docs, runbooks, API specs, and inline code comments spread across dozens of files. They spend 2 weeks just reading before they can contribute.
Solution: Feed everything into EasyRAG. Now they ask questions and get instant, cited answers.
from easyrag import RAG
# One-time setup (takes ~60 seconds)
rag = RAG(llm="ollama:llama3")
rag.ingest("./docs") # Markdown, PDFs, runbooks
rag.ingest("./src") # Source code with comments
# New developer asks questions on day one
result = rag.ask("How do I deploy to production?")
print(result.answer)
# => "Merge to main (auto-deploys to staging), verify on staging,
# then trigger the 'Deploy to Production' GitHub Actions workflow.
# Requires 2 approvals. ECS auto-rolls back within 5 min on failure. [1][2]"
# Every answer cites its sources
for src in result.sources:
print(f" - {src.metadata['file_name']} (score: {src.score:.3f})")
# - deployment.md (score: 0.575)
# - ci_cd.md (score: 0.545)Other use cases: internal Q&A bots, support knowledge bases, research paper chat, codebase exploration.
easyrag ingest ./docs # Ingest documents
easyrag ask "What is the retry policy?" # Ask a question
easyrag chat # Interactive chat
easyrag search "authentication flow" # Semantic search (no LLM)
easyrag serve --port 8000 # Start API server| Feature | EasyRAG | LangChain | LlamaIndex |
|---|---|---|---|
| Lines to first RAG | 5 | 50+ | 30+ |
| Zero-config setup | Yes | No | No |
| Built-in vector store | Yes | No | No |
| Local-first (no API keys) | Yes | No | No |
| CLI included | Yes | No | No |
| API server included | Yes | No | No |
| Category | Formats |
|---|---|
| Documents | PDF, Markdown, Plain text, CSV, JSON |
| Code | Python, JavaScript, TypeScript, Go, Rust, Java, C/C++, Ruby, PHP, and more |
| Config | YAML, TOML, XML, HTML, INI |
from easyrag import RAG
# Minimal -- everything has sensible defaults
rag = RAG()
# Full control
rag = RAG(
embedding_model="openai:text-embedding-3-small",
llm="openai:gpt-4o-mini",
chunk_size=512,
chunk_overlap=50,
top_k=5,
db_path="./my_vectors.db",
rerank=True,
)| Provider | Model | API Key |
|---|---|---|
| Local (default) | all-MiniLM-L6-v2 |
No |
| OpenAI | text-embedding-3-small, text-embedding-3-large |
Yes |
| Provider | Models | API Key |
|---|---|---|
| Ollama (default) | llama3, mistral, phi3, etc. |
No |
| OpenAI | gpt-4o, gpt-4o-mini |
Yes |
| Anthropic | claude-sonnet-4-20250514, claude-haiku-4-5-20251001 |
Yes |
| Store | Best For | Dependencies |
|---|---|---|
| NumPy (default) | Most use cases, portable, single file | None |
| ChromaDB | Large datasets, production | pip install easyrag[chroma] |
for token in rag.ask_stream("What is the retry policy?"):
print(token, end="", flush=True)easyrag serve --port 8000| Method | Path | Description |
|---|---|---|
GET |
/health |
Health check |
GET |
/info |
Database info |
POST |
/ingest |
Ingest documents |
POST |
/ask |
Ask a question |
POST |
/search |
Semantic search |
WS |
/ws/ask |
Streaming answers via WebSocket |
curl -X POST http://localhost:8000/ask \
-H "Content-Type: application/json" \
-d '{"question": "How does auth work?", "top_k": 5}'docker build -t easyrag .
docker run -p 8000:8000 -v ./data:/data easyraggit clone https://github.com/aymenhmaidiwastaken/easyrag.git
cd easyrag
pip install -e ".[dev,all]"
pytest