This document explains critical vector dimension compatibility issues when switching between different embedding models in Flexible GraphRAG.
When switching between different LLM providers or embedding models, you MUST delete existing vector indexes because different models produce embeddings with different dimensions.
Vector databases create indexes optimized for specific dimensions. When you change embedding models, the new embeddings won't fit the existing index structure, causing errors like:
Dimension mismatch errorVector size incompatible with indexIndex dimension does not match embedding dimension
- text-embedding-3-large:
3072dimensions - text-embedding-3-small:
1536dimensions (default) - text-embedding-ada-002:
1536dimensions
- all-minilm:
384dimensions (default) - nomic-embed-text:
768dimensions - mxbai-embed-large:
1024dimensions
- Same as OpenAI models:
1536or3072dimensions
- Default fallback:
1536dimensions
When frequently switching between embedding models (OpenAI ↔ Ollama), choose databases with user-friendly deletion:
| Database | Deletion Method | Difficulty | Dashboard |
|---|---|---|---|
| Qdrant ✅ | One-click collection deletion | ⭐ Easy | Web UI |
| Milvus ✅ | Professional drop operations | ⭐⭐ Moderate | Attu Dashboard |
| Weaviate ✅ | Schema-based deletion | ⭐⭐ Moderate | Console |
| Chroma |
HTTP mode: API deletion, Local mode: File cleanup | ⭐⭐ Moderate | Swagger API (HTTP) |
| LanceDB |
File/table deletion | ⭐⭐ Moderate | Viewer + Files |
| PostgreSQL ❌ | SQL commands required | ⭐⭐⭐ Advanced | pgAdmin |
| Pinecone |
Cloud console only | ⭐⭐ Moderate | Web Console |
💡 Recommendation: Use Qdrant or Milvus for the easiest vector cleanup when switching embedding models.
Using Qdrant Dashboard:
- Open Qdrant Dashboard: http://localhost:6333/dashboard
- Go to "Collections" tab
- Find
hybrid_search_vector(or your collection name) in the collections list - Click the 3 dots (⋮) menu next to the collection
- Select "Delete"
- Confirm the deletion
Using Neo4j Browser:
- Open Neo4j Browser: http://localhost:7474 (or your Neo4j port)
- Login with your credentials
- Drop Vector Index:
- Run:
SHOW INDEXES - Run:
DROP INDEX hybrid_search_vector IF EXISTS - Run:
SHOW INDEXESto verify cleanup
- Run:
Using Kibana Dashboard:
- Open Kibana: http://localhost:5601 (if Kibana is running)
- Choose "Management" from the main menu
- Click "Index Management"
- Select
hybrid_search_vectorfrom the indices list - Choose "Manage index" (blue button)
- Choose "Delete index"
- Confirm the deletion
Alternative - Using Elasticsearch REST API:
# Delete the vector index via curl
curl -X DELETE "http://localhost:9200/hybrid_search_vector"Using OpenSearch Dashboards:
- Open OpenSearch Dashboards: http://localhost:5601 (if running) or http://localhost:9201/_dashboards
- Go to "Index Management" (in the main menu or under "Management")
- Click on "Indices" tab
- Find
hybrid_search_vectorin the indices list - Click the checkbox next to the index
- Click "Actions" → "Delete"
- Confirm the deletion by typing "delete"
Alternative - Using OpenSearch REST API:
# Delete the vector index via curl
curl -X DELETE "http://localhost:9201/hybrid_search_vector"Chroma supports two deployment modes with different cleanup approaches:
Local Mode (PersistentClient) - File System Cleanup:
# Delete Chroma directory (contains all vector data)
rm -rf ./chroma_db
# Or on Windows
rmdir /s /q .\chroma_db
# Or on Windows PowerShell
Remove-Item -Path .\chroma_db -Recurse -Force
# Verify cleanup
ls -la # Should not show chroma_db directoryHTTP Mode (HttpClient) - Using curl or Swagger API:
# List all collections
curl "http://localhost:8001/api/v2/tenants/default_tenant/databases/default_database/collections"
# Delete specific collection
curl -X DELETE "http://localhost:8001/api/v2/tenants/default_tenant/databases/default_database/collections/hybrid_search"Via Swagger UI (http://localhost:8001/docs):
- Find the DELETE endpoint for collections
- Enter tenant:
default_tenant - Enter database:
default_database - Enter collection:
hybrid_search - Execute
Alternative - Using Python API (for both modes):
import chromadb
# For Local Mode (PersistentClient)
client = chromadb.PersistentClient(path="./chroma_db")
# For HTTP Mode (HttpClient)
# client = chromadb.HttpClient(host="localhost", port=8001)
# Delete collection
client.delete_collection("hybrid_search")
# Verify
print(client.list_collections()) # Should not include hybrid_searchVia Milvus Attu Dashboard (http://localhost:3003):
- Open Attu Dashboard at
http://localhost:3003 - Navigate to Collections page
- Find your collection (typically
hybrid_search) - Click the "Drop" button next to the collection
- Confirm the deletion by typing the collection name
- Click "Drop Collection" to confirm
Alternative - Using Milvus CLI:
# Connect to Milvus and drop collection
curl -X DELETE "http://localhost:19530/v1/collection" \
-H "Content-Type: application/json" \
-d '{"collection_name": "hybrid_search"}'Via Weaviate Console (http://localhost:8081/console):
- Open Weaviate Console at
http://localhost:8081/console - Navigate to Schema section
- Find your class (typically
HybridSearchorDocuments) - Click "Delete Class" button
- Confirm deletion - this removes all vectors in the class
Alternative - Using Weaviate API:
# Delete entire class (removes all vectors)
curl -X DELETE "http://localhost:8081/v1/schema/HybridSearch"Via pgAdmin (http://localhost:5050):
- Open pgAdmin at
http://localhost:5050 - Login with
admin@flexible-graphrag.com/admin - Connect to PostgreSQL server (
postgres:5432) - Navigate to Tables in the database
- Find your vector table (e.g.,
hybrid_search_vectors) - Right-click → Delete/Drop → Cascade
- Confirm deletion
Alternative - Using SQL Commands:
-- Delete all vectors from table
DELETE FROM hybrid_search_vectors;
-- Or drop entire table
DROP TABLE IF EXISTS hybrid_search_vectors CASCADE;
-- Verify cleanup
\dt -- List tables to confirm deletionReference: n8n Community - Deleting pgvector content
Via Pinecone Console (https://app.pinecone.io):
- Log in to Pinecone Console at
https://app.pinecone.io - Navigate to Indexes page from left navigation
- Find your index (typically
hybrid-search) - Click the three vertical dots (•••) to the right of index name
- Select "Delete" from dropdown menu
- Confirm deletion in the dialog box
⚠️ Warning: This is permanent and irreversible!
Note: Pinecone is a managed service - no local deletion needed.
Via LanceDB Viewer (http://localhost:3005):
- Open LanceDB Viewer at
http://localhost:3005 - Navigate to Tables section
- Find your table (typically
hybrid_search) - Click "Delete Table" button
- Confirm deletion
Alternative - File System Cleanup:
# Delete LanceDB directory (contains all vector data)
rm -rf ./lancedb
# Or on Windows
rmdir /s /q .\lancedb
# Verify cleanup
ls -la # Should not show lancedb directoryWhen switching embedding models, follow this process:
# Export any important data before deletion
# (Implementation depends on your database)# Edit your .env file
LLM_PROVIDER=ollama # Changing from openai to ollama
EMBEDDING_MODEL=all-minilm # 384 dimensionsChoose the appropriate cleanup method from above based on your vector database.
# Restart your application
cd flexible-graphrag
uv run start.py# Re-process your documents with the new embedding model
curl -X POST "http://localhost:8000/api/ingest" \
-H "Content-Type: application/json" \
-d '{"data_source": "filesystem", "paths": ["./your_documents"]}'Vector dimension mismatch: expected 1536, got 384
Vector index dimension (1536) does not match embedding dimension (384)
mapper_parsing_exception: dimension mismatch
The system automatically detects embedding dimensions in flexible-graphrag/factories.py:
def get_embedding_dimension(llm_provider: LLMProvider, llm_config: Dict[str, Any]) -> int:
if llm_provider == LLMProvider.OPENAI:
return 1536 # or 3072 for large models
elif llm_provider == LLMProvider.OLLAMA:
return 384 # default for all-minilm
# ... other providersThe dimension is automatically applied to vector database configurations in config.py:
"embed_dim": 1536 if self.llm_provider == LLMProvider.OPENAI else 384When using Ollama embeddings with Ladybug (GRAPH_DB=ladybug) and a separate VECTOR_DB (for example Qdrant), use one embedding model end-to-end and set EMBEDDING_DIMENSION to match (for example 384 for all-minilm, 768 for nomic-embed-text). If you change embedding models or dimensions, clear the vector index data and remove or recreate the Ladybug .lbug file before re-ingesting.
Ladybug can store vectors on chunk nodes when LADYBUG_USE_VECTOR_INDEX=true; those vectors must use the same embedding model and dimension as your configured VECTOR_DB.
- Plan Your Embedding Model: Choose your embedding model before ingesting large document collections
- Test with Small Data: Verify compatibility with a small test dataset first
- Document Your Configuration: Keep track of which embedding model you're using
- Backup Strategy: Consider backup procedures if you need to preserve processed data
- Environment Separation: Use different databases/collections for different embedding models
- Consistent Naming: Use explicit collection/database names to avoid defaults mismatches
- Ollama + Ladybug: Align embedding dimensions across Ladybug and
VECTOR_DBbefore large ingests
After switching models and cleaning databases, verify the setup:
# Test with a small document
curl -X POST "http://localhost:8000/api/test-sample" \
-H "Content-Type: application/json" \
-d '{}'
# Check system status
curl "http://localhost:8000/api/status"- Main README - Full system setup
- Neo4j Cleanup - Detailed Neo4j cleanup procedures
- Docker Setup - Container-based deployment
- Configuration Guide - Environment configuration