Skip to content

Feature Request: Support external API embedding providers (OpenAI/Ollama compatible) #521

Description

@jsty409

Summary

QMD currently only supports local GGUF models via node-llama-cpp for embeddings. It would be great to support external API-based embedding providers (OpenAI-compatible, Ollama, etc.) as an alternative.

Use Case

Many users already have powerful embedding models running externally:

  • Ollama hosting bge-m3 (1024-dim) on a NAS/server
  • OpenAI text-embedding-3-small/large
  • Self-hosted TEI (Text Embeddings Inference) endpoints

These often provide better embedding quality than the local 300M/0.6B GGUF models, especially for non-English content (CJK languages).

Current Workaround

Users who need high-quality embeddings + reranking currently cannot use QMD because:

  1. Local embeddinggemma-300M has limited CJK coverage
  2. Qwen3-Embedding-0.6B is better but still weaker than bge-m3 (1024-dim)
  3. No way to point QMD at an external embedding API

Proposed Solution

Add an environment variable or config option like:

# Use Ollama OpenAI-compatible API for embeddings
export QMD_EMBED_PROVIDER="openai"
export QMD_EMBED_API_URL="http://192.168.1.13:11434/v1"
export QMD_EMBED_MODEL="bge-m3"
export QMD_EMBED_DIMENSIONS=1024

This would allow QMD to call external embedding APIs while keeping reranking and query expansion local (which is a great design choice).

Environment

  • QMD version: 2.1.0
  • OS: Ubuntu 24.04 (x64)
  • Node: 22.22.0

Thanks for the great tool! The hybrid search + query expansion + reranking pipeline is excellent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions