Skip to content

[Feature Request]: Support LLMs, embeddings & reranking models served through vLLM #4316

@K-Mistele

Description

@K-Mistele

Is there an existing issue for the same feature request?

  • I have checked the existing issues.

Is your feature request related to a problem?

No response

Describe the feature you'd like

It would be great if RAGflow could support LLMs served through vLLM's OpenAI-compatible APi server, and if RAGflow could also support embedding models and re-ranking models served through vLLM.

Describe implementation you've considered

vLLM is the most popular open-source framework for high-performance, GPU-accelerated serving of LLMs and other AI models, and it would be great to have support for it.

vLLM has a well-documented API for LLM serving and embeddings (which is compatible with OpenAI's format), and a custom API for serving reranking models:

Documentation, adoption, use case

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions