Is there an existing issue for the same feature request?
Is your feature request related to a problem?
No response
Describe the feature you'd like
It would be great if RAGflow could support LLMs served through vLLM's OpenAI-compatible APi server, and if RAGflow could also support embedding models and re-ranking models served through vLLM.
Describe implementation you've considered
vLLM is the most popular open-source framework for high-performance, GPU-accelerated serving of LLMs and other AI models, and it would be great to have support for it.
vLLM has a well-documented API for LLM serving and embeddings (which is compatible with OpenAI's format), and a custom API for serving reranking models:
Documentation, adoption, use case
No response
Additional information
No response
Is there an existing issue for the same feature request?
Is your feature request related to a problem?
No response
Describe the feature you'd like
It would be great if RAGflow could support LLMs served through vLLM's OpenAI-compatible APi server, and if RAGflow could also support embedding models and re-ranking models served through vLLM.
Describe implementation you've considered
vLLM is the most popular open-source framework for high-performance, GPU-accelerated serving of LLMs and other AI models, and it would be great to have support for it.
vLLM has a well-documented API for LLM serving and embeddings (which is compatible with OpenAI's format), and a custom API for serving reranking models:
Documentation, adoption, use case
No response
Additional information
No response