Skip to content

Support AsyncPipeline for RESTful API #270

@toilaluan

Description

@toilaluan

Are you planning to support this feature?
I'm wanna use FastGen in my app but it's not currently support RESTful API asynchronously
vLLM support it's very well: https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py

I've also use Ray to deploy a server use MIIPipeline with dynamic batching but the performance is far behind vLLM default settings.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions