-
-
Notifications
You must be signed in to change notification settings - Fork 12.6k
Compiled model with torch.compile, unfortunately without performance improvements #2131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2bf9d5c to
179a630
Compare
|
|
using your code, I run vicuna-7b in one L40, torch.__version__2.1.0+cu121; vllm = 0.2.2, it seems using torch.compile,without performance improvements; before |
179a630 to
45ee43e
Compare
|
For the latest version v0.2.7, is there any meaningful acceleration in terms of the compiler? |
45ee43e to
2637c51
Compare
2637c51 to
73f0f1a
Compare
|
This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you! |
|
This pull request has merge conflicts that must be resolved before it can be |
What this PR does / why we need it? test vllm_ascend/envs.py contains environment variables defination Does this PR introduce any user-facing change? N/A How was this patch tested? CI passed with new added test. vLLM version: v0.10.0 vLLM main: vllm-project@9532a6d - vLLM version: v0.10.0 - vLLM main: vllm-project@b4e081c --------- Signed-off-by: chengyuan <[email protected]> Co-authored-by: chengyuan <[email protected]>
…or Instance-Worker Management (vllm-project#2131) * Refactor: Replace Complex Mappings with Hierarchical Tree Structure for Instance-Worker Management Signed-off-by: baoloongmao <[email protected]> * Fix gemini comment Signed-off-by: baoloongmao <[email protected]> * Add todo comment Signed-off-by: baoloongmao <[email protected]> --------- Signed-off-by: baoloongmao <[email protected]>
A follow-up of #42 cc @zhuohan123
torch.jit.scriptand TorchScript can't be used as forward methods use parameters not compatible with it https://pytorch.org/docs/stable/jit_language_reference.html#supported-type.torch.jit.tracelooks even more challenging.I was only able to make it run by using
torch.compilewith minimal@torch.compiler.disableaddition. Unfortunately, I only see performance degradation(RTX 3090)llama
This PR can be considered as a first step to use
torch.compilerfor further improvements.BTW
onnrtbackend returns