Skip to content

[skyrl-train] Support older vllm versions till 0.9.2#671

Merged
SumanthRH merged 4 commits intoNovaSky-AI:mainfrom
SumanthRH:support-vllm-old
Nov 17, 2025
Merged

[skyrl-train] Support older vllm versions till 0.9.2#671
SumanthRH merged 4 commits intoNovaSky-AI:mainfrom
SumanthRH:support-vllm-old

Conversation

@SumanthRH
Copy link
Member

@SumanthRH SumanthRH commented Nov 15, 2025

What does this PR do?

Adds support for older vllm versions.

With these changes, I am able to use older vllm versions as follows:

uv run --isolated --extra vllm --extra dev --with vllm==0.9.2 --with transformers==4.53.0 --with torch==2.7.0 --with "flash-attn@https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu12torch2.7cxx11abiTRUE-cp312-cp312-linux_x86_64.whl" -- pytest -s -vvv tests/gpu/gpu_ci/test_engine_generation.py::test_token_based_generation -m "vllm"

Will add docs for this in a future PR

x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH SumanthRH changed the title [skyrl-train] Support older vllm versions [skyrl-train] Support older vllm versions till 0.9.2 Nov 15, 2025
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

In some cases, you may encounter "illegal memory access" errors with vLLM >= 0.10.0: https://github.com/vllm-project/vllm/issues/23814. Currently, we recommend a workaround by downgrading to vLLM 0.9.2.

With SkyRL, this can be done with the following overrides:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A todo here is to mention the exact commit from which such override is supported., Will be done after merging the PR

@SumanthRH SumanthRH marked this pull request as ready for review November 15, 2025 02:40
@SumanthRH
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for older versions of vLLM, specifically down to version 0.9.2. The changes introduce version-aware logic to handle API differences, update the CI pipeline to include tests for the older vLLM version, and add relevant troubleshooting documentation. My review focuses on improving the maintainability of the CI script, correcting a broken link in the documentation, and suggesting refactoring to reduce code duplication in the Python source.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Copy link
Collaborator

@CharlieFRuan CharlieFRuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! This is quite useful (e.g. in clusters where you need to build things from source, you might prefer to run older versions). Left two comments just for my own sake of understanding

@SumanthRH SumanthRH merged commit 2ddddb9 into NovaSky-AI:main Nov 17, 2025
3 checks passed
li-boxuan pushed a commit to li-boxuan/SkyRL that referenced this pull request Nov 23, 2025
Adds support for older vllm versions.

With these changes, I am able to use older vllm versions as follows:

```bash
uv run --isolated --extra vllm --extra dev --with vllm==0.9.2 --with transformers==4.53.0 --with torch==2.7.0 --with "flash-attn@https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu12torch2.7cxx11abiTRUE-cp312-cp312-linux_x86_64.whl" -- pytest -s -vvv tests/gpu/gpu_ci/test_engine_generation.py::test_token_based_generation -m "vllm"
```

Will add docs for this in a future PR

---------

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
dzorlu pushed a commit to fleet-ai/SkyRL that referenced this pull request Feb 4, 2026
# What does this PR do?

Adds support for older vllm versions.

With these changes, I am able to use older vllm versions as follows:

```bash
uv run --isolated --extra vllm --extra dev --with vllm==0.9.2 --with transformers==4.53.0 --with torch==2.7.0 --with "flash-attn@https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu12torch2.7cxx11abiTRUE-cp312-cp312-linux_x86_64.whl" -- pytest -s -vvv tests/gpu/gpu_ci/test_engine_generation.py::test_token_based_generation -m "vllm"
```

Will add docs for this in a future PR

---------

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants