-
Notifications
You must be signed in to change notification settings - Fork 74
Adds smoke test workflow and tests #424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This PR removes the reusable workflow contributions in #419 but keeps the smoke testing. |
ced2123 to
f832e78
Compare
f832e78 to
782e425
Compare
|
This PR needs to go in so that we can test the manual functionality of the smoke-test addition. It works on a dev machine. |
tests/test_smoke/test_train.py
Outdated
|
|
||
|
|
||
| @pytest.fixture(scope="module") | ||
| def custom_tmp_path(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Apologies for the drive-by comment. The rest of this PR is over my head 😆 . I was just reading this PR and noticed something related to this fixture.)
The implementation of this method looks similar to pytest's own tmp_path fixture. I wondered if you could call custom_tmp_path(tmp_path) and simply return tmp_path here, or even use tmp_path_factory if you want a session-scoped fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been a while since I wrote this- I kinda remember doing exactly what you recommend and having the behavior be strange, and this was a convenient workaround. I don't really remember why though- I obviously should have made a note!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a note to revisit and then it's ok
cdoern
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial pass, some of these are nits, some are comments for the future. Overall I like the shape of this
| ./scripts/e2e-ci.sh -mp | ||
| # HACK(osilkin): The above test runs the medium workflow test which does not actually test the training library. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, if this job doesn't use the training library should we remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this uses --pipeline full which uses the full loop from ilab (sorry I did this 😆 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might be tangential to this PR but might be nice to see a green CI on this by just removing the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do that in a follow-up so we're doing on thing at a time in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure!
| on: | ||
| # TEMP - only runs when manually invoked | ||
| # and only runs against branches in the repo. | ||
| workflow_dispatch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can add on_pull so that it runs and passes on this PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true- I should do that
| ipykernel | ||
| jupyter | ||
|
|
||
| huggingface_hub |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want a version for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only if we know there's a minimal version that we have to have.
| from instructlab.training.main_ds import run_training | ||
|
|
||
| MINIMAL_TRAINING_ARGS = { | ||
| "max_seq_len": 140, # this config fits nicely on 4xL40s and may need modification for other setups |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes sense, but if we want to use these PROFILES then we should scope work to lift and shift the system profiles from ilab to the training repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true, let's scope that for future work
tests/test_smoke/test_train.py
Outdated
|
|
||
|
|
||
| @pytest.fixture(scope="module") | ||
| def prepared_data_dir(custom_tmp_path: pathlib.Path) -> pathlib.Path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing docstring
tests/test_smoke/test_train.py
Outdated
| return pathlib.Path(__file__).resolve() | ||
|
|
||
|
|
||
| def data_in_repo_path() -> pathlib.Path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing docstring
| return data_in_repo_path | ||
|
|
||
|
|
||
| def chat_template_in_repo_path() -> pathlib.Path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring plz
tests/test_smoke/test_train.py
Outdated
| assert True | ||
|
|
||
|
|
||
| @pytest.mark.slow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we aren't skipping this one but its also running training, why is that?
.github/workflows/smoke-tests.yaml
Outdated
| name: "Run smoke tests via Tox::pytest" | ||
| # These tests will be long running and require accelerated hardware. | ||
| # They will help to verify that the library is *functionally* correct but | ||
| # will not try to verify that the libary is *correct*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: library
I don't understand the distinction. :) Functionally correct is correct; just maybe not "non-functionally" correct (benchmarks, code quality etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True- I was trying to express that this will run tests that execute the code but won't benchmark the output model to see if it's better. That'd be my definition of a correctness test for this.
.github/workflows/smoke-tests.yaml
Outdated
| - name: "Verify cuda environment is setup" | ||
| run: | | ||
| export CUDA_HOME="/usr/local/cuda" | ||
| export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use $CUDA_HOME when defining LD_LIBRARY_PATH:
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$CUDA_HOME/lib64:$CUDA_HOME/extras/CUPTI/lib64"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha, updated and changed the bash variable expansion to be correctly scoped
.github/workflows/smoke-tests.yaml
Outdated
| run: | | ||
| df -h | ||
| - name: "Run unit tests with Tox and Pytest" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not "unit"
| ipykernel | ||
| jupyter | ||
|
|
||
| huggingface_hub |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only if we know there's a minimal version that we have to have.
| @@ -0,0 +1,227 @@ | |||
| # Standard | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please rename test_smoke and test_unit into smoke and unit. It's duplicative.
| pytest-asyncio | ||
| pytest-cov | ||
| pytest-html | ||
| -r requirements-dev.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should some of these go to requirements-dev.txt itself? (what's the distinction between putting dependencies here and in that requirements file?)
| deps = | ||
| pytest | ||
| pytest-asyncio | ||
| pytest-cov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we use cov / html anywhere?
tox.ini
Outdated
| # `--` delimits flags that are meant for tox vs. those that are positional arguments for | ||
| # the command that's being run in the environment. | ||
|
|
||
| # format, check, and linting targets don't build and install the project to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment belongs to the next section
| [testenv:py3-smoke] | ||
| description = run accelerated smoke tests with pytest | ||
| passenv = | ||
| HF_HOME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could define it once in testenv section not to repeat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm isolating this to the smoketest case specifically- I'm thinking that if it needs to be more broadly available we can change it as you say
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair but should you then passenv for unit tests like it's done here?
| pytest-cov | ||
| pytest-html | ||
| -r requirements-dev.txt | ||
| -r requirements-cuda.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if I want to run with a different accelerator? is it a choice to be made by the code that sets the test environment up? (github workflow?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think the workflow has to call specific code. This is set to cuda because we only have cuda runners
|
I'd like to see @courtneypacheco or @ktdreyer to confirm the AWS runner configuration here is valid. And if any of this actions code could be shifted to common actions defined for the whole org. |
Test groups are divided into three categories: 1) unit tests 2) smoke tests 3) benchmark tests They each have a dedicated tox entrypoint. Adds outer product of [FSDP, DeepSpeed] x [CPU offload, Not] test matrix. DEEPSPEED TESTS ARE BROKEN IN THIS COMMIT and are marked xFail- to be fixed in another, later commit. Signed-off-by: James Kunstle <[email protected]>
782e425 to
6036810
Compare
.github/workflows/smoke.yaml
Outdated
| # These tests will be long running and require accelerated hardware. | ||
|
|
||
| on: | ||
| # TEMP - only runs when manually invoked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it? isn't pull_request_target now triggering it automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could definitely update this comment to be more accurate, but I do think it could actually be valuable to have workflow_dispatch in addition to pull_request_target. If you have workflow_dispatch set, then you can trigger these smoke tests any time on existing branches. This could be particularly useful if you want to quickly rerun the smoke tests against a release branch when you know there has been a CUDA update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update the comment
| [testenv:py3-smoke] | ||
| description = run accelerated smoke tests with pytest | ||
| passenv = | ||
| HF_HOME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair but should you then passenv for unit tests like it's done here?
.github/workflows/smoke.yaml
Outdated
| - name: "Install packages" | ||
| run: | | ||
| cat /etc/os-release | ||
| sudo dnf install -y gcc gcc-c++ make git python3.11 python3.11-devel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been trying to clean this up in instructlab itself (instructlab/instructlab#3140), so when I see it here, we can also simplify it:
| sudo dnf install -y gcc gcc-c++ make git python3.11 python3.11-devel | |
| sudo dnf install -y gcc gcc-c++ make git-core python3.11 python3.11-devel |
users can dispatch a workflow that runs smoke tests against a selected branch Signed-off-by: James Kunstle <[email protected]>
6036810 to
b95099b
Compare
adds a smoke-test.yaml workflow that's very similar to unit-test.yaml but calls a different tox+pytest path and requires a cuda runner.