Adds smoke test workflow and tests #424

JamesKunstle · 2025-02-04T21:28:23Z

adds a smoke-test.yaml workflow that's very similar to unit-test.yaml but calls a different tox+pytest path and requires a cuda runner.

JamesKunstle · 2025-02-04T21:29:22Z

This PR removes the reusable workflow contributions in #419 but keeps the smoke testing.

JamesKunstle · 2025-02-04T21:39:01Z

This PR needs to go in so that we can test the manual functionality of the smoke-test addition. It works on a dev machine.

ktdreyer · 2025-02-07T16:25:58Z

tests/test_smoke/test_train.py

+
+
+@pytest.fixture(scope="module")
+def custom_tmp_path():


(Apologies for the drive-by comment. The rest of this PR is over my head 😆 . I was just reading this PR and noticed something related to this fixture.)

The implementation of this method looks similar to pytest's own tmp_path fixture. I wondered if you could call custom_tmp_path(tmp_path) and simply return tmp_path here, or even use tmp_path_factory if you want a session-scoped fixture.

It's been a while since I wrote this- I kinda remember doing exactly what you recommend and having the behavior be strange, and this was a convenient workaround. I don't really remember why though- I obviously should have made a note!

please add a note to revisit and then it's ok

cdoern

initial pass, some of these are nits, some are comments for the future. Overall I like the shape of this

cdoern · 2025-04-08T14:38:56Z

.github/workflows/e2e-nvidia-l4-x1.yml

          ./scripts/e2e-ci.sh -mp
-          
+
          # HACK(osilkin): The above test runs the medium workflow test which does not actually test the training library.


so, if this job doesn't use the training library should we remove it?

this uses --pipeline full which uses the full loop from ilab (sorry I did this 😆 )

this might be tangential to this PR but might be nice to see a green CI on this by just removing the test.

Should we do that in a follow-up so we're doing on thing at a time in this PR?

cdoern · 2025-04-08T14:41:57Z

.github/workflows/smoke-tests.yaml

+on:
+  # TEMP - only runs when manually invoked
+  # and only runs against branches in the repo.
+  workflow_dispatch:


you can add on_pull so that it runs and passes on this PR!

true- I should do that

cdoern · 2025-04-08T14:43:58Z

requirements-dev.txt

 ipykernel
 jupyter

+huggingface_hub


do we want a version for that?

only if we know there's a minimal version that we have to have.

cdoern · 2025-04-08T14:48:12Z

tests/test_smoke/test_train.py

+from instructlab.training.main_ds import run_training
+
+MINIMAL_TRAINING_ARGS = {
+    "max_seq_len": 140,  # this config fits nicely on 4xL40s and may need modification for other setups


this makes sense, but if we want to use these PROFILES then we should scope work to lift and shift the system profiles from ilab to the training repo

true, let's scope that for future work

cdoern · 2025-04-08T14:51:18Z

tests/test_smoke/test_train.py

+
+
+@pytest.fixture(scope="module")
+def prepared_data_dir(custom_tmp_path: pathlib.Path) -> pathlib.Path:


missing docstring

cdoern · 2025-04-08T14:51:32Z

tests/test_smoke/test_train.py

+    return pathlib.Path(__file__).resolve()
+
+
+def data_in_repo_path() -> pathlib.Path:


missing docstring

cdoern · 2025-04-08T14:51:44Z

tests/test_smoke/test_train.py

+    return data_in_repo_path
+
+
+def chat_template_in_repo_path() -> pathlib.Path:


docstring plz

cdoern · 2025-04-08T14:52:53Z

tests/test_smoke/test_train.py

+    assert True
+
+
+@pytest.mark.slow


so we aren't skipping this one but its also running training, why is that?

booxter · 2025-04-08T15:37:46Z

.github/workflows/smoke-tests.yaml

+name: "Run smoke tests via Tox::pytest"
+# These tests will be long running and require accelerated hardware.
+# They will help to verify that the library is *functionally* correct but
+# will not try to verify that the libary is *correct*.


nit: library

I don't understand the distinction. :) Functionally correct is correct; just maybe not "non-functionally" correct (benchmarks, code quality etc.)

True- I was trying to express that this will run tests that execute the code but won't benchmark the output model to see if it's better. That'd be my definition of a correctness test for this.

booxter · 2025-04-08T15:42:30Z

.github/workflows/smoke-tests.yaml

+      - name: "Verify cuda environment is setup"
+        run: |
+          export CUDA_HOME="/usr/local/cuda"
+          export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"


nit: use $CUDA_HOME when defining LD_LIBRARY_PATH:

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$CUDA_HOME/lib64:$CUDA_HOME/extras/CUPTI/lib64"

gotcha, updated and changed the bash variable expansion to be correctly scoped

booxter · 2025-04-08T15:44:53Z

.github/workflows/smoke-tests.yaml

+        run: |
+          df -h
+
+      - name: "Run unit tests with Tox and Pytest"


booxter · 2025-04-08T15:46:48Z

requirements-dev.txt

 ipykernel
 jupyter

+huggingface_hub


only if we know there's a minimal version that we have to have.

booxter · 2025-04-08T15:47:47Z

tests/test_smoke/test_train.py

@@ -0,0 +1,227 @@
+# Standard


nit: please rename test_smoke and test_unit into smoke and unit. It's duplicative.

booxter · 2025-04-08T16:06:21Z

tox.ini

+    pytest-asyncio
+    pytest-cov
+    pytest-html
+    -r requirements-dev.txt


Should some of these go to requirements-dev.txt itself? (what's the distinction between putting dependencies here and in that requirements file?)

booxter · 2025-04-08T16:06:50Z

tox.ini

+deps = 
+    pytest
+    pytest-asyncio
+    pytest-cov


do we use cov / html anywhere?

booxter · 2025-04-08T16:07:49Z

tox.ini

 # `--` delimits flags that are meant for tox vs. those that are positional arguments for
 # the command that's being run in the environment.

 # format, check, and linting targets don't build and install the project to


this comment belongs to the next section

booxter · 2025-04-08T16:08:29Z

tox.ini

+[testenv:py3-smoke]
+description = run accelerated smoke tests with pytest
+passenv =
+	HF_HOME


you could define it once in testenv section not to repeat

I'm isolating this to the smoketest case specifically- I'm thinking that if it needs to be more broadly available we can change it as you say

fair but should you then passenv for unit tests like it's done here?

booxter · 2025-04-08T16:09:11Z

tox.ini

+    pytest-cov
+    pytest-html
+    -r requirements-dev.txt
+    -r requirements-cuda.txt


what if I want to run with a different accelerator? is it a choice to be made by the code that sets the test environment up? (github workflow?)

Yeah I think the workflow has to call specific code. This is set to cuda because we only have cuda runners

booxter · 2025-04-08T16:15:04Z

I'd like to see @courtneypacheco or @ktdreyer to confirm the AWS runner configuration here is valid. And if any of this actions code could be shifted to common actions defined for the whole org.

Test groups are divided into three categories: 1) unit tests 2) smoke tests 3) benchmark tests They each have a dedicated tox entrypoint. Adds outer product of [FSDP, DeepSpeed] x [CPU offload, Not] test matrix. DEEPSPEED TESTS ARE BROKEN IN THIS COMMIT and are marked xFail- to be fixed in another, later commit. Signed-off-by: James Kunstle <[email protected]>

booxter · 2025-04-08T22:13:07Z

.github/workflows/smoke.yaml

+# These tests will be long running and require accelerated hardware.
+
+on:
+  # TEMP - only runs when manually invoked


is it? isn't pull_request_target now triggering it automatically?

We could definitely update this comment to be more accurate, but I do think it could actually be valuable to have workflow_dispatch in addition to pull_request_target. If you have workflow_dispatch set, then you can trigger these smoke tests any time on existing branches. This could be particularly useful if you want to quickly rerun the smoke tests against a release branch when you know there has been a CUDA update.

I'll update the comment

booxter · 2025-04-08T22:17:42Z

tox.ini

+[testenv:py3-smoke]
+description = run accelerated smoke tests with pytest
+passenv =
+	HF_HOME


fair but should you then passenv for unit tests like it's done here?

ktdreyer · 2025-04-10T12:54:50Z

.github/workflows/smoke.yaml

+      - name: "Install packages"
+        run: |
+          cat /etc/os-release
+          sudo dnf install -y gcc gcc-c++ make git python3.11 python3.11-devel


I've been trying to clean this up in instructlab itself (instructlab/instructlab#3140), so when I see it here, we can also simplify it:

Suggested change

sudo dnf install -y gcc gcc-c++ make git python3.11 python3.11-devel

sudo dnf install -y gcc gcc-c++ make git-core python3.11 python3.11-devel

users can dispatch a workflow that runs smoke tests against a selected branch Signed-off-by: James Kunstle <[email protected]>

mergify bot added CI/CD Affects CI/CD configuration testing Relates to testing ci-failure dependencies Pull requests that update a dependency file labels Feb 4, 2025

JamesKunstle force-pushed the smoke-testing branch from ced2123 to f832e78 Compare February 4, 2025 21:33

mergify bot removed the ci-failure label Feb 4, 2025

JamesKunstle force-pushed the smoke-testing branch from f832e78 to 782e425 Compare February 4, 2025 21:35

JamesKunstle mentioned this pull request Feb 4, 2025

Refactor unit test workflow; add smoke test workflow #419

Closed

ktdreyer reviewed Feb 7, 2025

View reviewed changes

JamesKunstle requested review from cdoern and nathan-weinberg April 8, 2025 04:42

cdoern reviewed Apr 8, 2025

View reviewed changes

booxter reviewed Apr 8, 2025

View reviewed changes

JamesKunstle force-pushed the smoke-testing branch from 782e425 to 6036810 Compare April 8, 2025 19:46

JamesKunstle mentioned this pull request Apr 8, 2025

remove e2e tests that rely on the instructlab/instructlab repo that don't test training code #447

Closed

mergify bot added the ci-failure label Apr 8, 2025

booxter reviewed Apr 8, 2025

View reviewed changes

ktdreyer approved these changes Apr 10, 2025

View reviewed changes

mergify bot added the one-approval label Apr 10, 2025

booxter approved these changes Apr 10, 2025

View reviewed changes

adds smoke test workflow

b95099b

users can dispatch a workflow that runs smoke tests against a selected branch Signed-off-by: James Kunstle <[email protected]>

JamesKunstle force-pushed the smoke-testing branch from 6036810 to b95099b Compare April 10, 2025 21:43

mergify bot removed the ci-failure label Apr 10, 2025

JamesKunstle requested a review from cdoern April 10, 2025 21:44

JamesKunstle self-assigned this Apr 10, 2025

JamesKunstle merged commit fce38cf into main Apr 10, 2025
17 checks passed

JamesKunstle deleted the smoke-testing branch April 10, 2025 23:14

		./scripts/e2e-ci.sh -mp


		# HACK(osilkin): The above test runs the medium workflow test which does not actually test the training library.



		@pytest.fixture(scope="module")
		def prepared_data_dir(custom_tmp_path: pathlib.Path) -> pathlib.Path:

		return pathlib.Path(__file__).resolve()


		def data_in_repo_path() -> pathlib.Path:

		return data_in_repo_path


		def chat_template_in_repo_path() -> pathlib.Path:

	sudo dnf install -y gcc gcc-c++ make git python3.11 python3.11-devel
	sudo dnf install -y gcc gcc-c++ make git-core python3.11 python3.11-devel

Adds smoke test workflow and tests #424

Adds smoke test workflow and tests #424

Uh oh!

Conversation

JamesKunstle commented Feb 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JamesKunstle commented Feb 4, 2025

Uh oh!

JamesKunstle commented Feb 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cdoern left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JamesKunstle commented Feb 4, 2025 •

edited

Loading