Skip to content

CI: Explore consolidating Linux/Windows test workflows? #603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
leofang opened this issue May 4, 2025 · 8 comments · Fixed by #645
Closed

CI: Explore consolidating Linux/Windows test workflows? #603

leofang opened this issue May 4, 2025 · 8 comments · Fixed by #645
Assignees
Labels
CI/CD CI/CD infrastructure enhancement Any code-related improvements

Comments

@leofang
Copy link
Member

leofang commented May 4, 2025

#600 opens up the door toward a single (Bash-based) implementation of the test workflow, which was previously not possible due to lack of Git Bash on the GH-hosted Windows GPU runner. It was the main reason that we had to re-implement test-wheel-windows.yml in Powershell in the first place (#444 (comment)). Note that our build workflow is already cross-platform (and Bash-based, since all of our build runners have Bash pre-installed -- trivial on Linux, and thanks to GH-hosted runner images on Windows).

There might still be a few kinks, so a naive unification likely won't work (ex: container vs VM image, gcc vs msvc, ...). But I think most job steps are enough identical that we could probably extract out a reusable workflow, e.g. test-wheel-common.yml, for both OS to call.

@leofang leofang added CI/CD CI/CD infrastructure enhancement Any code-related improvements triage Needs the team's attention labels May 4, 2025
@leofang
Copy link
Member Author

leofang commented May 4, 2025

(Assigning to Marcus to triage/evaluate/decide in light of the ongoing CI refactoring #576.)

@leofang
Copy link
Member Author

leofang commented May 4, 2025

also cc @shwina for vis

@leofang
Copy link
Member Author

leofang commented May 15, 2025

#638 is one example showing the drawback of the status quo: Instead of just updating one file, we need to update two.

@cryos
Copy link
Collaborator

cryos commented May 16, 2025

Yes moving all the Windows stuff to bash, and unifying on some common scripts would help a lot.

@leofang
Copy link
Member Author

leofang commented May 16, 2025

There might still be a few kinks, so a naive unification likely won't work (ex: container vs VM image, gcc vs msvc, ...).

As a reminder to myself why an indirection might still be needed: actions/runner#904 (comment)

@cryos
Copy link
Collaborator

cryos commented May 16, 2025

There might still be a few kinks, so a naive unification likely won't work (ex: container vs VM image, gcc vs msvc, ...).

As a reminder to myself why an indirection might still be needed: actions/runner#904 (comment)

I was more thinking at the bash script level for more reuse/central place to manage install, build, etc. Not container or action level where I don't see a full merge being feasible.

@leofang
Copy link
Member Author

leofang commented May 16, 2025

I guess you're right about this. The reusable bits will likely have to happen at the bash script level, not at the action or workflow level, because this is another thing that cannot be expressed today 😞 (jobs.<job_id>.container is not allowed when invoking a reusable workflow):

  test-linux:
    # use container 
    container:
      image: ubuntu:24.04
    uses:
      ./.github/workflows/test-wheel-common.yml
    ...

  test-windows:
    # don't use container 
    uses:
      ./.github/workflows/test-wheel-common.yml
    ...

However, I really love the capability of accessing the workflow yaml variables inside a bash snippet, like

          if [[ "${{ inputs.host-platform }}" == linux* ]]; then
            # do this
          elif [[ "${{ inputs.host-platform }}" == win* ]]; then
            # do that
          fi

It is so very neat (ex: the variables are statically expanded). Sounds like we'll be losing this 🥲

@cryos
Copy link
Collaborator

cryos commented May 16, 2025

We can just add these things to the env though, and then use them offering more standardization. It means that the bash scripts can be more reusable, for example reproducing pieces of CI locally in docker containers etc is more approachable. I don't think losing YAML variables is a huge thing, and in many workflows I tend to add them to the environment to avoid this.

@cryos cryos removed the triage Needs the team's attention label May 28, 2025
@leofang leofang added this to the cuda.core beta 4 milestone May 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/CD CI/CD infrastructure enhancement Any code-related improvements
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants