Skip to content

Migrate from Azure to GH Actions #1306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Jun 15, 2022
Merged

Conversation

kandersolar
Copy link
Member

@kandersolar kandersolar commented Sep 11, 2021

  • [ ] Closes #xxxx
  • I am familiar with the contributing guidelines
  • [ ] Tests added
  • [ ] Updates entries to docs/sphinx/source/api.rst for API changes.
  • Adds description and name entries in the appropriate "what's new" file in docs/sphinx/source/whatsnew for all changes. Includes link to the GitHub Issue with :issue:`num` or this Pull Request with :pull:`num`. Includes contributor name and/or GitHub username (link with :ghuser:`user`).
  • [ ] New code is fully documented. Includes numpydoc compliant docstrings, examples, and comments where necessary.
  • Pull request is nearly complete and ready for detailed review.
  • Maintainer: Appropriate GitHub Labels and Milestone are assigned to the Pull Request and linked Issue.

I've been playing with this on my fork and I think all of these are working correctly:

  • conda+bare for linux, conda for windows, bare for mac
  • 3.6-min only on conda+linux
  • --remote-data only on conda+linux
  • github secrets for API credentials

All in one workflow, so no separate yml files for each platform. To me this seems like a good thing, but maybe others would prefer separate workflows?

To-do:

  • Configure secrets for this repo: NREL_API_KEY, BSRN_FTP_USERNAME, BSRN_FTP_PASSWORD. Someone else will have to do this as I don't have admin access to pvlib/pvlib-python. (note that the jobs are going to fail until this happens)
  • Get codecov working
  • Remove Azure config from this repo
  • Disable Azure on their side?

@kandersolar
Copy link
Member Author

@wholmgren when you get a chance could you set up repo secrets for PSM3 and BSRN? I don't want to fiddle with codecov until the tests pass as expected. No rush of course :)

@wholmgren
Copy link
Member

@kanderso-nrel sorry for the delayed response. I just added the secrets. Not sure if either CI is actually case sensitive but GitHub appears to require all caps for secrets so it's now NRELAPIKEY rather than nrelApiKey.

@kandersolar
Copy link
Member Author

There's a bit of a problem -- turns out github secrets aren't available to PRs opened from forks. I think the values are being set to the empty string. This page lists some workarounds: https://securitylab.github.com/research/github-actions-preventing-pwn-requests/

But it got me wondering if maybe we should stop running the --remote-data tests on every pull request anyway. They're slow to execute, relatively brittle compared to the other tests, and github is right that making secrets available to any PR is generally a bad idea (although our current set of secrets is pretty low stakes). One alternative to running --remote-data tests on PRs is to run them on an independent schedule, cron-style, and I think since that would be run from "within" pvlib it should have access to secrets. Here are the relevant docs: https://docs.github.com/en/actions/learn-github-actions/events-that-trigger-workflows#schedule

One downside I can think of is that it would complicate calculating coverage on PRs, but maybe that is easy to solve with a sprinkling of # pragma: no cover.

I guess another option is to not run just tests that require secrets, which is a subset of all the --remote-data tests. Any thoughts?

@kandersolar
Copy link
Member Author

Unrelated to this github actions business, Azure and RTD are both timing out on things related to forecasts:

HTTPSConnectionPool(host='thredds.ucar.edu', port=443): Max retries exceeded with url: /thredds/catalog.xml (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0e7865c110>: Failed to establish a new connection: [Errno 110] Connection timed out'))

https://dev.azure.com/solararbiter/pvlib%20python/_build/results?buildId=5556&view=logs&j=e1592cb8-2816-5754-b393-3839a187d454&t=377c4fd6-97bd-5996-bc02-4d072a8786ea&l=30

https://readthedocs.org/projects/pvlib-python/builds/14923696/

@wholmgren
Copy link
Member

The linked securitylab article suggested applying labels as a workaround. It mentioned they could be prone to human error but I don't think it matters here. Anyways, my thought is we create a new label "remote data" and maintainers only apply it to PRs that need remote data testing. Then GitHub runs a test job with --remote-data and coverage reports from within the pvlib context so it has access to the secrets. And PRs that don't need remote data are no longer slowed by the requirement.

This is also another reason to prefer mocked remote data tests.

Finally I support running jobs on a schedule in general, but I don't think that solves this problem because we really need the coverage report before the PR is merged.

@kandersolar
Copy link
Member Author

@wholmgren do you agree that these -F adder -F subtractor -F conda flags are not necessary:

- script: |
bash <(curl https://codecov.io/bash) -t bbc2bdbe-5e67-4fef-9cb7-f52fe0b703a8 -f coverage.xml -F adder -F subtractor -F conda

Maybe a copy/paste artifact from here? https://github.com/codecov/example-azure-pipelines

They show up as independent coverage calcs here: https://app.codecov.io/gh/pvlib/pvlib-python/compare/1306?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=checks&utm_campaign=pr%20comments&utm_term=pvlib

Asking because if --remote-data is to be run on only a subset of PRs, I think codecov flags might be a good mechanism for tracking iotools and non-iotools coverage separately.

@wholmgren
Copy link
Member

@kanderso-nrel I think you're probably right, but it's been a long time since I carefully looked at any of this. It does feel like it's suboptimal.

@kandersolar kandersolar added remote-data triggers --remote-data pytests development_workflow labels Apr 19, 2022
@wholmgren
Copy link
Member

Check out this mamba github action as an alternative to the conda action. I don't have experience with either action, but I've recently started using mamba and it is way faster than conda for non-trivial environments. I'm guessing it would take a minute off the CI runs that don't use a cached environment. But certainly fine to stick with the conda action.

@kandersolar kandersolar added this to the 0.9.2 milestone Jun 12, 2022
@kandersolar
Copy link
Member Author

I think this is finally ready for review. The main challenge in supporting two sets of tests (one required, one optional) was getting coverage calculation and reporting to do something that made sense. I ended up at one github actions workflow per test group and keeping track of coverage for each of the two groups using codecov's "flags" feature. Unfortunately there's a lot of copy/paste between the two workflow files; since the two workflows are triggered differently, I think that might just be something we have to live with. Hopefully the notes in the files are sufficient explanation for the rest of what's going on here.

Here are two example PRs I opened over on my fork:

  • GitHub Actions test: drop core coverage  kandersolar/pvlib-python#6
    • A simple check that the "core" coverage calculation and reporting succesfully flags a coverage drop.
  • GitHub Actions test: iotools error  kandersolar/pvlib-python#7
    • A test that broken iotools tests don't fail the PR checks until after the remote-data label was added. Note that github's display of that PR history is a little misleading; all checks were passing prior to labeling the PR with remote-data, and only once the remote-data label was added were the corresponding tests run and the failure reported. So it did what I wanted, although that's not necessarily clear from looking at that page now.

Still to-do: nix Azure. I can remove the Azure config files in this PR but I think someone else will need to disable the integration on Azure's side.

Check out this mamba github action as an alternative to the conda action.

Looks cool and worth exploring (conda is ~70% of total time in some of these jobs!), but I would rather explore it in a separate PR 👍

@kandersolar kandersolar marked this pull request as ready for review June 12, 2022 19:06
Copy link
Member

@wholmgren wholmgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @kanderso-nrel for pushing this. I can turn off the azure integration once this is merged. Seems to me that we should go ahead and remove the azure config in this PR.

@wholmgren
Copy link
Member

Regarding the copy-paste for the remote data workflow, did you look into https://docs.github.com/en/actions/using-workflows/reusing-workflows#using-inputs-and-secrets-in-a-reusable-workflow ? My impression is that it provides enough configurability that it might work, but wouldn't be surprised if I'm missing something.

@kandersolar
Copy link
Member Author

While deleting the Azure config I noticed that I had forgotten about the Publish job that makes sure the package can be built successfully. I think I've modified the existing PyPI deploy job in such a way to be suitable for PRs as well. There are a handful of warnings in that job that we should probably address before too long: https://github.com/pvlib/pvlib-python/runs/6886769268?check_suite_focus=true#step:6:393

Regarding the copy-paste for the remote data workflow, did you look into https://docs.github.com/en/actions/using-workflows/reusing-workflows#using-inputs-and-secrets-in-a-reusable-workflow ?

I did not look into that. I agree it seems like it could work for us. Perhaps we can take that as another possible future improvement in the interest of getting this merged before I lose momentum again :)

@wholmgren
Copy link
Member

wholmgren commented Jun 14, 2022

the Publish job that makes sure the package can be built successfully.

The point of that job was to actually publish to PyPI - it had nothing to do with testing. Testing that the package can be built sounds like a good idea, but I think that would be a separate job, perhaps with fail on warnings if possible. There's probably some github actions dependency workflow that can allow a publish job to leverage a build job. All that is certainly fine to leave for future work.

Perhaps we can take that as another possible future improvement in the interest of getting this merged before I lose momentum again :)

+1

@kandersolar
Copy link
Member Author

Ok, I guess it's time to hit merge here. Since this is a change which is not very easy to test, I suggest we all scrutinize the CI more than usual for a while to make sure it's doing what we want it to do :) We'll also need to merge/rebase open PRs on master for the new CI to kick in.

I'll submit issues to track the "future improvement" items (mamba, reusable workflows, distribution builds) shortly.

@wholmgren
Copy link
Member

I deleted the azure devops project.

kandersolar added a commit to kandersolar/pvlib-python that referenced this pull request Jun 16, 2022
* add pytest workflow

* add some comments

* set fail-fast to false

* see if secrets work

* move --remote-data tests to other workflow

* remove secrets from default pytest workflow

* simplify bare environment installation

* simplify

* un-simplify

* thrashing around the "debug the CI" morass

* fix macos py3.6

* disable codecov upload in azure pipelines

* enable codecov in github actions

* fix janky coverage hack

* undo unnecessary changes to .coveragerc

* add label reminder to PR template

* rework workflows and codecov

* whatsnew

* remove azure config files

* don't pytest on all pushes, only master

* build sdist on PRs

* switch to more convenient environment filenames

* use most recent checkout action
@kandersolar kandersolar mentioned this pull request Jun 22, 2022
8 tasks
@kandersolar kandersolar mentioned this pull request Aug 17, 2022
8 tasks

on:
pull_request_target:
types: [labeled]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kanderso-nrel if we were to delete this line, then we wouldn't have to add and remove the remote-data label every time to trigger the tests.

For PRs that do NOT have the remote-data label, the remote tests would simply be skipped due to the if statement in line 66:
image

Maybe there are some security aspects that I'm not aware of.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having thought about this more, I think this was intentional on my part (see the comments at the top of the workflow file). Removing this line would allow a scenario where someone can submit a benign PR to lure us into adding the remote-data label, then push another commit with malicious content and have the workflow run without our sign off. The way it is now, the workflow cannot run without a maintainer explicitly triggering it.

Whether keeping that security hole plugged is worth the annoyance of repeated labeling is another question...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyPI's recent labeling of pvlib as a critical project has me feeling like we should err on the side of caution here. Tests should have a minimum of real API calls anyways - the vast majority should be mocked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants