Is scraping permitted within GitHub Actions usage? #183117
-
Why are you starting this discussion?Question What GitHub Actions topic or product is this about?General Discussion DetailsHi, I am an early career journalist creating a public-interest dashboard. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
|
Yes, this is absolutely permitted and is well within the acceptable use policy for GitHub Actions. Why your project is permitted Key things to keep in mind (Responsible Use)
Best Practices for your Workflow
Example "Clean Commit" Logic name: Commit and Push if changed Your project sounds like a great service to the community. Good luck with your dashboard! If this answer clarifies your concerns, please consider marking this as the accepted answer so it can help other journalists and researchers find this information easily! |
Beta Was this translation helpful? Give feedback.
-
|
Hi, thanks for explaining the context so clearly. Short answer: yes, what you’re describing is generally permitted. GitHub Actions is intended for CI, automation, and scheduled maintenance tasks, and a small, infrequent scraper that updates a CSV on a fixed schedule fits well within that scope. Running a lightweight job every two weeks, or even monthly, to refresh data for a GitHub Pages project is well within normal and responsible usage. A few practical points to keep in mind: Resource usage: Keep the job lightweight and short-running, which it sounds like you already are. Avoid long loops, heavy parallelism, or frequent retries. Scheduling: A biweekly or monthly cron schedule is very reasonable and unlikely to raise any flags. Scraping behavior: Make sure you’re respecting the target site’s robots.txt and terms of service, and avoid aggressive request rates. Transparency: Document in your repo what the Action does and why. This is especially helpful for public-interest projects like yours. As long as the workflow isn’t being used to provide a commercial scraping service, overwhelm a third-party site, or bypass access controls, it’s squarely within acceptable use. This sounds like a thoughtful, low-impact use of Actions for a public-interest project. You should be fine. |
Beta Was this translation helpful? Give feedback.
Yes, this is absolutely permitted and is well within the acceptable use policy for GitHub Actions.
Why your project is permitted
GitHub’s Terms of Service generally prohibit using Actions for "mining" or "server-like" tasks that consume excessive resources. However, running a small script every two weeks to update a CSV is considered a low-impact, productive use of the platform. Since your work is for a public-interest dashboard, it aligns perfectly with GitHub's mission of supporting open data and transparency.
Key things to keep in mind (Responsible Use)
Even though it is permitted, you should follow these "best practices" to ensure your workflow stays healthy: