Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
82ddfdb
Init tilegym CI
arjkesh Dec 10, 2025
bb5f66c
black code formatting
arjkesh Dec 10, 2025
6264b2c
Merge branch 'main' of github.com:NVIDIA/TileGym into tilegym_ci_init
arjkesh Dec 10, 2025
9c9e570
Test signed commit
arjkesh Dec 10, 2025
0987a19
update image names
arjkesh Dec 10, 2025
1a2e7ce
reduce image size
arjkesh Dec 10, 2025
cdeba44
debug parsing logic
arjkesh Dec 10, 2025
42c632f
grab PR number from branch name
arjkesh Dec 10, 2025
da1f37a
Parallelize tests, utilize build caches, optimize dockerfile
arjkesh Dec 10, 2025
173ca7b
empty trigger
arjkesh Dec 10, 2025
3d48592
update linter, move pr images to separate repo
arjkesh Dec 10, 2025
d653790
use darker instead of black formatting
arjkesh Dec 10, 2025
b91fde2
Add infra tests, modularize further
arjkesh Dec 11, 2025
bc782e2
Add infra tests, unify naming, add verified tag for nightly
arjkesh Dec 11, 2025
8de6ac0
format infra scripts
arjkesh Dec 11, 2025
00e7f0e
update naming, perms
arjkesh Dec 11, 2025
5e36b90
test trigger ci
arjkesh Dec 11, 2025
63c6639
sequential-ize benchmark tests, improve reporting, rename readme, ren…
arjkesh Dec 11, 2025
d3ba035
Merge branch 'main' of github.com:NVIDIA/TileGym into tilegym_ci_init
arjkesh Dec 11, 2025
c03370b
Fix benchmark reporting
arjkesh Dec 11, 2025
11c023b
restructure benchmark file checkout, add logs
arjkesh Dec 11, 2025
6806a7b
reformat 120 line length
arjkesh Dec 12, 2025
b66efcb
update
arjkesh Dec 12, 2025
a2f303b
add line length 120 to dark checker
arjkesh Dec 12, 2025
0daef68
revert dockerfile changes, install xdist in CI yaml
arjkesh Dec 12, 2025
fd88ad2
retrigger build
arjkesh Dec 12, 2025
8a19e9b
Revert "revert dockerfile changes, install xdist in CI yaml"
arjkesh Dec 12, 2025
a276ae1
go back to optimized dockerfile config
arjkesh Dec 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
124 changes: 124 additions & 0 deletions .github/infra_overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# GitHub Workflows & Infrastructure

This directory contains CI/CD workflows, utility scripts, and infrastructure tests for the TileGym repository.

## Workflows

### `tilegym-ci.yml`
**Main CI workflow** - Builds Docker images and runs tests.

**Jobs:**
- `config` - Parses PR body for CI configuration options
- `build` - Builds `tilegym` Docker image and pushes to GHCR
- `test-ops` - Runs ops tests (`pytest -s tests/ops`)
- `test-benchmark` - Runs benchmark tests sequentially (`tests/benchmark/run_all.sh`)

**Scripts used:**
- `scripts/parse_pr_config.py` - Parse PR body config
- `scripts/check_image_exists.py` - Skip nightly builds if tests already passed

**Test Results:**
- **ops-test-results:** JUnit XML + HTML report with test pass/fail status (visible in "Checks" tab)
- **benchmark-results:** Individual `*_results.txt` files containing performance tables with TFLOPS/GBps metrics for each benchmark (downloadable artifacts)
- **Benchmark summary:** Formatted markdown tables visible in the workflow "Summary" tab

---

### `tilegym-ci-infra-tests.yml`
**Infrastructure validation** - Ensures code quality and validates CI scripts.

**Jobs:**
- `python-formatting` - Runs `darker` with `isort` for incremental formatting checks
- `utility-scripts-tests` - Runs pytest on all infrastructure tests

**Triggers:** Push to `main`, push to `pull-request/*` branches

**Tests:**
- All utility scripts in `scripts/`

---

### `tilegym-ghcr-cleanup.yml`
**GHCR maintenance** - Cleans up old Docker images to save storage.

**Jobs:**
- `cleanup` - Deletes stale PR images and untracked images

**Triggers:** Daily at 2 AM UTC, manual

**Scripts used:**
- `scripts/cleanup_stale_images.py` - Delete closed PR and untracked images

**Cleanup rules:**
- Images for closed PRs (`pr-*` tags)
- Untracked images (no `pr-*`, `latest`, or `-verified` tags, older than 7 days)
- Verified images (`*-verified` tags) are kept indefinitely

---

## Scripts

Located in `scripts/`, these Python utilities are used by workflows:

- **`parse_pr_config.py`** - Extract CI configuration from PR descriptions
- **`check_image_exists.py`** - Check if Docker images exist in GHCR
- **`cleanup_stale_images.py`** - Delete stale Docker images from GHCR
- **`format_benchmark_summary.py`** - Parse benchmark results and format as markdown tables for GitHub Actions summary
- **`utils.py`** - Shared utilities (GitHub token, API headers, outputs)

All scripts have comprehensive docstrings and are fully tested.

---

## Infrastructure Tests

Located in `infra_tests/`, these pytest-based tests validate all CI scripts including:

- PR config parsing logic
- Image existence checks and latest tag validation
- Image cleanup logic (verified tag preservation, untracked image detection)
- Shared utility functions

**Run locally:**
```bash
pytest .github/infra_tests/ -v
```

Tests are independent of the main TileGym package (no torch/CUDA dependencies).

**Test results:** Available in GitHub Actions UI under "Checks" tab and as downloadable artifacts (`infra-test-results`).

---

## PR Configuration

Control CI behavior by adding a YAML config block to your PR description:

```yaml
config:
build: true
test: ["ops", "benchmark"]
```

**Options:**
- `build: false` - Skip build, pull latest from GHCR
- `test: ["ops"]` - Run only ops tests
- `test: []` - Skip all tests

See `.github/pull_request_template.md` for the full template.

---

## Docker Images

**Nightly images:** `ghcr.io/<owner>/tilegym:<SHA>`, `nightly-<DATETIME>`
**Verified images:** `ghcr.io/<owner>/tilegym:<SHA>-verified` (permanent proof tests passed)
**Latest verified:** `ghcr.io/<owner>/tilegym:latest` (points to newest passing build)

**Tagging strategy:**
- Build pushes: `<SHA>`, `nightly-<DATETIME>`
- After tests pass: `latest` and `<SHA>-verified` tags are added
- `latest` moves to newest passing build
- `<SHA>-verified` is permanent (useful for auditing and rollbacks)
- Nightly builds skip if `latest` already points to current SHA

128 changes: 128 additions & 0 deletions .github/infra_tests/test_check_image_exists.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
"""Unit tests for check_image_exists.py"""

import json
import os
import sys
import tempfile
from unittest.mock import MagicMock
from unittest.mock import patch

import pytest

# Add scripts directory to path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "../scripts"))

import check_image_exists


class TestCheckImageExists:
"""Tests for check_image_exists.py"""

def test_get_inputs_success(self):
"""Test successful input retrieval."""
env = {
"REGISTRY_IMAGE": "ghcr.io/test/image",
"IMAGE_TAG": "abc123",
"GITHUB_TOKEN": "token123",
"IS_PR": "false",
}
with patch.dict(os.environ, env):
registry, tag, token, is_pr = check_image_exists.get_inputs()
assert registry == "ghcr.io/test/image"
assert tag == "abc123"
assert token == "token123"
assert is_pr is False

def test_get_inputs_missing_env_vars(self):
"""Test that missing env vars causes exit."""
with patch.dict(os.environ, {}, clear=True):
with pytest.raises(SystemExit):
check_image_exists.get_inputs()

def test_should_skip_build_pr_context(self):
"""Test that PR context skips the check."""
skipped = check_image_exists.should_skip_build("ghcr.io/test/image", "tag", "token", is_pr=True)
assert skipped is False

@patch("check_image_exists.check_image_exists")
def test_should_skip_build_image_exists(self, mock_check):
"""Test that existing images are skipped."""
mock_check.return_value = True
skipped = check_image_exists.should_skip_build("ghcr.io/test/image", "tag", "token", is_pr=False)
assert skipped is True

@patch("check_image_exists.check_image_exists")
def test_should_skip_build_image_not_exists(self, mock_check):
"""Test that non-existing images are not skipped."""
mock_check.return_value = False
skipped = check_image_exists.should_skip_build("ghcr.io/test/image", "tag", "token", is_pr=False)
assert skipped is False

@patch("check_image_exists.subprocess.run")
def test_check_image_exists_latest_matches_sha(self, mock_run):
"""Test that check returns True when 'latest' points to current SHA."""
# Mock login
login_result = MagicMock(returncode=0)
# Mock latest inspect
latest_result = MagicMock(returncode=0, stdout=json.dumps({"config": {"digest": "sha256:abc123"}}).encode())
# Mock SHA inspect
sha_result = MagicMock(returncode=0, stdout=json.dumps({"config": {"digest": "sha256:abc123"}}).encode())
mock_run.side_effect = [login_result, latest_result, sha_result]

result = check_image_exists.check_image_exists("ghcr.io/test/image", "sha123", "token")
assert result is True

@patch("check_image_exists.subprocess.run")
def test_check_image_exists_latest_different_sha(self, mock_run):
"""Test that check returns False when 'latest' points to different SHA."""
# Mock login
login_result = MagicMock(returncode=0)
# Mock latest inspect (different digest)
latest_result = MagicMock(returncode=0, stdout=json.dumps({"config": {"digest": "sha256:abc123"}}).encode())
# Mock SHA inspect (different digest)
sha_result = MagicMock(returncode=0, stdout=json.dumps({"config": {"digest": "sha256:def456"}}).encode())
mock_run.side_effect = [login_result, latest_result, sha_result]

result = check_image_exists.check_image_exists("ghcr.io/test/image", "sha456", "token")
assert result is False

@patch("check_image_exists.subprocess.run")
def test_check_image_exists_no_latest(self, mock_run):
"""Test that check returns False when 'latest' tag doesn't exist."""
# Mock login
login_result = MagicMock(returncode=0)
# Mock latest inspect (not found)
latest_result = MagicMock(returncode=1)
mock_run.side_effect = [login_result, latest_result]

result = check_image_exists.check_image_exists("ghcr.io/test/image", "sha123", "token")
assert result is False

@patch("check_image_exists.subprocess.run")
def test_check_image_exists_no_sha(self, mock_run):
"""Test that check returns False when SHA tag doesn't exist."""
# Mock login
login_result = MagicMock(returncode=0)
# Mock latest inspect (exists)
latest_result = MagicMock(returncode=0, stdout=json.dumps({"config": {"digest": "sha256:abc123"}}).encode())
# Mock SHA inspect (not found)
sha_result = MagicMock(returncode=1)
mock_run.side_effect = [login_result, latest_result, sha_result]

result = check_image_exists.check_image_exists("ghcr.io/test/image", "sha123", "token")
assert result is False

def test_write_output(self):
"""Test output file writing."""
with tempfile.NamedTemporaryFile(mode="w", delete=False) as f:
output_file = f.name

try:
with patch.dict(os.environ, {"GITHUB_OUTPUT": output_file}):
check_image_exists.write_output(True)

with open(output_file) as f:
content = f.read()
assert "skipped=true" in content
finally:
os.unlink(output_file)
74 changes: 74 additions & 0 deletions .github/infra_tests/test_cleanup_stale_images.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
"""Unit tests for cleanup_stale_images.py"""

import os
import sys
from datetime import datetime
from datetime import timedelta
from unittest.mock import MagicMock
from unittest.mock import patch

import pytest

# Add scripts directory to path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "../scripts"))

import cleanup_stale_images


class TestCleanupStaleImages:
"""Tests for cleanup_stale_images.py"""

def test_should_delete_closed_pr_image(self):
"""Test detection of closed PR images."""
open_prs = {1, 2, 3}

# Closed PR
should_delete, reason = cleanup_stale_images.should_delete_closed_pr_image(["pr-4"], open_prs)
assert should_delete is True
assert "Closed PR" in reason

# Open PR
should_delete, reason = cleanup_stale_images.should_delete_closed_pr_image(["pr-2"], open_prs)
assert should_delete is False

def test_should_delete_untracked_image_with_latest(self):
"""Test that images with 'latest' tag are not deleted."""
old_date = (datetime.now() - timedelta(days=10)).isoformat() + "Z"
should_delete, _ = cleanup_stale_images.should_delete_untracked_image(["latest", "abc123"], old_date, 7)
assert should_delete is False

def test_should_delete_untracked_image_with_pr_tag(self):
"""Test that images with pr-* tags are not deleted."""
old_date = (datetime.now() - timedelta(days=10)).isoformat() + "Z"
should_delete, _ = cleanup_stale_images.should_delete_untracked_image(["pr-1", "abc123"], old_date, 7)
assert should_delete is False

def test_should_delete_untracked_image_with_verified_tag(self):
"""Test that images with -verified tags are not deleted."""
old_date = (datetime.now() - timedelta(days=10)).isoformat() + "Z"
should_delete, _ = cleanup_stale_images.should_delete_untracked_image(["abc123-verified"], old_date, 7)
assert should_delete is False

def test_should_delete_untracked_image_old_enough(self):
"""Test that old untracked images are marked for deletion."""
old_date = (datetime.now() - timedelta(days=10)).isoformat() + "Z"
should_delete, reason = cleanup_stale_images.should_delete_untracked_image(["abc123", "def456"], old_date, 7)
assert should_delete is True
assert "Untracked" in reason

def test_should_delete_untracked_image_too_recent(self):
"""Test that recent untracked images are not deleted."""
recent_date = (datetime.now() - timedelta(days=3)).isoformat() + "Z"
should_delete, _ = cleanup_stale_images.should_delete_untracked_image(["abc123"], recent_date, 7)
assert should_delete is False

@patch("cleanup_stale_images.requests.get")
def test_get_open_pr_numbers(self, mock_get):
"""Test fetching open PR numbers."""
mock_response = MagicMock()
mock_response.json.return_value = [{"number": 1}, {"number": 2}, {"number": 3}]
mock_response.raise_for_status = MagicMock()
mock_get.return_value = mock_response

pr_numbers = cleanup_stale_images.get_open_pr_numbers("owner", "repo", "token")
assert pr_numbers == {1, 2, 3}
Loading