ci-analysis skill: let the agent reason about its own tools (dotnet#124398)

lewing · Copilot · richlander · commit be256d2f795f · 2026-02-13T21:40:28.000-08:00
Refactors the ci-analysis skill to remove explicit MCP tool name references from all documentation. ### Why The agent has MCP tool descriptions in its context at runtime — it already knows what each tool does and what parameters it takes. Skills should provide domain knowledge the agent *doesn't* have: gotchas, priority orderings, data locations, and anti-patterns. Re-documenting tool parameters or providing step-by-step "call tool X then tool Y" recipes is fragile (breaks when tools change), redundant, and overly prescriptive. ### What changed - Replaced tool call chains with action descriptions - Replaced parameter-level details with workflow guidance - Subagent delegation prompts describe goals, not tool calls - Kept all domain-specific gotchas, anti-patterns, and priority orderings **Net result: 89 lines removed, 45 added** — less to maintain, less to break when tools change. ### Testing Multi-model tested with Claude Sonnet 4 and GPT-5 against real CI investigation (PR dotnet#124095). Both correctly identified and used the right tools for all scenarios without explicit tool names in the skill. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
diff --git a/.github/skills/ci-analysis/SKILL.md b/.github/skills/ci-analysis/SKILL.md
@@ -75,7 +75,7 @@ The script operates in three distinct modes depending on what information you ha
 ## What the Script Does
 
 ### PR Analysis Mode (`-PRNumber`)
-1. Discovers AzDO builds associated with the PR (via `gh pr checks`, or `pull_request_read` with method `get_status` — finds failing builds and one non-failing build as fallback; for full build history, use `azure-devops-pipelines_get_builds`)
+1. Discovers AzDO builds associated with the PR (from GitHub check status; for full build history, query AzDO builds API)
 2. Fetches Build Analysis for known issues
 3. Gets failed jobs from Azure DevOps timeline
 4. **Separates canceled jobs from failed jobs** (canceled may be dependency-canceled or timeout-canceled)
@@ -102,7 +102,7 @@ The script operates in three distinct modes depending on what information you ha
 
 **Build Analysis check status**: The "Build Analysis" GitHub check is **green** only when *every* failure is matched to a known issue. If it's **red**, at least one failure is unaccounted for — do NOT claim "all failures are known issues" just because some known issues were found. You must verify each failing job is covered by a specific known issue before calling it safe to retry.
 
-**Canceled/timed-out jobs**: Jobs canceled due to earlier stage failures or AzDO timeouts. Dependency-canceled jobs don't need investigation. **Timeout-canceled jobs may have all-passing Helix results** — the "failure" is just the AzDO job wrapper timing out, not actual test failures. To verify: use `hlx_status` on each Helix job in the timed-out build. If all work items passed, the build effectively passed.
+**Canceled/timed-out jobs**: Jobs canceled due to earlier stage failures or AzDO timeouts. Dependency-canceled jobs don't need investigation. **Timeout-canceled jobs may have all-passing Helix results** — the "failure" is just the AzDO job wrapper timing out, not actual test failures. To verify: use `hlx_status` on each Helix job in the timed-out build (include passed work items). If all work items passed, the build effectively passed.
 
 > ❌ **Don't dismiss timed-out builds.** A build marked "failed" due to a 3-hour AzDO timeout can have 100% passing Helix work items. Check before concluding it failed.
 
@@ -128,12 +128,11 @@ Error categories: `test-failure`, `build-error`, `test-timeout`, `crash` (exit c
 
 When an AzDO job is canceled (timeout) or Helix work items show `Crash` (exit code -4), the tests may have actually passed. Follow this procedure:
 
-1. **Find the Helix job IDs** — Read the AzDO "Send to Helix" step log (use `azure-devops-pipelines_get_build_log_by_id`) and search for lines containing `Sent Helix Job`. Extract the job GUIDs.
+1. **Find the Helix job IDs** — Read the AzDO "Send to Helix" step log and search for lines containing `Sent Helix Job`. Extract the job GUIDs.
 
-2. **Check Helix job status** — Use `hlx_batch_status` (accepts comma-separated job IDs) or `hlx_status` per job. Look at `failedCount` vs `passedCount`.
+2. **Check Helix job status** — Get pass/fail summary for each job. Look at `failedCount` vs `passedCount`.
 
-3. **For work items marked Crash/Failed** — Use `hlx_files` to check if `testResults.xml` was uploaded. If it exists:
-   - Download it with `hlx_download_url`
+3. **For work items marked Crash/Failed** — Check if tests actually passed despite the crash. Try structured test results first (TRX parsing), then search for pass/fail counts in result files without downloading, then download as last resort:
    - Parse the XML: `total`, `passed`, `failed` attributes on the `<assembly>` element
    - If `failed=0` and `passed > 0`, the tests passed — the "crash" is the wrapper timing out after test completion
 
@@ -255,6 +254,6 @@ Before stating a failure's cause, verify your claim:
 1. Check if same test fails on the target branch before assuming transient
 2. Look for `[ActiveIssue]` attributes for known skipped tests
 3. Use `-SearchMihuBot` for semantic search of related issues
-4. Use the binlog MCP tools (`mcp-binlog-tool-*`) to search binlogs for Helix job IDs, build errors, and properties
+4. Use binlog analysis tools to search binlogs for Helix job IDs, build errors, and properties
 5. `gh pr checks --json` valid fields: `bucket`, `completedAt`, `description`, `event`, `link`, `name`, `startedAt`, `state`, `workflow` — no `conclusion` field, `state` has `SUCCESS`/`FAILURE` directly
 6. "Canceled" ≠ "Failed" — canceled jobs may have recoverable Helix results. Check artifacts before concluding results are lost.
diff --git a/.github/skills/ci-analysis/references/azure-cli.md b/.github/skills/ci-analysis/references/azure-cli.md
@@ -1,6 +1,6 @@
 # Deep Investigation with Azure CLI
 
-The AzDO MCP tools (`azure-devops-pipelines_*`) handle most pipeline queries directly. This reference covers the Azure CLI fallback for cases where MCP tools are unavailable or the endpoint isn't exposed (e.g., downloading artifacts, inspecting pipeline definitions).
+The AzDO MCP tools handle most pipeline queries directly. This reference covers the Azure CLI fallback for cases where MCP tools are unavailable or the endpoint isn't exposed (e.g., downloading artifacts, inspecting pipeline definitions).
 
 When the CI script and GitHub APIs aren't enough (e.g., investigating internal pipeline definitions or downloading build artifacts), use the Azure CLI with the `azure-devops` extension.
 
@@ -80,10 +80,11 @@ All dotnet repos that use arcade put their pipeline definitions under `eng/pipel
 az pipelines show --id 1330 --org $org -p $project --query "{yamlPath:process.yamlFilename, repo:repository.name}" -o table
 
 # Fetch the YAML from the repo (example: dotnet/runtime's runtime-official pipeline)
-#   github-mcp-server-get_file_contents owner:dotnet repo:runtime path:eng/pipelines/runtime-official.yml
+#   Read the pipeline YAML from the repo to understand build stages and conditions
+#   e.g., eng/pipelines/runtime-official.yml in dotnet/runtime
 
 # For VMR unified builds, the YAML is in dotnet/dotnet:
-#   github-mcp-server-get_file_contents owner:dotnet repo:dotnet path:eng/pipelines/unified-build.yml
+#   eng/pipelines/unified-build.yml
 
 # Templates are usually in eng/pipelines/common/ or eng/pipelines/templates/
 ```
diff --git a/.github/skills/ci-analysis/references/binlog-comparison.md b/.github/skills/ci-analysis/references/binlog-comparison.md
@@ -24,11 +24,7 @@ When the failing work item's Helix job ID isn't visible (e.g., canceled jobs, or
    az pipelines runs artifact list --run-id $buildId --org "https://dev.azure.com/dnceng-public" -p public --query "[].name" -o tsv
    az pipelines runs artifact download --run-id $buildId --artifact-name "TestBuild_linux_x64" --path "$env:TEMP\artifact" --org "https://dev.azure.com/dnceng-public" -p public
    ```
-2. Load the binlog and search for job IDs:
-   ```
-   mcp-binlog-tool-load_binlog  path:"$env:TEMP\artifact\...\SendToHelix.binlog"
-   mcp-binlog-tool-search_binlog  binlog_file:"..."  query:"Sent Helix Job"
-   ```
+2. Load the `SendToHelix.binlog` and search for `Sent Helix Job` to find the GUIDs.
 3. Query each Helix job GUID with the CI script:
    ```
    ./scripts/Get-CIStatus.ps1 -HelixJob "{GUID}" -FindBinlogs
@@ -49,17 +45,9 @@ Launch two `task` subagents (can run in parallel), each with a prompt like:
 Download the msbuild.binlog from Helix job {JOB_ID} work item {WORK_ITEM}.
 Use the CI skill script to get the artifact URL:
   ./scripts/Get-CIStatus.ps1 -HelixJob "{JOB_ID}" -WorkItem "{WORK_ITEM}"
-Download the binlog URL to $env:TEMP\{label}.binlog.
-Load it with the binlog MCP server (mcp-binlog-tool-load_binlog).
-Search for the {TASK_NAME} task (mcp-binlog-tool-search_tasks_by_name).
-Get full task details (mcp-binlog-tool-list_tasks_in_target) for the target containing the task.
-Extract the CommandLineArguments parameter value.
-Normalize paths:
-  - Replace Helix work dirs (/datadisks/disk1/work/XXXXXXXX) with {W}
-  - Replace runfile hashes (Program-[a-f0-9]+) with Program-{H}
-  - Replace temp dir names (dotnetSdkTests.[a-zA-Z0-9]+) with dotnetSdkTests.{T}
+Download the binlog, load it, find the {TASK_NAME} task, and extract CommandLineArguments.
+Normalize paths (see table below) and sort args.
 Parse into individual args using regex: (?:"[^"]+"|/[^\s]+|[^\s]+)
-Sort the list and return it.
 Report the total arg count prominently.
 ```
 
@@ -69,26 +57,13 @@ Report the total arg count prominently.
 
 With two normalized arg lists, `Compare-Object` instantly reveals the difference.
 
-## Useful Binlog MCP Queries
-
-After loading a binlog with `mcp-binlog-tool-load_binlog`, use these queries (pass the loaded path as `binlog_file`):
-
-```
-# Find all invocations of a specific task
-mcp-binlog-tool-search_tasks_by_name  binlog_file:"$env:TEMP\my.binlog"  taskName:"Csc"
-
-# Search for a property value
-mcp-binlog-tool-search_binlog  binlog_file:"..."  query:"analysislevel"
+## Common Binlog Search Patterns
 
-# Find what happened inside a specific target
-mcp-binlog-tool-search_binlog  binlog_file:"..."  query:"under($target AddGlobalAnalyzerConfigForPackage_MicrosoftCodeAnalysisNetAnalyzers)"
+When investigating binlogs, these search query patterns are most useful:
 
-# Get all properties matching a pattern
-mcp-binlog-tool-search_binlog  binlog_file:"..."  query:"GlobalAnalyzerConfig"
-
-# List tasks in a target (returns full parameter details including CommandLineArguments)
-mcp-binlog-tool-list_tasks_in_target  binlog_file:"..."  projectId:22  targetId:167
-```
+- Search for a property: `analysislevel`
+- Search within a target: `under($target AddGlobalAnalyzerConfigForPackage_MicrosoftCodeAnalysisNetAnalyzers)`
+- Find all properties matching a pattern: `GlobalAnalyzerConfig`
 
 ## Path Normalization
 
@@ -141,4 +116,4 @@ Same MSBuild property resolution + different files on disk = different build beh
 
 > ❌ **Don't assume the MSBuild property diff explains the behavior diff.** Two branches can compute identical property values but produce different outputs because of different files on disk, different NuGet packages, or different task assemblies. Compare the actual task invocation.
 
-> ❌ **Don't load large binlogs and browse them interactively in main context.** Use targeted searches: `mcp-binlog-tool-search_tasks_by_name` for a specific task, `mcp-binlog-tool-search_binlog` with a focused query. Get in, get the data, get out.
+> ❌ **Don't load large binlogs and browse them interactively in main context.** Use targeted searches rather than browsing interactively. Get in, get the data, get out.
diff --git a/.github/skills/ci-analysis/references/build-progression-analysis.md b/.github/skills/ci-analysis/references/build-progression-analysis.md
@@ -18,18 +18,11 @@ On large PRs, the user is usually iterating toward a solution. The recent builds
 
 ### Step 1: List builds for the PR
 
-`gh pr checks` only shows checks for the current HEAD SHA. To see the full build history, use AzDO MCP or CLI:
+`gh pr checks` only shows checks for the current HEAD SHA. To see the full build history, use AzDO or CLI:
 
-**With AzDO MCP (preferred):**
-```
-azure-devops-pipelines_get_builds with:
-  project: "public"
-  branchName: "refs/pull/{PR}/merge"
-  top: 20
-  queryOrder: "QueueTimeDescending"
-```
+**With AzDO (preferred):**
 
-The response includes `triggerInfo` with `pr.sourceSha` — the PR's HEAD commit for each build.
+Query AzDO for builds on `refs/pull/{PR}/merge` branch, sorted by queue time descending. The response includes `triggerInfo` with `pr.sourceSha` — the PR's HEAD commit for each build.
 
 **Without MCP (fallback):**
 ```powershell
@@ -40,7 +33,7 @@ az pipelines runs list --branch "refs/pull/{PR}/merge" --top 20 --org $org -p $p
 
 ### Step 2: Map builds to the PR's head commit
 
-Each build's `triggerInfo` contains `pr.sourceSha` — the PR's HEAD commit when the build was triggered. Extract it from the `azure-devops-pipelines_get_builds` response or the `az` JSON output.
+Each build's `triggerInfo` contains `pr.sourceSha` — the PR's HEAD commit when the build was triggered. Extract it from the build response or CLI output.
 
 > ⚠️ **`sourceVersion` is the merge commit**, not the PR's head commit. Use `triggerInfo.'pr.sourceSha'` instead.
 
@@ -52,30 +45,17 @@ Each build's `triggerInfo` contains `pr.sourceSha` — the PR's HEAD commit when
 
 For the current/latest build, the merge ref (`refs/pull/{PR}/merge`) is available via the GitHub API. The merge commit's first parent is the target branch HEAD at the time GitHub computed the merge:
 
-```
-gh api repos/{OWNER}/{REPO}/git/commits/{sourceVersion} --jq '.parents[0].sha'
-```
-
-Or with GitHub MCP: `get_commit` with the `sourceVersion` SHA — the first parent in the response is the target branch HEAD.
-
-Where `sourceVersion` is the merge commit SHA from the AzDO build (not `pr.sourceSha`). This is simpler than parsing checkout logs.
+Look up the merge commit's parents — the first parent is the target branch HEAD. The `sourceVersion` from the AzDO build is the merge commit SHA (not `pr.sourceSha`). This is simpler than parsing checkout logs.
 
 > ⚠️ **This only works for the latest build.** GitHub recomputes `refs/pull/{PR}/merge` on each push, so the merge commit changes. For historical builds in a progression analysis, the merge ref no longer reflects what was built — use the checkout log method below.
 
 **For historical builds — extract from checkout logs:**
 
 The AzDO build API doesn't expose the target branch SHA. Extract it from the checkout task log.
 
-**With AzDO MCP (preferred):**
-```
-azure-devops-pipelines_get_build_log_by_id with:
-  project: "public"
-  buildId: {BUILD_ID}
-  logId: 5
-  startLine: 500
-```
+**With AzDO (preferred):**
 
-Search the output for the merge line:
+Fetch the checkout task log (typically log ID 5) for the build. Search the output for the merge line:
 ```
 HEAD is now at {mergeCommit} Merge {prSourceSha} into {targetBranchHead}
 ```
diff --git a/.github/skills/ci-analysis/references/delegation-patterns.md b/.github/skills/ci-analysis/references/delegation-patterns.md
@@ -13,7 +13,7 @@ Extract all unique test failures from these Helix work items:
 Job: {JOB_ID_1}, Work items: {ITEM_1}, {ITEM_2}
 Job: {JOB_ID_2}, Work items: {ITEM_3}
 
-For each, use hlx_logs with jobId and workItem to get console output.
+For each, search console logs for lines ending with [FAIL] (xUnit format).
 If hlx MCP is not available, fall back to:
   ./scripts/Get-CIStatus.ps1 -HelixJob "{JOB}" -WorkItem "{ITEM}"
 
@@ -34,7 +34,7 @@ Failing build: {BUILD_ID}, job: {JOB_NAME}, work item: {WORK_ITEM}
 
 Steps:
 1. Search for recently merged PRs:
-   github-mcp-server-search_pull_requests query:"is:merged base:{TARGET_BRANCH}" owner:dotnet repo:{REPO}
+   Search for recently merged PRs on {TARGET_BRANCH}
 2. Run: ./scripts/Get-CIStatus.ps1 -PRNumber {MERGED_PR} -Repository "dotnet/{REPO}"
 3. Find the build with same job name that passed
 4. Locate the Helix job ID (may need artifact download — see [azure-cli.md](azure-cli.md))
@@ -53,7 +53,7 @@ If authentication fails or API returns errors, STOP and return the error — don
 ```
 List all changed files on merge PR #{PR_NUMBER} in dotnet/{REPO}.
 
-Use: github-mcp-server-pull_request_read method:get_files owner:dotnet repo:{REPO} pullNumber:{PR_NUMBER}
+Get the list of changed files for PR #{PR_NUMBER} in dotnet/{REPO}
 
 For each file, note: path, change type (added/modified/deleted), lines changed.
 
@@ -74,9 +74,7 @@ Download and analyze binlog from AzDO build {BUILD_ID}, artifact {ARTIFACT_NAME}
 
 Steps:
 1. Download the artifact (see [azure-cli.md](azure-cli.md))
-2. Load: mcp-binlog-tool-load_binlog path:"{BINLOG_PATH}"
-3. Find tasks: mcp-binlog-tool-search_tasks_by_name taskName:"Csc"
-4. Get task parameters: mcp-binlog-tool-get_task_info
+2. Load the binlog, find the {TASK_NAME} task invocations, get full task details including CommandLineArguments.
 
 Return JSON: { "buildId": N, "project": "...", "args": ["..."] }
 ```
@@ -86,8 +84,8 @@ Return JSON: { "buildId": N, "project": "...", "args": ["..."] }
 Check if canceled job "{JOB_NAME}" from build {BUILD_ID} has recoverable Helix results.
 
 Steps:
-1. Use hlx_files with jobId:"{HELIX_JOB_ID}" workItem:"{WORK_ITEM}" to find testResults.xml
-2. Download with hlx_download_url using the testResults.xml URI
+1. Check if TRX test results are available for the work item. Parse them for pass/fail counts.
+2. If no structured results, check for testResults.xml
 3. Parse the XML for pass/fail counts on the <assembly> element
 
 Return JSON: { "jobName": "...", "hasResults": true, "passed": N, "failed": N }
@@ -104,20 +102,19 @@ This pattern scales to any number of builds — launch N subagents for N builds,
 ```
 Extract the target branch HEAD from AzDO build {BUILD_ID}.
 
-Use azure-devops-pipelines_get_build_log_by_id with:
-  project: "public", buildId: {BUILD_ID}, logId: 5, startLine: 500
+Fetch the checkout task log (typically log ID 5, around line 500+)
 
 Search for: "HEAD is now at {mergeCommit} Merge {prSourceSha} into {targetBranchHead}"
 
 Return JSON: { "buildId": N, "targetHead": "abc1234", "mergeCommit": "def5678" }
 Or: { "buildId": N, "targetHead": null, "error": "merge line not found in log 5" }
 ```
 
-Launch one per build in parallel. The main agent combines with `azure-devops-pipelines_get_builds` results to build the full progression table.
+Launch one per build in parallel. The main agent combines with the build list to build the full progression table.
 
 ## General Guidelines
 
-- **Use `general-purpose` agent type** — it has shell + MCP access (`hlx_status`, `azure-devops-pipelines_get_builds`, `mcp-binlog-tool-load_binlog`, etc.)
+- **Use `general-purpose` agent type** — it has shell + MCP access for Helix, AzDO, binlog, and GitHub queries
 - **Run independent tasks in parallel** — the whole point of delegation
 - **Include script paths** — subagents don't inherit skill context
 - **Require structured JSON output** — enables comparison across subagents
diff --git a/.github/skills/ci-analysis/references/helix-artifacts.md b/.github/skills/ci-analysis/references/helix-artifacts.md
@@ -190,10 +190,12 @@ When you download artifacts via MCP tools or manually, the directory structure c
 
 ### Helix Work Item Downloads
 
-Two MCP tools download Helix artifacts:
-- **`hlx_download`** — downloads multiple files from a work item, with optional glob `pattern` (e.g., `pattern:"*.binlog"`). Returns local file paths.
+MCP tools for downloading Helix artifacts:
+- **`hlx_download`** — downloads multiple files from a work item. Returns local file paths.
 - **`hlx_download_url`** — downloads a single file by direct URI (from `hlx_files` output). Use when you know exactly which file you need.
 
+> 💡 **Prefer remote investigation first**: search file contents, parse test results, and search logs remotely before downloading. Only download when you need to load binlogs or do offline analysis.
+
 `hlx_download` saves files to a temp directory. The structure is **flat** — all files from the work item land in one directory:
 
 ```
@@ -208,7 +210,7 @@ C:\...\Temp\helix-{hash}\
 ```
 
 **Key confusion point:** Numbered binlogs (`msbuild0.binlog`, `msbuild1.binlog`) correspond to individual test cases within the work item, not to build phases. A work item like `Microsoft.NET.Build.Tests.dll.18` runs dozens of tests, each invoking MSBuild separately. To map a binlog to a specific test:
-1. Load it with `mcp-binlog-tool-load_binlog`
+1. Load it with the binlog analysis tools
 2. Check the project paths inside — they usually contain the test name
 3. Or check `testResults.xml` to correlate test execution order with binlog numbering
 
@@ -226,7 +228,7 @@ $env:TEMP\TestBuild_linux_x64\
         └── SendToHelix.binlog     # Contains Helix job GUIDs
 ```
 
-**Key confusion point:** The artifact name appears twice in the path (extract folder + subfolder inside the ZIP). Use the full nested path with `mcp-binlog-tool-load_binlog`.
+**Key confusion point:** The artifact name appears twice in the path (extract folder + subfolder inside the ZIP). Use the full nested path when loading binlogs.
 
 ### Mapping Binlogs to Failures
 
diff --git a/.github/skills/ci-analysis/references/manual-investigation.md b/.github/skills/ci-analysis/references/manual-investigation.md