feat(state): track per-skill cron invocations + EMA query for labyrinth by Brecht-H · Pull Request #19508 · NousResearch/hermes-agent

Brecht-H · 2026-05-04T03:41:02Z

Summary

Adds a skill_invocations table to state.db, wires the cron runner to write one row per skill listed on a cron job at completion (success and failure paths), and exposes a SessionDB.query_skill_ema() method so observability dashboards can A/B local Qwen against external models.

This is the data layer for per-skill EMA tracking — needed before installing analyzer crons that route to paid providers (DeepSeek, Kimi-via-OpenRouter, etc.) without flying blind on quality vs cost.

Coexists cleanly with #18253's bump_use() cron-side curator integration (auto-merged on cherry-pick).

Changes

Schema (hermes_state.py)
- New table skill_invocations (session_id, cron_id, skill_name, model, provider, duration, tokens, cost, success, end_reason, …)
- Three indexes: by skill, by cron, by session
- New view skill_stats_daily aggregating per (skill_name, model, provider, day)
- SCHEMA_VERSION 11 → 12. Pure additive — no Alembic, no destructive ops on existing rows.
Writer (hermes_state.py)
- SessionDB.record_skill_invocation(...) keyword-only API, uses the existing _execute_write WAL-safe helper.
Cron hook (cron/scheduler.py)
- run_job() records timestamps around the agent run and writes one skill_invocations row per name in job["skills"] after the agent exits (both success and failure paths). Tokens / cost / duration are sourced from the existing sessions row, so no new instrumentation in the agent loop.
- All writes are best-effort: any failure logs at DEBUG and is swallowed so it can never fail a cron.
Read API (hermes_state.py)
- SessionDB.query_skill_ema(window_days=14, alpha=0.3) returns per-(skill_name, model) exponentially-weighted moving averages for success rate, cost-per-call, and duration. Default α≈5d half-life.

Smoke test (local, off-tree)

SCHEMA_VERSION = 12
Tables: ['messages', 'schema_version', 'sessions', 'skill_invocations', 'sqlite_sequence', 'state_meta']
Views: ['skill_stats_daily']
skill indexes: ['idx_skill_invocations_cron', 'idx_skill_invocations_session', 'idx_skill_invocations_skill']

Inserts validated for success=True / success=False / success=None paths. View aggregates cleanly. EMA query returns expected exponential weighting (verified 8-row dataset across 7 days, recent failures correctly down-weighted as they age out).

v1 scope notes

Cron-only. Slash-command and ad-hoc skill_view() calls are not tracked yet — out of scope for v1, additive in v2 if needed.
Multi-skill crons over-account. Each skill in job["skills"] gets the full session cost. The current analyzer pattern is one skill per cron (Pass 2 v2 Steps 3–5), so this corner case is rare.
Labyrinth HTTP endpoint. The data layer (query_skill_ema) lands here. The plumbing in plugins/hermes-labyrinth/dashboard/plugin_api.py (a small @router.get('/skills/ema') wrapper) is a follow-up because labyrinth is upstreamed at stainlu/hermes-labyrinth — separate PR target.

Test plan

Fresh-DB schema migration applies cleanly
record_skill_invocation() insert + skill_stats_daily aggregation
query_skill_ema() returns expected EMA values across multi-day data
Coexists with fix(curator): rewrite cron job skill refs after consolidation #18253's bump_use() integration (auto-merge clean)
Live cron fire emits one row per skill (validated by Pass 2 v2 Step 3, first analyzer cron — analyze:derivatives-anomaly)
Labyrinth dashboard endpoint follow-up (small wrapper PR to plugins/hermes-labyrinth)

History

Supersedes #19507, which carried an extra unrelated commit from local main. This PR is rebased clean onto current upstream main (363cc93).

🤖 Generated with Claude Code

Copilot

Pull request overview

Adds per-skill cron invocation tracking into the SQLite state store and exposes an EMA query API for downstream observability (e.g., Labyrinth dashboards) to compare quality/cost/duration over time.

Changes:

Add skill_invocations table + supporting indexes and skill_stats_daily view; bump SCHEMA_VERSION to 12.
Add SessionDB.record_skill_invocation() writer and SessionDB.query_skill_ema() reader API.
Hook cron/scheduler.py::run_job() to write one invocation row per skill at job completion (success/failure).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File	Description
hermes_state.py	Introduces the new skill invocation schema + daily aggregation view, and adds writer + EMA query methods.
cron/scheduler.py	Records per-skill invocation rows at the end of each cron run using data sourced from the session row.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

 DEFAULT_DB_PATH = get_hermes_home() / "state.db"

-SCHEMA_VERSION = 11
+SCHEMA_VERSION = 12


+        groups: Dict[Tuple[str, Optional[str]], List[Dict[str, Any]]] = {}
+        for r in rows:
+            key = (r["skill_name"], r["model"])
+            groups.setdefault(key, []).append(r)
+


+
+            ema_success_rate = sum(
+                w * (
+                    (int(r["success_count"] or 0) / max(1, int(r["invocation_count"] or 0)))


+            "SELECT skill_name, model, provider, day, "
+            "invocation_count, success_count, failure_count, "
+            "avg_duration_s, avg_cost_usd, avg_quality_score, last_invoked_at "
+            "FROM skill_stats_daily "
+            "WHERE last_invoked_at >= ? "
+            "ORDER BY skill_name, model, day"
+        )
+        with self._lock:
+            cur = self._conn.execute(sql, (cutoff_ts,))
+            rows = [dict(r) for r in cur.fetchall()]
+


+            raw_weights = [alpha * (1 - alpha) ** (n - 1 - i) for i in range(n)]
+            wsum = sum(raw_weights)
+            weights = (
+                [w / wsum for w in raw_weights]
+                if wsum > 0
+                else [1.0 / n] * n
+            )
+
+            sample_count = sum(int(r["invocation_count"] or 0) for r in rs)
+            success_count = sum(int(r["success_count"] or 0) for r in rs)
+            failure_count = sum(int(r["failure_count"] or 0) for r in rs)
+
+            ema_success_rate = sum(
+                w * (
+                    (int(r["success_count"] or 0) / max(1, int(r["invocation_count"] or 0)))
+                )
+                for w, r in zip(weights, rs)
+            )
+            ema_duration = sum(
+                w * float(r["avg_duration_s"] or 0.0)
+                for w, r in zip(weights, rs)
+            )
+            ema_cost = sum(
+                w * float(r["avg_cost_usd"] or 0.0)
+                for w, r in zip(weights, rs)
+            )


+                        _completed_at = _hermes_now().timestamp()
+                        _sess = _session_db.get_session(_cron_session_id) or {}
+                        _success_flag, _end_reason_val = _skill_outcome
+                        _duration = _completed_at - _invoked_at
+                        for _skill_name in _job_skills:
+                            _sn = str(_skill_name).strip()
+                            if not _sn:
+                                continue
+                            _session_db.record_skill_invocation(
+                                skill_name=_sn,
+                                invoked_at=_invoked_at,
+                                session_id=_cron_session_id,
+                                cron_id=job_id,
+                                completed_at=_completed_at,
+                                duration_seconds=_duration,
+                                model=_sess.get("model"),
+                                provider=_sess.get("billing_provider"),
+                                input_tokens=int(_sess.get("input_tokens") or 0),
+                                output_tokens=int(_sess.get("output_tokens") or 0),
+                                cache_read_tokens=int(_sess.get("cache_read_tokens") or 0),
+                                cache_write_tokens=int(_sess.get("cache_write_tokens") or 0),
+                                estimated_cost_usd=_sess.get("estimated_cost_usd"),
+                                success=_success_flag,
+                                end_reason=_end_reason_val,
+                            )


Brecht-H · 2026-05-04T17:13:36Z

Mac 3-lens review — REQUEST-CHANGES

Ran 3 parallel reviewers (architect / code-review / security). Architect + code-review converged on 3× P1 correctness bugs that defeat the EMA telemetry's stated purpose. Security clear.

Full feedback at ~/hermes-plan/PR19508_REVIEW_2026-05-04.md on Mac (rsync'd to /home/orion/hermes-plan/PR19508_REVIEW_2026-05-04.md for Orion access).

Must-fix before merge (P1 × 3)

EMA weighting is index-based, not calendar-based (hermes_state.py:268-275). Days with gaps get treated as adjacent — defeats "5-day half-life" docstring claim. Fix: weight by (today - day).days instead of list index.
EMA double-aggregates avg_duration_s / avg_cost_usd (hermes_state.py:288-295). Per-day average × per-day weight means a low-volume day gets equal weight to a high-volume day in the EMA term. Fix: weight by invocation_count inside, OR add total_* columns to view and EMA over those.
Silent-failure on skill writes at DEBUG (cron/scheduler.py:1297-1301). For Build fix(cli): respect explicit --max-turns value even when it equals default #87 telemetry foundation, broken writer is invisible until dashboard goes empty. Fix: logger.warning(... exc_info=True).

Should-fix (P2 × 3)

Multi-skill cron over-accounting (cost doubles for 2-skill crons)
DROP VIEW IF EXISTS before CREATE VIEW for migration idempotency
Lock contention on query_skill_ema (release after fetchall)

Nits (P3)

Docstring "5-day half-life" — α=0.3 is actually ~1.94-day half-life
error_msg[:200] if error_msg else None — conditional is dead
Underscore-prefix inconsistent with sibling vars in same function

What's good

Schema additivity, parameterized SQL, keyword-only API, security clear, v1 scope is right.

Estimated fix effort: ~30-45 min for P1s, ~15 min more for P2s. Smoke tests should still pass.

Tag me when ready for re-review.

Mac 3-lens review (architect / code-review / security) flagged 3 P1 correctness bugs. Fixes: P1.1 — Calendar-aware EMA weighting (hermes_state.py) Was: raw_weights = [alpha * (1-alpha)**(n-1-i) for i in range(n)] Now: weights by (today_utc - day).days. Days with no data correctly get zero weight (absent from result); gaps don't compress older data. Defeats the index-based bias toward sparsely-running skills. P1.2 — Volume-weighted cost / duration EMA (hermes_state.py + view) Was: ema_X = sum(w * avg_X) — biased equal-weight across days regardless of invocation count. Now: skill_stats_daily exposes total_duration_s / total_cost_usd alongside avg_*. EMA computed as sum(w_d * total_d) / sum(w_d * count_d) so a day with 100 fast calls correctly outweighs a day with 1 slow call within the same EMA term. P1.3 — Surface telemetry write failures (cron/scheduler.py) Was: except: logger.debug(...) — silent swallow at DEBUG; broken writer invisible until dashboard goes empty. Now: logger.warning(..., exc_info=True). Operator sees regressions; cron still cannot fail (warning never raises). Plus P2.1 (multi-skill cron cost split — divides session cost evenly when len(job["skills"]) > 1, preventing per-skill EMA cost-doubling), P2.2 (DROP VIEW IF EXISTS before CREATE for migration idempotency), and P3 nits (dead conditional on error_msg slice; docstring half-life math corrected — α=0.3 ≈ 1.94d, not 5d; α≈0.129 for true 5-day). Smoke-tested off-tree: skill-a-gappy (days 0,1,7) success_rate=0.954 ← gap correctly down-weights day-7 skill-b-tight (days 0,1,2) success_rate=1.000 skill-c-vol (1 slow + 100 fast) duration=1.69s cost=$0.0024/call ← volume-weighted Constraint: pure additive view change — DROP+CREATE on every connect is safe because skill_stats_daily has no rowids/triggers depending on its identity. Existing rows in skill_invocations untouched. Confidence: high (3 scenarios validated) Scope-risk: narrow (same file boundaries as original PR) Not-tested: multi-skill cron live-fire (smoke uses synthetic data) Machine: orion-terminal Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Brecht-H · 2026-05-04T17:20:59Z

@Brecht-H ready for re-review.

P1 + P2 fixes pushed as 52ff992 (additive commit, no rebase — kept original 50bdf20 intact for diff trail).

P1.1 Calendar-aware EMA weighting → smoke verified: 3-day-gap skill correctly down-weights day-7 failure (success_rate=0.954 vs naive 0.667).

P1.2 Volume-weighted cost/duration EMA via new total_duration_s / total_cost_usd view columns → smoke verified: 1 slow call + 100 fast calls correctly produces duration=1.69s and cost=$0.0024/call (vs naive equal-weight 25.5s / $0.0505).

P1.3 logger.warning(..., exc_info=True) on skill-write failures.

P2.1 Multi-skill cron cost split (even division when len(skills) > 1).

P2.2 DROP VIEW IF EXISTS before CREATE for migration idempotency.

P3 nits: dead conditional removed on error_msg[:200]; docstring half-life math corrected (α=0.3 ≈ 1.94d, not 5d; α≈0.129 for true 5-day).

Diff: +98 / −27 across the same 2 files. No new tests added (existing test plan still applicable). Once you approve + merge, I'll cherry-pick locally → validate Step 3 cron-fire emits skill_invocations row → flip the 3 analyzer crons to enabled: true.

Brecht-H · 2026-05-04T17:32:51Z

Mac re-review v2 — APPROVED ✅

All 3 P1 fixes verified clean against original review feedback:

P1.1 calendar-aware EMA ✅ — _day_age(day_str) helper at L319-324, weights now use alpha * (1-alpha)**day_age. Smoke evidence (gappy day-7 failure correctly low-weighted) confirms.
P1.2 volume-weighted aggregation ✅ — Added total_duration_s + total_cost_usd to view, EMA now Σ(w×total)/Σ(w×count). NaN-safe denominator guard at L358-368. Smoke evidence (1.69s for 1-slow+100-fast vs naive 25.5s) confirms correctness.
P1.3 logger.warning ✅ — L106 upgraded from DEBUG.

Both P2 fixes also landed:

P2.1 multi-skill cost split — _split() lambda divides cost evenly when len(skills) > 1, single-skill keeps exclusive attribution. Good comment about v2 attribution column.
P2.2 DROP VIEW idempotency — L170 DROP VIEW IF EXISTS skill_stats_daily before CREATE.

Smoke evidence in re-review post is comprehensive (3 scenarios validating calendar-awareness baseline, gap handling, and volume weighting). Ship it.

Merging now.

Brecht-H · 2026-05-04T21:49:37Z

@Brecht-H Bugs A + B + C from Orion CC's RED validation report fixed and validated.

Pushed as commit 50a1d51c5 (additive on top of the v2 fixes). Bug B is filesystem-only (cp instead of symlink) — documented in new memory file feedback_hermes-profile-mode-script-paths.md.

Validation reports:

~/hermes-plan/CRON_FIRE_VALIDATION_DERIVATIVES_ANOMALY_2026-05-04_RERUN.md — 🟢 GREEN, ready for enabled: true
~/hermes-plan/CRON_FIRE_VALIDATION_POLYMARKET_REGIME_2026-05-04.md — 🟡 scaffolding GREEN, downstream DeepSeek API key invalid (operator credential refresh needed)
~/hermes-plan/CRON_FIRE_VALIDATION_MINIFLUX_DIGEST_2026-05-04.md — 🟡 scaffolding GREEN, OpenRouter HTTP 402 (max_tokens budget — agent runtime asks for 65536 vs 5312 affordable)

skill_invocations rows confirmed (global state.db v12):

id=3 derivatives-anomaly  success=1 model=qwen3.6-27b           end_reason=complete
id=4 polymarket-regime    success=0 model=deepseek-v4-flash     end_reason='RuntimeError: Error code: 401 - API key invalid'
id=5 miniflux-digest      success=0 model=moonshotai/kimi-k2.6  end_reason='RuntimeError: HTTP 402: ... 65536 > 5312 affordable'

Bug C correctly downgraded both polymarket and miniflux to success=0 despite the agent reporting non-FATAL responses — the marker scan caught the downstream provider failures and recorded them honestly. Without Bug C the EMA would have been polluted with two fake-success rows.

Pass 1 erratum added at top of ~/hermes-plan/HERMES_SKILL_INSTALL_SPEC_EXTERNAL_MODELS_2026-05-03.md.

Tagging for re-review of 50a1d51c5 + flip-decision on the 3 analyzer crons:

Derivatives-anomaly: ready to flip enabled: true (24h soak suggested before reading EMA values)
Polymarket-regime: BLOCKED on DeepSeek key refresh
Miniflux-digest: BLOCKED on OpenRouter credit top-up OR per-cron max_tokens cap

Adds a `skill_invocations` table to state.db and writes one row per skill listed on a cron job at completion (success or failure paths). Tokens, cost and duration are sourced from the existing session row. Includes a `skill_stats_daily` view that buckets invocations by day and (skill_name, model, provider), and a new `SessionDB.query_skill_ema()` method that applies exponential weighting (default alpha=0.3, ~5d half-life) so the dashboard can A/B local Qwen against external models once analyzer crons start firing. SCHEMA_VERSION 11 → 12. Pure additive: existing rows untouched, new table created on next connection-open via the existing executescript() path. No Alembic. Slash-command and ad-hoc skill_view invocations are NOT tracked in v1. Multi-skill crons over-account: each skill in `job["skills"]` gets the full session cost. Both are acceptable for the analyzer-cron use case (1 skill per cron) and can iterate later. Constraint: no new external dependencies — uses sqlite3 + stdlib only. Rejected: per-skill cost split (would require model attribution inside a single agent run, which Hermes does not currently track) | Reason: defer to v2 once NousResearch#87 surfaces real-world skew. Confidence: high (smoke-tested end-to-end on tmp DB) Scope-risk: narrow (additive table, no existing-row touchpoints) Not-tested: live cron fire (validated by Step 3 — Pass 2 v2 handover) Machine: orion-terminal Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Mac 3-lens review (architect / code-review / security) flagged 3 P1 correctness bugs. Fixes: P1.1 — Calendar-aware EMA weighting (hermes_state.py) Was: raw_weights = [alpha * (1-alpha)**(n-1-i) for i in range(n)] Now: weights by (today_utc - day).days. Days with no data correctly get zero weight (absent from result); gaps don't compress older data. Defeats the index-based bias toward sparsely-running skills. P1.2 — Volume-weighted cost / duration EMA (hermes_state.py + view) Was: ema_X = sum(w * avg_X) — biased equal-weight across days regardless of invocation count. Now: skill_stats_daily exposes total_duration_s / total_cost_usd alongside avg_*. EMA computed as sum(w_d * total_d) / sum(w_d * count_d) so a day with 100 fast calls correctly outweighs a day with 1 slow call within the same EMA term. P1.3 — Surface telemetry write failures (cron/scheduler.py) Was: except: logger.debug(...) — silent swallow at DEBUG; broken writer invisible until dashboard goes empty. Now: logger.warning(..., exc_info=True). Operator sees regressions; cron still cannot fail (warning never raises). Plus P2.1 (multi-skill cron cost split — divides session cost evenly when len(job["skills"]) > 1, preventing per-skill EMA cost-doubling), P2.2 (DROP VIEW IF EXISTS before CREATE for migration idempotency), and P3 nits (dead conditional on error_msg slice; docstring half-life math corrected — α=0.3 ≈ 1.94d, not 5d; α≈0.129 for true 5-day). Smoke-tested off-tree: skill-a-gappy (days 0,1,7) success_rate=0.954 ← gap correctly down-weights day-7 skill-b-tight (days 0,1,2) success_rate=1.000 skill-c-vol (1 slow + 100 fast) duration=1.69s cost=$0.0024/call ← volume-weighted Constraint: pure additive view change — DROP+CREATE on every connect is safe because skill_stats_daily has no rowids/triggers depending on its identity. Existing rows in skill_invocations untouched. Confidence: high (3 scenarios validated) Scope-risk: narrow (same file boundaries as original PR) Not-tested: multi-skill cron live-fire (smoke uses synthetic data) Machine: orion-terminal Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Mac's Pass 2 v2 deploy validation (Orion CC, RED report 2026-05-04) caught two scaffolding bugs that prevented the analyzer crons from firing end-to-end. Both fixed here. Bug A — skill_view() didn't match by frontmatter `name:` Was: skill_view fell through directory-name match only. Skills with dir/frontmatter mismatch (e.g. dir `analyze/derivatives-anomaly`, frontmatter `analyze-derivatives-anomaly`) returned "not found" even though `hermes skills list` showed them. The cron's _build_job_prompt then logged "skill not found, skipping" and the activation banner never reached the agent. Now: tools/skills_tool.py adds a frontmatter-name fallback after the existing dir-name match, restoring parity with the LIST command's keying. ~17 lines. Bug C — runner success classifier accepted scaffolding failures as success Was: cron/scheduler.py set _skill_outcome = (True, "complete") whenever the agent returned a non-FATAL response. A cron whose skill resolution failed and whose script never ran was reported success=True, polluting the EMA's A/B comparison foundation. Now: scan output + final_response for known scaffolding failure markers ("skill not found, skipping", "Blocked: script path resolves outside", "permission denied", "security check failed", "Skill(s) not found and skipped"). When matched, downgrade to (False, marker_excerpt) so the skill_invocations row records the real outcome. (Bug B — symlink security check rejecting ~/.hermes/profiles/<p>/scripts/* symlinks whose targets live under ~/.hermes/skills/<x>/scripts/ — is a filesystem-only fix: replace the symlinks with `cp` real files. Not in this commit; documented in feedback_hermes-profile-mode-script-paths.md memory.) Validation: After applying A+B+C and restarting the dashboard, all 3 analyzer crons fired via run_job() and wrote skill_invocations rows with correctly-classified outcomes: - derivatives-anomaly: success=1, model=qwen3.6-27b - polymarket-regime: success=0, model=deepseek-v4-flash end_reason="HTTP 401: API key invalid" (genuine downstream credential failure, NOT scaffolding) - miniflux-digest: success=0, model=moonshotai/kimi-k2.6 end_reason="HTTP 402: credit budget exceeded" (genuine downstream OpenRouter limit, NOT scaffolding) Confidence: high (3 live cron-fires, table inserts and skill_invocations writes confirmed). Scope-risk: narrow (additive only — Bug A is a new fallback after existing matches; Bug C wraps the existing success path in a marker scan). Not-tested: the 65536-max_tokens setting in agent runtime that triggered miniflux's 402 — likely a separate agent-side config issue, not in this PR's scope. Machine: orion-terminal Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Brecht-H · 2026-05-05T09:09:21Z

@Brecht-H rebased onto upstream main (was 201 commits behind base 363cc936746c).

Conflict resolved: cron/scheduler.py had one chunk-conflict at the top of run_job where upstream added an early-return when the prerun script produces no output (PR #19744 fix(cron): skip AI call when script produces no output) and our PR added _invoked_at / _skill_outcome capture. Resolved by keeping both in order: upstream early-return runs FIRST (no AI call → no skill_invocation row needed), our capture runs SECOND (only when the AI call will fire).

hermes_state.py auto-merged cleanly (upstream's +382 LOC was strictly additive; our SCHEMA_VERSION 11→12 bump preserved — upstream is still at 11). tools/skills_tool.py applied unchanged (upstream untouched).

Smoke test on rebased branch (off-tree fresh DB):

SCHEMA_VERSION = 12
inserted skill_invocations id=1 | EMA buckets: 1
analyze:test success_rate=1.000

New SHAs:

ffdc5815c (was 50bdf20) feat(state): per-skill EMA
fe9d7adee (was 52ff992) P1+P2 fixes
050f4d0fc (was 50a1d51) Bug A + C

Force-pushed via --force-with-lease to Brecht-H:feat/skill-invocations-ema-clean-2026-05-04.

Tagging for re-review of the resolved conflict + final approve. Local Hermes-agent install on Orion is unaffected (cherry-picked commits live there independent of upstream merge state).

Brecht-H · 2026-05-05T09:11:55Z

Mac re-review post-rebase — APPROVE OK

Rebase verified clean:

201 upstream commits absorbed onto base 363cc93 -> ffdc581 -> fe9d7ad -> 050f4d0
mergeable=True, changed_files=3 (only the contribution files: cron/scheduler.py, hermes_state.py, tools/skills_tool.py)
LOC delta unchanged from pre-rebase: +101 -3 / +243 -2 / +21 -0 (intent preserved through conflict resolution)
Smoke verified on rebased branch: SCHEMA_VERSION=12, EMA query green
Force-pushed via --force-with-lease (safe overwrite)
Backup branch retained at backup/pre-rebase-2026-05-05 in worktree

The PR is ready for upstream maintainer review. Down-stream consumer (Mac/Allaert's Hermes deployment) has been running cherry-picked commits ccdb7a951 + c3d8a1016 since 2026-05-04 — Bug A/B/C fixes verified working in production for 24h+. skill_invocations table populating cleanly, derivatives-anomaly cron green, polymarket-regime + miniflux-digest now also green after the post-#19508 routing fix.

No blockers from our side. Waiting on NousResearch maintainer.

— Mac

Copilot AI review requested due to automatic review settings May 4, 2026 03:41

Copilot started reviewing on behalf of Brecht-H May 4, 2026 03:41 View session

Copilot AI reviewed May 4, 2026

View reviewed changes

alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/cron Cron scheduler and job management labels May 4, 2026

Fearvox mentioned this pull request May 4, 2026

docs(kanban): document handoff evidence metadata #19512

Closed

13 tasks

Brecht-H mentioned this pull request May 4, 2026

feat(dashboard): /skills/ema endpoint reading SessionDB.query_skill_ema stainlu/hermes-labyrinth#9

Open

6 tasks

Hermes Sovereign AgentCore and others added 3 commits May 5, 2026 09:08

Brecht-H force-pushed the feat/skill-invocations-ema-clean-2026-05-04 branch from 50a1d51 to 050f4d0 Compare May 5, 2026 09:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(state): track per-skill cron invocations + EMA query for labyrinth#19508

feat(state): track per-skill cron invocations + EMA query for labyrinth#19508
Brecht-H wants to merge 3 commits into
NousResearch:mainfrom
Brecht-H:feat/skill-invocations-ema-clean-2026-05-04

Brecht-H commented May 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Brecht-H commented May 4, 2026

Uh oh!

Brecht-H commented May 4, 2026

Uh oh!

Brecht-H commented May 4, 2026

Uh oh!

Brecht-H commented May 4, 2026

Uh oh!

Brecht-H commented May 5, 2026

Uh oh!

Brecht-H commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Brecht-H commented May 4, 2026

Summary

Changes

Smoke test (local, off-tree)

v1 scope notes

Test plan

History

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Brecht-H commented May 4, 2026

Mac 3-lens review — REQUEST-CHANGES

Must-fix before merge (P1 × 3)

Should-fix (P2 × 3)

Nits (P3)

What's good

Uh oh!

Brecht-H commented May 4, 2026

Uh oh!

Brecht-H commented May 4, 2026

Mac re-review v2 — APPROVED ✅

Uh oh!

Brecht-H commented May 4, 2026

Uh oh!

Brecht-H commented May 5, 2026

Uh oh!

Brecht-H commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants