Question: integration path for Agent Threat Rules detection in crewai/security

Hi CrewAI maintainers,

I am Adam Lin, maintainer of Agent Threat Rules (ATR), an open Apache 2.0 detection standard for AI agent threats. ATR currently ships 330 community-maintained rules across ten attack categories: prompt injection, tool poisoning, skill compromise, context exfiltration, agent manipulation, privilege escalation, excessive autonomy, model abuse, model security, and data poisoning. Repo: https://github.com/Agent-Threat-Rule/agent-threat-rules. Microsoft agent-governance-toolkit and Cisco AI Defense skill-scanner reference ATR rules as an upstream source.

I noticed lib/crewai/src/crewai/security/ has fingerprint and security_config in place, with security_config.py listing scoping rules and impersonation and delegation as future work. ATR rules are catalog-driven YAML pattern definitions rather than scope or delegation mechanisms, but they could plug into a hook on the agent or task lifecycle to flag adversarial inputs in user messages, tool outputs, retrieved documents, and inter-agent messages.

Before opening a PR I would like to confirm scope and architecture with you. Open questions follow.

Where should detection hooks attach in the crew lifecycle? Candidate sites: pre-task input scan on Task.execute, pre-tool-call scan on Tool input, post-tool-call scan on Tool output, and inter-agent message scan when delegation is used. The closest existing seam is the GuardrailProvider work hinted at in issue 4877.

What is the preferred dependency posture? ATR rules and the lightweight matcher are pure Python with no heavy native deps. The detection engine could be vendored as a small module under crewai.security.atr, or wired through an optional extra so users opt in with pip install crewai[security].

Is a thin first cut acceptable, scoped to one of the ten ATR categories (for example prompt-injection only, the largest at 109 rules), as proof of fit before broadening? I would aim for under 120 lines of Python plus tests.

Would the maintainers prefer this lives in lib/crewai-tools as a tool integration rather than in lib/crewai core? The crewai-tools/security folder already holds safe_path; an atr_input_scanner could sit alongside.

The companion v0.3 OSCAL catalog at https://github.com/Agent-Threat-Rule/ai-rmf-oscal-catalog under CC0 maps the ATR rule set to NIST AI RMF controls if that helps with the EU AI Act and NIST AI RMF positioning that came up in issue 5360.

Apache 2.0 on the ATR side. Happy to follow whatever direction the maintainers prefer. The llm-generated label is applied per the AI contribution policy in CONTRIBUTING.md.

Thanks,
Adam Lin (adam@agentthreatrule.org)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: integration path for Agent Threat Rules detection in crewai/security #5763

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question: integration path for Agent Threat Rules detection in crewai/security #5763

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions