[FEATURE][SECURITY]: MCP server source code scanner - Semgrep/Bandit integration

# 🔌 Plugin: MCP Server Source Code Scanner - Semgrep/Bandit Integration

## Goal

Implement a **gateway plugin** that performs static analysis on MCP server source code using **Semgrep**, **Bandit**, or other SAST tools to detect security vulnerabilities, code quality issues, and dangerous patterns before servers are added to the gateway.

## Why Now?

1. **Code-Level Vulnerabilities**: Container scanning misses application-level issues like SQL injection, command injection, and insecure deserialization
2. **MCP-Specific Risks**: MCP servers execute tools on behalf of AI agents—code vulnerabilities can have amplified impact
3. **Shift-Left Security**: Catching issues in code before deployment is cheaper than runtime detection
4. **GitHub Integration**: Many MCP servers are deployed from GitHub repos; source scanning is natural
5. **Existing Code Safety Plugin**: `code_safety_linter` plugin detects patterns in outputs, but pre-deployment source analysis is missing

---

## 📖 User Stories

<details>
<summary><strong>US-1: Security Engineer - Scan Source Code for Vulnerabilities</strong></summary>

**As a** Security Engineer
**I want** MCP server source code scanned for security issues
**So that** vulnerabilities are caught before deployment

**Acceptance Criteria:**

```gherkin
Given an MCP server from a GitHub repository:
  source:
    type: github
    repo: org/mcp-server
    branch: main
When the source scan runs:
Then the scanner should:
  - Clone the repository
  - Detect primary language
  - Run appropriate scanners (Semgrep, Bandit)
  - Return findings with:
    - Rule ID and severity
    - File path and line numbers
    - Code snippet
    - Remediation guidance
  - Block if critical findings exist
```

</details>

<details>
<summary><strong>US-2: Developer - View Scan Findings with Remediation</strong></summary>

**As a** Developer
**I want** actionable scan findings with code context
**So that** I can quickly fix security issues

**Acceptance Criteria:**

```gherkin
Given a scan has completed with findings:
When I view the assessment report:
Then I see for each finding:
  - Severity badge (CRITICAL/HIGH/MEDIUM/LOW)
  - Rule description
  - File path with clickable line number
  - Code snippet with highlighted issue
  - Remediation suggestion
  - Link to rule documentation
```

</details>

---

## 🏗 Architecture

### Supported Scanners

| Scanner | Languages | Output Format |
|---------|-----------|---------------|
| Semgrep | Python, JavaScript, Go, Java, etc. | SARIF, JSON |
| Bandit | Python | JSON |
| ESLint (security) | JavaScript/TypeScript | JSON |
| CodeQL | Multiple | SARIF |

### Plugin Flow

```mermaid
sequenceDiagram
    participant Gateway as Gateway
    participant Plugin as SourceScannerPlugin
    participant Git as Git
    participant Semgrep as Semgrep
    participant Bandit as Bandit

    Gateway->>Plugin: server_pre_register(github_repo)
    Plugin->>Git: Clone repository
    Plugin->>Plugin: Detect languages
    
    par Python detected
        Plugin->>Bandit: bandit -r . -f json
        Bandit-->>Plugin: Python findings
    and All languages
        Plugin->>Semgrep: semgrep --config p/security-audit
        Semgrep-->>Plugin: SARIF findings
    end
    
    Plugin->>Plugin: Merge and deduplicate
    Plugin->>Plugin: Check severity threshold
    Plugin->>Git: Cleanup temp directory
    Plugin-->>Gateway: Findings or block
```

---

## 📋 Implementation Tasks

- [ ] Create `plugins/source_scanner/` directory structure
- [ ] Implement `SourceScannerPlugin` class
- [ ] Add Semgrep CLI wrapper with SARIF parsing
- [ ] Add Bandit CLI wrapper for Python
- [ ] Implement language detection logic
- [ ] Add Git clone with authentication support
- [ ] Add branch/tag/commit checkout
- [ ] Implement finding deduplication
- [ ] Add severity filtering and thresholds
- [ ] Implement temp directory cleanup
- [ ] Create MCP-specific Semgrep rules (optional)
- [ ] Add scan result caching by commit SHA
- [ ] Create Admin UI for findings display
- [ ] Write unit tests
- [ ] Write integration tests with vulnerable repos
- [ ] Create README.md
- [ ] Pass `make verify` checks

---

## ⚙️ Configuration Example

```yaml
plugins:
  - name: "SourceScannerPlugin"
    kind: "plugins.source_scanner.source_scanner.SourceScannerPlugin"
    hooks:
      - server_pre_register
      - catalog_pre_deploy
    mode: "enforce"
    priority: 15
    
    config:
      # Scanner selection
      scanners:
        semgrep:
          enabled: true
          rulesets:
            - "p/security-audit"
            - "p/owasp-top-ten"
            - "p/python"
            - "p/javascript"
          extra_args: []
        bandit:
          enabled: true
          severity: "medium"
          confidence: "medium"
      
      # Severity settings
      severity_threshold: "WARNING"  # ERROR | WARNING | INFO
      fail_on_critical: true
      
      # Repository settings
      clone_timeout_seconds: 120
      scan_timeout_seconds: 600
      max_repo_size_mb: 500
      
      # Git authentication
      github_token_env: "GITHUB_TOKEN"
      
      # Caching
      cache_by_commit: true
      cache_ttl_hours: 168  # 1 week
```

---

## ✅ Success Criteria

- [ ] Semgrep integration with security rulesets
- [ ] Bandit integration for Python projects
- [ ] Language detection selects appropriate scanners
- [ ] Git clone with branch/tag support
- [ ] SARIF output parsing
- [ ] Findings stored in assessment database
- [ ] Admin UI shows findings with code context
- [ ] 80%+ test coverage
- [ ] Documentation complete

---

## 🔗 Related Issues

- #2215 - Epic: MCP Server Security Posture Assessment
- #2110 - Secure MCP Runtime
- Code Safety Linter plugin (existing)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE][SECURITY]: MCP server source code scanner - Semgrep/Bandit integration #2217

🔌 Plugin: MCP Server Source Code Scanner - Semgrep/Bandit Integration

Goal

Why Now?

📖 User Stories

🏗 Architecture

Supported Scanners

Plugin Flow

📋 Implementation Tasks

⚙️ Configuration Example

✅ Success Criteria

🔗 Related Issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scanner	Languages	Output Format
Semgrep	Python, JavaScript, Go, Java, etc.	SARIF, JSON
Bandit	Python	JSON
ESLint (security)	JavaScript/TypeScript	JSON
CodeQL	Multiple	SARIF

[FEATURE][SECURITY]: MCP server source code scanner - Semgrep/Bandit integration #2217

Description

🔌 Plugin: MCP Server Source Code Scanner - Semgrep/Bandit Integration

Goal

Why Now?

📖 User Stories

🏗 Architecture

Supported Scanners

Plugin Flow

📋 Implementation Tasks

⚙️ Configuration Example

✅ Success Criteria

🔗 Related Issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions