A2A Protocol Technology Compatibility Kit (TCK)

A comprehensive test suite for validating A2A (Application-to-Application) JSON-RPC protocol specification compliance with progressive validation and detailed compliance reporting.

Overview

The A2A Protocol TCK is a sophisticated validation framework that provides:

📋 Categorized Testing: Clear separation of mandatory vs. optional requirements
🎯 Capability-Based Validation: Smart test execution based on Agent Card declarations
📊 Compliance Reporting: Detailed assessment with actionable recommendations
🚀 Progressive Enhancement: Four-tier compliance levels for informed deployment decisions

The TCK transforms A2A specification compliance from guesswork into a clear, structured validation process.

🔄 Two Main Workflows

1. Testing Your A2A Implementation (You're likely here for this)

Use the TCK to validate your A2A implementation:

./run_tck.py --sut-url http://localhost:9999 --category all --compliance-report report.json

2. Managing A2A Specification Updates (Advanced/Maintainer workflow)

Use the TCK to validate your A2A implementation:

📖 Complete Specification Update Workflow
🔍 Check spec changes: util_scripts/check_spec_changes.py
📥 Update baseline: util_scripts/update_current_spec.py --version "v1.x"

✨ Key Features

🔍 Intelligent Test Categorization

🔴 MANDATORY: Must pass for A2A compliance (JSON-RPC 2.0 + A2A core)
🔄 CAPABILITIES: Conditional mandatory based on Agent Card declarations
🛡️ QUALITY: Production readiness indicators (optional)
🎨 FEATURES: Optional implementation completeness (informational)

🧠 Capability-Based Test Logic

Smart Execution: Tests skip when capabilities not declared, become mandatory when declared
False Advertising Detection: Catches capabilities declared but not implemented
Honest Validation: Only tests what's actually claimed to be supported

📈 Compliance Levels & Scoring

🔴 NON_COMPLIANT: Any mandatory failure (Not A2A Compliant)
🟡 MANDATORY: Basic compliance (A2A Core Compliant)
🟢 RECOMMENDED: Production-ready (A2A Recommended Compliant)
🏆 FULL_FEATURED: Complete implementation (A2A Fully Compliant)

📋 Comprehensive Reporting

Weighted compliance scoring
Specification reference citations
Actionable fix recommendations
Deployment readiness guidance

Requirements

Python: 3.8+
uv: Recommended for environment management
SUT: Running A2A implementation with accessible HTTP/HTTPS endpoint

Installation

Install uv:

# Install uv (see https://github.com/astral-sh/uv#installation)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or: pipx install uv
# Or: brew install uv

Clone and setup:

git clone https://github.com/maeste/a2a-tck.git
cd a2a-tck

# Create virtual environment
uv venv
source .venv/bin/activate  # Linux/macOS
# .venv\\Scripts\\activate   # Windows

# Install dependencies
uv pip install -e .

Configure environment (optional):

# Copy example environment file and customize
cp .env.example .env
# Edit .env to set timeout values and other configuration

Start your A2A implementation (System Under Test):

# Example using the included Python SUT
cd python-sut/tck_core_agent
uv run .

Preparing and Running Your SUT (System Under Test) with `run_sut.py`

Note: The run_sut.py script requires the PyYAML package. You can install it using uv pip install pyyaml or pip install pyyaml.

To simplify the process of testing various A2A implementations, this TCK includes a utility script run_sut.py. This Python script automates the download (or update), build, and execution of a System Under Test (SUT) based on a configuration file.

SUTs will be cloned or updated into a directory named SUT/ created in the root of this TCK repository.

Configuration (`sut_config.yaml`)

You need to create a YAML configuration file (e.g., my_sut_config.yaml) to define how your SUT should be handled. A template is available at sut_config_template.yaml.

The configuration file supports the following fields:

sut_name (string, mandatory): A descriptive name for your SUT. This name will also be used as the directory name for the SUT within the SUT/ folder (e.g., SUT/my_agent).
github_repo (string, mandatory): The HTTPS or SSH URL of the git repository where the SUT source code is hosted.
git_ref (string, optional): A specific git branch, tag, or commit hash to checkout after cloning/fetching. If omitted, the repository's default branch will be used.
prerequisites_script (string, mandatory): Path to the script that handles prerequisite installation and building the SUT. This path is relative to the root of the SUT's cloned repository (e.g., scripts/build.sh or setup/prepare_env.py).
prerequisites_interpreter (string, optional): The interpreter to use for the prerequisites_script (e.g., bash, python3, powershell.exe). If omitted, the script will be executed directly (e.g., ./scripts/build.sh). Ensure the script is executable and has a valid shebang in this case.
prerequisites_args (string, optional): A string of arguments to pass to the prerequisites_script (e.g., "--version 1.2 --no-cache").
run_script (string, mandatory): Path to the script that starts the SUT. This path is relative to the root of the SUT's cloned repository (e.g., scripts/run.sh or app/start_server.py).
run_interpreter (string, optional): The interpreter to use for the run_script.
run_args (string, optional): A string of arguments to pass to the run_script (e.g., "--port 8080 --debug").

Example sut_config.yaml:

sut_name: "example_agent"
github_repo: "https://github.com/your_org/example_agent_repo.git"
git_ref: "v1.0.0" # Optional: checkout tag v1.0.0
prerequisites_script: "bin/setup.sh"
prerequisites_interpreter: "bash"
prerequisites_args: "--fast"
run_script: "bin/start.py"
run_interpreter: "python3"
run_args: "--host 0.0.0.0 --port 9000"

SUT Script Requirements

Prerequisites Script: This script is responsible for all steps required to build your SUT and install its dependencies. It should exit with a status code of 0 on success and any non-zero status code on failure. If it fails, run_sut.py will terminate.
Run Script: This script should start your SUT. Typically, it will launch a server or application that runs in the foreground. The run_sut.py script will wait for this script to terminate (e.g., by Ctrl+C or if the SUT exits itself).
Directly Executable Scripts: If you omit the *_interpreter for a script, ensure the script file has execute permissions (e.g., chmod +x your_script.sh) and, for shell scripts on Unix-like systems, includes a valid shebang (e.g., #!/bin/bash).

Usage

Once you have your SUT configuration file ready, you can run your SUT using:

python run_sut.py path/to/your_sut_config.yaml

For example:

python run_sut.py sut_configs/my_python_agent_config.yaml

This will:

Clone the SUT from github_repo into SUT/<sut_name>/ (or update if it already exists).
Checkout the specified git_ref (if any).
Execute the prerequisites_script within the SUT's directory.
Execute the run_script within the SUT's directory to start the SUT.

You can then proceed to run the TCK tests against your SUT.

📋 SUT Requirements

Before running tests, ensure your A2A implementation meets the SUT Requirements. This includes:

Streaming Duration: Tasks with message IDs starting with "test-resubscribe-message-id" must run for ≥ 2 × TCK_STREAMING_TIMEOUT seconds
Environment Variables: Optional support for TCK_STREAMING_TIMEOUT configuration
Test Patterns: Proper handling of TCK-specific message ID patterns

📖 Read Full SUT Requirements →

🚀 Quick Start

1. Check A2A Compliance (Start Here!)

./run_tck.py --sut-url http://localhost:9999 --category mandatory

Result: ✅ Pass = A2A compliant, ❌ Fail = NOT A2A compliant

2. Validate Capability Honesty

./run_tck.py --sut-url http://localhost:9999 --category capabilities

Result: Ensures declared capabilities actually work (prevents false advertising)

3. Assess Production Readiness

./run_tck.py --sut-url http://localhost:9999 --category quality

Result: Identifies issues that may affect production deployment

4. Generate Comprehensive Report

./run_tck.py --sut-url http://localhost:9999 --category all --compliance-report compliance.json

Result: Complete assessment with compliance level and recommendations

📖 Command Reference

Core Commands

# Get help and understand test categories
./run_tck.py --explain

# Test specific category
./run_tck.py --sut-url URL --category CATEGORY

# Available categories:
#   mandatory    - A2A compliance validation (MUST pass)  
#   capabilities - Capability honesty check (conditional mandatory)
#   quality      - Production readiness assessment
#   features     - Optional feature completeness
#   all          - Complete validation workflow

Advanced Options

# Generate detailed compliance report
./run_tck.py --sut-url URL --category all --compliance-report report.json

# Verbose output with detailed logging
./run_tck.py --sut-url URL --category mandatory --verbose

# Generate HTML report (additional)
./run_tck.py --sut-url URL --category all --report

# Skip Agent Card fetching (for non-standard implementations)  
./run_tck.py --sut-url URL --category mandatory --skip-agent-card

⚙️ Environment Configuration

Using Environment Variables

The TCK supports configuration via environment variables and .env files for flexible timeout and behavior customization.

Setting up environment configuration:

# Copy the example file
cp .env.example .env

# Edit the file to customize settings
nano .env  # or your preferred editor

Available environment variables:

Variable	Description	Default	Examples
`TCK_STREAMING_TIMEOUT`	Base timeout for SSE streaming tests (seconds)	`2.0`	`1.0` (fast), `5.0` (slow), `10.0` (debug)

Timeout behavior:

Short timeout: TCK_STREAMING_TIMEOUT * 0.5 - Used for basic streaming operations
Normal timeout: TCK_STREAMING_TIMEOUT * 1.0 - Used for standard SSE client operations
Async timeout: TCK_STREAMING_TIMEOUT * 1.0 - Used for asyncio.wait_for operations

Usage examples:

# Use .env file (recommended)
echo "TCK_STREAMING_TIMEOUT=5.0" > .env
./run_tck.py --sut-url URL --category capabilities

# Set directly for single run
TCK_STREAMING_TIMEOUT=1.0 ./run_tck.py --sut-url URL --category capabilities

# Debug with very slow timeouts
TCK_STREAMING_TIMEOUT=30.0 ./run_tck.py --sut-url URL --category capabilities --verbose

When to adjust timeouts:

Decrease (1.0): Fast CI/CD pipelines, local development
Increase (5.0+): Slow networks, debugging, resource-constrained environments
Debug (10.0+): Detailed troubleshooting, step-through debugging

🎯 Understanding Test Categories

🔴 MANDATORY Tests - Core A2A Compliance

Purpose: Validate core A2A specification requirements
Impact: Failure = NOT A2A compliant
Location: tests/mandatory/

Includes:

JSON-RPC 2.0 compliance (tests/mandatory/jsonrpc/)
A2A protocol core methods (tests/mandatory/protocol/)
Agent Card required fields
Core message/send functionality
Task management (get/cancel)

Example Failures:

test_task_history_length → SDK doesn't implement historyLength parameter
test_mandatory_fields_present → Agent Card missing required fields

🔄 CAPABILITY Tests - Conditional Mandatory

Purpose: Validate declared capabilities work correctly
Impact: Failure = False advertising
Logic: Skip if not declared, mandatory if declared
Location: tests/optional/capabilities/

Capability Validation:

{
  "capabilities": {
    "streaming": true,         ← Must pass streaming tests
    "pushNotifications": false ← Streaming tests will skip
  }
}

Includes:

Streaming support (message/stream, tasks/resubscribe)
Push notification configuration
File/data modality support
Authentication methods

🛡️ QUALITY Tests - Production Readiness

Purpose: Assess implementation robustness
Impact: Never blocks compliance, indicates production issues
Location: tests/optional/quality/

Quality Areas:

Concurrent request handling
Edge case robustness
Unicode/special character support
Boundary value handling
Error recovery and resilience

🎨 FEATURE Tests - Optional Implementation

Purpose: Measure optional feature completeness
Impact: Purely informational
Location: tests/optional/features/

Includes:

Convenience features
Enhanced error messages
SDK-specific capabilities
Optional protocol extensions

📊 Compliance Levels

🔴 NON_COMPLIANT - Not A2A Compliant

Criteria: Any mandatory test failure
Business Impact: Cannot be used for A2A integrations
Action: Fix mandatory failures immediately

🟡 MANDATORY - A2A Core Compliant

Criteria: 100% mandatory test pass rate
Business Impact: Basic A2A integration support
Suitable For: Development and testing environments
Next Step: Address capability validation

🟢 RECOMMENDED - A2A Recommended Compliant

Criteria: Mandatory (100%) + Capability (≥85%) + Quality (≥75%)
Business Impact: Production-ready with confidence
Suitable For: Staging and careful production deployment
Next Step: Enhance feature completeness

🏆 FULL_FEATURED - A2A Fully Compliant

Criteria: Capability (≥95%) + Quality (≥90%) + Feature (≥80%)
Business Impact: Complete A2A implementation
Suitable For: Full production deployment with confidence

📋 Compliance Report

When you run with --compliance-report, you get a JSON report containing:

{
  "summary": {
    "compliance_level": "RECOMMENDED",
    "overall_score": 87.5,
    "mandatory_score": 100.0,
    "capability_score": 90.0,
    "quality_score": 75.0,
    "feature_score": 60.0
  },
  "recommendations": [
    "✅ Ready for staging deployment",
    "⚠️ Address 2 quality issues before production",
    "💡 Consider implementing 3 additional features"
  ],
  "next_steps": [
    "Fix Unicode handling in task storage",
    "Improve concurrent request performance",
    "Consider implementing authentication capability"
  ]
}

🔄 CI/CD Integration

Basic CI Pipeline (Compliance Gate)

#!/bin/bash
# Block deployment if not A2A compliant
./run_tck.py --sut-url $SUT_URL --category mandatory
if [ $? -ne 0 ]; then
    echo "❌ NOT A2A compliant - blocking deployment"
    exit 1
fi
echo "✅ A2A compliant - deployment approved"

Advanced CI Pipeline (Environment-Aware)

#!/bin/bash
# Generate compliance report and make environment-specific decisions
./run_tck.py --sut-url $SUT_URL --category all --compliance-report compliance.json

COMPLIANCE_LEVEL=$(jq -r '.summary.compliance_level' compliance.json)

case $COMPLIANCE_LEVEL in
    "NON_COMPLIANT")
        echo "❌ Not A2A compliant - blocking all deployments"
        exit 1
        ;;
    "MANDATORY")
        echo "🟡 Basic compliance - dev/test only"
        [[ "$ENVIRONMENT" == "production" ]] && exit 1
        ;;
    "RECOMMENDED")
        echo "🟢 Recommended - staging approved"
        ;;
    "FULL_FEATURED")
        echo "🏆 Full compliance - production approved"
        ;;
esac

🛠️ Troubleshooting

Common Issues

Streaming tests skipping:

# Check Agent Card capabilities
curl $SUT_URL/.well-known/agent.json | jq .capabilities
# If streaming: false, tests will skip (this is correct!)

Quality tests failing but compliance achieved:

# This is expected - quality tests don't block compliance
# Address quality issues for production readiness

Tests not discovering:

# Ensure proper installation
pip install -e .

# Check test discovery
pytest --collect-only tests/mandatory/

Running Individual Tests for Debugging

When debugging specific test failures, you can run individual tests with detailed output:

Run a single test with verbose output and debug information:

# Using run_tck.py with verbose mode (shows print() and logger.info() messages)
python run_tck.py --sut-url http://localhost:9999 --category capabilities --verbose-log

# Run specific test directly with pytest
python -m pytest tests/optional/capabilities/test_streaming_methods.py::test_message_stream_basic \
    --sut-url http://localhost:9999 -s -v --log-cli-level=INFO

Run all tests in a specific file:

python -m pytest tests/optional/capabilities/test_streaming_methods.py \
    --sut-url http://localhost:9999 -s -v --log-cli-level=INFO

Debug options explained:

-s: Shows print() statements during test execution
-v: Verbose test output with detailed test names and outcomes
--log-cli-level=INFO: Shows logger.info() and other log messages
--tb=short: Shorter traceback format (default in run_tck.py)

Run with different log levels:

# Show DEBUG level logs (very detailed)
python -m pytest tests/path/to/test.py --sut-url URL -s -v --log-cli-level=DEBUG

# Show only WARNING and ERROR logs
python -m pytest tests/path/to/test.py --sut-url URL -s -v --log-cli-level=WARNING

📚 Documentation

SUT Requirements - Essential requirements for A2A implementations to work with the TCK
SDK Validation Guide - Detailed usage guide for SDK developers
Specification Update Workflow - Monitor and manage A2A specification changes
Test Documentation Standards - Standards for test contributors

🤝 Contributing

Fork the repository
Follow Test Documentation Standards
Add tests with proper categorization and specification references
Submit pull request with clear specification citations

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🎯 Quick Decision Guide

Just want A2A compliance?

./run_tck.py --sut-url URL --category mandatory

Planning production deployment?

./run_tck.py --sut-url URL --category all --compliance-report report.json

Debugging capability issues?

./run_tck.py --sut-url URL --category capabilities --verbose

Want comprehensive assessment?

./run_tck.py --sut-url URL --explain  # Learn about categories first
./run_tck.py --sut-url URL --category all --compliance-report full_report.json

The A2A TCK transforms specification compliance from confusion into clarity. 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.github		.github
SUT/java_sut/scripts		SUT/java_sut/scripts
current_spec		current_spec
current_spec_backup		current_spec_backup
docs		docs
python-sut/tck_core_agent		python-sut/tck_core_agent
reports		reports
spec_analysis		spec_analysis
spec_tracker		spec_tracker
sut_configs		sut_configs
tck		tck
tests		tests
util_scripts		util_scripts
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compliance.json		compliance.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_sut.py		run_sut.py
run_tck.py		run_tck.py

License

a2aproject/a2a-tck

Folders and files

Latest commit

History

Repository files navigation

A2A Protocol Technology Compatibility Kit (TCK)

Overview

🔄 Two Main Workflows

1. Testing Your A2A Implementation (You're likely here for this)

2. Managing A2A Specification Updates (Advanced/Maintainer workflow)

✨ Key Features

🔍 Intelligent Test Categorization

🧠 Capability-Based Test Logic

📈 Compliance Levels & Scoring

📋 Comprehensive Reporting

Requirements

Installation

Preparing and Running Your SUT (System Under Test) with run_sut.py

Configuration (sut_config.yaml)

SUT Script Requirements

Usage

📋 SUT Requirements

🚀 Quick Start

1. Check A2A Compliance (Start Here!)

2. Validate Capability Honesty

3. Assess Production Readiness

4. Generate Comprehensive Report

📖 Command Reference

Core Commands

Advanced Options

⚙️ Environment Configuration

Using Environment Variables

🎯 Understanding Test Categories

🔴 MANDATORY Tests - Core A2A Compliance

🔄 CAPABILITY Tests - Conditional Mandatory

🛡️ QUALITY Tests - Production Readiness

🎨 FEATURE Tests - Optional Implementation

📊 Compliance Levels

🔴 NON_COMPLIANT - Not A2A Compliant

🟡 MANDATORY - A2A Core Compliant

🟢 RECOMMENDED - A2A Recommended Compliant

🏆 FULL_FEATURED - A2A Fully Compliant

📋 Compliance Report

🔄 CI/CD Integration

Basic CI Pipeline (Compliance Gate)

Advanced CI Pipeline (Environment-Aware)

🛠️ Troubleshooting

Common Issues

Running Individual Tests for Debugging

📚 Documentation

🤝 Contributing

📝 License

🎯 Quick Decision Guide

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Preparing and Running Your SUT (System Under Test) with `run_sut.py`

Configuration (`sut_config.yaml`)

Packages