Skip to content

aws-samples/sample-verified-access-jwt-private-distribution

AVA JWT Key Private Distribution

What This Is

A controlled network bridge that enables workloads running in air-gapped VPCs (no Internet Gateway, no NAT) to validate AWS Verified Access (AVA) signed JWTs.

AVA publishes its ES384 signing public keys on an internet-reachable endpoint. Workloads inside air-gapped VPCs cannot reach that endpoint directly. This sample provides a narrowly-scoped Lambda function with internet egress (the "fetcher") and a consumer-side Interface VPC Endpoint for the Lambda service, so that the workload can invoke the fetcher over AWS PrivateLink without any traffic leaving the AWS network from the consumer side.

The fetcher accepts only a UUID-shaped kid, constructs the AVA URL server-side, fetches the PEM, validates it is an ES384 public key, and returns it. It cannot be coerced into fetching anything else.

Architecture

flowchart LR
    subgraph consumer["Consumer Account"]
        subgraph vpc["Air-Gapped VPC · no Internet route"]
            workload["Workload
            EC2 / ECS / Lambda"]
            vpce["Lambda
            VPC Endpoint"]
        end
    end

    subgraph producer["Producer Account"]
        fetcher["Fetcher Lambda
        python3.14 · arm64"]
    end

    ava[/"AVA Public Keys
    public-keys.prod.verified-access.
    {REGION}.amazonaws.com"/]

    workload -- "InvokeFunction {kid}" --> vpce
    vpce -- "PrivateLink" --> fetcher
    fetcher -- "HTTPS GET" --> ava
Loading

Key facts:

  • The consumer VPC has no IGW and no NAT. Its only egress path for this purpose is the Lambda Interface VPC Endpoint.
  • The VPC endpoint policy restricts lambda:InvokeFunction to the fetcher's ARN. Any other Lambda ARN is denied at the endpoint.
  • The fetcher runs in the Lambda-managed network (not VPC-attached). It is the only component with internet reachability.
  • The fetcher has no function URL, no API Gateway, no ALB, and no event source mapping. Its only invocation path is lambda:InvokeFunction.
  • Cross-account trust is explicit: the fetcher's resource policy allows only the declared consumer principal ARNs.

Scope

This sample supports the commercial AWS partition (aws) only. GovCloud (aws-us-gov), China (aws-cn), and other non-commercial partitions are out of scope.

Threat Model Summary

This system is a network bridge from an air-gapped VPC to the public internet. The design ensures it cannot be abused as a generic egress primitive or arbitrary-fetch gadget. Three primary threats drive the architecture:

Threat Description Mitigation
T1 — SSRF via caller-supplied URL A caller attempts to supply a URL or hostname to coerce the fetcher into reaching an attacker-controlled endpoint. The URL is constructed entirely server-side from the region (parsed from the fetcher's own ARN) and the validated kid. The caller has no URL, scheme, host, or path input.
T4 — Fetcher used as generic egress A compromised or malicious principal invokes the fetcher to use it as an arbitrary internet fetch tool. The fetcher has exactly one behavior: fetch the AVA public-keys URL constructed from a UUID-shaped kid. There is no "arbitrary URL" code path. The resource policy further restricts invoke to designated consumer principals.
T6 — VPCE used to invoke other Lambdas A workload in the consumer VPC attempts to use the Lambda VPC endpoint to invoke Lambda functions other than the fetcher. The VPC endpoint policy scopes lambda:InvokeFunction to the fetcher ARN only. Invocation of any other ARN is denied at the endpoint.

Additional threats (path injection via crafted kid, upstream response tampering, key staleness, log leakage, cross-account misconfiguration) are addressed in the design but are secondary to the three above.

Deployment

This sample ships two equivalent IaC implementations. Choose whichever matches your toolchain:

Option Directory Docs
CDK (Python) infra/ Below
Terraform terraform/ terraform/README.md

Both share the same handler code in src/fetcher/.

CDK Deployment

Prerequisites

  • AWS CDK v2 installed
  • uv installed (Python project manager)
  • Two AWS accounts (or one account playing both roles): a producer account for the fetcher and a consumer account for the air-gapped VPC workload
  • AWS CLI profiles configured for each account with credentials that have sufficient permissions to deploy CDK stacks

1. Install dependencies

uv sync

2. Bootstrap CDK in both accounts

Both accounts must be bootstrapped in the target region before the first deployment.

# Producer account
uv run cdk bootstrap aws://<PRODUCER_ACCOUNT_ID>/<REGION> \
  --profile <PRODUCER_PROFILE>

# Consumer account
uv run cdk bootstrap aws://<CONSUMER_ACCOUNT_ID>/<REGION> \
  --profile <CONSUMER_PROFILE>

3. Deploy the FetcherStack in the producer account

CDK resolves the account ID from the profile's credentials.

uv run cdk deploy FetcherStack \
  --profile <PRODUCER_PROFILE> \
  --region <REGION> \
  --context consumer_principal_arns='["arn:aws:iam::<CONSUMER_ACCOUNT_ID>:role/<CONSUMER_ROLE_NAME>"]'

Capture the FetcherFunctionArn output from the deployment.

4. Deploy the ConsumerStack in the consumer account

uv run cdk deploy ConsumerStack \
  --profile <CONSUMER_PROFILE> \
  --region <REGION> \
  --context fetcher_function_arn="<FETCHER_FUNCTION_ARN_FROM_STEP_3>" \
  --context vpc_id="<YOUR_VPC_ID>" \
  --context vpc_cidr="<YOUR_VPC_CIDR>" \
  --context subnet_ids='["<SUBNET_1>","<SUBNET_2>"]'

This creates the Lambda Interface VPC Endpoint (with a policy scoped to the fetcher ARN) and a security group allowing HTTPS (443) from the VPC CIDR to the endpoint ENIs.

Important: The ConsumerStack does not create an IAM role. Your workload's existing execution role must include the following policy statement to invoke the fetcher:

{
  "Effect": "Allow",
  "Action": "lambda:InvokeFunction",
  "Resource": "<FETCHER_FUNCTION_ARN_FROM_STEP_3>"
}

Consumer Integration Snippet

The following Python snippet shows how a workload inside the air-gapped VPC invokes the fetcher and verifies an AVA JWT. This is a reference only — it is not deployed as infrastructure.

import json
import os
from collections import OrderedDict

import boto3
import jwt

FETCHER_ARN     = os.environ["AVA_FETCHER_ARN"]
EXPECTED_SIGNER = os.environ["AVA_INSTANCE_ARN"]

_lambda = boto3.client("lambda")

# Process-local LRU cache: kid -> PEM string.
# Bounded to _MAX_CACHE_SIZE entries so that long-running processes do not
# accumulate stale keys indefinitely after AVA rotates its signing key.
# Override via the AVA_PEM_CACHE_SIZE environment variable if needed.
_MAX_CACHE_SIZE = int(os.environ.get("AVA_PEM_CACHE_SIZE", "10"))
_cache: OrderedDict[str, str] = OrderedDict()


def _resolve_pem(kid: str) -> str:
    """Resolve a PEM for the given kid, using a bounded local cache."""
    cached = _cache.get(kid)
    if cached is not None:
        _cache.move_to_end(kid)  # mark as recently used
        return cached

    resp = _lambda.invoke(
        FunctionName=FETCHER_ARN,
        InvocationType="RequestResponse",
        Payload=json.dumps({"kid": kid}).encode(),
    )
    body = json.loads(resp["Payload"].read())

    if "error" in body:
        raise RuntimeError(f"fetcher error: {body['error']}")

    pem = body["pem"]
    _cache[kid] = pem
    if len(_cache) > _MAX_CACHE_SIZE:
        _cache.popitem(last=False)
    return pem


def verify_ava_jwt(token: str) -> dict:
    """Verify an AVA-signed JWT using the fetcher for key resolution."""
    # 1. Extract kid from the JWT header (no signature check yet).
    header = jwt.get_unverified_header(token)
    kid = header["kid"]

    # 2. Resolve PEM via the fetcher (cache hit or remote invoke).
    pem = _resolve_pem(kid)

    # 3. Verify signature. jwt.decode enforces exp by default.
    claims = jwt.decode(
        token,
        pem,
        algorithms=["ES384"],
        options={"require": ["exp", "iss"]},
    )

    # 4. Verify the signer claim matches the expected AVA instance ARN.
    if claims.get("signer") != EXPECTED_SIGNER:
        raise jwt.InvalidIssuerError("signer ARN mismatch")

    return claims

Key points:

  • kid is extracted from the JWT header via jwt.get_unverified_header.
  • The cache is keyed by kid alone. Each kid identifies a unique signing key — the fetcher already binds to a specific region via its own ARN, so there is no ambiguity.
  • The cache is bounded to _MAX_CACHE_SIZE entries using an OrderedDict as a simple LRU. This prevents unbounded memory growth in long-running processes as AVA rotates signing keys (see below).
  • On cache miss, the fetcher is invoked via boto3 with InvocationType="RequestResponse" and a payload of {"kid": kid}.
  • JWT verification uses jwt.decode with algorithms=["ES384"].
  • The signer claim is verified against the AVA_INSTANCE_ARN environment variable, as required by the AWS Verified Access documentation.

Key rotation behavior

AVA rotates its signing keys approximately every 7 days by issuing a new kid. The old key remains valid during an overlap period so that in-flight JWTs can still be verified. Because each kid maps to a unique, immutable public key, cached entries never become incorrect — they just become unused once AVA stops signing with that kid.

For long-running processes (web servers, ECS tasks, daemons), the bounded LRU cache ensures that old entries are evicted naturally as new kid values arrive. The default cache size of 10 is generous — under normal rotation cadence there are at most 2 active kid values at any time (current + previous overlap). Set the AVA_PEM_CACHE_SIZE environment variable to override the default if your workload handles traffic from multiple AVA instances or has unusual rotation patterns.

Short-lived processes (Lambda functions, batch jobs) do not need cache pruning because the cache is discarded when the process exits.

Error Codes

The fetcher returns structured error responses as {"error": "<code>", "message": "<detail>"}.

Error Code When Returned
InvalidKid The kid field is missing, is not a string, or does not match the UUID shape ([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}). Also returned when the payload is not a JSON object.
ExtraFields The payload contains keys other than kid.
UpstreamNotFound The AVA public-keys endpoint returned HTTP 404 for the requested kid. The key does not exist or has been rotated out.
UpstreamError The AVA public-keys endpoint returned a non-2xx, non-404 HTTP status. Retry with exponential backoff (max 2 retries).
UpstreamTimeout The HTTPS request to the AVA public-keys endpoint timed out. Retry with exponential backoff (max 2 retries).
InvalidPem The upstream response did not decode as UTF-8 or did not conform to the PEM SubjectPublicKeyInfo envelope. This is alarm-worthy — the upstream endpoint may be misbehaving. Do not retry.
InvalidKeyType The upstream response parsed as a valid PEM public key but the algorithm/curve is not ES384 (SECP384R1). This is alarm-worthy. Do not retry.

Error messages never include caller-supplied input values.

License

See LICENSE.

About

No description, website, or topics provided.

Resources

License

MIT-0, Unknown licenses found

Licenses found

MIT-0
LICENSE
Unknown
LICENSE-SUMMARY

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors