|
| 1 | +# :material-auto-fix: Asset Transformations Configuration |
| 2 | + |
| 3 | +The `transformations` section of the OWASP Amass `config.yaml` file is one of the most powerful parts of the collection engine. It controls **how data flows through the system**, defines **which types of assets can be transformed into others**, and sets **constraints like freshness (TTL), trustworthiness (confidence), and urgency (priority)** on those transformations. |
| 4 | + |
| 5 | +This section empowers users to **customize and optimize their data collection workflows** based on their goals, risk tolerance, and update requirements. |
| 6 | + |
| 7 | +## :material-help-circle-outline: What Are Transformations? |
| 8 | + |
| 9 | +In Amass, the data collection process is modeled as a pipeline of **asset transformations**. Each asset observed (like an IP address, domain, ASN, etc.) can trigger **handlers**, which attempt to enrich, correlate, or expand on that asset type by transforming it into new assets. |
| 10 | + |
| 11 | +For example: |
| 12 | +- A discovered `FQDN` might trigger a handler to look up its DNS records (`FQDN -> DNS`) |
| 13 | +- A known `AutonomousSystem` could be transformed into its RDAP metadata (`AutonomousSystem -> RDAP`) |
| 14 | +- A `Product` seen on a web service might lead to discovery of its `ProductRelease` metadata |
| 15 | + |
| 16 | +The `transformations` section defines: |
| 17 | +- **Which transformations are allowed** |
| 18 | +- **How frequently** each transformation should be retried (`ttl`) |
| 19 | +- **How much confidence** the system should have in the results (`confidence`) |
| 20 | +- **How important** the transformation is (`priority`) |
| 21 | + |
| 22 | +## :material-cog-outline: Configuration Overview |
| 23 | + |
| 24 | +Here's the structure of a typical configuration: |
| 25 | + |
| 26 | +```yaml |
| 27 | +options: |
| 28 | + default_transform_values: |
| 29 | + ttl: 1440 # in minutes (default is 1 day) |
| 30 | + confidence: 50 # default confidence threshold (0–100%) |
| 31 | + priority: 5 # default priority level (1=low, 10=high) |
| 32 | + |
| 33 | +transformations: |
| 34 | + FQDN->DNS: |
| 35 | + ttl: 1440 |
| 36 | + AutonomousSystem->RDAP: |
| 37 | + ttl: 43200 # 30 days |
| 38 | + Identifier->GLEIF: |
| 39 | + ttl: 43200 |
| 40 | + Product->ALL: |
| 41 | + ttl: 10080 # 7 days |
| 42 | +``` |
| 43 | +
|
| 44 | +### \:gear: `default_transform_values` |
| 45 | + |
| 46 | +This section defines fallback values used when no custom values are given for a transformation. |
| 47 | + |
| 48 | +| Key | Type | Description | |
| 49 | +| ------------ | ------- | --------------------------------------------------------------------------------------------------------------------------- | |
| 50 | +| `ttl` | integer | Time-to-live (in minutes) for data freshness. If expired, the data will be fetched again from the source instead of reused. | |
| 51 | +| `confidence` | integer | Minimum confidence (0–100%) required to accept the result of a transformation. | |
| 52 | +| `priority` | integer | Priority score (1=lowest, 10=highest) that may influence queue ordering in some future extensions. | |
| 53 | + |
| 54 | +## \:hammer\_and\_wrench: Defining Transformations |
| 55 | + |
| 56 | +Each transformation follows this format: |
| 57 | + |
| 58 | +```yaml |
| 59 | +<SourceAssetType>-><TargetAssetType|Plugin Name>: |
| 60 | + ttl: <int> # Optional override of default |
| 61 | + confidence: <int> # Optional override |
| 62 | + priority: <int> # Optional override |
| 63 | +``` |
| 64 | + |
| 65 | +Use `->ALL` as a wildcard to enable **all available transformations from a given source asset type**. |
| 66 | + |
| 67 | +Example: |
| 68 | + |
| 69 | +```yaml |
| 70 | +FQDN->ALL: |
| 71 | +``` |
| 72 | + |
| 73 | +This enables all known FQDN transformations (e.g., `FQDN->IPAddress`, `FQDN->DomainRecord`, etc.). |
| 74 | + |
| 75 | +## \:page\_facing\_up: Example Config Breakdown |
| 76 | + |
| 77 | +```yaml |
| 78 | +transformations: |
| 79 | + FQDN->DNS: |
| 80 | + ttl: 1440 # 1 day |
| 81 | + FQDN->DomainRecord: |
| 82 | + ttl: 43200 # 30 days |
| 83 | + IPAddress->ALL: |
| 84 | + TLSCertificate->ALL: |
| 85 | + ttl: 10080 # 7 days |
| 86 | +``` |
| 87 | + |
| 88 | +### Explanation: |
| 89 | + |
| 90 | +* **FQDN->DNS**: Amass will try to resolve DNS for fully qualified domain names once per day. |
| 91 | +* **FQDN->DomainRecord**: Domain ownership records are more stable, so these are refreshed every 30 days. |
| 92 | +* **IPAddress->ALL**: All available transformations for IPs are enabled (e.g., geolocation, RDAP, reverse DNS). |
| 93 | +* **TLSCertificate->ALL**: Certificates fetched from services are checked weekly. |
| 94 | + |
| 95 | +## \:material-refresh: What is `ttl`? |
| 96 | + |
| 97 | +`ttl` (Time To Live) controls **how often a transformation can be retried** from the original source: |
| 98 | + |
| 99 | +* If the data is **still fresh** (within TTL), Amass will use the previously stored result from the database. |
| 100 | +* If the TTL has **expired**, the handler will attempt to **re-run the transformation**, such as querying the data source again. |
| 101 | + |
| 102 | +This ensures the system avoids unnecessary queries and controls bandwidth/load. |
| 103 | + |
| 104 | +## \:material-shield-check: What is `confidence`? |
| 105 | + |
| 106 | +`confidence` helps Amass **filter out noisy or speculative results**. Some plugins or handlers may return results with associated confidence scores. |
| 107 | + |
| 108 | +* If a handler returns a transformation with `confidence: 40`, and your threshold is `50`, **it will be ignored**. |
| 109 | +* Use this to reduce false positives or to tune behavior in environments where high data quality is crucial. |
| 110 | + |
| 111 | +## \:material-star: What is `priority`? |
| 112 | + |
| 113 | +The `priority` value is a **relative score (1–10)** that can help inform which transformations are more important. While not strictly enforced in the engine today, this allows **future prioritization of more urgent or valuable tasks**—like scanning attack surfaces or refreshing high-risk domains. |
| 114 | + |
| 115 | +## \:material-clipboard-list-outline: Tips and Best Practices |
| 116 | + |
| 117 | +* ✅ Use `->ALL` to simplify enabling all known transformations for an asset type. |
| 118 | +* ✅ Use higher TTLs (e.g., 30+ days) for records that rarely change (e.g., `RDAP`, `GLEIF`, `DomainRecord`). |
| 119 | +* ⚠️ Keep `ttl` low (e.g., 60–1440 min) for time-sensitive records like DNS, services, or IP geolocation. |
| 120 | +* ✅ Set `confidence` thresholds higher (e.g., 70–90) in production pipelines where trust is critical. |
| 121 | +* ✅ Consider adjusting `priority` for critical infrastructure or high-value assets. |
| 122 | + |
| 123 | +## \:material-rocket-launch: Summary |
| 124 | + |
| 125 | +The `transformations` section of the Amass configuration lets users **shape the intelligence collection process**, optimize for **freshness vs. efficiency**, and **control data quality** through TTLs and confidence scoring. |
| 126 | + |
| 127 | +It is a key part of how Amass turns passive and active discoveries into structured asset graphs that can drive attack surface monitoring, red teaming, or asset attribution. |
| 128 | + |
| 129 | +--- |
| 130 | + |
| 131 | +For a full list of supported asset types, refer to the [Open Asset Model documentation](../open_asset_model/index.md). |
| 132 | + |
| 133 | +*© 2025 Jeff Foley — Licensed under Apache 2.0.* |
0 commit comments