Collects resource counts (EC2, Lambda, ECS tasks, EKS nodes/pods, RDS, ElastiCache, ALB/NLB, NAT Gateways, SQS, SNS) and CloudWatch metric time series across AWS accounts. Produces a CSV, a text summary, and a JSON summary suitable for handing off to commercial/sizing teams.
This section is for the AI agent running the sizing tool on behalf of a customer. Follow these steps in order. Each step tells you exactly what to do and when to ask the customer for input.
Try the approaches below in order until one works. Do not ask the customer which approach to use — just try them and move on.
a. Run `aws sts get-caller-identity` to check if credentials already exist.
b. Run `python discover_accounts.py` — it auto-detects Organizations, SSO,
or named CLI profiles and writes accounts.json.
c. If discovery finds zero accounts, check for an AWS SSO instance:
aws sso-admin list-instances --region <region>
for each of us-east-1, eu-west-1, us-west-2, eu-central-1.
If an instance is found, configure an SSO profile in ~/.aws/config using
the identity store ID as the start URL prefix
(https://<identity-store-id>.awsapps.com/start), then run `aws sso login`.
After login, use `aws sso list-accounts` and `aws sso list-account-roles`
to discover account IDs and role names, then create a profile per account.
d. If none of the above work, ask the customer for credentials or to run
`aws configure` before proceeding.
Once AWS access is established:
python discover_accounts.py --output accounts.jsonReview the output. If the customer asked to scope to specific accounts or
exclude dev/sandbox, edit accounts.json before proceeding.
-
If
accounts.jsonentries have a"profile"key, use--profiles:python collect_aws_sizing.py \ --profiles <profile1> <profile2> ... \ --regions <regions> --interval 300 --workers 10 --output sizing_results
-
If using cross-account role assumption, use
--role-name+--accounts-file:python collect_aws_sizing.py \ --role-name CoralogixSizingRole \ --accounts-file accounts.json \ --regions <regions> --interval 300 --workers 20 --output sizing_results
If the customer specified regions, use those. Otherwise, start with a broad
default set and let the script scan all of them. The script defaults to 16
common regions if --regions is omitted. For a faster initial run, start with
the regions where the customer's workloads are deployed (ask if unsure).
Run the sizing script and check the output:
- If all account-region pairs show as SKIPPED, the IAM role is missing.
Deploy
iam_sizing_role.yamlvia CloudFormation into the accounts, or switch to--profilesmode. - If some accounts are skipped, note them in the report — the role may still be deploying or an SCP may be blocking access.
- If the run succeeds, review
sizing_results_summary.txtand confirm the resource counts look reasonable for the customer's estate.
After a successful run, build a canvas report (.canvas.tsx) from the live
data in sizing_results_summary.json. The report should include:
- Headline stats (hosts, containers, pods, nodes, lambdas, total resources)
- Full resource inventory table (show all types, even zeros)
- Per-account breakdown (resource counts and metric series per account)
- CloudWatch namespace breakdown with a bar chart
- Sizing calculation at both 300s and 60s intervals
- List of output files generated
After the run, do not leave SSO tokens, credentials, or account lists lying
around unless the customer wants them for future runs. If you created
temporary profiles in ~/.aws/config, mention that to the customer.
pip install boto3 tqdmdiscover_accounts.py automatically finds your accounts using the best
available method:
python discover_accounts.pyIt tries, in order:
- AWS Organizations — if you have
organizations:ListAccountsaccess - AWS SSO (Identity Center) — reads your cached SSO token after
aws sso login - Named CLI profiles — reads
~/.aws/configand callssts:GetCallerIdentityper profile
You can also target specific profiles:
python discover_accounts.py --profiles default prod staging
python discover_accounts.py --from-profiles # auto-discover all profilesOutput: accounts.json — feed it to the sizing script with --accounts-file.
No IAM role deployment needed. Each profile's own credentials scan that account.
python collect_aws_sizing.py \
--profiles default prod staging \
--regions eu-west-1 us-east-1 \
--interval 300 \
--workers 10 \
--output sizing_resultsUse CloudFormation StackSets to deploy iam_sizing_role.yaml across all
accounts in scope. Pass your management/tooling account ID as TrustedAccountId.
aws cloudformation create-stack-set \
--stack-set-name CoralogixSizing \
--template-body file://iam_sizing_role.yaml \
--parameters ParameterKey=TrustedAccountId,ParameterValue=<YOUR_ACCOUNT_ID> \
--capabilities CAPABILITY_NAMED_IAM \
--permission-model SERVICE_MANAGED \
--auto-deployment Enabled=true,RetainStacksOnAccountRemoval=falseThen deploy to all OUs or accounts:
aws cloudformation create-stack-instances \
--stack-set-name CoralogixSizing \
--deployment-targets OrganizationalUnitIds=<ROOT_OU_ID> \
--regions us-east-1python discover_accounts.py --output accounts.json
# optionally edit accounts.json to remove sandboxes/devpython collect_aws_sizing.py \
--role-name CoralogixSizingRole \
--accounts-file accounts.json \
--interval 300 \
--workers 20 \
--output sizing_results| Resource | API | Metric source |
|---|---|---|
| EC2 Instances | ec2:DescribeInstances |
AWS/EC2 |
| Lambda Functions | lambda:ListFunctions |
AWS/Lambda |
| ECS Running Tasks | ecs:DescribeClusters (statistics) |
AWS/ECS |
| EKS Nodes | eks:DescribeNodegroup (desiredSize) |
AWS/EKS |
| EKS Running Pods | cloudwatch:GetMetricData (ContainerInsights) |
ContainerInsights |
| RDS Instances | rds:DescribeDBInstances |
AWS/RDS |
| ElastiCache Nodes | elasticache:DescribeCacheClusters |
AWS/ElastiCache |
| Load Balancers | elbv2:DescribeLoadBalancers |
AWS/ApplicationELB, AWS/NetworkELB |
| NAT Gateways | ec2:DescribeNatGateways |
AWS/NATGateway |
| SQS Queues | sqs:ListQueues |
AWS/SQS |
| SNS Topics | sns:ListTopics |
AWS/SNS |
EKS pod counts require ContainerInsights to be enabled on the cluster. If it is not enabled, the pod count falls back to 0.
| File | Contents |
|---|---|
sizing_results_resources.csv |
One row per account/region — all resource counts (hosts, pods, nodes, etc.) |
sizing_results_metrics.csv |
One row per account/region/namespace — CloudWatch metric series detail |
sizing_results_summary.txt |
Human-readable report: resource inventory + metric series |
sizing_results_summary.json |
Machine-readable totals with per-account breakdown |
Daily metric samples = Total metric time series × (86,400 ÷ collection interval)
300s interval → 288 data points/day
60s interval → 1,440 data points/day
| Accounts | Regions | Recommended workers |
|---|---|---|
| < 50 | any | 10 |
| 50–200 | < 10 | 20 |
| 200–600 | any | 30–40 |
| 600+ | any | 40–50 |
AWS API rate limits per account are low (CloudWatch ListMetrics = 25 TPS).
Workers apply across accounts, so parallelism is safe at these levels.
If you see throttling errors in the detail CSV, reduce --workers.
Accounts appear as SKIPPED in the detail CSV when:
- The role does not exist in that account yet (StackSet still deploying)
- The account is suspended or restricted
- STS assume-role is blocked by an SCP
Check sizing_results_summary.txt for the skipped count and first 20 errors.