Skip to content

usage of distinct hurts performance #32937

@freedge

Description

@freedge

Terraform Version

Terraform v1.5.0-dev
on linux_amd64

Terraform Configuration Files

https://github.com/freedge/tftest

we use a large expression containing the distinct keyword

locals {
  address_map = {
      for address in distinct(flatten([for endpoint in var.content: endpoint])): address.name => address
  }
}

resource "null_resource" "null_objects" {
  lifecycle {
    create_before_destroy = true
  }
  for_each    = local.address_map
  triggers = {
        name        = each.key,
  }
}

Debug Output

...
2023-03-29T06:23:54.760Z [TRACE] vertex "module.addresses.local.address_map (expand)": starting visit (*terraform.nodeExpandLocal)
2023-03-29T06:23:54.760Z [TRACE] vertex "module.addresses.local.address_map (expand)": expanding dynamic subgraph
2023-03-29T06:23:54.760Z [TRACE] Expanding local: adding module.addresses.local.address_map as *terraform.NodeLocal
2023-03-29T06:23:54.760Z [TRACE] vertex "module.addresses.local.address_map (expand)": entering dynamic subgraph
2023-03-29T06:23:54.760Z [TRACE] vertex "module.addresses.local.address_map": starting visit (*terraform.NodeLocal)
2023-03-29T06:23:59.003Z [TRACE] dag/walk: vertex "provider["registry.terraform.io/hashicorp/null"] (close)" is waiting for "module.addresses.null_resource.null_objects (expand)"
2023-03-29T06:23:59.003Z [TRACE] dag/walk: vertex "root" is waiting for "provider["registry.terraform.io/hashicorp/null"] (close)"
2023-03-29T06:23:59.003Z [TRACE] dag/walk: vertex "module.addresses (close)" is waiting for "module.addresses.null_resource.null_objects (expand)"
2023-03-29T06:23:59.041Z [TRACE] dag/walk: vertex "module.addresses.null_resource.null_objects (expand)" is waiting for "module.addresses.local.address_map (expand)"
2023-03-29T06:23:59.760Z [TRACE] dag/walk: vertex "root" is waiting for "module.addresses.local.address_map"
...

Expected Behavior

terraform plan, apply, show, should be reasonably slow

Actual Behavior

terraform takes CPU during several minutes, which is unreasonably slow

image

this is aggravated by the usage of the distinct function. If we remove it it's more reasonable:

image

  • x: the amount of elements in the content input array
  • y: the time in second to complete the terraform plan

Steps to Reproduce

git clone https://github.com/freedge/tftest.git
terraform init
terraform plan

initially reproduced on v1.4.2 and on current main branch.

Additional Context

this was observed initially while using the https://github.com/PaloAltoNetworks/terraform-provider-panos provider.
We are considering running a preprocessing script to clean up our input content, so that terraform eat pre-mashed data that are unique already, but it would be so great to have improvements in terraform itself.

References

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions