Skip to content

fix(tg_connection): handle create timeout by checking existing connec…#6703

Merged
hkantare merged 3 commits intoIBM-Cloud:masterfrom
anaghajoshiibm:tgw-timeout-issue
Mar 26, 2026
Merged

fix(tg_connection): handle create timeout by checking existing connec…#6703
hkantare merged 3 commits intoIBM-Cloud:masterfrom
anaghajoshiibm:tgw-timeout-issue

Conversation

@anaghajoshiibm
Copy link
Copy Markdown
Contributor

@anaghajoshiibm anaghajoshiibm commented Mar 17, 2026

Summary

Fixes an issue where Transit Gateway connection creation fails with "connection already exists" if a timeout occurs during create.

Fix

If create times out, the provider now checks whether the connection was successfully created in the backend.
If it exists, it is adopted into state instead of retrying creation.

Testing

  • Verified using reduced create timeout (20s)
  • Resource successfully created and adopted into state
  • No duplicate creation errors observed

@hkantare
Copy link
Copy Markdown
Collaborator

@anaghajoshiibm
When you handle this way if wait fails with some other reason where connection is not in the desired state then also you will continue further instead of that why can't we add and another comma separated string connectionAlreadyExists in wait loginc

Target:     []string{isTransitGatewayConnectionAttached, ""},

@anaghajoshiibm
Copy link
Copy Markdown
Contributor Author

Thanks for the suggestion! @hkantare
As per my understanding, we are not adding something like connectionAlreadyExists to the wait Target because the wait logic (StateChangeConf) operates purely on resource states returned by the GET API (such as pending, attached, failed). The "connection already exists" scenario is not a resource state, but an error returned by the CREATE API (HTTP 409), so it cannot be handled within the state-based wait mechanism.

To address the concern about wait failures and ensuring we don’t proceed incorrectly, I’ve added explicit post-timeout validation:
After a wait failure, we perform a GET to verify the actual backend state
If the status is attached or pending, we safely adopt the resource
If the status is failed, we return an error and stop

I validated this behaviour with forced timeout scenarios:

Timeout occurs → GET confirms resource in pending → adopted successfully
Reapply shows no duplicate create and state remains consistent

Please let me know if you would prefer a different approach, I’m happy to align and make changes accordingly.

@hkantare hkantare merged commit 37c7964 into IBM-Cloud:master Mar 26, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants