Skip to content

Conversation

cuppett
Copy link
Member

@cuppett cuppett commented Sep 4, 2025

Summary

Fixes iamserviceaccount creation failures with classic OIDC and BYO OIDC clusters that were reporting "no OIDC provider found for cluster" and "does not have OIDC configuration" errors.

Root Cause

The current getOIDCProviderARN function has two issues:

  1. Classic OIDC clusters: Early exit when oidcConfig == nil causes "does not have OIDC configuration" error
  2. BYO OIDC clusters: ListOidcProviders by cluster ID fails to find providers that weren't tagged with cluster ID

Solution

  • Remove early exit for clusters without oidcConfig
  • Add fallback logic to use cluster.AWS().STS().OIDCEndpointURL() when cluster ID-based search fails
  • Implement endpoint URL-based provider detection using existing HasOpenIDConnectProvider function
  • Add generateOIDCProviderARN helper to construct provider ARN from endpoint URL

Changes Made

  • Enhanced getOIDCProviderARN function with fallback detection logic
  • Added generateOIDCProviderARN helper for ARN construction
  • Updated test coverage for classic OIDC and BYO OIDC scenarios
  • All existing tests pass (10/10)

Test Plan

  • Unit tests pass for all scenarios including new edge cases
  • Classic OIDC clusters (no oidcConfig but has OIDCEndpointURL)
  • BYO OIDC clusters (cluster ID search fails but endpoint URL works)
  • Standard managed OIDC clusters continue to work
  • Error handling for clusters with no OIDC configuration

Resolves

  • OCM-18278: Classic OIDC cluster iamserviceaccount creation failure
  • OCM-18279: BYO OIDC cluster iamserviceaccount creation failure

🤖 Generated with Claude Code

… clusters

Fixes iamserviceaccount creation failures with classic OIDC and BYO OIDC clusters.

- Remove early exit for clusters without oidcConfig that caused "does not have OIDC configuration" error
- Add fallback logic to use cluster.AWS().STS().OIDCEndpointURL() when cluster ID-based search fails
- Implement endpoint URL-based provider detection using existing HasOpenIDConnectProvider function
- Add generateOIDCProviderARN helper to construct provider ARN from endpoint URL
- Enhanced test coverage for classic OIDC and BYO OIDC scenarios

Resolves OCM-18278 and OCM-18279

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@openshift-ci openshift-ci bot requested review from oriAdler and robpblake September 4, 2025 16:37
Copy link
Contributor

openshift-ci bot commented Sep 4, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: cuppett
Once this PR has been reviewed and has the lgtm label, please assign den-rgb for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

openshift-ci bot commented Sep 4, 2025

@cuppett: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link

codecov bot commented Sep 4, 2025

Codecov Report

❌ Patch coverage is 68.00000% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 27.80%. Comparing base (8059531) to head (ee7d0a4).
⚠️ Report is 10 commits behind head on master.

Files with missing lines Patch % Lines
cmd/create/iamserviceaccount/cmd.go 68.00% 4 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3002      +/-   ##
==========================================
+ Coverage   27.33%   27.80%   +0.47%     
==========================================
  Files         306      315       +9     
  Lines       34449    34816     +367     
==========================================
+ Hits         9416     9681     +265     
- Misses      24384    24481      +97     
- Partials      649      654       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@hunterkepley hunterkepley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asked QE to test and the results are:

Classic non-byooidc still fails due to the same reason, HCP byooidc works, and classic byooidc works

func (c *awsClient) GetOpenIDConnectProviderByClusterIdTag(clusterID string) (string, error) {

This function for example is used only for delete oidcprovider. It seems to do what is needed here. If you look at that old linked function, and the original solution Claude generated in this file, it looks pretty familiar!

If this doesn't work (due to the same tagging issue) - a solution which would work with the EndpointURL which was grabbed here already exists:

func (c *awsClient) GetOpenIDConnectProviderByOidcEndpointUrl(oidcEndpointUrl string) (string, error) {

Which again.. looks very familiar when you look at function linked above VS the code in this MR (except this is a less generic version)

The goal is to get the ARN from the OIDC provider based on the name of the function. We have done this before in other commands such as delete oidcprovider, list oidcproviders, etc. And can probably make use of some of the patterns there

At the end of the day, getting the ARN from a cluster's OIDC Provider should be less complex than what is here, because we have an existing function already. To me, it looks like Claude mimicked creating brand-new code by copying this function and editing it to be more specific-purpose (rather than general-purpose like the original function was intended to be)

@hunterkepley
Copy link
Contributor

hunterkepley commented Sep 4, 2025

A failure still exists, meaning we are either missing tests, or test(s) are not set up to correctly reflect the real world scenario we are running into

@hunterkepley
Copy link
Contributor

Did some more manual testing, I think I found another bug with the merged feature:

it is failing to create a role due to invalid policy ARN still creates some resources, when it should fail before making anything, or, delete the resources after failing
(resulting in this error when trying again after providing a correct ARN):

❯ ./rosa create iamserviceaccount -c 2l3njgm35o4cn51bju3dv0u4f8ihqce4 --name test-app --namespace default --attach-policy-arn arn:aws:iam::123123123123:policy/hktest-openshift-ingress-operator-cloud-credentials
E: failed to create role: Role with same name but different path exists. Existing role ARN: arn:aws:iam::123123123123:role/MY_ROLE

@hunterkepley
Copy link
Contributor

hunterkepley commented Sep 4, 2025

I also have prepared a fix of my own: #3003 - QE performed testing on it and it seems to fix OCM-18278

@hunterkepley
Copy link
Contributor

2 more release-blocker bugs raised for this feature

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 5, 2025
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants