Skip to content

Conversation

alebedev87
Copy link
Contributor

@alebedev87 alebedev87 commented Apr 15, 2025

This PR re-applies #2261 which was reverted #2277.

PR which fixed the missing GatewayAPI featuregate in Default on Hypershift: openshift/hypershift#6035.
PR which added HyperShift conformance tests to the api presubmites: openshift/release#63810.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 15, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Apr 15, 2025

@alebedev87: This pull request references NE-2009 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

This PR re-applies #2261 which was reverted #2277.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Apr 15, 2025

Hello @alebedev87! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Apr 15, 2025
@openshift-ci openshift-ci bot requested review from deads2k and JoelSpeed April 15, 2025 08:04
@JoelSpeed
Copy link
Contributor

/lgtm
/hold until hypershift conformance are green

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 15, 2025
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 15, 2025
Copy link
Contributor

openshift-ci bot commented Apr 15, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alebedev87, JoelSpeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 15, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Apr 15, 2025

@alebedev87: This pull request references NE-2009 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

This PR re-applies #2261 which was reverted #2277.

PR which fixed the missing GatewayAPI featuregate in Default on Hypershift: openshift/hypershift#6035.
PR which added HyperShift conformance tests to the api presubmites: openshift/release#63810.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@alebedev87
Copy link
Contributor Author

Failed HyperShift conformance test: link.
Ingress operator logs: logs.

Error from the logs:

ValidatingAdmissionPolicy 'openshift-ingress-operator-gatewayapi-crd-admission' with binding 'openshift-ingress-operator-gatewayapi-crd-admission' denied request: this user must have both \"authentication.kubernetes.io/node-name\" and \"authentication.kubernetes.io/pod-name\" claims, failed to create CRD httproutes.gateway.networking.k8s.io: customresourcedefinitions.apiextensions.k8s.io \"httproutes.gateway.networking.k8s.io\" is forbidden: ValidatingAdmissionPolicy 'openshift-ingress-operator-gatewayapi-crd-admission' with binding 'openshift-ingress-operator-gatewayapi-crd-admission' denied request: this user must have both \"authentication.kubernetes.io/node-name\" and \"authentication.kubernetes.io/pod-name\" claims,

PR to add a dedicated VAP for HyperShift: openshift/cluster-ingress-operator#1221.

@alebedev87
Copy link
Contributor Author

/retest

@alebedev87
Copy link
Contributor Author

/retest-required

1 similar comment
@alebedev87
Copy link
Contributor Author

/retest-required

@alebedev87
Copy link
Contributor Author

/tide refresh

@alebedev87
Copy link
Contributor Author

/test e2e-upgrade-out-of-change

@alebedev87
Copy link
Contributor Author

/test e2e-aws-ovn-hypershift

@alebedev87
Copy link
Contributor Author

/test e2e-aws-serial-techpreview

@candita
Copy link
Contributor

candita commented Apr 16, 2025

Requested Core Networking team look at https://issues.redhat.com/browse/OCPBUGS-53279 for the e2e-upgrade-out-of-change issue, which is consistently having network issues that impact etcd and seem caused by the error "(container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: no CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started?)"

@lihongan
Copy link
Contributor

/retest-required

@alebedev87
Copy link
Contributor Author

alebedev87 commented Apr 17, 2025

aws-ovn-hypershift-conformance is passing GatewayAPI tests:

curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_api/2281/pull-ci-openshift-api-master-e2e-aws-ovn-hypershift-conformance/1912696578448035840/artifacts/e2e-aws-ovn-hypershift-conformance/conformance-tests/artifacts/e2e.log | grep GatewayAPI
started: 0/57/530 "[sig-network][OCPFeatureGate:GatewayAPI][Feature:Router][apigroup:gateway.networking.k8s.io] Verify Gateway API CRDs and ensure CRD of standard group can not be created [Suite:openshift/conformance/parallel]"
passed: (2.4s) 2025-04-17T03:48:53 "[sig-network][OCPFeatureGate:GatewayAPI][Feature:Router][apigroup:gateway.networking.k8s.io] Verify Gateway API CRDs and ensure CRD of standard group can not be created [Suite:openshift/conformance/parallel]"
started: 0/367/530 "[sig-network][OCPFeatureGate:GatewayAPI][Feature:Router][apigroup:gateway.networking.k8s.io] Verify Gateway API CRDs and ensure CRD of experimental group is not installed [Suite:openshift/conformance/parallel]"
passed: (2.5s) 2025-04-17T03:52:18 "[sig-network][OCPFeatureGate:GatewayAPI][Feature:Router][apigroup:gateway.networking.k8s.io] Verify Gateway API CRDs and ensure CRD of experimental group is not installed [Suite:openshift/conformance/parallel]"
started: 0/373/530 "[sig-network][OCPFeatureGate:GatewayAPI][Feature:Router][apigroup:gateway.networking.k8s.io] Verify Gateway API CRDs and ensure required CRDs should already be installed [Suite:openshift/conformance/parallel]"
passed: (2.3s) 2025-04-17T03:52:22 "[sig-network][OCPFeatureGate:GatewayAPI][Feature:Router][apigroup:gateway.networking.k8s.io] Verify Gateway API CRDs and ensure required CRDs should already be installed [Suite:openshift/conformance/parallel]"
started: 0/435/530 "[sig-network][OCPFeatureGate:GatewayAPI][Feature:Router][apigroup:gateway.networking.k8s.io] Verify Gateway API CRDs and ensure existing CRDs can not be deleted [Suite:openshift/conformance/parallel]"
passed: (2.7s) 2025-04-17T03:53:10 "[sig-network][OCPFeatureGate:GatewayAPI][Feature:Router][apigroup:gateway.networking.k8s.io] Verify Gateway API CRDs and ensure existing CRDs can not be deleted [Suite:openshift/conformance/parallel]"
started: 0/483/530 "[sig-network][OCPFeatureGate:GatewayAPI][Feature:Router][apigroup:gateway.networking.k8s.io] Verify Gateway API CRDs and ensure existing CRDs can not be updated [Suite:openshift/conformance/parallel]"
passed: (2.8s) 2025-04-17T03:53:49 "[sig-network][OCPFeatureGate:GatewayAPI][Feature:Router][apigroup:gateway.networking.k8s.io] Verify Gateway API CRDs and ensure existing CRDs can not be updated [Suite:openshift/conformance/parallel]"
started: 0/513/530 "[sig-network][OCPFeatureGate:GatewayAPI][Feature:Router][apigroup:gateway.networking.k8s.io] Verify Gateway API CRDs and ensure CRD of experimental group can not be created [Suite:openshift/conformance/parallel]"
passed: (2.5s) 2025-04-17T03:54:23 "[sig-network][OCPFeatureGate:GatewayAPI][Feature:Router][apigroup:gateway.networking.k8s.io] Verify Gateway API CRDs and ensure CRD of experimental group can not be created [Suite:openshift/conformance/parallel]"

As a matter of fact all the e2e tests passed:

INFO[2025-04-17T03:13:16Z] Running step e2e-aws-ovn-hypershift-conformance-conformance-tests. 
INFO[2025-04-17T04:09:20Z] Step e2e-aws-ovn-hypershift-conformance-conformance-tests succeeded after 56m4s. 
INFO[2025-04-17T04:09:20Z] Step phase test succeeded after 56m4s. 

However some CI job steps fail due to CI image registry problems:

 * 2025-04-17T05:10:20Z 3x kubelet: Failed to pull image "registry.ci.openshift.org/ci/entrypoint-wrapper:latest": initializing source docker://registry.ci.openshift.org/ci/entrypoint-wrapper:latest: reading manifest latest in registry.ci.openshift.org/ci/entrypoint-wrapper: manifest unknown
* 2025-04-17T05:10:20Z 3x kubelet: Error: ErrImagePull
* 2025-04-17T06:04:46Z 242x kubelet: Back-off pulling image "registry.ci.openshift.org/ci/entrypoint-wrapper:latest"
* 2025-04-17T05:10:43Z 4x kubelet: Error: ImagePullBackOff 
INFO[2025-04-17T06:09:38Z] Step e2e-aws-ovn-hypershift-conformance-hypershift-debug failed after 1h0m7s.

@alebedev87
Copy link
Contributor Author

aws-serial-techpreview and upgrade-out-of-change failed because they hit AWS lease quota:

step e2e-aws-serial-techpreview failed: failed to acquire lease for "aws-quota-slice": resources not found 

/test e2e-aws-serial-techpreview
/test e2e-upgrade-out-of-change

@JoelSpeed
Copy link
Contributor

/retest

@JoelSpeed
Copy link
Contributor

Still seem to be seeing payload building failures in CI here :/

@alebedev87
Copy link
Contributor Author

Still seem to be seeing payload building failures in CI here :/

Yes, the appci incident is ongoing.

@alebedev87
Copy link
Contributor Author

/test e2e-aws-ovn-hypershift-conformance

@alebedev87
Copy link
Contributor Author

Nice try but no, the CI is still missing some images:

  * could not run steps: step [release:latest] failed: failed to wait for importing imagestreamtags [cluster-version-operator, cli] on ci-op-0nzb46m3/stable: failed to import tag(s) [cli,cluster-version-operator] on image stream ci-op-0nzb46m3/stable because of missing definition in the spec 

@candita
Copy link
Contributor

candita commented Apr 17, 2025

Can't hurt to try.
/test e2e-aws-ovn-hypershift-conformance

@alebedev87
Copy link
Contributor Author

/retest-required

@candita
Copy link
Contributor

candita commented Apr 17, 2025

Let's wait to try re-running e2e-aws-ovn-hypershift until after openshift/origin#29683 merges.

The hypershift tests were failing due to crashing pods:

util.go:669: Container csi-driver in pod aws-ebs-csi-driver-controller-7988c49799-wkc6j has a restartCount > 0 (2)
--- FAIL: TestNodePool/HostedCluster0/ValidateHostedCluster/EnsureNoCrashingPods (0.14s)

util.go:669: Container csi-driver in pod aws-ebs-csi-driver-controller-54c466c784-fgxvn has a restartCount > 0 (3)
--- FAIL: TestCreateClusterPrivate/ValidateHostedCluster/EnsureNoCrashingPods (0.10s)

cc @JoelSpeed

@alebedev87
Copy link
Contributor Author

e2e-upgrade-out-of-change passed the e2e test:

INFO[2025-04-17T21:30:19Z] Running step e2e-upgrade-out-of-change-openshift-e2e-test. 
INFO[2025-04-17T22:48:34Z] Step e2e-upgrade-out-of-change-openshift-e2e-test succeeded after 1h18m14s. 
INFO[2025-04-17T22:48:34Z] Step phase test succeeded after 1h18m14s. 

But it was running too long (>4h) which seems to be the case for some other runs. An increase of the timeout needs to be considered.

@lihongan
Copy link
Contributor

lihongan commented Apr 18, 2025

/retest-required

e2e-aws-serial-techpreview timed out (running too long >4h)

@alebedev87
Copy link
Contributor Author

/unhold

The hypershift conformance is green as requested by Joel.

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 18, 2025
@lihongan
Copy link
Contributor

/test e2e-aws-serial-techpreview

(install failed)

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD b7680e1 and 2 for PR HEAD 95cc95c in total

@lihongan
Copy link
Contributor

install failed due to quota limit exceeded

failed to create IAM master role: failed to create master role: LimitExceeded: Cannot exceed quota for RolesPerAccount: 1000

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD b7680e1 and 2 for PR HEAD 95cc95c in total

@alebedev87
Copy link
Contributor Author

Let's wait to try re-running e2e-aws-ovn-hypershift until after openshift/origin#29683 merges.

e2e-aws-ovn-hypershift is green now while openshift/origin#29683 is still not merged. I think problems were somewhere else.

@alebedev87
Copy link
Contributor Author

AWS IAM role quota issue was reported in the test-platform forum: Slack thread.

@alebedev87
Copy link
Contributor Author

/retest-required

Copy link
Contributor

openshift-ci bot commented Apr 18, 2025

@alebedev87: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn 95cc95c link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@JoelSpeed
Copy link
Contributor

/override ci/prow/e2e-aws-serial-techpreview

This change should not effect tech preview as the feature is already available in tech preview. Seems there are permanent errors with infrastructure provisioning at the moment.

/override ci/prow/e2e-upgrade-out-of-change

This has previously passed, but is now facing infrastructure issues. Since this was already promoted, and then reverted, I'm fairly confident that this will be safe to override

Copy link
Contributor

openshift-ci bot commented Apr 18, 2025

@JoelSpeed: Overrode contexts on behalf of JoelSpeed: ci/prow/e2e-aws-serial-techpreview, ci/prow/e2e-upgrade-out-of-change

In response to this:

/override ci/prow/e2e-aws-serial-techpreview

This change should not effect tech preview as the feature is already available in tech preview. Seems there are permanent errors with infrastructure provisioning at the moment.

/override ci/prow/e2e-upgrade-out-of-change

This has previously passed, but is now facing infrastructure issues. Since this was already promoted, and then reverted, I'm fairly confident that this will be safe to override

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit f636181 into openshift:master Apr 18, 2025
23 of 24 checks passed
@openshift-bot
Copy link

[ART PR BUILD NOTIFIER]

Distgit: ose-cluster-config-api
This PR has been included in build ose-cluster-config-api-container-v4.19.0-202504181514.p0.gf636181.assembly.stream.el9.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants