Skip to content

Conversation

wking
Copy link
Member

@wking wking commented Jun 18, 2025

Make it clearer how --accept ConditionalUpdateRisk maps to a risk like NonZonalAzureMachineSetScaling getting accepted, by turning the previous:

Reason: accepted NonZonalAzureMachineSetScaling

into:

Reason: accepted NonZonalAzureMachineSetScaling via ConditionalUpdateRisk

Eventually we'll have an API that allows us to use the conditional-update risk name itself (e.g. --accept NonZonalAzureMachineSetScaling, OTA-1543, openshift/enhancements#1807), but this via... context will hopefully help avoid confusion in the meantime.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 18, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jun 18, 2025

@wking: This pull request references OTA-1575 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set.

In response to this:

Make it clearer how --accept ConditionalUpdateRisk maps to a risk like NonZonalAzureMachineSetScaling getting accepted, by turning the previous:

Reason: accepted NonZonalAzureMachineSetScaling

into:

Reason: accepted NonZonalAzureMachineSetScaling via ConditionalUpdateRisk

Eventually we'll have an API that allows us to use the conditional-update risk name itself (e.g. --accept NonZonalAzureMachineSetScaling, OTA-1543, openshift/enhancements#1807), but this via... context will hopefully help avoid confusion in the meantime.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from ardaguclu and ingvagabund June 18, 2025 00:50
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 18, 2025
@JianLi-RH
Copy link

hi @wking I tried 2 scenarios, the reasons not maps to a risk.

  1. There is AdminAckRequired
  2. There are MultipleReasons

After accept ConditionalUpdateRisk, the reason changed to Reason: accepted %s via ConditionalUpdateRisk

Scenario 1, AdminAckRequired:

[jianl@jianl-thinkpadt14gen4 418]$ ./oc adm upgrade recommend --version 4.20.0-ec.0 --token=$token
The following conditions found no cause for concern in updating this cluster to later releases: recommended/CriticalAlerts (AsExpected), recommended/NodeAlerts (AsExpected), recommended/PodDisruptionBudgetAlerts (AsExpected), recommended/PodImagePullAlerts (AsExpected)

Upstream update service: https://raw.githubusercontent.com/JianLi-RH/ota/92fc74a3746b0fa652e03b1bc19b690698c594e3/cincy-82705.json
Channel: channel-a

Update to 4.20.0-ec.0 Recommended=False:
Image: quay.io/openshift-release-dev/ocp-release@sha256:b5bea7566302a82159c42b9fbf37aef3b819cb5a22a9ffd34081b0a2192a071a
Release URL: 
Reason: AdminAckRequired
Message: Kubernetes 1.32 and therefore OpenShift 4.19 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/7112216 for details and instructions. This cluster is GCP or AWS but lacks a boot image configuration. OCP will automatically opt this cluster into boot image management in 4.19. Please add a configuration to disable boot image updates if this is not desired. See https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/machine_configuration/mco-update-boot-images#mco-update-boot-images-disable_machine-configs-configure for more details.
error: issues that apply to this cluster but which were not included in --accept: ConditionalUpdateRisk
[jianl@jianl-thinkpadt14gen4 418]$ 
[jianl@jianl-thinkpadt14gen4 418]$ 
[jianl@jianl-thinkpadt14gen4 418]$ ./oc adm upgrade recommend --version 4.20.0-ec.0 --accept ConditionalUpdateRisk --token=$token
The following conditions found no cause for concern in updating this cluster to later releases: recommended/CriticalAlerts (AsExpected), recommended/NodeAlerts (AsExpected), recommended/PodDisruptionBudgetAlerts (AsExpected), recommended/PodImagePullAlerts (AsExpected)

Upstream update service: https://raw.githubusercontent.com/JianLi-RH/ota/92fc74a3746b0fa652e03b1bc19b690698c594e3/cincy-82705.json
Channel: channel-a

Update to 4.20.0-ec.0 Recommended=False:
Image: quay.io/openshift-release-dev/ocp-release@sha256:b5bea7566302a82159c42b9fbf37aef3b819cb5a22a9ffd34081b0a2192a071a
Release URL: 
Reason: accepted %s via ConditionalUpdateRisk
Message: Kubernetes 1.32 and therefore OpenShift 4.19 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/7112216 for details and instructions. This cluster is GCP or AWS but lacks a boot image configuration. OCP will automatically opt this cluster into boot image management in 4.19. Please add a configuration to disable boot image updates if this is not desired. See https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/machine_configuration/mco-update-boot-images#mco-update-boot-images-disable_machine-configs-configure for more details.
Update to 4.20.0-ec.0 has no known issues relevant to this cluster other than the accepted ConditionalUpdateRisk.
[jianl@jianl-thinkpadt14gen4 418]$ 

Scenario 2, MultipleReasons:

[jianl@jianl-thinkpadt14gen4 418]$ ./oc adm upgrade recommend --version 4.20.0-ec.3 --token=$token
The following conditions found no cause for concern in updating this cluster to later releases: recommended/CriticalAlerts (AsExpected), recommended/NodeAlerts (AsExpected), recommended/PodDisruptionBudgetAlerts (AsExpected), recommended/PodImagePullAlerts (AsExpected)

Upstream update service: https://raw.githubusercontent.com/JianLi-RH/ota/92fc74a3746b0fa652e03b1bc19b690698c594e3/cincy-82705.json
Channel: channel-a

Update to 4.20.0-ec.3 Recommended=False:
Image: quay.io/openshift-release-dev/ocp-release@sha256:4dfd7223e883a685c7be0906b09d573ef24bdb8f7fcfb1876e198bed5352ba55
Release URL: 
Reason: MultipleReasons
Message: Kubernetes 1.32 and therefore OpenShift 4.19 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/7112216 for details and instructions. This cluster is GCP or AWS but lacks a boot image configuration. OCP will automatically opt this cluster into boot image management in 4.19. Please add a configuration to disable boot image updates if this is not desired. See https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/machine_configuration/mco-update-boot-images#mco-update-boot-images-disable_machine-configs-configure for more details.
  
  Too many CI failures on this release, so do not update to it https://amd64.ocp.releases.ci.openshift.org/releasestream/4-dev-preview/release/4.20.0-ec.3
  
  On clusters on default invoker user, this imaginary bug can happen. https://bug.example.com/a
error: issues that apply to this cluster but which were not included in --accept: ConditionalUpdateRisk
[jianl@jianl-thinkpadt14gen4 418]$ 
[jianl@jianl-thinkpadt14gen4 418]$ 
[jianl@jianl-thinkpadt14gen4 418]$ 
[jianl@jianl-thinkpadt14gen4 418]$ ./oc adm upgrade recommend --version 4.20.0-ec.3 --accept ConditionalUpdateRisk --token=$token
The following conditions found no cause for concern in updating this cluster to later releases: recommended/CriticalAlerts (AsExpected), recommended/NodeAlerts (AsExpected), recommended/PodDisruptionBudgetAlerts (AsExpected), recommended/PodImagePullAlerts (AsExpected)

Upstream update service: https://raw.githubusercontent.com/JianLi-RH/ota/92fc74a3746b0fa652e03b1bc19b690698c594e3/cincy-82705.json
Channel: channel-a

Update to 4.20.0-ec.3 Recommended=False:
Image: quay.io/openshift-release-dev/ocp-release@sha256:4dfd7223e883a685c7be0906b09d573ef24bdb8f7fcfb1876e198bed5352ba55
Release URL: 
Reason: accepted %s via ConditionalUpdateRisk
Message: Kubernetes 1.32 and therefore OpenShift 4.19 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/7112216 for details and instructions. This cluster is GCP or AWS but lacks a boot image configuration. OCP will automatically opt this cluster into boot image management in 4.19. Please add a configuration to disable boot image updates if this is not desired. See https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/machine_configuration/mco-update-boot-images#mco-update-boot-images-disable_machine-configs-configure for more details.
  
  Too many CI failures on this release, so do not update to it https://amd64.ocp.releases.ci.openshift.org/releasestream/4-dev-preview/release/4.20.0-ec.3
  
  On clusters on default invoker user, this imaginary bug can happen. https://bug.example.com/a
Update to 4.20.0-ec.3 has no known issues relevant to this cluster other than the accepted ConditionalUpdateRisk.
[jianl@jianl-thinkpadt14gen4 418]$ 

…isk"

Make it clearer how '--accept ConditionalUpdateRisk' maps to a risk
like NonZonalAzureMachineSetScaling getting accepted, by turning the
previous:

  Reason: accepted NonZonalAzureMachineSetScaling

into:

  Reason: accepted NonZonalAzureMachineSetScaling via ConditionalUpdateRisk

Eventually we'll have an API that allows us to use the
conditional-update risk name itself (e.g. '--accept
NonZonalAzureMachineSetScaling') [1,2], but this 'via...' context will
hopefully help avoid confusion in the meantime.

[1]: openshift/enhancements#1807
[2]: https://issues.redhat.com/browse/OTA-1543
@wking wking force-pushed the accepted-via-ConditionalUpdateRisk branch from 37b8da4 to 7698d9a Compare July 7, 2025 18:09
@wking
Copy link
Member Author

wking commented Jul 7, 2025

Reason: accepted %s via ConditionalUpdateRisk

Oops, thanks for catching that. Fixed in 37b8da4 -> 7698d9a.

@hongkailiu
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 7, 2025
Copy link
Contributor

openshift-ci bot commented Jul 7, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hongkailiu, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

@JianLi-RH
Copy link

Thank you @wking It works fine.

By default, recommend with --version:

[jianl@jianl-thinkpadt14gen4 420]$ ./oc adm upgrade recommend --version 4.999.999
Failed to check for at least some preconditions: failed to get alerts from Thanos: no token is currently in use for this session
Upstream update service: https://raw.githubusercontent.com/JianLi-RH/ota/refs/heads/main/OCP-82705.json
Channel: stable-4.18

Update to 4.999.999 Recommended=False:
Image: quay.io/openshift-release-dev/ocp-release@sha256:9d24a8cdd67b8f18c99547d5910e4863e7aab5bd888e26670a00dbda0a9d4687
Release URL: 
Reason: MultipleReasons
Message: Too many CI failures on this release, so do not update to it https://amd64.ocp.releases.ci.openshift.org/releasestream/4.19.0-0.nightly/release/4.19.0-0.nightly-2025-06-16-060026
  
  On clusters on default invoker user, this imaginary bug can happen. https://bug.example.com/a
error: issues that apply to this cluster but which were not included in --accept: ConditionalUpdateRisk,FailedToCompletePrecheck
[jianl@jianl-thinkpadt14gen4 420]$ 

--accept ConditionalUpdateRisk

[jianl@jianl-thinkpadt14gen4 420]$ ./oc adm upgrade recommend --version 4.999.999 --accept ConditionalUpdateRisk
Failed to check for at least some preconditions: failed to get alerts from Thanos: no token is currently in use for this session
Upstream update service: https://raw.githubusercontent.com/JianLi-RH/ota/refs/heads/main/OCP-82705.json
Channel: stable-4.18

Update to 4.999.999 Recommended=False:
Image: quay.io/openshift-release-dev/ocp-release@sha256:9d24a8cdd67b8f18c99547d5910e4863e7aab5bd888e26670a00dbda0a9d4687
Release URL: 
Reason: accepted MultipleReasons via ConditionalUpdateRisk
Message: Too many CI failures on this release, so do not update to it https://amd64.ocp.releases.ci.openshift.org/releasestream/4.19.0-0.nightly/release/4.19.0-0.nightly-2025-06-16-060026
  
  On clusters on default invoker user, this imaginary bug can happen. https://bug.example.com/a
error: issues that apply to this cluster but which were not included in --accept: FailedToCompletePrecheck
[jianl@jianl-thinkpadt14gen4 420]$

--accept FailedToCompletePrecheck

[jianl@jianl-thinkpadt14gen4 420]$ ./oc adm upgrade recommend --version 4.999.999 --accept FailedToCompletePrecheck
Failed to check for at least some preconditions: failed to get alerts from Thanos: no token is currently in use for this session
Upstream update service: https://raw.githubusercontent.com/JianLi-RH/ota/refs/heads/main/OCP-82705.json
Channel: stable-4.18

Update to 4.999.999 Recommended=False:
Image: quay.io/openshift-release-dev/ocp-release@sha256:9d24a8cdd67b8f18c99547d5910e4863e7aab5bd888e26670a00dbda0a9d4687
Release URL: 
Reason: MultipleReasons
Message: Too many CI failures on this release, so do not update to it https://amd64.ocp.releases.ci.openshift.org/releasestream/4.19.0-0.nightly/release/4.19.0-0.nightly-2025-06-16-060026
  
  On clusters on default invoker user, this imaginary bug can happen. https://bug.example.com/a
error: issues that apply to this cluster but which were not included in --accept: ConditionalUpdateRisk
[jianl@jianl-thinkpadt14gen4 420]$ 

--accept ConditionalUpdateRisk --quiet

[jianl@jianl-thinkpadt14gen4 420]$ ./oc adm upgrade recommend --version 4.999.999 --accept ConditionalUpdateRisk --quiet
error: issues that apply to this cluster but which were not included in --accept: FailedToCompletePrecheck
[jianl@jianl-thinkpadt14gen4 420]$

accept all risks:

[jianl@jianl-thinkpadt14gen4 420]$ ./oc adm upgrade recommend --version 4.999.999 --accept ConditionalUpdateRisk,FailedToCompletePrecheck --quiet
[jianl@jianl-thinkpadt14gen4 420]$ 

@JianLi-RH
Copy link

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Jul 8, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jul 8, 2025

@wking: This pull request references OTA-1575 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set.

In response to this:

Make it clearer how --accept ConditionalUpdateRisk maps to a risk like NonZonalAzureMachineSetScaling getting accepted, by turning the previous:

Reason: accepted NonZonalAzureMachineSetScaling

into:

Reason: accepted NonZonalAzureMachineSetScaling via ConditionalUpdateRisk

Eventually we'll have an API that allows us to use the conditional-update risk name itself (e.g. --accept NonZonalAzureMachineSetScaling, OTA-1543, openshift/enhancements#1807), but this via... context will hopefully help avoid confusion in the meantime.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@JianLi-RH
Copy link

/retest

@JianLi-RH
Copy link

One more test: accept all risks without quiet:

[jianl@jianl-thinkpadt14gen4 420]$ ./oc adm upgrade recommend --version 4.999.999 --accept ConditionalUpdateRisk,FailedToCompletePrecheck
Failed to check for at least some preconditions: failed to get alerts from Thanos: no token is currently in use for this session
Upstream update service: https://raw.githubusercontent.com/JianLi-RH/ota/refs/heads/main/OCP-82705.json
Channel: stable-4.18

Update to 4.999.999 Recommended=False:
Image: quay.io/openshift-release-dev/ocp-release@sha256:9d24a8cdd67b8f18c99547d5910e4863e7aab5bd888e26670a00dbda0a9d4687
Release URL: 
Reason: accepted MultipleReasons via ConditionalUpdateRisk
Message: Too many CI failures on this release, so do not update to it https://amd64.ocp.releases.ci.openshift.org/releasestream/4.19.0-0.nightly/release/4.19.0-0.nightly-2025-06-16-060026
  
  On clusters on default invoker user, this imaginary bug can happen. https://bug.example.com/a
Update to 4.999.999 has no known issues relevant to this cluster other than the accepted ConditionalUpdateRisk,FailedToCompletePrecheck.
[jianl@jianl-thinkpadt14gen4 420]$ 

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

@petr-muller
Copy link
Member

/cc

@openshift-ci openshift-ci bot requested a review from petr-muller July 8, 2025 11:37
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

1 similar comment
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

@wking
Copy link
Member Author

wking commented Jul 8, 2025

The e2e-agnostic-ovn-cmd [sig-cli][Feature:LegacyCommandTests][Disruptive][Serial] test-cmd: test/cmd/images.sh [apigroup:image.openshift.io] failures are unrelated to my change:

  FAILURE after 60.000s: test/cmd/images.sh:56: executing 'oc get imagestreamtags wildfly:latest' expecting success; re-trying every 0.2s until completion or 60.000s: the command timed out
  Standard output from the command:
  Standard error from the command:
  Error from server (NotFound): imagestreamtags.image.openshift.io "wildfly:latest" not found
  ... repeated 177 times
  [ERROR] hack/lib/cmd.sh:114: `return "${return_code}"` exited with status 1.

@wking
Copy link
Member Author

wking commented Jul 8, 2025

Seems like the wildfly bit is from here, based on this fixture. But I'm not sure how to debug the wildfly:latest ImageStreamTag failure. From the test-case's stdout:

  I0708 15:23:04.047546 1418 client.go:451] Project "e2e-test-test-cmd-dqkrm" has been fully provisioned.

but that namespace was cleaned up as part of the test-case's teardown:

STEP: Destroying namespace "e2e-test-test-cmd-dqkrm" for this suite. @ 07/08/25 15:24:11.538

so there's not much left in gathered artifacts to talk about what went wrong. The test-case stdout also includes:

    STEP: Collecting events from namespace "e2e-test-test-cmd-dqkrm". @ 07/08/25 15:24:11.41
    STEP: Found 6 events. @ 07/08/25 15:24:11.42
...

but none of those 6 Events sound like they're talking about ImageStreams or Builds to me.

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

2 similar comments
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

7 similar comments
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f0e1051 and 2 for PR HEAD 7698d9a in total

Copy link
Contributor

openshift-ci bot commented Jul 17, 2025

@wking: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-metal-ipi-ovn-ipv6 7698d9a link false /test e2e-metal-ipi-ovn-ipv6

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 8a28796 and 1 for PR HEAD 7698d9a in total

@openshift-merge-bot openshift-merge-bot bot merged commit 9808979 into openshift:main Jul 17, 2025
17 of 18 checks passed
@wking wking deleted the accepted-via-ConditionalUpdateRisk branch July 17, 2025 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants