Skip to content

Revert "Update wait_for_status_success() call to look at both type and status for status.conditions" #2372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

myakove
Copy link
Collaborator

@myakove myakove commented Apr 1, 2025

Reverts #2289

Summary by CodeRabbit

  • Refactor
    • Streamlined the network configuration status check for faster and more reliable detection of successful configurations, enhancing overall responsiveness.
    • Improved logging for better traceability of status updates.

Copy link

coderabbitai bot commented Apr 1, 2025

Walkthrough

This change updates the implementation of the wait_for_status_success method in the node network configuration policy module. The updated method now invokes wait_for_configuration_conditions_unknown_or_progressing before sampling the status with a shortened interval of one second. It simplifies the logic by directly comparing the status value to predefined constants (e.g., SUCCESSFULLY_CONFIGURED, NO_MATCHING_NODE, FAILED_TO_CONFIGURE) rather than iterating over condition objects. No changes were made to the declarations of exported or public entities.

Changes

File(s) Change Summary
ocp_resources/node_network_configuration_policy.py Updated wait_for_status_success to call wait_for_configuration_conditions_unknown_or_progressing and use a 1-second sampling interval with direct status checks.

Possibly related issues

Possibly related PRs

Suggested labels

verified, can-be-merged, approved-myakove

Suggested reviewers

  • rnetser
  • EdDev
  • hmeir
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@redhat-qe-bot1
Copy link

Report bugs in Issues

The following are automatically added:

  • Add reviewers from OWNER file (in the root of the repository) under reviewers section.
  • Set PR size label.
  • New issue is created for the PR. (Closed when PR is merged/closed)
  • Run pre-commit if .pre-commit-config.yaml exists in the repo.

Available user actions:

  • To mark PR as WIP comment /wip to the PR, To remove it from the PR comment /wip cancel to the PR.
  • To block merging of PR comment /hold, To un-block merging of PR comment /hold cancel.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To cherry pick a merged PR comment /cherry-pick <target branch to cherry-pick to> in the PR.
    • Multiple target branches can be cherry-picked, separated by spaces. (/cherry-pick branch1 branch2)
    • Cherry-pick will be started when PR is merged
  • To build and push container image command /build-and-push-container in the PR (tag will be the PR number).
    • You can add extra args to the Podman build command
      • Example: /build-and-push-container --build-arg OPENSHIFT_PYTHON_WRAPPER_COMMIT=<commit_hash>
  • To add a label by comment use /<label name>, to remove, use /<label name> cancel
  • To assign reviewers based on OWNERS file use /assign-reviewers
  • To check if PR can be merged use /check-can-merge
  • to assign reviewer to PR use /assign-reviewer @<reviewer>
Supported /retest check runs
  • /retest tox: Retest tox
  • /retest python-module-install: Retest python-module-install
  • /retest conventional-title: Retest conventional-title
  • /retest all: Retest all
Supported labels
  • hold
  • verified
  • wip
  • lgtm

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
ocp_resources/node_network_configuration_policy.py (1)

407-433: Consider adding a docstring to explain the method's behavior

While the implementation is clear, adding a docstring would improve maintainability by explaining the method's purpose, expected behavior, return values, and possible exceptions.

def wait_for_status_success(self):
+    """
+    Wait for the Node Network Configuration Policy to reach a successful state.
+    
+    This method first waits for the configuration conditions to be ready (unknown or progressing),
+    then samples the status until it's successfully configured or fails.
+    
+    Returns:
+        str: The status reason (SUCCESSFULLY_CONFIGURED) when successful
+        
+    Raises:
+        NNCPConfigurationFailed: If the configuration fails or no matching node is found
+        TimeoutExpiredError: If the configuration doesn't complete within the timeout
+    """
    failed_condition_reason = self.Conditions.Reason.FAILED_TO_CONFIGURE
    no_match_node_condition_reason = self.Conditions.Reason.NO_MATCHING_NODE
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 93bf8d1 and 0d37192.

📒 Files selected for processing (1)
  • ocp_resources/node_network_configuration_policy.py (3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
ocp_resources/node_network_configuration_policy.py (4)
ocp_resources/exceptions.py (1)
  • NNCPConfigurationFailed (54-55)
ocp_resources/node_network_state.py (1)
  • NodeNetworkState (12-112)
ocp_resources/resource.py (6)
  • Resource (302-1295)
  • ResourceEditor (1411-1624)
  • patches (1450-1452)
  • update (911-925)
  • update (1454-1501)
  • status (899-909)
ocp_resources/utils/resource_constants.py (1)
  • Reason (43-47)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: python-module-install
  • GitHub Check: tox
  • GitHub Check: conventional-title
🔇 Additional comments (4)
ocp_resources/node_network_configuration_policy.py (4)

5-5: Updated imports to include retry from timeout_sampler

The inclusion of retry from timeout_sampler module is consistent with the requirements for the refactored code. This import is needed to support the updated implementation in this file.


14-14: Added TIMEOUT_5SEC constant import

The addition of the TIMEOUT_5SEC constant is appropriate. This constant is used in the _wait_for_nncp_status_update method (line 349), providing a defined interval for retries.


328-329: Commented out status transition time tracking code

The code for tracking and comparing the last successful transition time has been commented out. This is consistent with the reversion of changes from PR #2289, removing the mechanism that waited for NNCP status updates after interface changes.

Also applies to: 334-336


411-426: Reverted wait_for_status_success implementation

This change reverts to a cleaner implementation of the status checking logic:

  1. First waits for configuration conditions to be in a known state (unknown or progressing) to avoid race conditions
  2. Uses a 1-second sampling interval instead of 5 seconds for more responsive status checking
  3. Directly compares the status value to predefined constants rather than iterating through condition objects

The implementation is more straightforward while maintaining the core functionality of waiting until the NNCP is either successfully configured or fails.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
ocp_resources/node_network_configuration_policy.py (2)

14-14: Import includes unused constant

The import statement has been modified to include TIMEOUT_5SEC, but this constant appears to be used only in the _wait_for_nncp_status_update method, which is not directly called in the active code (it's commented out at line 336).

Consider removing unused imports to keep the code clean:

-from ocp_resources.utils.constants import TIMEOUT_1MINUTE, TIMEOUT_4MINUTES, TIMEOUT_5SEC
+from ocp_resources.utils.constants import TIMEOUT_1MINUTE, TIMEOUT_4MINUTES

394-396: Inappropriate log level for non-error message

Similar to the previous comment, the code is using logger.error for what appears to be informational output.

-self.logger.error(f"absent_ifaces: Before update resource: {initial_success_status_time}")
+self.logger.info(f"absent_ifaces: Before update resource: {initial_success_status_time}")
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d37192 and bda2775.

📒 Files selected for processing (1)
  • ocp_resources/node_network_configuration_policy.py (4 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
ocp_resources/node_network_configuration_policy.py (3)
ocp_resources/exceptions.py (1)
  • NNCPConfigurationFailed (54-55)
ocp_resources/node_network_state.py (1)
  • NodeNetworkState (12-112)
ocp_resources/resource.py (5)
  • Resource (302-1295)
  • patches (1450-1452)
  • update (911-925)
  • update (1454-1501)
  • status (899-909)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: python-module-install
  • GitHub Check: tox
  • GitHub Check: conventional-title
🔇 Additional comments (2)
ocp_resources/node_network_configuration_policy.py (2)

5-5:

✅ Verification successful

Verify if all imported modules are necessary

The imports seem to have been modified to include TimeoutWatch and retry, but it's unclear if these are still required after the reversion of PR #2289. TimeoutWatch is used in wait_for_configuration_conditions_unknown_or_progressing, but retry doesn't appear to be used directly in this file.


🏁 Script executed:

#!/bin/bash
# Check for any usages of the retry function in this file
grep -n "retry(" ocp_resources/node_network_configuration_policy.py

Length of output: 85


Clarification on Imported Modules Necessity

After verifying the code, it's confirmed that the retry function imported from timeout_sampler is used in the file. Specifically, the @retry( decorator appears on line 348, making its inclusion necessary along with TimeoutWatch which is used in wait_for_configuration_conditions_unknown_or_progressing.

  • Usage of retry found at line 348.
  • TimeoutWatch remains essential in its respective function.

414-429:

❓ Verification inconclusive

Significantly altered status checking logic

The implementation of wait_for_status_success has been completely rewritten. The key changes are:

  1. Added an initial call to wait_for_configuration_conditions_unknown_or_progressing
  2. Changed the sleep interval from the previous 5 seconds (likely) to 1 second
  3. Now directly checks the status value instead of iterating through condition objects

This change affects the core behavior of how the system waits for status updates. Testing should verify that this reversion addresses the issues encountered with PR #2289.

Since this is a reversion PR, ensure that thorough testing is performed to verify that the issues with PR #2289 are resolved and that this implementation works as expected with different timing conditions.


🏁 Script executed:

#!/bin/bash
# Description: Check for any references to this method in test files

# Check for test calls to wait_for_status_success method
echo "Looking for test references to wait_for_status_success method..."
rg -n "wait_for_status_success" --type py tests/

Length of output: 187


Critical: Manual Verification Required for Revised Wait Logic

The revised logic in wait_for_status_success shows significant changes:

  • An initial call to wait_for_configuration_conditions_unknown_or_progressing has been added.
  • The sleep interval is now 1 second instead of an assumed 5 seconds.
  • The status check now directly compares status values instead of iterating through condition objects.

Action Items:

  • Ensure that the new timing behavior has been thoroughly verified in real-world conditions.
  • Although searching the test files did not reveal direct references to wait_for_status_success, please confirm manually (or enhance tests) that the altered method correctly resolves the issues observed in PR Update wait_for_status_success() call to look at both type and status for status.conditions #2289.
  • Verify that the early call to wait_for_configuration_conditions_unknown_or_progressing along with the changed sleep interval provides sufficient stability across various timing scenarios.

Comment on lines +329 to 337
self.logger.error(f"absent_ifaces: Before update resource: {initial_success_status_time}")
ResourceEditor(
patches={self: {"spec": {"desiredState": {"interfaces": self.desired_state["interfaces"]}}}}
).update()

# If the NNCP failed on setup, then its tear-down AVAIALBLE status will necessarily be the first.
if initial_success_status_time:
self._wait_for_nncp_status_update(initial_transition_time=initial_success_status_time)
# if initial_success_status_time:
# self._wait_for_nncp_status_update(initial_transition_time=initial_success_status_time)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Inconsistent error logging and commented code

There are two issues here:

  1. The log message is using logger.error for what appears to be informational output
  2. The code for waiting for NNCP status updates has been commented out rather than removed

If this is a revert PR, consider either:

  1. Properly removing the commented-out code rather than keeping it commented
  2. Using the appropriate log level (info instead of error) for non-error messages
-self.logger.error(f"absent_ifaces: Before update resource: {initial_success_status_time}")
+self.logger.info(f"absent_ifaces: Before update resource: {initial_success_status_time}")

-# if initial_success_status_time:
-#     self._wait_for_nncp_status_update(initial_transition_time=initial_success_status_time)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self.logger.error(f"absent_ifaces: Before update resource: {initial_success_status_time}")
ResourceEditor(
patches={self: {"spec": {"desiredState": {"interfaces": self.desired_state["interfaces"]}}}}
).update()
# If the NNCP failed on setup, then its tear-down AVAIALBLE status will necessarily be the first.
if initial_success_status_time:
self._wait_for_nncp_status_update(initial_transition_time=initial_success_status_time)
# if initial_success_status_time:
# self._wait_for_nncp_status_update(initial_transition_time=initial_success_status_time)
self.logger.info(f"absent_ifaces: Before update resource: {initial_success_status_time}")
ResourceEditor(
patches={self: {"spec": {"desiredState": {"interfaces": self.desired_state["interfaces"]}}}}
).update()
# If the NNCP failed on setup, then its tear-down AVAIALBLE status will necessarily be the first.

@dbasunag
Copy link
Contributor

dbasunag commented Apr 1, 2025

Does not look like a clean revert as it is removing other people's changes as well. Specially after almost 3 months of the merge, may be fixing what does not work would be more simple?

@myakove
Copy link
Collaborator Author

myakove commented Apr 2, 2025

Does not look like a clean revert as it is removing other people's changes as well. Specially after almost 3 months of the merge, may be fixing what does not work would be more simple?

Thanks, this is only for testing.

@myakove
Copy link
Collaborator Author

myakove commented Apr 2, 2025

/hold

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants