-
Notifications
You must be signed in to change notification settings - Fork 630
clear stale policy status after the policy is orphaned #12883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clear stale policy status after the policy is orphaned #12883
Conversation
| } | ||
|
|
||
| // Create status marker if existing status has kgateway controller | ||
| var statusMarker *struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's define this as a concrete type StatusMarker that can be reused across different files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defined in #12852, will update this once the other merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/pluginsdk/types.go
Outdated
| GlobalPolicies func(krt.HandlerContext) ir.PolicyIR | ||
| Policies krt.Collection[ir.PolicyWrapper] | ||
| // ProcessPolicyStatusMarkers add empty reports for policies to clear stale status | ||
| ProcessPolicyStatusMarkers func(krt.HandlerContext, *reports.ReportMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name could be changed to something more expressive, such as AddStaleStatusMarker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not to add the marker but the call back to process the markers. I think its important to specify its for policy in the name, so do you think changing this to ProcessPolicyStaleStatusMarkers is a bit too long?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On reading this further, it appears that the function simply creates an empty report for policies with status referencing our controller and relies on BuildPolicyStatus to remove the stale status. Is that correct? If not, where does the clearing of stale status happen?
ProcessPolicyStaleStatusMarker seems reasonable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is correct, empty policy report entry in reportmap means empty status, so status syncer essentially clears the statuses.
Will change to ProcessPolicyStaleStatusMarker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would make sense to add a unit test for BuildPolicyStatus to ensure that an empty report does not add the status from currentStatus passed to it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done and done
test/e2e/features/cors/suite.go
Outdated
| } | ||
|
|
||
| // TestTrafficPolicyClearStaleStatus verifies that stale status is cleared when targetRef becomes invalid | ||
| func (s *testingSuite) TestTrafficPolicyClearStaleStatus() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we test these using translator tests or a new suite for statuses? This test is not specific to the Cors feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, will move to a new suite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think we should consider moving the e2e tests for stale policy and route status to a standalone status suite or to a "scenario" style e2e test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, I see the updated structure. I think this makes sense as defined in this PR for the policy tests.
I still think we should reevaluate where the HTTPRoute stale status test lives, but unrelated to this PR.
Signed-off-by: Alexander Liu <[email protected]>
Signed-off-by: Alexander Liu <[email protected]>
Signed-off-by: Alexander Liu <[email protected]>
Signed-off-by: Alexander Liu <[email protected]>
Signed-off-by: Alexander Liu <[email protected]>
2b45988 to
a0d74fe
Compare
internal/kgateway/extensions2/plugins/backendconfigpolicy/plugin.go
Outdated
Show resolved
Hide resolved
internal/kgateway/extensions2/plugins/backendtlspolicy/plugin.go
Outdated
Show resolved
Hide resolved
test/e2e/features/cors/suite.go
Outdated
| } | ||
|
|
||
| // TestTrafficPolicyClearStaleStatus verifies that stale status is cleared when targetRef becomes invalid | ||
| func (s *testingSuite) TestTrafficPolicyClearStaleStatus() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, I see the updated structure. I think this makes sense as defined in this PR for the policy tests.
I still think we should reevaluate where the HTTPRoute stale status test lives, but unrelated to this PR.
internal/kgateway/extensions2/plugins/backendconfigpolicy/plugin.go
Outdated
Show resolved
Hide resolved
| s.commonCols.Routes.ProcessHTTPRouteStatusMarkers(objStatus, merged) | ||
|
|
||
| for _, plugin := range s.plugins.ContributesPolicies { | ||
| if plugin.ProcessPolicyStaleStatusMarkers != nil && plugin.ProcessBackend == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this done to exclude backend policy plugins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, backend policy and traffic policies uses different status reporting queue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a small concern with this approach, if we have plugins that contribute both backends and policy will we potentially lose this functionality?
Wondering how we can make this more explicit.
One option would be to have an explicit processBackendPolicyMarkers vs. processPolicyMarkers.
What do you think? Is that overkill?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the functionality will be fine as long as we make sure to report / generate its status in either backend or general status report, and given its gonna contribute to backend most likely it will be reported as a part of backend status reporting, which is what I have here.
We might need to have explicit processBackendPolicyMarkers vs processPolicyMarkers and detailed if check in that case, but right now I didn't want to add more fields into the plugin unless unnecessary, so I went with a check here. Not entirely a fan of checking the ProcessBackend either so if you think have separate is better we can do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your assessment.
Given it's not likely we'll hit this specific scenario anytime soon, I think we can move forward with this as-is.
Signed-off-by: Alexander Liu <[email protected]>
Signed-off-by: Alexander Liu <[email protected]>
lgadban
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, thanks!
I'll approve with one minor question.
Let's chat tomorrow and either merge or follow up.
| s.commonCols.Routes.ProcessHTTPRouteStatusMarkers(objStatus, merged) | ||
|
|
||
| for _, plugin := range s.plugins.ContributesPolicies { | ||
| if plugin.ProcessPolicyStaleStatusMarkers != nil && plugin.ProcessBackend == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a small concern with this approach, if we have plugins that contribute both backends and policy will we potentially lose this functionality?
Wondering how we can make this more explicit.
One option would be to have an explicit processBackendPolicyMarkers vs. processPolicyMarkers.
What do you think? Is that overkill?
Description
When TrafficPolicy or HTTPListenerPolicy become orphan, ie all its TargetRefs are all invalid, it will not be picked up by translation phase due to its reverse lookup mechanism and thus will not receive any status update. This means any previous status message becomes stale and still remains on the policy's status. This PR resolves this problem by implementing the status collection approach discussed in the design doc linked below. Now the policy's status will be correctly cleared if the policy ever become orphaned.
Design doc: EP-11747
Fixes #11402
Changes
Change Type
/kind fix
Changelog