draft of changes to simplify the recording mechanism #2005

shawkins · 2023-08-08T12:25:56Z

cc @csviri @metacosm It seems like we can simply some of the mechanics discussed on fabric8io/kubernetes-client#4077 what are your thoughts on the following

Filtering your own creates / updates:

Add an annotation to the resource

https://github.com/operator-framework/java-operator-sdk/compare/main...shawkins:eventing?expand=1#diff-aa20588ab4b1ff4f171a897d4042d1b055c02737b067d09aaeb9cfd770adf3a0R132

This is checked later - if it matches, then we skip the event. If another actor modifies the resource and leaves this annotation alone or removes it, then we'll know it's an event that this KubernetesDependentResource was not responsible for.

https://github.com/operator-framework/java-operator-sdk/compare/main...shawkins:eventing?expand=1#diff-5947c6b7b60c804e59df2f5cfe62719dc51d3cc50b5e6cb1ddd1bc41323d3b77R162

Pros: Removes complexity associated with event recording.

Cons: Could be seen as "polluting" managed resources. Someone could also be "adversaral" and update this annotation for their own create / updates to make the operator ignore them.

Alternative: Don't worry about doing this - just allow the temporary cache check and the desired matcher minimize the api server calls. If the event comes before you get a chance to update the temporary cache, then you'll just end up processing it.

Update the temporary cache based solely upon the resourceVersion

https://github.com/operator-framework/java-operator-sdk/compare/main...shawkins:eventing?expand=1#diff-5947c6b7b60c804e59df2f5cfe62719dc51d3cc50b5e6cb1ddd1bc41323d3b77R267

Pros: Again removes the complexity associated with event recording.

Cons: The logic may move further ahead by basing a change on what is in the temporary cache - without first receiving the event. Once the original event is recieved, that will move the cache state backwards (although that same problem exists currently I believe).

Alternative: Parse the resourceVersion to determine the actual latest state - but this may break at any time if the api server either obfuscates or moves away from the etcd incrementing resourceVersions.

All operations are locked

https://github.com/operator-framework/java-operator-sdk/compare/main...shawkins:eventing?expand=1#diff-aa20588ab4b1ff4f171a897d4042d1b055c02737b067d09aaeb9cfd770adf3a0R131

I changed the non-SSA case to be locked, but I'm guessing it was probably unlocked for a reason.

Cons: If the Watch lags, or you mishandle the temporary cache, and some other actor makes change to the resource, you will not be able to reconcile that resource until the Watch is re-established. Also for SSA and a reconstructive controller it mostly shouldn't matter if you know the current state, you just need to assert your desired state.

csviri · 2023-08-08T13:10:07Z

This is checked later - if it matches, then we skip the event. If another actor modifies the resource and leaves this annotation alone or removes it, then we'll know it's an event that this KubernetesDependentResource was not responsible for.

This makes sense, seems radically simplifies the solution. However there might be some edge cases when this might not work. For example if an event received from previous update later (this probably a extremely rare case) but resource was already used from the cache during the update. (Although this is probably an issue also now).

Regarding polluting of the resource tbh I would not worry to much, this is a higher level abstraction, such an inconvenience I think is acceptable, at least I don't see any issues why it would not be. But, yes the current solution is nicer in this regard.

Alternative: Don't worry about doing this - just allow the temporary cache check and the desired matcher minimize the api server calls. If the event comes before you get a chance to update the temporary cache, then you'll just end up processing it.

I think this is a valid approach too, although once I measured this, and roughly in 50% cases I actually received the event before the resource ended up in cache. What was very surprising for me. Might worth to just repeat the experiment and see if this is the case also for DR (I was doing it for primary resources). If this works in most of the cases, we could just do it this way.

Cons: The logic may move further ahead by basing a change on what is in the temporary cache - without first receiving the event. Once the original event is recieved, that will move the cache state backwards (although that same problem exists currently I believe).

This is not the case, there is now guaranteed "monotonity". But for that need the optimistic locking. Actually if case of event recording we might not need that, might be possible without (for updates), but something have to think about.

I changed the non-SSA case to be locked, but I'm guessing it was probably unlocked for a reason.

I don't get this, it is locked now.

csviri · 2023-08-08T13:11:15Z

Alternative: Parse the resourceVersion to determine the actual latest state - but this may break at any time if the api server either obfuscates or moves away from the etcd incrementing resourceVersions.

The first implementations were doing this, not sure about solution like k3s and others if still use resourceVersion as a number in the backgroun, rather would keep to the contract.

csviri · 2023-08-08T13:15:54Z

But overall, I think we can simplify this, on the other hand it is working now. All the other solutions are simpler, but makes compromises that the event recording does not. And it's implemented already. Soo 🤷

metacosm · 2023-08-08T13:27:05Z

But overall, I think we can simplify this, on the other hand it is working now. All the other solutions are simpler, but makes compromises that the event recording does not. And it's implemented already. Soo 🤷

While I agree on the general approach that if it's not broken, we shouldn't fix it, I also think there's value, moving forward, to simplify things if we can as this will make it easier for people to contribute and maintain the code.

shawkins · 2023-08-08T13:29:01Z

This is not the case, there is now guaranteed "monotonity". But for that need the optimistic locking. Actually if case of event recording we might not need that, might be possible without (for updates), but something have to think about.

That's not what I'm saying. If you have the sequence:

update temporary cache to v1
some other event triggers reconcilitation, you work off of v1 without yet seeing the v1 event and create version v2 - arguably this may be indicative of problem with the matching logic, but as we have seen this does now happen with server side apply
now you get the event for v1 - you will remove the temporary cache entry and the actual cache will be at v1.

until you get the event for v2, you are operating off of a stale version.

I don't get this, it is locked now.

It is not -

java-operator-sdk/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/processing/dependent/kubernetes/KubernetesDependentResource.java

Line 157 in b66ef93

updatedResource = prepare(updatedActual, primary, "Updating").replace();

is calling replace, not update. Is there an invocation of lockResourceVersion that I'm missing?

csviri · 2023-08-08T13:31:29Z

is calling replace, not update. Is there an invocation of lockResourceVersion that I'm missing?

will create a PR to replace this to update

csviri · 2023-08-08T13:34:44Z

But overall, I think we can simplify this, on the other hand it is working now. All the other solutions are simpler, but makes compromises that the event recording does not. And it's implemented already. Soo shrug

While I agree on the general approach that if it's not broken, we shouldn't fix it, I also think there's value, moving forward, to simplify things if we can as this will make it easier for people to contribute and maintain the code.

I actually don't like that proverb :) I mean that we do some other compromises.

csviri · 2023-08-08T13:42:14Z

That's not what I'm saying. If you have the sequence:

update temporary cache to v1

some other event triggers reconcilitation, you work off of v1 without yet seeing the v1 event and create version v2 - arguably this may be indicative of problem with the matching logic, but as we have seen this does now happen with server side apply

now you get the event for v1 - you will remove the temporary cache entry and the actual cache will be at v1.

until you get the event for v2, you are operating off of a stale version.

yes, this is true, we could avoid this just by parsing the resource version.

csviri · 2023-08-08T14:12:03Z

Maybe we can discuss this on next community / fabric8 meeting? regarding what approach to take?

shawkins · 2023-08-08T17:03:07Z

Made a couple of changes:

aligned with fix: use update instead of replace in DR #2006 (it still makes sense to me though to just move up the call that is setting the resourceVersion for SSA - even if you change how the matcher works later that would still be correct)
because of where the tracking annotation is set, it needs filtered out of the SSA matching logic - that was causing the test failure as it would keep thinking the configmap needed updated, which had the side effect of deleting the pods. There's probably a better way to do this.

The integration tests are failing due to the changes to CreateUpdateEventFilterTestReconciler - that was manually exercising the dependent logic. With that removed there are 2 executions because we can't tell without the annotation that we're responsible for the create - but as expected the second execution does nothing.

As for the issue of putting the entry unconditionally into cache: #2006 (comment) that's done here conditionally - although with slightly different logic than what is in TemporaryResourceCache.putUpdatedResource In fact I don't think the code in this pr requires explicit optomistic locking for correctness - you should see the same eventual correctness either way.

While I agree on the general approach that if it's not broken, we shouldn't fix it, I also think there's value, moving forward, to simplify things if we can as this will make it easier for people to contribute and maintain the code.

I wouldn't say that things are broken, but there are implications to locking, temporary cache, and the SSA matching that don't degrade nicely - if the temporary cache gets off it blocks reconciliation when locked, and if the SSA matching is off then we can get stuck in a reconciliation loop. Ideally we want things to degrade as much as possible to simply an unomptimized state - I'm thinking that reducing some of the complexity may help with that.

csviri · 2023-08-09T10:52:39Z

...n/java/io/javaoperatorsdk/operator/processing/event/source/informer/InformerEventSource.java

+        if (oldObject == null && parts.length == 1) {
+          known = true;
+        } else if (oldObject != null && parts.length == 2
+            && oldObject.getMetadata().getResourceVersion().equals(parts[1])) {


Hmm, maybe I'm not following, but this shouldn't be newObject here what is compared with previous. So that we know that we just received has the same version as in the annotation (own). Basically saying that if the resource version did not change since we updated the resource we can skip.

Sorry ignore this, brain lag from my side.

The annotation added to what should become v2 will reference reference v1. If we are getting the event before the blocking update has completed, then newObject here is v2 and as long as the old is v1 (our basis for the change), we'll skip propagation.

The problematic case here (which we talked about a little with the current code, but with respect to the temporaryCache) is if the event for v2 is skipped. Normally this shouldn't happen. However it's possible when the informer must fully relist and thus skips ahead, which will generate an update event where the old is the current cache item v1 and the new is the latest, v3. That event would still be seen as "known". I don't think there's a great way to overcome that with local knowledge.

I don't think there's a great way to overcome that with local knowledge.

What I mean by that is that there is missing information from the event. Just like onDelete provides unknownFinalState, onUpdate/onAdd could convey whether they are from a relist such that we know the prior state isn't exact.

Yes. theoretically we could cache v2 version (we know that from update response) , and if the oldVersion == annotation version but newObjectVersion != v2 ==> emit the event

The remaining - exceedingly narrow case where this doesn't work:

starting with the temporary cache item as null, and informer cache state at v1

a full relist of the informer needs to be initiated - just a restart of the watch shouldn't cause a problem

the dependent resource initiates an update of v1, which will result in v2 - but the blocking call does not complete yet

on the api server in response to v2, v3 is generated by some other actor

the relist is processed such that it starts at v3, and emits an add event where the old state is v1 and the new is v3. The logic will detect this as "known", since it matches state expectations - temporary cache item is null and v1 matches the annotation.

This is what I mean to cover by this:

Yes. theoretically we could cache v2 version (we know that from update response) , and if the oldVersion == annotation version but newObjectVersion != v2 ==> emit the event

Yes. theoretically we could cache v2 version (we know that from update response)

You won't cache that because the update call is completing at the end of the above sequence and by that time the informer cache has already moved ahead to v3.

I mean caching it explicitly from the response.

there would be still a minor chance I guess that we receive v3 before the v2 is cached, but that is extremely low. (Probably won't happen in our lifetime :) )

there would be still a minor chance I guess that we receive v3 before the v2 is cached, but that is extremely low. (Probably won't happen in our lifetime :) )

Yes this is what I mean by extremely narrow.

...n/java/io/javaoperatorsdk/operator/processing/event/source/informer/InformerEventSource.java

csviri · 2023-08-09T10:58:57Z

Have to take a look on the test, but in general I like this idea very much. Simplifies the code, there is overhead with the annotation but that is negligible. Added some points to clarify, have to think through some cases, but I think we should go this way.

csviri · 2023-08-09T11:09:24Z

One additional though regarding optimistic locking (ssa version) , we saw sometimes that users use some data from the actual resource to create the desired for update. Without the optimistic locking, this might lead to some unexpected behavior, if some would need more strict consistency. So thinking about a feature flag for this (will create a separate issue). Just wanted to hear some opinions in case.

shawkins · 2023-08-10T11:37:57Z

To move this out of draft there are a couple of things to cover:

should the feature of adding the annotation be enabled/disabled via flag? If it's turned off, that just means we won't skip the event if it comes earlier than the update response. As mentioned above that may not be a big deal - operators that are working off of caches and that have effective matchers will just use a few extra cpu cycles.
Is there a better way to do this: https://github.com/operator-framework/java-operator-sdk/pull/2005/files#diff-7568f2dc58ce3a6a5d7ce2d5227aeb49402136b091fb19a4d8cadb28351ce709R93 - and does it need to be accountted for in the non-SSA case as well?
what tests to include - integration? or will some unit tests suffice

csviri · 2023-08-10T12:37:55Z

@shawkins one more note, it would be better to target the next branch with this PR, there is already a small feature there where will be probably conflicts but nothing too serious, in next we are developing the next minor/major version usually. Would like to have this rather as part of next minor version.

shawkins · 2023-08-10T13:10:23Z

Moving the last comment over to the new pr and closing this one.

openshift-ci bot requested review from adam-sandor and andreaTP August 8, 2023 12:26

csviri mentioned this pull request Aug 8, 2023

Should we do optimistic locking in KubernetesDependentResource when event recording is on? #2007

Closed

shawkins force-pushed the eventing branch from 6f12061 to de8b8ed Compare August 8, 2023 16:31

csviri requested changes Aug 9, 2023

View reviewed changes

shawkins force-pushed the eventing branch 2 times, most recently from f93fddc to 6c4b1d0 Compare August 9, 2023 12:51

draft of changes to simplify the recording mechanism

3714043

shawkins force-pushed the eventing branch from 6c4b1d0 to 3714043 Compare August 9, 2023 14:28

shawkins mentioned this pull request Aug 10, 2023

changes to simplify the recording mechanism #2012

Merged

shawkins closed this Aug 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

draft of changes to simplify the recording mechanism #2005

draft of changes to simplify the recording mechanism #2005

shawkins commented Aug 8, 2023

csviri commented Aug 8, 2023 •

edited

Loading

csviri commented Aug 8, 2023 •

edited

Loading

csviri commented Aug 8, 2023

metacosm commented Aug 8, 2023

shawkins commented Aug 8, 2023

csviri commented Aug 8, 2023

csviri commented Aug 8, 2023 •

edited

Loading

csviri commented Aug 8, 2023

csviri commented Aug 8, 2023

shawkins commented Aug 8, 2023 •

edited

Loading

csviri Aug 9, 2023

csviri Aug 9, 2023

shawkins Aug 9, 2023

shawkins Aug 9, 2023

csviri Aug 9, 2023 •

edited

Loading

csviri Aug 10, 2023

shawkins Aug 10, 2023

csviri Aug 10, 2023

csviri Aug 10, 2023

shawkins Aug 10, 2023

csviri commented Aug 9, 2023

csviri commented Aug 9, 2023

shawkins commented Aug 10, 2023

csviri commented Aug 10, 2023

shawkins commented Aug 10, 2023

draft of changes to simplify the recording mechanism #2005

draft of changes to simplify the recording mechanism #2005

Conversation

shawkins commented Aug 8, 2023

csviri commented Aug 8, 2023 • edited Loading

csviri commented Aug 8, 2023 • edited Loading

csviri commented Aug 8, 2023

metacosm commented Aug 8, 2023

shawkins commented Aug 8, 2023

csviri commented Aug 8, 2023

csviri commented Aug 8, 2023 • edited Loading

csviri commented Aug 8, 2023

csviri commented Aug 8, 2023

shawkins commented Aug 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

csviri Aug 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

csviri commented Aug 9, 2023

csviri commented Aug 9, 2023

shawkins commented Aug 10, 2023

csviri commented Aug 10, 2023

shawkins commented Aug 10, 2023

csviri commented Aug 8, 2023 •

edited

Loading

csviri commented Aug 8, 2023 •

edited

Loading

csviri commented Aug 8, 2023 •

edited

Loading

shawkins commented Aug 8, 2023 •

edited

Loading

csviri Aug 9, 2023 •

edited

Loading