-
Notifications
You must be signed in to change notification settings - Fork 562
OCPEDGE-1775: Enable separation of conflicting enum values, and drop DualReplica to DevPreview #2283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @JoelSpeed! Some important instructions when contributing to openshift/api: |
/retitle OCPEDGE-1637: Enable separation of conflicting enum values, and drop DualReplica to DevPreview |
@JoelSpeed: This pull request references OCPEDGE-1637 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/retest-required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions to try and build a context window of why these changes are necessary
// +openshift:validation:FeatureGateAwareEnum:featureGate=HighlyAvailableArbiter;DualReplica,enum=HighlyAvailable;HighlyAvailableArbiter;SingleReplica;DualReplica;External | ||
// +openshift:validation:FeatureGateAwareEnum:featureGate=HighlyAvailableArbiter,enum=HighlyAvailable;HighlyAvailableArbiter;SingleReplica;External | ||
// +openshift:validation:FeatureGateAwareEnum:featureGate=DualReplica,enum=HighlyAvailable;SingleReplica;DualReplica;External | ||
// +openshift:validation:FeatureGateAwareEnum:requiredFeatureGate=HighlyAvailableArbiter;DualReplica,enum=HighlyAvailable;HighlyAvailableArbiter;SingleReplica;DualReplica;External |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to introduce a dependence of HighlyAvailableArbiter
AND DualReplica
here?
Please correct me if I am wrong, but it seems like the goal here is to be able to have the values enforced by the two feature gates be in different feature set stages. It seems like a logical OR here is what we want and the manifests should be merged appropriately according to the feature gates enabled in each feature set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I gather this is more of a workaround due to a limitation of the upstream merge logic. The intention here is to describe in the marker api we want (HighlyAvailableArbiter
OR DualReplica
OR (HighlyAvailableArbiter
AND DualReplica
)).
That info is used to inform how we merge in the file, the reason this came about is because we can not merge in enum values, those simply get overwritten by the previous value. So this approach allows us to say (DevPreview = HighlyAvailableArbiter
AND DualReplica
) but (TechPreview = HighlyAvailableArbiter
). Otherwise if we do an OR approach with something like featureGate=HighlyAvailableArbiter;DualReplica
then only HighlyAvailableArbiter
gets applied for all of the featuresets due to the merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the explanation. I did some additional digging and it looks like we essentially mimic a server side apply for all the "partial" manifests. Because there is no specific SSA tags for the enum field I assume it defaults to atomic
meaning the entire list is replaced on each merge.
If this workaround gets us where we want to go, this seems fine to me temporarily. Longer term, I think I'd like to see if we can figure out a way to do this "better". I find the Foo+Bar.yaml
file and the intermediate Foo+Bar
feature gate to be a bit confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have intermediate Foo+Bar
files for other cases, for example, where an XValidation needs to apply only when both gates are enabled.
This change allows us to be more specific in the cases where gates are being enabled together, allowing us to say, "If only this gate, do X, if this gate and another gate, do Y"
It potentially could be more automated, but we would have to work out a better way to merge the gates in, which would be awkward given how much of the current logic is based on the SSA based merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I think we're on the same page, I'll lgtm this PR for now but I'll keep the jira ticket referenced in this TODO in the backlog so we can come back to it or remove the TODO so it's not a done link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoelSpeed yeah, that makes sense. If we can't reasonably "do better" based on our current approach then it's fine. Just calling out something I've noticed that seems like a bit of a rougher edge.
// Sort the manifests such that any combination of feature gates is applied after the gates that it combines. | ||
// This means that if we have a file that is "foo+bar" and a file that is "foo", the "foo+bar" file will be applied last. | ||
// This enables more speicfic handling for combinations of feature gates that affect the same field. | ||
sort.Slice(partialManifestFiles, func(i, j int) bool { | ||
// Get the name of the files without the ".yaml" suffix. | ||
// This should be the name of the feature gate, or, a list of feature gates separated by `+`. | ||
iBase := strings.TrimSuffix(filepath.Base(partialManifestFiles[i].Name()), ".yaml") | ||
jBase := strings.TrimSuffix(filepath.Base(partialManifestFiles[j].Name()), ".yaml") | ||
|
||
return strings.Contains(jBase, "+") && strings.Contains(jBase, iBase) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it that sorting in this way is effective?
If we are doing a proper manifest merge, wouldn't applying foo+bar
followed by foo
be a no-op when applying foo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the reason why this is ordered and the partial manifests that are a union of two feature gates are applied last is because the manifest merge can not merge enum values, so if two files are operating on the same list of enums the last one overrides the previous one. So we order and ensure the union file is applied last since that's the desired enum. Some more context on this starts in this conversation #2196 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Egli has it here
This is basically allowing us to override the merge of the individual gates by saying that something with multiple gates should take priority, when both gates are enabled
Hey @everettraven I tried to answer some of your questions, hope that provides some info, but I'm also relatively new to the codegen logic of this repo, @JoelSpeed would be the best one to provide better insight. |
/retest-required |
/retitle OCPEDGE-1775: Enable separation of conflicting enum values, and drop DualReplica to DevPreview Updating title to new bug so that we keep the original bug in the backlog since it was referenced in the TODO for the feature. This will be returned to to make sure we re-evaluate this approach or come to some consensus on a new solution. |
@JoelSpeed: This pull request references OCPEDGE-1775 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/test verify images |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: everettraven, JoelSpeed The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@eggfoobar Heads up that there is an active incident causing CI to be down. AFAIK, running tests and retests right now are likely to fail. |
@everettraven Would you have the ability to skip the |
@eggfoobar I'll give it a shot /override verify-crd-schema |
@everettraven: everettraven unauthorized: /override is restricted to Repo administrators, approvers in top level OWNERS file, and the following github teams:openshift: openshift-release-oversight openshift-staff-engineers. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
1 similar comment
/override ci/prow/verify-crd-schema Dropping a feature from TPNU, to be expected that it complains of enum removal. The enum will come back and hasn't shipped outside of nightlies and ECs yet |
@JoelSpeed: Overrode contexts on behalf of JoelSpeed: ci/prow/verify-crd-schema In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/retest |
/retest-required |
1 similar comment
/retest-required |
Merging to avoid retest loop. |
[ART PR BUILD NOTIFIER] Distgit: ose-cluster-config-api |
@JoelSpeed: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
…e Feature Set` to correlate with openshift/api#2283
…e Feature Set` to correlate with openshift/api#2283
…e Feature Set` to correlate with openshift/api#2283
…e Feature Set` to correlate with openshift/api#2283
…e Feature Set` to correlate with openshift/api#2283
CC @eggfoobar
This updates the way we apply manifests to make sure that where we have combinations (A+B), that these are applied after the constituent gates.
This means that we can control the order, and, where there are multiple gates affecting a single validation (enum), we can choose how to apply for either of the gates being enabled, or both gates being enabled.
This also drops the DualReplica gate to DevPreview, which is why we needed to find a solution for this