-
Notifications
You must be signed in to change notification settings - Fork 376
MON-4343: Cleanup deprecate pa config #2651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MON-4343: Cleanup deprecate pa config #2651
Conversation
jan--f
commented
Aug 26, 2025
- I added CHANGELOG entry for this change.
- No user facing changes, so no entry in CHANGELOG was needed.
The final bit of #2648 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, some suggestions
pkg/manifests/config.go
Outdated
|
||
var warning *InvalidConfigWarning | ||
wCmc := defaultClusterMonitoringConfiguration() | ||
wErr := UnmarshalStrict(content, &wCmc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to propagate warnings for this.
we already capture the error below in err := UnmarshalStrict(content, &cmc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I see, I suppose I didn't understand #2592 fully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in #2592, we had 2 different unmarshallers, we propagated the failures of the newer/stricter one as warnings only to block upgrades so the configs are fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now the newer/stricter one is the default/only one and we can (should) propagate its failures as errors.
pkg/manifests/config.go
Outdated
|
||
var warning *InvalidConfigWarning | ||
wU := &UserWorkloadConfiguration{} | ||
wErr := UnmarshalStrict([]byte(content), &wU) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, having err := UnmarshalStrict([]byte(content), &u)
below is enough, no need to have it as a warning as well.
pkg/manifests/config_test.go
Outdated
@@ -265,7 +296,12 @@ thanosRuler: | |||
|
|||
for _, tc := range tcs { | |||
t.Run(tc.name, func(t *testing.T) { | |||
c, err := NewUserConfigFromString(tc.configString()) | |||
c, warning, err := NewUserConfigFromString(tc.configString()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that warnings in all tests that don't have a promadapter field set should nil
test/e2e/config_test.go
Outdated
f.AssertOperatorConditionReason(configv1.OperatorDegraded, "InvalidConfiguration") | ||
f.AssertOperatorConditionReason(configv1.OperatorAvailable, "InvalidConfiguration") | ||
// Check that the previous setup hasn't been reverted | ||
f.AssertStatefulsetExists("prometheus-user-workload", f.UserWorkloadMonitoringNs)(t) | ||
|
||
t.Log("invalid configuration with field of the wrong case") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need have those as we don't want to propagate the same unmarshalling errors as warnings, see above.
test/e2e/config_test.go
Outdated
// Restore the first configuration. | ||
f.MustCreateOrUpdateConfigMap(t, getUserWorkloadEnabledConfigMap(t, f)) | ||
t.Log("asserting that CMO goes back healthy after the configuration is fixed") | ||
f.AssertOperatorCondition(configv1.OperatorDegraded, configv1.ConditionFalse)(t) | ||
f.AssertOperatorCondition(configv1.OperatorAvailable, configv1.ConditionTrue)(t) | ||
// Once the config is adjusted, the operator becomes Upgradeable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could keep this and replace it with a config with promAdapter set; check it goes upgreadable=false when set and true when removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you added those checks in TestClusterMonitoringDeprecatedConfig
f.AssertOperatorCondition(configv1.OperatorDegraded, configv1.ConditionTrue)(t) | ||
f.AssertOperatorCondition(configv1.OperatorAvailable, configv1.ConditionFalse)(t) | ||
f.AssertOperatorConditionReason(configv1.OperatorDegraded, "UserWorkloadInvalidConfiguration") | ||
f.AssertOperatorConditionReason(configv1.OperatorAvailable, "UserWorkloadInvalidConfiguration") | ||
// Even when an invalid configuration is caught by both unmarshallers, the operator is still set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no longer needed, we just have one unmarshaller now and it only yields errors, no need to have warnings for it. see above.
2ddd33a
to
07f5388
Compare
if c.ClusterMonitoringConfiguration.PrometheusK8sConfig.CollectionProfile != FullCollectionProfile && !c.CollectionProfilesFeatureGateEnabled { | ||
return fmt.Errorf("%w: collectionProfiles is currently a TechPreview feature behind the \"MetricsCollectionProfiles\" feature-gate, to be able to use a profile different from the default (\"full\") please enable it first", ErrConfigValidation) | ||
return nil, fmt.Errorf("%w: collectionProfiles is currently a TechPreview feature behind the \"MetricsCollectionProfiles\" feature-gate, to be able to use a profile different from the default (\"full\") please enable it first", ErrConfigValidation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not related to the PR: I think collectionProfile is GA in 4.19 and I think this logic can be cleaned now
cc @rexagod (I can create a ticket if needed)
@@ -791,7 +798,7 @@ func TestCollectionProfilePreCheck(t *testing.T) { | |||
t.Run(tc.name, func(t *testing.T) { | |||
c, err := NewConfigFromString(tc.config, true) | |||
require.NoError(t, err) | |||
err = c.Precheck() | |||
_, err = c.Precheck() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can set the warning and ensure it's nil, an easy extra check is always welcome.
test/e2e/config_test.go
Outdated
@@ -71,6 +71,9 @@ func TestClusterMonitoringOperatorConfiguration(t *testing.T) { | |||
t.Log("asserting that CMO goes degraded after an invalid configuration is pushed") | |||
f.AssertOperatorCondition(configv1.OperatorDegraded, configv1.ConditionTrue)(t) | |||
f.AssertOperatorCondition(configv1.OperatorAvailable, configv1.ConditionFalse)(t) | |||
// Even when an invalid configuration is caught by both unmarshallers, the operator is still set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover? also I don't think we do/need to make Upgradeable=false, no for such case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I see what you mean...thanks for your patience 😓
t.Log("restoring the initial configurations") | ||
uwmCM.Data["config.yaml"] = `` | ||
f.MustCreateOrUpdateConfigMap(t, uwmCM) | ||
f.MustCreateOrUpdateConfigMap(t, getUserWorkloadEnabledConfigMap(t, f)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: do we need this f.MustCreateOrUpdateConfigMap(t, getUserWorkloadEnabledConfigMap(t, f))
line?
f.MustCreateOrUpdateConfigMap(t, getUserWorkloadEnabledConfigMap(t, f)) | ||
f.AssertOperatorCondition(configv1.OperatorDegraded, configv1.ConditionFalse)(t) | ||
f.AssertOperatorCondition(configv1.OperatorAvailable, configv1.ConditionTrue)(t) | ||
// Once the config is adjusted, the operator becomes Upgradeable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think the operator should always stay upgradeable for cannot be deserialized
07f5388
to
ce0b4a1
Compare
/retest |
/lgtm |
… monitoring and set CMO to Upgradeable=false if errors are found The strict unmarshaller is currently in advisory mode. Setting CMO to Upgradeable=false ensures that configurations meet the new unmarshaller checks in 4.18 before it is set to be the default in 4.19. (cherry picked from commit 8e7bb77)
…config Signed-off-by: Jan Fajerski <[email protected]>
Signed-off-by: Jan Fajerski <[email protected]>
ce0b4a1
to
4370de4
Compare
rebased. @machine424 requires your lgtm again. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jan--f, machine424 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/label acknowledge-critical-fixes-only |
/retitle MON-4343: Cleanup deprecate pa config |
@jan--f: This pull request references MON-4343 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@jan--f: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
92f7f83
into
openshift:main
Apologies but this took out payloads due to ROSA use of the deprecated config: https://issues.redhat.com/browse/OCPBUGS-61135 Per org policy I need to pursue a revert. |