-
Notifications
You must be signed in to change notification settings - Fork 107
✨ Support default mode webhook networking configuration #1035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
✨ Support default mode webhook networking configuration #1035
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: bhperry The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ed144a9
to
57533ab
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1035 +/- ##
==========================================
- Coverage 63.98% 63.84% -0.15%
==========================================
Files 198 198
Lines 19631 19678 +47
==========================================
+ Hits 12561 12563 +2
- Misses 6020 6065 +45
Partials 1050 1050
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
57533ab
to
455701f
Compare
WalkthroughThis change introduces configurable health probe and metrics bind addresses, as well as host networking options, for webhook servers in both registration and work components. The configuration is exposed through CRD schema extensions, deployment manifest templating, Go API types, and command-line flags, enabling dynamic port and address assignment for health checks and metrics endpoints. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CRD
participant Operator
participant Deployment
participant Webhook Pod
User->>CRD: Set deployOption.default.registrationWebhookConfiguration.healthProbeBindAddress/metricsBindAddress/port/hostNetwork
CRD-->>Operator: Passes configuration via API
Operator->>Deployment: Renders manifests with configured values
Deployment->>Webhook Pod: Starts pod with specified ports, addresses, and host networking
Webhook Pod->>Webhook Pod: Listens on configured health probe and metrics addresses/ports
Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (23)
🚧 Files skipped from review as they are similar to previous changes (23)
⏰ Context from checks skipped due to timeout of 90000ms (6)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
manifests/config.go (1)
59-85
: Robust port parsing implementation with potential enhancement.The parsing logic correctly uses
net.SplitHostPort
and handles errors gracefully by returning port 0. This is a reasonable approach since:8000
format addresses will parse correctly, and malformed addresses safely default to disabled state.Consider logging parsing errors for better observability:
func parseHostPort(address string) (host string, port int32, err error) { host, portStr, err := net.SplitHostPort(address) if err != nil { + // Consider logging: "Failed to parse address %s: %v", address, err return host, port, err } port64, err := strconv.ParseInt(portStr, 10, 32) if err != nil { + // Consider logging: "Failed to parse port %s: %v", portStr, err return host, port, err } return host, int32(port64), nil }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sum
is excluded by!**/*.sum
📒 Files selected for processing (23)
deploy/cluster-manager/chart/cluster-manager/crds/0000_01_operator.open-cluster-management.io_clustermanagers.crd.yaml
(1 hunks)deploy/cluster-manager/config/crds/0000_01_operator.open-cluster-management.io_clustermanagers.crd.yaml
(1 hunks)deploy/cluster-manager/olm-catalog/latest/manifests/cluster-manager.clusterserviceversion.yaml
(1 hunks)deploy/cluster-manager/olm-catalog/latest/manifests/operator.open-cluster-management.io_clustermanagers.yaml
(1 hunks)deploy/klusterlet/olm-catalog/latest/manifests/klusterlet.clusterserviceversion.yaml
(1 hunks)go.mod
(1 hunks)manifests/cluster-manager/hub/cluster-manager-registration-webhook-service.yaml
(1 hunks)manifests/cluster-manager/hub/cluster-manager-work-webhook-service.yaml
(1 hunks)manifests/cluster-manager/hub/cluster-manager-work-webhook-validatingconfiguration.yaml
(1 hunks)manifests/cluster-manager/management/cluster-manager-registration-webhook-deployment.yaml
(4 hunks)manifests/cluster-manager/management/cluster-manager-work-webhook-deployment.yaml
(5 hunks)manifests/config.go
(2 hunks)pkg/operator/operators/clustermanager/controllers/clustermanagercontroller/clustermanager_controller.go
(3 hunks)pkg/registration/webhook/option.go
(1 hunks)pkg/registration/webhook/start.go
(3 hunks)pkg/work/webhook/option.go
(1 hunks)pkg/work/webhook/start.go
(3 hunks)vendor/modules.txt
(2 hunks)vendor/open-cluster-management.io/api/operator/v1/0000_01_operator.open-cluster-management.io_clustermanagers.crd.yaml
(1 hunks)vendor/open-cluster-management.io/api/operator/v1/types_clustermanager.go
(3 hunks)vendor/open-cluster-management.io/api/operator/v1/zz_generated.deepcopy.go
(3 hunks)vendor/open-cluster-management.io/api/operator/v1/zz_generated.swagger_doc_generated.go
(3 hunks)vendor/open-cluster-management.io/api/work/v1/types.go
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (7)
- GitHub Check: verify
- GitHub Check: integration
- GitHub Check: build
- GitHub Check: unit
- GitHub Check: e2e-singleton
- GitHub Check: e2e-hosted
- GitHub Check: e2e
🔇 Additional comments (50)
vendor/modules.txt (1)
1721-1722
: ```shell
#!/bin/bashCheck if the replace directive for open-cluster-management.io/api is still in go.mod
grep -n "replace .*open-cluster-management.io/api" go.mod
</details> <details> <summary>deploy/cluster-manager/olm-catalog/latest/manifests/cluster-manager.clusterserviceversion.yaml (1)</summary> `62-63`: **Timestamp bump only – LGTM** `createdAt` annotation updated; no functional impact. </details> <details> <summary>deploy/klusterlet/olm-catalog/latest/manifests/klusterlet.clusterserviceversion.yaml (1)</summary> `34-34`: **LGTM: Metadata timestamp update.** This is a standard timestamp refresh for the ClusterServiceVersion metadata with no functional impact. </details> <details> <summary>manifests/cluster-manager/hub/cluster-manager-registration-webhook-service.yaml (1)</summary> `17-18`: **LGTM: Port configuration made configurable.** The templating correctly replaces hardcoded ports with configurable values from `.RegistrationWebhook.Port`, maintaining consistency between `port` and `targetPort` fields. </details> <details> <summary>manifests/cluster-manager/hub/cluster-manager-work-webhook-service.yaml (1)</summary> `17-18`: **LGTM: Consistent port templating with registration webhook.** The port configuration correctly uses `.WorkWebhook.Port` template variable, maintaining consistency with the registration webhook service pattern. </details> <details> <summary>manifests/cluster-manager/hub/cluster-manager-work-webhook-validatingconfiguration.yaml (1)</summary> `20-21`: **LGTM: Proper separation of work and registration webhook configurations.** The changes correctly align the work webhook to use work-specific port and CA bundle configurations instead of registration webhook values, improving separation of concerns. </details> <details> <summary>pkg/registration/webhook/start.go (3)</summary> `14-14`: **LGTM: Proper metrics server import.** The metrics server import is correctly added to support the new configurable metrics bind address. --- `40-43`: **LGTM: Well-structured configurable address options.** The health probe and metrics bind addresses are properly configured using the new configurable options, replacing hardcoded values with flexible configuration. --- `59-70`: **LGTM: Proper conditional health check registration.** The conditional registration of health and readiness checks is well-implemented, preventing registration when health probes are disabled (`""` or `"0"`), which is the correct approach for optional health check endpoints. </details> <details> <summary>pkg/work/webhook/start.go (3)</summary> `16-16`: **LGTM: Added metrics server import.** The import for the metrics server package is correctly added to support the new metrics configuration. --- `45-48`: **LGTM: Configurable health probe and metrics bind addresses.** The hardcoded health probe address is properly replaced with the configurable `c.HealthProbeBindAddr`, and the new metrics server configuration follows the controller-runtime patterns correctly. --- `64-75`: **Good defensive programming for health probe registration.** The conditional registration logic properly handles cases where health probes should be disabled (empty string or "0" value), preventing unnecessary health check handlers from being added. </details> <details> <summary>vendor/open-cluster-management.io/api/work/v1/types.go (2)</summary> `506-509`: **LGTM: Work completion constant addition.** The `WorkComplete` constant and its documentation are well-defined and follow the existing pattern for work status constants. --- `511-516`: **LGTM: Work manifests completion constant and improved documentation.** The `WorkManifestsComplete` constant is well-documented and the expanded comment section provides good clarity about the prefixing convention for condition rule evaluation reasons. </details> <details> <summary>pkg/registration/webhook/option.go (3)</summary> `7-10`: **LGTM: Added configurable bind address fields.** The new `MetricsBindAddr` and `HealthProbeBindAddr` fields are properly added to the Options struct, maintaining consistency with the work webhook options structure. --- `16-19`: **LGTM: Reasonable default values.** The default values for metrics (":8080") and health probe (":8000") bind addresses are standard and reasonable choices that avoid common port conflicts. --- `25-28`: **LGTM: Well-named command line flags.** The command line flags `--metrics-bind-address` and `--health-probe-bind-address` follow proper kebab-case naming conventions and have clear descriptions. </details> <details> <summary>manifests/cluster-manager/management/cluster-manager-work-webhook-deployment.yaml (4)</summary> `16-25`: **Good conditional rolling update strategy for host networking.** The conditional rolling update strategy appropriately adjusts `maxUnavailable` based on replica count when host networking is enabled, which is important for avoiding deployment disruptions in host network scenarios. --- `78-80`: **LGTM: Dynamic webhook configuration.** The hardcoded webhook arguments are properly replaced with configurable template variables, enabling flexible port and bind address configuration. --- `97-111`: **Good conditional health probe configuration.** The health probe configuration is properly conditioned on `HealthProbePort` being greater than 0, which aligns with the webhook startup logic that conditionally registers health probes. --- `137-140`: **Proper host networking configuration.** The conditional host network and `ClusterFirstWithHostNet` DNS policy configuration is correctly implemented for scenarios requiring host networking access. </details> <details> <summary>deploy/cluster-manager/config/crds/0000_01_operator.open-cluster-management.io_clustermanagers.crd.yaml (3)</summary> `94-127`: **LGTM: Well-structured registration webhook configuration schema.** The registration webhook configuration schema is comprehensive, with appropriate default values, clear descriptions, and proper validation. The port validation with a maximum of 65535 is correct for TCP ports. --- `128-157`: **LGTM: Symmetric work webhook configuration schema.** The work webhook configuration schema mirrors the registration webhook configuration perfectly, maintaining consistency across the API. The default values (:8000 for health probes, :8080 for metrics, 9443 for webhook port) align with the webhook option defaults. --- `94-158`: **Verify consistency across CRD files.** Ensure that this CRD schema extension is consistently applied across all CRD files mentioned in the AI summary to maintain schema consistency. ```shell #!/bin/bash # Description: Verify that the default webhook configuration schema is consistent across all CRD files. # Find all ClusterManager CRD files fd -e yaml -e yml "clustermanagers.*crd" . | head -10 # Check if the default webhook configuration schema exists in other CRD files rg -A 20 "registrationWebhookConfiguration:" deploy/ vendor/ | head -50
vendor/open-cluster-management.io/api/operator/v1/zz_generated.swagger_doc_generated.go (3)
51-54
: LGTM! Documentation properly updated for new Default mode configuration.The swagger documentation correctly describes the new
default
field in ClusterManagerDeployOption and provides clear context about its purpose.
101-109
: LGTM! Well-documented API type for Default mode configuration.The DefaultClusterManagerConfiguration documentation is comprehensive and clearly describes the webhook configuration options for Default mode.
197-207
: LGTM! Comprehensive webhook configuration documentation.The WebhookDefaultConfiguration documentation includes all the necessary details including default values, field purposes, and important usage notes like the EKS with Calico CNI requirement for hostNetwork.
pkg/work/webhook/option.go (3)
7-12
: LGTM! Clean addition of configurable bind address fields.The new
MetricsBindAddr
andHealthProbeBindAddr
fields are properly integrated into the Options struct, following the existing pattern.
17-21
: LGTM! Default values consistent with CRD schema.The default values
:8080
for metrics and:8000
for health probes match the defaults specified in the CRD schema, ensuring consistency across the implementation.
27-30
: LGTM! Command-line flags properly implemented.The new flags follow the existing naming convention and provide clear descriptions for operators configuring the webhook server.
deploy/cluster-manager/chart/cluster-manager/crds/0000_01_operator.open-cluster-management.io_clustermanagers.crd.yaml (1)
94-158
: LGTM! Comprehensive CRD schema for Default mode webhook configuration.The new
default
configuration section provides excellent flexibility for webhook customization with:
- Consistent structure for both registration and work webhooks
- Sensible default values (port: 9443, health probe:
:8000
, metrics::8080
)- Proper validation with port ranges and format constraints
- Clear documentation explaining the hostNetwork use case for EKS with Calico CNI
- Well-structured schema that allows disabling health checks and metrics by setting values to "0" or ""
The schema design enables the flexibility needed for different deployment scenarios while maintaining good defaults.
manifests/cluster-manager/management/cluster-manager-registration-webhook-deployment.yaml (5)
16-25
: LGTM! Smart rolling update strategy for hostNetwork mode.The conditional rolling update strategy correctly addresses hostNetwork constraints where only one pod can run per node, using appropriate maxUnavailable values based on replica count.
78-80
: LGTM! Properly templated webhook server arguments.The webhook server arguments now use the configurable values from
.RegistrationWebhook
instead of hardcoded values, enabling dynamic configuration based on the ClusterManager CRD settings.
112-126
: LGTM! Conditional health probes with proper port configuration.The health probes are correctly configured to:
- Only run when
.RegistrationWebhook.HealthProbePort > 0
(allowing users to disable health checks)- Use the templated health probe port value
- Maintain the existing probe timing and endpoints
This provides flexibility while maintaining operational visibility when health checks are enabled.
128-128
: LGTM! Container port uses templated configuration.The container port correctly uses the templated
.RegistrationWebhook.Port
value, ensuring consistency with the webhook server configuration.
139-142
: LGTM! Proper hostNetwork configuration.The conditional hostNetwork configuration correctly:
- Enables
hostNetwork: true
when configured- Sets
dnsPolicy: ClusterFirstWithHostNet
for proper DNS resolution in hostNetwork mode- Addresses the EKS with Calico CNI use case mentioned in the PR objectives
This implementation ensures webhooks can be reached by the Kubernetes API server in environments that require hostNetwork mode.
vendor/open-cluster-management.io/api/operator/v1/0000_01_operator.open-cluster-management.io_clustermanagers.crd.yaml (1)
94-158
: Well-designed webhook configuration schema for Default mode.The new Default mode webhook configuration schema is comprehensive and well-structured. The field defaults are sensible (
:8000
for health probes,:8080
for metrics,9443
for webhook port), and the validation constraints are appropriate.manifests/config.go (2)
3-8
: Proper imports for address parsing functionality.The addition of
net
andstrconv
imports is appropriate for the new address parsing functionality.
50-57
: Well-structured Webhook struct extensions.The new fields properly extend the Webhook configuration to support host networking and configurable bind addresses for health probes and metrics.
deploy/cluster-manager/olm-catalog/latest/manifests/operator.open-cluster-management.io_clustermanagers.yaml (1)
94-158
: Consistent schema definition across deployment methods.The OLM catalog CRD schema correctly mirrors the vendor CRD definition, ensuring consistent webhook configuration options across different deployment methods (direct and OLM-based).
vendor/open-cluster-management.io/api/operator/v1/zz_generated.deepcopy.go (2)
171-175
: Correct deep copy handling for new Default field.The auto-generated deep copy method properly handles the new
Default
field inClusterManagerDeployOption
, ensuring safe copying of the webhook configuration.
298-314
: Proper deep copy methods for new webhook configuration types.The auto-generated deep copy methods for
DefaultClusterManagerConfiguration
andWebhookDefaultConfiguration
follow standard Go conventions and correctly implement the required functionality.Also applies to: 793-807
vendor/open-cluster-management.io/api/operator/v1/types_clustermanager.go (3)
235-244
: Well-designed API structure for Default mode configuration.The
DefaultClusterManagerConfiguration
struct follows the established pattern fromHostedClusterManagerConfiguration
and provides clear separation between registration and work webhook configurations. The optional fields and documentation are appropriate.
257-282
: Comprehensive webhook configuration with appropriate defaults and validation.The
WebhookDefaultConfiguration
struct provides all necessary fields for webhook customization in Default mode:
- Port validation with reasonable maximum (65535)
- Sensible default values matching controller constants
- Clear documentation explaining the HostNetwork requirement for EKS/Calico scenarios
- Proper optional field annotations
The design enables the flexibility required for different Kubernetes networking environments.
315-318
: Consistent API extension following existing patterns.The addition of the
Default
field toClusterManagerDeployOption
maintains symmetry with the existingHosted
field and follows Kubernetes API conventions with proper optional annotations.pkg/operator/operators/clustermanager/controllers/clustermanagercontroller/clustermanager_controller.go (5)
44-47
: Well-defined constants for default webhook configuration.The new constants provide sensible defaults:
- Port 9443 is the standard webhook port
- Health probe and metrics addresses use different ports to avoid conflicts
- Constants align with the kubebuilder defaults in the API types
162-162
: Good refactoring to centralize webhook configuration logic.Replacing the inline webhook configuration with a dedicated function improves code organization and makes the webhook configuration handling more maintainable. The function call cleanly separates concerns.
381-405
: Well-structured webhook configuration handling for different deployment modes.The
webhookConfigurations
function properly handles both Default and Hosted modes:
- Provides appropriate defaults when configuration is not specified
- Calls mode-specific conversion functions for clean separation of logic
- Maintains backward compatibility by defaulting to standard webhook port
The function signature clearly indicates it returns configurations for both registration and work webhooks.
407-416
: Correct hosted mode webhook configuration conversion.The
convertHostedWebhookConfiguration
function maintains the existing logic for hosted mode, correctly handling address, port, and IP format detection. The function preserves the current behavior for hosted deployments.
418-427
: Appropriate default mode webhook configuration conversion.The
convertDefaultWebhookConfiguration
function correctly maps the new API fields to the manifest webhook structure:
- Maps all relevant fields (port, health probe address, metrics address, host network)
- The comment clearly explains the difference from hosted mode
- Enables the required flexibility for different Kubernetes networking environments
Signed-off-by: Ben Perry <[email protected]>
455701f
to
0a659b3
Compare
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Summary
Add support for customization of network config on webhooks running in Default installation mode (i.e. running in hub cluster)
The primary reason to add this is because EKS with Calico CNI requires that webhooks run with
hostNetwork: true
, otherwise the webhook pods are unreachable from the k8s apiserver.In order to support the webhooks running in hostNetwork mode, it is important to be able to configure the ports that they bind to to avoid conflicts on nodes. To run in hostNetwork mode, a user might set clustermanager something like this:
This ensures the two webhooks will not conflict with each other, or other open ports on the nodes (depending on the particular setup of the cluster of course)
Related issue(s)
Related to (but does not fix) #1032
Depends on API change: open-cluster-management-io/api#379
Summary by CodeRabbit
New Features
Improvements
Chores