Skip to content

Commit dbdc267

Browse files
committed
KEP-3107: SecretRef field addition to NodeExpandVolume request
Issue #3107 Other comments: Prototype with working implementation kubernetes/kubernetes#105963 Signed-off-by: Humble Chirammal <[email protected]>
1 parent 366d029 commit dbdc267

File tree

2 files changed

+320
-0
lines changed

2 files changed

+320
-0
lines changed
Lines changed: 278 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,278 @@
1+
# NodeExpandSecret for CSI Driver
2+
3+
## Table of Contents
4+
5+
<!-- toc -->
6+
- [Release Signoff Checklist](#release-signoff-checklist)
7+
- [Summary](#summary)
8+
- [Motivation](#motivation)
9+
- [Goals](#goals)
10+
- [Non-Goals](#non-goals)
11+
- [Proposal](#proposal)
12+
- [User stories](#user-stories)
13+
- [story 1](#story-1)
14+
- [story 2](#story-2)
15+
- [story 3](#story-3)
16+
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
17+
- [Risks and Mitigations](#risks-and-mitigations)
18+
- [Design Details](#design-details)
19+
- [Test Plan](#test-plan)
20+
- [Graduation Criteria](#graduation-criteria)
21+
- [Alpha](#alpha)
22+
- [Beta](#beta)
23+
- [GA](#ga)
24+
- [Deprecation](#deprecation)
25+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
26+
- [Version Skew Strategy](#version-skew-strategy)
27+
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
28+
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
29+
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
30+
- [Monitoring Requirements](#monitoring-requirements)
31+
- [Dependencies](#dependencies)
32+
- [Scalability](#scalability)
33+
- [Troubleshooting](#troubleshooting)
34+
- [Implementation History](#implementation-history)
35+
- [Drawbacks](#drawbacks)
36+
- [Alternatives](#alternatives)
37+
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
38+
<!-- /toc -->
39+
40+
## Release Signoff Checklist
41+
42+
## Summary
43+
44+
This KEP proposes a way to add NodeExpandSecret to the CSI persistent
45+
volume source and thus enabling the csi client to send it out as part of
46+
the nodeExpandVolume request to the csi drivers for making use of it
47+
in the various Node Operations.
48+
49+
## Motivation
50+
51+
### Goals
52+
53+
- Introduce `secretRef` in CSI Persistent Volume Source.
54+
- Allow CSI driver to get/refer `secretRef` sent
55+
from kubelet as part of `NodeExpandVolume` operation.
56+
- To support per-PVC secrets for volume resizing, similar to CSI attach and
57+
detach - this proposal expands `CSIPersistentVolumeSource` object to
58+
contain `NodeExpandSecretRef`.
59+
60+
### Non-Goals
61+
62+
- Other CSI calls e.g. `NodeStageVolume` will not have the secretRef
63+
in the request, this is limited to `NodeExpandVolume` operation.
64+
65+
## Proposal
66+
67+
Currently, the CSI drivers dont have a method to make use of secretRef
68+
at time of Node operation (ex: nodeExpansion) as the subjected csi request does
69+
not carry a secret or credentials in the request. Even-though
70+
Kubernetes CSI have implemented similar mechanism for Controller side operations,
71+
ie secretRef field available in the csi PV source and making use of it while
72+
controllerExpand request has been sent to the CSI driver, similar field
73+
is missing in the nodeExpansion request.
74+
75+
### User stories
76+
77+
#### story 1
78+
- At times, the CSI driver need to check the actual size of the backend volume/image
79+
before proceeding on FS resize to avoid false positive returns on fs resize operation.
80+
81+
#### story 2
82+
- Encrypted device with LUKs, which need the passphrase in order to resize
83+
the device on the node.
84+
85+
#### story 3
86+
- For various validations at time of node expansion the CSI driver has to be connected
87+
to the backend storage cluster, if the secretRef is part of the nodeExpansion request
88+
the CSI driver can make use of the same and connect to the storage cluster
89+
to perform the cluster operations.
90+
91+
### Notes/Constraints/Caveats (Optional)
92+
93+
### Risks and Mitigations
94+
95+
## Design Details
96+
97+
```go
98+
- pkg/apis/core/types.go
99+
..
100+
type CSIPersistentVolumeSource struct {
101+
.....
102+
// nodeExpandSecretRef is a reference to secret object containing sensitive
103+
// information to pass to the CSI driver to complete CSI node expansion
104+
NodeExpandSecretRef *SecretReference
105+
}
106+
```
107+
The above field NodeExpandSecretRef is optional:
108+
109+
To enable, NodeExpandSecretRef a new feature gate (CSINodeExpandSecret) has to be
110+
introduced.
111+
112+
When the feature gate is enabled, the secretRef field will be added to the
113+
NodeExpandVolume request.
114+
115+
Secrets will be fetched from StorageClass with parameters `csi.storage.k8s.io/node-expand-secret-name`
116+
and `csi.storage.k8s.io/node-expand-secret-namespace`. Resizing secrets will support
117+
same templating rules as attach and detach as documented
118+
- https://kubernetes-csi.github.io/docs/secrets-and-credentials.html#controller-publishunpublish-secret .
119+
120+
CSI volumes that require secrets for online expansion will have NodeExpandSecretRef
121+
field set. If not set NodeExpandVolume CSI RPC call will be made without secret.
122+
Existing validation of PersistentVolume object will be relaxed to allow setting of
123+
NodeExpandSecretRef for the first time so as CSI volume expansion can be supported
124+
for existing PVs.
125+
126+
CSI Spec 1.5 has added below field to facilitate to enable COs to make use of the
127+
same as part of the NodeExpandSecret
128+
129+
```
130+
message NodeExpandVolumeRequest {
131+
...
132+
// Secrets required by plugin to complete node expand volume request.
133+
// This field is OPTIONAL. Refer to the `Secrets Requirements`
134+
// section on how to use this field.
135+
map<string, string> secrets = 6
136+
[(csi_secret) = true, (alpha_field) = true];
137+
}
138+
```
139+
The same field will be used by Kubernetes to fill secretRef in the
140+
NodeExpandVolume request.
141+
142+
### Test Plan
143+
- Unit tests around all the added logic in kubelet.
144+
- Unit tests around all the added logic in Api server.
145+
- E2E tests around nodeExpansionVolume to make sure the field value is passed
146+
and can be used.
147+
148+
### Graduation Criteria
149+
150+
#### Alpha
151+
152+
- Implemented the feature.
153+
- Wrote all the unit and E2E tests.
154+
155+
#### Beta
156+
157+
- Deployed the feature in production and went through at least minor k8s
158+
version.
159+
160+
#### GA
161+
162+
#### Deprecation
163+
164+
### Upgrade / Downgrade Strategy
165+
166+
### Version Skew Strategy
167+
168+
## Production Readiness Review Questionnaire
169+
170+
### Feature Enablement and Rollback
171+
172+
- **How can this feature be enabled / disabled in a live cluster?**
173+
174+
- Feature gate name: NodeExpandSecret
175+
- Components depending on the feature gate: kubelet, kube-apiserver
176+
- Will enabling / disabling the feature require downtime of the control
177+
plane? no.
178+
- Will enabling / disabling the feature require downtime or reprovisioning
179+
of a node? yes.
180+
181+
- **Does enabling the feature change any default behavior?** no.
182+
183+
- **Can the feature be disabled once it has been enabled (i.e. can we roll
184+
back the enablement)?** yes, if rollback of feature gate happened with the
185+
field `NodeExpandRequest` set, it will exist, but be ignored.
186+
187+
- **What happens if we reenable the feature if it was previously rolled
188+
back?** nothing, as long as the new fields in `NodeExpandRequest` is not used.
189+
190+
- **Are there any tests for feature enablement/disablement?** yes, unit tests
191+
will cover this.
192+
193+
### Rollout, Upgrade and Rollback Planning
194+
195+
TBD
196+
197+
###### How can a rollout or rollback fail? Can it impact already running workloads?
198+
199+
TBD
200+
201+
###### What specific metrics should inform a rollback?
202+
203+
TBD
204+
205+
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
206+
207+
TBD
208+
209+
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
210+
211+
TBD
212+
213+
### Monitoring Requirements
214+
215+
TBD
216+
217+
###### How can an operator determine if the feature is in use by workloads?
218+
219+
TBD
220+
221+
###### How can someone using this feature know that it is working for their instance?
222+
223+
TBD
224+
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
225+
226+
TBD
227+
228+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
229+
TBD
230+
231+
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
232+
233+
TBD
234+
235+
### Dependencies
236+
237+
TBD
238+
239+
###### Does this feature depend on any specific services running in the cluster?
240+
241+
TBD
242+
243+
### Scalability
244+
245+
- **Will enabling / using this feature result in any new API calls?**
246+
no.
247+
- **Will enabling / using this feature result in introducing new API types?**
248+
no.
249+
250+
- **Will enabling / using this feature result in any new calls to the cloud
251+
provider?** no.
252+
253+
- **Will enabling / using this feature result in increasing size or count of
254+
the existing API objects?** no.
255+
256+
- **Will enabling / using this feature result in increasing time taken by any
257+
operations covered by [existing SLIs/SLOs]?** no.
258+
259+
- **Will enabling / using this feature result in non-negligible increase of
260+
resource usage (CPU, RAM, disk, IO, ...) in any components?** no.
261+
262+
### Troubleshooting
263+
264+
## Implementation History
265+
266+
- 18/01/2022: Implementation started
267+
268+
## Drawbacks
269+
270+
## Alternatives
271+
272+
1. Instead of fetching secretRef from the nodeExpansion request, CSI drivers
273+
can store those somewhere in the cluster and make use of it while doing nodeExpansion,
274+
however this is really a hacky way and not the CSI driver authors want.
275+
276+
## Infrastructure Needed (Optional)
277+
278+
---
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
title: SecretRef field addition to NodeExpandVolume request
2+
kep-number: 3107
3+
authors:
4+
- "@humblec"
5+
owning-sig: sig-storage
6+
participating-sigs:
7+
- sig-storage
8+
- sig-api
9+
status: provisional
10+
creation-date: 2022-01-23
11+
reviewers:
12+
- TBD
13+
approvers:
14+
- TBD
15+
16+
see-also:
17+
- TBD
18+
19+
# The target maturity stage in the current dev cycle for this KEP.
20+
stage: alpha
21+
22+
# The most recent milestone for which work toward delivery of this KEP has been
23+
# done. This can be the current (upcoming) milestone, if it is being actively
24+
# worked on.
25+
latest-milestone: "v1.24"
26+
27+
# The milestone at which this feature was, or is targeted to be, at each stage.
28+
milestone:
29+
alpha: "v1.24"
30+
beta: "v1.25"
31+
stable: "v1.26"
32+
33+
# The following PRR answers are required at alpha release
34+
# List the feature gate name and the components for which it must be enabled
35+
feature-gates:
36+
- name: NodeExpandSecret
37+
components:
38+
- kubelet
39+
disable-supported: true
40+
41+
# The following PRR answers are required at beta release
42+
metrics: []

0 commit comments

Comments
 (0)