Skip to content

Commit fbbaf83

Browse files
committed
KEP-2033: KubeletInUserNamespace: promote to beta
Signed-off-by: Akihiro Suda <[email protected]>
1 parent 0afcd1b commit fbbaf83

File tree

3 files changed

+98
-44
lines changed

3 files changed

+98
-44
lines changed
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
kep-number: 2033
22
alpha:
33
approver: "@ehashman"
4+
beta:
5+
approver: "@soltysh"

keps/sig-node/2033-kubelet-in-userns-aka-rootless/README.md

Lines changed: 89 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -148,20 +148,20 @@ checklist items _must_ be updated for the enhancement to be released.
148148

149149
Items marked with (R) are required *prior to targeting to a milestone / release*.
150150

151-
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
152-
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
153-
- [ ] (R) Design details are appropriately documented
154-
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
155-
- [ ] e2e Tests for all Beta API Operations (endpoints)
156-
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
157-
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
151+
- [X] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
152+
- [X] (R) KEP approvers have approved the KEP status as `implementable`
153+
- [X] (R) Design details are appropriately documented
154+
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
155+
- [N/A] e2e Tests for all Beta API Operations (endpoints)
156+
- [N/A] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
157+
- [N/A] (R) Minimum Two Week Window for GA e2e tests to prove flake free
158158
- [ ] (R) Graduation criteria is in place
159-
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
159+
- [N/A] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
160160
- [ ] (R) Production readiness review completed
161161
- [ ] (R) Production readiness review approved
162-
- [ ] "Implementation History" section is up-to-date for milestone
163-
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
164-
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
162+
- [X] "Implementation History" section is up-to-date for milestone
163+
- [X] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
164+
- [X] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
165165

166166
<!--
167167
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
@@ -493,7 +493,11 @@ The patch modifies `kubelet` to ignore errors that happens during setting the fo
493493
#### kube-proxy
494494
Patch: ["kube-proxy: allow running in userns"](https://github.com/rootless-containers/usernetes/blob/v20210303.0/src/patches/kubernetes/0002-kube-proxy-allow-running-in-userns.patch)
495495

496-
The patch modifies `kube-proxy` to ignore an error during setting `RLIMIT_NOFILE`.
496+
The patch modifies `kube-proxy` (`userspace` mode) to ignore an error during setting `RLIMIT_NOFILE`.
497+
No change is needed for non-userspace mode.
498+
499+
> **Note**
500+
> `userspace` proxy was removed in v1.26.
497501

498502
### Test Plan
499503

@@ -508,19 +512,19 @@ when drafting this test plan.
508512
[testing-guidelines]: https://git.k8s.io/community/contributors/devel/sig-testing/testing.md
509513
-->
510514

511-
[ ] I/we understand the owners of the involved components may require updates to
515+
[X] I/we understand the owners of the involved components may require updates to
512516
existing tests to make this code solid enough prior to committing the changes necessary
513517
to implement this enhancement.
514518

515-
Tests are present in several subproject repos and third party repos:
516-
- https://github.com/kubernetes-sigs/kind/blob/v0.17.0/.github/workflows/cgroup2.yaml#L24
517-
- https://github.com/kubernetes/minikube/blob/v1.29.0/.github/workflows/pr.yml#L293-L410
518-
- https://github.com/k3s-io/k3s/blob/v1.26.1+k3s1/.github/workflows/cgroup.yaml#L92-L99
519-
- https://github.com/rootless-containers/usernetes/blob/v20221007.0/.cirrus.yml
519+
See [e2e tests](#e2e-tests) below.
520520

521-
Tests will be added to `kubernetes/test-infra` as well when the [`k8s-infra-prow-build`](https://github.com/kubernetes/k8s.io/blob/a071c4ed0823f193ee29e2f14e191be42dc1a1f0/infra/gcp/terraform/k8s-infra-prow-build/main.tf#L78) cluster
522-
is upgraded to use cgroup v2.
523-
This will probably automatically happen when [GKE bumps up their "regular" channel to Kubernetes v1.26 or later](https://cloud.google.com/kubernetes-engine/docs/how-to/node-system-config).
521+
Additional tests are present in several subproject repos and third party repos:
522+
- https://github.com/kubernetes-sigs/kind/blob/v0.29.0/.github/workflows/vm.yaml#L24
523+
- https://github.com/kubernetes/minikube/blob/v1.36.0/.github/workflows/pr.yml#L299-L415
524+
- https://github.com/k3s-io/k3s/blob/v1.33.1%2Bk3s1/.github/workflows/e2e.yaml#L56
525+
- https://github.com/rootless-containers/usernetes/blob/gen2-v20250501.0/.github/workflows/main.yaml
526+
- Covers multi-node clusters with Flannel (VXLAN)
527+
- Covers several host distributions (Ubuntu, CentOS Stream, and Fedora)
524528

525529
##### Prerequisite testing updates
526530

@@ -550,7 +554,7 @@ This can inform certain test coverage improvements that we want to do before
550554
extending the production code to implement this enhancement.
551555
-->
552556

553-
- `<package>`: `<date>` - `<test coverage>`
557+
N/A, as unit tests do not make sense here.
554558

555559
##### Integration tests
556560

@@ -576,7 +580,7 @@ This can be done with:
576580
- a search in the Kubernetes bug triage tool (https://storage.googleapis.com/k8s-triage/index.html)
577581
-->
578582

579-
- [test name](https://github.com/kubernetes/kubernetes/blob/2334b8469e1983c525c0c6382125710093a25883/test/integration/...): [integration master](https://testgrid.k8s.io/sig-release-master-blocking#integration-master?include-filter-by-regex=MyCoolFeature), [triage search](https://storage.googleapis.com/k8s-triage/index.html?test=MyCoolFeature)
583+
N/A, as integration tests do not make sense here.
580584

581585
##### e2e tests
582586

@@ -595,7 +599,31 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
595599
If e2e tests are not necessary or useful, explain why.
596600
-->
597601

598-
- [test name](https://github.com/kubernetes/kubernetes/blob/2334b8469e1983c525c0c6382125710093a25883/test/e2e/...): [SIG ...](https://testgrid.k8s.io/sig-...?include-filter-by-regex=MyCoolFeature), [triage search](https://storage.googleapis.com/k8s-triage/index.html?test=MyCoolFeature)
602+
`NodeConformance` tests are executed using [kubetest2-kindinv](https://github.com/rootless-containers/kubetest2-kindinv).
603+
604+
"kindinv" stands for "Kubernetes in (Rootless) Docker in (GCE) VM".
605+
GCE VM is used for enabling systemd that is required by Rootless Docker to set up cgroup v2.
606+
607+
```bash
608+
exec kubetest2 kindinv \
609+
--boskos-location=http://boskos.test-pods.svc.cluster.local \
610+
--gcp-zone=us-central1-b \
611+
--instance-image=ubuntu-os-cloud/ubuntu-2204-lts \
612+
--instance-type=n2-standard-4 \
613+
--kind-rootless \
614+
--user=rootless \
615+
--build \
616+
--up \
617+
--down \
618+
--test=ginkgo \
619+
-- \
620+
--focus-regex='\[NodeConformance\]' \
621+
--skip-regex='\[Environment:NotInUserNS\]|\[Slow\]' \
622+
--parallel=8
623+
```
624+
625+
- Prow manifest: https://github.com/kubernetes/test-infra/blob/4b7824ff1cfe00c36062035ab6aea3bb6c2e6ba2/config/jobs/kubernetes/sig-testing/kubernetes-kind.yaml#L615-L678
626+
- Logs: https://prow.k8s.io/job-history/gs/kubernetes-ci-logs/logs/ci-kubernetes-e2e-kind-rootless
599627

600628
### Graduation Criteria
601629

@@ -677,9 +705,7 @@ in back-to-back releases.
677705

678706
- Beta: e2e tests coverage.
679707
Requires [the cgroup v2 KEP](../20191118-cgroups-v2.md ) to reach Beta or GA.
680-
To move to beta, we need clarity if we intend to define two separate types of conformance suites:
681-
- kubernetes clusters that can run privileged workloads
682-
- kubernetes cluster that are restricted to run unprivileged workloads only
708+
The tests are covered by `NodeConformance` tests (see above).
683709

684710
- GA: Assuming no negative user feedback based on production experience, promote after >= 2 releases in beta.
685711
Requires [the cgroup v2 KEP](../20191118-cgroups-v2.md ) to reach GA.
@@ -715,7 +741,8 @@ enhancement:
715741
CRI or CNI may require updating that component before the kubelet.
716742
-->
717743

718-
N/A
744+
N/A.
745+
This KEP only affects the internal of kubelet, and does not affect any API.
719746

720747
## Production Readiness Review Questionnaire
721748

@@ -761,7 +788,7 @@ well as the [existing list] of feature gates.
761788

762789
- [X] Feature gate (also fill in values in `kep.yaml`)
763790
- Feature gate name: `KubeletInUserNamespace`
764-
- Components depending on the feature gate:
791+
- Components depending on the feature gate: kubelet
765792
- [ ] Other
766793
- Describe the mechanism:
767794
- Will enabling / disabling the feature require downtime of the control
@@ -784,7 +811,8 @@ Any change of default behavior may be surprising to users or break existing
784811
automations, so be extremely careful here.
785812
-->
786813

787-
During Alpha, we will document what workloads will work and what will not work.
814+
The limitation is same as Rootless Docker, Podman, etc.
815+
See <https://rootlesscontaine.rs/caveats/>.
788816

789817
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
790818

@@ -799,11 +827,11 @@ feature.
799827
NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
800828
-->
801829

802-
N/A, as switching back rootless to rootful requires redeploying the kubelet, and vice versa.
830+
Yes, by turning off the feature gate.
803831

804832
###### What happens if we reenable the feature if it was previously rolled back?
805833

806-
N/A.
834+
Nothing happens.
807835

808836
###### Are there any tests for feature enablement/disablement?
809837

@@ -820,17 +848,14 @@ You can take a look at one potential example of such test in:
820848
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
821849
-->
822850

823-
CI will run `kind` (Kubernetes in Docker) tests with Rootless Docker/Podman.
824-
Tests with a real cluster will be added later as well.
851+
Yes. See [Test Plan](#test-plan).
825852

826853
### Rollout, Upgrade and Rollback Planning
827854

828855
<!--
829856
This section must be completed when targeting beta to a release.
830857
-->
831858

832-
This section will be fulfilled when targeting beta graduation to a release.
833-
834859
###### How can a rollout or rollback fail? Can it impact already running workloads?
835860

836861
<!--
@@ -843,13 +868,22 @@ rollout. Similarly, consider large clusters and how enablement/disablement
843868
will rollout across nodes.
844869
-->
845870

871+
Rollout: Rolling out requires recreating a new node instance, in a UserNS.
872+
Typical failures:
873+
- [subuids are not allocated](https://rootlesscontaine.rs/getting-started/common/subuid/)
874+
- [cgroup v2 delegation is not enabled](https://rootlesscontaine.rs/getting-started/common/cgroup2/)
875+
876+
Rollback: this question is not applicable. Rolling back requires recreating a new node instance.
877+
846878
###### What specific metrics should inform a rollback?
847879

848880
<!--
849881
What signals should users be paying attention to when the feature is young
850882
that might indicate a serious problem?
851883
-->
852884

885+
CrashLoopBackOffs
886+
853887
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
854888

855889
<!--
@@ -858,12 +892,16 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
858892
are missing a bunch of machinery and tooling and can't do that now.
859893
-->
860894

895+
This question is not applicable. Rolling out and rolling back requires recreating a new node instance.
896+
861897
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
862898

863899
<!--
864900
Even if applying deprecation policies, they may still surprise some users.
865901
-->
866902

903+
No
904+
867905
### Monitoring Requirements
868906

869907
<!--
@@ -881,7 +919,7 @@ checking if there are objects with field X set) may be a last resort. Avoid
881919
logs or events for this purpose.
882920
-->
883921

884-
N/A
922+
They can determine if a Pod is running on a node that is running in UserNS.
885923

886924
###### How can someone using this feature know that it is working for their instance?
887925

@@ -894,8 +932,8 @@ and operation of this feature.
894932
Recall that end users cannot usually observe component logs or access metrics.
895933
-->
896934

897-
- [ ] Events
898-
- Event Reason:
935+
- [X] Events
936+
- Event Reason: No CrashLoopBackOff
899937
- [ ] API .status
900938
- Condition name:
901939
- Other field:
@@ -919,7 +957,7 @@ These goals will help you determine what you need to measure (SLIs) in the next
919957
question.
920958
-->
921959

922-
N/A
960+
99.9% of /health requests per day finish with 200 code
923961

924962
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
925963

@@ -941,7 +979,7 @@ Describe the metrics themselves and the reasons why they weren't added (e.g., co
941979
implementation difficulties, etc.).
942980
-->
943981

944-
N/A
982+
No
945983

946984
### Dependencies
947985

@@ -1058,6 +1096,8 @@ Think about adding additional work or introducing new steps in between
10581096
[existing SLIs/SLOs]: https://git.k8s.io/community/sig-scalability/slos/slos.md#kubernetes-slisslos
10591097
-->
10601098

1099+
No.
1100+
10611101
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
10621102

10631103
<!--
@@ -1084,6 +1124,8 @@ Are there any tests that were run/should be run to understand performance charac
10841124
and validate the declared limits?
10851125
-->
10861126

1127+
No
1128+
10871129
### Troubleshooting
10881130

10891131
<!--
@@ -1120,6 +1162,10 @@ Same as traditional rootful Kubernetes.
11201162

11211163
###### What steps should be taken if SLOs are not being met to determine the problem?
11221164

1165+
- Make sure that the supported version of the components are used
1166+
- [Make sure that more than 65536 subuids are allocated](https://rootlesscontaine.rs/getting-started/common/subuid/)
1167+
- [Make sure that cgroup v2 delegation is enabled](https://rootlesscontaine.rs/getting-started/common/cgroup2/)
1168+
11231169
## Implementation History
11241170

11251171
<!--
@@ -1140,6 +1186,8 @@ Major milestones might include:
11401186
- 2019-11-19: @giuseppe submitted [cgroup v2 KEP](https://github.com/kubernetes/enhancements/pull/1370)
11411187
- 2019-11-19: present KEP to SIG-node (cgroup v2 version)
11421188
- 2020-07-07: the cgroup v2 support is in `implementable` status
1189+
- 2021-08-04: Kubernetes v1.22 (Alpha)
1190+
- 2025-12-XX: Kubernetes v1.35 (Beta)
11431191

11441192
## Drawbacks
11451193

keps/sig-node/2033-kubelet-in-userns-aka-rootless/kep.yaml

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ reviewers:
1515
- "@dims"
1616
- "@sftim"
1717
approvers:
18-
- TBD
18+
- "@soltysh"
1919
see-also:
2020
# `add KEP for cgroups v2 support`
2121
- "https://github.com/kubernetes/enhancements/pull/1370"
@@ -24,16 +24,17 @@ replaces:
2424
- "https://github.com/kubernetes/enhancements/pull/1084"
2525

2626
# The target maturity stage in the current dev cycle for this KEP.
27-
stage: alpha
27+
stage: beta
2828

2929
# The most recent milestone for which work toward delivery of this KEP has been
3030
# done. This can be the current (upcoming) milestone, if it is being actively
3131
# worked on.
32-
latest-milestone: "v1.22"
32+
latest-milestone: "v1.35"
3333

3434
# The milestone at which this feature was, or is targeted to be, at each stage.
3535
milestone:
3636
alpha: "v1.22"
37+
beta: "v1.35"
3738

3839
# The following PRR answers are required at alpha release
3940
# List the feature gate name and the components for which it must be enabled
@@ -42,3 +43,6 @@ feature-gates:
4243
components:
4344
- kubelet
4445
disable-supported: true
46+
47+
# The following PRR answers are required at beta release
48+
metrics: []

0 commit comments

Comments
 (0)