-
Notifications
You must be signed in to change notification settings - Fork 220
Add k8s name related formats #384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
are non-standard format strings supposed to be namespaced or not? what are the implications for CRDs created using these format strings prior to this, allowing in data that ignores those format requirements, then auto-tightening validation when upgrading to a control plane with this support added? |
I have not found any mention of namespacing in the specs. xref https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-validation-01#name-custom-format-attributes I am open to naming suggestions for these formats though. I had briefly considered something like "kubernetes-name" but couldn't think of anything that aided rather than harmed clarity on what the format actually is.
Yes, exactly. Today, supported formats is used strip out unsupported |
/assign @apelisse Would you be willing to review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good thanks
Thanks, looks better to me :-) |
pkg/validation/strfmt/default.go
Outdated
@@ -124,6 +124,48 @@ func IsEmail(str string) bool { | |||
return e == nil && addr.Address != "" | |||
} | |||
|
|||
const dns1123LabelFmt string = "[a-z0-9]([-a-z0-9]*[a-z0-9])?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're copying/pasting from apimachinery validation, are we inheriting these issues:
- This requires things to be lowercase, which is more strict than the RFC
kubernetes/kubernetes#79351 (comment)
- This tolerates individual labels exceeding 63 chars, which is not as strict as the RFC
Should we fix those issues before adopting those formats here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We intend to publish OpenAPI the in-tree types using these formats. So I think we're stuck inheriting the it all.
That said, we shouldn't using the RFC identifiers for formats that are not exactly compliant with the RFC. I'll propose some alternative names.
pkg/validation/strfmt/default.go
Outdated
dns1123subdomain := DNS1123Subdomain("") | ||
Default.Add("dns1123subdomain", &dns1123subdomain, IsDNS1123Subdomain) | ||
|
||
dns1123label := DNS1123Label("") | ||
Default.Add("dns1123label", &dns1123label, IsDNS1123Label) | ||
|
||
dns1035label := DNS1035Label("") | ||
Default.Add("dns1035label", &dns1035label, IsDNS1035Label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if these get added to the default set of recognized formats here, how do we roll them out in a controlled way in kubernetes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was held until CRD Ratcheting was promoted to beta in 1.30, we'll leverage that capability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update this code so we can manage the newly added formats separate from the pre-existing ones.
I'm not sure I understand how that makes existing persisted CR data safe. However long we take to roll out support for the format, at some point, would we start treating existing persisted data that didn't match the specified format as invalid? |
We could solve this together with the more general problem of CRD validation ratcheting. I.e. the case where a CRD author adds a format to a CRD, and there already persisted CR data which then immediately becomes invalid. The case we are discussing is similar except that instead of the CRD author adding a format, the CRD author already had a non-validating format in the CRD and then Kubernetes changed the format to be validating. Ratcheting rule would be: if any CR data fails validation but is unchanged from the persisted value, we ignore the validation failure and allow the update. EDIT: This is also what we need for rollback. Note that because any format is allowed on any CRD, this would allow for rollback to arbitrary old versions. |
I agree a solution to that would resolve the issue with this. Shouldn't we get that solution in place before adding this? This would be the first example I can think of where we made a change that would reinterpret existing persisted CRDs in ways that would invalidate existing data |
SGTM. I'll poke around and see who might be able to implement. |
/close |
@jpbetz: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Just updating progress on this front was made with CRD Ratcheting Alpha added in 1.28: https://kep.k8s.io/4008 Will need to wait for it to mature before using it for name formats, but in the interim we can use inline OpenAPI schemas to describe the regex/cel expressions used for most formats. |
/reopen CRD Ratcheting was promoted to beta in 1.30, freeing us up to resume progress on this. |
@jpbetz: Reopened this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Now that we have CRD ratcheting, it would be good to give guidance around when expanded openapi formats are the preferred way to model a validation, and when a CEL library is. |
We are missing well documented guidance. Let's get that going. I like the idea of CEL "supplementing" OpenAPI formats only were clearly needed. Related- We've fallen behind on supporting formats. kubernetes/kubernetes#130639 describes where we haveCEL support for formats that SHOULD have an OpenAPI format. To help catch up, we've added emulation version support to formats: kubernetes/kubernetes#130783. Combined with ratcheting, this should give us all the machinery we need to add the formats we agree should be added. |
cc @thockin for the Declarative Validation aspect. We anticipate needing OpenAPI formats for the built-in formats that we validate with hand written code so that we can publish the declarative validation rules through OpenAPI. |
Is there an extension-mechanism for format names? Like The ultimate goal, I think is unification. CRDs can be written saying "this field is validated by format=x-k8s-dns-label" and we will do that server-side; Builtin types can express the same thing in a different way; apiserver publishes this in a way that an "enlightened" client can handle if they wanted to to do shift-left validation (e.g. as a git pre-commit). |
This PR has been refreshed and is ready for review. I've gone with "k8s.io/" as the prefix after consulting a few LLMs about widely used conventions . The main feedback was that standard bodies are steering away from "x-" prefixes, and that the charset in this prefix is within norms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple thoughts on names. Over time we have used 3 different DNS formats. But what we use is not really DNS compatible, it's DNS-ish.
-
Should we continue with the "DNS" nomenclature? This looks like a significant move towards publishing this terminology so this may be our last chance to name the formats better. Or it might be too late already.
-
Should we nominate one of the 3 as the "normal" format (e.g.
k8s.io/dns-label
) and let the others be wierd?
Since DNS 1123 subdomain is the type we use most often for resource name validation, how about a format name that emphasizes that? "k8s.io/basic-name" or something? |
How about:
|
I am terrible at naming things, but a few quick thoughts:
Then the name could just be "dns-label" or "dns-style-label" or something? Would love alternatives to "qualified name" too :) |
Maybe we just stick with subdomain then? We could consider "dns-full-name" or something. But if we're avoiding "qualified", is "full name" any better?
+1
Dropping 1035 from the format list SGTM.
Curious what value to the user using the term "DNS" adds? That aside, how about:
? |
My current concerns:
|
I agree that invoking DNS is a bit confusing, especially if we invoke a particular detailed RFC which we then don't quite implement correctly (waves vaguely at all the issues complaining our 1123 validation... isn't) I also agree that "label" is likely to confuse people in the context of Kubernetes. Perhaps "segment" would be clearer? Taking a step back, I'm a little fuzzy on which of these goals we're trying to accomplish:
Example bugs:
Example inconsistencies / variants:
|
Jordan, I think this was aiming at:
If we can find a better name to use in this that doesn't reference DNS and doesn't abuse the word "label" (which I never really recognized before but now cannot unsee), then that's great. If we don't need to handle Service, perhaps we can use 1.34 to alpha-gate service name ratcheting to the 1123 -ish spec. 1.34 would tolerate services with 1123-style names, but not allow new ones. 1.35 would allow new usage. Unless we can lean on compat-version to save the alpha-release? @danwinship @aojea Then we only have to name one thing instead of two. "kubernetes short name" vs "... long name" ? |
I've updated this to use |
I think either way, we should probably ignore Service for purposes of this PR; either we manage to fix Service and it becomes just a |
I asked OAI/learn.openapis.org#128 to see if there is a real norm here. I like the k8s-short-name + k8s-long-name ? k8s-shortname + k8s-longname ? k8s-name + k8s-long-name ? k8s-shortname + k8s-name |
We've gone full circle. This is what I'd proposed originally. |
We have two open questions in this thread: 1) Overall notationAlmost full circle - after looking at all the openapi and JSON schema got repos and specs I figured:
As such, I think the lowest risk and most non-contentious path is to declare a prefix (eg. I think this should extend into declarative validation, and all our custom formats should be "k8s-something" Agree or discuss more? 2) Rename "DNS label" and "DNS subdomain"I can't come up with anything more semantically meaningful that "short" and "long". Maybe just more descriptive. We want to eschew the words "DNS" and "label" in the name. "object name" isn't right because we use them other places. Even "name" is questionable. The short form is similar to a hostname, but we should avoid that, too since it is not exact. When it comes to object names we use subdomain more than label (sadly) but for other fields we seem to use label more (totally haphazard estimate by line-counting calls to each func). So I am fine with "k8s-short-name" and "k8s-long-name" or maybe with dropping either "long" or "short" (making that one more default-ish). Or I am fine with "k8s-name" (or "k8s-short-name") and "k8s-dotted-name" or something similarly descriptive. You are closest to the consequences of this decision, Joe, so I defer. |
Apologies for the delay, I lost track of this one and am just circling back now.
Agree.
I also could live with any of these variations. I'm going to go with While I'm tempted to have just |
|
||
// KubernetesExtensions is the formats registry for JSON Schema formats | ||
// extensions defined by the Kubernetes project. | ||
var KubernetesExtensions = NewSeededFormats(nil, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is peer to the existing Default
var? looks like we have a lot of things using that other var... will they need to switch to this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just reviewed the usages. I think it's best to just collapse these into the Default
set. We still have kubernetes/kubernetes#130783 for ompat-version guided enablement.
pkg/validation/strfmt/format.go
Outdated
case string: | ||
*r = T(v) | ||
default: | ||
return fmt.Errorf("cannot sql.Scan() strfmt.StringFormat from: %#v", v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤨 are the sql.Scan errors load-bearing or can we stop outputting that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason most of the string formats in kube-openapi support sql.Scanner
and output errors like this. But a few don't. I'll just not support it and open an issue to drop it from the other string formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the API validation functions were exposed in a low-deps library, would we take a dep on it here? /lgtm |
This adds the most heavily used name formats in k8s:
Note: These formats implemented in k8s in ways that do not exactly match the RFCs (kubernetes/kubernetes#71140,
kubernetes/kubernetes#79351). But, intend to use these formats to communicate the existing Kubernetes validation rules via OpenAPI, so we want to keep them as-is.
Benefits:
xref: https://json-schema.org/draft/2020-12/json-schema-validation#name-custom-format-attributes, https://spec.openapis.org/registry/format/