-
Notifications
You must be signed in to change notification settings - Fork 267
Proposal: Support Resource Relationships #670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,243 @@ | ||
# [Design] Support Resource Relationships | ||
|
||
## Problem Description | ||
|
||
In micro service world, often users need to be using multiple services to run an application or Job. For example, to create a model in SageMaker, you could use a combination of SageMaker, IAM, S3, ECR and VPC services. Some of these resources need to be created in serial manner because output of one API(e.g. IAM role ARN, vpc-id) is an input to the another API. | ||
|
||
As an ACK user, I want to enable to be able to express the relationship between resources in ACK CRs, so that I can submit multiple Specs related my application/infrastructure at once without waiting for another resources to get created. | ||
|
||
## Proposed Implementation | ||
Kubernetes style of expressing resource relationship is by using [object references](https://godoc.org/k8s.io/api/core/v1#ObjectReference), [typedLocalObjectReference](https://godoc.org/k8s.io/api/core/v1#TypedLocalObjectReference) or [localObjectReference](https://godoc.org/k8s.io/api/core/v1#LocalObjectReference). Some examples of this are [Secrets](https://github.com/kubernetes/api/blob/4e9f5db10201ce6270b160e96e3f99ddb580135e/core/v1/types.go#L861) and [configmaps](https://github.com/kubernetes/api/blob/4e9f5db10201ce6270b160e96e3f99ddb580135e/core/v1/types.go#L1999-L2016) in K8s API. In the following sections, we present how operator pattern can be used in conjunction with references in ACK to support relationsip between resources. | ||
|
||
### Modification to CRDs | ||
Currently, each top level resource in API model of the AWS service has a CRD. The field types in the CRD are structs, lists or primitive datatypes. A resource in K8s can be identified by a combination of metadata - `APIVersion, Kind, Name and Namespace`. Since AWS APIs have a deterministic input shape, APIVersion and Kind can be fixed at code generation time while Namespace and Name of resource will be a user input at runtime. To facilitate this, we propose to extend the K8s style of referencing resources in ACK by introducing a new datatype `AWSResourceReference` in `apis/core/v1alpha1/common.go`: | ||
|
||
``` | ||
// AWSResourceReference represents a reference to another CR which refers to a AWS resource in Kubernetes | ||
type AWSResourceReference struct { | ||
// Namespace of the resource being referred to | ||
Namespace string `json:"namespace"` | ||
// Name of the resource being referred to | ||
Name string `json:"name"` | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about we use the https://github.com/aws-controllers-k8s/runtime/pull/2/files We could structure an type AWSResourceReference struct {
// AWS contains identifiers of the resource in the AWS API that are
// being referred to
AWS *AWSIdentifiers `json:"aws,omitempty"`
// Name contains the (metadata) name of the Kubernetes resource being
// referred to
Name *string `json:"kubernetes,omitempty"`
} Imagine that the Let's say a Kubernetes user wants to create a SageMaker Endpoint and wants to reference an IAM Role via the Role's ARN. They might submit an Endpoint that looks like this: metadata:
name: my-endpoint
spec:
endpointName: my-endpoint
iamRole:
aws:
arn: "arn:aws:iam::123456789012:role/S3Access" On the other hand. let's say the user wants to create a SageMaker Endpoint and wants to reference an They might submit an Endpoint that looks like this: metadata:
name: my-endpoint
spec:
endpointName: my-endpoint
iamRole:
name: some-iam-role |
||
``` | ||
|
||
While we want to allow ACK users to benefit from references, there might be instances where some of the related resources are not created in the same cluster or do not have ACK service controller yet. The following two approaches show how we can account for this flexibility: | ||
|
||
Consider Endpoint resource with following input shape where IAMRole needs to be a referencce | ||
|
||
``` | ||
type EndpointSpec { | ||
EndpointName *string `json:"endpointName,omitempty"`, | ||
IAMRole *string `json:"IAMRole,omitempty"`, | ||
... | ||
} | ||
``` | ||
|
||
#### Approach 1 | ||
Change the type of the referred field, `IAMRole` -> `IAMRoleRef` | ||
|
||
``` | ||
type EndpointSpec { | ||
EndpointName *string `json:"endpointName,omitempty"`, | ||
IAMRoleRef *IAMRoleRef `json:"IAMRoleRef,omitempty"`, | ||
... | ||
} | ||
``` | ||
|
||
In `apis/core/v1alpha1/common.go` | ||
``` | ||
// AWSIdentifiers is used to identify the type of referred field | ||
type AWSIdentifiers struct { | ||
// ARN is the AWS Resource Name for the resource. It is a globally | ||
// unique identifier and generated by AWS service. | ||
ARN *AWSResourceName `json:"arn"` | ||
// Name is a user-supplied string identifier for the resource. It may | ||
// or may not be globally unique, depending on the type of resource. | ||
Name *string `json:"name"` | ||
// ID is an identifier that has been generated by the AWS service for | ||
// the resource. ID fields are *usually*, but not always, globally unique. | ||
// Sometimes ID fields are numeric, other times they are strings. | ||
ID *string `json:"id"` | ||
} | ||
``` | ||
|
||
and then in, `services/<AWS_SERVICE>/apis/v1alpha1/types.go`, we will add: | ||
|
||
``` | ||
// IAMRoleReference is a pointer to a IAMRole resource in the Kubernetes API | ||
// or the AWS API | ||
type IAMRoleReference struct { | ||
// KubernetesReference indicates the Kubernetes internal identifiers for | ||
// the IAMRole CR that references the AWS resource. | ||
// If nil, the IAMRole CR has not yet been created by an ACK controller | ||
// controller and the `External` field below should be non-nil. | ||
KubernetesReference *ackcorev1alpha1.AWSResourceReferencce `json:"kubernetesReference,omitempty" | ||
|
||
// External indicates the AWS external identifiers for the IAMRole. | ||
// This field can be an Name, ARN or id depending on API input shape and | ||
// is used for resources which are not managed by ACK controller. | ||
// If nil, the `KubernetesReferences` field must not be nil. | ||
External *ackv1alpha1.AWSIdentifiers `json:"external,omitempty"` | ||
} | ||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm thinking we don't actually need resource-specific Reference types. A single |
||
|
||
External field is beneficial for following reasons: | ||
|
||
* Service controller does not exist for this type of resource or is planned in future. This leaves a way for the Go types to be future proof. | ||
* User wants to use a resource which has been created by a different infrastructure team(using another Kubernetes cluster /CFN/Terraform/console). | ||
|
||
#### Approach 2 | ||
Leave the default field for the external case and have an additional field to specify Kubernetes reference. | ||
|
||
``` | ||
type EndpointSpec { | ||
EndpointName *string `json:"endpointName,omitempty"`, | ||
// Use to specify AWS external identifier | ||
IAMRole *string `json:"IAMRole,omitempty"`, | ||
// Use to specify KubernetesReference | ||
IAMRoleRef *ackv1alpha1.AWSResourceReference `json:"IAMRoleRef,omitempty"`, | ||
... | ||
} | ||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is certainly the only backwards-compatible way of handling object references, since it adds a field instead of changing a data type. However, since we have no controllers using anything other than alpha-series API versions yet, I don't think this is as important a consideration. |
||
Either IAMRole or IAMRoleRef needs to be non-nil. If both are specified by mistake, controller will give preference to external i.e. IAMRole. | ||
|
||
Note: The default IAMRole field will be modified to be optional if it is required in AWS API. | ||
|
||
### Modifications to Generator Config | ||
|
||
Introduce a new field “references” to generator config under resources which contain instructions to the code-generator about related resources. References would be a map keyed by path to field name which should be converted to a reference and value being a referenceConfig which consists instructions about the value to be substituted. ReferenceConfig will consist of the following fields: | ||
|
||
* value: full path to the value from the referenced resource object to be substituted for parent resource field name. | ||
* Full path is needed because value might be a generated field like ARN, id (like vpc-id) which will be present in the Status or can be names (like modelName) which are provided by user as part of spec | ||
* APIVersion and Kind: | ||
* Kind is pre-determined based on the AWS API parameter. So we dont need this to be input by the user | ||
* APIVersion needs to be fixed at code generation time so that controller is not affected by updates to another service controller. Kubernetes follows a [hub and spoke design](https://book.kubebuilder.io/multiversion-tutorial/conversion-concepts.html#what-does-that-have-to-do-with-webhooks) allows converting CRs from one version to other using webhooks. | ||
|
||
``` | ||
# Initial draft, expect to be changed | ||
|
||
resources: | ||
Endpoint: | ||
references: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am wondering whether we can find a way to get this information from the API spec, i.e. the model JSON, somehow. I know our friends at Azure Service Operator are trying to convince their teams to share some kind of metadata, like I can imagine that model JSON is published for many SDKs to use is pretty constrained due to having to be very generic, but from service owner perspective, it might be more manageable for other tooling they need to maintain to have the reference metadata in the source of truth. Though it may not be worth it either since the change there is probably higher cost than here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @muvaf yeah, I've considered this. If all the AWS API models were consistent, this would be possible... :) However, the reality is that the APIs have inconsistent patterns in how they refer to other objects, which makes this kind of thing pretty difficult to standardize on. |
||
IAMRole: | ||
# include Spec./Status. because it depends if | ||
# it is generated param e.g. ARN/Id (generally part of Status) or | ||
# provided by user e.g. Name (generally part of Spec) | ||
value: Status.ackResourceMetadata.arn | ||
APIVersion: iam.services.k8s.aws/v1alpha1 | ||
Kind: IAMRole | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Instead of having this information on the |
||
ignore: | ||
resource_names: | ||
- Algorithm | ||
- App | ||
- AutoMLJob | ||
``` | ||
|
||
### Controller Implemention | ||
|
||
A sharedInformer cache will be added to the service controller for each type(APIVersion, Kind) of resource a service can depend on. This cache will use an event listener to store and update the CR objects according to the operations(add/update/delete). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would focus primarily on standardizing the interfaces for taking an Under no circumstances should ANY data other than a referred-to resource's identifiers be duplicated into referring resources. In other words, only synchronize the state of the link itself, not the state of the linked item. |
||
|
||
ResourceManagers in controller will have a pointer to the cache which will be used to read the referred resource. This cache will be read in normal reconciliation loop of the referencing/parent resource. If the referred object exists in cache, it will be used to substitute the value in parent resource else the parent resource will requeueAfter interval. | ||
* Example: X refers to Y. The service controller for X will have a shared informer cache for Y type of CRs. Reconcile loop in X will look if Y exists in cache. | ||
* If Y does not exist - X will RequeueAfter a fixed time interval | ||
* If Y exists in cache - X will fetch the value to be substituted and proceed with reconciliation | ||
|
||
Currently ACK controllers do not requeue after the resource creation is successful, which means if the underlaying resource (Y) is deleted or updated those events will not get trigger any update in X. These updates will only get picked if X reconciles or user explicitly resubmits X | ||
* Hence, we propose to add either a controller level or resource level flag in generator config which will add a periodic requeue even after resource creation is successful | ||
|
||
#### Advantages: | ||
|
||
* Efficient since the sharedInformer will use a push model to maintain the cache compared to pull model where each resource manager querys the API server for the resource | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't recommend storing the full object in a cache. Only the identifiers (both AWS and Kubernetes) for referred objects. |
||
* Allows submitting multiple CRs simultaneously without waiting for ARN or id to pre-exist. Name is pre-determined user input | ||
* In addition to supporting references, periodic requeue after resource creation can be used for | ||
* Drift detection | ||
* Detect destructive operations [#82](https://github.com/aws/aws-controllers-k8s/issues/82) | ||
|
||
#### Limitation (to be solved): | ||
|
||
Resource exists in cache/etcd not a sufficient condition if X refers to a field in Y.Spec. E.g. | ||
* Endpoint resource (X) refers to Spec.ModelName from resource (Y). User submitted X & Y to cluster. Y.Spec has modelName when the CR is submitted but resource does not get created in AWS because of error. Since X is only looking for is Y to exist in shared informer cache, X will run create API by looking at Y.Spec.ModelName & will go into an exponential back off instead of periodic requeue | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is why
|
||
* If X had a dependency on Y's ARN/id (returned by AWS) then it would have requeueAfter() until Y.Status.*.ARN != nil | ||
|
||
Although, we do not want to complicate things by mixing the state of the referred-to resource with the state of the referencing resource, if the reference is created from a spec field, it needs to be treated like ARN/id. Is there a better way to solve this ? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, exactly what I am saying in my comment above :) |
||
|
||
#### Exploration | ||
Taking deployment in K8s as an example to see behaviour | ||
|
||
* ConfigMap in Deployment: | ||
* This is a referenced resource. [Updating configmap does not update the deployment/restart the pods](https://github.com/kubernetes/kubernetes/issues/22368). Doesnt look like there will be a fix as well (issue opened 4 years ago). There are [custom controllers](https://github.com/stakater/Reloader) to deal with this issue. | ||
|
||
* ReplicaSet in Deployment: | ||
* This is not a referenced resource but controller watches its resources. If you delete a pod from the replicaset, it gets recreated. This cannot happen unless controller is continuously watching the pods. | ||
* ``` | ||
k create deploy -—image nginx nginx | ||
k scale deploy nginx -—replicas=3 | ||
k get pods | ||
k delete pod <> | ||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the case of the ConfigMap referred to in a Deployment being changed and the Deployment controller not taking any action, that is a tricky issue indeed because there is a camp of people that want the Deployment controller to terminate the Pods that refer to the old ConfigMap value (and have the Deployment controller naturally spawn replacement Pods that will have the latest version of the ConfigMap). And there is a camp of people that do not want the Deployment controller to do that and instead want ConfigMaps to be content-addressable and immutable -- in other words, if a Pod is launched by a Deployment that refers (via the Pod template) to ConfigMap with content at version X, then all Pods in that Deployment controller should be launched referring to that specific content in ConfigMap version X. Only if the Deployment's Pod template ConfigMap version changes should the deployment controller delete Pods and have them reload with the newly-referred ConfigMap version. I would prefer that this proposal focus mostly on the structure of the AWSResourceReference object and walk through whether the fields contained in an AWSResourceReference will be sufficient for a controller to actuate some behaviour based on changes to the desired state of a referring resource or the observed state of a referred-to resource. Let's not get too in the weeds about implementation details about how controllers should/would respond to those changes. |
||
|
||
## In Scope | ||
* Ability to change field type from API specification to object reference. This includes top level fields in CRD as well as nested fields | ||
* Declarative syntax for generator Config to make this approach scalable and consistent across services | ||
* Best practice guide which lists when this should be used | ||
* Unit tests for code generator and integration tests for a specific controller (e.g. SageMaker) | ||
|
||
### Sample Specs after implementation: | ||
|
||
Using Kubernetes Reference: | ||
``` | ||
--- | ||
apiVersion: sagemaker.services.k8s.aws/v1alpha1 | ||
kind: EndpointConfig | ||
metadata: | ||
name: xgboost-endpoint-config | ||
namespace: xyz-namespace | ||
spec: | ||
endpointConfigName: def-endpoint-config | ||
productionVariants: | ||
- variantName: AllTraffic | ||
modelName: trained-model | ||
initialInstanceCount: 1 | ||
instanceType: ml.r5.large | ||
initialVariantWeight: 1 | ||
--- | ||
apiVersion: sagemaker.services.k8s.aws/v1alpha1 | ||
kind: Endpoint | ||
metadata: | ||
name: xgboost-endpoint | ||
spec: | ||
endpointName: my-endpoint | ||
endpointConfigNameRef: | ||
name: xgboost-endpoint-config | ||
namespace: xyz-namespace | ||
``` | ||
|
||
Using external: | ||
``` | ||
apiVersion: sagemaker.services.k8s.aws/v1alpha1 | ||
kind: Endpoint | ||
metadata: | ||
name: xgboost-endpoint | ||
spec: | ||
endpointName: xgboost-endpoint | ||
endpointConfigNameRef: | ||
external: abcd-endpoint-config | ||
``` | ||
|
||
## Out of Scope | ||
* Upgrading any service controller to use references | ||
* E2E tests for every service controller | ||
|
||
## Test Plan | ||
TBD | ||
|
||
## Discussion | ||
|
||
Within an API, there are multiple AWS resources which are from same and across services. e.g. Model resource in SageMaker has modelPackageName from SageMaker and executionRoleARN, subnetId and securityGroupId from IAM, VPC and EC2 services respectively. Which of these should have Kubernetes reference option? | ||
|
||
Here are some options: | ||
|
||
1. **Option A**: Produce a best practice guide which suggests(does not mandate) every AWS resource should be a reference even if a service controller is not available since the external field leaves way to enable references in future. | ||
1. **Option B (Proposed)**: In addition to a, let service owners choose the resources they want to convert to reference | ||
1. **Option C**: For consistency, requirement for all AWS resources to be converted to references going forward. This can be a breaking change |
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Kubernetes, namespaces are security boundary. So, the ability to reference an object in another namespace might result in unexpected consequences. In fact, this is one of the main reasons that managed resources are cluster-scoped in Crossplane, so that they can be shared and everyone knows they can be shared, similar to how you can use other team's resources in AWS console.
We actually had a chat in sig-api-machinery about this because I was looking for a more explicit statement in the API conventions and now there is an open PR to Kubernetes API conventions doc listing the reasons why references across namespaces should be avoided. A concrete example is the ownership relation built via object reference, i.e.
ownerReference
. It's explicitly forbidden for that reference to span across namespaces.It has some consequences in controller deployment as well. It allows a controller to see and fetch values from other resources that it otherwise might not have permission for. For example, I set up a SageMaker controller in my namespace
A
and reference a VPC in namespaceB
:VPC
RBAC permissions either cluster-wide or in namespaceB
so that it can query that custom resource.VPC
in namespaceB
, i.e. it could be in another AWS account or region or just simply nodescribe-vpc
permission on SageMaker IAM role.That kind of impersonation on controller level has to happen for any kind of referencing mechanism since it's the controller that fetches the value, but in native Kubernetes controllers it's always either in the same namespace or namespace->cluster in order not to breach the namespace boundary expectation of users.
To conclude, I'd suggest having only the
Name
in this struct so that teams do not end up referencing resources in other namespaces while the owner of those namespaces feel a false sense of security.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@surajkota I think @muvaf makes an excellent point here and we should remove Namespace from the reference type.
Also consider the updated guidance at https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#object-references