-
Notifications
You must be signed in to change notification settings - Fork 82
OADP-363 plugin-registry design, registry deployment removal proposal #737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
42d2973
5439f64
bb371c3
ba9decb
8a1f280
78e4976
3792d17
944b93b
328600a
7134028
4900189
3909a0f
db558d7
24901aa
4b73182
a83b80f
befe8ea
9214251
4ce1705
7d12ce2
5b1d38b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
--- | ||
title: plugin-registry | ||
authors: | ||
- @kaovilai | ||
reviewers: | ||
- @shubham-pampattiwar | ||
- @eemcmullan | ||
- @savitharaghunathan | ||
- @dymurray | ||
- @shawn-hurley | ||
approvers: | ||
- @dymurray | ||
- @shawn-hurley | ||
- @shubham-pampattiwar | ||
creation-date: 2022-06-14 | ||
last-updated: 2022-06-14 | ||
status: implementable | ||
--- | ||
|
||
# Plugin Registry | ||
|
||
Plugin registry enhancement replaces current usage of [registry](https://github.com/konveyor/distribution/) [deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) insecure [routes](https://access.redhat.com/documentation/en-us/openshift_container_platform/4.10/html/networking/configuring-routes) by talking to registry API server inside the velero container itself via openshift-velero-plugin. | ||
|
||
## Release Signoff Checklist | ||
|
||
- [x] Enhancement is `implementable` | ||
- [ ] Design details are appropriately documented from clear requirements | ||
- [ ] Test plan is defined | ||
- [ ] User-facing documentation is created | ||
|
||
## Open Questions [optional] | ||
|
||
Do we want registry logs to be included in velero backup and/or restore logs? | ||
|
||
## Summary | ||
|
||
removes a listening HTTP server and connect [openshift-velero-plugin](https://github.com/openshift/openshift-velero-plugin/) directly with the [distribution/distribution](https://github.com/distribution/distribution) code to serve requests from [container/image/copy](https://github.com/containers/image/tree/main/copy) method as needed. A "library" to help achieve this is [kaovilai/udistribution](https://github.com/kaovilai/udistribution) which may be moved to konveyor org. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So basically. Current workflow is, |
||
|
||
## Motivation | ||
|
||
Current image backup workflow depends on registry deployment with exposed insecure routes allowing anyone knowing the URL of the exposed routes to push or pull images to the registry. | ||
kaovilai marked this conversation as resolved.
Show resolved
Hide resolved
|
||
There are some transient errors such as pod health probe timeouts that are resolved by this design. | ||
kaovilai marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This approach will eliminate registry deployment dependency which the operator will no longer have to reconcile. | ||
|
||
### Goals | ||
|
||
- Exposed routes are eliminated. | ||
- Remove dependency on registry deployment. | ||
- Be backwards compatible with existing image backup workflow. | ||
- images backed up will continue to be in the same directory in object storage as they were prior to this enhancement. | ||
- Deduplication of layers/images across backups to the same backup storage location. | ||
|
||
<!-- ### Non-Goals --> | ||
## Proposal | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. More from a high level this proposal is to
|
||
We propose that insecure routes are eliminated by removing registry deployment, and to instead have openshift-velero-plugin make network requests to registry APIs internally via [golang ServeHTTP function provided by distribution/distribution](https://github.com/distribution/distribution/blob/3e4f8a0ab1476a9516d02d76fbf869480541657f/registry/handlers/app.go#L624) which will then through storage drivers such as [`s3-aws`](https://github.com/distribution/distribution/blob/main/registry/storage/driver/s3-aws/s3.go),`gcs` , and `azure` write to desired object storage. | ||
|
||
|
||
### Implementation Details/Notes/Constraints [optional] | ||
|
||
First, a udistribution client are to be initialized using [`udistribution.NewTransportFromNewConfig()`](https://github.com/kaovilai/udistribution/blob/2b5e16ac1f8efa0bbcdc513ae7103d4f56f3befa/pkg/image/udistribution/docker_transport.go#L50) function. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just a suggestion, a short paragraph that summarizes what There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see its more explained in line 88, but in this section stay away from in depth code references just give a high level description of the library There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Had to reread section description in https://github.com/konveyor/enhancements/blob/master/guidelines/enhancement_template.md |
||
|
||
```go | ||
ut, err := udistribution.NewTransportFromNewConfig("", os.Environ()) | ||
defer ut.Deregister() | ||
if err != nil { | ||
t.Errorf("failed to create transport with environment variables: %v", err) | ||
} | ||
``` | ||
|
||
This function [initializes Registry App](https://github.com/kaovilai/udistribution/blob/2b5e16ac1f8efa0bbcdc513ae7103d4f56f3befa/pkg/client/client.go#L58) and [registers a new container/images transport](https://github.com/kaovilai/udistribution/blob/2b5e16ac1f8efa0bbcdc513ae7103d4f56f3befa/pkg/image/udistribution/docker_transport.go#L61) that contains a `docker://` transport with [modified `makeRequestsToResolvedURLOnce()` function](https://github.com/kaovilai/udistribution/blob/d7f491d7c354caa1df6893d20c735b9c08c20108/pkg/image/udistribution/docker_client.go#L559) designed to use [ServeHTTP method available from the initialized Registry App](https://github.com/distribution/distribution/blob/3e4f8a0ab1476a9516d02d76fbf869480541657f/registry/handlers/app.go#L624) instead of communicating over to a listening web server through a route to a registry deployment which was insecure. | ||
|
||
This approach does not require an exposed port on the container. | ||
|
||
The logs relating to registry will be mixed into velero container logs. It is possbile to set log level. Could be done by setting `REGISTRY_LOG_LEVEL` environment variable on the velero container, or by discovering velero log level from velero container itself. | ||
|
||
### Security, Risks, and Mitigations | ||
|
||
As more backups and/or new Backup Storage Location are added and thus caching more initialized registries configurations, we may want to periodically garbage collect unused or outdated backup storage location registry configurations to optimize memory usage. | ||
|
||
## Design Details | ||
|
||
[](https://lucid.app/documents/view/14dd9d7d-38a5-40f7-a378-c89a99a1ace6) | ||
|
||
[udistribution](https://github.com/kaovilai/udistribution) library has been created to help with the following: | ||
- Initializing Registry [App](https://github.com/distribution/distribution/blob/b5e2f3f33dbc80d2c40b5d550541467477d5d36e/registry/handlers/app.go#L58) from configuration resolved from environment variables passed via [function parameters](https://github.com/kaovilai/udistribution/blob/d7f491d7c354caa1df6893d20c735b9c08c20108/pkg/client/client.go#L58). | ||
- Add a custom container/images docker transport with modified [makeRequestToResolvedURLOnce()](https://github.com/kaovilai/udistribution/blob/d7f491d7c354caa1df6893d20c735b9c08c20108/pkg/image/udistribution/docker_client.go#L559) to use [Registry App ServeHTTP method](https://github.com/kaovilai/udistribution/blob/d7f491d7c354caa1df6893d20c735b9c08c20108/pkg/image/udistribution/docker_client.go#L613) instead of sending HTTP requests to a listening HTTP server. | ||
|
||
OADP-Operator controller manager will | ||
* set a single environment variable on the velero container to tell openshift-velero-plugin to initialize [udistribution transport](https://github.com/kaovilai/udistribution/blob/main/pkg/image/udistribution/docker_transport.go#L36) to be used to talk to storage drivers such as s3 | ||
|
||
openshift-velero-plugin will | ||
* resolve registry secret for a given Backup Storage Location by looking up the secret in the namespace of the Backup Storage Location and use them as parameters to initialize [udistribution transport](https://github.com/kaovilai/udistribution/blob/main/pkg/image/udistribution/docker_transport.go#L36) to be used to talk to storage drivers such as s3, gcs, and azure | ||
* substitute transport name given to `image.Copy()` from `docker://` to [generated udistribution transport name](https://github.com/openshift/openshift-velero-plugin/blob/1600327cb3f6f9f60ade880aef8fe16d34e6fb04/velero-plugins/imagecopy/imagestream.go#L102) for bsl when plugin registry is requested. | ||
|
||
### Test Plan | ||
- Check for no user experience changes | ||
- Image backup continues to work and upload to s3/azure/gcp | ||
- ex: backup image with oadp-1.0 using registry deployment and restore using oadp-1.1 without registry deployment | ||
- e2e test that verify backed up internal images exists in the object store | ||
#### Test Changes | ||
- Remove check for ready registry deployments | ||
- Deploy a dummy bsl registry deployment and check for removal | ||
|
||
### Upgrade / Downgrade Strategy | ||
|
||
Upon upgrading, operator will attempt to remove remaining registry deployments specified in dpa BSLs. | ||
|
||
<!-- | ||
## Implementation History | ||
|
||
Major milestones in the life cycle of a proposal should be tracked in `Implementation | ||
History`. | ||
--> | ||
|
||
## Drawbacks | ||
|
||
TBC | ||
|
||
## Alternatives | ||
|
||
- Creating another Custom Resource from openshift-velero-plugin, watching for CR from oadp-operator controller-manager and acting upon the requests | ||
- [openshift/oadp-operator#618: ImageStreamsBackup Custom Resource design.](https://github.com/openshift/oadp-operator/pull/618) | ||
- Reasons against | ||
- Yet another custom resource/CRD. Increased debug complexity. | ||
- Sidecar registry container inside velero pod | ||
- [move registry container inside velero pod by kaovilai · Pull Request #700 · openshift/oadp-operator (github.com)](https://github.com/openshift/oadp-operator/pull/700) | ||
- Reasons against | ||
- As BSLs are added, velero pod has to be recreated, thus causing running backups to fail or has to wait for backups to complete before a new BSL can have a working registry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to have the registry logs, +1 from my side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, lets not call them registry logs, these are technically plugin logs, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I think there are 2 things we are talking about here.
I think 1. should be in the backup/restore for sure. 2. feels overly verbose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Plugin logs. However the logs generated from registry will not be in backup logs in object storage since the velero backup logger isn't being passed to the ServeHTTP method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do want them but I believe that per current implementation it is not possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the positive side, this means that we can leave log level as debug for the registry component without additional configuration flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah let's leave it as a potential future enhancement. Kinda opens the door for some other mechanism within velero plugins to pass logs back