-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Goals
The purpose of this FR is to outline our strategy for how to migrate customers' namespaces from their self-hosted Temporal instance to Temporal Cloud. We assume the following requirements.
- No changes needed in customer’s application code.
- Live migration handled by Temporal end-to-end via internal workflows.
- Signals & queries to workflows will be handled and customers don’t need to modify code.
- Minimize the setup and coordination required from customers (with the caveat, self-hosted must run a version of Temporal which supports migration to Cloud).
- Customers have full control over the migration process. They can decide when to handover and have the ability to abort/rollback (reverse the migration) before handing over.
- Migration of namespaces in the other direction, i.e. from Temporal Cloud to self-hosted Temporal is not in the scope of this FR.
Glossary of terms
- Migration server: A migration server is a single-tenant Temporal server and can access both self-hosted server and Temporal Cloud server via secure network connections.
- Migration proxy: By default, a migration server requires admin access to a self-hosted server and vice versa during migration. To enhance security, we will introduce proxies between the self-hosted server and migration server: a customer-side proxy and a cloud-side proxy.
User flows
- Temporal will coordinate with the customer prior to the migration and create a migration server. This could take several hours.
- Customers have to install the migration proxy to allow connections between the self-hosted server and migration server (steps may vary based on the customer's network setup).
StartMigrationRequest: Customer initializes the request to migrate namespace(s) from self-hosted server to cloud.- Migration request includes namespace(s) to migrate, the migration endpoint/cert.
- Temporal creates a cloud namespace with a “non-active” namespace status. Customers set the right permissions and access controls for their cloud namespace.
GetMigrationResponse: Customers can monitor the progress of workflow replication, time remaining for completion, and when the migration is complete.HandoverNamespaceRequest: Customers can handover back and forth between their source namespace and the cloud namespace. This provides time to validate everything is working as expected in cloud.- Customer has to switch worker traffic from self-hosted to cloud endpoint to validate that everything is working correctly.
- Customer updates the Temporal client in the application code to connect to the cloud namespace endpoint instead of the self-hosted namespace endpoint.
ConfirmMigrationRequestorAbortMigrationRequest: Customers confirm and complete the migration to cloud, or they can decide to abort the migration. Common reasons to abort migration may include: wrong namespace was migrated, or replication errors.
Cloud APIs for migration
message StartMigrationRequest {
// The migration specification.
temporal.api.cloud.namespace.v1.MigrationSpec spec = 1;
// The id to use for this async operation.
// Optional, if not provided a random id will be generated.
string async_operation_id = 2;
}
message StartMigrationResponse {
// The migration id.
string migration_id = 1;
// The cloud namespace.
string namespace = 2;
// The async operation.
temporal.api.cloud.operation.v1.AsyncOperation async_operation = 3;
}
message MigrationSpec {
oneof variant {
// Details for migration from self-hosted to cloud.
MigrationToCloudSpec to_cloud_spec = 1;
}
// The id of the migration endpoint used for connecting
// the self-hosted Temporal cluster to Temporal cloud.
string migration_endpoint_id = 3;
}
message MigrationToCloudSpec {
// The source namespace name for the migration.
string source_namespace = 1;
// Details for the namespace that will be created as a result of the migration.
NamespaceSpec target_namespace_spec = 2;
}
message GetMigrationRequest {
// The migration id.
string migration_id = 1;
}
message GetMigrationResponse {
// The migration.
temporal.api.cloud.namespace.v1.Migration migration = 1;
}
message GetMigrationsRequest {
// The requested size of the page to retrieve.
// Cannot exceed 1000.
// Optional, defaults to 100.
int32 page_size = 1;
// The page token if this is continuing from another response.
// Optional, defaults to empty.
string page_token = 2;
}
message GetMigrationsResponse {
// The list of migrations.
repeated temporal.api.cloud.namespace.v1.Migration migrations = 1;
// The next page's token.
string next_page_token = 2;
}
message Migration {
// The unique id of this migration.
string migration_id = 1;
// The MigrationSpec provided in the StartMigrationRequest.
MigrationSpec spec = 2;
// The state of the migration.
State state = 5;
// The source and destination replicas involved in the migration.
repeated MigrationReplica replicas = 4;
// The number of workflows replicated.
int64 replicated_workflows = 5;
// The number of workflows remaining.
int64 replicated_workflows_remaining = 6;
// An error message if the migration failed.
string failure_message = 7;
enum State {
STATE_UNSPECIFIED = 0;
STATE_MIGRATION_STARTED = 1;
STATE_REPLICATION_IN_PROGRESS = 2;
STATE_WAITING_FOR_HANDOVER = 3;
STATE_HANDOVER_IN_PROGRESS = 4;
STATE_READY_FOR_CONFIRMATION = 5;
STATE_COMPLETE = 6;
STATE_FAILED = 7;
STATE_ABORT_IN_PROGRESS = 8;
STATE_ABORTED = 9;
}
}
message MigrationReplica {
// The id of this replica. Indicates whether the replica is on the source
// or destination side of the migration.
string id = 1; // e.g. "source" / "target"
// The state of this replica.
State state = 2;
enum State {
STATE_UNSPECIFIED = 0;
STATE_ACTIVE = 1;
STATE_PASSIVE_OUT_OF_SYNC = 2;
STATE_PASSIVE_IN_SYNC = 3;
// If aborted migration, or if replication failed.
STATE_ABANDONED = 4;
}
}message HandoverNamespaceRequest {
// The migration id.
string id = 1;
// The id of replica to make active.
string to_replica_id = 2;
// The id to use for this async operation.
// Optional, if not provided a random id will be generated.
string async_operation_id = 3;
}
message HandoverNamespaceResponse {
// The async operation.
temporal.api.cloud.operation.v1.AsyncOperation async_operation = 1;
}message ConfirmMigrationRequest {
// The migration id.
string migration_id = 1;
// The id to use for this async operation.
// Optional, if not provided a random id will be generated.
string async_operation_id = 2;
}
message ConfirmMigrationResponse {
// The async operation.
temporal.api.cloud.operation.v1.AsyncOperation async_operation = 1;
}message AbortMigrationRequest {
// The migration id.
string migration_id = 1;
// The id to use for this async operation.
// Optional, if not provided a random id will be generated.
string async_operation_id = 2;
}
message AbortMigrationResponse {
// The async operation.
temporal.api.cloud.operation.v1.AsyncOperation async_operation = 1;
}Pre-requisites & limitations
Refer to this document for pre-requisites and limitations of using this migration tool.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request
