Skip to content

Releases: temporalio/temporal

v1.29.1

29 Oct 23:47
4c24037

Choose a tag to compare

Release Highlights

This patch release fixes an issue in Priority and Workflow Versioning.
It also fixes a bug in Workflow Retry for versioned workflows.

Full Changelog: v1.29.0...v1.29.1

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools

v1.29.0

03 Oct 17:33

Choose a tag to compare

⚠️💥 UPCOMING BREAKING CHANGES 💥⚠️

Starting from next server release 1.30.0, for security reasons, Temporal docker images will be slimmed down and we are taking away the binaries and packages that don’t strictly need to be included. This includes:

temporalio/server

  • temporal CLI - included in admin-tools
  • tctl and tctl-authorization-plugin - both are deprecated CLIs
  • dockerize - used for templating the configuration, functionality that is now inlined in the server codebase
  • curl - not part of the Temporal distribution

temporalio/admin-tools

  • tctl and tctl-authorization-plugin
  • python3
  • libev
  • curl
  • jq
  • yq
  • mysql-client
  • postgresql-client
  • expat
  • tini
  • cqlsh

Task queue fairness - pre-release

Description:

Task queue fairness allows you to control the execution order of workflows, activities, and child workflows within a single task queue by assigning fairness keys and weights. Note that priority keys take precedence over fairness assignments.

Fairness can be attached to workflows and activities using the latest versions of most SDKs. In order for priority to take effect on the server, you need to switch to set the dynamic config matching.enableFairness to true either on specific task queues, namespaces, or globally.

⚠️ Turning the feature on/off will cause currently backlogged tasks to be lost; which can cause workflows to be stuck. This limitation will be lifted in future releases.

See more usage details here: Temporal - Task Queue Fairness Guide (Pre-Release)

Rollout operational task: (schema upgrade)

Versioning improvement

Bug fixes

  • RampingVersionPercentageChangedTime is now visible from the Routing Config and can be accessed via a DescribeDeployment call (#8089)
  • Correct the drainage status of a deactivated version when it gets rolled back (#8119)
  • Fix max deployment count check to not block new versions (#7841)
  • Do not set sticky queue if a deployment transition is ongoing (#7852)
  • Do not bypass task generation when version changes during a deployment transition (#7890)
  • Make unsupported and deprecated Deployment API’s return Unimplemented error (#8009)
  • Update DeploymentName from Override and fix batch UpdateOptions serialization bug (#7910)

Improvements

  • Queries on drained + poller-less version to respond with descriptive error message (#7946)
  • SetWorkerDeploymentManager: Help coordinate multiple tenant writes to the k8s worker controller without overwriting each others changes (#8278)
  • Allow people to revert a task queue back to the un-versioned state without needing to set a 0% ramped non-nil Ramping Version (#8172)
  • Allow users to be able to opt-in to activating versions that have not yet had any pollers (#8254)

Cloud rollout operational task:

Task queue config

Description:

Task queues can be now configured via the new UpdateTaskQueueConfig endpoint. It allows configuring (1) the queue’s maximum requests per second and (2) default maximum requests per second for fairness keys (feature is in pre-release). For example, using the Temporal CLI 1.5.0 or later:

temporal task-queue config set --task-queue foo --task-queue-type bar --queue-rate-limit 100 --queue-rate-limit-reason "throttling"

Note that the rate limit for the task queue from the API takes precedence over the rate limit set via the worker’s TaskQueueActivitiesPerSecond option. If the API rate limit is unset again, it will fall back to the worker’s rate limit again, if set. Otherwise the system’s default limit is applied.

Authorizer

Description: Added a new dynamic config (frontend.exposeAuthorizerErrors) to control whether the frontend authorization interceptor should propagate errors returned by the Authorizer component as-is or wrap them with a PermissionDenied service error. Default is false, meaning all errors will be wrapped with PermissionDenied, which matches current behavior to avoid breaking any custom Authorizer implementers.

Eager workflow start public release (on by default)

Description: Eager workflow start is a latency optimization for worker and a starter a colocated in the same process. If the starter (client) requests eager execution, and a worker slot is available, the client will request the server to start the workflow eagerly. If permitted (the dynamic
config is on, and there's no start delay), the first workflow task will be returned inline to be processed by the colocated worker.

This feature is now on by default and the EagerExecutionAccepted flag has been added to WorkflowExecutionStartedEventAttributes for debugging purposes (#8056)

To turn this feature off set the dynamic config system.enableEagerWorkflowStart to false.

Activity and workflow metrics changes

Description: A number of activity and workflow metrics were added and activity_e2e_latency has been deprecated. (#8196, #8185)

Deprecated Metrics

  • activity_e2e_latency → Deprecated, replaced with better named activity_start_to_close_latency

New Metrics

  • activity_start_to_close_latency: Per attempt latency from activity start to close
  • activity_schedule_to_close_latency: End-to-end duration, including retries and backoff.
  • activity_success: Number of succeeded activities
  • activity_fail: Number of final failures for activities
  • activity_timeout: Incremented on the final activity timeout tagged by timeout_type
  • activity_task_fail: Failures for activities including retries
  • activity_task_timeout: Number of activity attempt timeouts, tagged by timeout_type.
  • activity_cancel: Number of canceled activities
  • workflow_duration: End to end latency of workflow

Batch operation improvement

Description: Removed the internal BatchParams go struct in favor of a protobuf struct for safer serialization. Optimized the batch operation processing with proactive page fetching, removing the worker wait for new pages to be completed. (#8144, #8081).

Batch Activity Reset and Update Options

Description: Support activity reset and update-options on the server side. (#8061)

Nexus

Description:

  • Fixed a bug where the standby outbound task executor could sometimes return a NamespaceNotActive error.
  • Fixed a data race in Nexus disallowed headers dynamic config.
  • All remote frontend HTTP calls will always attempt to use HTTP2 by default.
  • Changed the default to true for dynamic config component.nexusoperations.recordCancelRequestCompletionEvents . This config will be removed in a future release.
  • Fixed a bug for forwarded Nexus operation completion HTTP requests that contained a failure. The failure will now be reconstructed instead of reusing the original HTTP request body.
  • Added logic to forward Nexus HTTP requests using the same dispatch type (by endpoint or by namespace+task queue) as the original request. This behavior is behind a dynamic config, frontend.nexusForwardRequestUseEndpointDispatch, because endpoints currently do not support replication and therefore forwarding requests by endpoint will not work out of the box.
  • The original HTTP request headers will now be passed through as-is, without sanitization, for forwarded requests.
  • Bug fix: a new workflow task will be generated for NexusOperationCancelRequestCompleted|Failed events.

Visibility

Description: ScanWorkflowExecutions has been removed from Visibility. The API is still available, but it simply calls ListWorkflowExecutions.

Worker Insights

Description: Worker insights is an umbrella project that addresses the operational complexities and toil of worker management (tuning performance knobs, scaling workers). Our end goal is to automate this process for our users. As a first step, this release adds the capability to propagate the worker state (configuration + SDK metrics) to the server via a heartbeat mechanism. This state is stored in-memory on the matching servers, and can be queried by the user.

To achieve this, this release introduces 3 APIs:

  1. RecordWorkerHeartbeat: To send heartbeat to the matching service. Used by SDK.
  2. ListWorkers: To query the state of 1 or more workers that match a predicate.
  3. DescribeWorker: To query the state of a specific worker.

These APIs are disabled by default.

Flags are: frontend.WorkerHeartbeatsEnabled and frontend.ListWorkersEnabled.

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use the tag 1.29.0)

Server
Server With Auto Setup (what is Auto-Setup?)
[Admin-Tools](https://hu...

Read more

v1.27.3

20 Aug 22:29

Choose a tag to compare

Release Highlights

  • Limit the number of parts allowed for auth token by @picatz in #8122

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use the tag 1.27.3)

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools

Full Changelog: v1.27.2...v1.27.3

v1.26.3

20 Aug 22:30

Choose a tag to compare

Release Highlights

  • Limit the number of parts allowed for auth token by @picatz in #8122

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use the tag 1.26.3)

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools

Full Changelog: v1.26.2...v1.26.3

v1.28.1

06 Aug 19:12

Choose a tag to compare

This patch release fixes a few Workflow Update, Worker Deployment, Scheduler, Matching, and security bugs.

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use the tag 1.28.1)

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools

Full Changelog: v1.28.0...v1.28.1

v1.28.0

27 Jun 21:12
b036505

Choose a tag to compare

Schema changes

Before upgrading your Temporal Cluster to v1.28.0, you must upgrade your core schemas to the following:

  • MySQL schema v1.17
  • PostgreSQL schema v1.17
  • Cassandra schema v1.12

Please see our upgrade documentation for the necessary steps to upgrade your schemas.

Deprecation Announcements

Deprecating old Versioning APIs: The following APIs related to previous versions of Worker Versioning are deprecated.

The following APIs related to the December 2024 pre-release of Worker Versioning have been deprecated and are now no longer supported:

  • DescribeDeployment
  • ListDeployments
  • GetDeploymentReachability
  • GetCurrentDeployment
  • SetCurrentDeployment

The following APIs are now deprecated and will be removed once the latest APIs reach to General Availability in the coming months:

  • UpdateWorkerVersioningRules
  • GetWorkerVersioningRules
  • UpdateWorkerBuildIdCompatibility
  • GetWorkerBuildIdCompatibility
  • GetWorkerTaskReachability

Release Highlights

Update-With-Start GA

Update-With-Start sends a Workflow Update that checks whether an already-running Workflow with that ID exists. If it does, the Update is processed. If not, it starts a new Workflow Execution with the supplied ID. When starting a new Workflow, it immediately processes the Update.

Update-With-Start is great for latency-sensitive use cases.

Nexus - Back Multiple Operations by Single Workflow

Now you can have multiple callers starting operations backed by a single workflow. When the handler tries to start a workflow that is already running, with the “use existing” conflict policy, the server will attach the caller’s callback to the running workflow. When the workflow completes, the server will call all attached callbacks to notify the callers with the workflow result.

Here’s an example using Go SDK (available in v1.34.0+):

import (
	"context"
	
	"github.com/nexus-rpc/sdk-go/nexus"
	enumspb "go.temporal.io/api/enums/v1"
	"go.temporal.io/sdk/client"
	"go.temporal.io/sdk/temporalnexus"
)

sampleOperation := temporalnexus.NewWorkflowRunOperation(
	"sample-operation",
	SampleWorkflow,
	func (
		ctx context.Context,
		input SampleWorkflowInput,
		options nexus.StartOperationOptions,
	) (client.StartWorkflowOptions, error) {
		return client.StartWorkflowOptions{
			// Workflow ID is used as idempotency key.
			ID: "sample-workflow-id",
			// If a workflow with same ID is already running, then it will attach the callback to the existing running workflow.
			// Otherwise, it will start a new workflow.
			WorkflowIDConflictPolicy: enumspb.WORKFLOW_ID_CONFLICT_POLICY_USE_EXISTING,
		}, nil
	},
)

The handler workflow will return a RequestIdReferenceLink to the caller. This is an indirect link to the history event that attached the callback in the handler workflow. Links can provide information about the handler workflow to the caller. In order to get the exact event history, there is now the RequestIdInfos map in the WorkflowExtendedInfo field of a DescribeWorkflowExecutionResponse. To enable RequestIdReferenceLink, you have to set the dynamic config history.enableRequestIdRefLinks to true (this might become enabled by default in a future release).

Nexus - Callback ↔ Link Association

Nexus links were previously stored in the history event not directly associated with the callback that it came together. Now, the server is storing the Nexus links together with the callback. With this direct association, you can easily find out the caller that triggered the Nexus workflow from the callback through these links for example. This feature requires the latest version of Go SDK (v1.35.0+) and Java SDK (v1.30.0+).

Nexus - Cancellation Types

The server now supports Nexus operation cancellation types. These are specified when starting an operation and indicate what should happen to Nexus operations when the parent context that started them is cancelled. To use them, you must be using an SDK version that supports them. Available cancellation types are:

  • Abandon - Do not request cancellation of the operation
  • TryCancel - Request cancellation of the operation and immediately report cancellation to callers
  • WaitRequested - Request cancellation and wait until the cancellation request has been received by the operation handler
  • WaitCompleted - Request cancellation and wait for the operation to complete. The operation may or may not complete as cancelled. Default and behavior for server versions <1.28

For the WaitRequested type to work, you must set the dynamic config component.nexusoperations.recordCancelRequestCompletionEvents to true (default false ).

Nexus - Miscellaneous

  • Nexus callback request processing logic will now attempt to deserialize failure information from the body of failed callback HTTP requests. This means that Nexus operation handlers should now see more informative messages about why their callback failed to be delivered to callers.
  • Links added by Nexus operations should now be appropriately propagated across continue-as-new, workflow reset, and workflow retries.

Versioning / Safe-Deploy Public Preview

The following Worker Versioning APIs graduated into Public Preview stage. Production usage is encouraged but note that limited changes might be made to the APIs before General Availability in the coming months.

  • ListWorkerDeployments
  • DescribeWorkerDeployment
  • DescribeWorkerDeploymentVersion
  • SetWorkerDeploymentCurrentVersion
  • SetWorkerDeploymentRampingVersion
  • UpdateWorkerVersionMetadata
  • DeleteWorkerDeployment
  • DeleteWorkerDeploymentVersion

Using Worker Versioning: Find instructions in https://docs.temporal.io/worker-versioning.

Operator notes:

  • The following configs need to be enabled:
    • frontend.workerVersioningWorkflowAPIs (default: true)
    • system.enableDeploymentVersions (default: true)
  • Knobs:
    • matching.maxDeployments controls the maximum number of worker deployments that the server allows to be registered in a single namespace (default: 100, safe to increase to much higher values)
    • matching.maxVersionsInDeployment controls the maximum number of versions that the server allows to be registered in a single worker deployments (default: 100, unsafe to increase beyond a few 100s)
    • matching.maxTaskQueuesInDeploymentVersion controls the maximum number of task queues that the server allows to be registered in a single worker deployment version (default: 100, unsafe to increase beyond a few 100s)
    • matching.wv.VersionDrainageStatusVisibilityGracePeriod systems waits for this amount of time before checking the drainage status of a version that just entered in DRAINING state (default: 3 minutes, setting a very low value might cause the status to become DRAINED incorrectly)
    • matching.wv.VersionDrainageStatusRefreshInterval interval used for checking drainage status (default: 3 minutes, lowering the value will increase load on the Visibility database)

Please see deprecation warnings regarding earlier versions of Temporal versioning APIs.

⚠️ Important for Worker Versioning users (v1.27.x → v1.28.0)

If you used Worker Versioning in v1.27.x, you must delete all Worker Deployments (via DeleteWorkerDeployment) before upgrading to v1.28.0, then recreate them after. This is due to breaking changes between v1.27.2 and v1.28.0.

If you already upgraded or need help, ask in #safe-deploys on Community Slack.

Simple priority for task queues - pre-release

Simple priority allows you to control the execution order of workflows, activities, and child workflows based on assigned priority values within a single task queue. You can select a priority level in the integer range [1,5]. A lower value implies higher priority.

Priority can be attached to workflows and activities using the latest versions of most SDKs. In order for priority to take effect on the server, you need to switch to the new implementation of the matching service: set the dynamic config matching.useNewMatcher to true either on specific task queues, namespaces, or globally. After the new matcher has been turned on for a task queue, turning it off will cause tasks with non-default priority to be temporarily lost until it’s turned on again.

When the setting is changed, the implementation will be switched immediately, which may cause a momentary disruption in task dispatch.

Besides enabling priority, the new matcher will have a different profile of persistence operations, and slightly different behavior with task forwarding and some other edge cases. If you see performance regressions or unexpected behavior with the new matcher, please let us know.

See more usage details here: Temporal - Task Queue Priority Guide (Pre-Release)

Operator commands

  • Improvements to the activity in DescribeWorkflow:
    • Add activity options to the pending activity info
      • Task Queue name
      • All timeouts
      • Retry Policy
    • Add PauseInfo to the pending activity info
      • Timestamp
      • Identity
      • Pause reason
    • Add PAUSED and PAUSE_REQUESTED to activity state. This allows to distinguish between the situation when pause signal is received, but activity is still running on the worker.
  • Send ActivityPause/ActivityReset flag in heartbeat response. This notifies the workers about activity state.
  • Add the ability to restore activity options to its original state for ResetActivity and UpdateActivityOptions commands.
  • Add a flag to CLI to restore activity options for the activity res...
Read more

v1.27.2

28 Mar 16:36

Choose a tag to compare

This patch release fixes a few minor Worker Deployment and Nexus bugs.

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use the tag 1.27.2)

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools

Full Changelog: v1.27.1...v1.27.2

v1.27.1

26 Feb 23:01

Choose a tag to compare

Schema Changes

Before upgrading your Temporal Cluster to v1.27.1, you must upgrade your core schema if you are using MySQL or PostgreSQL, and your visibility schema to the following:

  • Core:
    • MySQL schema v1.16
    • PostgreSQL schema v1.16
  • Visibility:
    • Elasticsearch schema v9
    • MySQL schema 1.9
    • PostgreSQL schema v1.9

Please see our upgrade documentation for the necessary steps to upgrade your schemas.

NOTE: The upgrade to MySQL and PostgreSQL Visibility schemas may come with temporary performance degradation because of creation of a new column _version which has default values. Consider performing the schema upgrades when load is low. There are protective mechanisms in place to account for timeouts from any VisibilityStore.

Deprecation of Visibility Scan APIs

Visibility Scan APIs have been deprecated in favor of List Workflow APIs. Visibility Scan APIs will be removed in a future version. Migration to List Workflow APIs will be required in future versions.

Nexus GA

Nexus is now GA with a stable server API.

Read more here on how to disable Nexus or how to operate it here.

Notable features and bug fixes since v1.26.2:

  • The server now allows a maximum of 30 pending Nexus operations per workflow by default, as opposed to the previous limit being 30 total.
  • Add support for attaching callbacks, request IDs, and links to a workflow via StartWorkflowExecutionRequest.OnConflictOptions with WorkflowIdConflictPolicy of USE_EXISTING. This allows multiple operations to be backed by the same workflow.
  • Misc small features and tests.

Safe Deploys

The following APIs are added for Worker Versioning. All APIs are experimental and not yet recommended for production usage. You need to set the dynamic configs system.enableDeployments and system.enableDeploymentVersions in order to use them.

  • ListWorkerDeployments
  • DescribeWorkerDeployment
  • DescribeWorkerDeploymentVersion
  • SetWorkerDeploymentCurrentVersion
  • SetWorkerDeploymentRampingVersion
  • UpdateWorkerVersionMetadata
  • DeleteWorkerDeployment
  • DeleteWorkerDeploymentVersion

The following APIs are now deprecated and replaced by above:

  • DescribeDeployment
  • ListDeployments
  • GetDeploymentReachability
  • GetCurrentDeployment
  • SetCurrentDeployment

Activity Commands (pre-release)

Changes to the Activity Commands — a set of APIs designed to resolve issues related to activity execution. The following APIs where updated:

  • UpdateActivityOptionsById was renamed to UpdateActivityOptions. This API can be used by the client to update the options of an activity while activity is running.
  • PauseActivityById was renamed to PauseActivity. With this API, If the Activity is not currently running (e.g. because it previously failed and is waiting for the next retry), it will not be run again until it is unpaused.
    • activity_type parameter was added. If this parameter is set - all running activities of this type will be paused.
  • UnpauseActivityById was renamed to UnpauseActivity. With this API clients can re-schedule a previously-paused Activity for execution.
    • no_wait parameter was removed
    • activity_type parameter was added. If this parameter is set - all paused activities of this type will be unpaused.
    • match-all parameter was added. If this parameter is set - all paused activities will be unpaused.
    • jitter parameter was added. If set, the activity will start at a random time within the specified jitter duration.
  • ResetActivityById was renamed to ResetActivity. With this API clients can reset the execution of an activity, specified by its ID or type.
    • no_wait parameter was removed
    • activity_type parameter was added. If this parameter is provided - all paused activities of this type will be unpaused.
    • keep_paused parameter was added. If this parameter is provided - all paused activities will stay paused. Otherwise they will be unpaused.
    • jitter parameter was added. If set, the activity will start at a random time within the specified jitter duration.
  • New batch operation was introduced - BatchOperationUnpauseActivities.

Workflow Reset with children

In this release we have added the functionality to reset a workflow with a pending child.

Prior to this release reseting to a point between child workflow initiated and child workflow completed was not supported (the reset operation would fail). In the current release the reset operation will allow this case and the behavior of the parent after reset is to reconnect to the running child. The new run of the parent will receive the child’s completion event and result (if any) from the child.

The feature is gated behind a per namespace boolean dynamic configuration AllowResetWithPendingChildren which is enabled by default for all namespaces.

Note: If you are using Go-SDK and are relying on the SDK to generate child workflow IDs then you need to update it to the latest version to be able to use this feature. Other SDKs don’t need any upgrade to use this feature.

Delete namespace improvement

  1. Delete workflow executions RPS is now dynamic. Previously, frontend.deleteNamespaceDeleteActivityRPS was read only once when namespace deletion started, and subsequent changes to this dynamic config didn't affect the ongoing deletion. This was inconvenient for large namespaces since the default RPS is only 100. Now the RPS can be adjusted on the fly.

    Please note: Since deletion of Workflow Execution is an asynchronous process, this RPS controls the rate at which delete execution tasks are created. Decreasing this value (for example, from 1000 to 10) won't immediately slow down the process, as existing tasks in the transfer queue must be processed first.

  2. DeleteExecutionsWorkflow now supports a stats query to track its progress. Since this Workflow can run for hours after a namespace is marked as deleted, it was previously difficult to monitor how many Workflow Executions remained. The new query handler provides current statistics about total and remaining executions.

  3. Metrics and logging have been enhanced for better actionability. Key improvements include:

    1. Monitor ReclaimResources workflow failures using the metrics reclaim_resources_namespace_delete_failure and reclaim_resources_delete_executions_failure.
    2. Track DeleteExecutionsWorkflow progress using the metrics delete_executions_success and delete_executions_failure.
  4. Business critical namespaces can be protected from deletion. Use dynamic config to list namespaces which are not deletable:

    worker.protectedNamespaces:
      - value:
          - critical_namespace
          - just_very_important_namespace
  5. Sleep duration in ReclaimResourcesWorkflow now supports dynamic changes. If a namespace delete delay was mistakenly set too long, you can now modify it after the Workflow has started. Use this command to update the delay to a new value (10 hours in this example):

    temporal workflow update --namespace temporal-system --name update_namespace_delete_delay --workflow-id temporal-sys-reclaim-namespace-resources-workflow/default-deleted-93f5e --input '"10h"'
    

    Or use this command to remove the delay entirely:

    temporal workflow update --namespace temporal-system --name update_namespace_delete_delay --workflow-id temporal-sys-reclaim-namespace-resources-workflow/default-deleted-93f5e --input '"0"'
    

    Please note: The new delay starts from when it is set, not from when the original timer was created. For example, if the Workflow has already slept for 2 hours and the timer is updated to 10h, it will sleep for another 10 hours, not 8.

Scheduled Actions

  • Scheduler workflow version has been updated.
    • FutureActionTimes now accounts for a schedule’s update time and RemainingActions.
    • ScheduleActionResult now includes a WorkflowExecutionStatus field, providing an eventually-consistent view of a workflow’s status within List results.
  • Bugfix: Queries on schedules might have been delayed after dynamic config changes were applied to per-namespace workers. An anti-entropy mechanism has been added to the per-namespace worker component.

Fix out-of-order Visibility tasks with SQL database

All SQL stores used for Visibility had the rare possibility to perform updates to a workflow's visibility state out-of-order. This could result in Workflows occasionally appearing to have state that is out of date.

Implementation

This has been fixed by adding a new column _version to all SQL store implementations. Queries to update Visibility data now ensure the _version advances before performing any writes.

Updates to Visibility data are prepared by checking actual Workflow state. Therefore when a write is rejected for being out-of-order, we know the VisibilityStore already contains a equal or more up-to-date state, so we drop out-of-order updates silently.

Important Notes

  • Make sure to upgrade your Schemas before advancing your temporal version. See the Schema Changes section.
  • This will likely reduce the performance of SQL VisibilityStores.
  • We suggest ElasticSearch as the performant solution for VisibilityStore.

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use the...

Read more

v1.27.0

22 Feb 00:18

Choose a tag to compare

v1.27.0 Pre-release
Pre-release

Caution

Please DO NOT use it if you are using SQL-based (PostgreSQL, MySQL, or sqlite) persistence. Update directly to v1.27.1.
This release made changes to the SQL schemas, and there's a bug when upgrading from an older version.

Schema Changes

Before upgrading your Temporal Cluster to v1.27.0, you must upgrade your core schema if you are using MySQL or PostgreSQL, and your visibility schema to the following:

  • Core:
    • MySQL schema v1.15
    • PostgreSQL schema v1.15
  • Visibility:
    • Elasticsearch schema v9
    • MySQL schema 1.9
    • PostgreSQL schema v1.9

Please see our upgrade documentation for the necessary steps to upgrade your schemas.

NOTE: The upgrade to MySQL and PostgreSQL Visibility schemas may come with temporary performance degradation because of creation of a new column _version which has default values. Consider performing the schema upgrades when load is low. There are protective mechanisms in place to account for timeouts from any VisibilityStore.

Deprecation of Visibility Scan APIs

Visibility Scan APIs have been deprecated in favor of List Workflow APIs. Visibility Scan APIs will be removed in a future version. Migration to List Workflow APIs will be required in future versions.

Nexus GA

Nexus is now GA with a stable server API.

Read more here on how to disable Nexus or how to operate it here.

Notable features and bug fixes since v1.26.2:

  • The server now allows a maximum of 30 pending Nexus operations per workflow by default, as opposed to the previous limit being 30 total.
  • Add support for attaching callbacks, request IDs, and links to a workflow via StartWorkflowExecutionRequest.OnConflictOptions with WorkflowIdConflictPolicy of USE_EXISTING. This allows multiple operations to be backed by the same workflow.
  • Misc small features and tests.

Safe Deploys

The following APIs are added for Worker Versioning. All APIs are experimental and not yet recommended for production usage. You need to set the dynamic configs system.enableDeployments and system.enableDeploymentVersions in order to use them.

  • ListWorkerDeployments
  • DescribeWorkerDeployment
  • DescribeWorkerDeploymentVersion
  • SetWorkerDeploymentCurrentVersion
  • SetWorkerDeploymentRampingVersion
  • UpdateWorkerVersionMetadata
  • DeleteWorkerDeployment
  • DeleteWorkerDeploymentVersion

The following APIs are now deprecated and replaced by above:

  • DescribeDeployment
  • ListDeployments
  • GetDeploymentReachability
  • GetCurrentDeployment
  • SetCurrentDeployment

Activity Commands (pre-release)

Changes to the Activity Commands — a set of APIs designed to resolve issues related to activity execution. The following APIs where updated:

  • UpdateActivityOptionsById was renamed to UpdateActivityOptions. This API can be used by the client to update the options of an activity while activity is running.
  • PauseActivityById was renamed to PauseActivity. With this API, If the Activity is not currently running (e.g. because it previously failed and is waiting for the next retry), it will not be run again until it is unpaused.
    • activity_type parameter was added. If this parameter is set - all running activities of this type will be paused.
  • UnpauseActivityById was renamed to UnpauseActivity. With this API clients can re-schedule a previously-paused Activity for execution.
    • no_wait parameter was removed
    • activity_type parameter was added. If this parameter is set - all paused activities of this type will be unpaused.
    • match-all parameter was added. If this parameter is set - all paused activities will be unpaused.
    • jitter parameter was added. If set, the activity will start at a random time within the specified jitter duration.
  • ResetActivityById was renamed to ResetActivity. With this API clients can unpauses the execution of a previously paused activity, specified by its ID.
    • no_wait parameter was removed
    • activity_type parameter was added. If this parameter is provided - all paused activities of this type will be unpaused.
    • keep_paused parameter was added. If this parameter is provided - all paused activities will stay paused. Otherwise they will be unpaused.
    • jitter parameter was added. If set, the activity will start at a random time within the specified jitter duration.
  • New batch operation was introduced - BatchOperationUnpauseActivities.

Workflow Reset with children

In this release we have added the functionality to reset a workflow with a pending child.

Prior to this release reseting to a point between child workflow initiated and child workflow completed was not supported (the reset operation would fail). In the current release the reset operation will allow this case and the behavior of the parent after reset is to reconnect to the running child. The new run of the parent will receive the child’s completion event and result (if any) from the child.

The feature is gated behind a per namespace boolean dynamic configuration AllowResetWithPendingChildren which is enabled by default for all namespaces.

Note: If you are using Go-SDK and are relying on the SDK to generate child workflow IDs then you need to update it to the latest version to be able to use this feature. Other SDKs don’t need any upgrade to use this feature.

Delete namespace improvement

  1. Delete workflow executions RPS is now dynamic. Previously, frontend.deleteNamespaceDeleteActivityRPS was read only once when namespace deletion started, and subsequent changes to this dynamic config didn't affect the ongoing deletion. This was inconvenient for large namespaces since the default RPS is only 100. Now the RPS can be adjusted on the fly.

    Please note: Since deletion of Workflow Execution is an asynchronous process, this RPS controls the rate at which delete execution tasks are created. Decreasing this value (for example, from 1000 to 10) won't immediately slow down the process, as existing tasks in the transfer queue must be processed first.

  2. DeleteExecutionsWorkflow now supports a stats query to track its progress. Since this Workflow can run for hours after a namespace is marked as deleted, it was previously difficult to monitor how many Workflow Executions remained. The new query handler provides current statistics about total and remaining executions.

  3. Metrics and logging have been enhanced for better actionability. Key improvements include:

    1. Monitor ReclaimResources workflow failures using the metrics reclaim_resources_namespace_delete_failure and reclaim_resources_delete_executions_failure.
    2. Track DeleteExecutionsWorkflow progress using the metrics delete_executions_success and delete_executions_failure.
  4. Business critical namespaces can be protected from deletion. Use dynamic config to list namespaces which are not deletable:

    worker.protectedNamespaces:
      - value:
          - critical_namespace
          - just_very_important_namespace
  5. Sleep duration in ReclaimResourcesWorkflow now supports dynamic changes. If a namespace delete delay was mistakenly set too long, you can now modify it after the Workflow has started. Use this command to update the delay to a new value (10 hours in this example):

    temporal workflow update --namespace temporal-system --name update_namespace_delete_delay --workflow-id temporal-sys-reclaim-namespace-resources-workflow/default-deleted-93f5e --input '"10h"'
    

    Or use this command to remove the delay entirely:

    temporal workflow update --namespace temporal-system --name update_namespace_delete_delay --workflow-id temporal-sys-reclaim-namespace-resources-workflow/default-deleted-93f5e --input '"0"'
    

    Please note: The new delay starts from when it is set, not from when the original timer was created. For example, if the Workflow has already slept for 2 hours and the timer is updated to 10h, it will sleep for another 10 hours, not 8.

Scheduled Actions

  • Scheduler workflow version has been updated.
    • FutureActionTimes now accounts for a schedule’s update time and RemainingActions.
    • ScheduleActionResult now includes a WorkflowExecutionStatus field, providing an eventually-consistent view of a workflow’s status within List results.
  • Bugfix: Queries on schedules might have been delayed after dynamic config changes were applied to per-namespace workers. An anti-entropy mechanism has been added to the per-namespace worker component.

Fix out-of-order Visibility tasks with SQL database

All SQL stores used for Visibility had the rare possibility to perform updates to a workflow's visibility state out-of-order. This could result in Workflows occasionally appearing to have state that is out of date.

Implementation

This has been fixed by adding a new column _version to all SQL store implementations. Queries to update Visibility data now ensure the _version advances before performing any writes.

Updates to Visibility data are prepared by checking actual Workflow state. Therefore when a write is rejected for being out-of-order, we know the VisibilityStore already contains a equal or more up-to-date state, so we drop out-of-order updates silently.

Important Notes

  • Make sure to upgrade your Schemas before advancing your temporal version. See the Schema Changes section.
  • This will likely reduce the performance of SQL VisibilityStores.
  • We suggest ElasticSearch as the performant solution for VisibilityStore.

...

Read more

v1.26.2

23 Dec 18:23

Choose a tag to compare

Visibility schema changes

TemporalPauseInfo column was added to visibility. TemporalPauseInfo contains search attribute related to paused entities in temporal workflows.

Before upgrading your Temporal Cluster to v1.26.2, you must upgrade your visibility schemas to the following:

  • Visibility:
    • Elasticsearch schema v8
    • MySQL schema 1.7
    • PostgreSQL schema v1.7

Batch metrics (wide events)

PR: #6655

Description: Extended metrics.Handler interface with a new StartBatch method. StartBatch returns a BatchHandler that can be used to send a sequence of metrics as a single event when Close() is called on the batch. All provided metric handlers have been updated with the new interface and simply send metrics individually.

💥 BREAKING CHANGE 💥 If you provide a custom metrics handler with temporal.WithCustomMetricsHandler(metricsHandler) you will need to implement StartBatch() on that handler. See the tally metrics handler for an example of this.

Versioning / safe deploy (pre-release)

The following EXPERIMENTAL Versioning APIs are added in this release:

  • Deployment APIs, for managing worker deployments:
    • DescribeDeployment
    • ListDeployments
    • GetCurrentDeployment
    • SetCurrentDeployment
    • GetDeploymentReachability
  • UpdateWorkflowExecutionOptions API (and its batch mode) for setting versioning override for executions.

Documentation is not available at this point. Do not use above APIs in production.

To enable these APIs the following configs should be enabled: system.enableDeployments and frontend.workerVersioningWorkflowAPIs.

Workflow Update GA

Description:

Workflow Update enables a gRPC client of a Workflow Execution to issue requests to that Workflow Execution and receive a response. These requests are delivered to and processed by a client-specified Workflow Execution. Updates are differentiated from Queries in that the processing of an Update is allowed to modify the state of a Workflow Execution. Updates are different from Signals in that an Update returns a response.

Any gRPC client can invoke Updates via the WorkflowService.UpdateWorkflowExecution. Additionally, past Update requests can be observed via the WorkflowService.PollWorkflowExecutionUpdate API. The wait stage option determines whether they respond once the Update was accepted or completed.

Note that an Update only becomes durable when it was accepted, until then, it will not appear in the Workflow history. SDKs will automatically retry to ensure Update requests complete.

The execution and retention of Updates is configured via two optional dynamic configuration values:

  • history.maxTotalUpdates controls the total number of Updates that a single Workflow Execution can support. The default is 2000.
  • history.maxInFlightUpdates controls the number of Updates that can be “in-flight” (that is, concurrently executing, not having completed) for a given Workflow Execution. The default is 10.

Since the 1.25 release, several minor bugs have been fixed.

Workflow Update-With-Start (public preview)

Update-With-Start sends a Workflow Update that checks whether an already-running Workflow with that ID exists. If it does, the Update is processed. If not, it starts a new Workflow Execution with the supplied ID. When starting a new Workflow, it immediately processes the Update.

Update-With-Start is great for latency-sensitive use cases.

Activity API (pre-release)

Description:

We introduce the Activity API — a set of APIs designed to resolve issues related to activity execution. The following APIs where introduced:

  • UpdateActivityOptionsById. This API can be used by the client to update the options of an activity while activity is running.
    • The following options are supported:
      • task-queue
      • schedule-to-close-timeout
      • schedule-to-start-timeout
      • start-to-close-timeout
      • heartbeat-timeout
      • retry-initial-interval
      • retry-maximum-interval
      • retry-backoff-coefficient
      • retry-maximum-attempts
    • Partial updates are supported.
    • Timeout updates are effective immediately.
  • PauseActivityById. With this API, If the Activity is not currently running (e.g. because it previously failed and is waiting for the next retry), it will not be run again until it is unpaused.
    • However, if the Activity is currently running, it will run to completion. If the Activity is on its last retry attempt and fails, the failure will be returned to the caller, just as if the Activity had not been paused. Otherwise, if it fails, it will enter a paused state.
  • UnpauseActivityById . With this API clients can re-schedule a previously-paused Activity for execution.
    • If the Activity is not running and has exceeded its retry timeout, it will be scheduled immediately. Otherwise, it will be scheduled when its retry timeout expires. The Activity can be retried immediately using --no-wait.
  • ResetActivityById. With this API clients can unpauses the execution of a previously paused activity, specified by its ID.
    • Resetting an activity means:
      • number of attempts will be reset to 0.
      • activity timeouts will be reset.

Documentation is not available at this point. Do not use above APIs in production.

Nexus

Nexus is still in public preview for this release, but has now been enabled by default.

Read more here on how to disable Nexus or how to operate it here.

Notable features and bug fixes since v1.25.2:

  • Allow completing a Nexus operation after workflow reset (#6434).
  • Fix a few issues around replication and cancelation.
  • Record NexusOperationStarted event and link when async operation completion races with the sync response path.
  • Support for full Temporal Failure rehydration and encryption. Nexus handlers and workflows backing operations will now expose the full error details by default. Caller workflows will also rehydrate the original error to allow for better E2E debugging experience.
  • Allow skipping application of Nexus events when resetting a workflow.

Notable bug fixes

Primary engineer: @prathyushpv

Table not found bug in sqlite.

Durations with mismatched seconds and nanoseconds signs will now fail validation and return an InvalidArgument error.

Helpful links to get you started with Temporal

Temporal Docs
Server
Docker Compose
Helm Chart

Docker images for this release (use the tag 1.26.2)

Server
Server With Auto Setup (what is Auto-Setup?)
Admin-Tools

Full Changelog: v1.25.2...v1.26.2