Skip to content

Conversation

@lina-temporal
Copy link
Contributor

What changed?

  • Frontend workflow_handler can now route to CHASM scheduler, depending on experiment flag/dynamic config
  • Scheduler functional test has been updated to exercise both V1 and V2 scheduler
  • Small fixes throughout scheduler codebase to address differences noticed during functional testing
  • Nexus callback mechanisms updated to allow the CHASM internal callback URL

I plan to address the TODO around LastCompletionResult's assertion in a separate follow-up PR.

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

Potential risks

  • Existing V1 tests are preserved, to show that the V1 codepath still works as-is. All new CHASM handling logic falls back to V1 in absence of a positive response. CHASM can only be opted in for a schedule during creation, controlled by dynamic config/experimental header.

@lina-temporal lina-temporal requested review from a team as code owners November 24, 2025 22:16
Comment on lines +547 to +548
delete(attributes, sadefs.TemporalNamespaceDivision)
delete(attributes, sadefs.TemporalSchedulePaused)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These shouldn't come back from visibility... they are system fields, not user fields.
Are you seeing those come back?

delete(attributes, sadefs.TemporalNamespaceDivision)
delete(attributes, sadefs.TemporalSchedulePaused)

if s.Schedule.Policies == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not call setNullableFields again here?

s.Schedule.Policies.OverlapPolicy = s.overlapPolicy()
}
if !s.Schedule.GetPolicies().GetCatchupWindow().IsValid() {
// TODO - this should be set from Tweakables.DefaultCatchupWindow.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not pass this into the function?

Comment on lines +639 to +642
visibility := chasm.NewVisibility(ctx)
s.Visibility = chasm.NewComponentField(ctx, visibility)
visibility.SetSearchAttributes(ctx, req.FrontendRequest.GetSearchAttributes().GetIndexedFields())
visibility.SetMemo(ctx, oldMemo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
visibility := chasm.NewVisibility(ctx)
s.Visibility = chasm.NewComponentField(ctx, visibility)
visibility.SetSearchAttributes(ctx, req.FrontendRequest.GetSearchAttributes().GetIndexedFields())
visibility.SetMemo(ctx, oldMemo)
visibility := chasm.NewVisibilityWithData(ctx, req.FrontendRequest.GetSearchAttributes().GetIndexedFields(), oldMemo)
s.Visibility = chasm.NewComponentField(ctx, visibility)

@@ -0,0 +1,15 @@
package chasm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kinda hate that we have to put this here but this in not blocking the PR.

Comment on lines +1244 to +1246
return metadata.NewOutgoingContext(baseCtx, metadata.Pairs(
headers.ExperimentHeaderName, "chasm-scheduler",
))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a way to run the scheduler just with overriding the dynamic config without these specific headers, no?

// as part of StartWorkflowExecution, but here it must be done separately (to
// maintain equal payload size limits between versions).
switch action := request.GetSchedule().GetAction().GetAction().(type) {
case *schedulepb.ScheduleAction_StartWorkflow:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also verify the memo and header size. Those are also payloads.

func (wh *WorkflowHandler) DescribeSchedule(ctx context.Context, request *workflowservice.DescribeScheduleRequest) (_ *workflowservice.DescribeScheduleResponse, retError error) {
defer log.CapturePanic(wh.logger, &retError)

// Prefer CHASM scheduler if enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we should first check if CHASM scheduler is enabled to avoid every call to the schedule going through this new code path, especially before this code has been thoroughly tested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also move the check if schedules are enabled to the top of this function (or any other function that handles schedules). Users may have disabled the feature completely for some arbitrary reason.


if !wh.config.EnableSchedules(request.Namespace) {
return nil, errSchedulesNotAllowed
res, err := wh.updateScheduleCHASM(ctx, request)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about handling not found errors here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about request validation? (and make sure that you are not unaliasing search attributes on the CHASM path)

trigger.ScheduledTime = timestamppb.Now()
}

res, err := wh.patchScheduleCHASM(ctx, request)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing validation and not found error checks.
Also don't issue these RPCs unless CHASM scheduler is enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants