Skip to content

Conversation

@samanbarghi
Copy link
Contributor

@samanbarghi samanbarghi commented Apr 24, 2023

What changed?
Skipping over entier time range if schedule is paused. Also, batch getNextTime queries inside SideEffect and cache them.

Why?
if a schedule is paused for a very long time, when it wakes up it'll try to process the whole range of time it was paused. It will record a marker for each time. if there's many thousands, this could make the workflow task fail and the workflow get stuck. (and besides is just a waste of events.). Also, adding a Marker for getNextTime is expensive, batching and caching allows us to reduce the overhead.

How did you test it?
Unit tests

Potential risks

Is hotfix candidate?
No

@samanbarghi samanbarghi requested a review from a team as a code owner April 24, 2023 21:47
@samanbarghi samanbarghi force-pushed the feature/skip-over-entinre-time-range-while-paused branch from 233ba95 to 39036c3 Compare April 24, 2023 21:48
@samanbarghi samanbarghi requested a review from dnr April 24, 2023 21:49
Copy link
Contributor

@dnr dnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good, except we need to use versioning (with tweakables) to preserve the old logic for replay

Copy link
Contributor

@dnr dnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to invalidate the cache when Version changes also (for upgrades)

Do all the existing tests pass? This is just an internal optimization in some sense, so it doesn't necessarily need new tests, but it would be good to at least verify the marker behavior manually, if not an automated test for it

@samanbarghi samanbarghi changed the title Skip over entier time range if paused Skip over entier time range if paused and batch and cache time queries May 3, 2023
Copy link
Contributor

@dnr dnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, the map is nice and easy.

btw, there's a typo in the PR title

@samanbarghi samanbarghi changed the title Skip over entier time range if paused and batch and cache time queries Skip over entire time range if paused and batch and cache time queries May 3, 2023
// add/process maxRuns schedules
for j := 0; j < maxRuns; j++ {
runStartTime := time.Date(2022, 5, i, j, 17+j%2, 0, 0, time.UTC)
runs = append(runs, workflowRun{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad these don't have to be in order, that'd be even more annoying

delayedCallbacks := make([]delayedCallback, backfillIterations)

expected := runIterations
// schedule a call back every hour to spray backfills among scheduled runs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one nitpick: can you make the backfill only start after 2022-06-01 15:00:00? i.e. I want at least one long stretch where the cache runs out and has to refill naturally. (or even better, use the value of maxNextTimeResultCacheSize.) I think it may not actually be covering that branch anymore

t = next.Next
}
return results
}).Get(&s.nextTimeResultCache))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more thing: I think that Get into a map may actually add entries instead of resetting the map (it ends up in the data converter which does json.Unmarshal), so the map might keep growing. I think we should explicitly clear the map before Get just in case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, will do.

@samanbarghi samanbarghi merged commit dc7f28d into temporalio:master May 10, 2023
rodrigozhou pushed a commit that referenced this pull request May 13, 2023
#4215)

* Skip over entier time range if paused

* Refactor

* versioning using tweakables

* batch next time calculation in side effect

* Add version and a time generator to cache time results

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Refactor based on PR comments

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Add enum for versioning

* Fix a bug with skipping schedules

* Additional check for Equal

* Test cache

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Use map and add backfill to test

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Update name

* Improve backfill injection and add some randomness

* Clear the map before re-populating

* set map to nil

* update tests to refill cache

---------

Co-authored-by: David Reiss <[email protected]>
rodrigozhou pushed a commit that referenced this pull request May 16, 2023
#4215)

* Skip over entier time range if paused

* Refactor

* versioning using tweakables

* batch next time calculation in side effect

* Add version and a time generator to cache time results

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Refactor based on PR comments

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Add enum for versioning

* Fix a bug with skipping schedules

* Additional check for Equal

* Test cache

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Use map and add backfill to test

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Update service/worker/scheduler/workflow.go

Co-authored-by: David Reiss <[email protected]>

* Update name

* Improve backfill injection and add some randomness

* Clear the map before re-populating

* set map to nil

* update tests to refill cache

---------

Co-authored-by: David Reiss <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants