-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Consider non default sets reachable by new workflows for a while after they stop being queue default #4545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider non default sets reachable by new workflows for a while after they stop being queue default #4545
Conversation
| // ReachabilityQueryBuildIdLimit limits the number of build ids that can be requested in a single call to the | ||
| // GetWorkerTaskReachability API. | ||
| ReachabilityQueryBuildIdLimit = "limit.reachabilityQueryBuildIds" | ||
| // ReachabilityQuerySetDurationSinceDefault is the minimum period since a version set was demoted from being the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the two configs represent the same concept, right? why not collapse them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing a build id is more severe, I think we want different configs for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, that's fair
common/util/util.go
Outdated
| } | ||
|
|
||
| // ReduceSliceInitial reduces a slice using given reducer function and initial value. | ||
| func ReduceSliceInitial[T any, A any](in []T, initializer A, reducer func(A, T) A) A { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| func ReduceSliceInitial[T any, A any](in []T, initializer A, reducer func(A, T) A) A { | |
| func ReduceSlice[T any, A any](in []T, initializer A, reducer func(A, T) A) A { |
Reduce always needs an initial value, no need to include it in the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also reduce from T[] to T that can use the first element as the initializer. I didn't add it to our "godash" lib though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather force callers to do reduce(slice[1:], slice[0], f) so then it's clearly the caller's responsibility to do something for the empty slice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay
| Namespace: ns.Name().String(), | ||
| TaskQueue: taskQueue, | ||
| }, | ||
| value, err := f.matchingClient.GetTaskQueueUserData(ctx, &matchingservice.GetTaskQueueUserDataRequest{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think it makes sense to use the more general GetTaskQueueUserData in frontend's GetWorkerBuildIdCompatibility also and remove that specific rpc from matching? it's not really doing anything besides ToBuildIdOrderingResponse, which frontend could call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that makes sense to me.
TBH, I would just send all of the user facing user data info in DescribeTaskQueue and not have a special API for GetWorkerBuildIdCompatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, I wouldn't change in this PR.
We'd need to leave the matching RPC for this minor version because 1.21 was already released.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean not have GetWorkerBuildIdCompatibility in workflowservice (frontend)? I like that but we released it already, can we remove it? for matchingservice we need a separate GetTaskQueueUserData for long polls anyway so might as well use that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We marked this whole feature as experimental (API-wise), we can still change but we'll likely need to deprecate APIs and keep them around for a while since users will start relying on them fairly soon.
| }, | ||
| value, err := f.matchingClient.GetTaskQueueUserData(ctx, &matchingservice.GetTaskQueueUserDataRequest{ | ||
| NamespaceId: ns.ID().String(), | ||
| TaskQueue: taskQueue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this always goes to the root (it did before too). probably okay but we could consider load-balancing this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm more concerned with hammering visibility than getting this info from matching but I do see your point.
Do you think it's worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nah, you're right this should be cheap compared to visibility
|
|
||
| // Finds the version in the version sets, returning (set index, index within that set) | ||
| // Returns -1, -1 if not found. | ||
| func findVersion(data *persistencespb.VersioningData, buildID string) (setIndex, indexInSet int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
| // ReachabilityQueryBuildIdLimit limits the number of build ids that can be requested in a single call to the | ||
| // GetWorkerTaskReachability API. | ||
| ReachabilityQueryBuildIdLimit = "limit.reachabilityQueryBuildIds" | ||
| // ReachabilityQuerySetDurationSinceDefault is the minimum period since a version set was demoted from being the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, that's fair
common/util/util.go
Outdated
| } | ||
|
|
||
| // ReduceSliceInitial reduces a slice using given reducer function and initial value. | ||
| func ReduceSliceInitial[T any, A any](in []T, initializer A, reducer func(A, T) A) A { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather force callers to do reduce(slice[1:], slice[0], f) so then it's clearly the caller's responsibility to do something for the empty slice
| Namespace: ns.Name().String(), | ||
| TaskQueue: taskQueue, | ||
| }, | ||
| value, err := f.matchingClient.GetTaskQueueUserData(ctx, &matchingservice.GetTaskQueueUserDataRequest{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean not have GetWorkerBuildIdCompatibility in workflowservice (frontend)? I like that but we released it already, can we remove it? for matchingservice we need a separate GetTaskQueueUserData for long polls anyway so might as well use that
| }, | ||
| value, err := f.matchingClient.GetTaskQueueUserData(ctx, &matchingservice.GetTaskQueueUserDataRequest{ | ||
| NamespaceId: ns.ID().String(), | ||
| TaskQueue: taskQueue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nah, you're right this should be cheap compared to visibility
| // 1. There are no workflows currently marked as open in the visibility store but a worker for the demoted version | ||
| // is currently processing a task. | ||
| // 2. There are delays in the asynchrnous visiblity task processor. | ||
| // 2. There are delays in the visibility task processor (which is asynchronous). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, can you make the same change on the copied line (214)?
…r they stop being queue default
Co-authored-by: David Reiss <[email protected]>
31a4aaa to
4a826ce
Compare
…r they stop being queue default (#4545)
What changed?
See title, also added a
frontend.reachabilityQuerySetDurationSinceDefaultdynamic config with a default of 5 minutes.Why?
Docstring has an explanation:
How did you test it?
Added a couple of tests