-
Notifications
You must be signed in to change notification settings - Fork 16
Added include-retry-scenarios option to throughput_stress scenario
#229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
20765ba to
7e2604d
Compare
|
Curious what you think about the timeouts. I'm a little bit wary that they're kind of baked into tps now, which raises the floor run length time. In practice though, probably not an issue given that it's expected to run for extended periods of time |
--include-retry-scenarios flag to throughput_stress scenario
--include-retry-scenarios flag to throughput_stress scenario--include-retry-scenarios option to throughput_stress scenario
|
Was concerned about the runtime, so decided to put this behind an option |
scenarios/throughput_stress.go
Outdated
| DelayActivityWithCancellation(1*time.Second, 10*time.Second), | ||
| // Test activity retry: fails, succeeds on retry | ||
| RetryableErrorActivity(1, RemoteActivityWithRetry(1*time.Second, 2, 500*time.Millisecond, 1.0)), | ||
| // Test activity timeout with retry: times out on 1st attempt, succeeds on 2nd | ||
| TimeoutActivity(1, RemoteActivityWithRetry(1*time.Second, 2, 500*time.Millisecond, 1.0)), | ||
| // Test heartbeat timeout: skips heartbeats on 1st attempt, sends them on 2nd | ||
| HeartbeatActivity(1, RemoteActivityWithHeartbeat(10*time.Second, 1*time.Second, 2, 500*time.Millisecond, 1.0)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stylistic: If these helpers are only used here I'd probably just inline it. Having helpers for the DSL feels a little over-verbose unless the helper is doing a lot of heavy lifting or re-used a lot (which I believe is the case with the ones already present). That said I can see how it makes this nice too, so, 🤷 . Curious if @stephanos has an opinion.
| // Activity that runs too long for N attempts (causing timeout), then completes. | ||
| // Tests StartToCloseTimeout behavior with retries. | ||
| // On failing attempts, the activity runs until context cancellation (timeout). | ||
| // On success, it runs for duration < StartToCloseTimeout (runs for 0.5x StartToClose timeout). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // On success, it runs for duration < StartToCloseTimeout (runs for 0.5x StartToClose timeout). | |
| // On success, it runs for 0.5x StartToClose timeout. |
If it always does that, just say that.
That said - I would probably just parameterize the runtime too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added fields for success/failure durations
|
|
||
| // Activity that skips heartbeats for N attempts (causing heartbeat timeout), then sends them. | ||
| // Tests HeartbeatTimeout behavior with retries. | ||
| // Duration is derived from activity info's HeartbeatTimeout (runs for 2x heartbeat timeout). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Duration is derived from activity info's HeartbeatTimeout (runs for 2x heartbeat timeout). | |
| // On success, runs for 2x heartbeat timeout. |
To be consistent with the other one - but, again I would also probably just parameterize here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
--include-retry-scenarios option to throughput_stress scenarioinclude-retry-scenarios option to throughput_stress scenario
Sushisource
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good to me
6e9e887 to
a378468
Compare
Included these actions (and a cancellable activity) to throughput stress
a378468 to
192a528
Compare
What was changed
Added
include-retry-scenariosoption tothroughput_stressscenario.Setting
--option include-retry-scenarios=trueadds activities that fail and retry, as actions tothroughput_stress.NOTE: enabling this option increases runtime of a single iteration from ~23s to ~44s!
New activity actions were added to for this fail-and-retry coverage:
The added proto definitions are the same. I've separated them because they cover different activity failure cases, which I think would be neater to handle separately in the workers rather than a single failure activity. It also allows them to be expanded separately and more specifically.
I've added these actions to the existing throughput stress scenario, as well as a cancellable activity action.
Added to corresponding logic to each worker to handle these new activity actions.
Why?
Provide additional activity coverage, particularly in cases where the activity does not succeed on the first attempt.
Ran the existing
throughput_stress_test.gotest.Ran:
Having trouble running throughput stress on other language workers (even before this change). Will address small fixes in subsequent PRs.
No