-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Bucketize autoscaling metrics by timeframe not by pod name. #3289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bucketize autoscaling metrics by timeframe not by pod name. #3289
Conversation
knative-prow-robot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markusthoemmes: 0 warnings.
Details
In response to this:
Fixes #2977
Proposed Changes
Stats are averaged in each specific timeframe vs. averaged over the whole window. See the linked issue for more in-depth information
Release Note
TBD
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
0fde3c4 to
a61fb92
Compare
|
Unrelated failure /test pull-knative-serving-integration-tests |
|
/assign @yanweiguo Please let me know what you think. |
|
|
||
| go func() { | ||
| if err := generateTraffic(ctx, int(numPods*10), 30*time.Second, stopChan); err != nil { | ||
| if err := generateTraffic(ctx, int(numPods*10), 60*time.Second, stopChan); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes stabilize the autoscaling tests. They have recently been adjusted to continue generating more traffic as soon as the we hit the desired replica count. However that's only been done on "Replicas" so we're at danger of overflowing if the pod takes a while to come up.
Likewise the amount of traffic being sent in (30s) can be juuuuuust about enough to cause us to scale up. After 60s it's guaranteed to (for the default window sizes).
vagababov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Superficial mostly. I need to re-read the PR again for the logic part, though it mostly makes sense to me.
| kubeInformer.Core().V1().Endpoints().Informer().GetIndexer().Add(ep) | ||
| } | ||
|
|
||
| func roundedNow() time.Time { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the reason to use roundedNow that it prevent flakiness because some stats could be out of scale window if now() is used directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it basically normalizes the instances of "now" so the test doesn't depend on when exactly it is executed. Especially when adding to "now" in the tests we otherwise risk to jump into other buckets in the calculation. It makes the test deterministic.
vagababov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
The following is the coverage report on pkg/.
|
|
/lgtm |
|
/lgtm |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: markusthoemmes, srinivashegde86 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fixes #2977
Fixes #2379
Proposed Changes
Stats are averaged in each specific timeframe vs. averaged over the whole window. See the linked issue for more in-depth information
Release Note