Skip to content

Add generic buffer.TypedRingGrowing and shrinkable buffer.Ring #323

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nojnhuh
Copy link

@nojnhuh nojnhuh commented Mar 25, 2025

What type of PR is this?
/kind feature

What this PR does / why we need it:

This PR adds a new type, buffer.Ring which is a buffer.RingGrowing with a type parameter (generics) and the capability to shrink its underlying buffer when all elements have been read. It should be able to replace the scheduler's use of queue.FIFO (and its copy in k8s.io/dynamic-resource-allocation/internal/queue). Behavior of the existing buffer.RingGrowing should be unchanged.

Which issue(s) this PR fixes:

First step of kubernetes/kubernetes#131032

Special notes for your reviewer:

I verified that this new implementation passes the existing unit tests for k8s.io/client-go/tools/cache where buffer.RingGrowing is currently in use without any changes to that package, and for k8s.io/kubernetes/pkg/scheduler/util/assumecache and k8s.io/dynamic-resource-allocation/resourceslice/tracker replacing its use of queue.FIFO: kubernetes/kubernetes@master...nojnhuh:kubernetes:typed-ring-buffer.

Release note:

Add new, generic buffer.TypedRingGrowing type to succeed buffer.RingGrowing which is now deprecated. Add new buffer.Ring type, which is equivalent to buffer.TypedRingGrowing, but shrinks to its original size after its elements have been totally consumed.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 25, 2025
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 25, 2025
@nojnhuh
Copy link
Author

nojnhuh commented Mar 25, 2025

/cc @pohly

@k8s-ci-robot k8s-ci-robot requested a review from pohly March 25, 2025 03:27
@nojnhuh
Copy link
Author

nojnhuh commented Apr 14, 2025

AFAICT the apidiff failure is a false-positive, maybe it doesn't fully understand generics? I've at least checked that client-go will still compile without any changes with this PR.

@apelisse @cheftako If you could PTAL that would be great, thanks!

@pohly
Copy link
Contributor

pohly commented May 20, 2025

/wg device-management
/priority important-soon

We need this to replace some custom FIFO implementation in two different places (scheduler plugin and ResourceSlice tracker). Would be nice to get into 1.34 soon to give us time to adapt k/k.

@k8s-ci-robot k8s-ci-robot added wg/device-management Categorizes an issue or PR as relevant to WG Device Management. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels May 20, 2025
Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me with perhaps one small doc tweak.

Please squash into one commit to make it ready for merging.

@nojnhuh nojnhuh force-pushed the typed-ring-buffer branch from 133f6ff to bb2bb1b Compare May 20, 2025 18:22
@nojnhuh
Copy link
Author

nojnhuh commented May 20, 2025

Looks good to me with perhaps one small doc tweak.

Please squash into one commit to make it ready for merging.

Updated and squashed.

@nojnhuh nojnhuh force-pushed the typed-ring-buffer branch from bb2bb1b to 69bc812 Compare May 20, 2025 18:27
Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

I think all my comments and concerns have been addressed. But it's been a while, so a second top-to-bottom pass from an approver would be useful.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 20, 2025
@pohly pohly moved this to 👀 In review in Dynamic Resource Allocation May 21, 2025
@pohly
Copy link
Contributor

pohly commented May 21, 2025

/assign @aojea

For a second review and approval.

}
r.readable--
element := r.data[r.beg]
r.data[r.beg] = nil // Remove reference to the object to help GC
var zero T
r.data[r.beg] = zero // Remove reference to the object to help GC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is only true if T is a pointer, no?

Copy link
Member

@ash2k ash2k May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. e.g. string, slice, map, any struct that has such fields and/or pointers.

(I'm not the author of the PR but I wrote the original code)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, yeah , I mean , if we have an int this will be 0 , right :) ... maybe pedantic, or nitpicking, just for correctness

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ash2k: perhaps you can help with the review then?

Copy link
Member

@ash2k ash2k May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pohly I did have a look earlier today and I have the same questions re. allocations. Overall it looks good.

Comment on lines 81 to 87
if newN == 0 {
newN = 1
}
newData := make([]T, newN)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why before we didn't initialize to 1 when it was zero?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that before this PR we assumed that the initial size is some non-zero value. With this PR the RingOptions may not have the InitialSize set. In that case the NewRing constructor will use 0 as the initial size and multiplying by 2 wouldn't grow it :)

I think it'd be better to use some sane defaults for all parameters in the constructor rather than hacking around like this in the middle of the logic.

Same with NormalSize - I don't think it makes sense to shrink to 0 and then grow to 1, 2, 4, 8, etc. Why not use some sane size if nothing was provided? Maybe 32 or something sufficiently large.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that an initial size and normal size of 0 is probably not ideal in most cases. For something as generic as this though, I don't think we can realistically determine "sane" defaults for initial size or normal size for everywhere this could be used.

Would it be better to require the initial size and normal size to be set? That would at least prevent users from assuming the implicit default of 0 for each.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current default is effectively 1 for the initial size. We all seem to agree this is not a good default for any use case. I think it'd be beneficial to have the default set (for both) to 16 or 32 - even if the buffer is not used that much, it's not wasting a lot of space. If this default does not work for someone, they can set the value they want explicitly (just as you are suggesting). WDYT?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If 0 isn't good enough (which I think is similar to how the Go runtime behaves with plain slices), I'd still prefer requiring those parameters be set by the user over trying to determine defaults which are more acceptable.

Copy link
Member

@ash2k ash2k May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be nice to keep the zero value of the type a valid, working instance i.e. ideally it shouldn't blow up if was not created via the constructor.

If we agree on the above, then we only need to change that if to use some other value, not 1. But what about normalSize? Isn't there a performance bug now where any buffer would be thrown away once empty and replaced with a zero length slice when default values are used? This is not ideal, right? And if we write to it, it'll have to immediately grow/allocate the slice again. And then if we read - it'll throw away the slice again. So, we have a pathological case where each read of the last element generates garbage for GC. Am I missing something?

If we want to solve the above, we have to treat normalSize==0 as "use a default value" for this parameter too. What is the default here? We have to come up with two values now. 16 for both?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing: does it make sense to allow normalSize to be smaller than initialSize? We don't want to start with e.g. 100 and then shrink to e.g. 40 when it's empty. That just doesn't make sense. Shall we set normalSize=max(normalSize, initialSize)?

Copy link
Member

@aojea aojea May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those are good questions, the problem to put this in this repo is that there is a strong assumption of stability and correctness, if this is going to be used only in one place why don't you create this only for self consumption in the k/k repo? just putting it here to revendor for k/k adds a lot of maintanance burden. ...

EDIT answering myself, this came here because of the opposite reasons kubernetes/kubernetes#73209 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we agree on the above, then we only need to change that if to use some other value, not 1. But what about normalSize? Isn't there a performance bug now where any buffer would be thrown away once empty and replaced with a zero length slice when

what about I think that provide some more obvious default, I put 16 as it seems we are leaning towards that value, no other reason

diff --git a/buffer/ring_growing.go b/buffer/ring_growing.go
index 97dcd6e..0f6d31d 100644
--- a/buffer/ring_growing.go
+++ b/buffer/ring_growing.go
@@ -16,6 +16,9 @@ limitations under the License.
 
 package buffer
 
+// defaultRingSize defines the default ring size if not specified
+const defaultRingSize = 16
+
 // RingGrowingOptions sets parameters for [RingGrowing] and
 // [TypedRingGrowing].
 type RingGrowingOptions struct {
@@ -79,7 +82,7 @@ func (r *TypedRingGrowing[T]) WriteOne(data T) {
                // Time to grow
                newN := r.n * 2
                if newN == 0 {
-                       newN = 1
+                       newN = defaultRingSize
                }
                newData := make([]T, newN)
                to := r.beg + r.readable

One more thing: does it make sense to allow normalSize to be smaller than initialSize? We don't want to start with e.g. 100 and then shrink to e.g. 40 when it's empty.

I'm not fan of offering an API that is able to do that and then change the behavior behind the scenes, despite totally agree does not make sense I expect that if someone sets those parameters is because they want just to do that

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made that change to add a bigger default and also added the benchmarks.

Comment on lines +143 to +152
if r.growing.readable == 0 && r.growing.n > r.normalSize {
// The buffer is empty. Reallocate a new buffer so the old one can be
// garbage collected.
r.growing.data = make([]T, r.normalSize)
r.growing.n = r.normalSize
r.growing.beg = 0
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this effective? allocating and deallocating memory vs reusing it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I understood this change is that the idea is to shrink the hugely expanded buffer back to some typical size. E.g. there was a spike of usage (as it happens when e.g. a controller starts and fills up a workqueue but the workers only start when all informers have synced). Waiting for it to get to 0 first allows to eliminate the need for copying the data, which is good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. it is less effective but the goal is to reduce the amount of ram used so we have to free the "big" buffer and allocate a new one that is "normal".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this effective? allocating and deallocating memory vs reusing it?

A new, smaller array gets allocated here so the old, larger array can be garbage collected. I don't think it's possible to inform the Go runtime that part of an array backing a slice can be garbage collected, only that the entire backing array can be by removing all references to any part of the backing array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@nojnhuh nojnhuh force-pushed the typed-ring-buffer branch from 69bc812 to ed4fc2c Compare May 27, 2025 20:55
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 27, 2025
@ash2k
Copy link
Member

ash2k commented May 28, 2025

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 28, 2025
@aojea
Copy link
Member

aojea commented May 29, 2025

ok, this seems to work as expected, it sacrifices allocations for long term memory usage

const (
	spikeSize   = 100 // Number of items to write during a spike
	normalSize  = 64  // Normal capacity for the Ring type after shrinking
	initialSize = 16  // Initial capacity for buffers
)

func BenchmarkTypedRingGrowing_Spike(b *testing.B) {
	b.ReportAllocs()
	var item int // ensure item is used

	for i := 0; i < b.N; i++ {
		buffer := NewTypedRingGrowing[int](RingGrowingOptions{InitialSize: initialSize})

		for j := 0; j < spikeSize; j++ {
			buffer.WriteOne(j)
		}

		for buffer.Len() > 0 {
			item, _ = buffer.ReadOne()
		}
	}
	_ = item // use item
}

func BenchmarkRing_Spike_And_Shrink(b *testing.B) {
	b.ReportAllocs()
	var item int // ensure item is used

	for i := 0; i < b.N; i++ {
		// Create a new buffer for each benchmark iteration
		buffer := NewRing[int](RingOptions{
			InitialSize: initialSize,
			NormalSize:  normalSize,
		})

		for j := 0; j < spikeSize; j++ {
			buffer.WriteOne(j)
		}

		for buffer.Len() > 0 {
			item, _ = buffer.ReadOne()
		}
	}
	_ = item // use item
}
go test  -run=^$ -bench=. k8s.io/utils/buffer -v -count 1
goos: linux
goarch: amd64
pkg: k8s.io/utils/buffer
cpu: Intel(R) Xeon(R) CPU @ 2.60GHz
BenchmarkTypedRingGrowing_Spike
BenchmarkTypedRingGrowing_Spike-48        953743              1094 ns/op            1920 B/op          4 allocs/op
BenchmarkRing_Spike_And_Shrink
BenchmarkRing_Spike_And_Shrink-48         819721              1325 ns/op            2432 B/op          5 allocs/op
PASS
ok      k8s.io/utils/buffer     2.166s

The apidiff does not seems to cause any issue with the existing code as it is able to infer the type IIRC
It also fixes a bug where a buffer not initialized or initialized as zero will not be able to grow

There is an outstanding discussion about the default size, #323 (comment)

@aojea
Copy link
Member

aojea commented May 29, 2025

+1 if we change the magic 1 by a constant and we use a more larger default as 16 per example https://github.com/kubernetes/utils/pull/323/files#r2113869115

@nojnhuh nojnhuh force-pushed the typed-ring-buffer branch from ed4fc2c to aace85d Compare May 30, 2025 04:09
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 30, 2025
if expected, actual := x, g.Len(); expected != actual {
t.Fatalf("expected Len to be %d, got %d", expected, actual)
}
if expected, actual := 16, g.Cap(); expected != actual {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self, this is 16 because is the next power of 2 that can accomodate 10 elements, x is 10

if read != v {
t.Fatalf("expected %#v==%#v", read, v)
}
read++
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self, we write the sequence of the index for loop, 1 2 3 ... so we check we read exactly those values

if x != read {
t.Fatalf("expected to have read %d items: %d", x, read)
}
if expected, actual := 0, g.Len(); expected != actual {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buffer should be empty

if expected, actual := 0, g.Len(); expected != actual {
t.Fatalf("expected Len to be %d, got %d", expected, actual)
}
if expected, actual := normalSize, g.Cap(); expected != actual {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the capacity should be equal to the configured normal size

@aojea
Copy link
Member

aojea commented May 30, 2025

/approve

@ash2k for final lgtm

Thanks

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 30, 2025
@ash2k
Copy link
Member

ash2k commented May 30, 2025

/lgtm

Thank you, @nojnhuh!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 30, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea, ash2k, nojnhuh, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@aojea
Copy link
Member

aojea commented May 30, 2025

/override

@k8s-ci-robot
Copy link
Contributor

@aojea: /override requires failed status contexts to operate on, but none was given

In response to this:

/override

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@aojea
Copy link
Member

aojea commented May 30, 2025

/override apidiff

The apidiff does not seems to cause any issue with the existing code as it is able to infer the type IIRC

@k8s-ci-robot
Copy link
Contributor

@aojea: aojea unauthorized: /override is restricted to Repo administrators.

In response to this:

/override apidiff

The apidiff does not seems to cause any issue with the existing code as it is able to infer the type IIRC

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@aojea
Copy link
Member

aojea commented May 30, 2025

@dims we need your help :)

@aojea
Copy link
Member

aojea commented May 30, 2025

hmm, this is an incompatible change based on https://go-review.googlesource.com/c/tools/+/618215/3/internal/apidiff/testdata/tests.go#535

@nojnhuh should we leave the old buffer as it was to avoid breaking compatibility?

data []interface{}
//
// Deprecated: Use TypedRingGrowing[any] instead.
type RingGrowing = TypedRingGrowing[any]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case we don't have ~ on the type parameter, so maybe it's not a problem? https://go.dev/play/p/cTywdiHoQp_l this works fine (removed ~).

A similar change was made to gRPC and it seems nothing broke: https://github.com/grpc/grpc-go/pull/7057/files#diff-c5003637b707b222097960cf01b1d09d77126e39ff0073bff748bbc84951e6cfR74.

Disclaimer: I'm not an expert on Go generics by any means.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

me neither, just why I'm broadcasting :) I feel that since this is only used in one place and I already tested that there is no need to adapt the code after vendor with this change, we can override , @dims what do you think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aojea @ash2k as long as you can confirm that just updating the library does not break existing code, we can land this. looks like you both have tried it already. So we should be good to go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did replace the vendor in k/k with this branch and binaries compiled just fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.
Projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

6 participants