-
Notifications
You must be signed in to change notification settings - Fork 72
Implement BadgerDB garbage collection #757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
} | ||
|
||
// upper bound for loop | ||
const maxDiscards = 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is a reasonable value of discards, but it felt like the controller could sample, delete, and rewrite 1000 value log files relatively quickly in its own goroutine without impacting the controller-runtime too much.
1f78ec0
to
9ff8d70
Compare
"github.com/go-logr/logr/testr" | ||
) | ||
|
||
func TestBadgerGarbageCollectorDoesStop(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example Successful run (2.3 seconds)
go test -run '^TestBadgerGarbageCollectorDoesStop$' github.com/fluxcd/image-reflector-controller/internal/database -v -count 1
=== RUN TestBadgerGarbageCollectorDoesStop
badger 2025/05/07 00:04:06 INFO: All 0 tables opened in 0s
badger 2025/05/07 00:04:06 INFO: Discard stats nextEmptySlot: 0
badger 2025/05/07 00:04:06 INFO: Set nextTxnTs to 0
badger_gc.go:61: test-badger-gc: "level"=0 "msg"="Starting Badger GC"
badger_gc.go:84: test-badger-gc: "level"=0 "msg"="Ran Badger GC" "discarded_vlogs"=0
badger_gc_test.go:47: wrote tags successfully
badger_gc.go:84: test-badger-gc: "level"=0 "msg"="Ran Badger GC" "discarded_vlogs"=0
badger_gc.go:84: test-badger-gc: "level"=0 "msg"="Ran Badger GC" "discarded_vlogs"=0
badger_gc_test.go:52: waiting for GC stop
badger_gc.go:70: test-badger-gc: "level"=0 "msg"="Stopped Badger GC"
badger_gc_test.go:57: GC Stopped
badger 2025/05/07 00:04:08 INFO: Lifetime L0 stalled for: 0s
badger 2025/05/07 00:04:08 INFO:
Level 0 [ ]: NumTables: 01. Size: 216 B of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB
Level 1 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 2 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 3 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 4 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 5 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 6 [B]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level Done
--- PASS: TestBadgerGarbageCollectorDoesStop (2.06s)
PASS
ok github.com/fluxcd/image-reflector-controller/internal/database 2.355s
Synthetic Failure (sleep for a minute when stopped) (7.3 seconds)
go test -run '^TestBadgerGarbageCollectorDoesStop$' github.com/fluxcd/image-reflector-controller/internal/database -v -count 1
=== RUN TestBadgerGarbageCollectorDoesStop
badger 2025/05/06 23:46:31 INFO: All 0 tables opened in 0s
badger 2025/05/06 23:46:31 INFO: Discard stats nextEmptySlot: 0
badger 2025/05/06 23:46:31 INFO: Set nextTxnTs to 0
badger_gc.go:61: test-badger-gc: "level"=0 "msg"="Starting Badger GC"
badger_gc.go:85: test-badger-gc: "level"=0 "msg"="Ran Badger GC" "discarded_vlogs"=0
badger_gc_test.go:47: wrote tags successfully
badger_gc.go:85: test-badger-gc: "level"=0 "msg"="Ran Badger GC" "discarded_vlogs"=0
badger_gc.go:85: test-badger-gc: "level"=0 "msg"="Ran Badger GC" "discarded_vlogs"=0
badger_gc.go:85: test-badger-gc: "level"=0 "msg"="Ran Badger GC" "discarded_vlogs"=0
badger_gc_test.go:52: waiting for GC stop
badger_gc_test.go:55: GC did not stop
badger 2025/05/06 23:46:38 INFO: Lifetime L0 stalled for: 0s
badger 2025/05/06 23:46:38 INFO:
Level 0 [ ]: NumTables: 01. Size: 216 B of 0 B. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 64 MiB
Level 1 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 2 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 3 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 4 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 5 [ ]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level 6 [B]: NumTables: 00. Size: 0 B of 10 MiB. Score: 0.00->0.00 StaleData: 0 B Target FileSize: 2.0 MiB
Level Done
--- FAIL: TestBadgerGarbageCollectorDoesStop (7.04s)
FAIL
FAIL github.com/fluxcd/image-reflector-controller/internal/database 7.332s
FAIL
d507687
to
3b04a83
Compare
Notably, because we are using the controller manager, the Garbage Collector Runnable will only run when the controller is the Leader: This is probably okay behavior, but if we want it to run the GC all the time (ex: when the controller becomes a follower), we can implement |
main.go
Outdated
@@ -94,6 +99,7 @@ func main() { | |||
flag.StringVar(&healthAddr, "health-addr", ":9440", "The address the health endpoint binds to.") | |||
flag.StringVar(&storagePath, "storage-path", "/data", "Where to store the persistent database of image metadata") | |||
flag.Int64Var(&storageValueLogFileSize, "storage-value-log-file-size", 1<<28, "Set the database's memory mapped value log file size in bytes. Effective memory usage is about two times this size.") | |||
flag.Uint16Var(&gcInterval, "gc-interval", 1, "The number of minutes to wait between garbage collections. 0 disables the garbage collector.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wandering if the 1m interval is too aggressive and if it will increase the CPU usage unnecessarily for most users. How about we go with 5m as the default, and we document here the flag (those that have a massive amount of images could set it to 1m).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After running some load-tests and considering how much smaller image tag data is -- I don't think it's very common for this GC code to run with the default BadgerDB table/LSM/vlog settings. The database has to contain a large number of image tags and use more than the default memory limit of 1GB to start flushing to vlog files.
I've updated the default interval to 10 minutes.
My hypothesis right now is end-users who are seeing this in production either have:
a) a very large number of image repos, or
b) a constantly changing set of repos with taglists larger than 1MB each
@stealthybox can you please rebase with upstream main |
rebased -- load-testing code/notes in #763 It's a separate issue, but I think we could implement some much more conservative defaults right now for badger. With the badger defaults, a synthetic test in a 1GB limited docker container can OOM with just 400 repos & ~17MB of tag data using less than 60 value changes across the repos. This isn't because of the garbage collector. Most tag values will be under the value of the 1MB value log threshold, and get stored directly next to its repo key in the memtables. With the default table sizes and quantities, this fills up memory quickly beyond 1GB before anything at all is marked to be discarded or flushed to disk. I can double check, but I'm pretty sure it OOM's even if I disable the GC process because it's basically a noop for this scenario. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @stealthybox
@stealthybox please rebase your fork with upstream main, I can't merge as your branch is out-of-date. |
Signed-off-by: leigh capili <[email protected]>
something is broken on GitHub's side with this PR. |
editing the PR and re-selecting |
Garbage collection of the BadgerDB WAL/value-logs is our responsibility to orchestrate.
This patch implements it in a similar manner to dgraph.
db.RunValueLogGC()
will rewrite only a single file.We don't have to guard against concurrent runs of it, but we do need to run it multiple times until there is nothing left to write.
gc.discardValueLogFiles()
does this, and we wrap it in a goroutine that waitsgc.Interval
between each run.The use of a timer instead of a ticker here is so that the GC run can never overrun itself even if there are lots of files or the disk is very very very slow.
I've bounded the number of discards to prevent bugs in library-code and infrastructure we don't control from causing an infinite loop.
The Garbage Collector is added to the controller manager by implementing controller-runtime's
Runnable
interface.This means that the controller runtime will
Notably, the GC will shut down while other reconcilers and database writes may still be occurring.
Now that I've updated this patch, the channel implementation is much simpler than using mutex(#758)
I've added a test that
This test could be extended to write a benchmark that also exercises the value log discard by running the test for longer and writing millions of tags/repos.
Alternatively can run the GC as a full e2e/scale-test with an image on an actual cluster and reading from OCI registries.
Related #748