Proposal: Early Compaction of Stale Series from the Head Block #55

codesome · 2025-07-04T00:10:22Z

proposals/0055-stale-series-compaction.md

Signed-off-by: Ganesh Vernekar <[email protected]>

machine424

Thanks for this.
Some questions/suggestions.
I think we can start with tracking those stale series via a metric #55 (comment).

For the rest of the changes, If it's easy to put together, having a PoC will be really helpful to see clearer and start gathering meaningful measurements.

machine424 · 2025-07-15T09:33:17Z

proposals/0055-stale-series-compaction.md

+
+### Alternative for tracking stale series
+
+Consider when was the last sample scraped *in addition to* the above proposal.


because it's not really an alternative, maybe have it in # Future Consideration or somewhere else instead.

machine424 · 2025-07-15T09:33:29Z

proposals/0055-stale-series-compaction.md

+
+Implementation detail: if the usual head compaction is about to happen very soon, we should skip the stale series compaction and simply wait for the usual head compaction. The buffer can be hardcoded.
+
+## Alternatives


we could also mention (allowing to) reducing/tweaking storage.tsdb.min-block-duration and why it cannot help here.

Funny enough, we already adjust storage.tsdb.min-block-duration to 1h as a mitigation. We still have issues where a team will rollout, rollback, and rollout again in a single hour causing a huge bump in head series. Typically leading to an OOM crashing Prometheus with tends of millions of stale series.

machine424 · 2025-07-15T09:44:14Z

proposals/0055-stale-series-compaction.md

+
+### Compacting Stale Series
+
+We will have two thresholds to trigger stale series compaction, `p%` and `q%`, `q > p` (both indicating % of total series that are stale in the head). Both will be configurable and default to 0% (meaning stale series compaction is disabled).


I think it'd be more user friendly if we just allow enabling the feature and have Prometheus choose the appropriate threshold (like the 3/2 we currently have e.g.)

machine424 · 2025-07-15T09:47:21Z

proposals/0055-stale-series-compaction.md

+
+## Goals
+
+* Have a simple and efficient  mechanism in the TSDB to track and identify stale series.


I remember @SuperQ mentioning that somewhere, but it'd be great if we can start with a metric for that, it'll help us decide on the logic.

Yes, my idea was to start with a metric.

note that there is also scrape_series_added

machine424 · 2025-07-15T09:51:26Z

proposals/0055-stale-series-compaction.md

+
+**Part 1**
+
+At a regular interval (say 15 mins), we check if the stale series have crossed p% of the total series. If it has, we trigger a compaction that simply flushes these stale series into a block and removes it from the Head block (can be more than one block if the series crosses the block boundary). We skip WAL truncation and m-map files truncation at this stage and let the usual compaction cycle handle it. How we drop these compacted series during WAL replay is TBD during implementation (may need a new WAL record or use tombstone records).


IIUC, we'll be dropping the records during replay, otherwise the restarts on OOM or scale up take too long. from the Why should be removed.

machine424 · 2025-07-15T09:54:50Z

proposals/0055-stale-series-compaction.md

+
+Consider when was the last sample scraped *in addition to* the above proposal.
+
+For edge cases where we did not put the staleness markers, we can look at the difference between the last sample timestamp of the series and the max time of the head block, and if it crosses a threshold, call it stale. For example a series did not get a sample for 5 mins (i.e. head’s max time is 5 mins more than series’ last sample timestamp).


we'll need to align with the scrape logic that deals with staleness when staleness markers couldn't be inserted.

machine424 · 2025-07-15T09:56:25Z

proposals/0055-stale-series-compaction.md

+
+**Part 1**
+
+At a regular interval (say 15 mins), we check if the stale series have crossed p% of the total series. If it has, we trigger a compaction that simply flushes these stale series into a block and removes it from the Head block (can be more than one block if the series crosses the block boundary). We skip WAL truncation and m-map files truncation at this stage and let the usual compaction cycle handle it. How we drop these compacted series during WAL replay is TBD during implementation (may need a new WAL record or use tombstone records).


would the blocks be overlapping and merged during a normal compaction? we'd also need to take the merging overhead into account.

codesome requested review from jesusvazquez, bwplotka and bboreham July 4, 2025 00:12

codesome mentioned this pull request Jul 4, 2025

Eager compaction of stale series prometheus/prometheus#13616

Open

SuperQ reviewed Jul 4, 2025

View reviewed changes

proposals/0055-stale-series-compaction.md Outdated Show resolved Hide resolved

Proposal: Early Compaction of Stale Series from the Head Block

11dd563

Signed-off-by: Ganesh Vernekar <[email protected]>

codesome force-pushed the codesome/stale-series-compaction branch from ebbfe83 to 11dd563 Compare July 8, 2025 19:21

machine424 reviewed Jul 15, 2025

View reviewed changes

This was referenced Jul 25, 2025

Track stale series in the Head block of TSDB prometheus/prometheus#16925

Draft

Early compaction of stale series prometheus/prometheus#16929

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: Early Compaction of Stale Series from the Head Block #55

Proposal: Early Compaction of Stale Series from the Head Block #55

Uh oh!

codesome commented Jul 4, 2025

Uh oh!

Uh oh!

machine424 left a comment •

edited

Loading

Uh oh!

machine424 Jul 15, 2025

Uh oh!

machine424 Jul 15, 2025

Uh oh!

SuperQ Jul 15, 2025

Uh oh!

machine424 Jul 15, 2025

Uh oh!

machine424 Jul 15, 2025

Uh oh!

SuperQ Jul 15, 2025

Uh oh!

machine424 Jul 16, 2025

Uh oh!

machine424 Jul 15, 2025

Uh oh!

machine424 Jul 15, 2025

Uh oh!

machine424 Jul 15, 2025

Uh oh!

Uh oh!


		### Alternative for tracking stale series

		Consider when was the last sample scraped in addition to the above proposal.


		Implementation detail: if the usual head compaction is about to happen very soon, we should skip the stale series compaction and simply wait for the usual head compaction. The buffer can be hardcoded.

		## Alternatives


		### Compacting Stale Series

		We will have two thresholds to trigger stale series compaction, `p%` and `q%`, `q > p` (both indicating % of total series that are stale in the head). Both will be configurable and default to 0% (meaning stale series compaction is disabled).


		## Goals

		* Have a simple and efficient mechanism in the TSDB to track and identify stale series.


		Part 1

		At a regular interval (say 15 mins), we check if the stale series have crossed p% of the total series. If it has, we trigger a compaction that simply flushes these stale series into a block and removes it from the Head block (can be more than one block if the series crosses the block boundary). We skip WAL truncation and m-map files truncation at this stage and let the usual compaction cycle handle it. How we drop these compacted series during WAL replay is TBD during implementation (may need a new WAL record or use tombstone records).


		Consider when was the last sample scraped in addition to the above proposal.

		For edge cases where we did not put the staleness markers, we can look at the difference between the last sample timestamp of the series and the max time of the head block, and if it crosses a threshold, call it stale. For example a series did not get a sample for 5 mins (i.e. head’s max time is 5 mins more than series’ last sample timestamp).

Proposal: Early Compaction of Stale Series from the Head Block #55

Are you sure you want to change the base?

Proposal: Early Compaction of Stale Series from the Head Block #55

Uh oh!

Conversation

codesome commented Jul 4, 2025

Uh oh!

Uh oh!

machine424 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

machine424 left a comment •

edited

Loading