feat(v2): tenant-wide datasets #3885

kolesnikovae · 2025-02-03T09:27:11Z

The PR adds dataset indices: a TSDB index structurally identical to the one we use for storing series indices, but with postings pointing to datasets within the block. This index helps serve queries that do not specify which dataset is being queried (we currently create datasets based on the service_name label).

The index is also used for TSDB queries that do not rely on the SeriesIndex: specifically, LabelNames, LabelValue, and SeriesBy APIs. This is particularly useful for our old UX, where we need to generate suggestions for query input _ for example, listing all values for the pod label.

The dataset index is written at the tenant level. To enable this, I introduce the concept of tenant-wide datasets, or anonymous datasets: data regions attributed to the tenant itself. This allows storing data that is intended to be accessed globally for the tenant (i.e., when no specific query expression is provided). I plan to leverage this approach to implement #3256.

The dataset index is only generated during compaction. While we could create it in the segment writer, I have decided to postpone this for now. If any issues arise, they should be easy to address – I've modified the way datasets are flushed into segments, so we now get a stream of "flushes" and can produce the index in the same way it is done in the compactors.

In the PR, I also optimized the way blocks are compacted, eliminating most of the staging files: as a result the overall job duration (and resource usage) has decreased:

The reason is quite obvious:

I experimented with direct writes to object storage, but the client we use does not handle them well. As a result, I had to revert the optimization – something we should explore further in the future.

# Conflicts: # pkg/experiment/ingester/segment.go

pkg/experiment/query_backend/query.go

aleks-p

LGTM! I left a few questions / minor remarks that are not blockers to move this forward.

pkg/frontend/read_path/query_frontend/query_frontend.go

pkg/experiment/block/compaction.go

pkg/experiment/block/compaction_test.go

aleks-p · 2025-02-11T21:03:22Z

pkg/experiment/block/writer.go

-		}
-	}
-	return toc, err
+func (w *Writer) ReadFrom(r io.Reader) (n int64, err error) {


nit, is this technically WriteFromReader or similar? (I know it was already named this way, just noticing it now)

Not sure what you're referring to. The io.ReaderFrom interface is mainly implemented by types that are also io.Writer. And io.WriterTo interface is mainly implemented by types that are also io.Reader.

I've removed this bit either way – now that we do not use staging files, it is not super helpful.

I am saying it is strange for a method that does writing to be called ReadFrom, although I might be misunderstanding what it does. If this is a Go convention, then it is fine - though I would still find it strange :)

Completely unrelated, but since we've already started this conversation:

ReaderFrom and WriterTo are primarily meant for io.Copy and very useful: they allow to delegate copying to the implementation in:

io.Copy(dst io.Writer, src io.Reader)

https://cs.opensource.google/go/go/+/refs/tags/go1.24.0:src/io/io.go;l=407-416

This might be confusing at times.

For example, bytes.Buffer.ReaderFrom will actually grow your buffer by at least bytes.MinRead extra space, if you call ReadFrom, or copy to the buffer, and the source does not implement WriterTo (which would be called before dst.ReadFrom in io.Copy).

// MinRead is the minimum slice size passed to a [Buffer.Read] call by // [Buffer.ReadFrom]. As long as the [Buffer] has at least MinRead bytes beyond // what is required to hold the contents of r, [Buffer.ReadFrom] will not grow the // underlying buffer. const MinRead = 512

https://cs.opensource.google/go/go/+/refs/tags/go1.24.0:src/bytes/buffer.go;l=196-206

This is quite a pain – when you allocate a buffer that you know has sufficient capacity (e.g., 8MB), but it gets reallocated to 8MB + 512 bytes. This is especially frustrating when you're optimizing for performance and using a buffer pool (since most of them rely on size classes).

This is one of the reasons why we have our own "yet-another" buffer pool:
https://github.com/grafana/pyroscope/blob/main/pkg/util/bufferpool/pool.go#L34

This trick helped to achieve quite good inuse/alloc ratio:

Alloc space:

Inuse space:

Nice, thanks for explaining!

This trick helped to achieve quite good inuse/alloc ratio:

This seems like a useful "metric" that could be extracted from profiling data, e.g., for automated analysis / recommendations.

pkg/experiment/ingester/segment.go

pkg/experiment/query_backend/query.go

kolesnikovae · 2025-02-12T08:40:01Z

Thank you for the review Aleks!

kolesnikovae added 7 commits January 19, 2025 16:08

reuse getSeriesLabels

ccab3d1

add segment-writer worker pool for head flushes

1e1a990

stream dataset flushes in segment-writer

882097e

feat(v2): dataset indices

394dab0

error handling

45192ea

add block query context

3ab3fb0

store metadata in the block

aab405e

kolesnikovae changed the title ~~feat(v2): dataset indices~~ feat(v2): tenant-wide datasets Feb 4, 2025

kolesnikovae added 20 commits February 5, 2025 13:53

Merge branch 'main' into feat/tenant-wide-tsdb-index

28b7835

fix compaction error handling

be7b455

extend metadata query interface

93f66db

add symdb write closer option

b3183ae

fallback to dataset index query

42138cd

explicit format handling

676c024

fix not found err handling in compaction worker

9fbf325

stream output object at compaction

1b01c99

fix metadata encoding

4c0bcf5

block reader refactoring

61e2066

fix metadata label query

830532d

revert stream object upload

c6af723

free up dataset after compaction

838177c

fix up object size in metadata

c00890a

add dataset tenant and name getters

17134d4

add a note about the dataset index use

3555f1a

restrict metadata query expression

6389652

Merge branch 'main' into feat/tenant-wide-tsdb-index

8aa3b51

# Conflicts: # pkg/experiment/ingester/segment.go

add block reader unit test suite

6a2c76f

fix metadata access race

4d3ad13

kolesnikovae marked this pull request as ready for review February 11, 2025 11:53

kolesnikovae requested a review from a team as a code owner February 11, 2025 11:53

kolesnikovae commented Feb 11, 2025

View reviewed changes

pkg/experiment/query_backend/query.go Outdated Show resolved Hide resolved

aleks-p approved these changes Feb 11, 2025

View reviewed changes

nits

73cc609

kolesnikovae force-pushed the feat/tenant-wide-tsdb-index branch from 1ffc672 to 73cc609 Compare February 12, 2025 08:31

improve segment writer shutdown

ba8c885

kolesnikovae force-pushed the feat/tenant-wide-tsdb-index branch from ab53ba1 to ba8c885 Compare February 17, 2025 07:24

kolesnikovae merged commit 8eb99e0 into main Feb 17, 2025
21 checks passed

kolesnikovae deleted the feat/tenant-wide-tsdb-index branch February 17, 2025 08:58

shelldandy pushed a commit to shelldandy/pyroscope that referenced this pull request Mar 14, 2025

feat(v2): tenant-wide datasets (grafana#3885)

d1e9506

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(v2): tenant-wide datasets #3885

feat(v2): tenant-wide datasets #3885

Uh oh!

kolesnikovae commented Feb 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

aleks-p left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aleks-p Feb 11, 2025

Uh oh!

kolesnikovae Feb 12, 2025

Uh oh!

aleks-p Feb 13, 2025

Uh oh!

kolesnikovae Feb 14, 2025

Uh oh!

aleks-p Feb 14, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kolesnikovae commented Feb 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(v2): tenant-wide datasets #3885

feat(v2): tenant-wide datasets #3885

Uh oh!

Conversation

kolesnikovae commented Feb 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

aleks-p left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aleks-p Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

kolesnikovae Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

aleks-p Feb 13, 2025

Choose a reason for hiding this comment

Uh oh!

kolesnikovae Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

aleks-p Feb 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kolesnikovae commented Feb 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kolesnikovae commented Feb 3, 2025 •

edited

Loading