Skip to content

Conversation

@kolesnikovae
Copy link
Collaborator

@kolesnikovae kolesnikovae commented Feb 3, 2025

The PR adds dataset indices: a TSDB index structurally identical to the one we use for storing series indices, but with postings pointing to datasets within the block. This index helps serve queries that do not specify which dataset is being queried (we currently create datasets based on the service_name label).

The index is also used for TSDB queries that do not rely on the SeriesIndex: specifically, LabelNames, LabelValue, and SeriesBy APIs. This is particularly useful for our old UX, where we need to generate suggestions for query input _ for example, listing all values for the pod label.

The dataset index is written at the tenant level. To enable this, I introduce the concept of tenant-wide datasets, or anonymous datasets: data regions attributed to the tenant itself. This allows storing data that is intended to be accessed globally for the tenant (i.e., when no specific query expression is provided). I plan to leverage this approach to implement #3256.

The dataset index is only generated during compaction. While we could create it in the segment writer, I have decided to postpone this for now. If any issues arise, they should be easy to address – I've modified the way datasets are flushed into segments, so we now get a stream of "flushes" and can produce the index in the same way it is done in the compactors.

In the PR, I also optimized the way blocks are compacted, eliminating most of the staging files: as a result the overall job duration (and resource usage) has decreased:
image
The reason is quite obvious:
image

I experimented with direct writes to object storage, but the client we use does not handle them well. As a result, I had to revert the optimization – something we should explore further in the future.

@kolesnikovae kolesnikovae changed the title feat(v2): dataset indices feat(v2): tenant-wide datasets Feb 4, 2025
@kolesnikovae kolesnikovae marked this pull request as ready for review February 11, 2025 11:53
@kolesnikovae kolesnikovae requested a review from a team as a code owner February 11, 2025 11:53
Copy link
Contributor

@aleks-p aleks-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I left a few questions / minor remarks that are not blockers to move this forward.

}
}
return toc, err
func (w *Writer) ReadFrom(r io.Reader) (n int64, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, is this technically WriteFromReader or similar? (I know it was already named this way, just noticing it now)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you're referring to. The io.ReaderFrom interface is mainly implemented by types that are also io.Writer. And io.WriterTo interface is mainly implemented by types that are also io.Reader.

I've removed this bit either way – now that we do not use staging files, it is not super helpful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am saying it is strange for a method that does writing to be called ReadFrom, although I might be misunderstanding what it does. If this is a Go convention, then it is fine - though I would still find it strange :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely unrelated, but since we've already started this conversation:

ReaderFrom and WriterTo are primarily meant for io.Copy and very useful: they allow to delegate copying to the implementation in:

io.Copy(dst io.Writer, src io.Reader)

https://cs.opensource.google/go/go/+/refs/tags/go1.24.0:src/io/io.go;l=407-416

This might be confusing at times.

For example, bytes.Buffer.ReaderFrom will actually grow your buffer by at least bytes.MinRead extra space, if you call ReadFrom, or copy to the buffer, and the source does not implement WriterTo (which would be called before dst.ReadFrom in io.Copy).

// MinRead is the minimum slice size passed to a [Buffer.Read] call by
// [Buffer.ReadFrom]. As long as the [Buffer] has at least MinRead bytes beyond
// what is required to hold the contents of r, [Buffer.ReadFrom] will not grow the
// underlying buffer.
const MinRead = 512

https://cs.opensource.google/go/go/+/refs/tags/go1.24.0:src/bytes/buffer.go;l=196-206

This is quite a pain – when you allocate a buffer that you know has sufficient capacity (e.g., 8MB), but it gets reallocated to 8MB + 512 bytes. This is especially frustrating when you're optimizing for performance and using a buffer pool (since most of them rely on size classes).

This is one of the reasons why we have our own "yet-another" buffer pool:
https://github.com/grafana/pyroscope/blob/main/pkg/util/bufferpool/pool.go#L34

This trick helped to achieve quite good inuse/alloc ratio:

Alloc space:
Screenshot 2025-02-14 at 13 11 05

Inuse space:
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for explaining!

This trick helped to achieve quite good inuse/alloc ratio:

This seems like a useful "metric" that could be extracted from profiling data, e.g., for automated analysis / recommendations.

@kolesnikovae kolesnikovae force-pushed the feat/tenant-wide-tsdb-index branch from 1ffc672 to 73cc609 Compare February 12, 2025 08:31
@kolesnikovae
Copy link
Collaborator Author

Thank you for the review Aleks!

@kolesnikovae kolesnikovae force-pushed the feat/tenant-wide-tsdb-index branch from ab53ba1 to ba8c885 Compare February 17, 2025 07:24
@kolesnikovae kolesnikovae merged commit 8eb99e0 into main Feb 17, 2025
21 checks passed
@kolesnikovae kolesnikovae deleted the feat/tenant-wide-tsdb-index branch February 17, 2025 08:58
shelldandy pushed a commit to shelldandy/pyroscope that referenced this pull request Mar 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants