Skip to content

Improve performance of Bazel query streamed protobuf output #24304

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
keithl-stripe opened this issue Nov 12, 2024 · 0 comments
Open

Improve performance of Bazel query streamed protobuf output #24304

keithl-stripe opened this issue Nov 12, 2024 · 0 comments
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Core Skyframe, bazel query, BEP, options parsing, bazelrc team-Performance Issues for Performance teams type: feature request

Comments

@keithl-stripe
Copy link
Contributor

keithl-stripe commented Nov 12, 2024

Description of the feature request:

(description copied from related issue #24293)

Our repository contains about 700,000 targets. We use the output of bazel query to improve CI performance, by restricting the Bazel build to changed targets and their transitive dependencies (similar to bazel-diff).

Specifically, we run:

bazel query --output=streamed_proto //...

This produces a 6.8 GB file and takes (~cold):

  • 16 seconds to download/unpack external repos
  • 14 seconds to parse all the BUILD.bazel files
  • 4 seconds to evaluate the query expression
  • 1 minute, 35 seconds to render the protos to stdout

We'd like to speed up this last step, as it’s 74% of wall time.

Through Java profiling (via YourKit and Java Flight Recorder) we've noticed that Bazel spends a lot of wall time on a single thread doing:

  1. Constructing protobuf Build.Target objects in memory
  2. Rendering said proto to varint-delimited wire format
  3. Actually writing to stdout

Screenshot 2024-11-11 at 5 52 59 PM

Which category does this issue belong to?

Core, Performance

What underlying problem are you trying to solve with this feature?

Improve bazel query --output=streamed_proto performance

Which operating system are you running Bazel on?

Linux Ubuntu 24.04.1

What is the output of bazel info release?

release 7.2.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Core Skyframe, bazel query, BEP, options parsing, bazelrc team-Performance Issues for Performance teams type: feature request
Projects
None yet
Development

No branches or pull requests

5 participants