Skip to content

[DO NOT MERGE] Experimentally force non-leaf frame pointers #115521

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

workingjubilee
Copy link
Member

Continuing the experiment of #114323

r? @ghost

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 3, 2023
@workingjubilee
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 3, 2023
@bors
Copy link
Collaborator

bors commented Sep 3, 2023

⌛ Trying commit cbbed36 with merge d346bd31885b2a741517e772e9ac19eeecc088b5...

@workingjubilee workingjubilee changed the title Experimentally force non-leaf frame pointers [DO NOT MERGE] Experimentally force non-leaf frame pointers Sep 3, 2023
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Sep 4, 2023

☀️ Try build successful - checks-actions
Build commit: d346bd31885b2a741517e772e9ac19eeecc088b5 (d346bd31885b2a741517e772e9ac19eeecc088b5)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (d346bd31885b2a741517e772e9ac19eeecc088b5): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.6% [0.3%, 2.3%] 215
Regressions ❌
(secondary)
1.5% [0.4%, 2.5%] 189
Improvements ✅
(primary)
-5.4% [-10.4%, -0.4%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.5% [-10.4%, 2.3%] 217

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.2% [1.7%, 6.0%] 3
Improvements ✅
(primary)
-0.8% [-1.4%, -0.1%] 2
Improvements ✅
(secondary)
-4.9% [-6.9%, -2.7%] 3
All ❌✅ (primary) -0.8% [-1.4%, -0.1%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.1% [0.5%, 1.6%] 14
Regressions ❌
(secondary)
2.6% [1.0%, 3.2%] 6
Improvements ✅
(primary)
-8.1% [-8.1%, -8.1%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.5% [-8.1%, 1.6%] 15

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.0%] 12
Regressions ❌
(secondary)
0.2% [0.1%, 0.5%] 9
Improvements ✅
(primary)
-0.3% [-0.5%, -0.0%] 89
Improvements ✅
(secondary)
-0.3% [-3.0%, -0.0%] 32
All ❌✅ (primary) -0.2% [-0.5%, 0.0%] 101

Bootstrap: 628.382s -> 631.014s (0.42%)
Artifact size: 316.56 MiB -> 314.59 MiB (-0.62%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Sep 4, 2023
@workingjubilee
Copy link
Member Author

Much better, but still a pretty hard sell.

@workingjubilee workingjubilee restored the force-trunk-frame-pointers branch May 4, 2024 04:07
@workingjubilee workingjubilee reopened this May 4, 2024
@workingjubilee workingjubilee force-pushed the force-trunk-frame-pointers branch from cbbed36 to 2455c14 Compare May 4, 2024 04:08
@rust-log-analyzer
Copy link
Collaborator

The job mingw-check-tidy failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
Getting action download info
Download action repository 'msys2/[email protected]' (SHA:cc11e9188b693c2b100158c3322424c4cc1dadea)
Download action repository 'actions/checkout@v4' (SHA:0ad4b8fadaa221de15dcec353f45205ec38ea70b)
Download action repository 'actions/upload-artifact@v4' (SHA:65462800fd760344b1a7b4382951275a0abb4808)
Complete job name: PR - mingw-check-tidy
git config --global core.autocrlf false
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
---
COPY scripts/sccache.sh /scripts/
RUN sh /scripts/sccache.sh

COPY host-x86_64/mingw-check/reuse-requirements.txt /tmp/
RUN pip3 install --no-deps --no-cache-dir --require-hashes -r /tmp/reuse-requirements.txt \
    && pip3 install virtualenv
COPY host-x86_64/mingw-check/validate-toolstate.sh /scripts/
COPY host-x86_64/mingw-check/validate-error-codes.sh /scripts/

# NOTE: intentionally uses python2 for x.py so we can test it still works.
# NOTE: intentionally uses python2 for x.py so we can test it still works.
# validate-toolstate only runs in our CI, so it's ok for it to only support python3.
ENV SCRIPT TIDY_PRINT_DIFF=1 python2.7 ../x.py test \
           --stage 0 src/tools/tidy tidyselftest --extra-checks=py:lint
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
#    pip-compile --allow-unsafe --generate-hashes reuse-requirements.in
---

#12 [5/8] COPY host-x86_64/mingw-check/reuse-requirements.txt /tmp/
#12 DONE 0.0s

#13 [6/8] RUN pip3 install --no-deps --no-cache-dir --require-hashes -r /tmp/reuse-requirements.txt     && pip3 install virtualenv
#13 0.407   Downloading binaryornot-0.4.4-py2.py3-none-any.whl (9.0 kB)
#13 0.419 Collecting boolean-py==4.0
#13 0.423   Downloading boolean.py-4.0-py3-none-any.whl (25 kB)
#13 0.437 Collecting chardet==5.1.0
---
#13 3.409 Building wheels for collected packages: reuse
#13 3.410   Building wheel for reuse (pyproject.toml): started
#13 3.732   Building wheel for reuse (pyproject.toml): finished with status 'done'
#13 3.733   Created wheel for reuse: filename=reuse-1.1.0-cp310-cp310-manylinux_2_35_x86_64.whl size=181117 sha256=f5f58750481f69515c2c0d1d503daf565e2565c370d07fc6aeb95fe3498b4269
#13 3.733   Stored in directory: /tmp/pip-ephem-wheel-cache-zpb0ngqr/wheels/c2/3c/b9/1120c2ab4bd82694f7e6f0537dc5b9a085c13e2c69a8d0c76d
#13 3.736 Installing collected packages: boolean-py, binaryornot, setuptools, reuse, python-debian, markupsafe, license-expression, jinja2, chardet
#13 3.758   Attempting uninstall: setuptools
#13 3.759     Found existing installation: setuptools 59.6.0
#13 3.760     Not uninstalling setuptools at /usr/lib/python3/dist-packages, outside environment /usr
---
#13 4.974   Downloading virtualenv-20.26.1-py3-none-any.whl (3.9 MB)
#13 5.043      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.9/3.9 MB 58.9 MB/s eta 0:00:00
#13 5.092 Collecting filelock<4,>=3.12.2
#13 5.096   Downloading filelock-3.14.0-py3-none-any.whl (12 kB)
#13 5.124 Collecting platformdirs<5,>=3.9.1
#13 5.127   Downloading platformdirs-4.2.1-py3-none-any.whl (17 kB)
#13 5.144 Collecting distlib<1,>=0.3.7
#13 5.147   Downloading distlib-0.3.8-py2.py3-none-any.whl (468 kB)
#13 5.153      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 468.9/468.9 KB 123.5 MB/s eta 0:00:00
#13 5.240 Installing collected packages: distlib, platformdirs, filelock, virtualenv
#13 5.401 Successfully installed distlib-0.3.8 filelock-3.14.0 platformdirs-4.2.1 virtualenv-20.26.1
#13 DONE 5.5s

#14 [7/8] COPY host-x86_64/mingw-check/validate-toolstate.sh /scripts/
#14 DONE 0.0s
---
DirectMap4k:      192448 kB
DirectMap2M:     7147520 kB
DirectMap1G:    11534336 kB
##[endgroup]
Executing TIDY_PRINT_DIFF=1 python2.7 ../x.py test            --stage 0 src/tools/tidy tidyselftest --extra-checks=py:lint
+ TIDY_PRINT_DIFF=1 python2.7 ../x.py test --stage 0 src/tools/tidy tidyselftest --extra-checks=py:lint
    Finished `dev` profile [unoptimized] target(s) in 0.03s
##[endgroup]
downloading https://ci-artifacts.rust-lang.org/rustc-builds-alt/09cd00fea4aecaa6707f122d7e143196b8a12ee2/rust-dev-nightly-x86_64-unknown-linux-gnu.tar.xz
extracting /checkout/obj/build/cache/llvm-09cd00fea4aecaa6707f122d7e143196b8a12ee2-true/rust-dev-nightly-x86_64-unknown-linux-gnu.tar.xz to /checkout/obj/build/x86_64-unknown-linux-gnu/ci-llvm
---
##[endgroup]
fmt check
tidy check
tidy: Skipping binary file check, read-only filesystem
##[error]tidy error: /checkout/compiler/rustc_codegen_llvm/src/attributes.rs:116: TODO is used for tasks that should be done before merging a PR; If you want to leave a message in the codebase use FIXME
removing old virtual environment
creating virtual environment at '/checkout/obj/build/venv' using 'python3.10'
Requirement already satisfied: pip in ./build/venv/lib/python3.10/site-packages (24.0)
Collecting black==23.3.0 (from -r /checkout/src/tools/tidy/config/requirements.txt (line 7))
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 44.3 MB/s eta 0:00:00
Collecting click==8.1.3 (from -r /checkout/src/tools/tidy/config/requirements.txt (line 34))
  Downloading click-8.1.3-py3-none-any.whl (96 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.6/96.6 kB 38.0 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.6/96.6 kB 38.0 MB/s eta 0:00:00
Collecting importlib-metadata==6.7.0 (from -r /checkout/src/tools/tidy/config/requirements.txt (line 38))
  Downloading importlib_metadata-6.7.0-py3-none-any.whl (22 kB)
Collecting mypy-extensions==1.0.0 (from -r /checkout/src/tools/tidy/config/requirements.txt (line 42))
  Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
Collecting packaging==23.1 (from -r /checkout/src/tools/tidy/config/requirements.txt (line 46))
  Downloading packaging-23.1-py3-none-any.whl (48 kB)
Collecting pathspec==0.11.1 (from -r /checkout/src/tools/tidy/config/requirements.txt (line 50))
  Downloading pathspec-0.11.1-py3-none-any.whl (29 kB)
Collecting platformdirs==3.6.0 (from -r /checkout/src/tools/tidy/config/requirements.txt (line 54))
  Downloading platformdirs-3.6.0-py3-none-any.whl (16 kB)
  Downloading platformdirs-3.6.0-py3-none-any.whl (16 kB)
Collecting ruff==0.0.272 (from -r /checkout/src/tools/tidy/config/requirements.txt (line 58))
  Downloading ruff-0.0.272-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.9 MB)
Collecting tomli==2.0.1 (from -r /checkout/src/tools/tidy/config/requirements.txt (line 77))
  Downloading tomli-2.0.1-py3-none-any.whl (12 kB)
Collecting typed-ast==1.5.4 (from -r /checkout/src/tools/tidy/config/requirements.txt (line 81))
  Downloading typed_ast-1.5.4-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (877 kB)
  Downloading typed_ast-1.5.4-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (877 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 877.7/877.7 kB 138.9 MB/s eta 0:00:00
Collecting typing-extensions==4.6.3 (from -r /checkout/src/tools/tidy/config/requirements.txt (line 107))
  Downloading typing_extensions-4.6.3-py3-none-any.whl (31 kB)
Collecting zipp==3.15.0 (from -r /checkout/src/tools/tidy/config/requirements.txt (line 114))
  Downloading zipp-3.15.0-py3-none-any.whl (6.8 kB)
Installing collected packages: zipp, typing-extensions, typed-ast, tomli, ruff, platformdirs, pathspec, packaging, mypy-extensions, click, importlib-metadata, black
Successfully installed black-23.3.0 click-8.1.3 importlib-metadata-6.7.0 mypy-extensions-1.0.0 packaging-23.1 pathspec-0.11.1 platformdirs-3.6.0 ruff-0.0.272 tomli-2.0.1 typed-ast-1.5.4 typing-extensions-4.6.3 zipp-3.15.0
some tidy checks failed
Build completed unsuccessfully in 0:00:58
  local time: Sat May  4 04:13:15 UTC 2024
  network time: Sat, 04 May 2024 04:13:15 GMT

@workingjubilee
Copy link
Member Author

I have returned!
LLVM 18 landed.

Supposedly, Fuchsia (and indeed Google) enables this everywhere and "there is no significant performance impact when enabling frame pointers w/ -momit-leaf-frame-pointer or the Rust equivalent of NonLeaf". They even ran LLVM for quite a while. I know they tend to run LLVM at HEAD? Let's try this bench again, then!

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 4, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request May 4, 2024
…nters, r=<try>

[DO NOT MERGE] Experimentally force non-leaf frame pointers

Continuing the experiment of rust-lang#114323

r? `@ghost`
@bors
Copy link
Collaborator

bors commented May 4, 2024

⌛ Trying commit 2455c14 with merge 80a8029...

@bors
Copy link
Collaborator

bors commented May 4, 2024

☀️ Try build successful - checks-actions
Build commit: 80a8029 (80a802916b0785f44141a810f2f4fb5f56479de5)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (80a8029): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.3% [0.6%, 2.2%] 227
Regressions ❌
(secondary)
1.4% [0.3%, 2.6%] 206
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.3% [0.6%, 2.2%] 227

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-4.1% [-4.1%, -4.1%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -4.1% [-4.1%, -4.1%] 1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.6% [0.9%, 2.6%] 20
Regressions ❌
(secondary)
4.7% [0.7%, 12.2%] 20
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.6% [0.9%, 2.6%] 20

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.0%] 5
Regressions ❌
(secondary)
0.3% [0.0%, 0.5%] 3
Improvements ✅
(primary)
-0.3% [-0.7%, -0.0%] 92
Improvements ✅
(secondary)
-0.5% [-2.9%, -0.0%] 13
All ❌✅ (primary) -0.3% [-0.7%, 0.0%] 97

Bootstrap: 678.263s -> 676.946s (-0.19%)
Artifact size: 315.83 MiB -> 313.92 MiB (-0.61%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 4, 2024
@Kobzol
Copy link
Contributor

Kobzol commented May 4, 2024

How does this reduce binary size, lol.

@Noratrieb
Copy link
Member

Noratrieb commented May 4, 2024

this doesn't really accurately capture the perf impact tbh, as we're measuring both the effect of compiling the rustc binary with them and the effect of rustc compiling something with frame pointers - not that meaningful. it would be more insightful to patch bootstrap to pass the frame pointer builds flags to rustc, so we can measure the rustc binary without disturbing the way rustc compiles things - if that's what you want to measure

@Noratrieb
Copy link
Member

Noratrieb commented May 4, 2024

but it looks like from bitmaps incremental (which shows up in cycles and wall time as well) that try_mark_previous_green (a function that's burning hot in incremental) is being hit the most, so I'm inclined to believe that at least this result is directly from the rustc binary getting slower

> 4,849,672  <rustc_query_system::dep_graph::graph::DepGraphData<rustc_middle::dep_graph::DepsType>>::try_mark_previous_green::<rustc_query_impl::plumbing::QueryCtxt>:???

>   328,497  rustc_query_system::query::plumbing::try_execute_query::<rustc_query_impl::DynamicConfig<rustc_query_system::query::caches::VecCache<rustc_span::def_id::LocalDefId, rustc_middle::query::erase::Erased<[u8; 8]>>, false, false, false>, rustc_query_impl::plumbing::QueryCtxt, true>:???

@workingjubilee
Copy link
Member Author

...hmm.

@workingjubilee
Copy link
Member Author

How does this reduce binary size, lol.

I suspect, by being told to include frame pointers, LLVM is making more deliberate decisions about inlining since it knows that now that costs a few more instructions.

@workingjubilee
Copy link
Member Author

this doesn't really accurately capture the perf impact tbh, as we're measuring both the effect of compiling the rustc binary with them and the effect of rustc compiling something with frame pointers - not that meaningful. it would be more insightful to patch bootstrap to pass the frame pointer builds flags to rustc, so we can measure the rustc binary without disturbing the way rustc compiles things - if that's what you want to measure

There's no -Cforce-frame-pointers=trunk unfortunately.

@apiraino
Copy link
Contributor

apiraino commented Jun 6, 2024

Switching to waiting on author since it looks a work in progress, review by author themself

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 6, 2024
@Dylan-DPC Dylan-DPC added S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants