Skip to content

Performance regression in "Add externfn macro and correctly label fixed_stack_segments" #8782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
DaGenix opened this issue Aug 27, 2013 · 4 comments
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@DaGenix
Copy link

DaGenix commented Aug 27, 2013

I've noticed a minor performance regression in 0479d94 or maybe 303f650. Unfortunately, 303f650 doesn't compile, so, I can't be quite sure. The revision before that, c178b52, is fine. I've also tested it against current master, 1ac7d80.

I don't have a small test case that demonstrates the issue. All I have is a large one. Check out the "performance-regression" branch of https://github.com/DaGenix/rust-working and then run the benchmark on the test.rc file - the relevant item is aes::bench::aes_bench_x8 (there is lots of messy code in there; I apologize for that, I use that repository mostly as a backup for my main computer).

All tests were run with --opt-level 3 on a Core i5 under 64-bit Ubuntu. Performance is likely drastically different on an architecture where LLVM can't use the SSE instructions, so, I don't know how well this reproduces outside of x86_64.

c178b52 (last fast revision):

83.49 MB/s

0479d94 (first slow revision that compiles):

80.60 MB/s

1ac7d80 (current master):

79.60 MB/s

Here is where it gets interesting: the code being benchmarked is mostly just the contents of the aessafe.rs file. I took code from that file and everything it depends on and moved it into a single file at: https://gist.github.com/DaGenix/6348471. When I benchmark this file using master, I get the faster (83.71 MB/s) performance.

This certainly isn't the ideal test case for this issue, although I'm not sure how to chop down the code to get a better test case at this point.

@brson
Copy link
Contributor

brson commented Aug 29, 2013

Great bug report.

@DaGenix
Copy link
Author

DaGenix commented Aug 30, 2013

Updated on 8/29:

I re-ran all my tests on the named revisions above, and also against current master. I'm not sure why, but current master is much slower than before.

I put all my code into my github repository in the "performance-regression" branch of https://github.com/DaGenix/rust-working. Due to recent changes in master, the original code no longer compiles on master, so, I've updated the code a tiny little bit from before. The tests for test.rc were run as:

../path/to/rustc --opt-level 3 --test test.rc >/dev/null 2>&1 && ./test --bench x8

and the tests for aesbench2.rs were run as:

../path/to/rustc --opt-level 3 --test aesbench2.rs >/dev/null 2>&1 && ./aesbench2 --bench x8

Same as before, test.rc and aesbench2.rs contain pretty much identical code, except that the code in test.rc is split across multiple files and with aesbench2.rs everything is in a single file.

From run to run, test results would vary by up to around 0.5 MB/s, so, I wouldn't read too much in to tiny variations. I just tried to report the value that was pretty much the average. If anyone tries to run the benchmarks, I have them setup to report 100x the MB/s value, so, they will report something like 8000 MB/s which is really 80.00 MB/s.

commit test.rc (MB/s) aesbench2.rs (MB/s)
c178b52 (last fast revision) 84.0 83.4
0479d94 (first slower revision) 79.8 83.4
5ef8cdb (current master) 63.5 63.8

@nikomatsakis
Copy link
Contributor

cc me

@alexcrichton
Copy link
Member

Closing, these commits have since been removed.

flip1995 pushed a commit to flip1995/rust that referenced this issue May 5, 2022
Move only_used_in_recursion to nursery

r? `@llogiq`

I still think this is the right thing to do for this lint. See: rust-lang#8782

I want to get this merged before the sync on Thursday, because I want to backport this and the last fix rust-lang#8691 to beta.

changelog: Move [`only_used_in_recursion`] to nursery
Jarcho pushed a commit to Jarcho/rust that referenced this issue Aug 29, 2022
Rework `only_used_in_recursion`

fixes rust-lang#8782
fixes rust-lang#8629
fixes rust-lang#8560
fixes rust-lang#8556

This is a complete rewrite of the lint. This loses some capabilities of the old implementation. Namely the ability to track through tuple and slice patterns, as well as the ability to trace through assignments.

The two reported bugs are fixed with this. One was caused by using the name of the method rather than resolving to the `DefId` of the called method. The second was cause by using the existence of a cycle in the dependency graph to determine whether the parameter was used in recursion even though there were other ways to create a cycle in the graph.

Implementation wise this switches from using a visitor to walking up the tree from every use of each parameter until it has been determined the parameter is used for something other than recursion. This is likely to perform better as it avoids walking the entire function a second time, and it is unlikely to walk up the HIR tree very much. Some cases would perform worse though.

cc `@buttercrab`

changelog: Scale back `only_used_in_recursion` to fix false positives
changelog: Move `only_used_in_recursion` back to `complexity`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests

4 participants