-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Reduce text size for fail invocations #15983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Passing one pointer takes less code than one pointer and an integer.
Produces very clean asm, but makes bigger binaries.
One thing that might reduce overall bin size is to just use short variable names instead of e.g. |
Size difference in stripped binaries:
|
Seems like on balance a win to me; instruction cache footprint is important. (Though I don't know why |
@@ -58,7 +62,8 @@ macro_rules! fail( | |||
// up with the number of calls to fail!() | |||
#[inline(always)] | |||
fn run_fmt(fmt: &::std::fmt::Arguments) -> ! { | |||
::std::rt::begin_unwind_fmt(fmt, file!(), line!()) | |||
static file_line: (&'static str, uint) = (file!(), line!()); | |||
::std::rt::begin_unwind_fmt(fmt, &file_line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you ALL_CAPS file_line
in all these usages?
I agree, these size wins seem great! |
This looks like it's unoptimized, are you sure the assembly outputs you generated were optimized versions? |
@alexcrichton argh, no the asm I reported was unoptimized. I always forget that we don't optimize by default. |
This reverts commit c61f976. Conflicts: src/librustrt/unwind.rs src/libstd/macros.rs
Feedback addressed. |
@@ -33,7 +33,8 @@ macro_rules! fail( | |||
// up with the number of calls to fail!() | |||
#[inline(always)] | |||
fn run_fmt(fmt: &::std::fmt::Arguments) -> ! { | |||
::core::failure::begin_unwind(fmt, file!(), line!()) | |||
static file_line: (&'static str, uint) = (file!(), line!()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FILE_LINE? (is this not warned about?)
Fixed another static name. |
A few refactorings to decrease text size and increase data size. I'm not sure about this tradeoff. Various stats below. cc @pcwalton This reduces the code needed to pass arguments for `fail!()`, `fail!("{}", ...)`, and to a lesser extent `fail!("...")`. Still more work to be done on compiler-generated failures and the `fail!("...")` case. do_fail_empty: ``` #[inline(never)] fn do_fail_empty() { fail!() } ``` do_fail_empty before: ``` leaq 8(%rsp), %rdi movabsq $13, %rsi leaq "str\"str\"(1494)"(%rip), %rax movq %rax, 8(%rsp) movq $19, 16(%rsp) callq _ZN6unwind31begin_unwind_no_time_to_explain20h57030457935ab6111SdE@PLT ``` do_fail_empty after: ``` leaq _ZN13do_fail_empty9file_line20h339df6a0541e837eIaaE(%rip), %rdi callq _ZN6unwind31begin_unwind_no_time_to_explain20h33184cfdcce4dfd8QTdE@PLT ``` do_fail_fmt: ``` #[inline(never)] fn do_fail_fmt() { fail!("guh{}", "faw") } ``` do_fail_fmt before: ``` ... (snip lots of fmt stuff) callq _ZN3fmt22Arguments$LT$$x27a$GT$3new20he09b3a3f473879c41paE leaq 144(%rsp), %rsi movabsq $23, %rdx leaq "str\"str\"(1494)"(%rip), %rax leaq 32(%rsp), %rcx movq %rcx, 160(%rsp) movq 160(%rsp), %rdi movq %rax, 144(%rsp) movq $19, 152(%rsp) callq _ZN6unwind16begin_unwind_fmt20h3ebeb42f4d189b2buQdE@PLT ``` do_fail_fmt after: ``` ... (snip lots of fmt stuff) callq _ZN3fmt22Arguments$LT$$x27a$GT$3new20h42e5bb8d1711ee61OqaE leaq _ZN11do_fail_fmt7run_fmt9file_line20h339df6a0541e837eFbaE(%rip), %rsi leaq 32(%rsp), %rax movq %rax, 144(%rsp) movq 144(%rsp), %rdi callq _ZN6unwind16begin_unwind_fmt20hfdcadc14d188656biRdE@PLT ``` File size increases. file size before: ``` -rw-rw-r-- 1 brian brian 100501740 Jul 24 23:28 /home/brian/dev/rust2/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc-4e7c5e5c.rlib -rwxrwxr-x 1 brian brian 21201780 Jul 24 23:27 /home/brian/dev/rust2/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc-4e7c5e5c.so ``` file size after: ``` -rw-rw-r-- 1 brian brian 101542484 Jul 25 00:34 x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc-4e7c5e5c.rlib -rwxrwxr-x 1 brian brian 21348862 Jul 25 00:34 x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc-4e7c5e5c.so ``` Text size decreases by 52486 while data size increases by 143686. section size before: ``` text data bss dec hex filename 12712262 5924997 368 18637627 11c633b x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc-4e7c5e5c.so ``` section size after: ``` text data bss dec hex filename 12659776 6068683 368 18728827 11dc77b /home/brian/dev/rust/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc-4e7c5e5c.so ``` I don't know if anything can be learned from these benchmarks. Looks like a wash. std bench before: ``` test collections::hashmap::bench::find_existing ... bench: 43452 ns/iter (+/- 2423) test collections::hashmap::bench::find_nonexisting ... bench: 42416 ns/iter (+/- 3996) test collections::hashmap::bench::find_pop_insert ... bench: 214 ns/iter (+/- 11) test collections::hashmap::bench::hashmap_as_queue ... bench: 123 ns/iter (+/- 6) test collections::hashmap::bench::insert ... bench: 153 ns/iter (+/- 14) test collections::hashmap::bench::new_drop ... bench: 547 ns/iter (+/- 259) test collections::hashmap::bench::new_insert_drop ... bench: 682 ns/iter (+/- 366) test io::buffered::test::bench_buffered_reader ... bench: 1046 ns/iter (+/- 86) test io::buffered::test::bench_buffered_stream ... bench: 2156 ns/iter (+/- 801) test io::buffered::test::bench_buffered_writer ... bench: 1057 ns/iter (+/- 75) test io::extensions::bench::u64_from_be_bytes_4_aligned ... bench: 80 ns/iter (+/- 5) test io::extensions::bench::u64_from_be_bytes_4_unaligned ... bench: 81 ns/iter (+/- 6) test io::extensions::bench::u64_from_be_bytes_7_aligned ... bench: 80 ns/iter (+/- 4) test io::extensions::bench::u64_from_be_bytes_7_unaligned ... bench: 69 ns/iter (+/- 4) test io::extensions::bench::u64_from_be_bytes_8_aligned ... bench: 69 ns/iter (+/- 3) test io::extensions::bench::u64_from_be_bytes_8_unaligned ... bench: 81 ns/iter (+/- 4) test io::mem::test::bench_buf_reader ... bench: 628 ns/iter (+/- 18) test io::mem::test::bench_buf_writer ... bench: 478 ns/iter (+/- 19) test io::mem::test::bench_mem_reader ... bench: 712 ns/iter (+/- 44) test io::mem::test::bench_mem_writer_001_0000 ... bench: 31 ns/iter (+/- 1) test io::mem::test::bench_mem_writer_001_0010 ... bench: 51 ns/iter (+/- 3) test io::mem::test::bench_mem_writer_001_0100 ... bench: 121 ns/iter (+/- 8) test io::mem::test::bench_mem_writer_001_1000 ... bench: 774 ns/iter (+/- 47) test io::mem::test::bench_mem_writer_100_0000 ... bench: 756 ns/iter (+/- 50) test io::mem::test::bench_mem_writer_100_0010 ... bench: 2726 ns/iter (+/- 198) test io::mem::test::bench_mem_writer_100_0100 ... bench: 8961 ns/iter (+/- 712) test io::mem::test::bench_mem_writer_100_1000 ... bench: 105673 ns/iter (+/- 24711) test num::bench::bench_pow_function ... bench: 5849 ns/iter (+/- 371) test num::strconv::bench::f64::float_to_string ... bench: 662 ns/iter (+/- 202) test num::strconv::bench::int::to_str_base_36 ... bench: 424 ns/iter (+/- 7) test num::strconv::bench::int::to_str_bin ... bench: 1227 ns/iter (+/- 80) test num::strconv::bench::int::to_str_dec ... bench: 466 ns/iter (+/- 13) test num::strconv::bench::int::to_str_hex ... bench: 498 ns/iter (+/- 22) test num::strconv::bench::int::to_str_oct ... bench: 502 ns/iter (+/- 229) test num::strconv::bench::uint::to_str_base_36 ... bench: 375 ns/iter (+/- 7) test num::strconv::bench::uint::to_str_bin ... bench: 1011 ns/iter (+/- 590) test num::strconv::bench::uint::to_str_dec ... bench: 407 ns/iter (+/- 17) test num::strconv::bench::uint::to_str_hex ... bench: 442 ns/iter (+/- 7) test num::strconv::bench::uint::to_str_oct ... bench: 433 ns/iter (+/- 46) test path::posix::bench::ends_with_path_home_dir ... bench: 167 ns/iter (+/- 10) test path::posix::bench::ends_with_path_missmatch_jome_home ... bench: 148 ns/iter (+/- 6) test path::posix::bench::is_ancestor_of_path_with_10_dirs ... bench: 221 ns/iter (+/- 31) test path::posix::bench::join_abs_path_home_dir ... bench: 144 ns/iter (+/- 23) test path::posix::bench::join_home_dir ... bench: 196 ns/iter (+/- 9) test path::posix::bench::join_many_abs_path_home_dir ... bench: 143 ns/iter (+/- 6) test path::posix::bench::join_many_home_dir ... bench: 195 ns/iter (+/- 8) test path::posix::bench::path_relative_from_backward ... bench: 248 ns/iter (+/- 10) test path::posix::bench::path_relative_from_forward ... bench: 241 ns/iter (+/- 13) test path::posix::bench::path_relative_from_same_level ... bench: 296 ns/iter (+/- 11) test path::posix::bench::push_abs_path_home_dir ... bench: 104 ns/iter (+/- 7) test path::posix::bench::push_home_dir ... bench: 27311 ns/iter (+/- 2727) test path::posix::bench::push_many_abs_path_home_dir ... bench: 109 ns/iter (+/- 5) test path::posix::bench::push_many_home_dir ... bench: 23263 ns/iter (+/- 1726) test rand::bench::rand_isaac ... bench: 884 ns/iter (+/- 31) = 904 MB/s test rand::bench::rand_isaac64 ... bench: 440 ns/iter (+/- 126) = 1818 MB/s test rand::bench::rand_shuffle_100 ... bench: 2518 ns/iter (+/- 1371) test rand::bench::rand_std ... bench: 429 ns/iter (+/- 17) = 1864 MB/s test rand::bench::rand_xorshift ... bench: 0 ns/iter (+/- 0) = 800000 MB/s ``` std bench after: ``` test collections::hashmap::bench::find_existing ... bench: 43635 ns/iter (+/- 4508) test collections::hashmap::bench::find_nonexisting ... bench: 42323 ns/iter (+/- 1753) test collections::hashmap::bench::find_pop_insert ... bench: 216 ns/iter (+/- 11) test collections::hashmap::bench::hashmap_as_queue ... bench: 125 ns/iter (+/- 8) test collections::hashmap::bench::insert ... bench: 153 ns/iter (+/- 63) test collections::hashmap::bench::new_drop ... bench: 517 ns/iter (+/- 282) test collections::hashmap::bench::new_insert_drop ... bench: 734 ns/iter (+/- 264) test io::buffered::test::bench_buffered_reader ... bench: 1063 ns/iter (+/- 206) test io::buffered::test::bench_buffered_stream ... bench: 2321 ns/iter (+/- 2302) test io::buffered::test::bench_buffered_writer ... bench: 1060 ns/iter (+/- 24) test io::extensions::bench::u64_from_be_bytes_4_aligned ... bench: 69 ns/iter (+/- 2) test io::extensions::bench::u64_from_be_bytes_4_unaligned ... bench: 81 ns/iter (+/- 7) test io::extensions::bench::u64_from_be_bytes_7_aligned ... bench: 70 ns/iter (+/- 5) test io::extensions::bench::u64_from_be_bytes_7_unaligned ... bench: 69 ns/iter (+/- 5) test io::extensions::bench::u64_from_be_bytes_8_aligned ... bench: 80 ns/iter (+/- 6) test io::extensions::bench::u64_from_be_bytes_8_unaligned ... bench: 81 ns/iter (+/- 5) test io::mem::test::bench_buf_reader ... bench: 663 ns/iter (+/- 44) test io::mem::test::bench_buf_writer ... bench: 489 ns/iter (+/- 17) test io::mem::test::bench_mem_reader ... bench: 700 ns/iter (+/- 23) test io::mem::test::bench_mem_writer_001_0000 ... bench: 31 ns/iter (+/- 3) test io::mem::test::bench_mem_writer_001_0010 ... bench: 49 ns/iter (+/- 5) test io::mem::test::bench_mem_writer_001_0100 ... bench: 112 ns/iter (+/- 6) test io::mem::test::bench_mem_writer_001_1000 ... bench: 765 ns/iter (+/- 59) test io::mem::test::bench_mem_writer_100_0000 ... bench: 727 ns/iter (+/- 54) test io::mem::test::bench_mem_writer_100_0010 ... bench: 2586 ns/iter (+/- 215) test io::mem::test::bench_mem_writer_100_0100 ... bench: 8846 ns/iter (+/- 439) test io::mem::test::bench_mem_writer_100_1000 ... bench: 105747 ns/iter (+/- 17443) test num::bench::bench_pow_function ... bench: 5844 ns/iter (+/- 421) test num::strconv::bench::f64::float_to_string ... bench: 669 ns/iter (+/- 571) test num::strconv::bench::int::to_str_base_36 ... bench: 417 ns/iter (+/- 24) test num::strconv::bench::int::to_str_bin ... bench: 1216 ns/iter (+/- 36) test num::strconv::bench::int::to_str_dec ... bench: 466 ns/iter (+/- 24) test num::strconv::bench::int::to_str_hex ... bench: 492 ns/iter (+/- 8) test num::strconv::bench::int::to_str_oct ... bench: 496 ns/iter (+/- 295) test num::strconv::bench::uint::to_str_base_36 ... bench: 366 ns/iter (+/- 8) test num::strconv::bench::uint::to_str_bin ... bench: 1005 ns/iter (+/- 69) test num::strconv::bench::uint::to_str_dec ... bench: 396 ns/iter (+/- 20) test num::strconv::bench::uint::to_str_hex ... bench: 435 ns/iter (+/- 4) test num::strconv::bench::uint::to_str_oct ... bench: 436 ns/iter (+/- 451) test path::posix::bench::ends_with_path_home_dir ... bench: 171 ns/iter (+/- 6) test path::posix::bench::ends_with_path_missmatch_jome_home ... bench: 152 ns/iter (+/- 6) test path::posix::bench::is_ancestor_of_path_with_10_dirs ... bench: 215 ns/iter (+/- 8) test path::posix::bench::join_abs_path_home_dir ... bench: 143 ns/iter (+/- 6) test path::posix::bench::join_home_dir ... bench: 192 ns/iter (+/- 29) test path::posix::bench::join_many_abs_path_home_dir ... bench: 144 ns/iter (+/- 9) test path::posix::bench::join_many_home_dir ... bench: 194 ns/iter (+/- 19) test path::posix::bench::path_relative_from_backward ... bench: 254 ns/iter (+/- 15) test path::posix::bench::path_relative_from_forward ... bench: 244 ns/iter (+/- 17) test path::posix::bench::path_relative_from_same_level ... bench: 293 ns/iter (+/- 27) test path::posix::bench::push_abs_path_home_dir ... bench: 108 ns/iter (+/- 5) test path::posix::bench::push_home_dir ... bench: 32292 ns/iter (+/- 4361) test path::posix::bench::push_many_abs_path_home_dir ... bench: 108 ns/iter (+/- 6) test path::posix::bench::push_many_home_dir ... bench: 20305 ns/iter (+/- 1331) test rand::bench::rand_isaac ... bench: 888 ns/iter (+/- 35) = 900 MB/s test rand::bench::rand_isaac64 ... bench: 439 ns/iter (+/- 17) = 1822 MB/s test rand::bench::rand_shuffle_100 ... bench: 2582 ns/iter (+/- 1001) test rand::bench::rand_std ... bench: 431 ns/iter (+/- 93) = 1856 MB/s test rand::bench::rand_xorshift ... bench: 0 ns/iter (+/- 0) = 800000 MB/s ```
A few refactorings to decrease text size and increase data size. I'm not sure about this tradeoff. Various stats below. cc @pcwalton
This reduces the code needed to pass arguments for
fail!()
,fail!("{}", ...)
, and to a lesser extentfail!("...")
. Still more work to be done on compiler-generated failures and thefail!("...")
case.do_fail_empty:
do_fail_empty before:
do_fail_empty after:
do_fail_fmt:
do_fail_fmt before:
do_fail_fmt after:
File size increases.
file size before:
file size after:
Text size decreases by 52486 while data size increases by 143686.
section size before:
section size after:
I don't know if anything can be learned from these benchmarks. Looks like a wash.
std bench before:
std bench after: