Skip to content

Fix multiple unnecessary calls to strnom::skip_whitespace #226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 5, 2020
Merged

Fix multiple unnecessary calls to strnom::skip_whitespace #226

merged 1 commit into from
May 5, 2020

Conversation

zzau13
Copy link
Contributor

@zzau13 zzau13 commented May 4, 2020

#126

fn main() {
    let path = "../rust/src/librustc_ast/util/parser.rs";
    let content = std::fs::read_to_string(path).unwrap();
    for _ in 0..1000 {
        content.parse::<proc_macro2::TokenStream>().unwrap();
    }
}
cargo build --release  
valgrind --tool=callgrind target/release/bench_proc
callgrind_annotate --auto=yes callgrind.out.*

This one step, it eat about 300M ir.

I can improve it much more but it needs a refactor, the alt! macros they are not efficient and the common logic is repeated many times, the most called skip_whitespace but there are also others like fallback::token_kind, fallback::op_char ...

If you like, I continue.

Copy link
Owner

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Some notes below.

src/fallback.rs Outdated
if input.is_empty() {
break;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? It looks like input.is_empty() would only be true at most once in the entire parse so I don't see a way that this would affect performance. If you omit it, the next loop iteration will fail to parse a token_tree and break, which is still correct behavior.

src/fallback.rs Outdated
Comment on lines 843 to 849
let input_no_ws = skip_whitespace(input);
if input_no_ws.rest.len() == 0 {
break;
}
if let Ok((a, tokens)) = doc_comment(input_no_ws) {
if let Ok((a, tokens)) = doc_comment(input) {
input = a;
trees.extend(tokens);
continue;
}

let (a, tt) = match token_tree(input_no_ws) {
let (a, tt) = match token_tree(input) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help me understand this change: This causes all whitespace and comments to be parsed twice as many times, doesn't it? The previous behavior made 3 calls to skip_whitespace but only 1 of them would do any work. The new behavior makes 2 calls to skip_whitespace where the second one redoes all the work already done by the first one. Seems like which one is better would depend heavily on the characteristics of the input.

Instead we might want to split some of these functions like what we have for symbol_leading_ws vs symbol below, and always call the right one depending on whether whitespace has already been processed at the call site.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should go where it was.
I want to remove the call to skip_whitespace at spanned. I just removed it to make sure it worked since the logic is complex.

src/fallback.rs Outdated
@@ -919,7 +919,7 @@ named!(group -> Group, alt!(
));

fn symbol_leading_ws(input: Cursor) -> PResult<TokenTree> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should be deleted if it no longer does anything different from symbol.

@zzau13 zzau13 changed the title Fix multiple unnecessary calls to strnom::skip_whitespace [WIP] Fix multiple unnecessary calls to strnom::skip_whitespace May 5, 2020
@zzau13
Copy link
Contributor Author

zzau13 commented May 5, 2020

Even though the use of whitespace is still very exaggerated, I don't want to change too much at once to avoid breaking anything. I have left it in a single place in token_stream. The results:

Old

Ir
--------------------------------------------------------------------------------
4,866,615,165  PROGRAM TOTALS

--------------------------------------------------------------------------------
Ir             file:function
--------------------------------------------------------------------------------
2,004,232,000  ???:proc_macro2::strnom::whitespace

New

Ir
--------------------------------------------------------------------------------
4,334,090,565  PROGRAM TOTALS

--------------------------------------------------------------------------------
Ir             file:function
--------------------------------------------------------------------------------
1,541,298,000  ???:proc_macro2::strnom::whitespace

@zzau13 zzau13 changed the title [WIP] Fix multiple unnecessary calls to strnom::skip_whitespace Fix multiple unnecessary calls to strnom::skip_whitespace May 5, 2020
@zzau13 zzau13 requested a review from dtolnay May 5, 2020 16:59
Copy link
Owner

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@dtolnay dtolnay merged commit c3f5670 into dtolnay:master May 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants