Skip to content

[BUG] perf: having many expressions is slow to lex #513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
JohelEGP opened this issue Jun 14, 2023 · 2 comments · Fixed by #514
Closed

[BUG] perf: having many expressions is slow to lex #513

JohelEGP opened this issue Jun 14, 2023 · 2 comments · Fixed by #514
Labels
bug Something isn't working

Comments

@JohelEGP
Copy link
Contributor

JohelEGP commented Jun 14, 2023

Title: perf: having many expressions is slow to lex.

Description:

On my system,
the following test takes the longest time
to run through Cppfront and
to build with the Cpp1 compiler.

For the following table:

  • Cppfront is built with the respective Cpp1 compiler configuration.
  • LLVM is shorthand for Clang and Libc++.
mixed-as-for-variant-20-types LLVM 17, Debug LLVM 17, Release GCC 13, Debug GCC 13, Release
Cppfront time 11.28 s 0.94 s 3.44 s 0.45 s
Cpp1 compiler time 14.66 s 4.68 s 5.44 s 2.48 s

On Debug mode,
the time to lower mixed-as-for-variant-20-types
is very close to that of building the lowered Cpp1 (which is heavy on template instantiations).

Minimal reproducer (https://cpp2.godbolt.org/z/EvqKM4jMe):

main: (args) = {
  x :== args.argc;
  std::cout << "(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$\n";
  std::cout << "(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$\n";
  std::cout << "(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$\n";
  std::cout << "(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$\n";
  std::cout << "(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$\n";
  std::cout << "(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$\n";
  std::cout << "(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$\n";
  std::cout << "(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$\n";
  std::cout << "(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$(x)$\n";
}

[ Note:
During the lex phase, (x)$(x)$ becomes cpp2::to_string(x) + cpp2::to_string(x).
-- end note ]

Commands:
cppfront main.cpp2
clang++17 -std=c++23 -stdlib=libc++ -lc++abi -pedantic-errors -Wall -Wextra -Wconversion -I . main.cpp

Expected result:

Enough performance that lowering Cpp2 doesn't rival template-heavy Cpp1.

Actual result and error:

For the following table:

  • Cppfront is built with the respective Cpp1 compiler configuration.
  • LLVM is shorthand for Clang and Libc++.
minimal reproducer LLVM 17, Debug LLVM 17, Release GCC 13, Debug GCC 13, Release
Cppfront time 14.30 s 1.06 s 2.53 s 0.34 s
Cpp1 compiler time 17.20 s 4.20 s 4.23 s 2.10 s

The fault seems to lie in the use of std::regex.
In the following flame graph,
highlighted in blue is the compilation of a constant "is a keyword?" regex recompiled every iteration,
and then their neighboring std::regex_search which still seems to amount to ~½ of the total time.

1686782646

@JohelEGP JohelEGP added the bug Something isn't working label Jun 14, 2023
@JohelEGP
Copy link
Contributor Author

JohelEGP commented Jun 16, 2023

For the following table:

  • Cppfront is built with the respective Cpp1 compiler configuration.
  • LLVM is shorthand for Clang and Libc++.
reflect.h LLVM 17, Debug LLVM 17, Release GCC 13, Debug GCC 13, Release
Cppfront time 13.19 s 1.27 s 6.03 s 0.75 s

@JohelEGP
Copy link
Contributor Author

JohelEGP commented Jun 16, 2023

For the following table:

  • Cppfront is built with the respective Cpp1 compiler configuration.
  • LLVM is shorthand for Clang and Libc++.

Legend:

  • main: The regex is an automatic object (branch main).
  • static: The regex is an static object.
  • find_if: The keywords are stored in static vector and queried with std::find_if.
mixed-as-for-variant-20-types LLVM 17, Debug LLVM 17, Release GCC 13, Debug GCC 13, Release
Cppfront time (main) 11.28 s 0.94 s 3.44 s 0.45 s
Cppfront time (static) 10.89 s 0.79 s 1.47 s 0.19 s
Cppfront time (find_if) 0.08 s 0.01 s 0.08 s 0.01 s
Cpp1 compiler time (main) 14.66 s 4.68 s 5.44 s 2.48 s
Cpp1 compiler time (static) 14.71 s 4.22 s 3.47 s 2.21 s
Cpp1 compiler time (find_if) 3.90 s 3.37 s 2.24 s 2.19 s
[#513 minimal reproducer][] LLVM 17, Debug LLVM 17, Release GCC 13, Debug GCC 13, Release
Cppfront time (main) 14.30 s 1.06 s 2.53 s 0.34 s
Cppfront time (static) 14.00 s 0.98 s 1.76 s 0.23 s
Cppfront time (find_if) 0.03 s 0.02 s 0.04 s 0.01 s
Cpp1 compiler time (main) 17.20 s 4.20 s 4.23 s 2.10 s
Cpp1 compiler time (static) 17.49 s 4.41 s 3.84 s 2.34 s
Cpp1 compiler time (find_if) 3.66 s 6.48 s 2.11 s 1.85 s
reflect.h LLVM 17, Debug LLVM 17, Release GCC 13, Debug GCC 13, Release
Cppfront time (main) 13.19 s 1.27 s 6.03 s 0.75 s
Cppfront time (static) 12.11 s 0.84 s 5.71 s 0.70 s
Cppfront time (find_if) 0.19 s 0.02 s 0.19 s 0.01 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant