[C++20] [constexpr] Should we cache the result of constexpr/consteval expression? #61425

ChuanqiXu9 · 2023-03-15T06:26:36Z

I found this one when I investigate the building time in modules.

The reproducer:

// a.cpp
constexpr bool f() {
	for (unsigned n = 0; n != 1'000'000; ++n) {
	}
	return true;
}

template<typename>
struct s {
	static constexpr auto a = f();
#ifndef NO_MULTIPLE_CALL
	static constexpr auto b = f();
	static constexpr auto c = f();
	static constexpr auto d = f();
#endif
};

template struct s<int>;
template struct s<long>;

Let's run it with:

set x
echo "Build time for duplicated"
time clang++ -std=c++20 a.cpp -c -o a.o
echo "Build time for not duplicated"
time clang++ -std=c++20 a.cpp -c -o a.o -DNO_MULTIPLE_CALL

The result in my machine shows:

Build time for duplicated

real	0m11.753s
user	0m11.611s
sys	0m0.016s
Build time for not duplicated

real	0m3.631s
user	0m3.576s
sys	0m0.013s

So now Clang doesn't cache the result of constexpr/consteval clearly. This is significant when we try to use some metaprogramming techniques. Is there any special reason?

The text was updated successfully, but these errors were encountered:

llvmbot · 2023-03-15T06:26:50Z

@llvm/issue-subscribers-c-20

tbaederr · 2023-03-15T09:15:47Z

I tried doing exactly that by simply having a Expr* -> APValue cache in ASTContext, but there were some test failures so I didn't proceed further. I think the problem is that you'd need to add a few flags to the hash key, e.g. whether something only being evaluated for overflow etc.

llvmbot · 2023-03-15T14:07:56Z

@llvm/issue-subscribers-clang-frontend

tbaederr · 2023-03-17T09:41:13Z

Also relevant discussion: #13222

ChuanqiXu9 · 2023-03-17T09:53:41Z

I feel that discussion didn't talk a lot. Let's try to mark it as duplicate and track in this one. (Since that one is from llvmbot. So it looks a little bit outdated to me.)

royjacobson · 2023-03-17T14:35:45Z

What happens if a template specialization is added between calls? Or with all the 'unique unevaluated lambda types' tricks? (https://github.com/DaemonSnake/unconstexpr-cpp20 for example)

~~C++23 also has static constexpr vars, but I'm not sure how they work.~~ it doesn't allow mutable statics

I guess it might be possible to calculate when caching is possible, but I don't think it can trivially work.

There's also an easy workaround here with constexpr variables.

ChuanqiXu9 · 2023-03-20T09:54:04Z

Sent https://reviews.llvm.org/D146410 to get a more precise feeling for the proposed solution. There are several problems for the patch. But it can solve the problem in the issue and it can pass all the test except CodeGenCXX/builtin_LINE.cpp. So I guess it might be the potential solution.

After applying the above patch, the output for the reproducer would be:

Build time for duplicated

real	0m1.825s
user	0m1.796s
sys	0m0.011s
Build time for not duplicated

real	0m1.829s
user	0m1.801s
sys	0m0.008s

It looks pretty good to me.

What happens if a template specialization is added between calls?

The explicit specialization of functions after instantiation is not allowed. I guess you may want to ask for the following case?

constexpr bool f(short) {
	for (unsigned n = 0; n != 1'000'000; ++n) {
	}
	return true;
}

constexpr auto a = f((int)43);
constexpr auto b = f((int)43);
constexpr auto c = f((int)43);
constexpr auto d = f((int)43);

constexpr bool f(int) {
	for (unsigned n = 0; n != 1'000'000; ++n) {
	}
	return false;
}

constexpr auto e = f((int)43);
constexpr auto g = f((int)43);
constexpr auto h = f((int)43);
constexpr auto i = f((int)43);

It shows that we will take 3.6s to compile this. So it evaluates the two constexpr functions exactly once for each.

Although the possibility should be pretty low, the biggest problem of the solution is that the current hash algorithm is not solid in theory. It implies that it is possible the compiler may treat two different constexpr function calls as the same. In another way, the ODRHash, as the name implies, is used to perform ODR checking in modules for a pretty long time (including clang modules, so the period should be pretty long for real). If we don't like the potential conflicts for real (I think we will choose the method), we can choose the safer hash algorithm like SHA256 or use llvm::FoldingSetNodeID directly. But SHA256 is still possible to get the conflicted result. So we can't be sure completely too.

Then about the llvm::FoldingSetNodeID , it is actually a vector of unsigned and it is used to identity the unique entity. It looks great, isn't it? But the problem is that it is not good for modules. Since my intention is to reduce the the duplicated computation within modules. It would make the size of the BMI (the byproduct of module files) much larger and the size of BMI is already big.

So personally, I prefer to the current hash values (unsigned).

The second problem may be the style. To be honest, the patch above is a little bit ugly. And there are much more places I didn't touch. I want to move the process of evaluating expressions to ASTContext rather than Expr. But this requires a lot of change which I don't like. But I guess we have to do so if we want to make this.

Other problems I can see is the concern for correctness. This may be the most important thing. But it is hard to discuss without the details.

For CodeGenCXX/builtin_LINE.cpp, the problem is:

constexpr int get_line_constexpr(int l = __builtin_LINE()) {
  return l;
}

int global_two = get_line_constexpr();
const int global_three(get_line_constexpr());

The second call to get_line_constexpr() will return the result from the first call. And I don't get a good solution for this.

CC: @AaronBallman @erichkeane @cor3ntin

cor3ntin · 2023-03-20T10:59:29Z

We'd need to be very careful.
constexpr functions are not pure, neither are consteval functions. I think the main observable effect right now is source_locations but reflection, logging, and other things may affect the result of constant evaluation.

I'm not sure that users can themselves get into a situation where constant evaluation would lead to a different result the second time if the first evaluation was successful. so we might be able to have a list of builtins that exclude an evaluation from caching. maybe that would be enough?

erichkeane · 2023-03-20T13:21:04Z

I think this probably needs an RFC. But as Corentin says, I'm VERY concerned here.

I think a less dangerous solution is the one that @tbaederr is working on, which is the new bitcode based interpreter, which can then be better cached/executed more quickly.

tbaederr · 2023-03-30T10:39:29Z

FYI the results with the new constant expression interpreter are:

# time gcc -std=c++20 -c ./a.cpp -fconstexpr-loop-limit=1000000
real    0m0.395s
user    0m0.364s
sys     0m0.030s

# time gcc -std=c++20 -c ./a.cpp -fconstexpr-loop-limit=1000000 -DNO_MULTIPLE_CALL
real    0m0.395s
user    0m0.366s
sys     0m0.028s

# time bin/clang++ -std=c++20 -c ./a.cpp
real    0m4.713s
user    0m4.693s
sys     0m0.014s

# time bin/clang++ -std=c++20 -c ./a.cpp -DNO_MULTIPLE_CALL
real    0m1.939s
user    0m1.926s
sys     0m0.010s

# time bin/clang++ -std=c++20 -c ./a.cpp -fexperimental-new-constant-interpreter
real    0m0.991s
user    0m0.979s
sys     0m0.011s

# time bin/clang++ -std=c++20 -c ./a.cpp -fexperimental-new-constant-interpreter -DNO_MULTIPLE_CALL
real    0m0.330s
user    0m0.317s
sys     0m0.013s

cor3ntin · 2023-03-30T10:46:19Z

@tbaederr do we know what kind of witchcraft GCC is performing? it's very impressive performance numbers

tbaederr · 2023-03-30T10:47:52Z

You were too fast, I just updated them. My clang numbers were with msan enabled, sorry.

ChuanqiXu9 · 2023-03-31T08:35:45Z

@tbaederr the numbers looks pretty good! Is there is RFC or a design doc that I can learn the higher level design? And if the -fexperimental-new-constant-interpreter is going to be enabled by default, I think I don't need to work on.

tbaederr · 2023-03-31T13:27:24Z

In terms of high-level documentation, there is https://clang.llvm.org/docs/ConstantInterpreter.html, but large parts of that are out of date. I haven't found the time to update it yet.

neko-para · 2024-01-08T09:59:24Z

I believe this would be quite useful if compiler could promise to do so.

constexpr std::string func() { ... }
constexpr size_t N = func().size();
template <size_t M>
constexpr std::array<char, M> make_data() {
    std::string str = func();
    std::array<char, M> result(str.begin(), str.end());
    result[M - 1] = 0;
    return result;
}
constexpr auto data = make_data<N + 1>();

As shown above, I have to calculate func twice, so that I can store std::string into std::array which can cross the boundary between compiling and runtime.

ChuanqiXu9 added c++20 consteval C++20 consteval labels Mar 15, 2023

EugeneZelenko added the clang:frontend Language frontend issues, e.g. anything involving "Sema" label Mar 15, 2023

ChuanqiXu9 mentioned this issue Mar 17, 2023

memoize constexpr function invocations #13222

Closed

kaimfrai mentioned this issue May 18, 2023

[C++20] Unintuitive compile times when caching the result of constexpr/consteval in modules #62796

Closed

cor3ntin mentioned this issue Jun 5, 2023

[consteval] The mixed use of 'consteval' and 'constexpr' takes longer time to compile #62947

Open

usx95 added this to C++ 20 in Clang Jul 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C++20] [constexpr] Should we cache the result of constexpr/consteval expression? #61425

[C++20] [constexpr] Should we cache the result of constexpr/consteval expression? #61425

ChuanqiXu9 commented Mar 15, 2023 •

edited

Loading

llvmbot commented Mar 15, 2023

tbaederr commented Mar 15, 2023

llvmbot commented Mar 15, 2023

tbaederr commented Mar 17, 2023

ChuanqiXu9 commented Mar 17, 2023

royjacobson commented Mar 17, 2023 •

edited

Loading

ChuanqiXu9 commented Mar 20, 2023

cor3ntin commented Mar 20, 2023

erichkeane commented Mar 20, 2023

tbaederr commented Mar 30, 2023 •

edited

Loading

cor3ntin commented Mar 30, 2023

tbaederr commented Mar 30, 2023

ChuanqiXu9 commented Mar 31, 2023

tbaederr commented Mar 31, 2023

neko-para commented Jan 8, 2024

[C++20] [constexpr] Should we cache the result of constexpr/consteval expression? #61425

[C++20] [constexpr] Should we cache the result of constexpr/consteval expression? #61425

Comments

ChuanqiXu9 commented Mar 15, 2023 • edited Loading

llvmbot commented Mar 15, 2023

tbaederr commented Mar 15, 2023

llvmbot commented Mar 15, 2023

tbaederr commented Mar 17, 2023

ChuanqiXu9 commented Mar 17, 2023

royjacobson commented Mar 17, 2023 • edited Loading

ChuanqiXu9 commented Mar 20, 2023

cor3ntin commented Mar 20, 2023

erichkeane commented Mar 20, 2023

tbaederr commented Mar 30, 2023 • edited Loading

cor3ntin commented Mar 30, 2023

tbaederr commented Mar 30, 2023

ChuanqiXu9 commented Mar 31, 2023

tbaederr commented Mar 31, 2023

neko-para commented Jan 8, 2024

ChuanqiXu9 commented Mar 15, 2023 •

edited

Loading

royjacobson commented Mar 17, 2023 •

edited

Loading

tbaederr commented Mar 30, 2023 •

edited

Loading