Closed
Description
When you are down to a few nanoseconds per operation, it could be the exact alignment of the code in memory, it could be the CPU's memory caching algorithm, it could be the branch predictor, it could be the phase of the moon. Microbenchmarking is hard.
We don't use the issue tracker to discuss issues like this. Use a forum instead. If you identify a problem in the compiler or assembler or linker, then by all means open an issue with details. But the odds are that this is effectively chance, and not something that can be fixed.
Originally posted by @ianlancetaylor in #39059 (comment)