-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Various small optimizations #9605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thsi gets Context size down to 72 bytes from 80, assuming a 12 byte header and 8 byte rounding. In the type/* benchmark this gives a saving of ~700K contexts * 8 bytes = 5.4M vs a store allocation increase of 11K * ~90bytes -= 1M (approx).
It's fairly hot code, and eliminating the closure also avoids Int boxing.
The previously optimized apply function was not tail recursive since it was not final.
test performance please |
performance test scheduled: 13 job(s) in queue, 1 running. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
def fold(x: X, trees: List[Tree]): X = trees match | ||
case tree :: rest => fold(apply(x, tree), rest) | ||
case Nil => x | ||
fold(x, trees) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ee4b125, I'm trying:
var acc = x
var list = trees
while (!list.isEmpty) do
acc = apply(acc, list.head)
list = list.tail
acc
After tail-call optimization, the two versions are almost the same. Let's see if there is difference in benchmarks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there will be. If anything, the tail recursive version should be faster since it does a single type test per iteration instead of one each in isEmpty, head, and tail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #9565, we get a slight speedup for Dotty.
Performance test finished successfully: Visit http://dotty-bench.epfl.ch/9605/ to see the changes. Benchmarks is based on merging with master (f2018f0) |
test performance please |
performance test scheduled: 15 job(s) in queue, 1 running. |
Performance test finished successfully: Visit http://dotty-bench.epfl.ch/9605/ to see the changes. Benchmarks is based on merging with master (1431be2) |
Each of these will not change the needle much but they don't make the code more complicated either. So it is just attention to detail.