Skip to content

Conversation

@akarnokd
Copy link
Collaborator

@akarnokd akarnokd commented Jul 4, 2018

This PR adds special implementations of the Return, Throw, Append and Prepend operators that inlines the behavior of scheduling on the ImmediateScheduler and thus improving performance and reducing memory usage by not allocating the ImmediateAsyncScheduler to be thrown away immediately as well as not allocating an unnecessary IdentitySink.

Benchmark results:

BenchmarkDotNet=v0.10.14, OS=Windows 10.0.17134
Intel Core i7-4790 CPU 3.60GHz (Haswell), 1 CPU, 8 logical and 4 physical cores
Frequency=3513595 Hz, Resolution=284.6088 ns, Timer=TSC
  [Host]     : .NET Framework 4.6.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3110.0
  DefaultJob : .NET Framework 4.6.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3110.0

Before

Method Mean Error StdDev Median Gen 0 Gen 1 Gen 2 Allocated
Return_Immediate 1,406.0 ns 27.624 ns 72.287 ns 1,384.1 ns 0.1736 0.0381 0.0191 808 B
Return_CurrentThread 551.6 ns 2.766 ns 2.587 ns 551.7 ns 0.1392 - - 584 B
Return_EventLoop 1,109.4 ns 6.880 ns 6.436 ns 1,110.2 ns 0.1392 - - 585 B
Return_TaskPool 1,791.3 ns 7.471 ns 6.988 ns 1,790.8 ns 0.1793 0.0019 - 784 B
Return_ThreadPool 1,556.2 ns 7.648 ns 7.154 ns 1,556.1 ns 0.1354 0.0076 0.0019 592 B
Throw_Immediate 1,384.1 ns 7.853 ns 6.961 ns 1,384.0 ns 0.1736 0.0381 0.0191 808 B
Throw_CurrentThread 543.8 ns 1.886 ns 1.764 ns 543.5 ns 0.1392 - - 584 B
Throw_EventLoop 1,097.0 ns 7.096 ns 6.638 ns 1,094.1 ns 0.1392 - - 585 B
Throw_TaskPool 1,794.1 ns 5.266 ns 4.925 ns 1,794.3 ns 0.1793 0.0019 - 784 B
Throw_ThreadPool 1,547.0 ns 8.790 ns 8.222 ns 1,548.9 ns 0.1354 0.0095 0.0019 592 B
Prepend_Immediate 2,957.6 ns 14.329 ns 12.702 ns 2,954.3 ns 0.2975 0.0648 0.0305 1485 B
Prepend_CurrentThread 1,069.4 ns 7.772 ns 6.068 ns 1,067.6 ns 0.1945 - - 824 B
Prepend_EventLoop 1,854.6 ns 18.271 ns 17.090 ns 1,850.4 ns 0.2022 - - 859 B
Prepend_TaskPool 3,744.0 ns 21.445 ns 19.010 ns 3,738.7 ns 0.2937 0.0038 - 1262 B
Prepend_ThreadPool 3,039.0 ns 12.504 ns 11.084 ns 3,040.0 ns 0.1984 0.0191 0.0038 881 B
Append_Immediate 2,978.5 ns 16.442 ns 15.380 ns 2,976.6 ns 0.2975 0.0648 0.0305 1485 B
Append_CurrentThread 1,032.0 ns 4.986 ns 4.664 ns 1,030.7 ns 0.1945 - - 824 B
Append_EventLoop 1,676.3 ns 12.798 ns 10.687 ns 1,674.8 ns 0.2041 - - 858 B
Append_TaskPool 3,596.5 ns 6.415 ns 5.687 ns 3,596.8 ns 0.2937 0.0038 - 1262 B
Append_ThreadPool 2,888.5 ns 7.143 ns 6.682 ns 2,888.3 ns 0.1984 0.0191 0.0038 881 B

After

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
Return_Immediate 240.1 ns 3.267 ns 3.056 ns 0.0665 - - 280 B
Return_CurrentThread 576.5 ns 7.347 ns 6.872 ns 0.1392 - - 584 B
Return_EventLoop 1,138.5 ns 10.967 ns 10.258 ns 0.1392 - - 585 B
Return_TaskPool 1,825.4 ns 8.272 ns 7.333 ns 0.1793 0.0019 - 784 B
Return_ThreadPool 1,589.6 ns 6.414 ns 5.999 ns 0.1354 0.0095 0.0019 592 B
Throw_Immediate 230.9 ns 1.249 ns 1.169 ns 0.0663 - - 280 B
Throw_CurrentThread 556.9 ns 4.745 ns 4.207 ns 0.1392 - - 584 B
Throw_EventLoop 1,123.8 ns 9.712 ns 9.085 ns 0.1392 - - 585 B
Throw_TaskPool 1,814.4 ns 8.434 ns 7.889 ns 0.1793 0.0019 - 784 B
Throw_ThreadPool 1,595.0 ns 8.309 ns 7.366 ns 0.1354 0.0095 0.0019 592 B
Prepend_Immediate 404.3 ns 2.493 ns 2.210 ns 0.0801 - - 336 B
Prepend_CurrentThread 1,055.8 ns 4.180 ns 3.910 ns 0.1945 - - 824 B
Prepend_EventLoop 1,898.5 ns 12.197 ns 10.185 ns 0.2041 - - 859 B
Prepend_TaskPool 3,716.6 ns 16.270 ns 12.702 ns 0.2937 0.0038 - 1262 B
Prepend_ThreadPool 3,113.0 ns 8.101 ns 7.181 ns 0.1984 0.0191 0.0038 880 B
Append_Immediate 402.9 ns 3.107 ns 2.906 ns 0.0801 - - 336 B
Append_CurrentThread 1,045.1 ns 5.246 ns 4.650 ns 0.1945 - - 824 B
Append_EventLoop 1,710.2 ns 14.968 ns 12.499 ns 0.2041 - - 858 B
Append_TaskPool 3,711.2 ns 11.675 ns 9.749 ns 0.2937 0.0038 - 1263 B
Append_ThreadPool 2,938.2 ns 5.124 ns 4.543 ns 0.1984 0.0191 0.0038 881 B

Relevant changes

Method Mean Error StdDev Median Gen 0 Gen 1 Gen 2 Allocated
Return_Immediate 1,406.0 ns 27.624 ns 72.287 ns 1,384.1 ns 0.1736 0.0381 0.0191 808 B
Return_Immediate_After 240.1 ns 3.267 ns 3.056 ns - 0.0665 - - 280 B
Throw_Immediate 1,384.1 ns 7.853 ns 6.961 ns 1,384.0 ns 0.1736 0.0381 0.0191 808 B
Throw_Immediate_After 230.9 ns 1.249 ns 1.169 ns - 0.0663 - - 280 B
Prepend_Immediate 2,957.6 ns 14.329 ns 12.702 ns 2,954.3 ns 0.2975 0.0648 0.0305 1485 B
Prepend_Immediate_After 404.3 ns 2.493 ns 2.210 ns - 0.0801 - - 336 B
Append_Immediate 2,978.5 ns 16.442 ns 15.380 ns 2,976.6 ns 0.2975 0.0648 0.0305 1485 B
Append_Immediate_After 402.9 ns 3.107 ns 2.906 ns - 0.0801 - - 336 B

@danielcweber
Copy link
Collaborator

I'm very fine with inlining it into Return and Throw because they are very basic, and would merge it right away, but for Append and Prepend, I really don't know. We want to lower the barriers for writing new operators, this is just a bad sign for new contributors if it would now take separate classes for special schedulers. Also, there is so much code duplication in AppendPrepend. Can't ScheduleAction do the check on Immediate? The performance penalty is ok, it would just be a reference comparison.

@danielcweber
Copy link
Collaborator

danielcweber commented Jul 4, 2018

Or at least, see whether the duplicated code can be pushed down to a base class.

@akarnokd
Copy link
Collaborator Author

akarnokd commented Jul 4, 2018

Can't ScheduleAction do the check on Immediate?

That's a run-time check whereas this does this check at assembly time and picks the optimized version, which then on remains the same for all further subscriptions.

Can't ScheduleAction do the check on Immediate? The performance penalty is ok, it would just be a reference comparison.

It's not just the check, but any other code & field that can be eliminated in a dedicated implementation.

We want to lower the barriers for writing new operators, this is just a bad sign for new contributors if it would now take separate classes for special schedulers.

Writing new operators is one thing, optimizing them is another.

@danielcweber danielcweber merged commit c3844b4 into dotnet:master Jul 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants