-
Notifications
You must be signed in to change notification settings - Fork 1
Initial commit benchmark2 (.not core) #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@Anderman wow there are some cool tricks in here. I will probably wait merging until I have some time to do refactoring and understand your changes. Will you give some new results on the latest changes you have done? Perhaps in the original clr thread? Will you be doing more work on this? |
|
@Anderman I gave this a quick look. The tricks to call assembly code inline in C# are very impressive. I do have one concern and that is the use of
One way to avoid some issues with |
@nietras Could be a problem. But I don't see why this depends on the version of windows. There is another way to get the same counters. http://stackoverflow.com/questions/26618991/measure-cpu-cycles-of-a-function-call/26619409#26619409 |
this is the same as RDTSC see On Aug 29, 2016 21:50, "Thom Kiesewetter" [email protected] wrote:
|
Ok. I know that. Thats why some people say use the clock counter instead of nano sec. I found ä document from a prof, he did al lot of performance cpu testen for many years. Thats also why warmup test are needed. I do 50 cycles to minimize the overhead and each test is 100ms. So there will be a lot of test and some are not influenced by the system |
Are you referring to Agner Fog (http://www.agner.org/optimize/)?
Yes but his use cases are different, as far as I am concerned, he is profiling short native code. Not a managed runtime. Additionally, he can call
Does that mean that each measurement spans ~100ms or? Perhaps I should ask differently why did you want to change to using I am not saying that You probably already have seen what In any case, I think your approach with specific benchmarks for memory copying makes a lot of sense. |
I don't see what managed runtime has todo with our memory tests. Qcalls and PInvoke has some overhead but it is only a number of extra instructions.
Run with F5 release build. When the exception is show. Open with VS and do 4 times step out (Shift F11)
The tick counter on the stopwatch is 100-200 slower then the RDTSC counter. I didn't get repeatable result when I used the stopwatch. The test must be longer because counter is running at a slower speed. But then other code that's runs on my computer has the change to influence the results. Total clockCycles code = 8*50 = 400 So the clock timer will measure 600 clockcycles. With the stopwatch I get only 2 or 3 ticks So in 100ms the test will run about 100ms/600cycles is about 100ms/300ns = 333333 times. I hope this make thinks more clear why I took this solution |
Maybe Interesting |
Yes, it has an optimized copy function mentioned in one of the reports. Could be used for inspiration, although it uses aligned instructions etc that are not available in. net. |
@Anderman there is a great (and long) blog post about timers and e.g. all the issues with Including latency and resolution issues. |
Simple benchmark. Allow faster testing and easy to generate excel sheet.
randomize/cached testing is still manual