Skip to content

Performance Improvements #391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 25, 2016
Merged

Conversation

amadeus
Copy link
Collaborator

@amadeus amadeus commented Apr 22, 2016

I recently learned a bit about the :syntime command, and decided to use it to profile our syntax file. I've found some things that I think we can really optimize and I actually got some fairly significant optimizations just in this PR.

I am using a fully compiled version of PhaserJS for these benchmarks. I think it's safe to say it's a huge file and I considered it a good way to benchmark performance. As an aside, if there are files you find performance to be pretty bad, you should link it here so I can run these benchmarks and look for other optimizations.

The steps to reproduce the tests are as follows

  1. Open the phaser.js file
  2. Press gg to scroll to the top (I would additionally do :set nowrap as well)
  3. Start the syntax profiler: :syntime on
  4. Scroll quickly to around line 10,000 (I could've gone further, but I found it to be a pretty healthy test time)
  5. Get the profiles results using :syntime report

Here are before and after results for the most intensive selectors:

Before

TOTAL      COUNT  MATCH   SLOWEST     AVERAGE   NAME               PATTERN
1.617329   62434  0       0.002201    0.000026  jsGlobalObjects    \%(Intl\.\)\@<=\(Collator\|DateTimeFormat\|
1.047933   19970  2490    0.002397    0.000052  jsFuncAssignExpr   \v%(%([a-zA-Z_$]\k*\.)*[a-zA-Z_$]\k*\s*\=\s
1.028959   147679 93522   0.002242    0.000007  jsOperator         \(!\||\|&\|+\|-\|<\|>\|=\|%\|\/\|*\|\~\|\^\
0.977351   69643  10656   0.002030    0.000014  jsRegexpString     \(\(\(return\|case\)\s\+\)\@<=\|\(\([)\]"']
0.932609   62434  0       0.002367    0.000015  jsArrowFuncArgs    \(\k\)\+\s*\(=>\)\@=
0.930731   109058 49930   0.003152    0.000009  jsFuncCall         \k\+\%(\s*(\)\@=
0.701920   97547  35215   0.001612    0.000007  jsAssignmentExpr   \v%([a-zA-Z_$]\k*\.)*[a-zA-Z_$]\k*\s*\=\>@!
0.513394   192983 163196   0.001123   0.000003  jsNoise            \%(:\|,\|\;\|\.\)
0.439069   62434  0       0.001627    0.000007  jsTaggedTemplate   \k\+\(\(\n\|\s\)\+\)\?`
0.398041   103527 40664   0.001752    0.000004  jsNumber           \<-\=\d\+\(L\|[eE][+-]\=\d\+\)\=\>\|\<0[xX]
0.205140   63112  687     0.001655    0.000003  jsFloat            \<-\=\%(\d\+\.\d\+\|\d\+\.\|\.\d\+\)\%([eE]
0.189091   40580  23100   0.002135    0.000005  jsAssignExpIdent   \v[a-zA-Z_$]\k*\ze%(\s*\=)
0.140040   35583  3553    0.001768    0.000004  jsDocTags          @\(abstract\|access\|accessor\|author\|clas
0.132321   20175  1427    0.001212    0.000007  jsObjectKey        \<[a-zA-Z_$][0-9a-zA-Z_$]*\>\(\s*:\)\@=
0.089218   63451  2689    0.001222    0.000001  jsLineComment      ^\s*\/\/       

After

TOTAL      COUNT  MATCH   SLOWEST     AVERAGE   NAME               PATTERN
0.887965   68688  12559   0.002036    0.000013  jsRegexpString     \%(\%(\%(return\|case\)\s\+\)\@<=\|\%(\%([)\]"']\|\d\|\w
0.855625   109576 53343   0.002364    0.000008  jsFuncCall         \k\+\%(\s*(\)\@=
0.671938   59407  0       0.002029    0.000011  jsArrowFuncArgs    \k\+\s*\%(=>\)\@=
0.379020   137695 86113   0.001926    0.000003  jsOperator         [\!\|\&\+\-\<\>\=\%\/\*\~\^]\{1}
0.376395   101537 41691   0.001237    0.000004  jsNumber           \<-\=\d\+\(L\|[eE][+-]\=\d\+\)\=\>\|\<0[xX]\x\+\>
0.295642   59407  0       0.002014    0.000005  jsTaggedTemplate   \k\+\%([\n\s]\+\)\?`
0.284822   181546 152855   0.001993   0.000002  jsNoise            [:,\;\.]\{1}
0.194653   60391  991     0.001991    0.000003  jsFloat            \<-\=\%(\d\+\.\d\+\|\d\+\.\|\.\d\+\)\%([eE][+-]\=\d\+\)\
0.139054   21099  1720    0.002092    0.000007  jsObjectKey        \<[a-zA-Z_$][0-9a-zA-Z_$]*\>\(\s*:\)\@=
0.128952   35864  3521    0.002004    0.000004  jsDocTags          @\(abstract\|access\|accessor\|author\|classdesc\|consta
0.093106   60401  2413    0.001991    0.000002  jsLineComment      ^\s*\/\/

In summary, I found a few rules that seem to be just hanging around not matching anything (probably from a bygone era of function matching) that could easily be removed and also found a few ways to optimize a few of the hotter important contenders.

I am open to more optimization suggestions if you guys have them! I will let this sit for a bit though, to ensure I haven't really broken anything, but so far everything seems very solid.

@amadeus
Copy link
Collaborator Author

amadeus commented Apr 22, 2016

There's actually a very potentially valid reason to remove the jsFuncCall selector and perhaps figure out a way to re-jigger the jsRegexpString selector.

I know the jsFuncCall was something that was a pretty hot topic a while back, so I didn't remove it, but it does come in as our nastiest selector with very little utility.

Perhaps we hide it behind some global variable, so it's off by default and the people that want it can turn it on?

@amadeus amadeus force-pushed the performance-improvements branch 2 times, most recently from dc82a3d to 3cb1ea8 Compare April 22, 2016 22:16
@davidchambers
Copy link
Collaborator

I had no idea about :syntime. Very cool! Vim is so deep.

Only regexp literals are noticeably slow for me. I find there's sometimes a lag when I attempt to interact with text following a regexp literal. Does anyone else experience this?

@amadeus
Copy link
Collaborator Author

amadeus commented Apr 22, 2016

I've noticed that long lines tend to lag syntax highlighting, but generally the regexes I've used in JS have been pretty small and I have never noticed any serious performance issues, although I think it can be confirmed by the fact that jsRegexpString is a always at the top of the regex checks.

This PR might help a bit with this? I think the reason that jsRegexpString is so gnarly is that massive look behind. I am a bit afraid to mess with it more since I know it has incorporated a lot of various fixes over the years for bizarre regex edge cases.

@amadeus amadeus force-pushed the performance-improvements branch from 3cb1ea8 to 1ee5815 Compare April 23, 2016 22:27
@amadeus amadeus mentioned this pull request Apr 23, 2016
@amadeus amadeus force-pushed the performance-improvements branch 2 times, most recently from 01615e4 to a4005cc Compare April 24, 2016 18:04
@bounceme
Copy link
Collaborator

bounceme commented Apr 24, 2016

Little improvements could be made with trying to replace all the '@=>' and '@<!'. I've got no other suggestions though,I don't know anything about syntax files

@amadeus
Copy link
Collaborator Author

amadeus commented Apr 24, 2016

@bounceme do you know what we could replace the look-behinds with?

@bounceme
Copy link
Collaborator

according to ':h @<=' and ':h @<!' there are a few methods more performant like '\zs' and limiting the distance of the look behind in 'bytes'

@amadeus amadeus force-pushed the performance-improvements branch 2 times, most recently from cba3f07 to 678bb6f Compare April 24, 2016 20:29
@amadeus
Copy link
Collaborator Author

amadeus commented Apr 24, 2016

@bounceme I did end up playing around with the \zs stuff, but I could never get it to work properly. If you'd like to take a stab at it, feel free.

I did add some limiters to the jsRegexpString, specifically I added a lookbehind limit of 50 bytes. I am not sure if there's a value we should use, but at least at 50 it's probably safe for most scenarios where someone might want to do some funky alignment thing? It did also provide a nice incremental performance improvement in my tests.

@amadeus amadeus force-pushed the performance-improvements branch from b0a741f to c2bc4c7 Compare April 25, 2016 18:33
amadeus added 8 commits April 25, 2016 11:33
Not using groups seems to help a bunch with performance
I think these matchers where for functions, and from another era of
project ownership. They added a TON of performance overhead, and I could
never replicate a scenario where they matched.
We had some special matching for Intl and certain submethods. It seems
needless to do this for the following reasons:

* The look behind added massive overhead (especially since the rule was
  not contained)
* We never do any special matching for any other global objects, so why
  these?
Shaved off some average time by removing the groups, which weren't
really needed to begin with
The optimizations it provides really doesn't seem to help much and
usually just results in weird syntax matching errors unless you run
`syntax sync fromstart`
Another match that turns up pretty high in the list of total time spent
under `syntime`
@amadeus amadeus merged commit 09fde19 into pangloss:develop Apr 25, 2016
@amadeus amadeus deleted the performance-improvements branch April 25, 2016 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants