-
-
Notifications
You must be signed in to change notification settings - Fork 22
Make sphinx-lint faster #76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
See also python/cpython#110617; this seems partially a problem in pre-commit. A |
Very nice! python/cpython#110617 sped things up on my laptop from around 22 seconds to around 17 seconds. That still feels painfully slow for local usage, though; let's see if we can improve on it even further ;) |
Also: we should document somewhere that people are recommended to add |
I did some profiling of sphinx-lint; here are the results (sorted by which functions have the most cumulative time spent in them):
Script I used for profiling it (has to be run from a directory that includes a CPython clone)import cProfile
import sys
from sphinxlint.__main__ import main
files = [f"cpython/Doc/library/{module}.rst" for module in ("os", "typing", "sqlite3", "stdtypes", "argparse", "enum")]
files.append("cpython/Doc/reference/datamodel.rst")
cmd = f'main({["foo"] + files})'
cProfile.run(cmd, sort="cumulative") |
Similarly, you could profile with
This creates a JSON profile file used in the online viewer (https://www.speedscope.app/). A |
I have a patch locally that seems to provide a 24% speedup just by pre-compiling some regexes in |
Enjoy dinner! And sorry for stepping into your toes, but I also went in the same direction after a quick profiling (using pyinstrument) and already created #77. I don't mind if yours gets picked up though, whatever makes the world faster. |
I can't see anything else that provides an easy speedup here (and the impact of these PRs combined is great!), so I'll close this once #76 (comment) is done. Doing #76 (comment) first requires us to figure out which is faster, though: adding |
As we discussed briefly at the sprint, I think that To determine the optimal cache size, you could inspect the cache info of each cache to determine the number of hits and misses, or profile the code, see the number of calls for each cached function, and divide by the number of documents to get an idea of how many items are in each document. As we were leaving, I was about to suggest you to test it on these repos, to get an idea of both real-world performance improvements, and real-world file sizes that you can use to decide the cache size. cc @AlexWaygood, @hugovk |
After all the optimizations, |
|
We're running sphinx-lint as part of pre-commit at CPython, and it's a really useful check. However, it's by far the slowest check if you run
pre-commit run --all-files
locally. It would be great if we could speed it up somehow! (Note: I haven't looked into this at all yet; I don't know if there are any speedups that would be easily doable.)The text was updated successfully, but these errors were encountered: