-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[clang-doc] Clang-Doc is extremely slow in generating LLVM docs #96570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The map-reduce architecture was intended to improve scalability. TBH, I'm a bit surprised that the reduction step, which is doing far less work, is taking so long, and runs so inefficiently. While you raise some valid points w.r.t. the visitor pattern, I'm not completely convinced that will improve things all that much. Is it possible to narrow down what in the reduction step is taking so long? or to determine what redundant work is being done? As mentioned before, I have a suspicion that 1) we've chosen a naive data structure somewhere in the reduction stage, or 2) that we're recomputing things many times. For example, I have a suspicion that we generate documentation |
I think the majority of time spent in the reduction phase is spent decoding the bitcode during the mapping phase? Perhaps there's a faster way to decode it maybe. I also noticed that clang-doc is generating documentation for anonymous namespaces which it shouldn't do by default that could also be an easy way of improving performance |
The map and reduce phases are implemented as in-memory data structures. However, the design allows for extensions to perform the mapping and reducing at scale. The map phase is performed in the Clang in-memory tool framework, while the reduce phase is performed using an in-memory index. Both the framework and the index inherit from more generic classes, and thus can be swapped out in-place with distributed implementations of the tool framework and index, respectively. If we merged the two phases, we would loose that option. I agree with @ilovepi that we should instead investigate why is the reduce phase so expensive. |
Are you able to see what bitcode files it generated and is trying to read? I know these are in memory structures, but perhaps we can log something and arrive a similar result. My big worry is that we're doing something naive and processing the same data multiple times. |
I could try modifying the clang-doc to emit the intermediate bitcode to the filesystem and check to see if any them are the same to see if we're doing duplicate work |
I'm looking at the code in I also see heap allocations in the reduce step, via Lastly, I'm curious about the sizes of |
it appears we're doing duplicate work in the mapping phase of clang-doc. Even for small projects like the basic e2e project clang-doc emits duplicate ir bitcode. |
Uh oh!
There was an error while loading. Please reload this page.
@ilovepi, @petrhosek
Currently clang-doc is extremely slow in producing LLVM docs, it consumes a ton of memory and kills my machine every time I attempt to generate the LLVM docs. This is extremely unusual since doxygen, and other equivalent tools such hdoc is able to produce LLVM docs with significantly less memory and speed. This presents a tremendous challenge if we want clang-doc to eventually replace doxygen generated docs for LLVM.
During my investigation I found that clang-doc has similar performance for small project as equivalent tools. But for large projects the performance deviates drastically.
I was able to use llvm TimeTraceProfiler to set a profile of clang-doc when it was attempting to generate llvm documentation
here's the tracing json file: https://drive.google.com/file/d/1GSO-eRKiSv_g4yxtHTXFuz1yi7SXpeMH/view?usp=sharing
The majority of runtime appears to be dominated by the reduction phase, with Reading the Bitcode dominating alot of the time
I think this problem is due to clang-docs the way clang-doc is architected,
Currently clang-doc uses RecursiveASTVisitor to visit every Namespace/Record/Method/Function/Typedef/TypeAlias Decl in the compilation database. Every time clang-doc visits a Decl it saves Info on the Decl itself and creates an Info about the parent Decl and insert itself into it.
Instead of doing this I think we should just compute all child records, typdef and etc whenever clang-doc's RecursiveASTVisitor visit a namespace or record, instead of attempting to do a merge after. This is the conventional approach taken by most of the other doc generators using libtooling to generate documentation.
This approach has the advantage of removing the reduction step, to gather all the info needed for a Decl, which should speedup clang-doc.
The text was updated successfully, but these errors were encountered: