[clang-doc] Clang-Doc is extremely slow in generating LLVM docs #96570

PeterChou1 · 2024-06-24T22:41:55Z

Currently clang-doc is extremely slow in producing LLVM docs, it consumes a ton of memory and kills my machine every time I attempt to generate the LLVM docs. This is extremely unusual since doxygen, and other equivalent tools such hdoc is able to produce LLVM docs with significantly less memory and speed. This presents a tremendous challenge if we want clang-doc to eventually replace doxygen generated docs for LLVM.

During my investigation I found that clang-doc has similar performance for small project as equivalent tools. But for large projects the performance deviates drastically.

I was able to use llvm TimeTraceProfiler to set a profile of clang-doc when it was attempting to generate llvm documentation

here's the tracing json file: https://drive.google.com/file/d/1GSO-eRKiSv_g4yxtHTXFuz1yi7SXpeMH/view?usp=sharing

The majority of runtime appears to be dominated by the reduction phase, with Reading the Bitcode dominating alot of the time
I think this problem is due to clang-docs the way clang-doc is architected,

Currently clang-doc uses RecursiveASTVisitor to visit every Namespace/Record/Method/Function/Typedef/TypeAlias Decl in the compilation database. Every time clang-doc visits a Decl it saves Info on the Decl itself and creates an Info about the parent Decl and insert itself into it.

Instead of doing this I think we should just compute all child records, typdef and etc whenever clang-doc's RecursiveASTVisitor visit a namespace or record, instead of attempting to do a merge after. This is the conventional approach taken by most of the other doc generators using libtooling to generate documentation.
This approach has the advantage of removing the reduction step, to gather all the info needed for a Decl, which should speedup clang-doc.

ilovepi · 2024-06-24T22:56:45Z

The map-reduce architecture was intended to improve scalability. TBH, I'm a bit surprised that the reduction step, which is doing far less work, is taking so long, and runs so inefficiently. While you raise some valid points w.r.t. the visitor pattern, I'm not completely convinced that will improve things all that much.

Is it possible to narrow down what in the reduction step is taking so long? or to determine what redundant work is being done? As mentioned before, I have a suspicion that 1) we've chosen a naive data structure somewhere in the reduction stage, or 2) that we're recomputing things many times. For example, I have a suspicion that we generate documentation Info for the same item multiple times rather than just skipping items we've already processed.

PeterChou1 · 2024-06-24T23:12:33Z

I think the majority of time spent in the reduction phase is spent decoding the bitcode during the mapping phase? Perhaps there's a faster way to decode it maybe. I also noticed that clang-doc is generating documentation for anonymous namespaces which it shouldn't do by default that could also be an easy way of improving performance

petrhosek · 2024-06-24T23:39:09Z

The map and reduce phases are implemented as in-memory data structures. However, the design allows for extensions to perform the mapping and reducing at scale. The map phase is performed in the Clang in-memory tool framework, while the reduce phase is performed using an in-memory index. Both the framework and the index inherit from more generic classes, and thus can be swapped out in-place with distributed implementations of the tool framework and index, respectively. If we merged the two phases, we would loose that option. I agree with @ilovepi that we should instead investigate why is the reduce phase so expensive.

ilovepi · 2024-06-24T23:41:32Z

Are you able to see what bitcode files it generated and is trying to read? I know these are in memory structures, but perhaps we can log something and arrive a similar result. My big worry is that we're doing something naive and processing the same data multiple times.

PeterChou1 · 2024-06-24T23:55:12Z

I could try modifying the clang-doc to emit the intermediate bitcode to the filesystem and check to see if any them are the same to see if we're doing duplicate work

ilovepi · 2024-06-25T00:16:26Z

I'm looking at the code in ClangDocMain.cpp and in the ClangDocBitcodeReader. Both use std::vector quite a lot. I wonder how large these vectors typically are. For instance, it may make sense to reserve a common size to avoid resizing the vector needlessly. In some cases an llvm::SmallVector<T> could also make a good choice.

I also see heap allocations in the reduce step, via std::make_unique<T>, and code that, I assume, introduces lots of temporaries in Representation.cpp in the various merge routines.

Lastly, I'm curious about the sizes of USRToBitcode and Group. If USRToBitcode is small, and Group is large, then we're not parallelizing this very efficiently. I'm not sure thats the case here, but it would be good to know.

PeterChou1 · 2024-06-25T01:06:15Z

it appears we're doing duplicate work in the mapping phase of clang-doc. Even for small projects like the basic e2e project clang-doc emits duplicate ir bitcode.
Isn't this impossible? Could recursive AST Visitor visit the same Decl twice

github-actions bot added the new issue label Jun 24, 2024

EugeneZelenko added performance clang-doc and removed new issue labels Jun 24, 2024

PeterChou1 changed the title ~~[clang-doc] Clang-Doc is extremely slow in producing generating LLVM docs~~ [clang-doc] Clang-Doc is extremely slow in generating LLVM docs Jun 24, 2024

PeterChou1 mentioned this issue Jun 26, 2024

[clang-doc] Improve clang-doc performance through memoization #96809

Merged

PeterChou1 closed this as completed Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[clang-doc] Clang-Doc is extremely slow in generating LLVM docs #96570

[clang-doc] Clang-Doc is extremely slow in generating LLVM docs #96570

PeterChou1 commented Jun 24, 2024 •

edited

Loading

ilovepi commented Jun 24, 2024

Uh oh!

PeterChou1 commented Jun 24, 2024

Uh oh!

petrhosek commented Jun 24, 2024

Uh oh!

ilovepi commented Jun 24, 2024

Uh oh!

PeterChou1 commented Jun 24, 2024 •

edited

Loading

Uh oh!

ilovepi commented Jun 25, 2024

Uh oh!

PeterChou1 commented Jun 25, 2024 •

edited

Loading

Uh oh!

[clang-doc] Clang-Doc is extremely slow in generating LLVM docs #96570

[clang-doc] Clang-Doc is extremely slow in generating LLVM docs #96570

Comments

PeterChou1 commented Jun 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ilovepi commented Jun 24, 2024

Uh oh!

PeterChou1 commented Jun 24, 2024

Uh oh!

petrhosek commented Jun 24, 2024

Uh oh!

ilovepi commented Jun 24, 2024

Uh oh!

PeterChou1 commented Jun 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilovepi commented Jun 25, 2024

Uh oh!

PeterChou1 commented Jun 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PeterChou1 commented Jun 24, 2024 •

edited

Loading

PeterChou1 commented Jun 24, 2024 •

edited

Loading

PeterChou1 commented Jun 25, 2024 •

edited

Loading