-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[clang-doc] Improve clang-doc performance through memoization #96809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-tools-extra Author: None (PeterChou1) ChangesThis patch adds a short circuit which address performance problem referenced in From my local testing it cut the runtime of generating LLVM docs from 10 hours to 30 mins Full diff: https://github.com/llvm/llvm-project/pull/96809.diff 1 Files Affected:
diff --git a/clang-tools-extra/clang-doc/Mapper.cpp b/clang-tools-extra/clang-doc/Mapper.cpp
index bb8b7952980ac..b9ca6edd03b8c 100644
--- a/clang-tools-extra/clang-doc/Mapper.cpp
+++ b/clang-tools-extra/clang-doc/Mapper.cpp
@@ -12,11 +12,17 @@
#include "clang/AST/Comment.h"
#include "clang/Index/USRGeneration.h"
#include "llvm/ADT/StringExtras.h"
-#include "llvm/Support/Error.h"
+#include "llvm/Support/Mutex.h"
+#include <unordered_set>
namespace clang {
namespace doc {
+
+static std::unordered_set<std::string> USRVisited;
+
+static llvm::sys::Mutex USRVisitedGuard;
+
void MapASTVisitor::HandleTranslationUnit(ASTContext &Context) {
TraverseDecl(Context.getTranslationUnitDecl());
}
@@ -34,6 +40,17 @@ template <typename T> bool MapASTVisitor::mapDecl(const T *D) {
// If there is an error generating a USR for the decl, skip this decl.
if (index::generateUSRForDecl(D, USR))
return true;
+
+ // Prevent Visiting USR twice
+ {
+ std::lock_guard<llvm::sys::Mutex> Guard(USRVisitedGuard);
+ std::string Visited = USR.str().str();
+ if (USRVisited.count(Visited)) {
+ return true;
+ }
+ USRVisited.insert(Visited);
+ }
+
bool IsFileInRootDir;
llvm::SmallString<128> File =
getFile(D, D->getASTContext(), CDCtx.SourceRoot, IsFileInRootDir);
@@ -41,6 +58,7 @@ template <typename T> bool MapASTVisitor::mapDecl(const T *D) {
getLine(D, D->getASTContext()), File,
IsFileInRootDir, CDCtx.PublicOnly);
+
// A null in place of I indicates that the serializer is skipping this decl
// for some reason (e.g. we're only reporting public decls).
if (I.first)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is likely a step in the right direction, I'm a bit skeptical that this patch is entirely correct.
First, I think we need to understand why we only want to cache/memoize the USR strings from definitions.
Second, this requires some more significant testing than is currently present for clang-doc. Additionally, we'd need to evaluate such an invasive change on multiple projects, and ensure that the documentation output has not changed due to this patch. Even then, I'd wouldn't be confident that we'd gotten it correct until tests were in place, and we could examine the differences with, and without this patch.
Lastly, I think its worth discussing the memoization/caching strategy used here, and if its appropriate/scalable. Other clang tools use different mechanisms, and it would be good to avoid reinventing the wheel if we can just adopt one of those that also happens to be well tested, such as clangd's indexing and the search/caching mechanisms used in scandeps.
// We considered a USR to be visited only when its defined | ||
if (IsDefinition) | ||
USRVisited.insert(Visited); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this logic makes sense. Why is it unreasonable to expect a USR to be valid only when its a definition? Is there some logic that prevents things that aren't definitions from being able to hold documentation? What I'm worried about, is the case where we have several USRs that contain different documentation bits that would have been merged in the past, but now wont.
If there is a concrete reason, please document that here in the comments, and in the commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My logic was that definition of the USR is parsed last, so when the ASTVisitor visits the definition it would have already parsed every other USR that points to the same declaration. So we can safely short circuit, since every other fragments of USR would've been parsed already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My logic was that definition of the USR is parsed last, so when the ASTVisitor visits the definition it would have already parsed every other USR that points to the same declaration. So we can safely short circuit, since every other fragments of USR would've been parsed already
Is it really parsed last in all cases? Isn't it possible to have multiple of these definitions, depending on the scope of the AST construct and the target options. For instance, what if some code is compiled w/ a particular #define
enabled and that provides different documentation than was found previously, but is compiled elsewhere without that define? I can easily imagine that affecting header code, where documentation is likely to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I hadn't considered that this was my clumsy attempt at trying to remove the redundant work done by the ASTVisitor. Since we are undeniably doing some redundant work. if you take a look at the Shape class from the e2e test you'll see that we visit the declaration 3 times once for parsing the initial file and the twice more for each subclass. Is there any other mechanism that prevents this type of behaviour? This essentially what this patch is trying to accomplish
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is why we need more tests that exercise a diverse set of behavior. Some of the different targets should control the documentation bits w/ defines, and we should also make sure that we're covering all the ways a USR could be brought in. For instance, if there was only one base class that brought in Shape
, I'm not sure how much better you could do, since I'd assume it would only be included once, and then parsed in full the one time.
For more complicated arrangements, we'd need to be sure the definitions, and the documentation would be resolved the same way in all cases. I'm not aware of a mechanism that does that off the top of my head, but I would expect the clangd
indexing mechanic probably has a way to handle this on some level. I'd also expect that the logic in scan-deps
must have some mechanism for not processing files multiple times.
I'm not confident enough to say what you're proposing is outright wrong, but I'm also not confident that its correct, either. What I'm saying is that we need to be sure, that when we see a definition in clang-doc
that it won't change from out beneath us. That said, given the merge logic in the reduction phase, perhaps if the USR is complete(i.e. has no missing fields), the memoization is sufficient.
// Prevent Visiting USR twice | ||
{ | ||
std::lock_guard<llvm::sys::Mutex> Guard(USRVisitedGuard); | ||
std::string Visited = USR.str().str(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the logic here. We're using SmallString<128>
above, but now you're memoizing a copy of the USR string contents in a Set, using an std::string
. Is there a reason we need 2 string types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, can we just construct a single owned string, and use a ‘StringRef’ to handle the rest of the operations? We may need to refactor some of the code slightly, but I think that should be possible.
I agreed I'm working on a patch adding more e2e test to better evaluate the output, I'm also working on comparing the current output of the llvm doc with the previous I'll update the thread for more updates |
Here's the diff for the doc output for the different branches |
CC: @sam-mccall We are currently facing some performance issues with clang-doc (a documentation generator built on top of libtooling). The problem we're having is that it can take up to 10 hours from clang-doc to run through LLVM compilation database, whereas other tools similar tools can do this much faster. Is there a better way to approach this? |
✅ With the latest revision this PR passed the C/C++ code formatter. |
I did some more investigating and ran this patch against several different projects like llvm itself, and other well know libraries like zlib. The only functional difference I found was differences in the USR generated and generation order in the table genned files The one exceptions to this is we accidently overlooked anonymous typedef like
This causes the recursiveASTVisitor to visit the declaration once as a typedef and another time as a record. Originally we had the following code which guards against emitting duplicate code when visiting an anonmymous typedef. To prevent this I added a check for if a decl is an anonymous typedef and cause it to always process the decl regardless or not if its in the memoization list |
f860632
to
74c7e08
Compare
Ok nevermind, disregard the above comment I was wrong about the mechanism of the bug. test.c
with cause it to interpret Foo as @nonymous_record_XXXXXXX. The reason why that is because when checking anonymous typedef name we're explicitly casting it to a CXXDecl. (see code here) Since the above is c code it won't be cast to CXXRecordDecl and hences clang-doc will treat it as an anonymous record. This becomes a problem when we run the LLVM compilation database. As an example, a record that we were not properly serializing before was LLVMOrcCDependenceMapPair. When clang-doc runs the compilation database of LLVM it first comes across the file OrcV2CBindingsRemovableCode.c under the directory llvm/examples/OrcV2Examples/OrcV2CBindingsRemovableCode. This file causes recursiveASTVisitor to visit LLVMOrcCDependenceMapPair but because its c code its name is not found and it is incorrectly serialized into @nonymous_record_XXXXX. Normally this is not an issue for clang-doc, since we'll visit the record multiple times and the next time we visit a cpp file using LLVMOrcCDependenceMapPair , clang-doc will correctly fill in the name of the declaration. Since this patch changes the behaviour to only visiting the a typedef declaration at most once, The workaround I added still works since when it comes to anonymous typedefs we skip out on memomization. It would be better if we just correctly handle the case of parsing anonymous typedef in C. But this approach is more conservative and safer since it reverts to old behaviour of clang-doc instead of introducing some other bug with parsing typedefs. |
|
||
namespace clang { | ||
namespace doc { | ||
|
||
static llvm::StringSet USRVisited; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static llvm::StringSet USRVisited; | |
static llvm::StringSet<> USRVisited; |
There's a warning about type deduction (-Wctad-maybe-unsupported
)
@@ -53,30 +80,34 @@ template <typename T> bool MapASTVisitor::mapDecl(const T *D) { | |||
} | |||
|
|||
bool MapASTVisitor::VisitNamespaceDecl(const NamespaceDecl *D) { | |||
return mapDecl(D); | |||
return mapDecl(D, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for constant params you should have a comment, per our style guide.
return mapDecl(D, true); | |
return mapDecl(D, /*isDefinition=*/true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like there's a few clang-format errors, so lets address those, too.
} | ||
|
||
bool MapASTVisitor::VisitTypedefDecl(const TypedefDecl *D) { | ||
return mapDecl(D); | ||
return mapDecl(D, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here and below.
if (const TypedefNameDecl *TD = C->getTypedefNameForAnonDecl()) { | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (const TypedefNameDecl *TD = C->getTypedefNameForAnonDecl()) { | |
return true; | |
} | |
return C->getTypedefNameForAnonDecl(); |
Isn't this just the same. You still need the outer return, but this doesn't need a branch and temporary.
Thanks for the detailed analysis. With the current version of the patch, are there any difference between what we get on main and what we get w/ this patch? |
When sorted by USR, there are no differences between the index between this patch and what's on main, |
I'm not sure I completely follow. are you saying that when you sort the contents of all the files you no longer have any differences? So you're sure the only differences in output are due to ordering? I'd suppose that is rather expected from a racy framework, like clang-doc. Once the tests PRs are landed, can you rebase this patch, so we can see the affects on the larger tests? Once that's done, we should be able to move forward with this, assuming we don't find any material differences in generated documentation other than ordering. |
92c461f
to
6de47d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can confirm that the only differences in documentation generation are due to ordering, then I think it makes sense to land this patch.
@@ -243,6 +243,7 @@ Example usage for a project using a compile commands database: | |||
|
|||
// Fail early if an invalid format was provided. | |||
std::string Format = getFormatString(); | |||
llvm::outs() << "Unoptimized\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: leftover from debugging?
Seems to be a formatting issue. |
Oh, can you also update the title and commit message? You're not really adding a short circuit, you're just memoizing visited items, and avoiding reprocessing them. A bit more context, and a summary of your findings on the changes to output would also be good to have in the commit log. |
ee21eae
to
3fd1d92
Compare
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60250689
This patch adds a short circuit which address performance problem referenced in
issue: #96570
We memoize the results after visiting a USR definition. Since the definition is always parsed last so we can safely exit earlier next time we visit the USR because everything relevant to the doc generator would've been parsed already.
From my local testing it cut the runtime of generating LLVM docs from 10 hours to around 1 hour
The only differences between this output and the last are changes due to the racy nature of clang-docs doc generation. Because it is multithreaded the generated infos may be reordered in a different way, but the results remain the same