Skip to content

Commit 42d32a4

Browse files
committed
docs: complete contributor guide
1 parent 608a925 commit 42d32a4

File tree

4 files changed

+282
-1
lines changed

4 files changed

+282
-1
lines changed

docs/antora-playbook.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,11 @@ ui:
7777
7878
antora:
7979
extensions:
80+
- require: '@sntke/antora-mermaid-extension' # <1>
81+
mermaid_library_url: https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs # <2>
82+
script_stem: header-scripts # <3>
83+
mermaid_initialize_options: # <4>
84+
start_on_load: true
8085
- require: '@antora/lunr-extension' # https://gitlab.com/antora/antora-lunr-extension
8186
index_latest_only: true
8287
asciidoc:

docs/local-antora-playbook.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,11 @@ ui:
7474
7575
antora:
7676
extensions:
77+
- require: '@sntke/antora-mermaid-extension' # <1>
78+
mermaid_library_url: https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs # <2>
79+
script_stem: header-scripts # <3>
80+
mermaid_initialize_options: # <4>
81+
start_on_load: true
7782
- require: '@antora/lunr-extension' # https://gitlab.com/antora/antora-lunr-extension
7883
index_latest_only: true
7984
asciidoc:
Lines changed: 271 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,275 @@
1-
= Contribute
1+
= Contributor's Guide
22

3+
This page contains information for contributors to the MrDocs project.
4+
It is intended to provide an overview of the codebase and the process of adding new features.
35

6+
== Codebase Overview
47

8+
The MrDocs codebase is divided into several modules:
59

10+
[mermaid]
11+
....
12+
graph TD
13+
CL[Command Line Arguments] --> P
14+
CF[Configuration File] --> P
15+
P[Options] --> E
16+
P --> CD
17+
P --> G
18+
CD[Compilation Database] --> E
19+
E[Extract Symbols] -->|Corpus| G
20+
G[Generator] --> D(Documentation)
21+
....
22+
23+
This section provides an overview of each module and how they interact with each other in the MrDocs codebase.
24+
25+
[#options]
26+
=== Parsing options
27+
28+
MrDocs options affect the behavior of the compilation database, how symbols are extracted, and how the documentation is generated.
29+
They are parsed from the command line and configuration file.
30+
31+
The main entry point of MrDocs is the `DoGenerateAction` function in `src/tool/GenerateAction.cpp`.
32+
It loads the options, creates the compilation database, and runs the extraction and generation steps.
33+
The options formed from a combination of command line arguments and configuration file settings.
34+
35+
==== Command Line Options
36+
37+
Command line and common options are defined in `src/tool/ToolArgs.hpp`.
38+
The `ToolArgs` class uses the `llvm::cl` library to define and parse the command line arguments.
39+
40+
==== Configuration File
41+
42+
Common options are defined in `mrdocs/Config.hpp`.
43+
The `Config` class represents all public options that could be defined in a configuration file.
44+
It also provides a representation plugins can use to access public options from the command line or configuration file.
45+
46+
The function `clang::mrdocs::loadConfig` is also provided to parse all public options from a YAML configuration file.
47+
48+
Internally, MrDocs uses the derived `clang::mrdocs::ConfigImpl` class (`src/lib/Lib/ConfigImpl.hpp`) to also store the private representation of parsed options, such as filters.
49+
50+
==== Finalizing Options
51+
52+
Common options are stored in the `Config` class, while the `ToolArgs` class stores common options and the command line options.
53+
For instance, the `config` option can only be set from the command line, as it would be illogical to expect the location of the configuration file to be defined in the configuration file itself.
54+
On the other hand, the `output` option can be set from both the command line and the configuration file so that the user can define a default output location in the configuration file.
55+
56+
Thus, after the command line and configuration file options are parsed, they are finalized in the `DoGenerateAction` function by calling `ToolArgs::apply`, which overrides the configuration file options in `Config` with the command line options, when applicable.
57+
58+
As a last step, `DoGenerateAction` converts the public `Config` settings into a `ConfigImpl` object, which is used by the rest of the program with the parsed options.
59+
60+
[#extract_symbols]
61+
=== Extracting Symbols
62+
63+
At this stage, the clang frontend is used to parse the source code and generate an AST.
64+
The AST information is extracted and stored in a `Corpus` object (`mrdocs/Corpus.hpp`).
65+
66+
[#compilation_database]
67+
==== Compilation Database
68+
69+
The second step in `DoGenerateAction` is to create a `CompilationDatabase` object, so we can extract symbols from its source files.
70+
There are multiple possible sources for this file according to the configuration options: the file might be read directly from the path specified in the options, or it might be generated by MrDocs from build scripts.
71+
72+
Whatever the source, a derived `MrDocsCompilationDatabase` object (`lib/Lib/MrDocsCompilationDatabase.hpp`) is created to represent the compilation database.
73+
The difference between the original `CompilationDatabase` and the `MrDocsCompilationDatabase` is that the latter includes a number of pre-processing steps to filter and transform compilation commands.
74+
75+
For each compilation command:
76+
77+
* Command line arguments are adjusted
78+
** Warnings are supressed
79+
** Additional defines are added
80+
** Implicit include directories are added
81+
** Unrecognized arguments are removed
82+
* Paths are normalized
83+
* Non C++ files are filtered
84+
85+
[#info_nodes]
86+
==== Info Nodes
87+
88+
MrDocs represents each C++ symbol or construct as an `Info` node (`mrdocs/Metadata/Info.hpp`).
89+
MrDocs currently defines the following `Info` nodes:
90+
91+
[c-preprocessor]
92+
====
93+
94+
[cols="1,3,2"]
95+
|===
96+
| Name | Description | Declaration
97+
98+
#define INFO_PASCAL_AND_DESC(Type, Desc) | `pass:[Type]pass:[Info]` | Desc | `mrdocs/Metadata/pass:[Type].hpp`
99+
100+
include::partial$InfoNodes.inc[]
101+
102+
|===
103+
====
104+
105+
`Info` can not only represent direct AST symbols but also {cpp} constructs that need to be inferred from these symbols.
106+
Nodes in the first category will typically be created in the initial extraction step, and nodes in the second category will be created in the finalization step.
107+
108+
When defining a new `Info` type, it is important to consider how this type will be supported in all other modules of the codebase, including the AST visitor, the bitcode writer, generators, tests, and the documentation.
109+
The script `.github/check_info_nodes_support.sh` will attempt to infer whether most of these features have been implemented for each node type.
110+
111+
==== Clang LibTooling
112+
113+
MrDocs uses Clang to extract `Info` objects from the {cpp} AST.
114+
Clang offers two https://clang.llvm.org/docs/Tooling.html[interfaces] to access the C++ AST: the https://clang.llvm.org/doxygen/group__CINDEX.html[`LibClang`] and https://clang.llvm.org/docs/LibTooling.html[`LibTooling`] libraries.
115+
MrDocs uses the latter, as it provides full control over the AST traversal process at the cost of an unstable API.
116+
117+
In LibTooling, once we have a <<compilation_database>>, we can create a `ClangTool` object to run the Clang frontend on a set of source files.
118+
119+
[source,c++]
120+
----
121+
clang::tooling::ClangTool Tool(compilationDatabase, sourceFiles);
122+
newFrontendActionFactory<clang::SyntaxOnlyAction> actionFactory();
123+
return Tool.run(actionFactory.get());
124+
----
125+
126+
The `clang::tooling::ClangTool::run` method takes a `clang::tooling::ToolAction` object that defines how to process the AST.
127+
The action object that usually comes from a `clang::tooling::FrontendActionFactory`.
128+
In the example above, the `SyntaxOnlyAction` is used to parse the source code and generate the AST without any further processing.
129+
130+
In MrDocs, this process happens in `clang::mrdocs::CorpusImpl::build` (`src/lib/Lib/CorpusImpl.cpp`), where we call `Tool.run` for each object in the database with our custom `ASTAction` action and `ASTActionFactory` factory (`src/lib/AST/ASTVisitor.cpp`).
131+
132+
==== AST Traversal
133+
134+
While `ASTAction` is the entry point for processing the AST, the real work is done by the `ASTVisitor` class.
135+
As the AST is generated, it is traversed by the `ASTVisitor` class.
136+
137+
The entry point of this class is `ASTVisitor::build`, which recursively calls `ASTVisitor::traverseDecl` for the root `clang::TranslationUnitDecl` node of the translation unit.
138+
During the AST traversal stage, the complete AST generated by the clang frontend is walked beginning with this root `TranslationUnitDecl` node.
139+
140+
Each `clang` node is converted into a `<<info_nodes,mrdocs::Info>>` node, which is then stored with any relevant information in a `mrdocs::Corpus` object.
141+
142+
==== USR Generation
143+
144+
It is during this stage that USRs (universal symbol references) are generated and hashed with SHA1 to form the 160 bit `SymbolID` for an entity.
145+
Except for built-in types, *all* entities referenced in the corpus will be traversed and be assigned a `SymbolID`; including those from the standard library.
146+
This is necessary to generate the full interface for user-defined types.
147+
148+
==== Bitcode
149+
150+
To maximize the size of the code base MrDocs is capable of processing, `Info`
151+
types generated during traversal are serialized to a compressed bitcode representation.
152+
153+
The `ASTVisitor` reports each new `Info` object to the `BitcodeExecutionContext` (`src/lib/Lib/ExecutionContext.cpp`) which serializes it to the bitcode file.
154+
155+
==== Finalizing the Corpus
156+
157+
After running the AST traversal on all translation units, `CorpusImpl::build` contains finalization steps for the `Corpus` object.
158+
At this point, we process C++ constructs that are not directly represented in the AST.
159+
160+
The first finalization step happens in `BitcodeExecutionContext::reportEnd` (`src/lib/Lib/ExecutionContext.cpp`), where the `Info` objects with the same `SymbolID` are merged.
161+
The merging step is necessary as there may be multiple identical definitions of the same entity.
162+
For instance, this represents the case where a function is declared at different points in the code base and might have different attributes or comments.
163+
At this step, the doc comments are also finalized.
164+
Each `Info` object has a pointer to its `Javadoc` object (`mrdocs/Metadata/Javadoc.hpp`), which is a representation of the documentation comments.
165+
166+
After AST traversal and `Info` merging, the result is stored as a map of `Info` objects indexed by their respective `SymbolID`.
167+
A second finalization step is then performed in `clang::mrdocs::finalize` (`src/lib/Metadata/Finalize.cpp`), where any references to `SymbolID` objects that don't exist are removed.
168+
This is necessary because the AST traversal will generate references to entities that should be filtered and are not present in the corpus.
169+
170+
At this point, the `Corpus` object contains representations of all entities in the code base and further semantic {cpp} constructs that are not directly represented in the AST can be inferred.
171+
172+
=== Generators
173+
174+
Documentation generators may traverse this structure by calling `Corpus::traverse` with a `Corpus::Visitor` derived visitor and the `SymbolID` of the entity to visit (e.g. the global namespace).
175+
176+
Documentation generators are responsible for traversing the corpus and generating documentation in the desired format.
177+
178+
The API for documentation generators is defined in `mrdocs/Generator.hpp`.
179+
180+
=== Directory Layout
181+
182+
The MrDocs codebase is organized as follows:
183+
184+
==== `include/`—The main include directory
185+
186+
This directory contains the public headers for the MrDocs library.
187+
188+
* `include/mrdocs/`—The core library headers
189+
** `include/mrdocs/ADT`—Data Structures
190+
** `include/mrdocs/Dom`—The Document Object Model for Abstract Trees
191+
** `include/mrdocs/Metadata`—`Info` nodes and metadata classes
192+
** `include/mrdocs/Support`—Various utility classes
193+
194+
==== `src/`—The main source directory
195+
196+
This directory contains the source code for the MrDocs library and private headers.
197+
198+
* `src/lib/`—The core library
199+
** `src/lib/AST/`—The AST traversal code
200+
** `src/lib/Dom/`—The Document Object Model for Abstract Trees
201+
** `src/lib/Gen/`—Generators
202+
** `src/lib/Lib/`—The core library classes
203+
** `src/lib/Metadata/`—`Info` nodes and metadata classes
204+
** `src/lib/Support/`—Various utility classes
205+
* `src/test/`—The test directory
206+
* `src/test_suite/`—The library used for testing
207+
* `src/tool/`—The main program
208+
209+
==== `share/`—Shared resources
210+
211+
This directory contains shared resources for the documentation generators and utilities for developers.
212+
Its subdirectories are installed in the `share` directory of the installation.
213+
214+
* `share/`—Shared resources for the documentation generators
215+
* `share/cmake/`—CMake modules to generate the documentation
216+
* `share/gdb/`—GDB pretty printers
217+
* `share/mrdocs/`—Shared resources for the documentation generators
218+
219+
==== `docs`—Documentation
220+
221+
This directory contains the documentation for the MrDocs project.
222+
The documentation is written in AsciiDoc and can be built using the Antora tool.
223+
224+
* `docs/`—Documentation configuration files and scripts
225+
** `docs/modules/`—The documentation asciidoc files
226+
** `docs/extensions`—Antora extensions for the documentation
227+
228+
=== `third-party/`—Helpers for third-party libraries
229+
230+
This directory contains build scripts and configuration files for third-party libraries.
231+
232+
* `third-party/`—Third-party libraries
233+
** `third-party/llvm/`—CMake Presets for LLVM
234+
** `third-party/duktape/`—CMake scripts for Duktape
235+
** `third-party/lua/`—A bundled Lua interpreter
236+
237+
== Coding Standards
238+
239+
=== Paths
240+
241+
The AST visitor and metadata all use forward slashes to represent file pathnames, even on Windows.
242+
This is so the generated reference documentation does not vary based on the platform.
243+
244+
=== Exceptions
245+
246+
Errors thrown by the program should always have type `Exception`.
247+
Objects of this type are capable of transporting an `Error` object.
248+
This is important for the scripting to work; exceptions are used to propagate errors from library code to scripts and back to the invoking code.
249+
For exceptional cases, these thrown exceptions should be uncaught.
250+
The tool installs an uncaught exception handler that prints a stack trace and exits the process immediately.
251+
252+
=== Testing
253+
254+
All new features should be accompanied by tests.
255+
The `mrdocs-test` target is used to run the test suites.
256+
This target has its entry point in `src/test/TestMain.cpp`, which can take two paths:
257+
258+
* Golden testing: When input paths are provided to the test executable via the command line, the test suite will run the `DoTestAction()` that iterates all files in `test-files` comparing the input source files with the expected XML output files.
259+
* Unit testing: When no input paths are provided, all unit tests will be run via `unit_test_main()`, defined by the our test-suite library in `src/test_suite/test_suite.cpp`.
260+
261+
The fixtures for golden testing are defined in `test-files/golden-tests`, where files in each directory have the following format:
262+
263+
* `mrdocs.yml`: Basic configuration options for all files in this directory.
264+
* `<filename>.cpp`: The input source file to extract symbols from.
265+
* `<filename>.xml`: The expected XML output file generated with the XML generator.
266+
* `<filename>.bad.xml`: The test output file generated when the test fails.
267+
* `<filename>.yml`: Extra configuration options for this specific file.
268+
269+
== Contributing
270+
271+
If you find a bug or have a feature request, please open an issue on the MrDocs GitHub repository: https://github.com/cppalliance/mrdocs/issues
272+
273+
If you would like to contribute a feature or bug fix, please open a pull request on the MrDocs GitHub repository: https://github.com/cppalliance/mrdocs/pulls
274+
275+
If you would like to discuss a feature or bug fix before opening a pull request, discussing happen in the `#mrdocs` channel on the Cpplang Slack: https://cpplang.slack.com/
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../../../include/mrdocs/Metadata/InfoNodes.inc

0 commit comments

Comments
 (0)