|
1 |
| -= Contribute |
| 1 | += Contributor's Guide |
2 | 2 |
|
| 3 | +This page contains information for contributors to the MrDocs project. |
| 4 | +It is intended to provide an overview of the codebase and the process of adding new features. |
3 | 5 |
|
| 6 | +== Codebase Overview |
4 | 7 |
|
| 8 | +The MrDocs codebase is divided into several modules: |
5 | 9 |
|
| 10 | +[mermaid] |
| 11 | +.... |
| 12 | +graph TD |
| 13 | + CL[Command Line Arguments] --> P |
| 14 | + CF[Configuration File] --> P |
| 15 | + P[Options] --> E |
| 16 | + P --> CD |
| 17 | + P --> G |
| 18 | + CD[Compilation Database] --> E |
| 19 | + E[Extract Symbols] -->|Corpus| G |
| 20 | + G[Generator] --> D(Documentation) |
| 21 | +.... |
| 22 | + |
| 23 | +This section provides an overview of each module and how they interact with each other in the MrDocs codebase. |
| 24 | + |
| 25 | +[#options] |
| 26 | +=== Parsing options |
| 27 | + |
| 28 | +MrDocs options affect the behavior of the compilation database, how symbols are extracted, and how the documentation is generated. |
| 29 | +They are parsed from the command line and configuration file. |
| 30 | + |
| 31 | +The main entry point of MrDocs is the `DoGenerateAction` function in `src/tool/GenerateAction.cpp`. |
| 32 | +It loads the options, creates the compilation database, and runs the extraction and generation steps. |
| 33 | +The options formed from a combination of command line arguments and configuration file settings. |
| 34 | + |
| 35 | +==== Command Line Options |
| 36 | + |
| 37 | +Command line and common options are defined in `src/tool/ToolArgs.hpp`. |
| 38 | +The `ToolArgs` class uses the `llvm::cl` library to define and parse the command line arguments. |
| 39 | + |
| 40 | +==== Configuration File |
| 41 | + |
| 42 | +Common options are defined in `mrdocs/Config.hpp`. |
| 43 | +The `Config` class represents all public options that could be defined in a configuration file. |
| 44 | +It also provides a representation plugins can use to access public options from the command line or configuration file. |
| 45 | + |
| 46 | +The function `clang::mrdocs::loadConfig` is also provided to parse all public options from a YAML configuration file. |
| 47 | + |
| 48 | +Internally, MrDocs uses the derived `clang::mrdocs::ConfigImpl` class (`src/lib/Lib/ConfigImpl.hpp`) to also store the private representation of parsed options, such as filters. |
| 49 | + |
| 50 | +==== Finalizing Options |
| 51 | + |
| 52 | +Common options are stored in the `Config` class, while the `ToolArgs` class stores common options and the command line options. |
| 53 | +For instance, the `config` option can only be set from the command line, as it would be illogical to expect the location of the configuration file to be defined in the configuration file itself. |
| 54 | +On the other hand, the `output` option can be set from both the command line and the configuration file so that the user can define a default output location in the configuration file. |
| 55 | + |
| 56 | +Thus, after the command line and configuration file options are parsed, they are finalized in the `DoGenerateAction` function by calling `ToolArgs::apply`, which overrides the configuration file options in `Config` with the command line options, when applicable. |
| 57 | + |
| 58 | +As a last step, `DoGenerateAction` converts the public `Config` settings into a `ConfigImpl` object, which is used by the rest of the program with the parsed options. |
| 59 | + |
| 60 | +[#extract_symbols] |
| 61 | +=== Extracting Symbols |
| 62 | + |
| 63 | +At this stage, the clang frontend is used to parse the source code and generate an AST. |
| 64 | +The AST information is extracted and stored in a `Corpus` object (`mrdocs/Corpus.hpp`). |
| 65 | + |
| 66 | +[#compilation_database] |
| 67 | +==== Compilation Database |
| 68 | + |
| 69 | +The second step in `DoGenerateAction` is to create a `CompilationDatabase` object, so we can extract symbols from its source files. |
| 70 | +There are multiple possible sources for this file according to the configuration options: the file might be read directly from the path specified in the options, or it might be generated by MrDocs from build scripts. |
| 71 | + |
| 72 | +Whatever the source, a derived `MrDocsCompilationDatabase` object (`lib/Lib/MrDocsCompilationDatabase.hpp`) is created to represent the compilation database. |
| 73 | +The difference between the original `CompilationDatabase` and the `MrDocsCompilationDatabase` is that the latter includes a number of pre-processing steps to filter and transform compilation commands. |
| 74 | + |
| 75 | +For each compilation command: |
| 76 | + |
| 77 | +* Command line arguments are adjusted |
| 78 | +** Warnings are supressed |
| 79 | +** Additional defines are added |
| 80 | +** Implicit include directories are added |
| 81 | +** Unrecognized arguments are removed |
| 82 | +* Paths are normalized |
| 83 | +* Non C++ files are filtered |
| 84 | + |
| 85 | +[#info_nodes] |
| 86 | +==== Info Nodes |
| 87 | + |
| 88 | +MrDocs represents each C++ symbol or construct as an `Info` node (`mrdocs/Metadata/Info.hpp`). |
| 89 | +MrDocs currently defines the following `Info` nodes: |
| 90 | + |
| 91 | +[c-preprocessor] |
| 92 | +==== |
| 93 | +
|
| 94 | +[cols="1,3,2"] |
| 95 | +|=== |
| 96 | +| Name | Description | Declaration |
| 97 | +
|
| 98 | +#define INFO_PASCAL_AND_DESC(Type, Desc) | `pass:[Type]pass:[Info]` | Desc | `mrdocs/Metadata/pass:[Type].hpp` |
| 99 | +
|
| 100 | +include::partial$InfoNodes.inc[] |
| 101 | +
|
| 102 | +|=== |
| 103 | +==== |
| 104 | + |
| 105 | +`Info` can not only represent direct AST symbols but also {cpp} constructs that need to be inferred from these symbols. |
| 106 | +Nodes in the first category will typically be created in the initial extraction step, and nodes in the second category will be created in the finalization step. |
| 107 | + |
| 108 | +When defining a new `Info` type, it is important to consider how this type will be supported in all other modules of the codebase, including the AST visitor, the bitcode writer, generators, tests, and the documentation. |
| 109 | +The script `.github/check_info_nodes_support.sh` will attempt to infer whether most of these features have been implemented for each node type. |
| 110 | + |
| 111 | +==== Clang LibTooling |
| 112 | + |
| 113 | +MrDocs uses Clang to extract `Info` objects from the {cpp} AST. |
| 114 | +Clang offers two https://clang.llvm.org/docs/Tooling.html[interfaces] to access the C++ AST: the https://clang.llvm.org/doxygen/group__CINDEX.html[`LibClang`] and https://clang.llvm.org/docs/LibTooling.html[`LibTooling`] libraries. |
| 115 | +MrDocs uses the latter, as it provides full control over the AST traversal process at the cost of an unstable API. |
| 116 | + |
| 117 | +In LibTooling, once we have a <<compilation_database>>, we can create a `ClangTool` object to run the Clang frontend on a set of source files. |
| 118 | + |
| 119 | +[source,c++] |
| 120 | +---- |
| 121 | +clang::tooling::ClangTool Tool(compilationDatabase, sourceFiles); |
| 122 | +newFrontendActionFactory<clang::SyntaxOnlyAction> actionFactory(); |
| 123 | +return Tool.run(actionFactory.get()); |
| 124 | +---- |
| 125 | + |
| 126 | +The `clang::tooling::ClangTool::run` method takes a `clang::tooling::ToolAction` object that defines how to process the AST. |
| 127 | +The action object that usually comes from a `clang::tooling::FrontendActionFactory`. |
| 128 | +In the example above, the `SyntaxOnlyAction` is used to parse the source code and generate the AST without any further processing. |
| 129 | + |
| 130 | +In MrDocs, this process happens in `clang::mrdocs::CorpusImpl::build` (`src/lib/Lib/CorpusImpl.cpp`), where we call `Tool.run` for each object in the database with our custom `ASTAction` action and `ASTActionFactory` factory (`src/lib/AST/ASTVisitor.cpp`). |
| 131 | + |
| 132 | +==== AST Traversal |
| 133 | + |
| 134 | +While `ASTAction` is the entry point for processing the AST, the real work is done by the `ASTVisitor` class. |
| 135 | +As the AST is generated, it is traversed by the `ASTVisitor` class. |
| 136 | + |
| 137 | +The entry point of this class is `ASTVisitor::build`, which recursively calls `ASTVisitor::traverseDecl` for the root `clang::TranslationUnitDecl` node of the translation unit. |
| 138 | +During the AST traversal stage, the complete AST generated by the clang frontend is walked beginning with this root `TranslationUnitDecl` node. |
| 139 | + |
| 140 | +Each `clang` node is converted into a `<<info_nodes,mrdocs::Info>>` node, which is then stored with any relevant information in a `mrdocs::Corpus` object. |
| 141 | + |
| 142 | +==== USR Generation |
| 143 | + |
| 144 | +It is during this stage that USRs (universal symbol references) are generated and hashed with SHA1 to form the 160 bit `SymbolID` for an entity. |
| 145 | +Except for built-in types, *all* entities referenced in the corpus will be traversed and be assigned a `SymbolID`; including those from the standard library. |
| 146 | +This is necessary to generate the full interface for user-defined types. |
| 147 | + |
| 148 | +==== Bitcode |
| 149 | + |
| 150 | +To maximize the size of the code base MrDocs is capable of processing, `Info` |
| 151 | +types generated during traversal are serialized to a compressed bitcode representation. |
| 152 | + |
| 153 | +The `ASTVisitor` reports each new `Info` object to the `BitcodeExecutionContext` (`src/lib/Lib/ExecutionContext.cpp`) which serializes it to the bitcode file. |
| 154 | + |
| 155 | +==== Finalizing the Corpus |
| 156 | + |
| 157 | +After running the AST traversal on all translation units, `CorpusImpl::build` contains finalization steps for the `Corpus` object. |
| 158 | +At this point, we process C++ constructs that are not directly represented in the AST. |
| 159 | + |
| 160 | +The first finalization step happens in `BitcodeExecutionContext::reportEnd` (`src/lib/Lib/ExecutionContext.cpp`), where the `Info` objects with the same `SymbolID` are merged. |
| 161 | +The merging step is necessary as there may be multiple identical definitions of the same entity. |
| 162 | +For instance, this represents the case where a function is declared at different points in the code base and might have different attributes or comments. |
| 163 | +At this step, the doc comments are also finalized. |
| 164 | +Each `Info` object has a pointer to its `Javadoc` object (`mrdocs/Metadata/Javadoc.hpp`), which is a representation of the documentation comments. |
| 165 | + |
| 166 | +After AST traversal and `Info` merging, the result is stored as a map of `Info` objects indexed by their respective `SymbolID`. |
| 167 | +A second finalization step is then performed in `clang::mrdocs::finalize` (`src/lib/Metadata/Finalize.cpp`), where any references to `SymbolID` objects that don't exist are removed. |
| 168 | +This is necessary because the AST traversal will generate references to entities that should be filtered and are not present in the corpus. |
| 169 | + |
| 170 | +At this point, the `Corpus` object contains representations of all entities in the code base and further semantic {cpp} constructs that are not directly represented in the AST can be inferred. |
| 171 | + |
| 172 | +=== Generators |
| 173 | + |
| 174 | +Documentation generators may traverse this structure by calling `Corpus::traverse` with a `Corpus::Visitor` derived visitor and the `SymbolID` of the entity to visit (e.g. the global namespace). |
| 175 | + |
| 176 | +Documentation generators are responsible for traversing the corpus and generating documentation in the desired format. |
| 177 | + |
| 178 | +The API for documentation generators is defined in `mrdocs/Generator.hpp`. |
| 179 | + |
| 180 | +=== Directory Layout |
| 181 | + |
| 182 | +The MrDocs codebase is organized as follows: |
| 183 | + |
| 184 | +==== `include/`—The main include directory |
| 185 | + |
| 186 | +This directory contains the public headers for the MrDocs library. |
| 187 | + |
| 188 | +* `include/mrdocs/`—The core library headers |
| 189 | +** `include/mrdocs/ADT`—Data Structures |
| 190 | +** `include/mrdocs/Dom`—The Document Object Model for Abstract Trees |
| 191 | +** `include/mrdocs/Metadata`—`Info` nodes and metadata classes |
| 192 | +** `include/mrdocs/Support`—Various utility classes |
| 193 | + |
| 194 | +==== `src/`—The main source directory |
| 195 | + |
| 196 | +This directory contains the source code for the MrDocs library and private headers. |
| 197 | + |
| 198 | +* `src/lib/`—The core library |
| 199 | +** `src/lib/AST/`—The AST traversal code |
| 200 | +** `src/lib/Dom/`—The Document Object Model for Abstract Trees |
| 201 | +** `src/lib/Gen/`—Generators |
| 202 | +** `src/lib/Lib/`—The core library classes |
| 203 | +** `src/lib/Metadata/`—`Info` nodes and metadata classes |
| 204 | +** `src/lib/Support/`—Various utility classes |
| 205 | +* `src/test/`—The test directory |
| 206 | +* `src/test_suite/`—The library used for testing |
| 207 | +* `src/tool/`—The main program |
| 208 | + |
| 209 | +==== `share/`—Shared resources |
| 210 | + |
| 211 | +This directory contains shared resources for the documentation generators and utilities for developers. |
| 212 | +Its subdirectories are installed in the `share` directory of the installation. |
| 213 | + |
| 214 | +* `share/`—Shared resources for the documentation generators |
| 215 | +* `share/cmake/`—CMake modules to generate the documentation |
| 216 | +* `share/gdb/`—GDB pretty printers |
| 217 | +* `share/mrdocs/`—Shared resources for the documentation generators |
| 218 | + |
| 219 | +==== `docs`—Documentation |
| 220 | + |
| 221 | +This directory contains the documentation for the MrDocs project. |
| 222 | +The documentation is written in AsciiDoc and can be built using the Antora tool. |
| 223 | + |
| 224 | +* `docs/`—Documentation configuration files and scripts |
| 225 | +** `docs/modules/`—The documentation asciidoc files |
| 226 | +** `docs/extensions`—Antora extensions for the documentation |
| 227 | + |
| 228 | +=== `third-party/`—Helpers for third-party libraries |
| 229 | + |
| 230 | +This directory contains build scripts and configuration files for third-party libraries. |
| 231 | + |
| 232 | +* `third-party/`—Third-party libraries |
| 233 | +** `third-party/llvm/`—CMake Presets for LLVM |
| 234 | +** `third-party/duktape/`—CMake scripts for Duktape |
| 235 | +** `third-party/lua/`—A bundled Lua interpreter |
| 236 | + |
| 237 | +== Coding Standards |
| 238 | + |
| 239 | +=== Paths |
| 240 | + |
| 241 | +The AST visitor and metadata all use forward slashes to represent file pathnames, even on Windows. |
| 242 | +This is so the generated reference documentation does not vary based on the platform. |
| 243 | + |
| 244 | +=== Exceptions |
| 245 | + |
| 246 | +Errors thrown by the program should always have type `Exception`. |
| 247 | +Objects of this type are capable of transporting an `Error` object. |
| 248 | +This is important for the scripting to work; exceptions are used to propagate errors from library code to scripts and back to the invoking code. |
| 249 | +For exceptional cases, these thrown exceptions should be uncaught. |
| 250 | +The tool installs an uncaught exception handler that prints a stack trace and exits the process immediately. |
| 251 | + |
| 252 | +=== Testing |
| 253 | + |
| 254 | +All new features should be accompanied by tests. |
| 255 | +The `mrdocs-test` target is used to run the test suites. |
| 256 | +This target has its entry point in `src/test/TestMain.cpp`, which can take two paths: |
| 257 | + |
| 258 | +* Golden testing: When input paths are provided to the test executable via the command line, the test suite will run the `DoTestAction()` that iterates all files in `test-files` comparing the input source files with the expected XML output files. |
| 259 | +* Unit testing: When no input paths are provided, all unit tests will be run via `unit_test_main()`, defined by the our test-suite library in `src/test_suite/test_suite.cpp`. |
| 260 | + |
| 261 | +The fixtures for golden testing are defined in `test-files/golden-tests`, where files in each directory have the following format: |
| 262 | + |
| 263 | +* `mrdocs.yml`: Basic configuration options for all files in this directory. |
| 264 | +* `<filename>.cpp`: The input source file to extract symbols from. |
| 265 | +* `<filename>.xml`: The expected XML output file generated with the XML generator. |
| 266 | +* `<filename>.bad.xml`: The test output file generated when the test fails. |
| 267 | +* `<filename>.yml`: Extra configuration options for this specific file. |
| 268 | + |
| 269 | +== Contributing |
| 270 | + |
| 271 | +If you find a bug or have a feature request, please open an issue on the MrDocs GitHub repository: https://github.com/cppalliance/mrdocs/issues |
| 272 | + |
| 273 | +If you would like to contribute a feature or bug fix, please open a pull request on the MrDocs GitHub repository: https://github.com/cppalliance/mrdocs/pulls |
| 274 | + |
| 275 | +If you would like to discuss a feature or bug fix before opening a pull request, discussing happen in the `#mrdocs` channel on the Cpplang Slack: https://cpplang.slack.com/ |
0 commit comments