docs: Add initial fern docs by debermudez · Pull Request #676 · ai-dynamo/aiperf

debermudez · 2026-02-11T23:35:32Z

Summary by CodeRabbit

New Features
- Launched documentation on Fern platform with improved navigation and site organization.
Documentation
- Updated documentation links with improved URL formatting for consistency.
- Standardized documentation formatting across all guides and references.
- Enhanced cross-reference validation and link integrity throughout documentation.
- Reorganized navigation structure for better discoverability.

github-actions · 2026-02-11T23:35:39Z

Try out this PR

Quick install:

pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@78262cfa6c18c7f7a297c1516b53b994c9944981

Recommended with virtual environment (using uv):

uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@78262cfa6c18c7f7a297c1516b53b994c9944981

Last updated for commit: 78262cf • Browse code

coderabbitai · 2026-02-11T23:41:17Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Introduces a Fern-based documentation framework via configuration files and a comprehensive migration skill document, while systematically standardizing documentation formatting across the codebase through comment header conversions, hyphenated filename conventions, admonition block updates, and corresponding link/path adjustments.

Changes

Cohort / File(s)	Summary
Fern Documentation Configuration `fern/fern.config.json`, `fern/docs.yml`, `fern/versions/next.yml`	Adds new Fern setup with organization metadata, site title (NVIDIA AIPerf Documentation), theming colors, navbar links, versioning config, and hierarchical navigation structure for documentation site.
Skill Documentation `.cursor/skills/docs-to-fern/SKILL_md`	Introduces comprehensive end-to-end migration framework (Phases 0–7) for transforming Markdown docs into Fern sites, including example scripts, configuration templates, navigation guidance, CI integration, and validation checklists.
Comment Header Standardization `docs/*/.md`, `tools/*.py`, `tests/ci/...`	Converts HTML-style comments (`<!-- ... -->`) to JSX-style block comments (`{/* ... */}`) across 40+ documentation files and tool scripts, plus updates comment detection regex in test parsers.
Filename & Link Migration `docs/api/synthesis.md`, `docs/server-metrics/.md`, `docs/tutorials/.md`, `docs/*.md`, `mkdocs.yml`	Standardizes file and directory naming from underscores to hyphens (e.g., `cli_options.md` → `cli-options.md`, `benchmark_modes/` → `benchmark-modes/`, `server_metrics/` → `server-metrics/`), updating all internal links and navigation entries accordingly.
Admonition Block Updates `docs/tutorials/.md`, `docs/.md`	Replaces Markdown directive-style admonitions (`[!NOTE]`, `[!WARNING]`, `[!TIP]`) with HTML-like tags (`<Note>`, `<Warning>`, `<Tip>`) across 20+ documentation files.
Self-Closing Tag & Entity Escaping `docs/*/.md`, `tools/generate_cli_docs.py`	Converts HTML tags to self-closing variants (`<br>` → `<br/>`, `<hr>` → `<hr/>`) and escapes less-than symbols (`<` → `<`) for MDX compatibility. Adds `_escape_mdx_prose()` helper function for prose sanitization.
Tool & Script Updates `tools/generate_cli_docs.py`, `tools/generate_env_vars_docs.py`, `tools/_core.py`	Updates output file paths to use hyphenated naming, adds MDX prose escaping logic, and revises public constants (`CONSTRAINT_SYMBOLS`, `SPDX_HEADER_MD`) to reflect new comment and escape syntax.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Poem

🐰 Hop through the docs, from old to new,
HTML comments fade from view,
Hyphens dance where underscores stood,
Fern's framework blooms—misschief looks good! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'docs: Add initial fern docs' directly describes the main change: adding Fern documentation infrastructure to the project.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🤖 Fix all issues with AI agents

In `@fern/pages/reproducibility.md`:
- Around line 76-80: The relative links in fern/pages/reproducibility.md (the
references to ../tests/integration/test_random_generator_canary.py,
../tests/integration/test_deterministic_behavior.py, and
../src/aiperf/common/random_generator.py) will 404 in the published docs; update
each to use the absolute GitHub URL for the target file in the correct
repository/branch (per the Fern migration guide) so they point to the canonical
locations (e.g., the absolute URLs for test_random_generator_canary.py,
test_deterministic_behavior.py, and aiperf/common/random_generator.py) instead
of the relative ../ paths.

In `@fern/pages/server-metrics/server-metrics-reference.md`:
- Around line 545-547: Update the two broken Markdown links in the paragraph
that currently point to server_metrics_json_schema.md and
server_metrics_parquet_schema.md: replace those underscored filenames with the
correct hyphenated filenames server-metrics-json-schema.md and
server-metrics-parquet-schema.md so the links resolve to the existing files
(look for the link text referencing "JSON Schema Reference" and "Parquet Schema
Reference" in this file).

In `@fern/pages/tutorials/custom-dataset.md`:
- Around line 122-131: Replace the hardcoded developer home path shown in the
sample output blocks in fern/pages/tutorials/custom-dataset.md (the CLI Command
/ Benchmark Duration / CSV Export / JSON Export / Log File example lines that
currently contain "/home/lkomali/aiperf/...") with a generic placeholder such as
"/home/user/aiperf/..." or use relative paths like "artifacts/..." so the sample
output no longer exposes a username; update all three identical output blocks
(the CLI/sample outputs around the CSV Export, JSON Export and Log File entries)
to use the chosen placeholder.

In `@fern/pages/tutorials/multi-url-load-balancing.md`:
- Around line 15-78: The markdown has unclosed and mis-placed code fences
causing rendering issues; close the first ```bash fence immediately after the
first command block (the "Round-robin across two servers" aiperf command), then
wrap its sample output in a separate ```text block and close it; open a new
```bash for the "Multi-GPU scaling on a single node" aiperf command and close it
after that command, then wrap its sample output in a ```text block and close it;
remove the stray trailing ``` so there are exactly paired fences for each
command and each sample output in the multi-url-load-balancing.md content.

In `@fern/pages/tutorials/prefix-synthesis.md`:
- Around line 232-248: Scenario 1 ("Simulate High Cache Hit Rate") incorrectly
uses --synthesis-prefix-root-multiplier 5 which, per the documentation for
--synthesis-prefix-root-multiplier, splits traces across multiple trees and
reduces cache hits; fix by either changing the scenario title/description to
reflect that multiplier=5 simulates a lower cache hit rate, or change the flag
value in Scenario 1 to --synthesis-prefix-root-multiplier 1 to actually simulate
high cache hit rate, and ensure the scenario text mentions the chosen behavior
to match the doc for --synthesis-prefix-root-multiplier.

🟡 Minor comments (26)

fern/pages/tutorials/rankings.md-53-71 (1)
53-71: ⚠️ Potential issue | 🟡 Minor

Add a language tag to the fenced block.

Markdownlint flags the sample output block as missing a language. Use text (or bash if you want formatting) to satisfy MD040.
✅ Suggested fix
-```
+```text
 INFO Starting AIPerf System
 INFO AIPerf System is PROFILING
 
 Profiling: 10/10 |████████████████████████| 100% [00:02<00:00]
 
 INFO Benchmark completed successfully
 INFO Results saved to: artifacts/BAAI_bge-reranker-base-rankings/
 
 NVIDIA AIPerf | LLM Metrics
 ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┓
 ┃ Metric ┃ avg ┃ min ┃ max ┃ p99 ┃ p50 ┃
 ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━┩
 │ Request Latency (ms) │ 52.34 │ 45.12 │ 68.45 │ 65.23 │ 51.89 │
 │ Request Throughput (req/s) │ 5.12 │ - │ - │ - │ - │
 └────────────────────────────┴───────┴───────┴───────┴───────┴───────┘
 
 JSON Export: artifacts/BAAI_bge-reranker-base-rankings/profile_export_aiperf.json
fern/pages/tutorials/user-centric-timing.md-237-239 (1)
237-239: ⚠️ Potential issue | 🟡 Minor

Hyphenate compound modifier in heading.

Use “High-Throughput” as a compound adjective.
✏️ Suggested fix
-### High Throughput Cache Test
+### High-Throughput Cache Test
fern/pages/tutorials/template-endpoint.md-75-75 (1)

75-75: ⚠️ Potential issue | 🟡 Minor

Clarify JMESPath array indexing in examples.

The documentation shows two different JMESPath patterns for extracting embeddings:

Line 75: data[0].embedding (extracts first element's embedding)

Line 117: data[].embedding (extracts all embeddings)

These serve different purposes but the distinction isn't explained. Consider clarifying when to use indexed access ([0]) versus array projection ([]) to help users understand which pattern fits their needs.

Also applies to: 117-117
fern/pages/tutorials/vision.md-55-55 (1)
55-55: ⚠️ Potential issue | 🟡 Minor

Specify fenced code block languages for sample outputs.

Markdownlint flags these blocks as missing a language. Use text or console to satisfy MD040.
✅ Suggested fix
-```
+```text
 INFO Starting AIPerf System
 ...
 JSON Export: artifacts/Qwen_Qwen2-VL-2B-Instruct-chat-concurrency4/profile_export_aiperf.json
```diff
-```
+```text
 INFO Starting AIPerf System
 ...
 JSON Export: artifacts/Qwen_Qwen2-VL-2B-Instruct-chat-concurrency1/profile_export_aiperf.json
</details>


Also applies to: 108-108

</blockquote></details>
<details>
<summary>fern/pages/tutorials/multi-turn.md-405-405 (1)</summary><blockquote>

`405-405`: _⚠️ Potential issue_ | _🟡 Minor_

**Document the `--conversation-turn-delay-ratio` parameter in the Core Parameters section.**

This parameter is mentioned at line 405 but is missing from the "Turn Delays" subsection (lines 59-69) where the other delay parameters (`--conversation-turn-delay-mean` and `--conversation-turn-delay-stddev`) are documented. Add documentation for this parameter alongside the other turn delay options for consistency and completeness.

</blockquote></details>
<details>
<summary>fern/pages/tutorials/fixed-schedule.md-139-163 (1)</summary><blockquote>

`139-163`: _⚠️ Potential issue_ | _🟡 Minor_

**Sample output entry count appears inconsistent with the schedule data.**

The schedule defined earlier has entries at timestamps: 0, 500, 750, 1000, 1250, 2000, 2500, 3000, 4000, 5000. With `--fixed-schedule-start-offset 2000` and `--fixed-schedule-end-offset 4000`, the filtered entries should include at least timestamps 2000, 2500, and 3000 (3+ entries depending on boundary inclusivity), but line 143 says "Filtered to 2 entries." Please verify and correct the sample output to match the expected filtering behavior.

</blockquote></details>
<details>
<summary>fern/pages/tutorials/fixed-schedule.md-122-134 (1)</summary><blockquote>

`122-134`: _⚠️ Potential issue_ | _🟡 Minor_

**Incorrect comment: "2s to 6s" should be "2s to 4s".**

Line 123 says `# Execute schedule from 2s to 6s window`, but the actual offsets are `--fixed-schedule-start-offset 2000` (2s) and `--fixed-schedule-end-offset 4000` (4s).

<details>
<summary>Proposed fix</summary>

```diff
-# Execute schedule from 2s to 6s window
+# Execute schedule from 2s to 4s window
fern/pages/tutorials/local-tokenizer.md-82-89 (1)
82-89: ⚠️ Potential issue | 🟡 Minor

Sample output table is missing its top border row.

Other sample outputs in the tutorials include the full table frame starting with ┏━━━.... Here the table begins directly with ┃ (line 83), missing the opening border. This inconsistency could confuse readers.
Proposed fix — add the missing top border
 NVIDIA AIPerf | LLM Metrics
+┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓
 ┃ Metric ┃ avg ┃ min ┃ max ┃ p99 ┃ p50 ┃
fern/pages/tutorials/time-based-benchmarking.md-169-169 (1)
169-169: ⚠️ Potential issue | 🟡 Minor

Fix broken link: benchmark_modes should be benchmark-modes.

The path ../benchmark_modes/timing-modes-reference.md uses an underscore, but the actual directory in fern/pages/ is benchmark-modes (with a hyphen).
Proposed fix
-- [Timing Modes Reference](../benchmark_modes/timing-modes-reference.md) — Complete CLI compatibility matrix
+- [Timing Modes Reference](../benchmark-modes/timing-modes-reference.md) — Complete CLI compatibility matrix
fern/pages/tutorials/working-with-profile-exports.md-96-106 (1)
96-106: ⚠️ Potential issue | 🟡 Minor

Fix duplicate metric in example.

Lines 98 and 101 both show time_to_first_token with different values and units. This appears to be a copy-paste error. The second occurrence should likely be time_to_second_token or another distinct metric.
🔧 Proposed fix
 "metrics": {
 "input_sequence_length": {"value": 550, "unit": "tokens"},
 "time_to_first_token": {"value": 255.88656799999998, "unit": "ms"},
 "request_latency": {"value": 297.52522799999997, "unit": "ms"},
 "output_token_count": {"value": 9, "unit": "tokens"},
- "time_to_first_token": {"value": 4.8984369999999995, "unit": "ms"},
+ "time_to_second_token": {"value": 4.8984369999999995, "unit": "ms"},
 "inter_chunk_latency": {"value": [4.898437, 5.316006, 4.801489, 5.674918, 4.811467, 5.097998, 5.504797, 5.533548], "unit": "ms"},
 "output_sequence_length": {"value": 9, "unit": "tokens"},
 "inter_token_latency": {"value": 5.2048325, "unit": "ms"},
 "output_token_throughput_per_user": {"value": 192.1291415237666, "unit": "tokens/sec/user"}
 },
fern/pages/tutorials/plot.md-82-95 (1)
82-95: ⚠️ Potential issue | 🟡 Minor

Fix extra closing backticks.

Line 95 has three closing backticks when only one set is needed to close the output block that started on Line 86. This will cause rendering issues.
🔧 Proposed fix
 INFO Using dark theme
 INFO Found 3 runs to compare
 INFO Generating 3 comparison plots
 INFO Successfully generated 3 plots
 INFO Plots saved to: artifacts/sweep_qwen/plots/
-```
</details>

</blockquote></details>
<details>
<summary>fern/pages/tutorials/timeslices.md-12-13 (1)</summary><blockquote>

`12-13`: _⚠️ Potential issue_ | _🟡 Minor_

**Use hyphenated compound adjective.**

“Equal-duration segments” reads cleaner than “equal duration segments.”

</blockquote></details>
<details>
<summary>fern/pages/benchmark-modes/trace-replay.md-47-52 (1)</summary><blockquote>

`47-52`: _⚠️ Potential issue_ | _🟡 Minor_

**`hash_ids` listed under "Required fields" but marked "(optional)" — confusing.**

The section heading says "Required fields for trace replay" but `hash_ids` is annotated as `(optional)`. Consider either moving it to a separate "Optional fields" list or changing the heading to "Fields for trace replay."


<details>
<summary>Proposed fix</summary>

```diff
-Required fields for trace replay:
+Fields for trace replay:
 - `timestamp`: Request arrival time in milliseconds
 - `input_length`: Number of input tokens
 - `output_length`: Number of output tokens
-- `hash_ids`: List of block hashes (optional)
+- `hash_ids`: List of block hashes *(optional)*
fern/pages/server-metrics/server-metrics-json-schema.md-38-43 (1)
38-43: ⚠️ Potential issue | 🟡 Minor

Fix broken internal links — filenames use hyphens, not underscores.

Three links in the related documentation section reference files with underscores, but the actual filenames use hyphens. These links will produce 404s and need to be corrected.
Proposed fix
-The Parquet format exports raw time-series data with delta calculations in columnar format, optimized for SQL analytics with DuckDB, pandas, or Polars. See [Parquet Schema Reference](server_metrics_parquet_schema.md) for the complete schema.
+The Parquet format exports raw time-series data with delta calculations in columnar format, optimized for SQL analytics with DuckDB, pandas, or Polars. See [Parquet Schema Reference](server-metrics-parquet-schema.md) for the complete schema.
 
 **Related documentation:**
 - [Server Metrics Tutorial](server-metrics.md) - Quick start guide and usage examples
-- [Server Metrics Reference](server_metrics_reference.md) - Metric definitions by backend (vLLM, SGLang, TRT-LLM, Dynamo)
-- [Parquet Schema Reference](server_metrics_parquet_schema.md) - Raw time-series data schema
+- [Server Metrics Reference](server-metrics-reference.md) - Metric definitions by backend (vLLM, SGLang, TRT-LLM, Dynamo)
+- [Parquet Schema Reference](server-metrics-parquet-schema.md) - Raw time-series data schema
fern/pages/metrics-reference.md-216-216 (1)
216-216: ⚠️ Potential issue | 🟡 Minor

Use hyphenated compound modifiers (e.g., “Inter‑Token”).

These headings should be hyphenated for correctness and consistency (Inter‑Token, Inter‑Chunk, Token‑Based).
Proposed fix
-### Inter Token Latency (ITL)
+### Inter-Token Latency (ITL)

-### Inter Chunk Latency (ICL)
+### Inter-Chunk Latency (ICL)

-## Token Based Metrics
+## Token-Based Metrics
Also applies to: 240-240, 298-298
fern/pages/server-metrics/server-metrics.md-158-158 (1)
158-158: ⚠️ Potential issue | 🟡 Minor

Fix broken Parquet schema link.

The link uses underscore naming and likely doesn’t resolve; the file in this PR uses hyphens.
Proposed fix
-See [Parquet Schema Reference](server_metrics_parquet_schema.md) for complete schema, metadata, and query examples.
+See [Parquet Schema Reference](server-metrics-parquet-schema.md) for complete schema, metadata, and query examples.
fern/pages/cli-options.md-431-434 (1)
431-434: ⚠️ Potential issue | 🟡 Minor

Fix missing space after period.

This reads as a typo in the rendered docs.
Proposed fix
-Note that due to the prefix and user prompts being concatenated, the number of tokens in the final prompt may be off by one.Mutually exclusive with `--shared-system-prompt-length`/`--user-context-prompt-length`.
+Note that due to the prefix and user prompts being concatenated, the number of tokens in the final prompt may be off by one. Mutually exclusive with `--shared-system-prompt-length`/`--user-context-prompt-length`.
fern/pages/tutorials/arrival-patterns.md-392-396 (1)
392-396: ⚠️ Potential issue | 🟡 Minor

Fix broken relative link to timing modes reference.

The directory name is benchmark-modes, not benchmark_modes.
Proposed fix
-- [Timing Modes Reference](../benchmark_modes/timing-modes-reference.md) — Complete CLI compatibility matrix
+- [Timing Modes Reference](../benchmark-modes/timing-modes-reference.md) — Complete CLI compatibility matrix
fern/pages/benchmark-modes/timing-modes-reference.md-83-84 (1)
83-84: ⚠️ Potential issue | 🟡 Minor

Remove blank line inside blockquote.

markdownlint MD028 flags this; keep the blockquote contiguous.
Proposed fix
-> **Important**: If `--concurrency` is not set, session concurrency limiting is **disabled** (unlimited). For `--user-centric-rate` mode, consider setting `--concurrency` to at least `--num-users` to ensure all users can have in-flight requests.
-
-> **See also**: [Prefill Concurrency Tutorial](../tutorials/prefill-concurrency.md) for detailed guidance on memory-safe long-context benchmarking.
+> **Important**: If `--concurrency` is not set, session concurrency limiting is **disabled** (unlimited). For `--user-centric-rate` mode, consider setting `--concurrency` to at least `--num-users` to ensure all users can have in-flight requests.
+> **See also**: [Prefill Concurrency Tutorial](../tutorials/prefill-concurrency.md) for detailed guidance on memory-safe long-context benchmarking.
fern/pages/diagrams/metrics-flow.md-34-49 (1)

34-49: ⚠️ Potential issue | 🟡 Minor

Stage numbering jumps from "Stage 2" to "Stage 4" — missing "Stage 3".

Line 34 comments Stage 2 and Line 48 comments Stage 4. Either rename to Stage 3 or add the missing stage.

fern/pages/migrating.md-1-4 (1)

1-4: ⚠️ Potential issue | 🟡 Minor

Copyright year range inconsistent with other files in this PR.

This file uses 2024-2025 while most other new files in this PR use 2025-2026 or 2025. If 2024 is intentional (e.g., content originated in 2024), this is fine — just flagging the inconsistency for confirmation.

fern/pages/server-metrics/server-metrics-reference.md-156-158 (1)

156-158: ⚠️ Potential issue | 🟡 Minor

Broken link fragment: #histogram-buckets heading does not exist.

The link on Line 158 points to [Histogram Buckets](#histogram-buckets), but there is no ## Histogram Buckets heading in this document. The bucket definitions are inline within metric tables. Either add a dedicated heading or update the link to point to an existing anchor (e.g., #metric-interpretation-guide).
.cursor/skills/docs-to-fern/SKILL_md-967-972 (1)
967-972: ⚠️ Potential issue | 🟡 Minor

File appears truncated — item 6 ends mid-sentence.

Line 972 ends without a period and the numbered list seems incomplete. The closing ``` for the skill file may also be missing.
Proposed fix
-6. **Get logo assets early.** The NVIDIA logo SVGs and favicon are required before `fern docs dev` will render correctly. Copy from an existing NVIDIA Fern project or request from design
+6. **Get logo assets early.** The NVIDIA logo SVGs and favicon are required before `fern docs dev` will render correctly. Copy from an existing NVIDIA Fern project or request from design.
fern/pages/diagrams/metrics-flow.md-75-84 (1)
75-84: ⚠️ Potential issue | 🟡 Minor

Style assignments reference undefined nodes I1 and F.

Line 82 applies the statistics class to I1, but only I2 is defined (Line 49). Line 83 applies transport to F, which doesn't exist anywhere in the diagram. These appear to be remnants of a previous diagram revision. Mermaid will silently ignore them, but they're misleading.
Proposed fix
- class I1,G statistics
+ class G statistics
- class E1,E2,E3,F,L transport
+ class E1,E2,E3,L transport
fern/pages/server-metrics/server-metrics-reference.md-166-168 (1)
166-168: ⚠️ Potential issue | 🟡 Minor

Heading hierarchy: Dynamo Frontend should be ### (h3), not ## (h2).

Dynamo Frontend (Line 168) is logically a subsection of Detailed Metric Definitions (Line 166), as shown in the TOC (Line 13). The same applies to the other backend headings (Dynamo Component, vLLM, SGLang, TensorRT-LLM, KVBM). Using h2 for both the parent and children flattens the hierarchy and may cause Fern's sidebar to display them incorrectly.
Proposed fix (apply to all backend subsection headings)
-## Dynamo Frontend
+### Dynamo Frontend
Similarly for ## Dynamo Component, ## vLLM, ## SGLang, ## TensorRT-LLM, ## KVBM.
fern/pages/api/synthesis.md-556-557 (1)
556-557: ⚠️ Potential issue | 🟡 Minor

Fix broken documentation link in "See Also" section.

The link to ../benchmark_modes/trace_replay.md is broken. The correct path is ../benchmark-modes/trace-replay.md (using dashes instead of underscores in both the directory and filename).
Corrected snippet
- [Prefix Synthesis Tutorial](../tutorials/prefix-synthesis.md)
- [Trace Replay](../benchmark-modes/trace-replay.md)

🧹 Nitpick comments (44)

fern/pages/tutorials/user-centric-timing.md (1)
91-93: Add languages to fenced code blocks (MD040).

markdownlint warns when fenced blocks omit a language; please tag them (e.g., text for diagrams/outputs, bash for commands).
🔧 Proposed updates
-```
+```text
 turn_gap = num_users / user_centric_rate
```diff
-```
+```text
 Evaluate: Benchmark Execution Timeline (t=0 to t=30s)
 ---------------------------------------------------------------------
 TIME (s) >>> 0 1 2 3 4 5 6 7 8 9 10 11 12 ...
@@
 RESULT:
 Immediate mix of fresh sessions (User 16) and deep sessions (User 14),
 with users finishing and churning naturally from t=6s onwards.
```diff
-```
+```text
 ┌─────────────────────────────────────────────────────────────┐
 │ Shared System Prompt (1000 tokens) │ ← Same across ALL users
 │ "You are a helpful assistant..." │ (KV cache shared prefix)
@@
 └─────────────────────────────────────────────────────────────┘
```diff
-```
+```text
 INFO Starting AIPerf System
 INFO User-centric mode: 15 users, 1.0 req/s (15.0s turn gap per user)
@@
 JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-rate1.0/profile_export_aiperf.json
```diff
-```
+```text
 INFO Starting AIPerf System
 INFO User-centric mode: 15 users, 4.0 req/s (3.75s turn gap per user)
@@
 JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-rate4.0/profile_export_aiperf.json
```diff
-```
+```text
 INFO Starting AIPerf System
 INFO User-centric mode: 15 users, 0.5 req/s (30.0s turn gap per user)
@@
 JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-rate0.5/profile_export_aiperf.json
</details>


Also applies to: 106-129, 145-156, 207-231, 259-283, 309-333

</blockquote></details>
<details>
<summary>fern/pages/tutorials/template-endpoint.md (2)</summary><blockquote>

`91-97`: **Consider documenting when named content variables are populated.**

The named content variables (`query`, `queries`, `passage`, `passages`, etc.) are listed but lack context about:
- When these variables are populated vs. being `None`
- How they differ from generic `text`/`texts` variables
- What input format is required to use them

Adding a brief explanation would help users understand when to use `query` vs `text` in their templates.

---

`222-222`: **Expand guidance on `tojson` filter usage.**

The troubleshooting section states "Use `|tojson` filter for string or nullable values," but the examples throughout the document show `tojson` being used for all JSON-serializable types (strings, lists, dicts, numbers). 

Consider revising to: "Use `|tojson` filter for all values to ensure proper JSON serialization and escaping" to better align with the "Always use `|tojson`" tip on line 214 and the actual usage patterns shown in the examples.

</blockquote></details>
<details>
<summary>fern/pages/tutorials/vision.md (1)</summary><blockquote>

`18-22`: **Pin the vLLM Docker image tag to a specific version for reproducibility.**

The `:latest` tag is a moving target and can cause non-deterministic behavior in the tutorial. Use a version-pinned tag instead, such as `:v0.15.1` (the current stable release), which aligns with vLLM's official documentation recommendations for reproducible deployments.

</blockquote></details>
<details>
<summary>fern/pages/tutorials/multi-turn.md (2)</summary><blockquote>

`108-129`: **Add language identifier to sample output code block.**

The sample output code block should have a language identifier (e.g., `text` or `console`) for better syntax highlighting and consistency with Markdown best practices.



<details>
<summary>📝 Proposed fix</summary>

```diff
 **Sample Output (Successful Run):**
-```
+```text
 INFO Starting AIPerf System
20-29: Optional: Remove blank line inside blockquote for consistency.

There's a blank line between the two blockquote sections (line 29). While this doesn't affect functionality, removing it would align with Markdown best practices and silence the linter warning.
📝 Proposed fix
 > These options are mutually exclusive in their intent - use `--request-count` for single-turn benchmarking and `--conversation-num` for multi-turn benchmarking to avoid confusion.
-
 > [!NOTE]
 > **Dataset Generation vs Request Execution**
fern/pages/tutorials/ui-types.md (1)
83-93: Add a language identifier to fenced code blocks.

The sample output code blocks at lines 83 and 116 lack a language specifier, which triggers MD040 warnings. Adding text (or console) satisfies the linter and can improve rendering.
Proposed fix
-```
+```text
 INFO Starting AIPerf System
Apply the same change at line 116.
fern/pages/tutorials/fixed-schedule.md (1)

87-110: Add language identifier to sample output code blocks.

Lines 87 and 139 open fenced code blocks without a language specifier (MD040). Use ```text for linting compliance.

fern/pages/tutorials/goodput.md (1)

54-74: Add language identifier to the output code block.

Line 56 opens a fenced code block without a language specifier. Use ```text for consistency with linting rules (MD040).

fern/pages/tutorials/time-based-benchmarking.md (1)

26-35: Add language identifiers to fenced code blocks.

The ASCII diagram (line 26) and sample output blocks (lines 76, 121) lack language specifiers (MD040). Use ```text for these blocks.

Also applies to: 76-104, 121-149

fern/pages/tutorials/local-tokenizer.md (1)

30-36: Add language identifiers to fenced code blocks.

The directory listing (line 30) and sample output (line 71) lack language specifiers (MD040). Use ```text for these.

Also applies to: 71-92

fern/pages/tutorials/custom-dataset.md (1)

90-131: Add language identifiers to sample output code blocks.

Lines 90, 171, and 265 open fenced code blocks without a language specifier (MD040). Use ```text.

Also applies to: 171-210, 265-305
fern/pages/tutorials/audio.md (1)
97-109: Move the fixture path note before the code example.

The note on Line 101 warns users to replace fixture paths, but it appears after the code example that contains those paths (Lines 105-109). Users who copy-paste the example first may miss this critical instruction.

Consider moving the note to Line 97 (before the code example) to prevent confusion.
📝 Suggested reordering
 ## Profile with Custom Input File

 AIPerf can automatically load and encode audio files from local paths.

+> **Note:** The example below uses paths from the AIPerf test fixtures directory. Replace these with paths to your own audio files.
+
 {/* aiperf-run-vllm-audio-openai-endpoint-server */}
 ```bash
 cat <<EOF > inputs.jsonl
 {"texts": ["Transcribe this."], "audios": ["/fixtures/audio/test_audio_1s.wav"]}
 {"texts": ["What is said?"], "audios": ["/fixtures/audio/test_audio_2.wav"]}
 {"texts": ["Summarize."], "audios": ["/fixtures/audio/test_audio_3.wav"]}
 EOF

-aiperf profile \
- --model Qwen/Qwen2-Audio-7B-Instruct \
- --endpoint-type chat \
- --input-file inputs.jsonl \
- --custom-dataset-type single_turn \
- --streaming \
- --url localhost:8000 \
- --request-count 3
-```
-{/* /aiperf-run-vllm-audio-openai-endpoint-server */}
-
-AIPerf will automatically:
-- Load the audio files from the specified paths
-- Convert them to base64 format
-- Send them to the model endpoint
-
-> **Note:** The example below uses paths from the AIPerf test fixtures directory. Replace these with paths to your own audio files.
-
-{/* aiperf-run-vllm-audio-openai-endpoint-server */}
-```bash
-cat <<EOF > inputs.jsonl
-{"texts": ["Transcribe this."], "audios": ["/fixtures/audio/test_audio_1s.wav"]}
-{"texts": ["What is said?"], "audios": ["/fixtures/audio/test_audio_2.wav"]}
-{"texts": ["Summarize."], "audios": ["/fixtures/audio/test_audio_3.wav"]}
-EOF
-
 aiperf profile \
 --model Qwen/Qwen2-Audio-7B-Instruct \
 --endpoint-type chat \
 --input-file inputs.jsonl \
 --custom-dataset-type single_turn \
 --streaming \
 --url localhost:8000 \
 --request-count 3
{/* /aiperf-run-vllm-audio-openai-endpoint-server */}

+AIPerf will automatically:
+- Load the audio files from the specified paths
+- Convert them to base64 format
+- Send them to the model endpoint
</details>

</blockquote></details>
<details>
<summary>fern/pages/tutorials/timeslices.md (1)</summary><blockquote>

`63-76`: **Add a language identifier to the sample output fence.**

Use ```text (or ```console) for the log block to satisfy MD040 and keep lint clean.

</blockquote></details>
<details>
<summary>fern/pages/tutorials/sglang-video-generation.md (1)</summary><blockquote>

`104-173`: **Add language identifiers to output/log fences.**

Use ```text for the Uvicorn log snippet and the sample output table to satisfy MD040.

</blockquote></details>
<details>
<summary>fern/pages/tutorials/openai-text-endpoints.md (1)</summary><blockquote>

`47-68`: **Add language identifiers to the sample output fences.**

Use ```text for the log/output blocks to clear MD040. 



Also applies to: 117-139

</blockquote></details>
<details>
<summary>fern/pages/tutorials/sglang-image-generation.md (1)</summary><blockquote>

`96-139`: **Add language identifiers to fenced blocks.**

Use ```text for sample outputs and ```json for prompt examples to satisfy MD040 and improve readability. 



Also applies to: 224-249

</blockquote></details>
<details>
<summary>fern/pages/tutorials/sequence-distributions.md (1)</summary><blockquote>

`45-73`: **Add language identifiers to fenced blocks.**

Use ```text for the string-format examples and the sample output to resolve MD040. 



Also applies to: 103-127

</blockquote></details>
<details>
<summary>fern/pages/tutorials/gpu-telemetry.md (1)</summary><blockquote>

`159-193`: **Add language identifiers to output fences.**

Use ```text for console/CSV output snippets to satisfy MD040. 



Also applies to: 499-538

</blockquote></details>
<details>
<summary>fern/pages/tutorials/custom-prompt-benchmarking.md (1)</summary><blockquote>

`68-97`: **Add a language identifier to the sample output fence.**

Use ```text for the log/output block to satisfy MD040.

</blockquote></details>
<details>
<summary>fern/pages/comprehensive-llm-benchmarking.md (1)</summary><blockquote>

`87-107`: **Prefer #### for subsections under ## in this doc.**

This file uses many ### headings directly under ## (e.g., “### Command”, “### Parameters Explained”). Consider switching to #### for consistency with the docs style preference. 


Based on learnings: “In the aiperf repository's docs/metrics_reference.md file, the maintainer prefers using h4 headings (####) for subsections under h2 headings instead of h3 (###) for better visual sizing and readability, even though this violates markdownlint rule MD001.”


Also applies to: 118-140

</blockquote></details>
<details>
<summary>fern/pages/server-metrics/server-metrics-json-schema.md (1)</summary><blockquote>

`1-2`: **Minor: Copyright year inconsistency.**

This file uses `2025` only, while most other new files in this PR use `2025-2026`.

</blockquote></details>
<details>
<summary>fern/pages/tutorials/prefix-synthesis.md (1)</summary><blockquote>

`39-66`: **Add language specifiers to fenced code blocks.**

Multiple output/error blocks lack a language identifier (flagged by markdownlint MD040). Use ` ```text ` for terminal output and error message blocks (lines 39, 112, 378, 387, 394).




Also applies to: 112-136, 378-400

</blockquote></details>
<details>
<summary>fern/pages/tutorials/warmup.md (1)</summary><blockquote>

`14-26`: **Add language specifiers to fenced code blocks.**

Several output and diagram blocks lack language identifiers (markdownlint MD040). Use ` ```text ` for ASCII art diagrams (lines 14, 254) and terminal output blocks (lines 54, 122, 157, 194, 233).




Also applies to: 54-70, 122-138, 157-173, 194-210, 233-250, 254-264

</blockquote></details>
<details>
<summary>fern/pages/index.md (1)</summary><blockquote>

`6-20`: **Landing page is sparse — consider adding navigation links and install instructions.**

This is the docs entry point, but it lacks links to the key sections defined in `next.yml` (Tutorials, CLI Options, Architecture, etc.) and provides no concrete install command or API example. Users landing here have no actionable path forward.

Consider at minimum:
- A `pip install aiperf` (or equivalent) command in Quick Start
- Links to the tutorials section, CLI reference, and architecture page
- A brief description of what AIPerf benchmarks (LLM inference servers)

</blockquote></details>
<details>
<summary>fern/pages/plugins/plugin-system.md (1)</summary><blockquote>

`44-52`: **Add language identifiers to fenced code blocks.**

Several fences are missing a language tag, which triggers MD040 and reduces syntax highlighting consistency. 


<details>
<summary>Proposed fix</summary>

```diff
-```
+```text
 Registry (singleton)
 └── Package (1+) ─── discovered via entry points
 └── Manifest (1+ per package) ─── plugins.yaml files
 └── Category (1+)
 └── Entry (1+) ─── PluginEntry
 ├── Class ─── lazy-loaded Python class
 └── Metadata ─── optional typed config
-```
+```

-```
+```text
 Entry Points → plugins.yaml → Pydantic Validation → Registry
 ↓
 get_class() → Import Module → Cache
-```
+```

-```
+```text
 TypeNotFoundError: Type 'my_plugin' not found for category 'endpoint'.
-```
+```

-```
+```text
 ImportError: Failed to import module for endpoint:my_plugin
-```
+```

-```
+```text
 AttributeError: Class 'MyClass' not found
-```
+```
Also applies to: 70-74, 387-395, 399-407, 410-417
fern/pages/metrics-reference.md (1)

141-282: Consider h4 subsections under h2 for readability (project style).

This file uses h3 for subsections under h2; project preference is to use h4 for better visual sizing.

Based on learnings: “In the aiperf repository's docs/metrics_reference.md file, the maintainer prefers using h4 headings (####) for subsections under h2 headings instead of h3 (###) for better visual sizing and readability, even though this violates markdownlint rule MD001.”

Also applies to: 298-1187
fern/pages/reference/tokenizer-auto-detection.md (1)
34-37: Add language identifiers to fenced output blocks.

This resolves MD040 and improves readability.
Proposed fix
-```
+```text
 INFO ✓ Tokenizer Qwen/Qwen3-0.6B detected for qwen3-0.6b
 INFO 1 tokenizer validated • 1 resolved • 0.3s
-```
+```

-```
+```text
 ╭──────────────────────────────── Ambiguous Tokenizer Name ─────────────────────────────────╮
 ...
 ╰───────────────────────────────────────────────────────────────────────────────────────────╯
-```
+```

-```
+```text
 ╭───────────────────────────────── Gated Repository ──────────────────────────────────╮
 ...
 ╰─────────────────────────────────────────────────────────────────────────────────────╯
-```
+```
Also applies to: 40-60, 63-82
fern/pages/tutorials/request-rate-concurrency.md (1)

97-115: Add language identifiers to output fences.

These look like console output; use text for consistent highlighting and to satisfy MD040.

Also applies to: 141-159

fern/pages/tutorials/prefill-concurrency.md (1)

14-25: Add language identifiers to fenced blocks.

Use text for diagrams/output and bash for CLI blocks to satisfy MD040.

Also applies to: 33-52, 100-121, 147-168, 172-181, 202-224

fern/pages/server-metrics/server-metrics.md (1)

71-71: Add language identifiers to fenced blocks.

Use bash/json as appropriate to satisfy MD040 and improve readability.

Also applies to: 396-396

fern/pages/tutorials/ramping.md (1)

14-26: Add language identifiers to fenced blocks.

Several fences are missing a language tag; use text for diagrams/output and bash for CLI blocks to satisfy MD040.

Also applies to: 66-87, 90-99, 115-136, 139-148, 168-190, 209-230
fern/pages/tutorials/request-cancellation.md (1)
16-30: Add language tags to fenced blocks.

markdownlint flags these blocks for missing language; using text keeps formatting while satisfying MD040.
Proposed fix
-```
+```text
 T0: Request scheduled
 ...
 T3: Request cancelled if still waiting for response
-```
+```

-```
+```text
 INFO Starting AIPerf System
 ...
 JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-concurrency8/profile_export_aiperf.json
-```
+```

-```
+```text
 INFO Starting AIPerf System
 ...
 JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-concurrency10/profile_export_aiperf.json
-```
+```

-```
+```text
 INFO Starting AIPerf System
 ...
 JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-concurrency15/profile_export_aiperf.json
-```
+```
Also applies to: 104-132, 161-184, 209-233
fern/pages/benchmark-modes/timing-modes-reference.md (1)
248-266: Add language tag to the ASCII diagram.

Helps with MD040 and keeps consistent formatting.
Proposed fix
-```
+```text
 ┌─────────────────────────────────────────────────────────────────┐
 ...
 └─────────────────────────────────────────────────────────────────┘
-```
+```
fern/pages/server-metrics/server-metrics-parquet-schema.md (1)
82-87: Add language tags to ASCII tables.

Avoids MD040 while preserving formatting.
Proposed fix
-```
+```text
 endpoint_url | metric_name | metric_type | unit | description |timestamp_ns | model_name | value | sum | count | bucket_le | bucket_count
 ...
-```
+```

-```
+```text
 endpoint_url | metric_name | metric_type | unit | description | timestamp_ns | model_name | value | sum | count | bucket_le | bucket_count
 ...
-```
+```
Also applies to: 91-98
fern/pages/tutorials/http-trace-metrics.md (1)
34-45: Add language tags to fenced blocks.

Keeps markdownlint happy for diagrams/formulas.
Proposed fix
-```
+```text
 Request Lifecycle ──────────────────────────────────────────────────────────────────────────────►
 ...
-```
+```

-```
+```text
 http_req_duration = response_receive_end_perf_ns - request_send_start_perf_ns
-```
+```

-```
+```text
 http_req_connection_overhead = http_req_blocked + http_req_dns_lookup + http_req_connecting
-```
+```

-```
+```text
 http_req_total = http_req_blocked + http_req_dns_lookup + http_req_connecting
 + http_req_sending + http_req_waiting + http_req_receiving
-```
+```

-```
+```text
 [content chunk 1] ─► included in both metrics
 ...
 [DONE] ─► http_req_total ends here (last network chunk)
-```
+```
Also applies to: 88-90, 98-100, 106-109, 137-144
fern/pages/tutorials/arrival-patterns.md (1)
14-20: Add language tags to diagram blocks.

Use text to satisfy MD040 without changing rendering.
Proposed fix
-```
+```text
 Constant Pattern: Poisson Pattern: Gamma (bursty):
 ...
-```
+```

-```
+```text
 Inter-arrival times:
 10 QPS → every 100ms: |····|····|····|····|····|····|
 ...
-```
+```

-```
+```text
 Inter-arrival times (exponential):
 10 QPS average: |··|······|·|···|····|··|·······|···|
 ...
-```
+```

-```
+```text
 Burst mode (concurrency=3):
 [Req1]────────────────────────────▶
 ...
-```
+```
Also applies to: 45-49, 65-69, 121-127
fern/pages/tutorials/synthetic-video.md (1)
67-90: Add language tag to sample output block.

Addresses MD040 while preserving the console output look.
Proposed fix
-```
+```text
 INFO Starting AIPerf System
 ...
 JSON Export: artifacts/your-model-name-chat-concurrency1/profile_export_aiperf.json
-```
+```
fern/docs.yml (1)
1-23: Logo and favicon configuration missing.

The SKILL_md guide in this PR lists logo and favicon as required fields for NVIDIA projects. If assets aren't available yet, consider adding a TODO comment so this doesn't get forgotten. fern docs dev may not render correctly without them.
# TODO: Add logo and favicon once assets are available
# logo:
# href: /
# light: ./assets/img/nvidia-logo.svg
# dark: ./assets/img/nvidia-logo-dark.svg
# height: 50
# favicon: ./assets/img/favicon.png
fern/pages/genai-perf-feature-comparison.md (1)

1-3: Inconsistent SPDX header format across documentation files.

This file uses JSX comment syntax {/* ... */} for the SPDX header, while migrating.md and creating-your-first-plugin.md use YAML frontmatter (---). Both work in Fern, but consistency across the doc set would improve maintainability. The same inconsistency appears in server-metrics-reference.md and reproducibility.md.

Pick one approach and apply it uniformly. The YAML frontmatter style is recommended by the SKILL_md guide in this PR.
fern/pages/plugins/creating-your-first-plugin.md (1)
35-62: Add language identifiers to fenced code blocks for consistent rendering.

Several code blocks (Lines 35, 50, 203, 274, 310, 321) lack language identifiers. Adding text will satisfy the markdown linter and may improve Fern's syntax highlighting behavior.
Example fix for directory trees
-```
+```text
 Package (my-aiperf-plugins)
fern/pages/api/synthesis.md (3)
105-112: Clarify the return type format in the example output.

The example shows stats as a plain dictionary, but get_stats() returns a RadixTreeStats dataclass object. The actual representation would be RadixTreeStats(num_nodes=7, num_leaves=3, total_visits=3, max_depth=3) rather than a dictionary. Consider updating the example to reflect the actual dataclass representation to avoid confusion.
📝 Suggested clarification
 # Get statistics
 stats = tree.get_stats()
-# {
-# 'num_nodes': 7,
-# 'num_leaves': 3,
-# 'total_visits': 3,
-# 'max_depth': 3
-# }
+# RadixTreeStats(num_nodes=7, num_leaves=3, total_visits=3, max_depth=3)
348-353: Spell out the "ge" abbreviation for better clarity.

The abbreviation "ge" (greater than or equal to) is mathematical shorthand that may not be immediately clear to all readers. Consider spelling it out or using the >= symbol for better readability.
📖 Suggested clarification
 **Fields:**
-- `speedup_ratio: float = 1.0` - Timestamp scaling multiplier (ge 0.0)
-- `prefix_len_multiplier: float = 1.0` - Core prefix length multiplier (ge 0.0)
-- `prefix_root_multiplier: int = 1` - Number of independent trees to distribute traces across (ge 1)
-- `prompt_len_multiplier: float = 1.0` - Leaf prompt length multiplier (ge 0.0)
+- `speedup_ratio: float = 1.0` - Timestamp scaling multiplier (>= 0.0)
+- `prefix_len_multiplier: float = 1.0` - Core prefix length multiplier (>= 0.0)
+- `prefix_root_multiplier: int = 1` - Number of independent trees to distribute traces across (>= 1)
+- `prompt_len_multiplier: float = 1.0` - Leaf prompt length multiplier (>= 0.0)
 - `max_isl: int | None = None` - Maximum input sequence length filter
-- `block_size: int = 512` - KV cache page size (ge 1)
+- `block_size: int = 512` - KV cache page size (>= 1)
314-314: Clarify the definition of "unique_prefixes".

The description "Number of unique prefix patterns (all prefix subsequences)" is slightly ambiguous. Consider clarifying whether this counts all unique prefix subsequences across all requests or something else.
📖 Suggested clarification

For example, if it means all unique prefix subsequences:
-- `unique_prefixes: int` - Number of unique prefix patterns (all prefix subsequences)
+- `unique_prefixes: int` - Number of unique prefix patterns (counts all unique prefix subsequences across requests)
Or if it means something else, adjust accordingly.

codecov · 2026-02-11T23:42:08Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 2

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🤖 Fix all issues with AI agents

In `@fern/pages/tutorials/embeddings.md`:
- Around line 81-87: The JSONL example uses the wrong field name "texts" which
will fail the OpenAI embeddings API; update the inputs.jsonl generation in the
here-doc to use the "input" field (either a single string per line like
{"input":"What is artificial intelligence?"} or an array form
{"input":["...","..."]} if batching), replacing all occurrences of {"texts":
[...] } with {"input": ... } so the examples match the verification curl and the
OpenAI embeddings schema.

In `@fern/pages/tutorials/sharegpt.md`:
- Around line 19-21: The Docker commands currently use the floating "latest"
tag; replace both occurrences of vllm/vllm-openai:latest with a specific stable
tag (for example vllm/vllm-openai:v0.14.1) and, if you need a CUDA-specific
build, choose the appropriate CUDA variant (e.g., v0.14.1-cu130) so the docker
pull and docker run lines pin the image version for reproducible runs.

🟡 Minor comments (36)

fern/pages/tutorials/prefix-synthesis.md-342-372 (1)

342-372: ⚠️ Potential issue | 🟡 Minor

Guidance vs examples conflict on “extreme” multiplier values.

The tips recommend avoiding extreme multipliers (typically 0.5–3.0), but earlier examples use --synthesis-prefix-root-multiplier 5 and 10. Please clarify that the 0.5–3.0 guidance applies only to certain multipliers (e.g., prefix length/prompt length), or update the examples/guidance to be consistent.

fern/pages/tutorials/synthetic-video.md-49-49 (1)

49-49: ⚠️ Potential issue | 🟡 Minor

Minor wording: add space in “4 fps”.

“4fps” reads as a typo; use “4 fps”.

fern/pages/tutorials/synthetic-video.md-52-56 (1)

52-56: ⚠️ Potential issue | 🟡 Minor

Clarify that video is supported via AIPerf’s custom chat endpoint, not the vanilla OpenAI API.

Since the example uses /v1/chat/completions, add a short note that video inputs are supported by AIPerf’s custom ChatEndpoint extensions (and are not standard OpenAI API behavior) to avoid user confusion.
Based on learnings: “ChatEndpoint … supports video inputs (supports_videos=True) through custom extensions, even though the standard OpenAI /v1/chat/completions API does not natively support raw video inputs.”

Also applies to: 356-357
fern/pages/tutorials/huggingface-tgi.md-55-75 (1)
55-75: ⚠️ Potential issue | 🟡 Minor

Add language tags to fenced output blocks.

markdownlint MD040 flags these blocks; use text for console output.
✅ Suggested fix
-```
+```text
 INFO Starting AIPerf System
 INFO Using Hugging Face TGI /generate endpoint (non-streaming)
 INFO AIPerf System is PROFILING
@@
 JSON Export: artifacts/TinyLlama_TinyLlama-1.1B-Chat-v1.0-generate-concurrency1/profile_export_aiperf.json
-```
+```

@@
-```
+```text
 INFO Starting AIPerf System
 INFO Using Hugging Face TGI /generate_stream endpoint (streaming)
 INFO AIPerf System is PROFILING
@@
 JSON Export: artifacts/TinyLlama_TinyLlama-1.1B-Chat-v1.0-generate-concurrency1/profile_export_aiperf.json
-```
+```
Also applies to: 117-139
fern/pages/tutorials/sequence-distributions.md-45-66 (1)
45-66: ⚠️ Potential issue | 🟡 Minor

Add language identifiers to fenced code blocks (MD040).

The fenced blocks under “Semicolon Format” and “Bracket Format” are missing languages. Please tag them (e.g., text) to satisfy the linter.
Suggested fix
-```
+```text
 "ISL1,OSL1:PROB1;ISL2,OSL2:PROB2;..."
@@
- +text
"ISL1|STDDEV1,OSL1|STDDEV1:PROB1;ISL2|STDDEV2,OSL2|STDDEV2:PROB2"
@@
-```
+```text
"[(ISL1,OSL1):PROB1,(ISL2,OSL2):PROB2]"
@@
- +text
"[(256|10,128|5):60,(512|20,256|15):40]"
fern/pages/tutorials/sequence-distributions.md-103-127 (1)
103-127: ⚠️ Potential issue | 🟡 Minor

Add a language tag to the sample output fence (MD040).

The sample output block should be tagged (e.g., text or console) to satisfy markdownlint.
Suggested fix
-```
+```text
 INFO Starting AIPerf System
 ...
 JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-concurrency1/profile_export_aiperf.json
</details>

</blockquote></details>
<details>
<summary>fern/pages/tutorials/sglang-video-generation.md-104-107 (1)</summary><blockquote>

`104-107`: _⚠️ Potential issue_ | _🟡 Minor_

**Add language identifiers to fenced code blocks.**

markdownlint flags these code fences as missing a language. Add `text` (or a more specific language) to satisfy MD040. 


<details>
<summary>Proposed fix</summary>

```diff
-```
+```text
 Uvicorn running on http://0.0.0.0:30010 (Press CTRL+C to quit)
```diff
-```
+```text
 NVIDIA AIPerf | Video Generation Metrics
 ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━┓
 ┃ Metric ┃ avg ┃ min ┃ max ┃ p99 ┃ p90 ┃ p50 ┃ std ┃
 ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━┩
 │ Request Latency (ms) │ 45,234.56 │ 42,123.45 │ 48,567.89 │ 48,432.12 │ 47,654.32 │ 45,012.34 │ 2634.78 │
 │ Input Sequence Length (tokens) │ 8.33 │ 7.00 │ 10.00 │ 9.98 │ 9.80 │ 8.00 │ 1.25 │
 │ Request Throughput (requests/sec) │ 0.02 │ - │ - │ - │ - │ - │ - │
 │ Request Count (requests) │ 3.00 │ - │ - │ - │ - │ - │ - │
 └───────────────────────────────────┴───────────┴───────────┴───────────┴───────────┴───────────┴───────────┴─────────┘
```diff
-```
+```text
 Downloading: http://localhost:30010/v1/videos/video_abc123/content
 Saved: /path/to/downloaded_videos/video_abc123.mp4
 Downloading: http://localhost:30010/v1/videos/video_def456/content
 Saved: /path/to/downloaded_videos/video_def456.mp4
 Downloading: http://localhost:30010/v1/videos/video_ghi789/content
 Saved: /path/to/downloaded_videos/video_ghi789.mp4
 
 Videos saved to: /path/to/downloaded_videos
</details>


Also applies to: 162-173, 347-357

</blockquote></details>
<details>
<summary>fern/pages/tutorials/custom-prompt-benchmarking.md-69-69 (1)</summary><blockquote>

`69-69`: _⚠️ Potential issue_ | _🟡 Minor_

**Add a language to the fenced block (MD040).**

The sample output fence lacks a language tag. Use `text` to satisfy the linter.

<details>
<summary>🔧 Suggested change</summary>

```diff
-```
+```text
fern/pages/tutorials/custom-prompt-benchmarking.md-105-105 (1)
105-105: ⚠️ Potential issue | 🟡 Minor

Promote “Use Cases” to h2 to satisfy heading increment (MD001).

After an h2 section, the next top-level section should also be h2.
🔧 Suggested change
-### Use Cases
+## Use Cases
fern/pages/tutorials/sharegpt.md-51-74 (1)
51-74: ⚠️ Potential issue | 🟡 Minor

Add a language to the sample output code fence.

MD040 expects a language; use text for terminal output.
✅ Proposed change
-```
+```text
fern/pages/tutorials/rankings.md-53-71 (1)
53-71: ⚠️ Potential issue | 🟡 Minor

Add language specifier to fenced code block.

The static analysis tool flagged this code block for missing a language specification. For sample output blocks, consider adding text or console as the language identifier for better rendering and linter compliance.
📝 Proposed fix
 **Sample Output (Successful Run):**
-```
+```text
 INFO Starting AIPerf System
fern/pages/tutorials/embeddings.md-45-45 (1)

45-45: ⚠️ Potential issue | 🟡 Minor

Add protocol prefix to --url parameter.

Line 45 uses --url localhost:8000 without a protocol, while the verification curl command on line 28 uses http://localhost:8000. AIPerf documentation recommends including the full URL with protocol scheme. Update line 45 to --url http://localhost:8000 for consistency and adherence to best practices.

fern/pages/tutorials/fixed-schedule.md-138-163 (1)

138-163: ⚠️ Potential issue | 🟡 Minor

Sample output entry count is inconsistent with the schedule data.

The schedule file (lines 60-71) contains entries at timestamps: 0, 500, 750, 1000, 1250, 2000, 2500, 3000, 4000, 5000. With --fixed-schedule-start-offset 2000 and --fixed-schedule-end-offset 4000, timestamps 2000, 2500, 3000, and 4000 fall within the window (3-4 entries depending on boundary inclusivity), but the sample output shows "Filtered to 2 entries" and Profiling: 2/2.

Consider updating the sample output to reflect the correct filtered count, or adjust the offset values to match the intended 2-entry result.
fern/pages/tutorial.md-2-2 (1)
2-2: ⚠️ Potential issue | 🟡 Minor

Copyright year range is inconsistent with other new files in this PR.

Other new files use 2025-2026 or 2026, but this file has 2024-2025.
-# SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
docs/tutorials/sglang-image-generation.md-237-249 (1)
237-249: ⚠️ Potential issue | 🟡 Minor

Add alt text to image references for accessibility.

All three image references lack alt text, which is important for accessibility and also helps when images fail to load. The static analysis tool (MD045) flagged this as well.
Proposed fix
-![](../media/extracted-images/image-0001-00-00.jpg)
+![Serene mountain landscape at sunset](../media/extracted-images/image-0001-00-00.jpg)
-![](../media/extracted-images/image-0002-00-00.jpg)
+![Futuristic city with flying cars](../media/extracted-images/image-0002-00-00.jpg)
-![](../media/extracted-images/image-0003-00-00.jpg)
+![Cute robot playing with a kitten](../media/extracted-images/image-0003-00-00.jpg)
fern/pages/benchmark-modes/trace-replay.md-47-52 (1)
47-52: ⚠️ Potential issue | 🟡 Minor

Contradictory labeling: hash_ids listed under "Required fields" but marked "(optional)".

The heading says "Required fields for trace replay" but hash_ids on line 51 is annotated as (optional). Either move it out of the "Required fields" list or change the heading to "Fields for trace replay" / "Supported fields".
Suggested fix
-Required fields for trace replay:
+Supported fields for trace replay:
 - `timestamp`: Request arrival time in milliseconds
 - `input_length`: Number of input tokens
 - `output_length`: Number of output tokens
-- `hash_ids`: List of block hashes (optional)
+- `hash_ids`: List of block hashes _(optional)_
fern/pages/tutorials/user-centric-timing.md-237-239 (1)

237-239: ⚠️ Potential issue | 🟡 Minor

Hyphenate the heading for readability. Use “High-Throughput Cache Test”.
fern/pages/tutorials/multi-url-load-balancing.md-1-4 (1)
1-4: ⚠️ Potential issue | 🟡 Minor

Replace JSX-style comment with standard markdown frontmatter or HTML comment. In plain .md, {/* ... */} renders as text.
Proposed fix
-{/* # SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
-# SPDX-License-Identifier: Apache-2.0 */}
+---
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+---
fern/pages/cli-options.md-431-434 (1)
431-434: ⚠️ Potential issue | 🟡 Minor

Fix missing space in the prefix prompt note.
Proposed fix
-Note that due to the prefix and user prompts being concatenated, the number of tokens in the final prompt may be off by one.Mutually exclusive with `--shared-system-prompt-length`/`--user-context-prompt-length`.
+Note that due to the prefix and user prompts being concatenated, the number of tokens in the final prompt may be off by one. Mutually exclusive with `--shared-system-prompt-length`/`--user-context-prompt-length`.
fern/pages/tutorials/local-tokenizer.md-28-92 (1)
28-92: ⚠️ Potential issue | 🟡 Minor

Add language tags to non-bash fences. This fixes MD040 and improves rendering/linters.
Proposed fix
-```
+```text
 /path/to/your/local/tokenizer/
 ├── tokenizer.json
 ├── tokenizer_config.json
 ├── vocab.txt (or vocab.json)
 └── config.json
@@
- +text
INFO Starting AIPerf System
INFO Loading local tokenizer from: /home/user/tokenizers/llama-2-7b
INFO Tokenizer loaded successfully (offline mode)
INFO AIPerf System is PROFILING
@@
JSON Export: artifacts/llama-2-7b-chat-concurrency4/profile_export_aiperf.json
fern/pages/tutorials/time-based-benchmarking.md-26-149 (1)
26-149: ⚠️ Potential issue | 🟡 Minor

Add language tags to non-bash fences. Keeps markdownlint clean and clarifies intent.
Proposed fix
-```
+```text
 │ BENCHMARK DURATION │ GRACE PERIOD │
@@
 ▲ ▲
 Duration expires Grace period ends
@@
- +text
INFO Starting AIPerf System
@@
JSON Export: artifacts/Qwen_Qwen2.5-7B-Instruct-chat-concurrency50/profile_export_aiperf.json
@@
-```
+```text
INFO Starting AIPerf System
@@
JSON Export: artifacts/Qwen_Qwen2.5-7B-Instruct-chat-concurrency20/profile_export_aiperf.json
</details>

</blockquote></details>
<details>
<summary>fern/pages/tutorials/ramping.md-14-231 (1)</summary><blockquote>

`14-231`: _⚠️ Potential issue_ | _🟡 Minor_

**Add language tags to non-bash fences.** This covers the ASCII diagrams and sample outputs.

<details>
<summary>Proposed fix</summary>

```diff
-```
+```text
 Without ramping: With ramping:
@@
 0 ┼──────────────────────▶ 0 ┼●─────────────────────▶
 0 Time 0 30s Time
@@
- +text
INFO Starting AIPerf System
@@
JSON Export: artifacts/your-model-chat-concurrency100/profile_export_aiperf.json
@@
-```
+```text
Concurrency
 100 ┤ ●━━━━━━━━━━━
@@
 7.5s 15s 22.5s 30s Time
@@
- +text
INFO Starting AIPerf System
@@
JSON Export: artifacts/your-model-chat-rate100/profile_export_aiperf.json
@@
-```
+```text
Request Rate (QPS)
 100 ┤ ●━━━━━━━━━━━
@@
 15s 30s 45s 60s Time
@@
- +text
INFO Starting AIPerf System
@@
JSON Export: artifacts/your-model-chat-concurrency100-rate200/profile_export_aiperf.json
@@
-```
+```text
INFO Starting AIPerf System
@@
JSON Export: artifacts/your-model-chat-concurrency100/profile_export_aiperf.json
</details>

</blockquote></details>
<details>
<summary>fern/pages/tutorials/user-centric-timing.md-91-333 (1)</summary><blockquote>

`91-333`: _⚠️ Potential issue_ | _🟡 Minor_

**Add language tags to non-bash code fences.** This resolves MD040 and makes render intent explicit.

<details>
<summary>Proposed fix</summary>

```diff
-```
+```text
 turn_gap = num_users / user_centric_rate
@@
- +text
Evaluate: Benchmark Execution Timeline (t=0 to t=30s)

@@
RESULT:
Immediate mix of fresh sessions (User 16) and deep sessions (User 14),
with users finishing and churning naturally from t=6s onwards.
@@
-```
+```text
┌─────────────────────────────────────────────────────────────┐
@@
└─────────────────────────────────────────────────────────────┘
@@
- +text
INFO Starting AIPerf System
@@
JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-rate1.0/profile_export_aiperf.json
@@
-```
+```text
INFO Starting AIPerf System
@@
JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-rate4.0/profile_export_aiperf.json
@@
- +text
INFO Starting AIPerf System
@@
JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-rate0.5/profile_export_aiperf.json
fern/pages/plugins/plugin-system.md-42-52 (1)
42-52: ⚠️ Potential issue | 🟡 Minor

Add a language to the hierarchy diagram fence (MD040).
Proposed fix
-```
+```text
 Registry (singleton)
 └── Package (1+) ─── discovered via entry points
 └── Manifest (1+ per package) ─── plugins.yaml files
 └── Category (1+)
 └── Entry (1+) ─── PluginEntry
 ├── Class ─── lazy-loaded Python class
 └── Metadata ─── optional typed config
</details>

</blockquote></details>
<details>
<summary>fern/pages/plugins/plugin-system.md-385-412 (1)</summary><blockquote>

`385-412`: _⚠️ Potential issue_ | _🟡 Minor_

**Add `text` language to the error snippet fences (MD040).**

<details>
<summary>Proposed fix</summary>

```diff
-```
+```text
 TypeNotFoundError: Type 'my_plugin' not found for category 'endpoint'.
@@
- +text
ImportError: Failed to import module for endpoint:my_plugin
@@
-```
+```text
AttributeError: Class 'MyClass' not found
</details>

</blockquote></details>
<details>
<summary>fern/pages/benchmark-modes/timing-modes-reference.md-248-266 (1)</summary><blockquote>

`248-266`: _⚠️ Potential issue_ | _🟡 Minor_

**Add a language to the fenced block (MD040).**

The ASCII diagram should declare a language like `text` to satisfy markdownlint and improve renderer consistency.

<details>
<summary>Proposed fix</summary>

```diff
-```
+```text
 ┌─────────────────────────────────────────────────────────────────┐
 │ Which options should I use? │
 ├─────────────────────────────────────────────────────────────────┤
@@
 └─────────────────────────────────────────────────────────────────┘
</details>

</blockquote></details>
<details>
<summary>fern/pages/tutorials/multi-turn.md-107-129 (1)</summary><blockquote>

`107-129`: _⚠️ Potential issue_ | _🟡 Minor_

**Add a language to the sample output fence (MD040).**

<details>
<summary>Proposed fix</summary>

```diff
-```
+```text
 INFO Starting AIPerf System
 INFO Multi-turn mode: 10 conversations, 3 turns each (30 total requests)
@@
 JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-concurrency2/profile_export_aiperf.json
</details>

</blockquote></details>
<details>
<summary>fern/pages/tutorials/http-trace-metrics.md-34-110 (1)</summary><blockquote>

`34-110`: _⚠️ Potential issue_ | _🟡 Minor_

**Add language identifiers to fenced blocks (MD040).**

<details>
<summary>Proposed fix</summary>

```diff
-```
+```text
 Request Lifecycle ──────────────────────────────────────────────────────────────────────────────►
@@
@@
- +text
http_req_duration = response_receive_end_perf_ns - request_send_start_perf_ns
@@
-```
+```text
http_req_connection_overhead = http_req_blocked + http_req_dns_lookup + http_req_connecting
@@
- +text
http_req_total = http_req_blocked + http_req_dns_lookup + http_req_connecting
+ http_req_sending + http_req_waiting + http_req_receiving
fern/pages/benchmark-modes/timing-modes-reference.md-83-85 (1)
83-85: ⚠️ Potential issue | 🟡 Minor

Remove the blank line inside the blockquote to satisfy MD028.

markdownlint flags the empty line inside the blockquote; it breaks the blockquote formatting in some renderers.
Proposed fix
-> **Important**: If `--concurrency` is not set, session concurrency limiting is **disabled** (unlimited). For `--user-centric-rate` mode, consider setting `--concurrency` to at least `--num-users` to ensure all users can have in-flight requests.
-
-> **See also**: [Prefill Concurrency Tutorial](../tutorials/prefill-concurrency.md) for detailed guidance on memory-safe long-context benchmarking.
+> **Important**: If `--concurrency` is not set, session concurrency limiting is **disabled** (unlimited). For `--user-centric-rate` mode, consider setting `--concurrency` to at least `--num-users` to ensure all users can have in-flight requests.
+> **See also**: [Prefill Concurrency Tutorial](../tutorials/prefill-concurrency.md) for detailed guidance on memory-safe long-context benchmarking.
.cursor/skills/docs-to-fern/SKILL_md-316-323 (1)

316-323: ⚠️ Potential issue | 🟡 Minor

SPDX bulk-add script won't detect JSX-style SPDX headers.

The check head -1 "$f" | grep -q '^---' only detects YAML frontmatter. Files that already have JSX comment SPDX headers ({/* SPDX... */}) — like several files in this very PR — would get a duplicate SPDX block prepended.
fern/pages/diagrams/metrics-flow.md-48-48 (1)
48-48: ⚠️ Potential issue | 🟡 Minor

Stage numbering jumps from 2 to 4.

Line 34 labels "Stage 2" and line 48 labels "Stage 4", with no Stage 3. This is likely a numbering error — should this be "Stage 3"?
Proposed fix
- %% Stage 4: Summarize Function Processing
+ %% Stage 3: Summarize Function Processing
fern/pages/diagrams/metrics-flow.md-82-83 (1)
82-83: ⚠️ Potential issue | 🟡 Minor

Style classes reference undefined nodes I1 and F.

class I1,G statistics references I1 which is never defined (only I2 exists). Similarly, class E1,E2,E3,F,L transport references F which has no corresponding node. These appear to be stale references from a prior revision. Mermaid will silently ignore them, but they should be cleaned up to avoid confusion.
Proposed fix
- class I1,G statistics
+ class G statistics
- class E1,E2,E3,F,L transport
+ class E1,E2,E3,L transport
fern/pages/server-metrics/server-metrics-reference.md-158-158 (1)

158-158: ⚠️ Potential issue | 🟡 Minor

Broken internal link: #histogram-buckets anchor does not exist.

The link [Histogram Buckets](#histogram-buckets) on line 158 does not resolve to any heading in this document. There is no ## Histogram Buckets or similar heading. Consider either adding a dedicated "Histogram Buckets" section or updating this link to point to an existing section (e.g., the per-backend histogram bucket tables).
fern/pages/tutorials/gpu-telemetry.md-4-4 (1)
4-4: ⚠️ Potential issue | 🟡 Minor

Multiple # (h1) headings will conflict with Fern's auto-generated title.

Fern auto-generates an h1 from the navigation title in next.yml, so page content should start at h2 (##). This file has four h1 headings (lines 4, 67, 202, 353) which will produce duplicate/conflicting h1 elements and may break the rendered table of contents.

Downgrade these to h2:
Proposed fix
-# GPU Telemetry with AIPerf
+## GPU Telemetry with AIPerf
-# 1: Using Dynamo
+## 1: Using Dynamo
-# 2: Using Other Inference Server
+## 2: Using Other Inference Server
-# 3: Using pynvml (Local GPU Monitoring)
+## 3: Using pynvml (Local GPU Monitoring)
(And cascade all sub-headings down one level accordingly.)
Also applies to: 67-67, 202-202, 353-353
fern/pages/server-metrics/server-metrics-reference.md-166-168 (1)
166-168: ⚠️ Potential issue | 🟡 Minor

## Dynamo Frontend is at the same heading level as its parent ## Detailed Metric Definitions.

The Table of Contents (lines 12-18) lists "Dynamo Frontend", "vLLM", etc. as children of "Detailed Metric Definitions". But in the document body, they are all ## (h2), making them siblings rather than children. They should be ### (h3) to match the TOC hierarchy.

This applies to all backend section headings: lines 168, 217, 273, 352, 452, and 478.
Proposed fix for each backend heading
-## Dynamo Frontend
+### Dynamo Frontend
-## Dynamo Component
+### Dynamo Component
-## vLLM
+### vLLM
-## SGLang
+### SGLang
-## TensorRT-LLM
+### TensorRT-LLM
-## KVBM (KV Block Manager)
+### KVBM (KV Block Manager)
(And cascade all sub-headings under these down one level: ### → ####, etc.)
Based on learnings: the maintainer prefers h4 headings (####) for subsections under h2 headings for better visual sizing. If the backend sections become h3, their subsections (currently h3) would naturally become h4, which aligns with this preference.
.cursor/skills/docs-to-fern/SKILL_md-114-118 (1)

114-118: ⚠️ Potential issue | 🟡 Minor

Update Fern CLI version from 3.29.1 to 3.55.7.

The pinned version is significantly outdated. The latest stable release is 3.55.7 (released January 31, 2026). Update the version to avoid missing recent bug fixes, performance improvements, and security patches.

🧹 Nitpick comments (24)

fern/pages/tutorials/custom-prompt-benchmarking.md (1)
41-41: Replace JSX-style markers with HTML comments to fix MD037.

The {/* ... */} markers are parsed as emphasis with spaces, triggering MD037. Use HTML comments instead.
🔧 Suggested change
-{/* aiperf-run-vllm-default-openai-endpoint-server */}
+

 ...

-{/* /aiperf-run-vllm-default-openai-endpoint-server */}
+
Also applies to: 66-66
fern/pages/reproducibility.md (2)
16-20: Remove blank line between blockquote declarations.

Line 18 contains a blank line between two consecutive blockquote sections. Per Markdown best practices, consecutive blockquotes should not have blank lines between them if they're meant to be separate alert boxes.
📝 Proposed fix
 > [!IMPORTANT]
 > **Default behavior:** Without `--random-seed`, AIPerf produces **non-deterministic** results. Set `--random-seed <integer>` for reproducibility.
-
 > [!WARNING]
 > **Distributed System Constraints:** Even with `--random-seed`, **performance metrics and worker assignment are NOT reproducible** due to system non-determinism (network timing, async I/O, ZMQ load balancing).
60-60: Hyphenate compound adjective.

The phrase "credit issuing strategy" should use a hyphen: "credit-issuing strategy" when the compound modifier precedes the noun.
📝 Proposed fix
-- TimingManager creates credit issuing strategy with RNG-based interval generator
+- TimingManager creates credit-issuing strategy with RNG-based interval generator
fern/pages/tutorials/goodput.md (1)
27-32: Pin the Docker image tag for reproducible setup steps.

Using :latest makes the tutorial non-deterministic; it can silently break when the image changes. The vLLM documentation recommends pinning to a specific version tag.
✅ Suggested doc update
-docker pull vllm/vllm-openai:latest
-docker run --gpus all -p 8000:8000 vllm/vllm-openai:latest \
+docker pull vllm/vllm-openai:v0.11.2
+docker run --gpus all -p 8000:8000 vllm/vllm-openai:v0.11.2 \
fern/pages/tutorials/sharegpt.md (1)
37-48: Replace JSX-style comments to satisfy markdownlint.

{/* ... */} triggers MD037 in Markdown; use HTML comments instead.
💡 Proposed change
-{/* aiperf-run-vllm-default-openai-endpoint-server */}
+
@@
-{/* /aiperf-run-vllm-default-openai-endpoint-server */}
+
fern/pages/tutorials/warmup.md (1)
14-26: Specify language identifiers for fenced code blocks.

Multiple fenced code blocks lack language specifications, which can affect rendering and accessibility:

Lines 14-26, 254-264: ASCII art diagrams (consider text)

Lines 54-70, 122-138, 157-173, 194-210, 233-250: Sample output/logs (consider console, text, or plaintext)

Adding language identifiers improves syntax highlighting, screen reader compatibility, and consistent rendering across different Markdown processors.
📝 Example fixes for ASCII art and sample output

For ASCII art (lines 14-26):
-```
+```text
 Without warmup: With warmup:
For sample output (lines 54-70):
-```
+```console
 INFO Starting AIPerf System
Also applies to: 54-70, 122-138, 157-173, 194-210, 233-250, 254-264
fern/pages/tutorials/embeddings.md (2)

37-37: Unusual comment syntax in Markdown.

The JSX-style comments {/* ... */} wrapping code blocks are atypical for Markdown. If these are used by Fern's documentation system for code snippet extraction or tagging, this is fine. Otherwise, consider using standard Markdown comment syntax.

Also applies to: 49-49, 79-79, 101-101

103-115: Consider adding metrics table to sample output.

The sample output for custom inputs doesn't include a metrics table (unlike the synthetic inputs example on lines 51-71). Consider adding a sample metrics table here for consistency and to show users what results to expect.
fern/pages/tutorials/plot.md (1)
29-95: Consider adding language identifiers to sample output blocks.

The sample output blocks at lines 30-40, 46-56, 71-80, and 86-94 are missing language identifiers. Adding text or console would improve syntax highlighting and accessibility.
Example fix
 **Sample Output (Successful Run):**
-```
+```text
 INFO Loading single-run data from: artifacts/Qwen_Qwen3-0.6B-chat-concurrency10/
fern/pages/tutorials/custom-dataset.md (1)
90-95: Add language identifier to sample output block.

The sample output block is missing a language identifier. Adding text or console improves rendering.
Example fix
 **Output:**
-```
+```text
 
 
 NVIDIA AIPerf | LLM Metrics
fern/pages/reference/tokenizer-auto-detection.md (1)
31-82: Add language identifiers to output example blocks.

The output example blocks at lines 34-37, 40-60, and 63-82 are missing language identifiers. Adding text or console improves syntax highlighting consistency.
Example fix
 **Successful resolution:**
-```
+```text
 INFO ✓ Tokenizer Qwen/Qwen3-0.6B detected for qwen3-0.6b
fern/pages/tutorials/openai-text-endpoints.md (1)

31-68: Inconsistent code block formatting.

Lines 35 and 67 use indented code block style, while the rest of the document uses fenced code blocks. The sample output blocks at lines 48-68 are also missing language identifiers.

Standardization suggestion

Ensure all code blocks use fenced style with language identifiers:

Use ```bash for shell commands

Use ```text for sample output
fern/pages/tutorials/timeslices.md (1)
62-76: Add language identifier to sample output block.

The sample output block at lines 63-76 is missing a language identifier. Adding text improves consistency with other documentation.
Example fix
 **Sample Output (Successful Run):**
-```
+```text
 INFO Starting AIPerf System
fern/pages/comprehensive-llm-benchmarking.md (1)

83-120: Consider using #### for subsections under ## to match doc style.

Multiple ### subsections under ## Use Case 1 appear; the repo preference is to use h4 for these nested headings.

Based on learnings: In the aiperf repository's docs/metrics_reference.md file, the maintainer prefers using h4 headings (####) for subsections under h2 headings instead of h3 (###) for better visual sizing and readability, even though this violates markdownlint rule MD001.
docs/server-metrics/server-metrics-parquet-schema.md (1)
312-312: Vary repeated “For …” sentence openings for readability.

Consider rephrasing to avoid three consecutive sentences starting with “For”.
✏️ Possible rewording
-*For aggregated statistics, see [JSON Schema](server-metrics-json-schema.md). For metric definitions, see [Server Metrics Reference](server-metrics-reference.md). For usage examples, see the [Server Metrics Tutorial](server-metrics.md).*
+*See the [JSON Schema](server-metrics-json-schema.md) for aggregated statistics, the [Server Metrics Reference](server-metrics-reference.md) for metric definitions, and the [Server Metrics Tutorial](server-metrics.md) for usage examples.*
docs/server-metrics/server-metrics-reference.md (1)
549-549: Vary repeated “For …” sentence openings for readability.

Consider rephrasing to avoid three consecutive sentences starting with “For”.
✏️ Possible rewording
-*For detailed implementation and usage examples, see the [Server Metrics Tutorial](server-metrics.md). For aggregated statistics, see the [JSON Schema Reference](server-metrics-json-schema.md). For raw time-series analysis, see the [Parquet Schema Reference](server-metrics-parquet-schema.md).*
+*See the [Server Metrics Tutorial](server-metrics.md) for detailed implementation and usage examples, the [JSON Schema Reference](server-metrics-json-schema.md) for aggregated statistics, and the [Parquet Schema Reference](server-metrics-parquet-schema.md) for raw time-series analysis.*
fern/pages/server-metrics/server-metrics-parquet-schema.md (3)
82-87: Add a language specifier to the fenced code blocks.

These ASCII-table examples lack a language tag, which triggers MD040 warnings. Using text (or an empty identifier like ```text) will satisfy the linter and may also render more predictably in Fern's MDX parser.
Proposed fix

Line 82:
-```
+```text
Line 91:
-```
+```text
Also applies to: 91-98

269-278: Redundant guard on Line 273.

The if i > 0 check on line 273 is always true because line 271-272 already returns when i == 0. The ternary fallback to 0 on line 274 is also unreachable. Not a bug, but slightly confusing for readers.
Simplification
 if count >= target:
 if i == 0:
 return le
- prev_le = bounds[i-1] if i > 0 else 0
- prev_count = counts[i-1] if i > 0 else 0
+ prev_le = bounds[i-1]
+ prev_count = counts[i-1]
 # Linear interpolation within bucket
 fraction = (target - prev_count) / (count - prev_count) if count > prev_count else 0
 return prev_le + fraction * (le - prev_le)
1-2: SPDX copyright year says 2025, but other files in this PR use 2025-2026.

The SPDX header here uses Copyright (c) 2025 while other new files (e.g., metrics-flow.md, the SKILL file) use 2025-2026. Consider aligning the year range for consistency across the new docs.
.cursor/skills/docs-to-fern/SKILL_md (1)

605-627: Link auditor and cross-repo detector scripts use GNU-specific flags.

grep -oP (PCRE mode) and realpath -m (line 866) are GNU-specific and won't work on macOS with default BSD tools. Since this is a skill doc for internal use, this may be fine — but a brief note about requiring GNU coreutils/grep would help avoid confusion.

fern/pages/tutorials/gpu-telemetry.md (2)

160-193: Add language specifiers to sample output code blocks.

The console/table output blocks at lines 160 and 500 lack a language tag (MD040). Use text or console for these display-only blocks.

Also applies to: 500-521

542-755: Consider truncating the JSON export example.

This 213-line JSON blob is quite long for a tutorial page. Consider showing only a representative subset (e.g., 2-3 metrics) with a comment like // ... additional metrics omitted ... and a note that the full output is saved to the export file.

fern/pages/diagrams/metrics-flow.md (1)

1-4: SPDX copyright year 2025 — consider updating to 2025-2026 for consistency.

Same note as server-metrics-parquet-schema.md: other new files in this PR use the 2025-2026 range.
fern/pages/server-metrics/server-metrics-reference.md (1)
71-77: Add language specifiers to fenced code blocks.

The latency formula block (line 71) and the histogram bucket list (line 396) lack language tags. Use text for these display-only blocks to satisfy MD040.
Proposed fix

Line 71:
-```
+```text
Line 396:
-```
+```text
Also applies to: 396-398

coderabbitai

Actionable comments posted: 3

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🤖 Fix all issues with AI agents

In `@fern/pages/genai-perf-feature-comparison.md`:
- Line 137: The table cell for "Arrival Smoothness" uses a backslash-escaped
less-than (`\<1`) which can break Fern's MDX rendering; update the cell content
so the Gamma distribution shape uses the HTML entity `&lt;1=bursty, 1=Poisson,
&gt;1=smooth` instead of backslash escapes for the `<` and `>` characters
(affecting the line with `--arrival-smoothness` / `--vllm-burstiness` in
genai-perf-feature-comparison.md); apply the same replacement pattern across
other docs referencing CLI options (e.g., cli-options.md) and server-metrics
files to ensure consistent MDX-safe rendering.

In `@fern/pages/tutorials/arrival-patterns.md`:
- Around line 155-225: The sample output blocks after the initial bash command
blocks are not closed or language-marked, causing Markdown lint errors
(MD040/MD046); locate the example sections around the aiperf profile command
lines (the Run 1 and Run 2 blocks) and close the opening ```bash fence
immediately after the command invocation, then open a separate fenced block with
language "text" for each "**Expected Output (Run X):**" multi-line output;
specifically ensure you insert a closing ``` after the aiperf profile lines and
replace the following -``` / ``` fences with ```text ... ``` for both the Run 1
and Run 2 expected output blocks so each output is properly fenced and
language-tagged.

In `@tools/generate_cli_docs.py`:
- Line 50: The mkdocs navigation still points to the old filename; update the
mkdocs.yml nav entry to match the new OUTPUT_FILE name by changing the reference
from cli_options.md to cli-options.md so the docs build serves the generated
file produced by OUTPUT_FILE in generate_cli_docs.py.

🟡 Minor comments (27)

fern/pages/tutorials/timeslices.md-82-116 (1)

82-116: ⚠️ Potential issue | 🟡 Minor

Make output file paths consistent with the earlier example.
The sample output shows files under artifacts/<run>/..., but the “Output Files” section omits the run directory. This is confusing for users trying to locate artifacts. Please align both sections (either include the run directory or state it’s implied).

fern/pages/tutorials/timeslices.md-12-12 (1)

12-12: ⚠️ Potential issue | 🟡 Minor

Hyphenate “equal-duration” for grammar.
Line 12 reads “equal duration segments”; hyphenation is preferred here.

fern/pages/tutorials/timeslices.md-63-76 (1)

63-76: ⚠️ Potential issue | 🟡 Minor

Add a language identifier to the sample output fence.
The fenced block under “Sample Output” lacks a language tag, triggering MD040. Consider text or log.
fern/pages/tutorials/sequence-distributions.md-103-127 (1)
103-127: ⚠️ Potential issue | 🟡 Minor

Add a language to the sample output fence.

MD040 flags the sample output block for a missing fence language. Use text (or console) to satisfy linting.
🔧 Suggested fix
-**Sample Output (Successful Run):**
-```
+**Sample Output (Successful Run):**
+```text
 INFO Starting AIPerf System
 INFO Using sequence distribution: 70% (ISL~N(64,10), OSL~N(32,8)), 20% (ISL~N(256,40), OSL~N(128,20)), 10% (ISL~N(1024,100), OSL~N(512,50))
 INFO AIPerf System is PROFILING
@@
 JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-concurrency1/profile_export_aiperf.json
</details>

</blockquote></details>
<details>
<summary>fern/pages/reproducibility.md-60-62 (1)</summary><blockquote>

`60-62`: _⚠️ Potential issue_ | _🟡 Minor_

**Hyphenate “in‑memory” as an adjective.** 
Improves grammar and readability.


<details>
<summary>📝 Proposed fix</summary>

```diff
-- DatasetManager pre-generates complete dataset using derived RNGs and stores in memory
+- DatasetManager pre-generates complete dataset using derived RNGs and stores in-memory
fern/pages/tutorials/working-with-profile-exports.md-128-152 (1)
128-152: ⚠️ Potential issue | 🟡 Minor

Clarify metrics behavior for failed requests.

The text says metrics are “always null for failed requests” (Line 128-129), but the failed example includes a metrics object with error_isl (Line 150-152). Please reconcile this to avoid reader confusion.
✂️ Possible wording update
-See the [Complete Metrics Reference](../metrics-reference.md) page for a list of all metrics and their descriptions. Will always be null for failed requests.
+See the [Complete Metrics Reference](../metrics-reference.md) page for a list of all metrics and their descriptions. For failed requests, metrics may be null or include error-related metrics depending on the failure mode.
fern/pages/tutorials/working-with-profile-exports.md-96-105 (1)
96-105: ⚠️ Potential issue | 🟡 Minor

Fix duplicate JSON key in example.

The successful-record JSON example includes time_to_first_token twice (Line 98 and Line 101), which makes the sample invalid/ambiguous. Keep the correct field once to avoid copy‑paste errors.
✂️ Suggested fix
 "metrics": {
 "input_sequence_length": {"value": 550, "unit": "tokens"},
- "time_to_first_token": {"value": 255.88656799999998, "unit": "ms"},
 "request_latency": {"value": 297.52522799999997, "unit": "ms"},
 "output_token_count": {"value": 9, "unit": "tokens"},
 "time_to_first_token": {"value": 4.8984369999999995, "unit": "ms"},
fern/pages/diagrams/metrics-flow.md-82-83 (1)
82-83: ⚠️ Potential issue | 🟡 Minor

Remove undefined node references from class applications.

Lines 82 and 83 reference nodes I1 and F in class applications, but these nodes are not defined anywhere in the diagram. These are likely leftover from an earlier version or typos.
🧹 Proposed fix: remove undefined node references
- class I1,G statistics
+ class G statistics
- class E1,E2,E3,F,L transport
+ class E1,E2,E3,L transport
fern/pages/diagrams/metrics-flow.md-11-48 (1)
11-48: ⚠️ Potential issue | 🟡 Minor

Fix the stage numbering inconsistency.

The diagram jumps from "Stage 1: Distributed Record Processing" (line 11) to "Stage 2: Centralized Results Processing" (line 34) to "Stage 4: Summarize Function Processing" (line 48). Stage 3 is missing, which could confuse readers following the pipeline flow.
📝 Proposed fix: renumber Stage 4 to Stage 3
- %% Stage 4: Summarize Function Processing
+ %% Stage 3: Summarize Function Processing
 L --> I2["Summarize Function summarize() (Process&nbsp;all&nbsp;collected&nbsp;results)"]
fern/pages/tutorials/goodput.md-2-2 (1)

2-2: ⚠️ Potential issue | 🟡 Minor

Copyright year range may be outdated.

This file uses 2024-2025 while other new files in this PR use 2025-2026 or 2026. Consider updating for consistency.

fern/pages/tutorials/fixed-schedule.md-139-163 (1)

139-163: ⚠️ Potential issue | 🟡 Minor

Sample output entry count is inconsistent with the input data.

The schedule defined on Lines 60-71 has entries at timestamps 2000, 2500, 3000, and 4000ms. With --fixed-schedule-start-offset 2000 --fixed-schedule-end-offset 4000, the filtered set should contain 3–4 entries (depending on boundary inclusivity), not 2. The progress bar (2/2) and throughput also reflect this incorrect count, which could confuse readers trying to reproduce the example.

fern/pages/tutorials/audio.md-85-92 (1)

85-92: ⚠️ Potential issue | 🟡 Minor

Sample output contains a local username/path that should be anonymized.

The output references /home/lkomali/aiperf/artifacts/... which exposes an internal username. This appears again in Lines 149-156. Consider replacing with a generic path like /home/user/aiperf/artifacts/... or using relative paths like artifacts/... to match other tutorials in this PR.
docs/tutorials/sglang-image-generation.md-237-249 (1)
237-249: ⚠️ Potential issue | 🟡 Minor

Add alt text to images for accessibility.

The static analysis tool correctly flags that these images lack alt text (MD045). Adding descriptive alt text improves accessibility and provides context when images fail to load.
Proposed fix
-![](../media/extracted-images/image-0001-00-00.jpg)
+![A serene mountain landscape at sunset](../media/extracted-images/image-0001-00-00.jpg)
-![](../media/extracted-images/image-0002-00-00.jpg)
+![A futuristic city with flying cars](../media/extracted-images/image-0002-00-00.jpg)
-![](../media/extracted-images/image-0003-00-00.jpg)
+![A cute robot playing with a kitten](../media/extracted-images/image-0003-00-00.jpg)
fern/pages/tutorials/local-tokenizer.md-82-89 (1)
82-89: ⚠️ Potential issue | 🟡 Minor

Sample output table is missing the top border row.

Other tutorial pages in this PR include the full table border (┏━━━...┓ header line). This output block is missing it, making the rendering look broken/inconsistent.
Proposed fix
 NVIDIA AIPerf | LLM Metrics
+┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓
 ┃ Metric ┃ avg ┃ min ┃ max ┃ p99 ┃ p50 ┃
 ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩
fern/pages/tutorials/ui-types.md-55-55 (1)
55-55: ⚠️ Potential issue | 🟡 Minor

Add language identifier to fenced code block.

The code block at line 55 lacks a language identifier, which prevents proper syntax highlighting. Based on the content (terminal output), use text or console.
📝 Proposed fix
 **Note:** Dashboard automatically switches to `simple` when using `--verbose` or `--extra-verbose` in a TTY for better log visibility.
 
 ## Simple
 
 Lightweight progress bars using TQDM:
 
 ```bash
 aiperf profile \
 --model Qwen/Qwen3-0.6B \
 --url localhost:8000 \
 --endpoint-type chat \
 --concurrency 10 \
 --request-count 100 \
 --streaming \
 --ui-type simple
Sample Output (Successful Run):
- +text
INFO Starting AIPerf System
</details>

</blockquote></details>
<details>
<summary>fern/pages/tutorials/ui-types.md-108-108 (1)</summary><blockquote>

`108-108`: _⚠️ Potential issue_ | _🟡 Minor_

**Add language identifier to fenced code block.**

The code block at line 108 lacks a language identifier. Based on the content (terminal output with timestamps), use `text` or `console`.



<details>
<summary>📝 Proposed fix</summary>

```diff
 aiperf profile \
 --model Qwen/Qwen3-0.6B \
 --url localhost:8000 \
 --endpoint-type chat \
 --concurrency 10 \
 --request-count 100 \
 --streaming \
 --ui-type none
Sample Output (Successful Run):
- +text
23:07:28.809795 INFO Starting AIPerf System
</details>

</blockquote></details>
<details>
<summary>fern/pages/tutorials/multi-turn.md-402-406 (1)</summary><blockquote>

`402-406`: _⚠️ Potential issue_ | _🟡 Minor_

**`--conversation-turn-delay-ratio` is mentioned but never documented.**

Line 405 introduces `--conversation-turn-delay-ratio` as a control for turn delays, but this flag doesn't appear in the Core Parameters section (lines 41–69) or the Quick Reference (lines 420–440). Either document it alongside the other delay parameters, or remove it from this list if it's not a user-facing option.

</blockquote></details>
<details>
<summary>fern/pages/benchmark-modes/trace-replay.md-47-52 (1)</summary><blockquote>

`47-52`: _⚠️ Potential issue_ | _🟡 Minor_

**`hash_ids` listed as required but described as optional.**

Line 47 introduces these as "Required fields for trace replay" but line 51 says `hash_ids` is "(optional)". This is contradictory — either move `hash_ids` to a separate "Optional fields" section, or remove the "(optional)" qualifier.

</blockquote></details>
<details>
<summary>fern/pages/tutorials/custom-prompt-benchmarking.md-105-115 (1)</summary><blockquote>

`105-115`: _⚠️ Potential issue_ | _🟡 Minor_

**`### Use Cases` should be `## Use Cases` to fix heading hierarchy.**

This is a top-level section of the document, not a subsection of "Running the Benchmark." Static analysis also flags this as MD001 (heading increment skipped).


<details>
<summary>Proposed fix</summary>

```diff
-### Use Cases
+## Use Cases
fern/pages/tutorials/custom-prompt-benchmarking.md-41-43 (1)
41-43: ⚠️ Potential issue | 🟡 Minor

Line 42 will render as an unintended h1 heading.

The line # Create an input file with specific text inputs sits between the JSX comment and the code fence, so it will be rendered as a Markdown h1 heading in the doc. It looks like it was meant to be a code comment or descriptive text.

Move it inside the code block (it's already duplicated as a comment on line 44), or convert it to a plain paragraph.
Proposed fix
 {/* aiperf-run-vllm-default-openai-endpoint-server */}
-# Create an input file with specific text inputs
 ```bash
 # Create an input file to use for benchmarking
fern/pages/tutorials/openai-text-endpoints.md-144-164 (1)
144-164: ⚠️ Potential issue | 🟡 Minor

Trailing blank lines inside fenced code blocks.

Lines 150–151 and 162–163 have blank lines before the closing ``` fence, which will render as empty trailing lines inside the code blocks.
Proposed fix
 cat <<EOF > inputs.jsonl
 {"texts": ["How are you?"]}
 {"texts": ["Give me a poem."]}
 EOF
-
Run AIPerf against the Completions endpoint using the custom input file:
aiperf profile \
 --model Qwen/Qwen3-0.6B \
 --endpoint-type completions \
 --endpoint /v1/completions \
 --input-file inputs.jsonl \
 --custom-dataset-type single_turn \
 --url localhost:8000 \
 --request-count 10
-
</details>

</blockquote></details>
<details>
<summary>fern/pages/tutorials/sglang-image-generation.md-224-249 (1)</summary><blockquote>

`224-249`: _⚠️ Potential issue_ | _🟡 Minor_

**Filename mismatch between the script output and the "View the generated images" section.**

The `extract_images.py` script (line 207) generates filenames with the pattern `image_{line_num:04d}_{data_idx:02d}.jpg` (e.g., `image_0001_00.jpg`), but lines 237, 243, and 249 reference filenames with three numeric segments (`image_0001_00_00.jpg`, `image_0002_00_00.jpg`, `image_0003_00_00.jpg`), which the script never produces.

Additionally, the sample output (lines 226–228) shows all images from `line_num=1` (`image_0001_00.jpg` through `image_0001_02.jpg`), implying a single JSONL record with 3 response images. For 3 separate prompts (one per line), the expected output would be `image_0001_00.jpg`, `image_0002_00.jpg`, `image_0003_00.jpg`.


<details>
<summary>Proposed fix for the viewing section filenames</summary>

```diff
 Prompt:
{"text": "A serene mountain landscape at sunset"}
-*Generated image: image_0001_00_00.jpg*
+*Generated image: image_0001_00.jpg*

Prompt:
{"text": "A futuristic city with flying cars"}
-*Generated image: image_0002_00_00.jpg*
+*Generated image: image_0002_00.jpg*

Prompt:
{"text": "A cute robot playing with a kitten"}
-*Generated image: image_0003_00_00.jpg*
+*Generated image: image_0003_00.jpg*
And update the sample output to match one-image-per-prompt:
 **Output:**
-Extracted: /path/to/extracted_images/image_0001_00.jpg
-Extracted: /path/to/extracted_images/image_0001_01.jpg
-Extracted: /path/to/extracted_images/image_0001_02.jpg
+Extracted: /path/to/extracted_images/image_0001_00.jpg
+Extracted: /path/to/extracted_images/image_0002_00.jpg
+Extracted: /path/to/extracted_images/image_0003_00.jpg
fern/pages/server-metrics/server-metrics-parquet-schema.md-82-98 (1)
82-98: ⚠️ Potential issue | 🟡 Minor

Add text language to table example fences
Unlabeled fenced blocks trigger MD040 and reduce readability.
✅ Suggested fix
-```
+```text
 endpoint_url | metric_name | metric_type | unit | description |timestamp_ns | model_name | value | sum | count | bucket_le | bucket_count
 ...
```diff
-```
+```text
 endpoint_url | metric_name | metric_type | unit | description | timestamp_ns | model_name | value | sum | count | bucket_le | bucket_count
 ...
</details>

</blockquote></details>
<details>
<summary>fern/pages/reference/tokenizer-auto-detection.md-33-82 (1)</summary><blockquote>

`33-82`: _⚠️ Potential issue_ | _🟡 Minor_

**Add languages to fenced output blocks (`text`)** 
These output examples are unlabeled, triggering MD040 and reducing readability.


<details>
<summary>✅ Suggested fix</summary>

```diff
-```
+```text
 INFO ✓ Tokenizer Qwen/Qwen3-0.6B detected for qwen3-0.6b
 INFO 1 tokenizer validated • 1 resolved • 0.3s
```diff
-```
+```text
 ╭──────────────────────────────── Ambiguous Tokenizer Name ─────────────────────────────────╮
 ...
 ╰───────────────────────────────────────────────────────────────────────────────────────────╯
```diff
-```
+```text
 ╭───────────────────────────────── Gated Repository ──────────────────────────────────╮
 ...
 ╰─────────────────────────────────────────────────────────────────────────────────────╯
</details>

</blockquote></details>
<details>
<summary>fern/pages/tutorials/http-trace-metrics.md-371-373 (1)</summary><blockquote>

`371-373`: _⚠️ Potential issue_ | _🟡 Minor_

**Broken links: relative source-code paths won't resolve in the Fern docs site.**

The Fern-generated site doesn't serve files from the repo's `src/` directory, so `../../src/aiperf/...` links will 404. Replace these with absolute GitHub URLs pointing to the source files on the `main` branch:

<details>
<summary>📝 Suggested fix</summary>

```diff
-- [Source: trace_models.py](../../src/aiperf/common/models/trace_models.py) - Trace data model definitions
-- [Source: http_trace_metrics.py](../../src/aiperf/metrics/types/http_trace_metrics.py) - HTTP trace metric implementations
+- [Source: trace_models.py](https://github.com/ai-dynamo/aiperf/blob/main/src/aiperf/common/models/trace_models.py) - Trace data model definitions
+- [Source: http_trace_metrics.py](https://github.com/ai-dynamo/aiperf/blob/main/src/aiperf/metrics/types/http_trace_metrics.py) - HTTP trace metric implementations
fern/pages/server-metrics/server-metrics-reference.md-488-499 (1)
488-499: ⚠️ Potential issue | 🟡 Minor

Ambiguous h2d abbreviation in KVBM transfer patterns.

The suffix h2d is used for two different meanings:

kvbm_offload_blocks_h2d → Host to Disk (line 491)

kvbm_onboard_blocks_h2d → Host to Device (line 493)

The "Block transfer patterns" summary (lines 496–499) lists h2d twice with contradictory definitions. Consider adding a brief clarifying note (e.g., "In offload context, d = disk; in onboard context, d = device") or restructuring the summary to disambiguate:
📝 Suggested clarification
 **Block transfer patterns:**
-- **d2d**: Device ↔ Disk (direct, fast path)
-- **d2h**: Device → Host (offload to CPU memory)
-- **h2d**: Host → Device (onboard from CPU memory)
-- **h2d** (disk): Host → Disk (persist to storage)
+- **d2d**: Device ↔ Disk (direct, bypassing host memory)
+- **d2h**: Device → Host (offload to CPU memory)
+- **h2d** (onboard): Host → Device (onboard from CPU memory to GPU)
+- **h2d** (offload): Host → Disk (persist from CPU memory to storage)
fern/pages/genai-perf-feature-comparison.md-14-14 (1)
14-14: ⚠️ Potential issue | 🟡 Minor

Inconsistent formatting in legend.

The "Not Applicable" entry uses both backticks and bold formatting (**N/A**), while other legend entries use only bold. This creates visual inconsistency.
📝 Proposed fix for consistent formatting
-- **`N/A`** **Not Applicable** - Feature not applicable
+- ❌ **Not Applicable (N/A)** - Feature not applicable
Alternatively, if you want to keep the N/A as code-styled:
-- **`N/A`** **Not Applicable** - Feature not applicable
+- `N/A` **Not Applicable** - Feature not applicable

🧹 Nitpick comments (22)

fern/pages/plugins/creating-your-first-plugin.md (1)
260-260: Consider expanding the test fixtures guidance.

The tutorial mentions creating fixtures in conftest.py but doesn't provide concrete examples. Consider adding a brief code snippet showing how to create mock_model_endpoint, mock_request_info, and mock_response fixtures to make the testing section more actionable.
📝 Example fixture structure
# tests/conftest.py
import pytest
from unittest.mock import Mock

`@pytest.fixture`
def mock_model_endpoint():
 """Create a mock ModelEndpoint for testing."""
 endpoint = Mock()
 endpoint.primary_model_name = "test-model"
 endpoint.endpoint.streaming = False
 return endpoint

`@pytest.fixture`
def mock_request_info():
 """Create a mock RequestInfo with sample data."""
 # Return minimal RequestInfo structure
 ...
fern/pages/comprehensive-llm-benchmarking.md (1)
55-55: Minor: Add comma after year for clarity.

Per common style guides, a comma should follow the year when using month-day-year format.
📝 Suggested fix
-**Note**: This was a demo endpoint used for the November 13, 2025 presentation. The cluster has been taken down.
+**Note**: This was a demo endpoint used for the November 13, 2025, presentation. The cluster has been taken down.
fern/pages/tutorials/huggingface-tgi.md (4)
15-15: Optional: Consider hyphenating compound adjective.

The phrase "full text completion" could be hyphenated as "full-text completion" when used as a compound adjective modifying a noun.

80-80: Format CLI flag as inline code.

The --input-file option should be formatted with backticks for consistency with other CLI flags in the document.
📝 Proposed fix
-You can also provide your own text prompts using the
---input-file option.
+You can also provide your own text prompts using the `--input-file` option.
55-75: Add language specifier to code block.

The output code block should specify a language for better rendering and accessibility. Use text or console for terminal output.
📝 Proposed fix
 **Sample Output (Successful Run):**
-```
+```text
 INFO Starting AIPerf System
117-139: Add language specifier to code block.

Similar to the earlier output example, this code block should specify a language for better rendering and accessibility. Use text or console for terminal output.
📝 Proposed fix
 **Sample Output (Successful Run):**
-```
+```text
 INFO Starting AIPerf System
fern/pages/tutorials/prefill-concurrency.md (2)
14-25: Add language identifiers to fenced code blocks for proper rendering.

Multiple code blocks containing ASCII diagrams and sample terminal outputs are missing language identifiers. While the static analysis tool flags these as warnings, they are valid issues that affect documentation rendering and syntax highlighting.
Proposed fix for ASCII diagrams and sample outputs

For ASCII art diagrams (lines 14-25, 33-52, 172-181), add text identifier:
-```
+```text
 Request Lifecycle
For sample terminal outputs (lines 100-121, 147-168, 202-224), add text or console identifier:
-```
+```text
 INFO Starting AIPerf System
Also applies to: 33-52, 100-121, 147-168, 172-181, 202-224

56-60: Fix blockquote structure with blank line separator.

The blank line at line 58 between two blockquotes creates improper Markdown structure. Either merge the blockquotes or separate them completely.
Proposed fix

Option 1 - Merge into single blockquote:
 > [!IMPORTANT]
 > Requires `--streaming` to be enabled. Without streaming, AIPerf can't detect when the first token arrives.
-
+>
 > [!WARNING]
 > **Coordinated omission trade-off:** When requests wait for prefill slots...
Option 2 - Separate blockquotes completely (remove blank line):
 > [!IMPORTANT]
 > Requires `--streaming` to be enabled. Without streaming, AIPerf can't detect when the first token arrives.

 > [!WARNING]
 > **Coordinated omission trade-off:** When requests wait for prefill slots...
fern/pages/tutorials/plot.md (2)
30-40: Add language identifiers to sample output code blocks.

Multiple sample terminal output blocks are missing language identifiers, which affects proper rendering and syntax highlighting.
Proposed fix

Add text or console identifier to sample output blocks:
-```
+```text
 INFO Loading single-run data from: artifacts/Qwen_Qwen3-0.6B-chat-concurrency10/
Also applies to: 46-56, 71-80, 86-94, 424-433

251-255: Fix blockquote structures with blank line separators.

Multiple locations have blank lines between consecutive blockquotes (lines 253, 372, 439, 442, 445, 448), creating improper Markdown structure. Ensure blockquotes are either merged or properly separated.
Example fix for lines 251-255
 > [!NOTE]
 > When experiment classification is enabled, all multi-run plots automatically group by `experiment_group` (directory name) to preserve individual treatment variants with semantic baseline/treatment colors.
-
+>
 > [!TIP]
 > See the CONFIGURATION GUIDE section in `~/.aiperf/plot_config.yaml` for detailed customization options.
Also applies to: 370-374, 437-450
fern/pages/tutorials/synthetic-video.md (1)
66-90: Add language identifiers to sample output code blocks.

Sample output blocks at lines 66-90 and sections marked at lines 105, 163, and 348 are missing language identifiers.
Proposed fix

Add text identifier to sample output blocks:
 **Sample Output (Successful Run):**
-```
+```text
 INFO Starting AIPerf System
Also applies to: 105-106, 163-164, 348-349
fern/pages/tutorials/request-rate-concurrency.md (2)
27-41: Fix blockquote structures with blank line separators.

Blank lines at lines 29 and 32 within blockquote sequences create improper Markdown structure. Merge related blockquotes or ensure proper separation.
Proposed fix
 > [!IMPORTANT]
 > **No catch-up behavior**: When the concurrency limit is reached, the system does not attempt to "catch up" by issuing requests faster once slots free up. The schedule continues at the configured rate.
-
+>
 > [!TIP]
 > **Sustaining max concurrency**: If your request rate is faster than your server's average response time...
-
+>
 > [!NOTE]
 > **Ramp-up time formula**: `ramp_up_time = concurrency / request_rate`
96-115: Add language identifiers to sample output code blocks.

Sample output blocks at lines 96-115 and 140-159 are missing language identifiers.
Proposed fix
 **Sample Output (Successful Run):**
-```
+```text
 INFO Starting AIPerf System
Also applies to: 140-159
fern/pages/tutorials/rankings.md (1)
52-71: Add language identifier to sample output code block.

The sample output block at lines 52-71 is missing a language identifier for proper rendering.
Proposed fix
 **Sample Output (Successful Run):**
-```
+```text
 INFO Starting AIPerf System
fern/pages/tutorials/custom-dataset.md (1)
89-131: Add language identifiers to sample output code blocks.

Sample output blocks at lines 89-131, 170-210, and 264-305 are missing language identifiers for proper rendering.
Proposed fix
 **Output:**
-```
+```text
 
 
 NVIDIA AIPerf | LLM Metrics
Also applies to: 170-210, 264-305
fern/pages/tutorials/gpu-telemetry.md (3)
1-2: Use standard Markdown comment syntax.

The file uses non-standard {/* ... */} comment syntax. For Markdown files, use standard HTML comment syntax .
Proposed fix
-{/* # SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
-# SPDX-License-Identifier: Apache-2.0 */}
+
44-55: Fix blockquote structures with blank line separators.

Blank lines at lines 50 and 53 between blockquote sections create improper Markdown structure. Merge related blockquotes or ensure proper separation.
Proposed fix
 > [!IMPORTANT]
 > **DCGM mode (default):** The default endpoints `http://localhost:9400/metrics` and `http://localhost:9401/metrics` are always attempted...
->
+>
 > **pynvml mode:** When using `--gpu-telemetry pynvml`, DCGM endpoints are NOT used...
->
+>
 > To completely disable GPU telemetry collection, use `--no-gpu-telemetry`.
-
+>
 > [!NOTE]
 > When specifying custom DCGM exporter URLs, the `http://` prefix is optional...
-
+>
 > [!TIP]
 > For simple local GPU monitoring without DCGM setup, use `--gpu-telemetry pynvml`...
159-193: Add language identifiers to sample output code blocks.

Sample output blocks at lines 159-193, 498-521, and 524-538 are missing language identifiers for proper rendering.
Proposed fix

For terminal output:
 **Sample Output (Successful Run):**
-```
+```text
 INFO Starting AIPerf System
For CSV output:
 ## Example CSV Export
-```
+```csv
 Endpoint,GPU_Index,GPU_Name,GPU_UUID,Metric,avg,min,max...
Also applies to: 498-521, 524-538
fern/pages/tutorials/multi-turn.md (1)

2-2: Copyright year range inconsistency.

This file uses 2024-2025 while most other new files in this PR use 2025-2026. Consider updating for consistency.
fern/pages/tutorials/prefix-synthesis.md (1)
1-3: SPDX header format differs from other files.

This file uses a JSX comment ({/* ... */}) for the SPDX header, while all other new docs in this PR use YAML frontmatter (--- delimiters). Consider using the standard frontmatter format for consistency.
Proposed fix
-{/* # SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
-# SPDX-License-Identifier: Apache-2.0 */}
+---
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+---
+
 # Prefix Data Synthesis Tutorial
fern/pages/server-metrics/server-metrics-reference.md (1)

166-168: Heading level inconsistency: backend sections should be ### under "Detailed Metric Definitions".

Line 166 introduces ## Detailed Metric Definitions as a parent section, and the TOC (lines 13–18) indents the backends as children. However, Dynamo Frontend (line 168), Dynamo Component (line 217), vLLM (line 273), etc. are all ## — the same level as their parent — so they appear as siblings in the document outline rather than subsections.

If you want them nested under "Detailed Metric Definitions", change each to ###. Alternatively, if flat h2 is intentional for visual weight in the rendered Fern site, consider removing the "Detailed Metric Definitions" h2 header (or making it a non-heading intro paragraph) to avoid the structural mismatch.
fern/pages/tutorials/http-trace-metrics.md (1)
34-45: Add language identifiers to fenced code blocks for linter compliance.

Several fenced code blocks (the lifecycle diagram at line 34, formulas at lines 88/98/106, and the streaming example at line 137) lack a language specifier. Using text would satisfy the markdownlint MD040 rule while preserving readability.
📝 Example fix (line 34)
-```
+```text
 Request Lifecycle ──────────...

dagil-nvidia · 2026-02-19T14:50:56Z

Converting PR #676 to Config-Only Fern Setup

PR: ai-dynamo/aiperf#676 — docs: Add initial fern docs
Branch: dbermudez/aip-730-generate-fern-docs

Background: Two Ways to Set Up Fern

When adding Fern to a repo that already has a docs/ folder, there are two approaches:

Approach A: Copy Content into `fern/pages/` (Current PR)

repo/
├── docs/              # original docs
│   ├── getting-started.md
│   └── tutorials/
└── fern/
    ├── docs.yml       # config
    ├── fern.config.json
    ├── pages/         # COPIES of docs/ content
    │   ├── getting-started.md   ← duplicate
    │   └── tutorials/           ← duplicate
    └── versions/
        └── next.yml   # nav points to ../pages/*

Every Markdown file exists in two places. The Fern nav points at fern/pages/.

Approach B: Config-Only `fern/`, Content Stays in `docs/`

repo/
├── docs/              # content stays here (single source of truth)
│   ├── getting-started.md
│   └── tutorials/
└── fern/
    ├── docs.yml       # config
    ├── fern.config.json
    └── versions/
        └── next.yml   # nav points to ../../docs/*

No duplication. The Fern nav points directly at docs/ via relative paths.

Why Approach B Is Better

	Approach A (Copy)	Approach B (Config-Only)
Files in PR	85 (60+ duplicates)	~25 (config + in-place fixes)
Doc edits	Must update both `docs/` and `fern/pages/`	Edit `docs/` once
Drift risk	`docs/` and `fern/pages/` can go out of sync	Impossible -- single source
Versioning	Unclear where old versions go	Clean: `fern/versions/vX/docs/` snapshots from release branches
GitHub rendering	`docs/` still visible on GitHub but may diverge from site	`docs/` IS the site content
PR review burden	Reviewers see every file twice	Reviewers see only real changes

Fern supports both approaches -- it just needs a relative path from the version YAML to the Markdown file. It does not require content to live inside fern/.

What Needs to Change

The goal is to convert PR #676 from Approach A to Approach B.

Net effect: Remove ~60 duplicated files, shrinking the diff from 85 files to ~25 files.

Step-by-Step Changes

1. Delete `fern/pages/` Entirely

git rm -r fern/pages/

This removes all 60+ copied files. Content is already in docs/.

2. Update `fern/versions/next.yml`

Every path: entry needs to change from ../pages/* to ../../docs/*.

Before:

- page: Welcome to AIPerf Documentation
  path: ../pages/index.md
- page: AIPerf Metrics Reference
  path: ../pages/metrics-reference.md
- section: Tutorials
  contents:
    - page: Warmup Phase Configuration
      path: ../pages/tutorials/warmup.md

After:

- page: Welcome to AIPerf Documentation
  path: ../../docs/index.md
- page: AIPerf Metrics Reference
  path: ../../docs/metrics-reference.md
- section: Tutorials
  contents:
    - page: Warmup Phase Configuration
      path: ../../docs/tutorials/warmup.md

The page: titles and section structure stay exactly the same. Only the path: values change.

Quick sed command (verify the output before committing):

sed -i 's|path: \.\./pages/|path: ../../docs/|g' fern/versions/next.yml

3. Keep All `docs/` Modifications

The PR already makes in-place changes to docs/ files (MDX fixes, SPDX frontmatter, H1 removal). These changes are still needed and should remain in the PR:

docs/api/synthesis.md -- modified
docs/benchmark-datasets.md -- renamed from benchmark_datasets.md
docs/benchmark-modes/timing-modes-reference.md -- renamed
docs/benchmark-modes/trace-replay.md -- renamed
docs/cli-options.md -- renamed from cli_options.md
docs/comprehensive-llm-benchmarking.md -- modified
docs/environment-variables.md -- renamed
docs/metrics-reference.md -- renamed
docs/server-metrics/*.md -- renamed
docs/tutorials/*.md -- modified (link fixes)

4. Fix Remaining CodeRabbit Issues

These 7 unresolved issues now apply to the docs/ files directly:

4a. Broken Cross-Repo Links

In docs/reproducibility.md (was fern/pages/reproducibility.md), convert relative links that escape docs/:

-- [test_random_generator_canary.py](../tests/integration/test_random_generator_canary.py)
-- [test_deterministic_behavior.py](../tests/integration/test_deterministic_behavior.py)
+- [test_random_generator_canary.py](https://github.com/ai-dynamo/aiperf/blob/main/tests/integration/test_random_generator_canary.py)
+- [test_deterministic_behavior.py](https://github.com/ai-dynamo/aiperf/blob/main/tests/integration/test_deterministic_behavior.py)

And:

-See [random_generator.py](../src/aiperf/common/random_generator.py)
+See [random_generator.py](https://github.com/ai-dynamo/aiperf/blob/main/src/aiperf/common/random_generator.py)

4b. PII Leak in Sample Output

In docs/tutorials/custom-dataset.md, replace developer home paths:

-/home/lkomali/aiperf/artifacts/Qwen_Qwen3-0.6B-openai-chat-concurrency2/profile_export_aiperf.csv
+artifacts/Qwen_Qwen3-0.6B-openai-chat-concurrency2/profile_export_aiperf.csv

Apply to all three output blocks in the file.

4c. Code Block Fencing

In docs/tutorials/multi-url-load-balancing.md and docs/tutorials/arrival-patterns.md, close bash code fences before sample output sections and tag output blocks with text:

 ```bash
 aiperf profile --model llama \
     --url http://server1:8000 \
     --request-rate 20
+```

 **Sample Output:**
-```
+```text
 INFO     Starting AIPerf System
 ...


#### 4d. Angle Bracket Escaping

In `docs/genai-perf-feature-comparison.md` (and similar files), use `&lt;` instead of `\<`:

```diff
-| Gamma distribution shape: \<1=bursty, 1=Poisson, >1=smooth |
+| Gamma distribution shape: &lt;1=bursty, 1=Poisson, &gt;1=smooth |

Apply the same pattern across docs/cli-options.md and server-metrics files.

4e. mkdocs.yml Nav Entry

Update mkdocs.yml to match the renamed file:

-- CLI Options: cli_options.md
+- CLI Options: cli-options.md

4f. Server-Metrics Link Mismatch

In docs/server-metrics/server-metrics-reference.md, fix underscore links:

-[JSON Schema Reference](server_metrics_json_schema.md)
-[Parquet Schema Reference](server_metrics_parquet_schema.md)
+[JSON Schema Reference](server-metrics-json-schema.md)
+[Parquet Schema Reference](server-metrics-parquet-schema.md)

5. Fern-Only Pages

Some pages in fern/pages/ were created specifically for Fern and don't exist in docs/:

fern/pages/index.md -- Fern landing page
fern/pages/architecture.md -- may not exist in docs/
fern/pages/migrating.md -- GenAI-Perf migration guide
fern/pages/genai-perf-feature-comparison.md
fern/pages/dev/patterns.md
fern/pages/diagrams/metrics-flow.md
fern/pages/diagrams/mixins.md

For each of these: Check if a corresponding file exists in docs/. If not, either:

Move it to docs/ (preferred -- it becomes the source of truth)
Keep it in fern/pages/ as a Fern-only page and reference it from next.yml as ../pages/filename.md

Option 1 is cleaner. Option 2 creates a hybrid where most content is in docs/ but a few Fern-specific pages live in fern/pages/.

6. Update CI Trigger Paths (If Applicable)

If the PR includes a GitHub Actions workflow for publishing, ensure it triggers on both:

paths: ['fern/**', 'docs/**']

7. Verify

After making all changes:

# Validate config
fern check --warnings

# Preview locally
fern docs dev --port 3000

# Check every page renders, images load, links work

Summary of Changes

Action	Files Affected	Lines
Delete `fern/pages/`	-60 files	-19k lines
Update `next.yml` paths	1 file	~60 path changes
Fix CodeRabbit issues in `docs/`	~6 files	~30 lines
Move Fern-only pages to `docs/`	~7 files	net zero
Net PR reduction	~53 fewer files	~19k fewer lines

coderabbitai

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (9)

docs/tutorials/sglang-image-generation.md (1)
226-229: ⚠️ Potential issue | 🟡 Minor

Filename format mismatch between script output and example media images.

The extraction script (line 207) produces 2-segment names:
f"image_{line_num:04d}_{data_idx:02d}.jpg" # → image_0001_00.jpg
The sample output (lines 226–229) correctly reflects this. However, the example images embedded in the "View the generated images" section carry 3-segment, hyphenated names (image-0001-00-00.jpg). A reader who runs the script will produce image_0001_00.jpg and won't recognize a match with the displayed examples.

Either update the media filenames (and the references on lines 237, 243, 249) to match what the script actually generates, or adjust the script's naming to add the trailing segment and align all three artefacts (script code, sample output, and media filenames) to the same convention.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/sglang-image-generation.md` around lines 226 - 229, The
example media filenames do not match the extraction script's f-string
(f"image_{line_num:04d}_{data_idx:02d}.jpg")—the script produces
"image_0001_00.jpg" but the docs show hyphenated "image-0001-00-00.jpg"; fix by
either renaming the embedded media and their references on the "View the
generated images" lines (currently at the three places called out) to the
underscore two-segment form that the script outputs, or change the extraction
naming convention in the script to a three-segment hyphenated format (adding the
extra trailing index and using "-" separators) and then update the sample output
and media references so all three (script, sample output, media files) use the
exact same filename pattern; ensure consistency for filenames referenced on the
three lines mentioned.
docs/tutorials/audio.md (1)
89-92: ⚠️ Potential issue | 🟡 Minor

Developer home path leaks into sample output.

Lines 89–92 (and 153–156) contain /home/lkomali/aiperf/artifacts/..., a developer-specific path. The PR reviewer flagged the same issue as fix 4b for custom-dataset.md; the same correction is needed here.
✏️ Suggested fix — replace with a generic path
-/home/lkomali/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/profile_export_aiperf.csv
+artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/profile_export_aiperf.csv
-/home/lkomali/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/profile_export_aiperf.json
+artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/profile_export_aiperf.json
-/home/lkomali/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/logs/aiperf.log
+artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/logs/aiperf.log
Apply the same substitution to lines 153–156.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/audio.md` around lines 89 - 92, Replace developer-specific
absolute paths in the sample output lines that reference
profile_export_aiperf.csv, profile_export_aiperf.json, and aiperf.log with a
generic placeholder path (e.g., /path/to/artifacts/<run-name>/...) in the
docs/tutorials/audio.md content; apply the same substitution to the other
matching block later in the file (the lines containing the same three filenames)
so both occurrences no longer leak the developer home directory.
docs/benchmark-modes/trace-replay.md (1)
63-111: ⚠️ Potential issue | 🟡 Minor

Same duplicate section-ID pattern — three distinct scenarios share one ID.

All three {/* aiperf-run-vllm-default-openai-endpoint-server */} pairs (lines 63–71, 74–85, 93–111) wrap different scenarios (create trace file / run custom trace / run Mooncake trace) but carry the same marker. As flagged in openai-text-endpoints.md, this prevents unique extraction by the e2e parser. Consider using distinct IDs such as aiperf-run-vllm-trace-create-custom, aiperf-run-vllm-trace-run-custom, and aiperf-run-vllm-trace-run-mooncake.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/benchmark-modes/trace-replay.md` around lines 63 - 111, The three code
block markers all use the same ID
"aiperf-run-vllm-default-openai-endpoint-server", preventing unique extraction;
update each pair to unique IDs that reflect their scenario names (e.g., change
the create-trace block marker to "aiperf-run-vllm-trace-create-custom", the
run-custom-trace block to "aiperf-run-vllm-trace-run-custom", and the Mooncake
trace block to "aiperf-run-vllm-trace-run-mooncake"), ensuring both the opening
and closing comment markers around each fenced bash block are changed
accordingly so the e2e parser can extract them uniquely.
docs/tutorials/multi-run-confidence.md (1)
703-704: ⚠️ Potential issue | 🟡 Minor

Broken cross-reference links — filenames changed in this PR.

The PR renames cli_options.md → cli-options.md (mkdocs.yml update, fix 4e) and metrics_reference.md → metrics-reference.md (confirmed by the working-with-profile-exports.md change). Both links on lines 703–704 still point to the old underscore names and will 404.
🔗 Proposed fix
-- [CLI Options](../cli_options.md) - Full parameter reference
-- [Metrics Reference](../metrics_reference.md) - Detailed metric descriptions
+- [CLI Options](../cli-options.md) - Full parameter reference
+- [Metrics Reference](../metrics-reference.md) - Detailed metric descriptions
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/multi-run-confidence.md` around lines 703 - 704, Update the
broken cross-reference links in docs/tutorials/multi-run-confidence.md that
still point to the old underscored filenames: change ../cli_options.md to
../cli-options.md and ../metrics_reference.md to ../metrics-reference.md (and
search the same file for any other occurrences of cli_options or
metrics_reference to replace them too) so the links match the renamed files
referenced in mkdocs.yml and other docs.
docs/tutorials/openai-text-endpoints.md (1)
31-164: ⚠️ Potential issue | 🔴 Critical

All aiperf-run-vllm-default-openai-endpoint-server tags across 15+ tutorial and benchmark files extract to the same server identifier, causing parser collisions.

The parser extracts server name from the tag format aiperf-run-{server-name}-endpoint-server, meaning all 66 occurrences of the identical tag aiperf-run-vllm-default-openai-endpoint-server (across openai-text-endpoints.md, multi-turn.md, custom-dataset.md, request-cancellation.md, trace-replay.md, and others) extract to server name vllm-default. These commands are then appended to a single list, making it impossible for the e2e test parser to differentiate between distinct scenarios and collapse multiple test cases into one.

Affected files with duplicate tags:

openai-text-endpoints.md: 4 pairs (chat/completions × synthetic/custom)

multi-turn.md: 6 pairs

custom-dataset.md: 3 pairs

request-cancellation.md: 3 pairs

trace-replay.md: 3 pairs

Plus 10+ additional tutorial files with 1–2 pairs each

Each distinct scenario needs a unique identifier. Use a pattern like aiperf-run-vllm-{scenario}-endpoint-server to distinguish scenarios within each file and avoid collisions across files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/openai-text-endpoints.md` around lines 31 - 164, The repeated
snippet tag name aiperf-run-vllm-default-openai-endpoint-server causes parser
collisions; update each opening and matching closing tag in this file (e.g., the
four pairs around the chat synthetic, chat custom, completions synthetic,
completions custom blocks) to use unique identifiers following the suggested
pattern such as aiperf-run-vllm-chat-synthetic-endpoint-server,
aiperf-run-vllm-chat-custom-endpoint-server,
aiperf-run-vllm-completions-synthetic-endpoint-server, and
aiperf-run-vllm-completions-custom-endpoint-server so the parser extracts
distinct server names; ensure every changed opening tag has its corresponding
closing tag updated to the exact same new identifier.
docs/tutorials/embeddings.md (1)
79-101: ⚠️ Potential issue | 🟠 Major

aiperf-run tag wraps the wrong bash block — CI parser will extract the cat command instead of aiperf profile.

parser.py's _extract_bash_block scans forward from the tag line and returns the first \``bashblock it encounters. With the opening tag placed at line 79, the first bash block is thecat < inputs.jsonlheredoc (lines 80–89), not theaiperf profilecommand (lines 92–100). The CI E2E test would register acat` command as the server's aiperf run command, causing incorrect test execution.
🐛 Proposed fix — move the opening tag to immediately before the `aiperf profile` block
-{/* aiperf-run-vllm-default-openai-endpoint-server */}
 ```bash
 cat <<EOF > inputs.jsonl
 {"texts": ["What is artificial intelligence?"]}
 {"texts": ["Explain machine learning."]}
 {"texts": ["How do neural networks work?"]}
 {"texts": ["Define deep learning."]}
 {"texts": ["What are transformers in AI?"]}
 EOF
Run AIPerf using the custom input file:
+{/* aiperf-run-vllm-default-openai-endpoint-server */}
aiperf profile \
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/embeddings.md` around lines 79 - 101, The aiperf-run tag is
placed before the heredoc bash block so parser.py::_extract_bash_block picks up
the cat <<EOF block instead of the intended aiperf profile command; move the
opening tag {/* aiperf-run-vllm-default-openai-endpoint-server */} from above
the inputs.jsonl heredoc to immediately before the ```bash block that starts the
aiperf profile command (and keep the closing tag after that block) so the parser
extracts the correct aiperf profile command for CI.
docs/benchmark-datasets.md (1)
42-42: ⚠️ Potential issue | 🟡 Minor

Broken relative link: file was renamed but this reference wasn't updated.

benchmark_modes/trace_replay.md (underscores) no longer exists; the PR renamed it to benchmark-modes/trace-replay.md. This link will 404 in both mkdocs and Fern.
🔧 Proposed fix
- <td>Mooncake trace file <a href="benchmark_modes/trace_replay.md"><code>--input-file your_trace_file.jsonl --custom-dataset-type mooncake_trace</code></a></td>
+ <td>Mooncake trace file <a href="benchmark-modes/trace-replay.md"><code>--input-file your_trace_file.jsonl --custom-dataset-type mooncake_trace</code></a></td>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/benchmark-datasets.md` at line 42, Update the broken relative link in
the Mooncake trace file row by replacing the old reference
"benchmark_modes/trace_replay.md" with the new path
"benchmark-modes/trace-replay.md" so the anchor around the code snippet
(--input-file your_trace_file.jsonl --custom-dataset-type mooncake_trace) points
to the renamed document; ensure the href in that <a> tag is updated accordingly.
docs/tutorials/vision.md (1)
40-53: ⚠️ Potential issue | 🟡 Minor

Duplicate marker IDs across different code blocks.

The marker aiperf-run-vllm-vision-openai-endpoint-server is reused on three distinct code blocks (lines 40, 86, 100). If these markers are used for snippet extraction or testing, the duplicate IDs could cause the wrong block to be selected. Consider using unique identifiers (e.g., aiperf-run-vllm-vision-synthetic, aiperf-run-vllm-vision-custom-file, aiperf-run-vllm-vision-custom-run).

Also applies to: 86-96, 100-111
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/vision.md` around lines 40 - 53, Duplicate marker ID
"aiperf-run-vllm-vision-openai-endpoint-server" is used for multiple code blocks
causing potential snippet extraction collisions; update each marker to a unique
identifier (for example replace the three occurrences with
"aiperf-run-vllm-vision-synthetic", "aiperf-run-vllm-vision-custom-file", and
"aiperf-run-vllm-vision-custom-run") by locating the markers surrounding the
code fences (the strings inside {/* ... */}) and renaming them consistently at
both the opening and closing markers so each block has a distinct ID.
tools/generate_cli_docs.py (1)
428-432: ⚠️ Potential issue | 🟡 Minor

Command help text paragraphs not wrapped with _escape_mdx_prose.

Parameter descriptions (line 331) and choice descriptions (line 345) are escaped, but command-level help text at line 432 passes through normalize_text only. The plot command's description contains <first_path>, which would break MDX rendering without escaping.
Proposed fix
 if desc:
 for para in desc.split("\n\n"):
 if para.strip():
- lines.extend([normalize_text(para), ""])
+ lines.extend([_escape_mdx_prose(normalize_text(para)), ""])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/generate_cli_docs.py` around lines 428 - 432, The command-level
description is not MDX-escaped: when building desc from desc_lines the code uses
normalize_text(para) but does not call _escape_mdx_prose, so angle-bracketed
tokens (e.g. <first_path>) break rendering; update the block that processes desc
(the variables desc, desc_lines and loop that appends to lines) to pass each
paragraph through _escape_mdx_prose (e.g.
_escape_mdx_prose(normalize_text(para))) before extending lines so command help
text is properly escaped for MDX.

🧹 Nitpick comments (4)

docs/tutorials/sglang-image-generation.md (1)
240-242: Specify a language on fenced code blocks containing JSON prompts (MD040).

Lines 240 and 246 open code fences with no language identifier. Adding json makes intent explicit and suppresses the markdownlint warning.
✏️ Proposed fix
-```
+```json
 {"text": "A futuristic city with flying cars"}
```diff
-```
+```json
 {"text": "A cute robot playing with a kitten"}
</details>

> **Note:** The same fix applies to the first prompt block at line 234.

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @docs/tutorials/sglang-image-generation.md around lines 240 - 242, Add the
missing language identifier "json" to the fenced code blocks that contain JSON
prompts (e.g., the blocks containing {"text": "A futuristic city with flying
cars"} and {"text": "A cute robot playing with a kitten"} and the earlier prompt
at the first block), by changing the opening backtick fences from tojson
so the intent is explicit and the markdownlint MD040 warning is suppressed.
</details>

</blockquote></details>
<details>
<summary>docs/tutorials/embeddings.md (1)</summary><blockquote>

`37-37`: **Recurring MD037 markdownlint false positives from MDX comment syntax (applies to all changed doc files).**

Lines 37, 49, 79, 101 here, and similarly in `docs/tutorials/fixed-schedule.md` (57, 84, 121, 136) and `docs/tutorial.md` (12, 21, 23, 27, 30, 42) all trigger MD037 ("Spaces inside emphasis markers") because markdownlint cannot distinguish `{/* … */}` MDX comment delimiters from `* … *` emphasis syntax.

Consider suppressing MD037 in the project's markdownlint config for MDX-style files, e.g. in `.markdownlint-cli2.yaml`:

```yaml
# MDX comment delimiters {/* … */} trigger false MD037 positives
MD037: false
```

Or scope it per-glob if you want to keep the rule for pure-markdown files:

```yaml
overrides:
 - globs: ["docs/**/*.md"]
 config:
 MD037: false
```

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/embeddings.md` at line 37, Several doc files use MDX-style
comment delimiters like {/* aiperf-run-vllm-default-openai-endpoint-server */}
which markdownlint mis-parses as emphasis and triggers MD037; update the
markdownlint config (.markdownlint-cli2.yaml) to suppress MD037 for MDX-style
files by adding MD037: false globally or, preferably, add an overrides entry
that disables MD037 for the docs glob (e.g., docs/**/*.md or the MDX-specific
glob) so the {/* … */} comments no longer produce false positives while
preserving MD037 for pure-markdown files.
```

</details>

</blockquote></details>
<details>
<summary>docs/comprehensive-llm-benchmarking.md (1)</summary><blockquote>

`1-4`: **Stray `#` prefix on SPDX lines is inconsistent with all other files in this PR.**

Lines 2–3 read `# SPDX-FileCopyrightText:` / `# SPDX-License-Identifier:` — the `#` prefix is a leftover from a Python/shell-comment–style header and was not stripped during the conversion. Every other converted docs file (e.g., `docs/tutorials/custom-dataset.md`) and the `SPDX_HEADER_MD` constant in `tools/_core.py` omit the `#`.

<details>
<summary>✏️ Suggested fix</summary>

```diff
 {/*
-# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
-# SPDX-License-Identifier: Apache-2.0
+SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+SPDX-License-Identifier: Apache-2.0
 */}
```

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@docs/comprehensive-llm-benchmarking.md` around lines 1 - 4, The SPDX header
in docs/comprehensive-llm-benchmarking.md uses a stray leading "#" on the two
SPDX lines; remove the "#" prefix from the lines beginning with "#
SPDX-FileCopyrightText:" and "# SPDX-License-Identifier:" so they match the
SPDX_HEADER_MD format used elsewhere (i.e., change them to
"SPDX-FileCopyrightText: ..." and "SPDX-License-Identifier: ...").
```

</details>

</blockquote></details>
<details>
<summary>.cursor/skills/docs-to-fern/SKILL_md (1)</summary><blockquote>

`113-118`: **Hardcoded Fern CLI version `3.29.1` will become stale.**

Consider recommending that users check the latest version (`npm show fern-api version`) without hardcoding a specific version, or add a note to update this periodically.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In @.cursor/skills/docs-to-fern/SKILL_md around lines 113 - 118, The JSON in
SKILL_md hardcodes "version": "3.29.1", which will become stale; replace the
fixed version with guidance to either leave the version out or use a placeholder
and add a note telling users to run `npm show fern-api version` to find the
latest Fern CLI version (or to substitute "latest"/their desired version), and
update the documentation text around the JSON (the block containing
"organization" and "version") to explain how and when to refresh the version
value.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.cursor/skills/docs-to-fern/SKILL_md:

Around line 326-361: The fix_file function currently detects SPDX blocks using
spdx_pattern and replaces them with YAML frontmatter (--- ... ---); change that
behavior so detected SPDX-FileCopyrightText/SPDX-License-Identifier blocks are
converted into JSX block comments like {/* ... /} (preserving the original
lines and ordering) instead of creating a YAML frontmatter block, and ensure the
subsequent generic HTML-to-JSX replacement (the re.sub that turns
into {/ \1 */}) still matches or is adjusted to avoid double-wrapping SPDX
content.

Around line 262-280: Update the skill document to remove the duplicate-content
Approach A instructions in "Phase 3: Migrate Content" (specifically "Step 3.1:
Bulk Copy with Hyphen Renaming" and any steps that create or require
fern/pages/* copies) and replace them with guidance for the config-only Approach
B: describe that fern/ holds only configuration, that next.yml navigation
entries should point to the original docs via relative paths (e.g.,
../../docs/), and remove the Definition of Done requirement that every
docs/.md must have a corresponding fern/pages/*.md; update any example code and
references to "fern/pages/" to instead show the next.yml relative-path pattern.

In @docs/cli-options.md:

Line 492: The docstring in src/aiperf/common/config/prompt_config.py contains
two adjacent string literals for the --num-prefix-prompts description that
concatenate without a space, producing "off by one.Mutually". Edit the string
literals in the prompt_config module (the doc/description for
--num-prefix-prompts in PromptConfig or the module-level prompt config constant)
to add a trailing space after the period or join them into a single string so
the generated docs read "...off by one. Mutually exclusive..." (ensure the
change targets the adjacent literals shown in the review).

In @docs/diagrams/metrics-flow.md:

Around line 12-61: The Mermaid diagram connectors were corrupted: every valid
arrow "-->" was replaced by "/}", breaking edges; restore all "/}" connectors
back to "-->" within the Mermaid fenced block so nodes like A, B1/B2/B3,
C1/C2/C3, D1/D2/D3, E1/E2/E3, G, H1/H2, I2, J1/J2/J3 and K reconnect correctly.
Locate the mermaid block containing the node IDs (e.g., "MetricRecordProcessor",
"RECORD: RequestLatencyMetric", "AGGREGATE: TotalRequestsMetric",
"MetricResultsDict", "Summarize Function") and perform a global replace of the
connector token "*/}" with the correct Mermaid arrow "-->" (or manually fix each
edge) ensuring spacing/HTML entities remain unchanged. Ensure the fix is applied
to every connector occurrence in the file so the rendered diagram is fully
connected.

In @docs/diagrams/mixins.md:

Around line 9-31: The Mermaid diagram has accidental "/}" edge tokens (e.g.,
between "BaseMixin" -> "AIPerfLoggerMixin", "AIPerfLoggerMixin" -> "HooksMixin",
and others like "BaseService" -> "BaseComponentService") which breaks
connections; replace all occurrences of the "/}" connector with the proper
"-->" arrow (preserving surrounding spacing as needed) so every link like A -> B
uses "-->" (ensure entries such as "AIPerfLoggerMixin */} C" become
"AIPerfLoggerMixin --> C" across the file).

In @docs/index.md:

Around line 1-4: Update the SPDX header comment so the copyright year matches
the other changed docs by replacing the single year "2025" with the range
"2025-2026" in the existing SPDX block (the comment that begins with
"SPDX-FileCopyrightText" and includes "SPDX-License-Identifier: Apache-2.0");
ensure the formatting of the comment remains unchanged aside from the year
range.

In @docs/metrics-reference.md:

Around line 1-4: The file docs/metrics-reference.md currently uses JSX comment
delimiters {/* and */} which render as literal text; replace those JSX
comment markers (the {/* at the top and the matching */} at the bottom) with
standard HTML comment markers  so the SPDX header lines remain
hidden in this .md file.

In @docs/server-metrics/server-metrics-json-schema.md:

Around line 1-4: The top-of-file JS-style JSX comment {/* ... */} wrapping
the SPDX lines is being rendered as headings; replace that {/* ... */} block
with an HTML comment  (i.e., wrap the SPDX lines in ) so GitHub will not
render them as Markdown H1s, mirroring the same fix applied to the other docs
file.

In @docs/server-metrics/server-metrics-reference.md:

Around line 1-4: The SPDX header is currently wrapped with JSX-style comment
markers {/* ... /}, which GitHub renders as Markdown H1s; replace the {/ and
/} wrapper around the two SPDX lines with a proper Markdown/HTML comment
delimiter so the lines and are hidden; update the block that
contains the {/ and */} markers and the two SPDX lines accordingly so the SPDX
metadata remains present but does not render as headings.

In @docs/tutorials/huggingface-tgi.md:

Around line 1-4: The JSX-style license header using {/* ... /} is being
rendered as literal text by MkDocs; replace the JSX comment block (the four-line
SPDX header wrapped in {/ */}) in huggingface-tgi.md and each affected tutorial
file (e.g., the listed tutorials such as multi-turn.md, request-cancellation.md,
fixed-schedule.md, etc.) with an HTML comment block using
containing the same SPDX lines so MkDocs will ignore it; ensure you preserve the
exact SPDX lines and spacing when converting the comment syntax.

In @docs/tutorials/sglang-image-generation.md:

Around line 237-249: Replace the three empty image alt texts with descriptive
text taken from the prompt above each image: for the image after the prompt
{"text": "A futuristic city with flying cars"} set alt text to "A futuristic
city with flying cars"; for the image after {"text": "A cute robot playing with
a kitten"} set alt text to "A cute robot playing with a kitten"; and for the
third image use the prompt text that precedes it (use that prompt as the alt
text). Update the three markdown image tags (the
occurrences) so the alt text is included
inside the brackets.

In @docs/tutorials/sharegpt.md:

Line 2: The SPDX copyright header line currently uses a single year "2026"
(the line starting with SPDX-FileCopyrightText), which is inconsistent with the
project's "2025-2026" range; update that line to use "2025-2026" so it matches
other files and maintain consistency across the docs.

In @tools/_core.py:

Around line 90-95: The SPDX_HEADER_MD constant uses JSX-style block comments
("{/", "/}") which render as visible text in standard Markdown; update
SPDX_HEADER_MD in tools/_core.py to use HTML comments ("") so the
header is invisible to mkdocs/GFM, e.g. replace the tuple entries wrapping the
SPDX lines with an HTML comment prefix/suffix; alternatively, if you prefer not
to embed the header in docs, remove SPDX_HEADER_MD and configure Fern to inject
the SPDX info via a Fern-native mechanism (frontmatter/_layout or
fern.config.json header/footer) so MDX and plain Markdown pipelines are both
handled correctly.

Outside diff comments:
In @docs/benchmark-datasets.md:

Line 42: Update the broken relative link in the Mooncake trace file row by
replacing the old reference "benchmark_modes/trace_replay.md" with the new path
"benchmark-modes/trace-replay.md" so the anchor around the code snippet
(--input-file your_trace_file.jsonl --custom-dataset-type mooncake_trace) points
to the renamed document; ensure the href in that tag is updated accordingly.

In @docs/benchmark-modes/trace-replay.md:

Around line 63-111: The three code block markers all use the same ID
"aiperf-run-vllm-default-openai-endpoint-server", preventing unique extraction;
update each pair to unique IDs that reflect their scenario names (e.g., change
the create-trace block marker to "aiperf-run-vllm-trace-create-custom", the
run-custom-trace block to "aiperf-run-vllm-trace-run-custom", and the Mooncake
trace block to "aiperf-run-vllm-trace-run-mooncake"), ensuring both the opening
and closing comment markers around each fenced bash block are changed
accordingly so the e2e parser can extract them uniquely.

In @docs/tutorials/audio.md:

Around line 89-92: Replace developer-specific absolute paths in the sample
output lines that reference profile_export_aiperf.csv,
profile_export_aiperf.json, and aiperf.log with a generic placeholder path
(e.g., /path/to/artifacts//...) in the docs/tutorials/audio.md
content; apply the same substitution to the other matching block later in the
file (the lines containing the same three filenames) so both occurrences no
longer leak the developer home directory.

In @docs/tutorials/embeddings.md:

Around line 79-101: The aiperf-run tag is placed before the heredoc bash block
so parser.py::_extract_bash_block picks up the cat <<EOF block instead of the
intended aiperf profile command; move the opening tag {/*
aiperf-run-vllm-default-openai-endpoint-server */} from above the inputs.jsonl
heredoc to immediately before the ```bash block that starts the aiperf profile
command (and keep the closing tag after that block) so the parser extracts the
correct aiperf profile command for CI.

In @docs/tutorials/multi-run-confidence.md:

Around line 703-704: Update the broken cross-reference links in
docs/tutorials/multi-run-confidence.md that still point to the old underscored
filenames: change ../cli_options.md to ../cli-options.md and
../metrics_reference.md to ../metrics-reference.md (and search the same file for
any other occurrences of cli_options or metrics_reference to replace them too)
so the links match the renamed files referenced in mkdocs.yml and other docs.

In @docs/tutorials/openai-text-endpoints.md:

Around line 31-164: The repeated snippet tag name
aiperf-run-vllm-default-openai-endpoint-server causes parser collisions; update
each opening and matching closing tag in this file (e.g., the four pairs around
the chat synthetic, chat custom, completions synthetic, completions custom
blocks) to use unique identifiers following the suggested pattern such as
aiperf-run-vllm-chat-synthetic-endpoint-server,
aiperf-run-vllm-chat-custom-endpoint-server,
aiperf-run-vllm-completions-synthetic-endpoint-server, and
aiperf-run-vllm-completions-custom-endpoint-server so the parser extracts
distinct server names; ensure every changed opening tag has its corresponding
closing tag updated to the exact same new identifier.

In @docs/tutorials/sglang-image-generation.md:

Around line 226-229: The example media filenames do not match the extraction
script's f-string (f"image_{line_num:04d}_{data_idx:02d}.jpg")—the script
produces "image_0001_00.jpg" but the docs show hyphenated
"image-0001-00-00.jpg"; fix by either renaming the embedded media and their
references on the "View the generated images" lines (currently at the three
places called out) to the underscore two-segment form that the script outputs,
or change the extraction naming convention in the script to a three-segment
hyphenated format (adding the extra trailing index and using "-" separators) and
then update the sample output and media references so all three (script, sample
output, media files) use the exact same filename pattern; ensure consistency for
filenames referenced on the three lines mentioned.

In @docs/tutorials/vision.md:

Around line 40-53: Duplicate marker ID
"aiperf-run-vllm-vision-openai-endpoint-server" is used for multiple code blocks
causing potential snippet extraction collisions; update each marker to a unique
identifier (for example replace the three occurrences with
"aiperf-run-vllm-vision-synthetic", "aiperf-run-vllm-vision-custom-file", and
"aiperf-run-vllm-vision-custom-run") by locating the markers surrounding the
code fences (the strings inside {/* ... */}) and renaming them consistently at
both the opening and closing markers so each block has a distinct ID.

In @tools/generate_cli_docs.py:

Around line 428-432: The command-level description is not MDX-escaped: when
building desc from desc_lines the code uses normalize_text(para) but does not
call _escape_mdx_prose, so angle-bracketed tokens (e.g. <first_path>) break
rendering; update the block that processes desc (the variables desc, desc_lines
and loop that appends to lines) to pass each paragraph through _escape_mdx_prose
(e.g. _escape_mdx_prose(normalize_text(para))) before extending lines so command
help text is properly escaped for MDX.

Duplicate comments:
In @docs/cli-options.md:

Around line 1-4: The generated docs include a JSX-style comment block that
breaks mkdocs; update the generator to emit an mkdocs/Fern-compatible header by
changing SPDX_HEADER_MD in tools/_core.py to use a plain Markdown comment or
HTML comment (e.g., ) or remove the JSX braces so the header is
valid Markdown; modify the SPDX_HEADER_MD constant and any code paths that
insert it (references in tools/generate_cli_docs.py) so the output for docs/.md
no longer contains the JSX-style {/ ... */} wrapper.

In @docs/tutorials/fixed-schedule.md:

Line 57: MDX-style inline comment "{/*
aiperf-run-vllm-default-openai-endpoint-server /}" in
docs/tutorials/fixed-schedule.md is triggering a known false positive for MD037;
update the markdownlint configuration the same way you did for
docs/tutorials/embeddings.md — add an exception to allow MDX comment tags like
"{/ ... /}" (or disable MD037 for docs/tutorials/.md) so these MDX comment
tokens are ignored by the linter.

Nitpick comments:
In @.cursor/skills/docs-to-fern/SKILL_md:

Around line 113-118: The JSON in SKILL_md hardcodes "version": "3.29.1", which
will become stale; replace the fixed version with guidance to either leave the
version out or use a placeholder and add a note telling users to run npm show fern-api version to find the latest Fern CLI version (or to substitute
"latest"/their desired version), and update the documentation text around the
JSON (the block containing "organization" and "version") to explain how and when
to refresh the version value.

In @docs/comprehensive-llm-benchmarking.md:

Around line 1-4: The SPDX header in docs/comprehensive-llm-benchmarking.md
uses a stray leading "#" on the two SPDX lines; remove the "#" prefix from the
lines beginning with "# SPDX-FileCopyrightText:" and "#
SPDX-License-Identifier:" so they match the SPDX_HEADER_MD format used elsewhere
(i.e., change them to "SPDX-FileCopyrightText: ..." and
"SPDX-License-Identifier: ...").

In @docs/tutorials/embeddings.md:

Line 37: Several doc files use MDX-style comment delimiters like {/*
aiperf-run-vllm-default-openai-endpoint-server /} which markdownlint mis-parses
as emphasis and triggers MD037; update the markdownlint config
(.markdownlint-cli2.yaml) to suppress MD037 for MDX-style files by adding MD037:
false globally or, preferably, add an overrides entry that disables MD037 for
the docs glob (e.g., docs/**/.md or the MDX-specific glob) so the {/* … */}
comments no longer produce false positives while preserving MD037 for
pure-markdown files.

In @docs/tutorials/sglang-image-generation.md:

Around line 240-242: Add the missing language identifier "json" to the fenced
code blocks that contain JSON prompts (e.g., the blocks containing {"text": "A
futuristic city with flying cars"} and {"text": "A cute robot playing with a
kitten"} and the earlier prompt at the first block), by changing the opening
backtick fences from tojson so the intent is explicit and the
markdownlint MD040 warning is suppressed.
</details>

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)

docs/tutorials/arrival-patterns.md (2)
189-200: ⚠️ Potential issue | 🟡 Minor

Use bash fence for executable CLI commands.

This block is a runnable command example but is fenced as text. Please switch it to bash for consistency and copy/paste ergonomics.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/arrival-patterns.md` around lines 189 - 200, Update the fenced
code block for the runnable CLI example so the language tag is "bash" instead of
"text" to improve copy/paste and syntax highlighting; locate the block
containing the "aiperf profile \ --model your-model \ --url localhost:8000 \
--endpoint-type chat \ --streaming \ --request-rate 100 \ --arrival-pattern
poisson \ --benchmark-duration 60 \ --output-dir results/poisson" command and
replace the opening triple-backtick fence from ```text to ```bash.
14-20: 🛠️ Refactor suggestion | 🟠 Major

Convert conceptual ASCII diagrams to Mermaid.

These timeline visuals are ASCII diagrams in a Markdown doc. Please convert them to Mermaid blocks to align with repo docs standards, while keeping terminal-output ASCII tables as-is.

As per coding guidelines, "Use mermaid diagrams instead of ASCII art in markdown files."
Based on learnings, preserve ASCII box-drawing tables inside real command output blocks exactly as written.

Also applies to: 45-49, 65-69, 96-108, 121-127
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/arrival-patterns.md` around lines 14 - 20, Replace the
three-block ASCII timeline (the fenced ```text block showing "Constant Pattern",
"Poisson Pattern", "Gamma (bursty)" and their arrows) with a mermaid diagram:
wrap a mermaid block (```mermaid) and create a left-to-right flowchart (e.g.,
"flowchart LR") with nodes named "Constant Pattern", "Poisson Pattern", "Gamma
(bursty)" connected by arrows and include sublabels "Perfect spacing", "Natural
variance", "Clustered bursts" as node descriptions or subnodes so the visual
matches the original. Do the same conversion for the other listed ASCII timeline
blocks (lines 45-49, 65-69, 96-108, 121-127), but do not change any ASCII
box-drawing tables that are inside real command/output fences—leave those exact
text blocks untouched.
docs/environment-variables.md (1)
1-23: ⚠️ Potential issue | 🟠 Major

Update the generator to output JSX <Warning> component instead of markdown blockquotes.

The file is auto-generated by tools/generate_env_vars_docs.py, but currently out of sync. The generator produces > [!WARNING] markdown blockquotes (line 203-205 in generate_markdown), while the file contains <Warning> JSX components. Update tools/generate_env_vars_docs.py to replace the blockquote syntax with:
"<Warning>",
"Environment variable names, default values, and definitions are subject to change.",
"These settings may be modified, renamed, or removed in future releases.",
"</Warning>",
Then regenerate the file with make generate-env-vars-docs to sync all changes (the SPDX header format is already correct).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/environment-variables.md` around lines 1 - 23, The generated docs still
use markdown blockquote warning syntax; update the generator in
tools/generate_env_vars_docs.py (look for the generate_markdown function around
the block that emits the warning lines) to emit the JSX <Warning> component
lines instead of the "> [!WARNING]" blockquote—replace the three line blockquote
output with the four strings: "<Warning>", the warning text line(s), and
"</Warning>". After making that change, run make generate-env-vars-docs to
re-generate docs so docs/environment-variables.md matches the new JSX warning
format.
docs/tutorials/audio.md (1)
89-93: ⚠️ Potential issue | 🟠 Major

Remove user-specific home paths from sample outputs.

These paths expose a developer-specific username/location and make the tutorial non-portable. Replace them with neutral placeholders (for example, /path/to/aiperf/artifacts/...).
Suggested doc-safe replacement
-/home/lkomali/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/profile_export_aiperf.csv
+/path/to/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/profile_export_aiperf.csv
...
-/home/lkomali/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency1/profile_export_aiperf.json
+/path/to/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency1/profile_export_aiperf.json
Also applies to: 155-158
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/audio.md` around lines 89 - 93, The sample output contains
developer-specific absolute home paths; locate the occurrences of the filenames
"profile_export_aiperf.csv", "profile_export_aiperf.json", and "aiperf.log" in
docs/tutorials/audio.md and replace their parent paths with neutral placeholders
(e.g. "/path/to/aiperf/artifacts/.../profile_export_aiperf.csv",
"/path/to/aiperf/artifacts/.../profile_export_aiperf.json",
"/path/to/aiperf/artifacts/.../logs/aiperf.log") ensuring you update all
instances (including the other occurrences mentioned) so no user-specific home
directories remain.
docs/cli-options.md (1)
1-1077: ⚠️ Potential issue | 🟠 Major

This generated file is out of sync with the generator output.

CI reports pre-commit hook generate-cli-docs modified files. Please regenerate and commit updated artifacts before merge.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/cli-options.md` around lines 1 - 1077, The docs/cli-options.md is stale
and was modified by the pre-commit `generate-cli-docs` hook; regenerate the CLI
docs using the same generator invoked by the pre-commit hook (run the
`generate-cli-docs` step or the project script/Makefile target that produces cli
docs), replace the checked-in docs/cli-options.md with the newly generated
output, and commit the updated file so the pre-commit check passes; ensure you
run the pre-commit hooks (or `pre-commit run generate-cli-docs --all-files`)
locally before pushing.

♻️ Duplicate comments (7)

docs/tutorials/fixed-schedule.md (1)
57-57: ⚠️ Potential issue | 🟠 Major

Same MDX syntax issue affects benchmark run markers.

The benchmark run markers use JSX-style comments (e.g., {/* aiperf-run-vllm-default-openai-endpoint-server */}) which will also render as literal text on GitHub. For consistency with the recommendation in the license header comment, these should also use HTML comment syntax if GitHub visibility is required.

Note: The markdownlint MD037 warnings (spaces inside emphasis markers) are false positives caused by the linter misinterpreting JSX comment syntax. If converted to HTML comments, these warnings will resolve.

Also applies to: 84-84, 121-121, 136-136
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/fixed-schedule.md` at line 57, Replace the JSX-style benchmark
run markers like "{/* aiperf-run-vllm-default-openai-endpoint-server */}" with
HTML comments "" so they
don’t render as literal text on GitHub and so the MDX/markdown linter no longer
misinterprets them; find each benchmark marker string (e.g.,
"aiperf-run-vllm-default-openai-endpoint-server" and the other similar markers
present in the file) and convert the surrounding "{/* ... */}" to the HTML "" form.
docs/cli-options.md (1)
545-545: ⚠️ Potential issue | 🟡 Minor

Fix the missing space in the generated description (one.Mutually).

This typo is still present; update the source docstring and regenerate this file.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/cli-options.md` at line 545, Typo: the generated description for the
--num-prefix-prompts option contains "one.Mutually" (missing space). Edit the
original docstring/option help text where --num-prefix-prompts is defined
(search for the string "one.Mutually" or the help text for --num-prefix-prompts)
and add the missing space so it reads "one. Mutually exclusive..."; then
regenerate the docs/cli-options.md using the project's docs generation
task/script to update the generated file so the fix appears in docs.
docs/diagrams/mixins.md (1)
9-31: ⚠️ Potential issue | 🔴 Critical

Fix Mermaid edge syntax (*/} → -->) to restore diagram rendering.

All connectors in this segment use an invalid operator, so links won’t render.
Proposed fix
- A["BaseMixin Ensures proper inheritance chain"] */} B["AIPerfLoggerMixin Lazy-evaluated logging with f-strings"]
+ A["BaseMixin Ensures proper inheritance chain"] --> B["AIPerfLoggerMixin Lazy-evaluated logging with f-strings"]
Apply the same replacement for every */} edge in this diagram.
#!/bin/bash
set -euo pipefail
rg -n '\*/}' docs/diagrams/mixins.md || true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/diagrams/mixins.md` around lines 9 - 31, The Mermaid diagram uses
invalid edge syntax `*/}`; replace every occurrence of `*/}` with the correct
Mermaid connector `-->` for all edges in this block (e.g., the edges connecting
BaseMixin → AIPerfLoggerMixin, AIPerfLoggerMixin → HooksMixin/TaskManagerMixin,
HooksMixin → AIPerfLifecycleMixin, TaskManagerMixin → AIPerfLifecycleMixin,
AIPerfLifecycleMixin → MessageBusClientMixin, MessageBusClientMixin →
BaseService, BaseService → BaseComponentService/SystemController, and
BaseComponentService →
DatasetManager/TimingManager/RecordsManager/RecordProcessor/WorkerManager/Worker)
so the diagram renders properly.
.cursor/skills/docs-to-fern/SKILL_md (2)
344-353: ⚠️ Potential issue | 🟡 Minor

SPDX conversion guidance is inconsistent with the repo’s current JSX header style.

This script converts SPDX HTML comments to YAML frontmatter; current docs in this PR are standardized on JSX comment headers. Align the skill instructions to one canonical format.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.cursor/skills/docs-to-fern/SKILL_md around lines 344 - 353, The SPDX
conversion currently turns SPDX HTML comment blocks into YAML frontmatter using
spdx_pattern/match/spdx_content/spdx_lines, but the repo uses JSX comment
headers; update the logic to produce the canonical JSX header instead of YAML
frontmatter: when spdx_pattern matches, build a JSX comment block (prefixed
lines wrapped as {/* ... */}) using spdx_content/spdx_lines and insert it in
place of the match, and ensure the final fallback re.sub(r'', r'{/*
\1 */}', ...) remains or is adjusted to avoid double-wrapping already-converted
SPDX blocks.
262-280: ⚠️ Potential issue | 🟠 Major

Update this skill to the config-only Fern model (no fern/pages content duplication).

These sections still prescribe bulk-copying docs/ into fern/pages/ and navigation rooted in ../pages/..., which conflicts with the agreed single-source docs approach.

Also applies to: 463-565, 666-667
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.cursor/skills/docs-to-fern/SKILL_md around lines 262 - 280, The
instructions currently tell users to bulk-copy docs into fern/pages and use
../pages navigation (the find/while loop that builds target="fern/pages/$(...)"
and the mkdir/cp steps); replace this with config-only Fern guidance: remove the
find/while copy script and any mention of creating or using fern/pages, and
instead show how to point the Fern configuration (site/navigation) at the
existing docs/ source (and remove ../pages-* links), updating references where
navigation items or links use ../pages/... to reference the original docs paths;
ensure all mentions of target="fern/pages/..." and the copy/mkdir/cp steps are
deleted and replaced with a brief note describing using Fern’s config to map
docs as the single source.
docs/tutorials/sglang-image-generation.md (1)
247-247: ⚠️ Potential issue | 🟡 Minor

Add descriptive alt text to the three generated-image references.

Line [247], Line [253], and Line [259] still use empty ![](...), which fails accessibility/lint checks.
✏️ Proposed fix
-![](../media/extracted-images/image-0001-00-00.jpg)
+![A serene mountain landscape at sunset](../media/extracted-images/image-0001-00-00.jpg)

-![](../media/extracted-images/image-0002-00-00.jpg)
+![A futuristic city with flying cars](../media/extracted-images/image-0002-00-00.jpg)

-![](../media/extracted-images/image-0003-00-00.jpg)
+![A cute robot playing with a kitten](../media/extracted-images/image-0003-00-00.jpg)
Also applies to: 253-253, 259-259
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/sglang-image-generation.md` at line 247, Three image markdown
references lack alt text; replace the empty brackets for the three images
(../media/extracted-images/image-0001-00-00.jpg, image-0001-00-01.jpg,
image-0001-00-02.jpg) with short descriptive alt text describing each generated
image (e.g. "generated solar system scene" or whatever matches content) so
markdown becomes ![descriptive
alt](../media/extracted-images/image-0001-00-00.jpg) for each occurrence to
satisfy accessibility/lint rules.
docs/server-metrics/server-metrics-reference.md (1)
1-4: ⚠️ Potential issue | 🟠 Major

SPDX wrapper issue is still present here.

Line [1]-Line [4] still uses {/* ... */} in a .md file; this is the same rendering-portability issue previously flagged.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/server-metrics/server-metrics-reference.md` around lines 1 - 4, Replace
the invalid JSX-style comment wrapper `{/* ... */}` at the top of the Markdown
file with a proper Markdown-safe comment: remove the `{/*` and `*/}` tokens
around the SPDX header and put the SPDX lines inside an HTML comment so they
won't render (e.g., use an HTML comment containing the SPDX-FileCopyrightText
and SPDX-License-Identifier lines); locate the existing block by searching for
the `{/*` token and update the SPDX block accordingly (do not leave the
JSX-style wrapper or expose raw license lines in rendered output).

🧹 Nitpick comments (3)

docs/tutorials/request-rate-concurrency.md (1)

38-42: Add blank line after table for consistency.

The table should be followed by a blank line before the closing </Note> tag to comply with MD058 and maintain consistent spacing.
📝 Proposed formatting fix
 | 100 | 20 req/s | ~5.0 seconds |
+
 </Note>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tutorials/request-rate-concurrency.md` around lines 38 - 42, Add a
single blank line after the Markdown table (the block starting with "|
Concurrency | Request Rate | Ramp-up Time |" and ending with "| 100 | 20
req/s | ~5.0 seconds |") so there is an empty line before the closing
</Note> tag to satisfy MD058 and maintain consistent spacing.

.cursor/skills/docs-to-fern/SKILL_md (1)

885-889: Include docs/** in the publish trigger paths for config-only docs.

If docs content remains in docs/, triggering only on fern/** will miss content-only documentation updates.

Suggested workflow snippet update

 on:
   push:
     branches: [main]
-    paths: ['fern/**']
+    paths: ['fern/**', 'docs/**']

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.cursor/skills/docs-to-fern/SKILL_md around lines 885 - 889, The GitHub
Actions workflow trigger only watches fern/** and will miss changes under
docs/**; update the workflow's push.paths configuration (the on.push.paths entry
that currently reads paths: ['fern/**']) to also include docs/** (e.g., add
'docs/**' to the array) so content-only documentation updates under docs/ will
trigger the workflow.

docs/tutorial.md (1)

12-12: Optional: remove spaces inside MDX tag comments to clear MD037 warnings.

Line [12], Line [21], Line [23], Line [27], Line [30], and Line [42] use {/* ... */} with surrounding spaces; compact form avoids markdownlint noise.

✂️ Suggested cleanup

-{/* setup-vllm-default-openai-endpoint-server */}
+{/*setup-vllm-default-openai-endpoint-server*/}
...
-{/* /setup-vllm-default-openai-endpoint-server */}
+{/*/setup-vllm-default-openai-endpoint-server*/}
...
-{/* health-check-vllm-default-openai-endpoint-server */}
+{/*health-check-vllm-default-openai-endpoint-server*/}
...
-{/* /health-check-vllm-default-openai-endpoint-server */}
+{/*/health-check-vllm-default-openai-endpoint-server*/}
...
-{/* aiperf-run-vllm-default-openai-endpoint-server */}
+{/*aiperf-run-vllm-default-openai-endpoint-server*/}
...
-{/* /aiperf-run-vllm-default-openai-endpoint-server */}
+{/*/aiperf-run-vllm-default-openai-endpoint-server*/}

Also applies to: 21-21, 23-23, 27-27, 30-30, 42-42

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/tutorial.md` at line 12, Replace the MDX comment tags that contain extra
spaces so they use a compact form; locate occurrences of the comment pattern{/*
setup-vllm-default-openai-endpoint-server */} (and the other similar tags on the
page) and remove the spaces after the opening brace and before the closing brace
so they become{/*setup-vllm-default-openai-endpoint-server*/}, repeating the
same fix for the other MDX comments mentioned.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/tutorials/custom-prompt-benchmarking.md`:
- Around line 41-43: The Markdown heading "# Create an input file with specific
text inputs" sits outside the fenced code block after the JSX marker {/*
aiperf-run-vllm-default-openai-endpoint-server */} and will render visibly;
either move that line inside the following triple-backtick code fence so it's
part of the code block or replace it with a JSX comment (e.g., wrap it with {/*
... */}) so it does not render; update the snippet containing the JSX marker and
the code fence accordingly to keep comments non-rendering.

In `@docs/tutorials/openai-text-endpoints.md`:
- Around line 1-4: Replace all JSX-style block comments "{/* ... */}" with
standard HTML comments "<!-- ... -->" throughout the document so the Markdown is
valid; search for the "{/*" and "*/}" tokens (they appear at the top and at
other comment spots) and convert each to the corresponding "<!--" and "-->"
form, preserving the original comment text and spacing.

---

Outside diff comments:
In `@docs/cli-options.md`:
- Around line 1-1077: The docs/cli-options.md is stale and was modified by the
pre-commit `generate-cli-docs` hook; regenerate the CLI docs using the same
generator invoked by the pre-commit hook (run the `generate-cli-docs` step or
the project script/Makefile target that produces cli docs), replace the
checked-in docs/cli-options.md with the newly generated output, and commit the
updated file so the pre-commit check passes; ensure you run the pre-commit hooks
(or `pre-commit run generate-cli-docs --all-files`) locally before pushing.

In `@docs/environment-variables.md`:
- Around line 1-23: The generated docs still use markdown blockquote warning
syntax; update the generator in tools/generate_env_vars_docs.py (look for the
generate_markdown function around the block that emits the warning lines) to
emit the JSX <Warning> component lines instead of the "> [!WARNING]"
blockquote—replace the three line blockquote output with the four strings:
"<Warning>", the warning text line(s), and "</Warning>". After making that
change, run make generate-env-vars-docs to re-generate docs so
docs/environment-variables.md matches the new JSX warning format.

In `@docs/tutorials/arrival-patterns.md`:
- Around line 189-200: Update the fenced code block for the runnable CLI example
so the language tag is "bash" instead of "text" to improve copy/paste and syntax
highlighting; locate the block containing the "aiperf profile \ --model
your-model \ --url localhost:8000 \ --endpoint-type chat \ --streaming \
--request-rate 100 \ --arrival-pattern poisson \ --benchmark-duration 60 \
--output-dir results/poisson" command and replace the opening triple-backtick
fence from ```text to ```bash.
- Around line 14-20: Replace the three-block ASCII timeline (the fenced ```text
block showing "Constant Pattern", "Poisson Pattern", "Gamma (bursty)" and their
arrows) with a mermaid diagram: wrap a mermaid block (```mermaid) and create a
left-to-right flowchart (e.g., "flowchart LR") with nodes named "Constant
Pattern", "Poisson Pattern", "Gamma (bursty)" connected by arrows and include
sublabels "Perfect spacing", "Natural variance", "Clustered bursts" as node
descriptions or subnodes so the visual matches the original. Do the same
conversion for the other listed ASCII timeline blocks (lines 45-49, 65-69,
96-108, 121-127), but do not change any ASCII box-drawing tables that are inside
real command/output fences—leave those exact text blocks untouched.

In `@docs/tutorials/audio.md`:
- Around line 89-93: The sample output contains developer-specific absolute home
paths; locate the occurrences of the filenames "profile_export_aiperf.csv",
"profile_export_aiperf.json", and "aiperf.log" in docs/tutorials/audio.md and
replace their parent paths with neutral placeholders (e.g.
"/path/to/aiperf/artifacts/.../profile_export_aiperf.csv",
"/path/to/aiperf/artifacts/.../profile_export_aiperf.json",
"/path/to/aiperf/artifacts/.../logs/aiperf.log") ensuring you update all
instances (including the other occurrences mentioned) so no user-specific home
directories remain.

---

Duplicate comments:
In @.cursor/skills/docs-to-fern/SKILL_md:
- Around line 344-353: The SPDX conversion currently turns SPDX HTML comment
blocks into YAML frontmatter using spdx_pattern/match/spdx_content/spdx_lines,
but the repo uses JSX comment headers; update the logic to produce the canonical
JSX header instead of YAML frontmatter: when spdx_pattern matches, build a JSX
comment block (prefixed lines wrapped as {/* ... */}) using
spdx_content/spdx_lines and insert it in place of the match, and ensure the
final fallback re.sub(r'<!--(.*?)-->', r'{/* \1 */}', ...) remains or is
adjusted to avoid double-wrapping already-converted SPDX blocks.
- Around line 262-280: The instructions currently tell users to bulk-copy docs
into fern/pages and use ../pages navigation (the find/while loop that builds
target="fern/pages/$(...)" and the mkdir/cp steps); replace this with
config-only Fern guidance: remove the find/while copy script and any mention of
creating or using fern/pages, and instead show how to point the Fern
configuration (site/navigation) at the existing docs/ source (and remove
../pages-* links), updating references where navigation items or links use
../pages/... to reference the original docs paths; ensure all mentions of
target="fern/pages/..." and the copy/mkdir/cp steps are deleted and replaced
with a brief note describing using Fern’s config to map docs as the single
source.

In `@docs/cli-options.md`:
- Line 545: Typo: the generated description for the --num-prefix-prompts option
contains "one.Mutually" (missing space). Edit the original docstring/option help
text where --num-prefix-prompts is defined (search for the string "one.Mutually"
or the help text for --num-prefix-prompts) and add the missing space so it reads
"one. Mutually exclusive..."; then regenerate the docs/cli-options.md using the
project's docs generation task/script to update the generated file so the fix
appears in docs.

In `@docs/diagrams/mixins.md`:
- Around line 9-31: The Mermaid diagram uses invalid edge syntax `*/}`; replace
every occurrence of `*/}` with the correct Mermaid connector `-->` for all edges
in this block (e.g., the edges connecting BaseMixin → AIPerfLoggerMixin,
AIPerfLoggerMixin → HooksMixin/TaskManagerMixin, HooksMixin →
AIPerfLifecycleMixin, TaskManagerMixin → AIPerfLifecycleMixin,
AIPerfLifecycleMixin → MessageBusClientMixin, MessageBusClientMixin →
BaseService, BaseService → BaseComponentService/SystemController, and
BaseComponentService →
DatasetManager/TimingManager/RecordsManager/RecordProcessor/WorkerManager/Worker)
so the diagram renders properly.

In `@docs/server-metrics/server-metrics-reference.md`:
- Around line 1-4: Replace the invalid JSX-style comment wrapper `{/* ... */}`
at the top of the Markdown file with a proper Markdown-safe comment: remove the
`{/*` and `*/}` tokens around the SPDX header and put the SPDX lines inside an
HTML comment so they won't render (e.g., use an HTML comment containing the
SPDX-FileCopyrightText and SPDX-License-Identifier lines); locate the existing
block by searching for the `{/*` token and update the SPDX block accordingly (do
not leave the JSX-style wrapper or expose raw license lines in rendered output).

In `@docs/tutorials/fixed-schedule.md`:
- Line 57: Replace the JSX-style benchmark run markers like "{/*
aiperf-run-vllm-default-openai-endpoint-server */}" with HTML comments "<!--
aiperf-run-vllm-default-openai-endpoint-server -->" so they don’t render as
literal text on GitHub and so the MDX/markdown linter no longer misinterprets
them; find each benchmark marker string (e.g.,
"aiperf-run-vllm-default-openai-endpoint-server" and the other similar markers
present in the file) and convert the surrounding "{/* ... */}" to the HTML "<!--
... -->" form.

In `@docs/tutorials/sglang-image-generation.md`:
- Line 247: Three image markdown references lack alt text; replace the empty
brackets for the three images (../media/extracted-images/image-0001-00-00.jpg,
image-0001-00-01.jpg, image-0001-00-02.jpg) with short descriptive alt text
describing each generated image (e.g. "generated solar system scene" or whatever
matches content) so markdown becomes ![descriptive
alt](../media/extracted-images/image-0001-00-00.jpg) for each occurrence to
satisfy accessibility/lint rules.

---

Nitpick comments:
In @.cursor/skills/docs-to-fern/SKILL_md:
- Around line 885-889: The GitHub Actions workflow trigger only watches fern/**
and will miss changes under docs/**; update the workflow's push.paths
configuration (the on.push.paths entry that currently reads paths: ['fern/**'])
to also include docs/** (e.g., add 'docs/**' to the array) so content-only
documentation updates under docs/ will trigger the workflow.

In `@docs/tutorial.md`:
- Line 12: Replace the MDX comment tags that contain extra spaces so they use a
compact form; locate occurrences of the comment pattern{/*
setup-vllm-default-openai-endpoint-server */} (and the other similar tags on the
page) and remove the spaces after the opening brace and before the closing brace
so they become{/*setup-vllm-default-openai-endpoint-server*/}, repeating the
same fix for the other MDX comments mentioned.

In `@docs/tutorials/request-rate-concurrency.md`:
- Around line 38-42: Add a single blank line after the Markdown table (the block
starting with "| Concurrency | Request Rate | Ramp-up Time |" and ending with "|
100         | 20 req/s     | ~5.0 seconds |") so there is an empty line before
the closing </Note> tag to satisfy MD058 and maintain consistent spacing.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c754b47e75d79ab6f0d6f47095fc2569efaa35e5 and 44df8dfd670f342fab007a409f0c940792727435.

⛔ Files ignored due to path filters (24)

docs/diagrams/plot-examples/multi-run/config-experiment-classification/pareto-curve-throughput-per-gpu-vs-interactivity.png is excluded by !**/*.png
docs/diagrams/plot-examples/multi-run/config-experiment-classification/ttft-vs-throughput.png is excluded by !**/*.png
docs/diagrams/plot-examples/multi-run/pareto-curve-throughput-per-gpu-vs-interactivity.png is excluded by !**/*.png
docs/diagrams/plot-examples/multi-run/pareto-curve-throughput-per-gpu-vs-latency.png is excluded by !**/*.png
docs/diagrams/plot-examples/multi-run/theme-dark-mode/pareto-curve-throughput-per-gpu-vs-interactivity.png is excluded by !**/*.png
docs/diagrams/plot-examples/multi-run/theme-dark-mode/pareto-curve-throughput-per-gpu-vs-latency.png is excluded by !**/*.png
docs/diagrams/plot-examples/multi-run/theme-dark-mode/ttft-vs-throughput.png is excluded by !**/*.png
docs/diagrams/plot-examples/multi-run/ttft-vs-throughput.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/time-series/dispersed-throughput-over-time.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/time-series/gpu-utilization-and-throughput-over-time.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/time-series/itl-over-time.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/time-series/latency-over-time.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/time-series/theme-dark-mode/gpu-utilization-and-throughput-over-time.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/time-series/theme-dark-mode/itl-over-time.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/time-series/theme-dark-mode/timeslices-itl.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/time-series/ttft-over-time.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/time-series/ttft-timeline.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/timeslices/timeslices-itl.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/timeslices/timeslices-latency.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/timeslices/timeslices-throughput-warning.png is excluded by !**/*.png
docs/diagrams/plot-examples/single-run/timeslices/timeslices-ttft.png is excluded by !**/*.png
docs/media/extracted-images/image-0001-00-00.jpg is excluded by !**/*.jpg
docs/media/extracted-images/image-0002-00-00.jpg is excluded by !**/*.jpg
docs/media/extracted-images/image-0003-00-00.jpg is excluded by !**/*.jpg

📒 Files selected for processing (68)

.cursor/skills/docs-to-fern/SKILL_md
docs/api/synthesis.md
docs/architecture.md
docs/benchmark-datasets.md
docs/benchmark-modes/timing-modes-reference.md
docs/benchmark-modes/trace-replay.md
docs/cli-options.md
docs/comprehensive-llm-benchmarking.md
docs/dev/patterns.md
docs/diagrams/metrics-flow.md
docs/diagrams/mixins.md
docs/environment-variables.md
docs/genai-perf-feature-comparison.md
docs/index.md
docs/metrics-reference.md
docs/migrating.md
docs/plugins/creating-your-first-plugin.md
docs/plugins/plugin-system.md
docs/reference/tokenizer-auto-detection.md
docs/reproducibility.md
docs/server-metrics/server-metrics-json-schema.md
docs/server-metrics/server-metrics-parquet-schema.md
docs/server-metrics/server-metrics-reference.md
docs/server-metrics/server-metrics.md
docs/tutorial.md
docs/tutorials/arrival-patterns.md
docs/tutorials/audio.md
docs/tutorials/custom-dataset.md
docs/tutorials/custom-prompt-benchmarking.md
docs/tutorials/embeddings.md
docs/tutorials/fixed-schedule.md
docs/tutorials/goodput.md
docs/tutorials/gpu-telemetry.md
docs/tutorials/http-trace-metrics.md
docs/tutorials/huggingface-tgi.md
docs/tutorials/local-tokenizer.md
docs/tutorials/multi-run-confidence.md
docs/tutorials/multi-turn.md
docs/tutorials/multi-url-load-balancing.md
docs/tutorials/openai-text-endpoints.md
docs/tutorials/plot.md
docs/tutorials/prefill-concurrency.md
docs/tutorials/prefix-synthesis.md
docs/tutorials/ramping.md
docs/tutorials/rankings.md
docs/tutorials/request-cancellation.md
docs/tutorials/request-rate-concurrency.md
docs/tutorials/sequence-distributions.md
docs/tutorials/sglang-image-generation.md
docs/tutorials/sglang-video-generation.md
docs/tutorials/sharegpt.md
docs/tutorials/synthetic-video.md
docs/tutorials/template-endpoint.md
docs/tutorials/time-based-benchmarking.md
docs/tutorials/timeslices.md
docs/tutorials/ui-types.md
docs/tutorials/user-centric-timing.md
docs/tutorials/vision.md
docs/tutorials/warmup.md
docs/tutorials/working-with-profile-exports.md
fern/docs.yml
fern/fern.config.json
fern/versions/next.yml
mkdocs.yml
tests/ci/test_docs_end_to_end/parser.py
tools/_core.py
tools/generate_cli_docs.py
tools/generate_env_vars_docs.py

✅ Files skipped from review due to trivial changes (1)

docs/dev/patterns.md

🚧 Files skipped from review as they are similar to previous changes (29)

docs/tutorials/multi-run-confidence.md
docs/genai-perf-feature-comparison.md
docs/tutorials/user-centric-timing.md
docs/reference/tokenizer-auto-detection.md
docs/benchmark-datasets.md
docs/plugins/plugin-system.md
docs/tutorials/huggingface-tgi.md
docs/tutorials/synthetic-video.md
fern/fern.config.json
docs/tutorials/ui-types.md
docs/architecture.md
docs/tutorials/gpu-telemetry.md
fern/docs.yml
docs/metrics-reference.md
docs/tutorials/template-endpoint.md
fern/versions/next.yml
docs/tutorials/rankings.md
docs/tutorials/warmup.md
docs/tutorials/time-based-benchmarking.md
docs/tutorials/ramping.md
docs/benchmark-modes/timing-modes-reference.md
docs/diagrams/metrics-flow.md
docs/tutorials/working-with-profile-exports.md
docs/index.md
docs/reproducibility.md
docs/tutorials/multi-url-load-balancing.md
docs/api/synthesis.md
docs/server-metrics/server-metrics-json-schema.md
docs/tutorials/prefix-synthesis.md

github-actions Bot added the docs label Feb 11, 2026

coderabbitai Bot reviewed Feb 11, 2026

View reviewed changes

debermudez force-pushed the dbermudez/aip-730-generate-fern-docs branch from 9ea0874 to 6864d56 Compare February 12, 2026 18:45

coderabbitai Bot reviewed Feb 12, 2026

View reviewed changes

Comment thread fern/pages/tutorials/embeddings.md Outdated

Comment thread fern/pages/tutorials/sharegpt.md Outdated

matthewkotila reviewed Feb 12, 2026

View reviewed changes

Comment thread .cursor/skills/docs-to-fern/SKILL_md

debermudez force-pushed the dbermudez/aip-730-generate-fern-docs branch from 6864d56 to f159bd8 Compare February 12, 2026 20:24

coderabbitai Bot reviewed Feb 12, 2026

View reviewed changes

Comment thread fern/pages/genai-perf-feature-comparison.md Outdated

Comment thread fern/pages/tutorials/arrival-patterns.md Outdated

Comment thread tools/generate_cli_docs.py

debermudez requested a review from dagil-nvidia February 18, 2026 23:24

debermudez force-pushed the dbermudez/aip-730-generate-fern-docs branch from f159bd8 to c754b47 Compare February 21, 2026 00:53

coderabbitai Bot reviewed Feb 21, 2026

View reviewed changes

debermudez force-pushed the dbermudez/aip-730-generate-fern-docs branch from 5483f52 to 44df8df Compare February 27, 2026 00:16

coderabbitai Bot reviewed Feb 27, 2026

View reviewed changes

Comment thread docs/tutorials/custom-prompt-benchmarking.md

Comment thread docs/tutorials/openai-text-endpoints.md Outdated

debermudez force-pushed the dbermudez/aip-730-generate-fern-docs branch 3 times, most recently from 4f0b5f8 to dd3bc7f Compare March 3, 2026 01:03

Christina-Young-NVIDIA requested a review from nealvaidya March 5, 2026 18:47

debermudez force-pushed the dbermudez/aip-730-generate-fern-docs branch 2 times, most recently from 915d0dd to 5a4ca96 Compare March 5, 2026 21:25

debermudez added 8 commits March 6, 2026 14:53

Add initial fern docs

9f8a5fa

Migrate underscore to hyphen in file names and references

6d492f9

Remove fern/pages

e3c2fe2

Update fern pages references

f06ee33

Update docs from MD to MDX formatting

13992db

Update more docs with fern mdx format

c423f73

Update some links and file names that were incorrect

d42f78e

Add missing mdx formatting

159f90b

debermudez added 6 commits March 6, 2026 14:53

Fix mermaid diagrams

df63845

Update headers and side panel titles

a56f721

Add assets from dynamo repo

38ea7a4

Fix broken links

105b38d

Update navigation

e39d4e2

Add collapsed true to sections

649cd79

debermudez force-pushed the dbermudez/aip-730-generate-fern-docs branch from 5a4ca96 to 649cd79 Compare March 6, 2026 22:53

Merge branch 'main' into dbermudez/aip-730-generate-fern-docs

69981c8

dagil-nvidia approved these changes Mar 7, 2026

View reviewed changes

Uncollapse the getting started and tutorial sections

78262cf

debermudez enabled auto-merge (squash) March 7, 2026 01:00

debermudez merged commit e294a31 into main Mar 7, 2026
18 checks passed

debermudez deleted the dbermudez/aip-730-generate-fern-docs branch March 7, 2026 01:17

Conversation

debermudez commented Feb 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

github-actions Bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Try out this PR

Uh oh!

coderabbitai Bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Feb 11, 2026

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

@@ - +text Evaluate: Benchmark Execution Timeline (t=0 to t=30s)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dagil-nvidia commented Feb 19, 2026

Converting PR #676 to Config-Only Fern Setup

Background: Two Ways to Set Up Fern

Approach A: Copy Content into fern/pages/ (Current PR)

Approach B: Config-Only fern/, Content Stays in docs/

Why Approach B Is Better

What Needs to Change

Step-by-Step Changes

1. Delete fern/pages/ Entirely

2. Update fern/versions/next.yml

3. Keep All docs/ Modifications

4. Fix Remaining CodeRabbit Issues

4a. Broken Cross-Repo Links

4b. PII Leak in Sample Output

4c. Code Block Fencing

4e. mkdocs.yml Nav Entry

4f. Server-Metrics Link Mismatch

5. Fern-Only Pages

6. Update CI Trigger Paths (If Applicable)

7. Verify

Summary of Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

debermudez commented Feb 11, 2026 •

edited by coderabbitai Bot

Loading

github-actions Bot commented Feb 11, 2026 •

edited

Loading

coderabbitai Bot commented Feb 11, 2026 •

edited

Loading

@@
- `+`text
Evaluate: Benchmark Execution Timeline (t=0 to t=30s)

Approach A: Copy Content into `fern/pages/` (Current PR)

Approach B: Config-Only `fern/`, Content Stays in `docs/`

1. Delete `fern/pages/` Entirely

2. Update `fern/versions/next.yml`

3. Keep All `docs/` Modifications