docs: Add initial fern docs#676
Conversation
Try out this PRQuick install: pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@78262cfa6c18c7f7a297c1516b53b994c9944981Recommended with virtual environment (using uv): uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@78262cfa6c18c7f7a297c1516b53b994c9944981Last updated for commit: |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughIntroduces a Fern-based documentation framework via configuration files and a comprehensive migration skill document, while systematically standardizing documentation formatting across the codebase through comment header conversions, hyphenated filename conventions, admonition block updates, and corresponding link/path adjustments. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🤖 Fix all issues with AI agents
In `@fern/pages/reproducibility.md`:
- Around line 76-80: The relative links in fern/pages/reproducibility.md (the
references to ../tests/integration/test_random_generator_canary.py,
../tests/integration/test_deterministic_behavior.py, and
../src/aiperf/common/random_generator.py) will 404 in the published docs; update
each to use the absolute GitHub URL for the target file in the correct
repository/branch (per the Fern migration guide) so they point to the canonical
locations (e.g., the absolute URLs for test_random_generator_canary.py,
test_deterministic_behavior.py, and aiperf/common/random_generator.py) instead
of the relative ../ paths.
In `@fern/pages/server-metrics/server-metrics-reference.md`:
- Around line 545-547: Update the two broken Markdown links in the paragraph
that currently point to server_metrics_json_schema.md and
server_metrics_parquet_schema.md: replace those underscored filenames with the
correct hyphenated filenames server-metrics-json-schema.md and
server-metrics-parquet-schema.md so the links resolve to the existing files
(look for the link text referencing "JSON Schema Reference" and "Parquet Schema
Reference" in this file).
In `@fern/pages/tutorials/custom-dataset.md`:
- Around line 122-131: Replace the hardcoded developer home path shown in the
sample output blocks in fern/pages/tutorials/custom-dataset.md (the CLI Command
/ Benchmark Duration / CSV Export / JSON Export / Log File example lines that
currently contain "/home/lkomali/aiperf/...") with a generic placeholder such as
"/home/user/aiperf/..." or use relative paths like "artifacts/..." so the sample
output no longer exposes a username; update all three identical output blocks
(the CLI/sample outputs around the CSV Export, JSON Export and Log File entries)
to use the chosen placeholder.
In `@fern/pages/tutorials/multi-url-load-balancing.md`:
- Around line 15-78: The markdown has unclosed and mis-placed code fences
causing rendering issues; close the first ```bash fence immediately after the
first command block (the "Round-robin across two servers" aiperf command), then
wrap its sample output in a separate ```text block and close it; open a new
```bash for the "Multi-GPU scaling on a single node" aiperf command and close it
after that command, then wrap its sample output in a ```text block and close it;
remove the stray trailing ``` so there are exactly paired fences for each
command and each sample output in the multi-url-load-balancing.md content.
In `@fern/pages/tutorials/prefix-synthesis.md`:
- Around line 232-248: Scenario 1 ("Simulate High Cache Hit Rate") incorrectly
uses --synthesis-prefix-root-multiplier 5 which, per the documentation for
--synthesis-prefix-root-multiplier, splits traces across multiple trees and
reduces cache hits; fix by either changing the scenario title/description to
reflect that multiplier=5 simulates a lower cache hit rate, or change the flag
value in Scenario 1 to --synthesis-prefix-root-multiplier 1 to actually simulate
high cache hit rate, and ensure the scenario text mentions the chosen behavior
to match the doc for --synthesis-prefix-root-multiplier.
🟡 Minor comments (26)
fern/pages/tutorials/rankings.md-53-71 (1)
53-71:⚠️ Potential issue | 🟡 MinorAdd a language tag to the fenced block.
Markdownlint flags the sample output block as missing a language. Use
text(orbashif you want formatting) to satisfy MD040.✅ Suggested fix
-``` +```text INFO Starting AIPerf System INFO AIPerf System is PROFILING Profiling: 10/10 |████████████████████████| 100% [00:02<00:00] INFO Benchmark completed successfully INFO Results saved to: artifacts/BAAI_bge-reranker-base-rankings/ NVIDIA AIPerf | LLM Metrics ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┓ ┃ Metric ┃ avg ┃ min ┃ max ┃ p99 ┃ p50 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━┩ │ Request Latency (ms) │ 52.34 │ 45.12 │ 68.45 │ 65.23 │ 51.89 │ │ Request Throughput (req/s) │ 5.12 │ - │ - │ - │ - │ └────────────────────────────┴───────┴───────┴───────┴───────┴───────┘ JSON Export: artifacts/BAAI_bge-reranker-base-rankings/profile_export_aiperf.jsonfern/pages/tutorials/user-centric-timing.md-237-239 (1)
237-239:⚠️ Potential issue | 🟡 MinorHyphenate compound modifier in heading.
Use “High-Throughput” as a compound adjective.
✏️ Suggested fix
-### High Throughput Cache Test +### High-Throughput Cache Testfern/pages/tutorials/template-endpoint.md-75-75 (1)
75-75:⚠️ Potential issue | 🟡 MinorClarify JMESPath array indexing in examples.
The documentation shows two different JMESPath patterns for extracting embeddings:
- Line 75:
data[0].embedding(extracts first element's embedding)- Line 117:
data[].embedding(extracts all embeddings)These serve different purposes but the distinction isn't explained. Consider clarifying when to use indexed access (
[0]) versus array projection ([]) to help users understand which pattern fits their needs.Also applies to: 117-117
fern/pages/tutorials/vision.md-55-55 (1)
55-55:⚠️ Potential issue | 🟡 MinorSpecify fenced code block languages for sample outputs.
Markdownlint flags these blocks as missing a language. Use
textorconsoleto satisfy MD040.✅ Suggested fix
-``` +```text INFO Starting AIPerf System ... JSON Export: artifacts/Qwen_Qwen2-VL-2B-Instruct-chat-concurrency4/profile_export_aiperf.json```diff -``` +```text INFO Starting AIPerf System ... JSON Export: artifacts/Qwen_Qwen2-VL-2B-Instruct-chat-concurrency1/profile_export_aiperf.json</details> Also applies to: 108-108 </blockquote></details> <details> <summary>fern/pages/tutorials/multi-turn.md-405-405 (1)</summary><blockquote> `405-405`: _⚠️ Potential issue_ | _🟡 Minor_ **Document the `--conversation-turn-delay-ratio` parameter in the Core Parameters section.** This parameter is mentioned at line 405 but is missing from the "Turn Delays" subsection (lines 59-69) where the other delay parameters (`--conversation-turn-delay-mean` and `--conversation-turn-delay-stddev`) are documented. Add documentation for this parameter alongside the other turn delay options for consistency and completeness. </blockquote></details> <details> <summary>fern/pages/tutorials/fixed-schedule.md-139-163 (1)</summary><blockquote> `139-163`: _⚠️ Potential issue_ | _🟡 Minor_ **Sample output entry count appears inconsistent with the schedule data.** The schedule defined earlier has entries at timestamps: 0, 500, 750, 1000, 1250, 2000, 2500, 3000, 4000, 5000. With `--fixed-schedule-start-offset 2000` and `--fixed-schedule-end-offset 4000`, the filtered entries should include at least timestamps 2000, 2500, and 3000 (3+ entries depending on boundary inclusivity), but line 143 says "Filtered to 2 entries." Please verify and correct the sample output to match the expected filtering behavior. </blockquote></details> <details> <summary>fern/pages/tutorials/fixed-schedule.md-122-134 (1)</summary><blockquote> `122-134`: _⚠️ Potential issue_ | _🟡 Minor_ **Incorrect comment: "2s to 6s" should be "2s to 4s".** Line 123 says `# Execute schedule from 2s to 6s window`, but the actual offsets are `--fixed-schedule-start-offset 2000` (2s) and `--fixed-schedule-end-offset 4000` (4s). <details> <summary>Proposed fix</summary> ```diff -# Execute schedule from 2s to 6s window +# Execute schedule from 2s to 4s windowfern/pages/tutorials/local-tokenizer.md-82-89 (1)
82-89:⚠️ Potential issue | 🟡 MinorSample output table is missing its top border row.
Other sample outputs in the tutorials include the full table frame starting with
┏━━━.... Here the table begins directly with┃(line 83), missing the opening border. This inconsistency could confuse readers.Proposed fix — add the missing top border
NVIDIA AIPerf | LLM Metrics +┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓ ┃ Metric ┃ avg ┃ min ┃ max ┃ p99 ┃ p50 ┃fern/pages/tutorials/time-based-benchmarking.md-169-169 (1)
169-169:⚠️ Potential issue | 🟡 MinorFix broken link:
benchmark_modesshould bebenchmark-modes.The path
../benchmark_modes/timing-modes-reference.mduses an underscore, but the actual directory infern/pages/isbenchmark-modes(with a hyphen).Proposed fix
-- [Timing Modes Reference](../benchmark_modes/timing-modes-reference.md) — Complete CLI compatibility matrix +- [Timing Modes Reference](../benchmark-modes/timing-modes-reference.md) — Complete CLI compatibility matrixfern/pages/tutorials/working-with-profile-exports.md-96-106 (1)
96-106:⚠️ Potential issue | 🟡 MinorFix duplicate metric in example.
Lines 98 and 101 both show
time_to_first_tokenwith different values and units. This appears to be a copy-paste error. The second occurrence should likely betime_to_second_tokenor another distinct metric.🔧 Proposed fix
"metrics": { "input_sequence_length": {"value": 550, "unit": "tokens"}, "time_to_first_token": {"value": 255.88656799999998, "unit": "ms"}, "request_latency": {"value": 297.52522799999997, "unit": "ms"}, "output_token_count": {"value": 9, "unit": "tokens"}, - "time_to_first_token": {"value": 4.8984369999999995, "unit": "ms"}, + "time_to_second_token": {"value": 4.8984369999999995, "unit": "ms"}, "inter_chunk_latency": {"value": [4.898437, 5.316006, 4.801489, 5.674918, 4.811467, 5.097998, 5.504797, 5.533548], "unit": "ms"}, "output_sequence_length": {"value": 9, "unit": "tokens"}, "inter_token_latency": {"value": 5.2048325, "unit": "ms"}, "output_token_throughput_per_user": {"value": 192.1291415237666, "unit": "tokens/sec/user"} },fern/pages/tutorials/plot.md-82-95 (1)
82-95:⚠️ Potential issue | 🟡 MinorFix extra closing backticks.
Line 95 has three closing backticks when only one set is needed to close the output block that started on Line 86. This will cause rendering issues.
🔧 Proposed fix
INFO Using dark theme INFO Found 3 runs to compare INFO Generating 3 comparison plots INFO Successfully generated 3 plots INFO Plots saved to: artifacts/sweep_qwen/plots/-```
</details> </blockquote></details> <details> <summary>fern/pages/tutorials/timeslices.md-12-13 (1)</summary><blockquote> `12-13`: _⚠️ Potential issue_ | _🟡 Minor_ **Use hyphenated compound adjective.** “Equal-duration segments” reads cleaner than “equal duration segments.” </blockquote></details> <details> <summary>fern/pages/benchmark-modes/trace-replay.md-47-52 (1)</summary><blockquote> `47-52`: _⚠️ Potential issue_ | _🟡 Minor_ **`hash_ids` listed under "Required fields" but marked "(optional)" — confusing.** The section heading says "Required fields for trace replay" but `hash_ids` is annotated as `(optional)`. Consider either moving it to a separate "Optional fields" list or changing the heading to "Fields for trace replay." <details> <summary>Proposed fix</summary> ```diff -Required fields for trace replay: +Fields for trace replay: - `timestamp`: Request arrival time in milliseconds - `input_length`: Number of input tokens - `output_length`: Number of output tokens -- `hash_ids`: List of block hashes (optional) +- `hash_ids`: List of block hashes *(optional)*fern/pages/server-metrics/server-metrics-json-schema.md-38-43 (1)
38-43:⚠️ Potential issue | 🟡 MinorFix broken internal links — filenames use hyphens, not underscores.
Three links in the related documentation section reference files with underscores, but the actual filenames use hyphens. These links will produce 404s and need to be corrected.
Proposed fix
-The Parquet format exports raw time-series data with delta calculations in columnar format, optimized for SQL analytics with DuckDB, pandas, or Polars. See [Parquet Schema Reference](server_metrics_parquet_schema.md) for the complete schema. +The Parquet format exports raw time-series data with delta calculations in columnar format, optimized for SQL analytics with DuckDB, pandas, or Polars. See [Parquet Schema Reference](server-metrics-parquet-schema.md) for the complete schema. **Related documentation:** - [Server Metrics Tutorial](server-metrics.md) - Quick start guide and usage examples -- [Server Metrics Reference](server_metrics_reference.md) - Metric definitions by backend (vLLM, SGLang, TRT-LLM, Dynamo) -- [Parquet Schema Reference](server_metrics_parquet_schema.md) - Raw time-series data schema +- [Server Metrics Reference](server-metrics-reference.md) - Metric definitions by backend (vLLM, SGLang, TRT-LLM, Dynamo) +- [Parquet Schema Reference](server-metrics-parquet-schema.md) - Raw time-series data schemafern/pages/metrics-reference.md-216-216 (1)
216-216:⚠️ Potential issue | 🟡 MinorUse hyphenated compound modifiers (e.g., “Inter‑Token”).
These headings should be hyphenated for correctness and consistency (Inter‑Token, Inter‑Chunk, Token‑Based).
Proposed fix
-### Inter Token Latency (ITL) +### Inter-Token Latency (ITL) -### Inter Chunk Latency (ICL) +### Inter-Chunk Latency (ICL) -## Token Based Metrics +## Token-Based MetricsAlso applies to: 240-240, 298-298
fern/pages/server-metrics/server-metrics.md-158-158 (1)
158-158:⚠️ Potential issue | 🟡 MinorFix broken Parquet schema link.
The link uses underscore naming and likely doesn’t resolve; the file in this PR uses hyphens.
Proposed fix
-See [Parquet Schema Reference](server_metrics_parquet_schema.md) for complete schema, metadata, and query examples. +See [Parquet Schema Reference](server-metrics-parquet-schema.md) for complete schema, metadata, and query examples.fern/pages/cli-options.md-431-434 (1)
431-434:⚠️ Potential issue | 🟡 MinorFix missing space after period.
This reads as a typo in the rendered docs.
Proposed fix
-Note that due to the prefix and user prompts being concatenated, the number of tokens in the final prompt may be off by one.Mutually exclusive with `--shared-system-prompt-length`/`--user-context-prompt-length`. +Note that due to the prefix and user prompts being concatenated, the number of tokens in the final prompt may be off by one. Mutually exclusive with `--shared-system-prompt-length`/`--user-context-prompt-length`.fern/pages/tutorials/arrival-patterns.md-392-396 (1)
392-396:⚠️ Potential issue | 🟡 MinorFix broken relative link to timing modes reference.
The directory name is
benchmark-modes, notbenchmark_modes.Proposed fix
-- [Timing Modes Reference](../benchmark_modes/timing-modes-reference.md) — Complete CLI compatibility matrix +- [Timing Modes Reference](../benchmark-modes/timing-modes-reference.md) — Complete CLI compatibility matrixfern/pages/benchmark-modes/timing-modes-reference.md-83-84 (1)
83-84:⚠️ Potential issue | 🟡 MinorRemove blank line inside blockquote.
markdownlint MD028 flags this; keep the blockquote contiguous.
Proposed fix
-> **Important**: If `--concurrency` is not set, session concurrency limiting is **disabled** (unlimited). For `--user-centric-rate` mode, consider setting `--concurrency` to at least `--num-users` to ensure all users can have in-flight requests. - -> **See also**: [Prefill Concurrency Tutorial](../tutorials/prefill-concurrency.md) for detailed guidance on memory-safe long-context benchmarking. +> **Important**: If `--concurrency` is not set, session concurrency limiting is **disabled** (unlimited). For `--user-centric-rate` mode, consider setting `--concurrency` to at least `--num-users` to ensure all users can have in-flight requests. +> **See also**: [Prefill Concurrency Tutorial](../tutorials/prefill-concurrency.md) for detailed guidance on memory-safe long-context benchmarking.fern/pages/diagrams/metrics-flow.md-34-49 (1)
34-49:⚠️ Potential issue | 🟡 MinorStage numbering jumps from "Stage 2" to "Stage 4" — missing "Stage 3".
Line 34 comments
Stage 2and Line 48 commentsStage 4. Either rename toStage 3or add the missing stage.fern/pages/migrating.md-1-4 (1)
1-4:⚠️ Potential issue | 🟡 MinorCopyright year range inconsistent with other files in this PR.
This file uses
2024-2025while most other new files in this PR use2025-2026or2025. If2024is intentional (e.g., content originated in 2024), this is fine — just flagging the inconsistency for confirmation.fern/pages/server-metrics/server-metrics-reference.md-156-158 (1)
156-158:⚠️ Potential issue | 🟡 MinorBroken link fragment:
#histogram-bucketsheading does not exist.The link on Line 158 points to
[Histogram Buckets](#histogram-buckets), but there is no## Histogram Bucketsheading in this document. The bucket definitions are inline within metric tables. Either add a dedicated heading or update the link to point to an existing anchor (e.g.,#metric-interpretation-guide)..cursor/skills/docs-to-fern/SKILL_md-967-972 (1)
967-972:⚠️ Potential issue | 🟡 MinorFile appears truncated — item 6 ends mid-sentence.
Line 972 ends without a period and the numbered list seems incomplete. The closing
```for the skill file may also be missing.Proposed fix
-6. **Get logo assets early.** The NVIDIA logo SVGs and favicon are required before `fern docs dev` will render correctly. Copy from an existing NVIDIA Fern project or request from design +6. **Get logo assets early.** The NVIDIA logo SVGs and favicon are required before `fern docs dev` will render correctly. Copy from an existing NVIDIA Fern project or request from design.fern/pages/diagrams/metrics-flow.md-75-84 (1)
75-84:⚠️ Potential issue | 🟡 MinorStyle assignments reference undefined nodes
I1andF.Line 82 applies the
statisticsclass toI1, but onlyI2is defined (Line 49). Line 83 appliestransporttoF, which doesn't exist anywhere in the diagram. These appear to be remnants of a previous diagram revision. Mermaid will silently ignore them, but they're misleading.Proposed fix
- class I1,G statistics + class G statistics - class E1,E2,E3,F,L transport + class E1,E2,E3,L transportfern/pages/server-metrics/server-metrics-reference.md-166-168 (1)
166-168:⚠️ Potential issue | 🟡 MinorHeading hierarchy:
Dynamo Frontendshould be###(h3), not##(h2).
Dynamo Frontend(Line 168) is logically a subsection ofDetailed Metric Definitions(Line 166), as shown in the TOC (Line 13). The same applies to the other backend headings (Dynamo Component,vLLM,SGLang,TensorRT-LLM,KVBM). Using h2 for both the parent and children flattens the hierarchy and may cause Fern's sidebar to display them incorrectly.Proposed fix (apply to all backend subsection headings)
-## Dynamo Frontend +### Dynamo FrontendSimilarly for
## Dynamo Component,## vLLM,## SGLang,## TensorRT-LLM,## KVBM.fern/pages/api/synthesis.md-556-557 (1)
556-557:⚠️ Potential issue | 🟡 MinorFix broken documentation link in "See Also" section.
The link to
../benchmark_modes/trace_replay.mdis broken. The correct path is../benchmark-modes/trace-replay.md(using dashes instead of underscores in both the directory and filename).Corrected snippet
- [Prefix Synthesis Tutorial](../tutorials/prefix-synthesis.md) - [Trace Replay](../benchmark-modes/trace-replay.md)
🧹 Nitpick comments (44)
fern/pages/tutorials/user-centric-timing.md (1)
91-93: Add languages to fenced code blocks (MD040).markdownlint warns when fenced blocks omit a language; please tag them (e.g.,
textfor diagrams/outputs,bashfor commands).🔧 Proposed updates
-``` +```text turn_gap = num_users / user_centric_rate```diff -``` +```text Evaluate: Benchmark Execution Timeline (t=0 to t=30s) --------------------------------------------------------------------- TIME (s) >>> 0 1 2 3 4 5 6 7 8 9 10 11 12 ... @@ RESULT: Immediate mix of fresh sessions (User 16) and deep sessions (User 14), with users finishing and churning naturally from t=6s onwards.```diff -``` +```text ┌─────────────────────────────────────────────────────────────┐ │ Shared System Prompt (1000 tokens) │ ← Same across ALL users │ "You are a helpful assistant..." │ (KV cache shared prefix) @@ └─────────────────────────────────────────────────────────────┘```diff -``` +```text INFO Starting AIPerf System INFO User-centric mode: 15 users, 1.0 req/s (15.0s turn gap per user) @@ JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-rate1.0/profile_export_aiperf.json```diff -``` +```text INFO Starting AIPerf System INFO User-centric mode: 15 users, 4.0 req/s (3.75s turn gap per user) @@ JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-rate4.0/profile_export_aiperf.json```diff -``` +```text INFO Starting AIPerf System INFO User-centric mode: 15 users, 0.5 req/s (30.0s turn gap per user) @@ JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-rate0.5/profile_export_aiperf.json</details> Also applies to: 106-129, 145-156, 207-231, 259-283, 309-333 </blockquote></details> <details> <summary>fern/pages/tutorials/template-endpoint.md (2)</summary><blockquote> `91-97`: **Consider documenting when named content variables are populated.** The named content variables (`query`, `queries`, `passage`, `passages`, etc.) are listed but lack context about: - When these variables are populated vs. being `None` - How they differ from generic `text`/`texts` variables - What input format is required to use them Adding a brief explanation would help users understand when to use `query` vs `text` in their templates. --- `222-222`: **Expand guidance on `tojson` filter usage.** The troubleshooting section states "Use `|tojson` filter for string or nullable values," but the examples throughout the document show `tojson` being used for all JSON-serializable types (strings, lists, dicts, numbers). Consider revising to: "Use `|tojson` filter for all values to ensure proper JSON serialization and escaping" to better align with the "Always use `|tojson`" tip on line 214 and the actual usage patterns shown in the examples. </blockquote></details> <details> <summary>fern/pages/tutorials/vision.md (1)</summary><blockquote> `18-22`: **Pin the vLLM Docker image tag to a specific version for reproducibility.** The `:latest` tag is a moving target and can cause non-deterministic behavior in the tutorial. Use a version-pinned tag instead, such as `:v0.15.1` (the current stable release), which aligns with vLLM's official documentation recommendations for reproducible deployments. </blockquote></details> <details> <summary>fern/pages/tutorials/multi-turn.md (2)</summary><blockquote> `108-129`: **Add language identifier to sample output code block.** The sample output code block should have a language identifier (e.g., `text` or `console`) for better syntax highlighting and consistency with Markdown best practices. <details> <summary>📝 Proposed fix</summary> ```diff **Sample Output (Successful Run):** -``` +```text INFO Starting AIPerf System
20-29: Optional: Remove blank line inside blockquote for consistency.There's a blank line between the two blockquote sections (line 29). While this doesn't affect functionality, removing it would align with Markdown best practices and silence the linter warning.
📝 Proposed fix
> These options are mutually exclusive in their intent - use `--request-count` for single-turn benchmarking and `--conversation-num` for multi-turn benchmarking to avoid confusion. - > [!NOTE] > **Dataset Generation vs Request Execution**fern/pages/tutorials/ui-types.md (1)
83-93: Add a language identifier to fenced code blocks.The sample output code blocks at lines 83 and 116 lack a language specifier, which triggers MD040 warnings. Adding
text(orconsole) satisfies the linter and can improve rendering.Proposed fix
-``` +```text INFO Starting AIPerf SystemApply the same change at line 116.
fern/pages/tutorials/fixed-schedule.md (1)
87-110: Add language identifier to sample output code blocks.Lines 87 and 139 open fenced code blocks without a language specifier (MD040). Use
```textfor linting compliance.fern/pages/tutorials/goodput.md (1)
54-74: Add language identifier to the output code block.Line 56 opens a fenced code block without a language specifier. Use
```textfor consistency with linting rules (MD040).fern/pages/tutorials/time-based-benchmarking.md (1)
26-35: Add language identifiers to fenced code blocks.The ASCII diagram (line 26) and sample output blocks (lines 76, 121) lack language specifiers (MD040). Use
```textfor these blocks.Also applies to: 76-104, 121-149
fern/pages/tutorials/local-tokenizer.md (1)
30-36: Add language identifiers to fenced code blocks.The directory listing (line 30) and sample output (line 71) lack language specifiers (MD040). Use
```textfor these.Also applies to: 71-92
fern/pages/tutorials/custom-dataset.md (1)
90-131: Add language identifiers to sample output code blocks.Lines 90, 171, and 265 open fenced code blocks without a language specifier (MD040). Use
```text.Also applies to: 171-210, 265-305
fern/pages/tutorials/audio.md (1)
97-109: Move the fixture path note before the code example.The note on Line 101 warns users to replace fixture paths, but it appears after the code example that contains those paths (Lines 105-109). Users who copy-paste the example first may miss this critical instruction.
Consider moving the note to Line 97 (before the code example) to prevent confusion.
📝 Suggested reordering
## Profile with Custom Input File AIPerf can automatically load and encode audio files from local paths. +> **Note:** The example below uses paths from the AIPerf test fixtures directory. Replace these with paths to your own audio files. + {/* aiperf-run-vllm-audio-openai-endpoint-server */} ```bash cat <<EOF > inputs.jsonl {"texts": ["Transcribe this."], "audios": ["/fixtures/audio/test_audio_1s.wav"]} {"texts": ["What is said?"], "audios": ["/fixtures/audio/test_audio_2.wav"]} {"texts": ["Summarize."], "audios": ["/fixtures/audio/test_audio_3.wav"]} EOF -aiperf profile \ - --model Qwen/Qwen2-Audio-7B-Instruct \ - --endpoint-type chat \ - --input-file inputs.jsonl \ - --custom-dataset-type single_turn \ - --streaming \ - --url localhost:8000 \ - --request-count 3 -``` -{/* /aiperf-run-vllm-audio-openai-endpoint-server */} - -AIPerf will automatically: -- Load the audio files from the specified paths -- Convert them to base64 format -- Send them to the model endpoint - -> **Note:** The example below uses paths from the AIPerf test fixtures directory. Replace these with paths to your own audio files. - -{/* aiperf-run-vllm-audio-openai-endpoint-server */} -```bash -cat <<EOF > inputs.jsonl -{"texts": ["Transcribe this."], "audios": ["/fixtures/audio/test_audio_1s.wav"]} -{"texts": ["What is said?"], "audios": ["/fixtures/audio/test_audio_2.wav"]} -{"texts": ["Summarize."], "audios": ["/fixtures/audio/test_audio_3.wav"]} -EOF - aiperf profile \ --model Qwen/Qwen2-Audio-7B-Instruct \ --endpoint-type chat \ --input-file inputs.jsonl \ --custom-dataset-type single_turn \ --streaming \ --url localhost:8000 \ --request-count 3{/* /aiperf-run-vllm-audio-openai-endpoint-server */}
+AIPerf will automatically:
+- Load the audio files from the specified paths
+- Convert them to base64 format
+- Send them to the model endpoint</details> </blockquote></details> <details> <summary>fern/pages/tutorials/timeslices.md (1)</summary><blockquote> `63-76`: **Add a language identifier to the sample output fence.** Use ```text (or ```console) for the log block to satisfy MD040 and keep lint clean. </blockquote></details> <details> <summary>fern/pages/tutorials/sglang-video-generation.md (1)</summary><blockquote> `104-173`: **Add language identifiers to output/log fences.** Use ```text for the Uvicorn log snippet and the sample output table to satisfy MD040. </blockquote></details> <details> <summary>fern/pages/tutorials/openai-text-endpoints.md (1)</summary><blockquote> `47-68`: **Add language identifiers to the sample output fences.** Use ```text for the log/output blocks to clear MD040. Also applies to: 117-139 </blockquote></details> <details> <summary>fern/pages/tutorials/sglang-image-generation.md (1)</summary><blockquote> `96-139`: **Add language identifiers to fenced blocks.** Use ```text for sample outputs and ```json for prompt examples to satisfy MD040 and improve readability. Also applies to: 224-249 </blockquote></details> <details> <summary>fern/pages/tutorials/sequence-distributions.md (1)</summary><blockquote> `45-73`: **Add language identifiers to fenced blocks.** Use ```text for the string-format examples and the sample output to resolve MD040. Also applies to: 103-127 </blockquote></details> <details> <summary>fern/pages/tutorials/gpu-telemetry.md (1)</summary><blockquote> `159-193`: **Add language identifiers to output fences.** Use ```text for console/CSV output snippets to satisfy MD040. Also applies to: 499-538 </blockquote></details> <details> <summary>fern/pages/tutorials/custom-prompt-benchmarking.md (1)</summary><blockquote> `68-97`: **Add a language identifier to the sample output fence.** Use ```text for the log/output block to satisfy MD040. </blockquote></details> <details> <summary>fern/pages/comprehensive-llm-benchmarking.md (1)</summary><blockquote> `87-107`: **Prefer #### for subsections under ## in this doc.** This file uses many ### headings directly under ## (e.g., “### Command”, “### Parameters Explained”). Consider switching to #### for consistency with the docs style preference. Based on learnings: “In the aiperf repository's docs/metrics_reference.md file, the maintainer prefers using h4 headings (####) for subsections under h2 headings instead of h3 (###) for better visual sizing and readability, even though this violates markdownlint rule MD001.” Also applies to: 118-140 </blockquote></details> <details> <summary>fern/pages/server-metrics/server-metrics-json-schema.md (1)</summary><blockquote> `1-2`: **Minor: Copyright year inconsistency.** This file uses `2025` only, while most other new files in this PR use `2025-2026`. </blockquote></details> <details> <summary>fern/pages/tutorials/prefix-synthesis.md (1)</summary><blockquote> `39-66`: **Add language specifiers to fenced code blocks.** Multiple output/error blocks lack a language identifier (flagged by markdownlint MD040). Use ` ```text ` for terminal output and error message blocks (lines 39, 112, 378, 387, 394). Also applies to: 112-136, 378-400 </blockquote></details> <details> <summary>fern/pages/tutorials/warmup.md (1)</summary><blockquote> `14-26`: **Add language specifiers to fenced code blocks.** Several output and diagram blocks lack language identifiers (markdownlint MD040). Use ` ```text ` for ASCII art diagrams (lines 14, 254) and terminal output blocks (lines 54, 122, 157, 194, 233). Also applies to: 54-70, 122-138, 157-173, 194-210, 233-250, 254-264 </blockquote></details> <details> <summary>fern/pages/index.md (1)</summary><blockquote> `6-20`: **Landing page is sparse — consider adding navigation links and install instructions.** This is the docs entry point, but it lacks links to the key sections defined in `next.yml` (Tutorials, CLI Options, Architecture, etc.) and provides no concrete install command or API example. Users landing here have no actionable path forward. Consider at minimum: - A `pip install aiperf` (or equivalent) command in Quick Start - Links to the tutorials section, CLI reference, and architecture page - A brief description of what AIPerf benchmarks (LLM inference servers) </blockquote></details> <details> <summary>fern/pages/plugins/plugin-system.md (1)</summary><blockquote> `44-52`: **Add language identifiers to fenced code blocks.** Several fences are missing a language tag, which triggers MD040 and reduces syntax highlighting consistency. <details> <summary>Proposed fix</summary> ```diff -``` +```text Registry (singleton) └── Package (1+) ─── discovered via entry points └── Manifest (1+ per package) ─── plugins.yaml files └── Category (1+) └── Entry (1+) ─── PluginEntry ├── Class ─── lazy-loaded Python class └── Metadata ─── optional typed config -``` +``` -``` +```text Entry Points → plugins.yaml → Pydantic Validation → Registry ↓ get_class() → Import Module → Cache -``` +``` -``` +```text TypeNotFoundError: Type 'my_plugin' not found for category 'endpoint'. -``` +``` -``` +```text ImportError: Failed to import module for endpoint:my_plugin -``` +``` -``` +```text AttributeError: Class 'MyClass' not found -``` +```Also applies to: 70-74, 387-395, 399-407, 410-417
fern/pages/metrics-reference.md (1)
141-282: Consider h4 subsections under h2 for readability (project style).This file uses h3 for subsections under h2; project preference is to use h4 for better visual sizing.
Based on learnings: “In the aiperf repository's docs/metrics_reference.md file, the maintainer prefers using h4 headings (####) for subsections under h2 headings instead of h3 (###) for better visual sizing and readability, even though this violates markdownlint rule MD001.”
Also applies to: 298-1187
fern/pages/reference/tokenizer-auto-detection.md (1)
34-37: Add language identifiers to fenced output blocks.This resolves MD040 and improves readability.
Proposed fix
-``` +```text INFO ✓ Tokenizer Qwen/Qwen3-0.6B detected for qwen3-0.6b INFO 1 tokenizer validated • 1 resolved • 0.3s -``` +``` -``` +```text ╭──────────────────────────────── Ambiguous Tokenizer Name ─────────────────────────────────╮ ... ╰───────────────────────────────────────────────────────────────────────────────────────────╯ -``` +``` -``` +```text ╭───────────────────────────────── Gated Repository ──────────────────────────────────╮ ... ╰─────────────────────────────────────────────────────────────────────────────────────╯ -``` +```Also applies to: 40-60, 63-82
fern/pages/tutorials/request-rate-concurrency.md (1)
97-115: Add language identifiers to output fences.These look like console output; use
textfor consistent highlighting and to satisfy MD040.Also applies to: 141-159
fern/pages/tutorials/prefill-concurrency.md (1)
14-25: Add language identifiers to fenced blocks.Use
textfor diagrams/output andbashfor CLI blocks to satisfy MD040.Also applies to: 33-52, 100-121, 147-168, 172-181, 202-224
fern/pages/server-metrics/server-metrics.md (1)
71-71: Add language identifiers to fenced blocks.Use
bash/jsonas appropriate to satisfy MD040 and improve readability.Also applies to: 396-396
fern/pages/tutorials/ramping.md (1)
14-26: Add language identifiers to fenced blocks.Several fences are missing a language tag; use
textfor diagrams/output andbashfor CLI blocks to satisfy MD040.Also applies to: 66-87, 90-99, 115-136, 139-148, 168-190, 209-230
fern/pages/tutorials/request-cancellation.md (1)
16-30: Add language tags to fenced blocks.markdownlint flags these blocks for missing language; using
textkeeps formatting while satisfying MD040.Proposed fix
-``` +```text T0: Request scheduled ... T3: Request cancelled if still waiting for response -``` +``` -``` +```text INFO Starting AIPerf System ... JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-concurrency8/profile_export_aiperf.json -``` +``` -``` +```text INFO Starting AIPerf System ... JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-concurrency10/profile_export_aiperf.json -``` +``` -``` +```text INFO Starting AIPerf System ... JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-concurrency15/profile_export_aiperf.json -``` +```Also applies to: 104-132, 161-184, 209-233
fern/pages/benchmark-modes/timing-modes-reference.md (1)
248-266: Add language tag to the ASCII diagram.Helps with MD040 and keeps consistent formatting.
Proposed fix
-``` +```text ┌─────────────────────────────────────────────────────────────────┐ ... └─────────────────────────────────────────────────────────────────┘ -``` +```fern/pages/server-metrics/server-metrics-parquet-schema.md (1)
82-87: Add language tags to ASCII tables.Avoids MD040 while preserving formatting.
Proposed fix
-``` +```text endpoint_url | metric_name | metric_type | unit | description |timestamp_ns | model_name | value | sum | count | bucket_le | bucket_count ... -``` +``` -``` +```text endpoint_url | metric_name | metric_type | unit | description | timestamp_ns | model_name | value | sum | count | bucket_le | bucket_count ... -``` +```Also applies to: 91-98
fern/pages/tutorials/http-trace-metrics.md (1)
34-45: Add language tags to fenced blocks.Keeps markdownlint happy for diagrams/formulas.
Proposed fix
-``` +```text Request Lifecycle ──────────────────────────────────────────────────────────────────────────────► ... -``` +``` -``` +```text http_req_duration = response_receive_end_perf_ns - request_send_start_perf_ns -``` +``` -``` +```text http_req_connection_overhead = http_req_blocked + http_req_dns_lookup + http_req_connecting -``` +``` -``` +```text http_req_total = http_req_blocked + http_req_dns_lookup + http_req_connecting + http_req_sending + http_req_waiting + http_req_receiving -``` +``` -``` +```text [content chunk 1] ─► included in both metrics ... [DONE] ─► http_req_total ends here (last network chunk) -``` +```Also applies to: 88-90, 98-100, 106-109, 137-144
fern/pages/tutorials/arrival-patterns.md (1)
14-20: Add language tags to diagram blocks.Use
textto satisfy MD040 without changing rendering.Proposed fix
-``` +```text Constant Pattern: Poisson Pattern: Gamma (bursty): ... -``` +``` -``` +```text Inter-arrival times: 10 QPS → every 100ms: |····|····|····|····|····|····| ... -``` +``` -``` +```text Inter-arrival times (exponential): 10 QPS average: |··|······|·|···|····|··|·······|···| ... -``` +``` -``` +```text Burst mode (concurrency=3): [Req1]────────────────────────────▶ ... -``` +```Also applies to: 45-49, 65-69, 121-127
fern/pages/tutorials/synthetic-video.md (1)
67-90: Add language tag to sample output block.Addresses MD040 while preserving the console output look.
Proposed fix
-``` +```text INFO Starting AIPerf System ... JSON Export: artifacts/your-model-name-chat-concurrency1/profile_export_aiperf.json -``` +```fern/docs.yml (1)
1-23: Logo and favicon configuration missing.The SKILL_md guide in this PR lists
logoandfaviconas required fields for NVIDIA projects. If assets aren't available yet, consider adding a TODO comment so this doesn't get forgotten.fern docs devmay not render correctly without them.# TODO: Add logo and favicon once assets are available # logo: # href: / # light: ./assets/img/nvidia-logo.svg # dark: ./assets/img/nvidia-logo-dark.svg # height: 50 # favicon: ./assets/img/favicon.pngfern/pages/genai-perf-feature-comparison.md (1)
1-3: Inconsistent SPDX header format across documentation files.This file uses JSX comment syntax
{/* ... */}for the SPDX header, whilemigrating.mdandcreating-your-first-plugin.mduse YAML frontmatter (---). Both work in Fern, but consistency across the doc set would improve maintainability. The same inconsistency appears inserver-metrics-reference.mdandreproducibility.md.Pick one approach and apply it uniformly. The YAML frontmatter style is recommended by the SKILL_md guide in this PR.
fern/pages/plugins/creating-your-first-plugin.md (1)
35-62: Add language identifiers to fenced code blocks for consistent rendering.Several code blocks (Lines 35, 50, 203, 274, 310, 321) lack language identifiers. Adding
textwill satisfy the markdown linter and may improve Fern's syntax highlighting behavior.Example fix for directory trees
-``` +```text Package (my-aiperf-plugins)fern/pages/api/synthesis.md (3)
105-112: Clarify the return type format in the example output.The example shows
statsas a plain dictionary, butget_stats()returns aRadixTreeStatsdataclass object. The actual representation would beRadixTreeStats(num_nodes=7, num_leaves=3, total_visits=3, max_depth=3)rather than a dictionary. Consider updating the example to reflect the actual dataclass representation to avoid confusion.📝 Suggested clarification
# Get statistics stats = tree.get_stats() -# { -# 'num_nodes': 7, -# 'num_leaves': 3, -# 'total_visits': 3, -# 'max_depth': 3 -# } +# RadixTreeStats(num_nodes=7, num_leaves=3, total_visits=3, max_depth=3)
348-353: Spell out the "ge" abbreviation for better clarity.The abbreviation "ge" (greater than or equal to) is mathematical shorthand that may not be immediately clear to all readers. Consider spelling it out or using the
>=symbol for better readability.📖 Suggested clarification
**Fields:** -- `speedup_ratio: float = 1.0` - Timestamp scaling multiplier (ge 0.0) -- `prefix_len_multiplier: float = 1.0` - Core prefix length multiplier (ge 0.0) -- `prefix_root_multiplier: int = 1` - Number of independent trees to distribute traces across (ge 1) -- `prompt_len_multiplier: float = 1.0` - Leaf prompt length multiplier (ge 0.0) +- `speedup_ratio: float = 1.0` - Timestamp scaling multiplier (>= 0.0) +- `prefix_len_multiplier: float = 1.0` - Core prefix length multiplier (>= 0.0) +- `prefix_root_multiplier: int = 1` - Number of independent trees to distribute traces across (>= 1) +- `prompt_len_multiplier: float = 1.0` - Leaf prompt length multiplier (>= 0.0) - `max_isl: int | None = None` - Maximum input sequence length filter -- `block_size: int = 512` - KV cache page size (ge 1) +- `block_size: int = 512` - KV cache page size (>= 1)
314-314: Clarify the definition of "unique_prefixes".The description "Number of unique prefix patterns (all prefix subsequences)" is slightly ambiguous. Consider clarifying whether this counts all unique prefix subsequences across all requests or something else.
📖 Suggested clarification
For example, if it means all unique prefix subsequences:
-- `unique_prefixes: int` - Number of unique prefix patterns (all prefix subsequences) +- `unique_prefixes: int` - Number of unique prefix patterns (counts all unique prefix subsequences across requests)Or if it means something else, adjust accordingly.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
9ea0874 to
6864d56
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🤖 Fix all issues with AI agents
In `@fern/pages/tutorials/embeddings.md`:
- Around line 81-87: The JSONL example uses the wrong field name "texts" which
will fail the OpenAI embeddings API; update the inputs.jsonl generation in the
here-doc to use the "input" field (either a single string per line like
{"input":"What is artificial intelligence?"} or an array form
{"input":["...","..."]} if batching), replacing all occurrences of {"texts":
[...] } with {"input": ... } so the examples match the verification curl and the
OpenAI embeddings schema.
In `@fern/pages/tutorials/sharegpt.md`:
- Around line 19-21: The Docker commands currently use the floating "latest"
tag; replace both occurrences of vllm/vllm-openai:latest with a specific stable
tag (for example vllm/vllm-openai:v0.14.1) and, if you need a CUDA-specific
build, choose the appropriate CUDA variant (e.g., v0.14.1-cu130) so the docker
pull and docker run lines pin the image version for reproducible runs.
🟡 Minor comments (36)
fern/pages/tutorials/prefix-synthesis.md-342-372 (1)
342-372:⚠️ Potential issue | 🟡 MinorGuidance vs examples conflict on “extreme” multiplier values.
The tips recommend avoiding extreme multipliers (typically 0.5–3.0), but earlier examples use
--synthesis-prefix-root-multiplier 5and10. Please clarify that the 0.5–3.0 guidance applies only to certain multipliers (e.g., prefix length/prompt length), or update the examples/guidance to be consistent.fern/pages/tutorials/synthetic-video.md-49-49 (1)
49-49:⚠️ Potential issue | 🟡 MinorMinor wording: add space in “4 fps”.
“4fps” reads as a typo; use “4 fps”.
fern/pages/tutorials/synthetic-video.md-52-56 (1)
52-56:⚠️ Potential issue | 🟡 MinorClarify that video is supported via AIPerf’s custom chat endpoint, not the vanilla OpenAI API.
Since the example uses
/v1/chat/completions, add a short note that video inputs are supported by AIPerf’s custom ChatEndpoint extensions (and are not standard OpenAI API behavior) to avoid user confusion.
Based on learnings: “ChatEndpoint … supports video inputs (supports_videos=True) through custom extensions, even though the standard OpenAI /v1/chat/completions API does not natively support raw video inputs.”Also applies to: 356-357
fern/pages/tutorials/huggingface-tgi.md-55-75 (1)
55-75:⚠️ Potential issue | 🟡 MinorAdd language tags to fenced output blocks.
markdownlint MD040 flags these blocks; use
textfor console output.✅ Suggested fix
-``` +```text INFO Starting AIPerf System INFO Using Hugging Face TGI /generate endpoint (non-streaming) INFO AIPerf System is PROFILING @@ JSON Export: artifacts/TinyLlama_TinyLlama-1.1B-Chat-v1.0-generate-concurrency1/profile_export_aiperf.json -``` +``` @@ -``` +```text INFO Starting AIPerf System INFO Using Hugging Face TGI /generate_stream endpoint (streaming) INFO AIPerf System is PROFILING @@ JSON Export: artifacts/TinyLlama_TinyLlama-1.1B-Chat-v1.0-generate-concurrency1/profile_export_aiperf.json -``` +```Also applies to: 117-139
fern/pages/tutorials/sequence-distributions.md-45-66 (1)
45-66:⚠️ Potential issue | 🟡 MinorAdd language identifiers to fenced code blocks (MD040).
The fenced blocks under “Semicolon Format” and “Bracket Format” are missing languages. Please tag them (e.g.,
text) to satisfy the linter.Suggested fix
-``` +```text "ISL1,OSL1:PROB1;ISL2,OSL2:PROB2;..."@@
-+text
"ISL1|STDDEV1,OSL1|STDDEV1:PROB1;ISL2|STDDEV2,OSL2|STDDEV2:PROB2"@@ -``` +```text "[(ISL1,OSL1):PROB1,(ISL2,OSL2):PROB2]"@@
-+text
"[(256|10,128|5):60,(512|20,256|15):40]"fern/pages/tutorials/sequence-distributions.md-103-127 (1)
103-127:⚠️ Potential issue | 🟡 MinorAdd a language tag to the sample output fence (MD040).
The sample output block should be tagged (e.g.,
textorconsole) to satisfy markdownlint.Suggested fix
-``` +```text INFO Starting AIPerf System ... JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-concurrency1/profile_export_aiperf.json</details> </blockquote></details> <details> <summary>fern/pages/tutorials/sglang-video-generation.md-104-107 (1)</summary><blockquote> `104-107`: _⚠️ Potential issue_ | _🟡 Minor_ **Add language identifiers to fenced code blocks.** markdownlint flags these code fences as missing a language. Add `text` (or a more specific language) to satisfy MD040. <details> <summary>Proposed fix</summary> ```diff -``` +```text Uvicorn running on http://0.0.0.0:30010 (Press CTRL+C to quit)```diff -``` +```text NVIDIA AIPerf | Video Generation Metrics ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━┓ ┃ Metric ┃ avg ┃ min ┃ max ┃ p99 ┃ p90 ┃ p50 ┃ std ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━┩ │ Request Latency (ms) │ 45,234.56 │ 42,123.45 │ 48,567.89 │ 48,432.12 │ 47,654.32 │ 45,012.34 │ 2634.78 │ │ Input Sequence Length (tokens) │ 8.33 │ 7.00 │ 10.00 │ 9.98 │ 9.80 │ 8.00 │ 1.25 │ │ Request Throughput (requests/sec) │ 0.02 │ - │ - │ - │ - │ - │ - │ │ Request Count (requests) │ 3.00 │ - │ - │ - │ - │ - │ - │ └───────────────────────────────────┴───────────┴───────────┴───────────┴───────────┴───────────┴───────────┴─────────┘```diff -``` +```text Downloading: http://localhost:30010/v1/videos/video_abc123/content Saved: /path/to/downloaded_videos/video_abc123.mp4 Downloading: http://localhost:30010/v1/videos/video_def456/content Saved: /path/to/downloaded_videos/video_def456.mp4 Downloading: http://localhost:30010/v1/videos/video_ghi789/content Saved: /path/to/downloaded_videos/video_ghi789.mp4 Videos saved to: /path/to/downloaded_videos</details> Also applies to: 162-173, 347-357 </blockquote></details> <details> <summary>fern/pages/tutorials/custom-prompt-benchmarking.md-69-69 (1)</summary><blockquote> `69-69`: _⚠️ Potential issue_ | _🟡 Minor_ **Add a language to the fenced block (MD040).** The sample output fence lacks a language tag. Use `text` to satisfy the linter. <details> <summary>🔧 Suggested change</summary> ```diff -``` +```textfern/pages/tutorials/custom-prompt-benchmarking.md-105-105 (1)
105-105:⚠️ Potential issue | 🟡 MinorPromote “Use Cases” to h2 to satisfy heading increment (MD001).
After an h2 section, the next top-level section should also be h2.
🔧 Suggested change
-### Use Cases +## Use Casesfern/pages/tutorials/sharegpt.md-51-74 (1)
51-74:⚠️ Potential issue | 🟡 MinorAdd a language to the sample output code fence.
MD040 expects a language; use
textfor terminal output.✅ Proposed change
-``` +```textfern/pages/tutorials/rankings.md-53-71 (1)
53-71:⚠️ Potential issue | 🟡 MinorAdd language specifier to fenced code block.
The static analysis tool flagged this code block for missing a language specification. For sample output blocks, consider adding
textorconsoleas the language identifier for better rendering and linter compliance.📝 Proposed fix
**Sample Output (Successful Run):** -``` +```text INFO Starting AIPerf Systemfern/pages/tutorials/embeddings.md-45-45 (1)
45-45:⚠️ Potential issue | 🟡 MinorAdd protocol prefix to
--urlparameter.Line 45 uses
--url localhost:8000without a protocol, while the verificationcurlcommand on line 28 useshttp://localhost:8000. AIPerf documentation recommends including the full URL with protocol scheme. Update line 45 to--url http://localhost:8000for consistency and adherence to best practices.fern/pages/tutorials/fixed-schedule.md-138-163 (1)
138-163:⚠️ Potential issue | 🟡 MinorSample output entry count is inconsistent with the schedule data.
The schedule file (lines 60-71) contains entries at timestamps: 0, 500, 750, 1000, 1250, 2000, 2500, 3000, 4000, 5000. With
--fixed-schedule-start-offset 2000and--fixed-schedule-end-offset 4000, timestamps 2000, 2500, 3000, and 4000 fall within the window (3-4 entries depending on boundary inclusivity), but the sample output shows "Filtered to 2 entries" andProfiling: 2/2.Consider updating the sample output to reflect the correct filtered count, or adjust the offset values to match the intended 2-entry result.
fern/pages/tutorial.md-2-2 (1)
2-2:⚠️ Potential issue | 🟡 MinorCopyright year range is inconsistent with other new files in this PR.
Other new files use
2025-2026or2026, but this file has2024-2025.-# SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.docs/tutorials/sglang-image-generation.md-237-249 (1)
237-249:⚠️ Potential issue | 🟡 MinorAdd alt text to image references for accessibility.
All three image references lack alt text, which is important for accessibility and also helps when images fail to load. The static analysis tool (MD045) flagged this as well.
Proposed fix
- +- +- +fern/pages/benchmark-modes/trace-replay.md-47-52 (1)
47-52:⚠️ Potential issue | 🟡 MinorContradictory labeling:
hash_idslisted under "Required fields" but marked "(optional)".The heading says "Required fields for trace replay" but
hash_idson line 51 is annotated as(optional). Either move it out of the "Required fields" list or change the heading to "Fields for trace replay" / "Supported fields".Suggested fix
-Required fields for trace replay: +Supported fields for trace replay: - `timestamp`: Request arrival time in milliseconds - `input_length`: Number of input tokens - `output_length`: Number of output tokens -- `hash_ids`: List of block hashes (optional) +- `hash_ids`: List of block hashes _(optional)_fern/pages/tutorials/user-centric-timing.md-237-239 (1)
237-239:⚠️ Potential issue | 🟡 MinorHyphenate the heading for readability. Use “High-Throughput Cache Test”.
fern/pages/tutorials/multi-url-load-balancing.md-1-4 (1)
1-4:⚠️ Potential issue | 🟡 MinorReplace JSX-style comment with standard markdown frontmatter or HTML comment. In plain
.md,{/* ... */}renders as text.Proposed fix
-{/* # SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -# SPDX-License-Identifier: Apache-2.0 */} +--- +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +---fern/pages/cli-options.md-431-434 (1)
431-434:⚠️ Potential issue | 🟡 MinorFix missing space in the prefix prompt note.
Proposed fix
-Note that due to the prefix and user prompts being concatenated, the number of tokens in the final prompt may be off by one.Mutually exclusive with `--shared-system-prompt-length`/`--user-context-prompt-length`. +Note that due to the prefix and user prompts being concatenated, the number of tokens in the final prompt may be off by one. Mutually exclusive with `--shared-system-prompt-length`/`--user-context-prompt-length`.fern/pages/tutorials/local-tokenizer.md-28-92 (1)
28-92:⚠️ Potential issue | 🟡 MinorAdd language tags to non-bash fences. This fixes MD040 and improves rendering/linters.
Proposed fix
-``` +```text /path/to/your/local/tokenizer/ ├── tokenizer.json ├── tokenizer_config.json ├── vocab.txt (or vocab.json) └── config.json@@
-+text
INFO Starting AIPerf System
INFO Loading local tokenizer from: /home/user/tokenizers/llama-2-7b
INFO Tokenizer loaded successfully (offline mode)
INFO AIPerf System is PROFILING
@@
JSON Export: artifacts/llama-2-7b-chat-concurrency4/profile_export_aiperf.jsonfern/pages/tutorials/time-based-benchmarking.md-26-149 (1)
26-149:⚠️ Potential issue | 🟡 MinorAdd language tags to non-bash fences. Keeps markdownlint clean and clarifies intent.
Proposed fix
-``` +```text │ BENCHMARK DURATION │ GRACE PERIOD │ @@ ▲ ▲ Duration expires Grace period ends@@
-+text
INFO Starting AIPerf System
@@
JSON Export: artifacts/Qwen_Qwen2.5-7B-Instruct-chat-concurrency50/profile_export_aiperf.json@@ -``` +```text INFO Starting AIPerf System @@ JSON Export: artifacts/Qwen_Qwen2.5-7B-Instruct-chat-concurrency20/profile_export_aiperf.json</details> </blockquote></details> <details> <summary>fern/pages/tutorials/ramping.md-14-231 (1)</summary><blockquote> `14-231`: _⚠️ Potential issue_ | _🟡 Minor_ **Add language tags to non-bash fences.** This covers the ASCII diagrams and sample outputs. <details> <summary>Proposed fix</summary> ```diff -``` +```text Without ramping: With ramping: @@ 0 ┼──────────────────────▶ 0 ┼●─────────────────────▶ 0 Time 0 30s Time@@
-+text
INFO Starting AIPerf System
@@
JSON Export: artifacts/your-model-chat-concurrency100/profile_export_aiperf.json@@ -``` +```text Concurrency 100 ┤ ●━━━━━━━━━━━ @@ 7.5s 15s 22.5s 30s Time@@
-+text
INFO Starting AIPerf System
@@
JSON Export: artifacts/your-model-chat-rate100/profile_export_aiperf.json@@ -``` +```text Request Rate (QPS) 100 ┤ ●━━━━━━━━━━━ @@ 15s 30s 45s 60s Time@@
-+text
INFO Starting AIPerf System
@@
JSON Export: artifacts/your-model-chat-concurrency100-rate200/profile_export_aiperf.json@@ -``` +```text INFO Starting AIPerf System @@ JSON Export: artifacts/your-model-chat-concurrency100/profile_export_aiperf.json</details> </blockquote></details> <details> <summary>fern/pages/tutorials/user-centric-timing.md-91-333 (1)</summary><blockquote> `91-333`: _⚠️ Potential issue_ | _🟡 Minor_ **Add language tags to non-bash code fences.** This resolves MD040 and makes render intent explicit. <details> <summary>Proposed fix</summary> ```diff -``` +```text turn_gap = num_users / user_centric_rate@@
-+text
Evaluate: Benchmark Execution Timeline (t=0 to t=30s)@@
RESULT:
Immediate mix of fresh sessions (User 16) and deep sessions (User 14),
with users finishing and churning naturally from t=6s onwards.@@ -``` +```text ┌─────────────────────────────────────────────────────────────┐ @@ └─────────────────────────────────────────────────────────────┘@@
-+text
INFO Starting AIPerf System
@@
JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-rate1.0/profile_export_aiperf.json@@ -``` +```text INFO Starting AIPerf System @@ JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-rate4.0/profile_export_aiperf.json@@
-+text
INFO Starting AIPerf System
@@
JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-rate0.5/profile_export_aiperf.jsonfern/pages/plugins/plugin-system.md-42-52 (1)
42-52:⚠️ Potential issue | 🟡 MinorAdd a language to the hierarchy diagram fence (MD040).
Proposed fix
-``` +```text Registry (singleton) └── Package (1+) ─── discovered via entry points └── Manifest (1+ per package) ─── plugins.yaml files └── Category (1+) └── Entry (1+) ─── PluginEntry ├── Class ─── lazy-loaded Python class └── Metadata ─── optional typed config</details> </blockquote></details> <details> <summary>fern/pages/plugins/plugin-system.md-385-412 (1)</summary><blockquote> `385-412`: _⚠️ Potential issue_ | _🟡 Minor_ **Add `text` language to the error snippet fences (MD040).** <details> <summary>Proposed fix</summary> ```diff -``` +```text TypeNotFoundError: Type 'my_plugin' not found for category 'endpoint'.@@
-+text
ImportError: Failed to import module for endpoint:my_plugin@@ -``` +```text AttributeError: Class 'MyClass' not found</details> </blockquote></details> <details> <summary>fern/pages/benchmark-modes/timing-modes-reference.md-248-266 (1)</summary><blockquote> `248-266`: _⚠️ Potential issue_ | _🟡 Minor_ **Add a language to the fenced block (MD040).** The ASCII diagram should declare a language like `text` to satisfy markdownlint and improve renderer consistency. <details> <summary>Proposed fix</summary> ```diff -``` +```text ┌─────────────────────────────────────────────────────────────────┐ │ Which options should I use? │ ├─────────────────────────────────────────────────────────────────┤ @@ └─────────────────────────────────────────────────────────────────┘</details> </blockquote></details> <details> <summary>fern/pages/tutorials/multi-turn.md-107-129 (1)</summary><blockquote> `107-129`: _⚠️ Potential issue_ | _🟡 Minor_ **Add a language to the sample output fence (MD040).** <details> <summary>Proposed fix</summary> ```diff -``` +```text INFO Starting AIPerf System INFO Multi-turn mode: 10 conversations, 3 turns each (30 total requests) @@ JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-concurrency2/profile_export_aiperf.json</details> </blockquote></details> <details> <summary>fern/pages/tutorials/http-trace-metrics.md-34-110 (1)</summary><blockquote> `34-110`: _⚠️ Potential issue_ | _🟡 Minor_ **Add language identifiers to fenced blocks (MD040).** <details> <summary>Proposed fix</summary> ```diff -``` +```text Request Lifecycle ──────────────────────────────────────────────────────────────────────────────► @@@@
-+text
http_req_duration = response_receive_end_perf_ns - request_send_start_perf_ns@@ -``` +```text http_req_connection_overhead = http_req_blocked + http_req_dns_lookup + http_req_connecting@@
-+text
http_req_total = http_req_blocked + http_req_dns_lookup + http_req_connecting
+ http_req_sending + http_req_waiting + http_req_receivingfern/pages/benchmark-modes/timing-modes-reference.md-83-85 (1)
83-85:⚠️ Potential issue | 🟡 MinorRemove the blank line inside the blockquote to satisfy MD028.
markdownlint flags the empty line inside the blockquote; it breaks the blockquote formatting in some renderers.
Proposed fix
-> **Important**: If `--concurrency` is not set, session concurrency limiting is **disabled** (unlimited). For `--user-centric-rate` mode, consider setting `--concurrency` to at least `--num-users` to ensure all users can have in-flight requests. - -> **See also**: [Prefill Concurrency Tutorial](../tutorials/prefill-concurrency.md) for detailed guidance on memory-safe long-context benchmarking. +> **Important**: If `--concurrency` is not set, session concurrency limiting is **disabled** (unlimited). For `--user-centric-rate` mode, consider setting `--concurrency` to at least `--num-users` to ensure all users can have in-flight requests. +> **See also**: [Prefill Concurrency Tutorial](../tutorials/prefill-concurrency.md) for detailed guidance on memory-safe long-context benchmarking..cursor/skills/docs-to-fern/SKILL_md-316-323 (1)
316-323:⚠️ Potential issue | 🟡 MinorSPDX bulk-add script won't detect JSX-style SPDX headers.
The check
head -1 "$f" | grep -q '^---'only detects YAML frontmatter. Files that already have JSX comment SPDX headers ({/* SPDX... */}) — like several files in this very PR — would get a duplicate SPDX block prepended.fern/pages/diagrams/metrics-flow.md-48-48 (1)
48-48:⚠️ Potential issue | 🟡 MinorStage numbering jumps from 2 to 4.
Line 34 labels "Stage 2" and line 48 labels "Stage 4", with no Stage 3. This is likely a numbering error — should this be "Stage 3"?
Proposed fix
- %% Stage 4: Summarize Function Processing + %% Stage 3: Summarize Function Processingfern/pages/diagrams/metrics-flow.md-82-83 (1)
82-83:⚠️ Potential issue | 🟡 MinorStyle classes reference undefined nodes
I1andF.
class I1,G statisticsreferencesI1which is never defined (onlyI2exists). Similarly,class E1,E2,E3,F,L transportreferencesFwhich has no corresponding node. These appear to be stale references from a prior revision. Mermaid will silently ignore them, but they should be cleaned up to avoid confusion.Proposed fix
- class I1,G statistics + class G statistics- class E1,E2,E3,F,L transport + class E1,E2,E3,L transportfern/pages/server-metrics/server-metrics-reference.md-158-158 (1)
158-158:⚠️ Potential issue | 🟡 MinorBroken internal link:
#histogram-bucketsanchor does not exist.The link
[Histogram Buckets](#histogram-buckets)on line 158 does not resolve to any heading in this document. There is no## Histogram Bucketsor similar heading. Consider either adding a dedicated "Histogram Buckets" section or updating this link to point to an existing section (e.g., the per-backend histogram bucket tables).fern/pages/tutorials/gpu-telemetry.md-4-4 (1)
4-4:⚠️ Potential issue | 🟡 MinorMultiple
#(h1) headings will conflict with Fern's auto-generated title.Fern auto-generates an h1 from the navigation title in
next.yml, so page content should start at h2 (##). This file has four h1 headings (lines 4, 67, 202, 353) which will produce duplicate/conflicting h1 elements and may break the rendered table of contents.Downgrade these to h2:
Proposed fix
-# GPU Telemetry with AIPerf +## GPU Telemetry with AIPerf-# 1: Using Dynamo +## 1: Using Dynamo-# 2: Using Other Inference Server +## 2: Using Other Inference Server-# 3: Using pynvml (Local GPU Monitoring) +## 3: Using pynvml (Local GPU Monitoring)(And cascade all sub-headings down one level accordingly.)
Also applies to: 67-67, 202-202, 353-353
fern/pages/server-metrics/server-metrics-reference.md-166-168 (1)
166-168:⚠️ Potential issue | 🟡 Minor
## Dynamo Frontendis at the same heading level as its parent## Detailed Metric Definitions.The Table of Contents (lines 12-18) lists "Dynamo Frontend", "vLLM", etc. as children of "Detailed Metric Definitions". But in the document body, they are all
##(h2), making them siblings rather than children. They should be###(h3) to match the TOC hierarchy.This applies to all backend section headings: lines 168, 217, 273, 352, 452, and 478.
Proposed fix for each backend heading
-## Dynamo Frontend +### Dynamo Frontend-## Dynamo Component +### Dynamo Component-## vLLM +### vLLM-## SGLang +### SGLang-## TensorRT-LLM +### TensorRT-LLM-## KVBM (KV Block Manager) +### KVBM (KV Block Manager)(And cascade all sub-headings under these down one level:
###→####, etc.)Based on learnings: the maintainer prefers h4 headings (####) for subsections under h2 headings for better visual sizing. If the backend sections become h3, their subsections (currently h3) would naturally become h4, which aligns with this preference.
.cursor/skills/docs-to-fern/SKILL_md-114-118 (1)
114-118:⚠️ Potential issue | 🟡 MinorUpdate Fern CLI version from
3.29.1to3.55.7.The pinned version is significantly outdated. The latest stable release is
3.55.7(released January 31, 2026). Update the version to avoid missing recent bug fixes, performance improvements, and security patches.
🧹 Nitpick comments (24)
fern/pages/tutorials/custom-prompt-benchmarking.md (1)
41-41: Replace JSX-style markers with HTML comments to fix MD037.The
{/* ... */}markers are parsed as emphasis with spaces, triggering MD037. Use HTML comments instead.🔧 Suggested change
-{/* aiperf-run-vllm-default-openai-endpoint-server */} +<!-- aiperf-run-vllm-default-openai-endpoint-server --> ... -{/* /aiperf-run-vllm-default-openai-endpoint-server */} +<!-- /aiperf-run-vllm-default-openai-endpoint-server -->Also applies to: 66-66
fern/pages/reproducibility.md (2)
16-20: Remove blank line between blockquote declarations.Line 18 contains a blank line between two consecutive blockquote sections. Per Markdown best practices, consecutive blockquotes should not have blank lines between them if they're meant to be separate alert boxes.
📝 Proposed fix
> [!IMPORTANT] > **Default behavior:** Without `--random-seed`, AIPerf produces **non-deterministic** results. Set `--random-seed <integer>` for reproducibility. - > [!WARNING] > **Distributed System Constraints:** Even with `--random-seed`, **performance metrics and worker assignment are NOT reproducible** due to system non-determinism (network timing, async I/O, ZMQ load balancing).
60-60: Hyphenate compound adjective.The phrase "credit issuing strategy" should use a hyphen: "credit-issuing strategy" when the compound modifier precedes the noun.
📝 Proposed fix
-- TimingManager creates credit issuing strategy with RNG-based interval generator +- TimingManager creates credit-issuing strategy with RNG-based interval generatorfern/pages/tutorials/goodput.md (1)
27-32: Pin the Docker image tag for reproducible setup steps.Using
:latestmakes the tutorial non-deterministic; it can silently break when the image changes. The vLLM documentation recommends pinning to a specific version tag.✅ Suggested doc update
-docker pull vllm/vllm-openai:latest -docker run --gpus all -p 8000:8000 vllm/vllm-openai:latest \ +docker pull vllm/vllm-openai:v0.11.2 +docker run --gpus all -p 8000:8000 vllm/vllm-openai:v0.11.2 \fern/pages/tutorials/sharegpt.md (1)
37-48: Replace JSX-style comments to satisfy markdownlint.
{/* ... */}triggers MD037 in Markdown; use HTML comments instead.💡 Proposed change
-{/* aiperf-run-vllm-default-openai-endpoint-server */} +<!-- aiperf-run-vllm-default-openai-endpoint-server --> @@ -{/* /aiperf-run-vllm-default-openai-endpoint-server */} +<!-- /aiperf-run-vllm-default-openai-endpoint-server -->fern/pages/tutorials/warmup.md (1)
14-26: Specify language identifiers for fenced code blocks.Multiple fenced code blocks lack language specifications, which can affect rendering and accessibility:
- Lines 14-26, 254-264: ASCII art diagrams (consider
text)- Lines 54-70, 122-138, 157-173, 194-210, 233-250: Sample output/logs (consider
console,text, orplaintext)Adding language identifiers improves syntax highlighting, screen reader compatibility, and consistent rendering across different Markdown processors.
📝 Example fixes for ASCII art and sample output
For ASCII art (lines 14-26):
-``` +```text Without warmup: With warmup:For sample output (lines 54-70):
-``` +```console INFO Starting AIPerf SystemAlso applies to: 54-70, 122-138, 157-173, 194-210, 233-250, 254-264
fern/pages/tutorials/embeddings.md (2)
37-37: Unusual comment syntax in Markdown.The JSX-style comments
{/* ... */}wrapping code blocks are atypical for Markdown. If these are used by Fern's documentation system for code snippet extraction or tagging, this is fine. Otherwise, consider using standard Markdown comment syntax.Also applies to: 49-49, 79-79, 101-101
103-115: Consider adding metrics table to sample output.The sample output for custom inputs doesn't include a metrics table (unlike the synthetic inputs example on lines 51-71). Consider adding a sample metrics table here for consistency and to show users what results to expect.
fern/pages/tutorials/plot.md (1)
29-95: Consider adding language identifiers to sample output blocks.The sample output blocks at lines 30-40, 46-56, 71-80, and 86-94 are missing language identifiers. Adding
textorconsolewould improve syntax highlighting and accessibility.Example fix
**Sample Output (Successful Run):** -``` +```text INFO Loading single-run data from: artifacts/Qwen_Qwen3-0.6B-chat-concurrency10/fern/pages/tutorials/custom-dataset.md (1)
90-95: Add language identifier to sample output block.The sample output block is missing a language identifier. Adding
textorconsoleimproves rendering.Example fix
**Output:** -``` +```text NVIDIA AIPerf | LLM Metricsfern/pages/reference/tokenizer-auto-detection.md (1)
31-82: Add language identifiers to output example blocks.The output example blocks at lines 34-37, 40-60, and 63-82 are missing language identifiers. Adding
textorconsoleimproves syntax highlighting consistency.Example fix
**Successful resolution:** -``` +```text INFO ✓ Tokenizer Qwen/Qwen3-0.6B detected for qwen3-0.6bfern/pages/tutorials/openai-text-endpoints.md (1)
31-68: Inconsistent code block formatting.Lines 35 and 67 use indented code block style, while the rest of the document uses fenced code blocks. The sample output blocks at lines 48-68 are also missing language identifiers.
Standardization suggestion
Ensure all code blocks use fenced style with language identifiers:
- Use
```bashfor shell commands- Use
```textfor sample outputfern/pages/tutorials/timeslices.md (1)
62-76: Add language identifier to sample output block.The sample output block at lines 63-76 is missing a language identifier. Adding
textimproves consistency with other documentation.Example fix
**Sample Output (Successful Run):** -``` +```text INFO Starting AIPerf Systemfern/pages/comprehensive-llm-benchmarking.md (1)
83-120: Consider using####for subsections under##to match doc style.Multiple
###subsections under## Use Case 1appear; the repo preference is to use h4 for these nested headings.Based on learnings: In the aiperf repository's docs/metrics_reference.md file, the maintainer prefers using h4 headings (####) for subsections under h2 headings instead of h3 (###) for better visual sizing and readability, even though this violates markdownlint rule MD001.
docs/server-metrics/server-metrics-parquet-schema.md (1)
312-312: Vary repeated “For …” sentence openings for readability.Consider rephrasing to avoid three consecutive sentences starting with “For”.
✏️ Possible rewording
-*For aggregated statistics, see [JSON Schema](server-metrics-json-schema.md). For metric definitions, see [Server Metrics Reference](server-metrics-reference.md). For usage examples, see the [Server Metrics Tutorial](server-metrics.md).* +*See the [JSON Schema](server-metrics-json-schema.md) for aggregated statistics, the [Server Metrics Reference](server-metrics-reference.md) for metric definitions, and the [Server Metrics Tutorial](server-metrics.md) for usage examples.*docs/server-metrics/server-metrics-reference.md (1)
549-549: Vary repeated “For …” sentence openings for readability.Consider rephrasing to avoid three consecutive sentences starting with “For”.
✏️ Possible rewording
-*For detailed implementation and usage examples, see the [Server Metrics Tutorial](server-metrics.md). For aggregated statistics, see the [JSON Schema Reference](server-metrics-json-schema.md). For raw time-series analysis, see the [Parquet Schema Reference](server-metrics-parquet-schema.md).* +*See the [Server Metrics Tutorial](server-metrics.md) for detailed implementation and usage examples, the [JSON Schema Reference](server-metrics-json-schema.md) for aggregated statistics, and the [Parquet Schema Reference](server-metrics-parquet-schema.md) for raw time-series analysis.*fern/pages/server-metrics/server-metrics-parquet-schema.md (3)
82-87: Add a language specifier to the fenced code blocks.These ASCII-table examples lack a language tag, which triggers MD040 warnings. Using
text(or an empty identifier like```text) will satisfy the linter and may also render more predictably in Fern's MDX parser.Proposed fix
Line 82:
-``` +```textLine 91:
-``` +```textAlso applies to: 91-98
269-278: Redundant guard on Line 273.The
if i > 0check on line 273 is always true because line 271-272 already returns wheni == 0. The ternary fallback to0on line 274 is also unreachable. Not a bug, but slightly confusing for readers.Simplification
if count >= target: if i == 0: return le - prev_le = bounds[i-1] if i > 0 else 0 - prev_count = counts[i-1] if i > 0 else 0 + prev_le = bounds[i-1] + prev_count = counts[i-1] # Linear interpolation within bucket fraction = (target - prev_count) / (count - prev_count) if count > prev_count else 0 return prev_le + fraction * (le - prev_le)
1-2: SPDX copyright year says 2025, but other files in this PR use2025-2026.The SPDX header here uses
Copyright (c) 2025while other new files (e.g.,metrics-flow.md, the SKILL file) use2025-2026. Consider aligning the year range for consistency across the new docs..cursor/skills/docs-to-fern/SKILL_md (1)
605-627: Link auditor and cross-repo detector scripts use GNU-specific flags.
grep -oP(PCRE mode) andrealpath -m(line 866) are GNU-specific and won't work on macOS with default BSD tools. Since this is a skill doc for internal use, this may be fine — but a brief note about requiring GNU coreutils/grep would help avoid confusion.fern/pages/tutorials/gpu-telemetry.md (2)
160-193: Add language specifiers to sample output code blocks.The console/table output blocks at lines 160 and 500 lack a language tag (MD040). Use
textorconsolefor these display-only blocks.Also applies to: 500-521
542-755: Consider truncating the JSON export example.This 213-line JSON blob is quite long for a tutorial page. Consider showing only a representative subset (e.g., 2-3 metrics) with a comment like
// ... additional metrics omitted ...and a note that the full output is saved to the export file.fern/pages/diagrams/metrics-flow.md (1)
1-4: SPDX copyright year2025— consider updating to2025-2026for consistency.Same note as
server-metrics-parquet-schema.md: other new files in this PR use the2025-2026range.fern/pages/server-metrics/server-metrics-reference.md (1)
71-77: Add language specifiers to fenced code blocks.The latency formula block (line 71) and the histogram bucket list (line 396) lack language tags. Use
textfor these display-only blocks to satisfy MD040.Proposed fix
Line 71:
-``` +```textLine 396:
-``` +```textAlso applies to: 396-398
6864d56 to
f159bd8
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🤖 Fix all issues with AI agents
In `@fern/pages/genai-perf-feature-comparison.md`:
- Line 137: The table cell for "Arrival Smoothness" uses a backslash-escaped
less-than (`\<1`) which can break Fern's MDX rendering; update the cell content
so the Gamma distribution shape uses the HTML entity `<1=bursty, 1=Poisson,
>1=smooth` instead of backslash escapes for the `<` and `>` characters
(affecting the line with `--arrival-smoothness` / `--vllm-burstiness` in
genai-perf-feature-comparison.md); apply the same replacement pattern across
other docs referencing CLI options (e.g., cli-options.md) and server-metrics
files to ensure consistent MDX-safe rendering.
In `@fern/pages/tutorials/arrival-patterns.md`:
- Around line 155-225: The sample output blocks after the initial bash command
blocks are not closed or language-marked, causing Markdown lint errors
(MD040/MD046); locate the example sections around the aiperf profile command
lines (the Run 1 and Run 2 blocks) and close the opening ```bash fence
immediately after the command invocation, then open a separate fenced block with
language "text" for each "**Expected Output (Run X):**" multi-line output;
specifically ensure you insert a closing ``` after the aiperf profile lines and
replace the following -``` / ``` fences with ```text ... ``` for both the Run 1
and Run 2 expected output blocks so each output is properly fenced and
language-tagged.
In `@tools/generate_cli_docs.py`:
- Line 50: The mkdocs navigation still points to the old filename; update the
mkdocs.yml nav entry to match the new OUTPUT_FILE name by changing the reference
from cli_options.md to cli-options.md so the docs build serves the generated
file produced by OUTPUT_FILE in generate_cli_docs.py.
🟡 Minor comments (27)
fern/pages/tutorials/timeslices.md-82-116 (1)
82-116:⚠️ Potential issue | 🟡 MinorMake output file paths consistent with the earlier example.
The sample output shows files underartifacts/<run>/..., but the “Output Files” section omits the run directory. This is confusing for users trying to locate artifacts. Please align both sections (either include the run directory or state it’s implied).fern/pages/tutorials/timeslices.md-12-12 (1)
12-12:⚠️ Potential issue | 🟡 MinorHyphenate “equal-duration” for grammar.
Line 12 reads “equal duration segments”; hyphenation is preferred here.fern/pages/tutorials/timeslices.md-63-76 (1)
63-76:⚠️ Potential issue | 🟡 MinorAdd a language identifier to the sample output fence.
The fenced block under “Sample Output” lacks a language tag, triggering MD040. Considertextorlog.fern/pages/tutorials/sequence-distributions.md-103-127 (1)
103-127:⚠️ Potential issue | 🟡 MinorAdd a language to the sample output fence.
MD040 flags the sample output block for a missing fence language. Use
text(orconsole) to satisfy linting.🔧 Suggested fix
-**Sample Output (Successful Run):** -``` +**Sample Output (Successful Run):** +```text INFO Starting AIPerf System INFO Using sequence distribution: 70% (ISL~N(64,10), OSL~N(32,8)), 20% (ISL~N(256,40), OSL~N(128,20)), 10% (ISL~N(1024,100), OSL~N(512,50)) INFO AIPerf System is PROFILING @@ JSON Export: artifacts/Qwen_Qwen3-0.6B-chat-concurrency1/profile_export_aiperf.json</details> </blockquote></details> <details> <summary>fern/pages/reproducibility.md-60-62 (1)</summary><blockquote> `60-62`: _⚠️ Potential issue_ | _🟡 Minor_ **Hyphenate “in‑memory” as an adjective.** Improves grammar and readability. <details> <summary>📝 Proposed fix</summary> ```diff -- DatasetManager pre-generates complete dataset using derived RNGs and stores in memory +- DatasetManager pre-generates complete dataset using derived RNGs and stores in-memoryfern/pages/tutorials/working-with-profile-exports.md-128-152 (1)
128-152:⚠️ Potential issue | 🟡 MinorClarify metrics behavior for failed requests.
The text says metrics are “always null for failed requests” (Line 128-129), but the failed example includes a metrics object with
error_isl(Line 150-152). Please reconcile this to avoid reader confusion.✂️ Possible wording update
-See the [Complete Metrics Reference](../metrics-reference.md) page for a list of all metrics and their descriptions. Will always be null for failed requests. +See the [Complete Metrics Reference](../metrics-reference.md) page for a list of all metrics and their descriptions. For failed requests, metrics may be null or include error-related metrics depending on the failure mode.fern/pages/tutorials/working-with-profile-exports.md-96-105 (1)
96-105:⚠️ Potential issue | 🟡 MinorFix duplicate JSON key in example.
The successful-record JSON example includes
time_to_first_tokentwice (Line 98 and Line 101), which makes the sample invalid/ambiguous. Keep the correct field once to avoid copy‑paste errors.✂️ Suggested fix
"metrics": { "input_sequence_length": {"value": 550, "unit": "tokens"}, - "time_to_first_token": {"value": 255.88656799999998, "unit": "ms"}, "request_latency": {"value": 297.52522799999997, "unit": "ms"}, "output_token_count": {"value": 9, "unit": "tokens"}, "time_to_first_token": {"value": 4.8984369999999995, "unit": "ms"},fern/pages/diagrams/metrics-flow.md-82-83 (1)
82-83:⚠️ Potential issue | 🟡 MinorRemove undefined node references from class applications.
Lines 82 and 83 reference nodes
I1andFin class applications, but these nodes are not defined anywhere in the diagram. These are likely leftover from an earlier version or typos.🧹 Proposed fix: remove undefined node references
- class I1,G statistics + class G statistics - class E1,E2,E3,F,L transport + class E1,E2,E3,L transportfern/pages/diagrams/metrics-flow.md-11-48 (1)
11-48:⚠️ Potential issue | 🟡 MinorFix the stage numbering inconsistency.
The diagram jumps from "Stage 1: Distributed Record Processing" (line 11) to "Stage 2: Centralized Results Processing" (line 34) to "Stage 4: Summarize Function Processing" (line 48). Stage 3 is missing, which could confuse readers following the pipeline flow.
📝 Proposed fix: renumber Stage 4 to Stage 3
- %% Stage 4: Summarize Function Processing + %% Stage 3: Summarize Function Processing L --> I2["Summarize Function<br/>summarize()<br/><em>(Process all collected results)</em>"]fern/pages/tutorials/goodput.md-2-2 (1)
2-2:⚠️ Potential issue | 🟡 MinorCopyright year range may be outdated.
This file uses
2024-2025while other new files in this PR use2025-2026or2026. Consider updating for consistency.fern/pages/tutorials/fixed-schedule.md-139-163 (1)
139-163:⚠️ Potential issue | 🟡 MinorSample output entry count is inconsistent with the input data.
The schedule defined on Lines 60-71 has entries at timestamps 2000, 2500, 3000, and 4000ms. With
--fixed-schedule-start-offset 2000 --fixed-schedule-end-offset 4000, the filtered set should contain 3–4 entries (depending on boundary inclusivity), not 2. The progress bar (2/2) and throughput also reflect this incorrect count, which could confuse readers trying to reproduce the example.fern/pages/tutorials/audio.md-85-92 (1)
85-92:⚠️ Potential issue | 🟡 MinorSample output contains a local username/path that should be anonymized.
The output references
/home/lkomali/aiperf/artifacts/...which exposes an internal username. This appears again in Lines 149-156. Consider replacing with a generic path like/home/user/aiperf/artifacts/...or using relative paths likeartifacts/...to match other tutorials in this PR.docs/tutorials/sglang-image-generation.md-237-249 (1)
237-249:⚠️ Potential issue | 🟡 MinorAdd alt text to images for accessibility.
The static analysis tool correctly flags that these images lack alt text (MD045). Adding descriptive alt text improves accessibility and provides context when images fail to load.
Proposed fix
- +- +- +fern/pages/tutorials/local-tokenizer.md-82-89 (1)
82-89:⚠️ Potential issue | 🟡 MinorSample output table is missing the top border row.
Other tutorial pages in this PR include the full table border (
┏━━━...┓header line). This output block is missing it, making the rendering look broken/inconsistent.Proposed fix
NVIDIA AIPerf | LLM Metrics +┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓ ┃ Metric ┃ avg ┃ min ┃ max ┃ p99 ┃ p50 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩fern/pages/tutorials/ui-types.md-55-55 (1)
55-55:⚠️ Potential issue | 🟡 MinorAdd language identifier to fenced code block.
The code block at line 55 lacks a language identifier, which prevents proper syntax highlighting. Based on the content (terminal output), use
textorconsole.📝 Proposed fix
**Note:** Dashboard automatically switches to `simple` when using `--verbose` or `--extra-verbose` in a TTY for better log visibility. ## Simple Lightweight progress bars using TQDM: ```bash aiperf profile \ --model Qwen/Qwen3-0.6B \ --url localhost:8000 \ --endpoint-type chat \ --concurrency 10 \ --request-count 100 \ --streaming \ --ui-type simpleSample Output (Successful Run):
-+text
INFO Starting AIPerf System</details> </blockquote></details> <details> <summary>fern/pages/tutorials/ui-types.md-108-108 (1)</summary><blockquote> `108-108`: _⚠️ Potential issue_ | _🟡 Minor_ **Add language identifier to fenced code block.** The code block at line 108 lacks a language identifier. Based on the content (terminal output with timestamps), use `text` or `console`. <details> <summary>📝 Proposed fix</summary> ```diff aiperf profile \ --model Qwen/Qwen3-0.6B \ --url localhost:8000 \ --endpoint-type chat \ --concurrency 10 \ --request-count 100 \ --streaming \ --ui-type noneSample Output (Successful Run):
-+text
23:07:28.809795 INFO Starting AIPerf System</details> </blockquote></details> <details> <summary>fern/pages/tutorials/multi-turn.md-402-406 (1)</summary><blockquote> `402-406`: _⚠️ Potential issue_ | _🟡 Minor_ **`--conversation-turn-delay-ratio` is mentioned but never documented.** Line 405 introduces `--conversation-turn-delay-ratio` as a control for turn delays, but this flag doesn't appear in the Core Parameters section (lines 41–69) or the Quick Reference (lines 420–440). Either document it alongside the other delay parameters, or remove it from this list if it's not a user-facing option. </blockquote></details> <details> <summary>fern/pages/benchmark-modes/trace-replay.md-47-52 (1)</summary><blockquote> `47-52`: _⚠️ Potential issue_ | _🟡 Minor_ **`hash_ids` listed as required but described as optional.** Line 47 introduces these as "Required fields for trace replay" but line 51 says `hash_ids` is "(optional)". This is contradictory — either move `hash_ids` to a separate "Optional fields" section, or remove the "(optional)" qualifier. </blockquote></details> <details> <summary>fern/pages/tutorials/custom-prompt-benchmarking.md-105-115 (1)</summary><blockquote> `105-115`: _⚠️ Potential issue_ | _🟡 Minor_ **`### Use Cases` should be `## Use Cases` to fix heading hierarchy.** This is a top-level section of the document, not a subsection of "Running the Benchmark." Static analysis also flags this as MD001 (heading increment skipped). <details> <summary>Proposed fix</summary> ```diff -### Use Cases +## Use Casesfern/pages/tutorials/custom-prompt-benchmarking.md-41-43 (1)
41-43:⚠️ Potential issue | 🟡 MinorLine 42 will render as an unintended h1 heading.
The line
# Create an input file with specific text inputssits between the JSX comment and the code fence, so it will be rendered as a Markdown h1 heading in the doc. It looks like it was meant to be a code comment or descriptive text.Move it inside the code block (it's already duplicated as a comment on line 44), or convert it to a plain paragraph.
Proposed fix
{/* aiperf-run-vllm-default-openai-endpoint-server */} -# Create an input file with specific text inputs ```bash # Create an input file to use for benchmarkingfern/pages/tutorials/openai-text-endpoints.md-144-164 (1)
144-164:⚠️ Potential issue | 🟡 MinorTrailing blank lines inside fenced code blocks.
Lines 150–151 and 162–163 have blank lines before the closing
```fence, which will render as empty trailing lines inside the code blocks.Proposed fix
cat <<EOF > inputs.jsonl {"texts": ["How are you?"]} {"texts": ["Give me a poem."]} EOF -Run AIPerf against the Completions endpoint using the custom input file:
aiperf profile \ --model Qwen/Qwen3-0.6B \ --endpoint-type completions \ --endpoint /v1/completions \ --input-file inputs.jsonl \ --custom-dataset-type single_turn \ --url localhost:8000 \ --request-count 10 -</details> </blockquote></details> <details> <summary>fern/pages/tutorials/sglang-image-generation.md-224-249 (1)</summary><blockquote> `224-249`: _⚠️ Potential issue_ | _🟡 Minor_ **Filename mismatch between the script output and the "View the generated images" section.** The `extract_images.py` script (line 207) generates filenames with the pattern `image_{line_num:04d}_{data_idx:02d}.jpg` (e.g., `image_0001_00.jpg`), but lines 237, 243, and 249 reference filenames with three numeric segments (`image_0001_00_00.jpg`, `image_0002_00_00.jpg`, `image_0003_00_00.jpg`), which the script never produces. Additionally, the sample output (lines 226–228) shows all images from `line_num=1` (`image_0001_00.jpg` through `image_0001_02.jpg`), implying a single JSONL record with 3 response images. For 3 separate prompts (one per line), the expected output would be `image_0001_00.jpg`, `image_0002_00.jpg`, `image_0003_00.jpg`. <details> <summary>Proposed fix for the viewing section filenames</summary> ```diff Prompt:{"text": "A serene mountain landscape at sunset"}
-*Generated image: image_0001_00_00.jpg* +*Generated image: image_0001_00.jpg* Prompt:{"text": "A futuristic city with flying cars"}
-*Generated image: image_0002_00_00.jpg* +*Generated image: image_0002_00.jpg* Prompt:{"text": "A cute robot playing with a kitten"}
-*Generated image: image_0003_00_00.jpg* +*Generated image: image_0003_00.jpg*And update the sample output to match one-image-per-prompt:
**Output:**-Extracted: /path/to/extracted_images/image_0001_00.jpg
-Extracted: /path/to/extracted_images/image_0001_01.jpg
-Extracted: /path/to/extracted_images/image_0001_02.jpg
+Extracted: /path/to/extracted_images/image_0001_00.jpg
+Extracted: /path/to/extracted_images/image_0002_00.jpg
+Extracted: /path/to/extracted_images/image_0003_00.jpgfern/pages/server-metrics/server-metrics-parquet-schema.md-82-98 (1)
82-98:⚠️ Potential issue | 🟡 MinorAdd
textlanguage to table example fences
Unlabeled fenced blocks trigger MD040 and reduce readability.✅ Suggested fix
-``` +```text endpoint_url | metric_name | metric_type | unit | description |timestamp_ns | model_name | value | sum | count | bucket_le | bucket_count ...```diff -``` +```text endpoint_url | metric_name | metric_type | unit | description | timestamp_ns | model_name | value | sum | count | bucket_le | bucket_count ...</details> </blockquote></details> <details> <summary>fern/pages/reference/tokenizer-auto-detection.md-33-82 (1)</summary><blockquote> `33-82`: _⚠️ Potential issue_ | _🟡 Minor_ **Add languages to fenced output blocks (`text`)** These output examples are unlabeled, triggering MD040 and reducing readability. <details> <summary>✅ Suggested fix</summary> ```diff -``` +```text INFO ✓ Tokenizer Qwen/Qwen3-0.6B detected for qwen3-0.6b INFO 1 tokenizer validated • 1 resolved • 0.3s```diff -``` +```text ╭──────────────────────────────── Ambiguous Tokenizer Name ─────────────────────────────────╮ ... ╰───────────────────────────────────────────────────────────────────────────────────────────╯```diff -``` +```text ╭───────────────────────────────── Gated Repository ──────────────────────────────────╮ ... ╰─────────────────────────────────────────────────────────────────────────────────────╯</details> </blockquote></details> <details> <summary>fern/pages/tutorials/http-trace-metrics.md-371-373 (1)</summary><blockquote> `371-373`: _⚠️ Potential issue_ | _🟡 Minor_ **Broken links: relative source-code paths won't resolve in the Fern docs site.** The Fern-generated site doesn't serve files from the repo's `src/` directory, so `../../src/aiperf/...` links will 404. Replace these with absolute GitHub URLs pointing to the source files on the `main` branch: <details> <summary>📝 Suggested fix</summary> ```diff -- [Source: trace_models.py](../../src/aiperf/common/models/trace_models.py) - Trace data model definitions -- [Source: http_trace_metrics.py](../../src/aiperf/metrics/types/http_trace_metrics.py) - HTTP trace metric implementations +- [Source: trace_models.py](https://github.com/ai-dynamo/aiperf/blob/main/src/aiperf/common/models/trace_models.py) - Trace data model definitions +- [Source: http_trace_metrics.py](https://github.com/ai-dynamo/aiperf/blob/main/src/aiperf/metrics/types/http_trace_metrics.py) - HTTP trace metric implementationsfern/pages/server-metrics/server-metrics-reference.md-488-499 (1)
488-499:⚠️ Potential issue | 🟡 MinorAmbiguous
h2dabbreviation in KVBM transfer patterns.The suffix
h2dis used for two different meanings:
kvbm_offload_blocks_h2d→ Host to Disk (line 491)kvbm_onboard_blocks_h2d→ Host to Device (line 493)The "Block transfer patterns" summary (lines 496–499) lists
h2dtwice with contradictory definitions. Consider adding a brief clarifying note (e.g., "In offload context,d= disk; in onboard context,d= device") or restructuring the summary to disambiguate:📝 Suggested clarification
**Block transfer patterns:** -- **d2d**: Device ↔ Disk (direct, fast path) -- **d2h**: Device → Host (offload to CPU memory) -- **h2d**: Host → Device (onboard from CPU memory) -- **h2d** (disk): Host → Disk (persist to storage) +- **d2d**: Device ↔ Disk (direct, bypassing host memory) +- **d2h**: Device → Host (offload to CPU memory) +- **h2d** (onboard): Host → Device (onboard from CPU memory to GPU) +- **h2d** (offload): Host → Disk (persist from CPU memory to storage)fern/pages/genai-perf-feature-comparison.md-14-14 (1)
14-14:⚠️ Potential issue | 🟡 MinorInconsistent formatting in legend.
The "Not Applicable" entry uses both backticks and bold formatting (
**N/A**), while other legend entries use only bold. This creates visual inconsistency.📝 Proposed fix for consistent formatting
-- **`N/A`** **Not Applicable** - Feature not applicable +- ❌ **Not Applicable (N/A)** - Feature not applicableAlternatively, if you want to keep the N/A as code-styled:
-- **`N/A`** **Not Applicable** - Feature not applicable +- `N/A` **Not Applicable** - Feature not applicable
🧹 Nitpick comments (22)
fern/pages/plugins/creating-your-first-plugin.md (1)
260-260: Consider expanding the test fixtures guidance.The tutorial mentions creating fixtures in
conftest.pybut doesn't provide concrete examples. Consider adding a brief code snippet showing how to createmock_model_endpoint,mock_request_info, andmock_responsefixtures to make the testing section more actionable.📝 Example fixture structure
# tests/conftest.py import pytest from unittest.mock import Mock `@pytest.fixture` def mock_model_endpoint(): """Create a mock ModelEndpoint for testing.""" endpoint = Mock() endpoint.primary_model_name = "test-model" endpoint.endpoint.streaming = False return endpoint `@pytest.fixture` def mock_request_info(): """Create a mock RequestInfo with sample data.""" # Return minimal RequestInfo structure ...fern/pages/comprehensive-llm-benchmarking.md (1)
55-55: Minor: Add comma after year for clarity.Per common style guides, a comma should follow the year when using month-day-year format.
📝 Suggested fix
-**Note**: This was a demo endpoint used for the November 13, 2025 presentation. The cluster has been taken down. +**Note**: This was a demo endpoint used for the November 13, 2025, presentation. The cluster has been taken down.fern/pages/tutorials/huggingface-tgi.md (4)
15-15: Optional: Consider hyphenating compound adjective.The phrase "full text completion" could be hyphenated as "full-text completion" when used as a compound adjective modifying a noun.
80-80: Format CLI flag as inline code.The
--input-fileoption should be formatted with backticks for consistency with other CLI flags in the document.📝 Proposed fix
-You can also provide your own text prompts using the ---input-file option. +You can also provide your own text prompts using the `--input-file` option.
55-75: Add language specifier to code block.The output code block should specify a language for better rendering and accessibility. Use
textorconsolefor terminal output.📝 Proposed fix
**Sample Output (Successful Run):** -``` +```text INFO Starting AIPerf System
117-139: Add language specifier to code block.Similar to the earlier output example, this code block should specify a language for better rendering and accessibility. Use
textorconsolefor terminal output.📝 Proposed fix
**Sample Output (Successful Run):** -``` +```text INFO Starting AIPerf Systemfern/pages/tutorials/prefill-concurrency.md (2)
14-25: Add language identifiers to fenced code blocks for proper rendering.Multiple code blocks containing ASCII diagrams and sample terminal outputs are missing language identifiers. While the static analysis tool flags these as warnings, they are valid issues that affect documentation rendering and syntax highlighting.
Proposed fix for ASCII diagrams and sample outputs
For ASCII art diagrams (lines 14-25, 33-52, 172-181), add
textidentifier:-``` +```text Request LifecycleFor sample terminal outputs (lines 100-121, 147-168, 202-224), add
textorconsoleidentifier:-``` +```text INFO Starting AIPerf SystemAlso applies to: 33-52, 100-121, 147-168, 172-181, 202-224
56-60: Fix blockquote structure with blank line separator.The blank line at line 58 between two blockquotes creates improper Markdown structure. Either merge the blockquotes or separate them completely.
Proposed fix
Option 1 - Merge into single blockquote:
> [!IMPORTANT] > Requires `--streaming` to be enabled. Without streaming, AIPerf can't detect when the first token arrives. - +> > [!WARNING] > **Coordinated omission trade-off:** When requests wait for prefill slots...Option 2 - Separate blockquotes completely (remove blank line):
> [!IMPORTANT] > Requires `--streaming` to be enabled. Without streaming, AIPerf can't detect when the first token arrives. > [!WARNING] > **Coordinated omission trade-off:** When requests wait for prefill slots...fern/pages/tutorials/plot.md (2)
30-40: Add language identifiers to sample output code blocks.Multiple sample terminal output blocks are missing language identifiers, which affects proper rendering and syntax highlighting.
Proposed fix
Add
textorconsoleidentifier to sample output blocks:-``` +```text INFO Loading single-run data from: artifacts/Qwen_Qwen3-0.6B-chat-concurrency10/Also applies to: 46-56, 71-80, 86-94, 424-433
251-255: Fix blockquote structures with blank line separators.Multiple locations have blank lines between consecutive blockquotes (lines 253, 372, 439, 442, 445, 448), creating improper Markdown structure. Ensure blockquotes are either merged or properly separated.
Example fix for lines 251-255
> [!NOTE] > When experiment classification is enabled, all multi-run plots automatically group by `experiment_group` (directory name) to preserve individual treatment variants with semantic baseline/treatment colors. - +> > [!TIP] > See the CONFIGURATION GUIDE section in `~/.aiperf/plot_config.yaml` for detailed customization options.Also applies to: 370-374, 437-450
fern/pages/tutorials/synthetic-video.md (1)
66-90: Add language identifiers to sample output code blocks.Sample output blocks at lines 66-90 and sections marked at lines 105, 163, and 348 are missing language identifiers.
Proposed fix
Add
textidentifier to sample output blocks:**Sample Output (Successful Run):** -``` +```text INFO Starting AIPerf SystemAlso applies to: 105-106, 163-164, 348-349
fern/pages/tutorials/request-rate-concurrency.md (2)
27-41: Fix blockquote structures with blank line separators.Blank lines at lines 29 and 32 within blockquote sequences create improper Markdown structure. Merge related blockquotes or ensure proper separation.
Proposed fix
> [!IMPORTANT] > **No catch-up behavior**: When the concurrency limit is reached, the system does not attempt to "catch up" by issuing requests faster once slots free up. The schedule continues at the configured rate. - +> > [!TIP] > **Sustaining max concurrency**: If your request rate is faster than your server's average response time... - +> > [!NOTE] > **Ramp-up time formula**: `ramp_up_time = concurrency / request_rate`
96-115: Add language identifiers to sample output code blocks.Sample output blocks at lines 96-115 and 140-159 are missing language identifiers.
Proposed fix
**Sample Output (Successful Run):** -``` +```text INFO Starting AIPerf SystemAlso applies to: 140-159
fern/pages/tutorials/rankings.md (1)
52-71: Add language identifier to sample output code block.The sample output block at lines 52-71 is missing a language identifier for proper rendering.
Proposed fix
**Sample Output (Successful Run):** -``` +```text INFO Starting AIPerf Systemfern/pages/tutorials/custom-dataset.md (1)
89-131: Add language identifiers to sample output code blocks.Sample output blocks at lines 89-131, 170-210, and 264-305 are missing language identifiers for proper rendering.
Proposed fix
**Output:** -``` +```text NVIDIA AIPerf | LLM MetricsAlso applies to: 170-210, 264-305
fern/pages/tutorials/gpu-telemetry.md (3)
1-2: Use standard Markdown comment syntax.The file uses non-standard
{/* ... */}comment syntax. For Markdown files, use standard HTML comment syntax<!-- ... -->.Proposed fix
-{/* # SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -# SPDX-License-Identifier: Apache-2.0 */} +<!-- SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +SPDX-License-Identifier: Apache-2.0 -->
44-55: Fix blockquote structures with blank line separators.Blank lines at lines 50 and 53 between blockquote sections create improper Markdown structure. Merge related blockquotes or ensure proper separation.
Proposed fix
> [!IMPORTANT] > **DCGM mode (default):** The default endpoints `http://localhost:9400/metrics` and `http://localhost:9401/metrics` are always attempted... -> +> > **pynvml mode:** When using `--gpu-telemetry pynvml`, DCGM endpoints are NOT used... -> +> > To completely disable GPU telemetry collection, use `--no-gpu-telemetry`. - +> > [!NOTE] > When specifying custom DCGM exporter URLs, the `http://` prefix is optional... - +> > [!TIP] > For simple local GPU monitoring without DCGM setup, use `--gpu-telemetry pynvml`...
159-193: Add language identifiers to sample output code blocks.Sample output blocks at lines 159-193, 498-521, and 524-538 are missing language identifiers for proper rendering.
Proposed fix
For terminal output:
**Sample Output (Successful Run):** -``` +```text INFO Starting AIPerf SystemFor CSV output:
## Example CSV Export -``` +```csv Endpoint,GPU_Index,GPU_Name,GPU_UUID,Metric,avg,min,max...Also applies to: 498-521, 524-538
fern/pages/tutorials/multi-turn.md (1)
2-2: Copyright year range inconsistency.This file uses
2024-2025while most other new files in this PR use2025-2026. Consider updating for consistency.fern/pages/tutorials/prefix-synthesis.md (1)
1-3: SPDX header format differs from other files.This file uses a JSX comment (
{/* ... */}) for the SPDX header, while all other new docs in this PR use YAML frontmatter (---delimiters). Consider using the standard frontmatter format for consistency.Proposed fix
-{/* # SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -# SPDX-License-Identifier: Apache-2.0 */} +--- +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +--- + # Prefix Data Synthesis Tutorialfern/pages/server-metrics/server-metrics-reference.md (1)
166-168: Heading level inconsistency: backend sections should be###under "Detailed Metric Definitions".Line 166 introduces
## Detailed Metric Definitionsas a parent section, and the TOC (lines 13–18) indents the backends as children. However,Dynamo Frontend(line 168),Dynamo Component(line 217),vLLM(line 273), etc. are all##— the same level as their parent — so they appear as siblings in the document outline rather than subsections.If you want them nested under "Detailed Metric Definitions", change each to
###. Alternatively, if flat h2 is intentional for visual weight in the rendered Fern site, consider removing the "Detailed Metric Definitions" h2 header (or making it a non-heading intro paragraph) to avoid the structural mismatch.fern/pages/tutorials/http-trace-metrics.md (1)
34-45: Add language identifiers to fenced code blocks for linter compliance.Several fenced code blocks (the lifecycle diagram at line 34, formulas at lines 88/98/106, and the streaming example at line 137) lack a language specifier. Using
textwould satisfy the markdownlint MD040 rule while preserving readability.📝 Example fix (line 34)
-``` +```text Request Lifecycle ──────────...
Converting PR #676 to Config-Only Fern SetupPR: ai-dynamo/aiperf#676 — docs: Add initial fern docs Background: Two Ways to Set Up FernWhen adding Fern to a repo that already has a Approach A: Copy Content into
|
| Approach A (Copy) | Approach B (Config-Only) | |
|---|---|---|
| Files in PR | 85 (60+ duplicates) | ~25 (config + in-place fixes) |
| Doc edits | Must update both docs/ and fern/pages/ |
Edit docs/ once |
| Drift risk | docs/ and fern/pages/ can go out of sync |
Impossible -- single source |
| Versioning | Unclear where old versions go | Clean: fern/versions/vX/docs/ snapshots from release branches |
| GitHub rendering | docs/ still visible on GitHub but may diverge from site |
docs/ IS the site content |
| PR review burden | Reviewers see every file twice | Reviewers see only real changes |
Fern supports both approaches -- it just needs a relative path from the version YAML to the Markdown file. It does not require content to live inside fern/.
What Needs to Change
The goal is to convert PR #676 from Approach A to Approach B.
Net effect: Remove ~60 duplicated files, shrinking the diff from 85 files to ~25 files.
Step-by-Step Changes
1. Delete fern/pages/ Entirely
git rm -r fern/pages/This removes all 60+ copied files. Content is already in docs/.
2. Update fern/versions/next.yml
Every path: entry needs to change from ../pages/* to ../../docs/*.
Before:
- page: Welcome to AIPerf Documentation
path: ../pages/index.md
- page: AIPerf Metrics Reference
path: ../pages/metrics-reference.md
- section: Tutorials
contents:
- page: Warmup Phase Configuration
path: ../pages/tutorials/warmup.mdAfter:
- page: Welcome to AIPerf Documentation
path: ../../docs/index.md
- page: AIPerf Metrics Reference
path: ../../docs/metrics-reference.md
- section: Tutorials
contents:
- page: Warmup Phase Configuration
path: ../../docs/tutorials/warmup.mdThe page: titles and section structure stay exactly the same. Only the path: values change.
Quick sed command (verify the output before committing):
sed -i 's|path: \.\./pages/|path: ../../docs/|g' fern/versions/next.yml3. Keep All docs/ Modifications
The PR already makes in-place changes to docs/ files (MDX fixes, SPDX frontmatter, H1 removal). These changes are still needed and should remain in the PR:
docs/api/synthesis.md-- modifieddocs/benchmark-datasets.md-- renamed frombenchmark_datasets.mddocs/benchmark-modes/timing-modes-reference.md-- renameddocs/benchmark-modes/trace-replay.md-- renameddocs/cli-options.md-- renamed fromcli_options.mddocs/comprehensive-llm-benchmarking.md-- modifieddocs/environment-variables.md-- renameddocs/metrics-reference.md-- renameddocs/server-metrics/*.md-- renameddocs/tutorials/*.md-- modified (link fixes)
4. Fix Remaining CodeRabbit Issues
These 7 unresolved issues now apply to the docs/ files directly:
4a. Broken Cross-Repo Links
In docs/reproducibility.md (was fern/pages/reproducibility.md), convert relative links that escape docs/:
-- [test_random_generator_canary.py](../tests/integration/test_random_generator_canary.py)
-- [test_deterministic_behavior.py](../tests/integration/test_deterministic_behavior.py)
+- [test_random_generator_canary.py](https://github.com/ai-dynamo/aiperf/blob/main/tests/integration/test_random_generator_canary.py)
+- [test_deterministic_behavior.py](https://github.com/ai-dynamo/aiperf/blob/main/tests/integration/test_deterministic_behavior.py)And:
-See [random_generator.py](../src/aiperf/common/random_generator.py)
+See [random_generator.py](https://github.com/ai-dynamo/aiperf/blob/main/src/aiperf/common/random_generator.py)4b. PII Leak in Sample Output
In docs/tutorials/custom-dataset.md, replace developer home paths:
-/home/lkomali/aiperf/artifacts/Qwen_Qwen3-0.6B-openai-chat-concurrency2/profile_export_aiperf.csv
+artifacts/Qwen_Qwen3-0.6B-openai-chat-concurrency2/profile_export_aiperf.csvApply to all three output blocks in the file.
4c. Code Block Fencing
In docs/tutorials/multi-url-load-balancing.md and docs/tutorials/arrival-patterns.md, close bash code fences before sample output sections and tag output blocks with text:
```bash
aiperf profile --model llama \
--url http://server1:8000 \
--request-rate 20
+```
**Sample Output:**
-```
+```text
INFO Starting AIPerf System
...
#### 4d. Angle Bracket Escaping
In `docs/genai-perf-feature-comparison.md` (and similar files), use `<` instead of `\<`:
```diff
-| Gamma distribution shape: \<1=bursty, 1=Poisson, >1=smooth |
+| Gamma distribution shape: <1=bursty, 1=Poisson, >1=smooth |
Apply the same pattern across docs/cli-options.md and server-metrics files.
4e. mkdocs.yml Nav Entry
Update mkdocs.yml to match the renamed file:
-- CLI Options: cli_options.md
+- CLI Options: cli-options.md4f. Server-Metrics Link Mismatch
In docs/server-metrics/server-metrics-reference.md, fix underscore links:
-[JSON Schema Reference](server_metrics_json_schema.md)
-[Parquet Schema Reference](server_metrics_parquet_schema.md)
+[JSON Schema Reference](server-metrics-json-schema.md)
+[Parquet Schema Reference](server-metrics-parquet-schema.md)5. Fern-Only Pages
Some pages in fern/pages/ were created specifically for Fern and don't exist in docs/:
fern/pages/index.md-- Fern landing pagefern/pages/architecture.md-- may not exist indocs/fern/pages/migrating.md-- GenAI-Perf migration guidefern/pages/genai-perf-feature-comparison.mdfern/pages/dev/patterns.mdfern/pages/diagrams/metrics-flow.mdfern/pages/diagrams/mixins.md
For each of these: Check if a corresponding file exists in docs/. If not, either:
- Move it to
docs/(preferred -- it becomes the source of truth) - Keep it in
fern/pages/as a Fern-only page and reference it fromnext.ymlas../pages/filename.md
Option 1 is cleaner. Option 2 creates a hybrid where most content is in docs/ but a few Fern-specific pages live in fern/pages/.
6. Update CI Trigger Paths (If Applicable)
If the PR includes a GitHub Actions workflow for publishing, ensure it triggers on both:
paths: ['fern/**', 'docs/**']7. Verify
After making all changes:
# Validate config
fern check --warnings
# Preview locally
fern docs dev --port 3000
# Check every page renders, images load, links workSummary of Changes
| Action | Files Affected | Lines |
|---|---|---|
Delete fern/pages/ |
-60 files | -19k lines |
Update next.yml paths |
1 file | ~60 path changes |
Fix CodeRabbit issues in docs/ |
~6 files | ~30 lines |
Move Fern-only pages to docs/ |
~7 files | net zero |
| Net PR reduction | ~53 fewer files | ~19k fewer lines |
f159bd8 to
c754b47
Compare
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (9)
docs/tutorials/sglang-image-generation.md (1)
226-229:⚠️ Potential issue | 🟡 MinorFilename format mismatch between script output and example media images.
The extraction script (line 207) produces 2-segment names:
f"image_{line_num:04d}_{data_idx:02d}.jpg" # → image_0001_00.jpgThe sample output (lines 226–229) correctly reflects this. However, the example images embedded in the "View the generated images" section carry 3-segment, hyphenated names (
image-0001-00-00.jpg). A reader who runs the script will produceimage_0001_00.jpgand won't recognize a match with the displayed examples.Either update the media filenames (and the references on lines 237, 243, 249) to match what the script actually generates, or adjust the script's naming to add the trailing segment and align all three artefacts (script code, sample output, and media filenames) to the same convention.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/sglang-image-generation.md` around lines 226 - 229, The example media filenames do not match the extraction script's f-string (f"image_{line_num:04d}_{data_idx:02d}.jpg")—the script produces "image_0001_00.jpg" but the docs show hyphenated "image-0001-00-00.jpg"; fix by either renaming the embedded media and their references on the "View the generated images" lines (currently at the three places called out) to the underscore two-segment form that the script outputs, or change the extraction naming convention in the script to a three-segment hyphenated format (adding the extra trailing index and using "-" separators) and then update the sample output and media references so all three (script, sample output, media files) use the exact same filename pattern; ensure consistency for filenames referenced on the three lines mentioned.docs/tutorials/audio.md (1)
89-92:⚠️ Potential issue | 🟡 MinorDeveloper home path leaks into sample output.
Lines 89–92 (and 153–156) contain
/home/lkomali/aiperf/artifacts/..., a developer-specific path. The PR reviewer flagged the same issue as fix 4b forcustom-dataset.md; the same correction is needed here.✏️ Suggested fix — replace with a generic path
-/home/lkomali/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/profile_export_aiperf.csv +artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/profile_export_aiperf.csv -/home/lkomali/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/profile_export_aiperf.json +artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/profile_export_aiperf.json -/home/lkomali/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/logs/aiperf.log +artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/logs/aiperf.logApply the same substitution to lines 153–156.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/audio.md` around lines 89 - 92, Replace developer-specific absolute paths in the sample output lines that reference profile_export_aiperf.csv, profile_export_aiperf.json, and aiperf.log with a generic placeholder path (e.g., /path/to/artifacts/<run-name>/...) in the docs/tutorials/audio.md content; apply the same substitution to the other matching block later in the file (the lines containing the same three filenames) so both occurrences no longer leak the developer home directory.docs/benchmark-modes/trace-replay.md (1)
63-111:⚠️ Potential issue | 🟡 MinorSame duplicate section-ID pattern — three distinct scenarios share one ID.
All three
{/* aiperf-run-vllm-default-openai-endpoint-server */}pairs (lines 63–71, 74–85, 93–111) wrap different scenarios (create trace file / run custom trace / run Mooncake trace) but carry the same marker. As flagged inopenai-text-endpoints.md, this prevents unique extraction by the e2e parser. Consider using distinct IDs such asaiperf-run-vllm-trace-create-custom,aiperf-run-vllm-trace-run-custom, andaiperf-run-vllm-trace-run-mooncake.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/benchmark-modes/trace-replay.md` around lines 63 - 111, The three code block markers all use the same ID "aiperf-run-vllm-default-openai-endpoint-server", preventing unique extraction; update each pair to unique IDs that reflect their scenario names (e.g., change the create-trace block marker to "aiperf-run-vllm-trace-create-custom", the run-custom-trace block to "aiperf-run-vllm-trace-run-custom", and the Mooncake trace block to "aiperf-run-vllm-trace-run-mooncake"), ensuring both the opening and closing comment markers around each fenced bash block are changed accordingly so the e2e parser can extract them uniquely.docs/tutorials/multi-run-confidence.md (1)
703-704:⚠️ Potential issue | 🟡 MinorBroken cross-reference links — filenames changed in this PR.
The PR renames
cli_options.md→cli-options.md(mkdocs.yml update, fix 4e) andmetrics_reference.md→metrics-reference.md(confirmed by theworking-with-profile-exports.mdchange). Both links on lines 703–704 still point to the old underscore names and will 404.🔗 Proposed fix
-- [CLI Options](../cli_options.md) - Full parameter reference -- [Metrics Reference](../metrics_reference.md) - Detailed metric descriptions +- [CLI Options](../cli-options.md) - Full parameter reference +- [Metrics Reference](../metrics-reference.md) - Detailed metric descriptions🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/multi-run-confidence.md` around lines 703 - 704, Update the broken cross-reference links in docs/tutorials/multi-run-confidence.md that still point to the old underscored filenames: change ../cli_options.md to ../cli-options.md and ../metrics_reference.md to ../metrics-reference.md (and search the same file for any other occurrences of cli_options or metrics_reference to replace them too) so the links match the renamed files referenced in mkdocs.yml and other docs.docs/tutorials/openai-text-endpoints.md (1)
31-164:⚠️ Potential issue | 🔴 CriticalAll
aiperf-run-vllm-default-openai-endpoint-servertags across 15+ tutorial and benchmark files extract to the same server identifier, causing parser collisions.The parser extracts server name from the tag format
aiperf-run-{server-name}-endpoint-server, meaning all 66 occurrences of the identical tagaiperf-run-vllm-default-openai-endpoint-server(across openai-text-endpoints.md, multi-turn.md, custom-dataset.md, request-cancellation.md, trace-replay.md, and others) extract to server namevllm-default. These commands are then appended to a single list, making it impossible for the e2e test parser to differentiate between distinct scenarios and collapse multiple test cases into one.Affected files with duplicate tags:
- openai-text-endpoints.md: 4 pairs (chat/completions × synthetic/custom)
- multi-turn.md: 6 pairs
- custom-dataset.md: 3 pairs
- request-cancellation.md: 3 pairs
- trace-replay.md: 3 pairs
- Plus 10+ additional tutorial files with 1–2 pairs each
Each distinct scenario needs a unique identifier. Use a pattern like
aiperf-run-vllm-{scenario}-endpoint-serverto distinguish scenarios within each file and avoid collisions across files.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/openai-text-endpoints.md` around lines 31 - 164, The repeated snippet tag name aiperf-run-vllm-default-openai-endpoint-server causes parser collisions; update each opening and matching closing tag in this file (e.g., the four pairs around the chat synthetic, chat custom, completions synthetic, completions custom blocks) to use unique identifiers following the suggested pattern such as aiperf-run-vllm-chat-synthetic-endpoint-server, aiperf-run-vllm-chat-custom-endpoint-server, aiperf-run-vllm-completions-synthetic-endpoint-server, and aiperf-run-vllm-completions-custom-endpoint-server so the parser extracts distinct server names; ensure every changed opening tag has its corresponding closing tag updated to the exact same new identifier.docs/tutorials/embeddings.md (1)
79-101:⚠️ Potential issue | 🟠 Major
aiperf-runtag wraps the wrong bash block — CI parser will extract thecatcommand instead ofaiperf profile.
parser.py's_extract_bash_blockscans forward from the tag line and returns the first\``bashblock it encounters. With the opening tag placed at line 79, the first bash block is thecat < inputs.jsonlheredoc (lines 80–89), not theaiperf profilecommand (lines 92–100). The CI E2E test would register acat` command as the server's aiperf run command, causing incorrect test execution.🐛 Proposed fix — move the opening tag to immediately before the `aiperf profile` block
-{/* aiperf-run-vllm-default-openai-endpoint-server */} ```bash cat <<EOF > inputs.jsonl {"texts": ["What is artificial intelligence?"]} {"texts": ["Explain machine learning."]} {"texts": ["How do neural networks work?"]} {"texts": ["Define deep learning."]} {"texts": ["What are transformers in AI?"]} EOFRun AIPerf using the custom input file:
+{/* aiperf-run-vllm-default-openai-endpoint-server */}aiperf profile \🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/embeddings.md` around lines 79 - 101, The aiperf-run tag is placed before the heredoc bash block so parser.py::_extract_bash_block picks up the cat <<EOF block instead of the intended aiperf profile command; move the opening tag {/* aiperf-run-vllm-default-openai-endpoint-server */} from above the inputs.jsonl heredoc to immediately before the ```bash block that starts the aiperf profile command (and keep the closing tag after that block) so the parser extracts the correct aiperf profile command for CI.docs/benchmark-datasets.md (1)
42-42:⚠️ Potential issue | 🟡 MinorBroken relative link: file was renamed but this reference wasn't updated.
benchmark_modes/trace_replay.md(underscores) no longer exists; the PR renamed it tobenchmark-modes/trace-replay.md. This link will 404 in both mkdocs and Fern.🔧 Proposed fix
- <td>Mooncake trace file <a href="benchmark_modes/trace_replay.md"><code>--input-file your_trace_file.jsonl --custom-dataset-type mooncake_trace</code></a></td> + <td>Mooncake trace file <a href="benchmark-modes/trace-replay.md"><code>--input-file your_trace_file.jsonl --custom-dataset-type mooncake_trace</code></a></td>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/benchmark-datasets.md` at line 42, Update the broken relative link in the Mooncake trace file row by replacing the old reference "benchmark_modes/trace_replay.md" with the new path "benchmark-modes/trace-replay.md" so the anchor around the code snippet (--input-file your_trace_file.jsonl --custom-dataset-type mooncake_trace) points to the renamed document; ensure the href in that <a> tag is updated accordingly.docs/tutorials/vision.md (1)
40-53:⚠️ Potential issue | 🟡 MinorDuplicate marker IDs across different code blocks.
The marker
aiperf-run-vllm-vision-openai-endpoint-serveris reused on three distinct code blocks (lines 40, 86, 100). If these markers are used for snippet extraction or testing, the duplicate IDs could cause the wrong block to be selected. Consider using unique identifiers (e.g.,aiperf-run-vllm-vision-synthetic,aiperf-run-vllm-vision-custom-file,aiperf-run-vllm-vision-custom-run).Also applies to: 86-96, 100-111
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/vision.md` around lines 40 - 53, Duplicate marker ID "aiperf-run-vllm-vision-openai-endpoint-server" is used for multiple code blocks causing potential snippet extraction collisions; update each marker to a unique identifier (for example replace the three occurrences with "aiperf-run-vllm-vision-synthetic", "aiperf-run-vllm-vision-custom-file", and "aiperf-run-vllm-vision-custom-run") by locating the markers surrounding the code fences (the strings inside {/* ... */}) and renaming them consistently at both the opening and closing markers so each block has a distinct ID.tools/generate_cli_docs.py (1)
428-432:⚠️ Potential issue | 🟡 MinorCommand help text paragraphs not wrapped with
_escape_mdx_prose.Parameter descriptions (line 331) and choice descriptions (line 345) are escaped, but command-level help text at line 432 passes through
normalize_textonly. Theplotcommand's description contains<first_path>, which would break MDX rendering without escaping.Proposed fix
if desc: for para in desc.split("\n\n"): if para.strip(): - lines.extend([normalize_text(para), ""]) + lines.extend([_escape_mdx_prose(normalize_text(para)), ""])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/generate_cli_docs.py` around lines 428 - 432, The command-level description is not MDX-escaped: when building desc from desc_lines the code uses normalize_text(para) but does not call _escape_mdx_prose, so angle-bracketed tokens (e.g. <first_path>) break rendering; update the block that processes desc (the variables desc, desc_lines and loop that appends to lines) to pass each paragraph through _escape_mdx_prose (e.g. _escape_mdx_prose(normalize_text(para))) before extending lines so command help text is properly escaped for MDX.
🧹 Nitpick comments (4)
docs/tutorials/sglang-image-generation.md (1)
240-242: Specify a language on fenced code blocks containing JSON prompts (MD040).Lines 240 and 246 open code fences with no language identifier. Adding
jsonmakes intent explicit and suppresses the markdownlint warning.✏️ Proposed fix
-``` +```json {"text": "A futuristic city with flying cars"}```diff -``` +```json {"text": "A cute robot playing with a kitten"}</details> > **Note:** The same fix applies to the first prompt block at line 234. <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@docs/tutorials/sglang-image-generation.mdaround lines 240 - 242, Add the
missing language identifier "json" to the fenced code blocks that contain JSON
prompts (e.g., the blocks containing {"text": "A futuristic city with flying
cars"} and {"text": "A cute robot playing with a kitten"} and the earlier prompt
at the first block), by changing the opening backtick fences fromtojson
so the intent is explicit and the markdownlint MD040 warning is suppressed.</details> </blockquote></details> <details> <summary>docs/tutorials/embeddings.md (1)</summary><blockquote> `37-37`: **Recurring MD037 markdownlint false positives from MDX comment syntax (applies to all changed doc files).** Lines 37, 49, 79, 101 here, and similarly in `docs/tutorials/fixed-schedule.md` (57, 84, 121, 136) and `docs/tutorial.md` (12, 21, 23, 27, 30, 42) all trigger MD037 ("Spaces inside emphasis markers") because markdownlint cannot distinguish `{/* … */}` MDX comment delimiters from `* … *` emphasis syntax. Consider suppressing MD037 in the project's markdownlint config for MDX-style files, e.g. in `.markdownlint-cli2.yaml`: ```yaml # MDX comment delimiters {/* … */} trigger false MD037 positives MD037: false ``` Or scope it per-glob if you want to keep the rule for pure-markdown files: ```yaml overrides: - globs: ["docs/**/*.md"] config: MD037: false ``` <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/embeddings.md` at line 37, Several doc files use MDX-style comment delimiters like {/* aiperf-run-vllm-default-openai-endpoint-server */} which markdownlint mis-parses as emphasis and triggers MD037; update the markdownlint config (.markdownlint-cli2.yaml) to suppress MD037 for MDX-style files by adding MD037: false globally or, preferably, add an overrides entry that disables MD037 for the docs glob (e.g., docs/**/*.md or the MDX-specific glob) so the {/* … */} comments no longer produce false positives while preserving MD037 for pure-markdown files. ``` </details> </blockquote></details> <details> <summary>docs/comprehensive-llm-benchmarking.md (1)</summary><blockquote> `1-4`: **Stray `#` prefix on SPDX lines is inconsistent with all other files in this PR.** Lines 2–3 read `# SPDX-FileCopyrightText:` / `# SPDX-License-Identifier:` — the `#` prefix is a leftover from a Python/shell-comment–style header and was not stripped during the conversion. Every other converted docs file (e.g., `docs/tutorials/custom-dataset.md`) and the `SPDX_HEADER_MD` constant in `tools/_core.py` omit the `#`. <details> <summary>✏️ Suggested fix</summary> ```diff {/* -# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -# SPDX-License-Identifier: Apache-2.0 +SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +SPDX-License-Identifier: Apache-2.0 */} ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@docs/comprehensive-llm-benchmarking.md` around lines 1 - 4, The SPDX header in docs/comprehensive-llm-benchmarking.md uses a stray leading "#" on the two SPDX lines; remove the "#" prefix from the lines beginning with "# SPDX-FileCopyrightText:" and "# SPDX-License-Identifier:" so they match the SPDX_HEADER_MD format used elsewhere (i.e., change them to "SPDX-FileCopyrightText: ..." and "SPDX-License-Identifier: ..."). ``` </details> </blockquote></details> <details> <summary>.cursor/skills/docs-to-fern/SKILL_md (1)</summary><blockquote> `113-118`: **Hardcoded Fern CLI version `3.29.1` will become stale.** Consider recommending that users check the latest version (`npm show fern-api version`) without hardcoding a specific version, or add a note to update this periodically. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In @.cursor/skills/docs-to-fern/SKILL_md around lines 113 - 118, The JSON in SKILL_md hardcodes "version": "3.29.1", which will become stale; replace the fixed version with guidance to either leave the version out or use a placeholder and add a note telling users to run `npm show fern-api version` to find the latest Fern CLI version (or to substitute "latest"/their desired version), and update the documentation text around the JSON (the block containing "organization" and "version") to explain how and when to refresh the version value. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.cursor/skills/docs-to-fern/SKILL_md:
- Around line 326-361: The fix_file function currently detects SPDX blocks using
spdx_pattern and replaces them with YAML frontmatter (--- ... ---); change that
behavior so detected SPDX-FileCopyrightText/SPDX-License-Identifier blocks are
converted into JSX block comments like {/* ... /} (preserving the original
lines and ordering) instead of creating a YAML frontmatter block, and ensure the
subsequent generic HTML-to-JSX replacement (the re.sub that turns
into {/ \1 */}) still matches or is adjusted to avoid double-wrapping SPDX
content.- Around line 262-280: Update the skill document to remove the duplicate-content
Approach A instructions in "Phase 3: Migrate Content" (specifically "Step 3.1:
Bulk Copy with Hyphen Renaming" and any steps that create or require
fern/pages/* copies) and replace them with guidance for the config-only Approach
B: describe that fern/ holds only configuration, that next.yml navigation
entries should point to the original docs via relative paths (e.g.,
../../docs/), and remove the Definition of Done requirement that every
docs/.md must have a corresponding fern/pages/*.md; update any example code and
references to "fern/pages/" to instead show the next.yml relative-path pattern.In
@docs/cli-options.md:
- Line 492: The docstring in src/aiperf/common/config/prompt_config.py contains
two adjacent string literals for the --num-prefix-prompts description that
concatenate without a space, producing "off by one.Mutually". Edit the string
literals in the prompt_config module (the doc/description for
--num-prefix-prompts in PromptConfig or the module-level prompt config constant)
to add a trailing space after the period or join them into a single string so
the generated docs read "...off by one. Mutually exclusive..." (ensure the
change targets the adjacent literals shown in the review).In
@docs/diagrams/metrics-flow.md:
- Around line 12-61: The Mermaid diagram connectors were corrupted: every valid
arrow "-->" was replaced by "/}", breaking edges; restore all "/}" connectors
back to "-->" within the Mermaid fenced block so nodes like A, B1/B2/B3,
C1/C2/C3, D1/D2/D3, E1/E2/E3, G, H1/H2, I2, J1/J2/J3 and K reconnect correctly.
Locate the mermaid block containing the node IDs (e.g., "MetricRecordProcessor",
"RECORD: RequestLatencyMetric", "AGGREGATE: TotalRequestsMetric",
"MetricResultsDict", "Summarize Function") and perform a global replace of the
connector token "*/}" with the correct Mermaid arrow "-->" (or manually fix each
edge) ensuring spacing/HTML entities remain unchanged. Ensure the fix is applied
to every connector occurrence in the file so the rendered diagram is fully
connected.In
@docs/diagrams/mixins.md:
- Around line 9-31: The Mermaid diagram has accidental "/}" edge tokens (e.g.,
between "BaseMixin" -> "AIPerfLoggerMixin", "AIPerfLoggerMixin" -> "HooksMixin",
and others like "BaseService" -> "BaseComponentService") which breaks
connections; replace all occurrences of the "/}" connector with the proper
"-->" arrow (preserving surrounding spacing as needed) so every link like A -> B
uses "-->" (ensure entries such as "AIPerfLoggerMixin */} C" become
"AIPerfLoggerMixin --> C" across the file).In
@docs/index.md:
- Around line 1-4: Update the SPDX header comment so the copyright year matches
the other changed docs by replacing the single year "2025" with the range
"2025-2026" in the existing SPDX block (the comment that begins with
"SPDX-FileCopyrightText" and includes "SPDX-License-Identifier: Apache-2.0");
ensure the formatting of the comment remains unchanged aside from the year
range.In
@docs/metrics-reference.md:
- Around line 1-4: The file docs/metrics-reference.md currently uses JSX comment
delimiters{/*and*/}which render as literal text; replace those JSX
comment markers (the{/*at the top and the matching*/}at the bottom) with
standard HTML comment markers<!--and-->so the SPDX header lines remain
hidden in this .md file.In
@docs/server-metrics/server-metrics-json-schema.md:
- Around line 1-4: The top-of-file JS-style JSX comment
{/* ... */}wrapping
the SPDX lines is being rendered as headings; replace that{/* ... */}block
with an HTML comment<!-- SPDX-FileCopyrightText: ... SPDX-License-Identifier: ... -->(i.e., wrap the SPDX lines in<!--and-->) so GitHub will not
render them as Markdown H1s, mirroring the same fix applied to the other docs
file.In
@docs/server-metrics/server-metrics-reference.md:
- Around line 1-4: The SPDX header is currently wrapped with JSX-style comment
markers {/* ... /}, which GitHub renders as Markdown H1s; replace the {/ and
/} wrapper around the two SPDX lines with a proper Markdown/HTML comment
delimiter so the lines and are hidden; update the block that
contains the {/ and */} markers and the two SPDX lines accordingly so the SPDX
metadata remains present but does not render as headings.In
@docs/tutorials/huggingface-tgi.md:
- Around line 1-4: The JSX-style license header using {/* ... /} is being
rendered as literal text by MkDocs; replace the JSX comment block (the four-line
SPDX header wrapped in {/ */}) in huggingface-tgi.md and each affected tutorial
file (e.g., the listed tutorials such as multi-turn.md, request-cancellation.md,
fixed-schedule.md, etc.) with an HTML comment block using
containing the same SPDX lines so MkDocs will ignore it; ensure you preserve the
exact SPDX lines and spacing when converting the comment syntax.In
@docs/tutorials/sglang-image-generation.md:
- Around line 237-249: Replace the three empty image alt texts with descriptive
text taken from the prompt above each image: for the image after the prompt
{"text": "A futuristic city with flying cars"} set alt text to "A futuristic
city with flying cars"; for the image after {"text": "A cute robot playing with
a kitten"} set alt text to "A cute robot playing with a kitten"; and for the
third image use the prompt text that precedes it (use that prompt as the alt
text). Update the three markdown image tags (the
occurrences) so the alt text is included
inside the brackets.In
@docs/tutorials/sharegpt.md:
- Line 2: The SPDX copyright header line currently uses a single year "2026"
(the line starting with SPDX-FileCopyrightText), which is inconsistent with the
project's "2025-2026" range; update that line to use "2025-2026" so it matches
other files and maintain consistency across the docs.In
@tools/_core.py:
- Around line 90-95: The SPDX_HEADER_MD constant uses JSX-style block comments
("{/", "/}") which render as visible text in standard Markdown; update
SPDX_HEADER_MD in tools/_core.py to use HTML comments ("") so the
header is invisible to mkdocs/GFM, e.g. replace the tuple entries wrapping the
SPDX lines with an HTML comment prefix/suffix; alternatively, if you prefer not
to embed the header in docs, remove SPDX_HEADER_MD and configure Fern to inject
the SPDX info via a Fern-native mechanism (frontmatter/_layout or
fern.config.json header/footer) so MDX and plain Markdown pipelines are both
handled correctly.
Outside diff comments:
In@docs/benchmark-datasets.md:
- Line 42: Update the broken relative link in the Mooncake trace file row by
replacing the old reference "benchmark_modes/trace_replay.md" with the new path
"benchmark-modes/trace-replay.md" so the anchor around the code snippet
(--input-file your_trace_file.jsonl --custom-dataset-type mooncake_trace) points
to the renamed document; ensure the href in that tag is updated accordingly.In
@docs/benchmark-modes/trace-replay.md:
- Around line 63-111: The three code block markers all use the same ID
"aiperf-run-vllm-default-openai-endpoint-server", preventing unique extraction;
update each pair to unique IDs that reflect their scenario names (e.g., change
the create-trace block marker to "aiperf-run-vllm-trace-create-custom", the
run-custom-trace block to "aiperf-run-vllm-trace-run-custom", and the Mooncake
trace block to "aiperf-run-vllm-trace-run-mooncake"), ensuring both the opening
and closing comment markers around each fenced bash block are changed
accordingly so the e2e parser can extract them uniquely.In
@docs/tutorials/audio.md:
- Around line 89-92: Replace developer-specific absolute paths in the sample
output lines that reference profile_export_aiperf.csv,
profile_export_aiperf.json, and aiperf.log with a generic placeholder path
(e.g., /path/to/artifacts//...) in the docs/tutorials/audio.md
content; apply the same substitution to the other matching block later in the
file (the lines containing the same three filenames) so both occurrences no
longer leak the developer home directory.In
@docs/tutorials/embeddings.md:
- Around line 79-101: The aiperf-run tag is placed before the heredoc bash block
so parser.py::_extract_bash_block picks up the cat <<EOF block instead of the
intended aiperf profile command; move the opening tag {/*
aiperf-run-vllm-default-openai-endpoint-server */} from above the inputs.jsonl
heredoc to immediately before the ```bash block that starts the aiperf profile
command (and keep the closing tag after that block) so the parser extracts the
correct aiperf profile command for CI.In
@docs/tutorials/multi-run-confidence.md:
- Around line 703-704: Update the broken cross-reference links in
docs/tutorials/multi-run-confidence.md that still point to the old underscored
filenames: change ../cli_options.md to ../cli-options.md and
../metrics_reference.md to ../metrics-reference.md (and search the same file for
any other occurrences of cli_options or metrics_reference to replace them too)
so the links match the renamed files referenced in mkdocs.yml and other docs.In
@docs/tutorials/openai-text-endpoints.md:
- Around line 31-164: The repeated snippet tag name
aiperf-run-vllm-default-openai-endpoint-server causes parser collisions; update
each opening and matching closing tag in this file (e.g., the four pairs around
the chat synthetic, chat custom, completions synthetic, completions custom
blocks) to use unique identifiers following the suggested pattern such as
aiperf-run-vllm-chat-synthetic-endpoint-server,
aiperf-run-vllm-chat-custom-endpoint-server,
aiperf-run-vllm-completions-synthetic-endpoint-server, and
aiperf-run-vllm-completions-custom-endpoint-server so the parser extracts
distinct server names; ensure every changed opening tag has its corresponding
closing tag updated to the exact same new identifier.In
@docs/tutorials/sglang-image-generation.md:
- Around line 226-229: The example media filenames do not match the extraction
script's f-string (f"image_{line_num:04d}_{data_idx:02d}.jpg")—the script
produces "image_0001_00.jpg" but the docs show hyphenated
"image-0001-00-00.jpg"; fix by either renaming the embedded media and their
references on the "View the generated images" lines (currently at the three
places called out) to the underscore two-segment form that the script outputs,
or change the extraction naming convention in the script to a three-segment
hyphenated format (adding the extra trailing index and using "-" separators) and
then update the sample output and media references so all three (script, sample
output, media files) use the exact same filename pattern; ensure consistency for
filenames referenced on the three lines mentioned.In
@docs/tutorials/vision.md:
- Around line 40-53: Duplicate marker ID
"aiperf-run-vllm-vision-openai-endpoint-server" is used for multiple code blocks
causing potential snippet extraction collisions; update each marker to a unique
identifier (for example replace the three occurrences with
"aiperf-run-vllm-vision-synthetic", "aiperf-run-vllm-vision-custom-file", and
"aiperf-run-vllm-vision-custom-run") by locating the markers surrounding the
code fences (the strings inside {/* ... */}) and renaming them consistently at
both the opening and closing markers so each block has a distinct ID.In
@tools/generate_cli_docs.py:
- Around line 428-432: The command-level description is not MDX-escaped: when
building desc from desc_lines the code uses normalize_text(para) but does not
call _escape_mdx_prose, so angle-bracketed tokens (e.g. <first_path>) break
rendering; update the block that processes desc (the variables desc, desc_lines
and loop that appends to lines) to pass each paragraph through _escape_mdx_prose
(e.g. _escape_mdx_prose(normalize_text(para))) before extending lines so command
help text is properly escaped for MDX.
Duplicate comments:
In@docs/cli-options.md:
- Around line 1-4: The generated docs include a JSX-style comment block that
breaks mkdocs; update the generator to emit an mkdocs/Fern-compatible header by
changing SPDX_HEADER_MD in tools/_core.py to use a plain Markdown comment or
HTML comment (e.g., ) or remove the JSX braces so the header is
valid Markdown; modify the SPDX_HEADER_MD constant and any code paths that
insert it (references in tools/generate_cli_docs.py) so the output for docs/.md
no longer contains the JSX-style {/ ... */} wrapper.In
@docs/tutorials/fixed-schedule.md:
- Line 57: MDX-style inline comment "{/*
aiperf-run-vllm-default-openai-endpoint-server /}" in
docs/tutorials/fixed-schedule.md is triggering a known false positive for MD037;
update the markdownlint configuration the same way you did for
docs/tutorials/embeddings.md — add an exception to allow MDX comment tags like
"{/ ... /}" (or disable MD037 for docs/tutorials/.md) so these MDX comment
tokens are ignored by the linter.
Nitpick comments:
In @.cursor/skills/docs-to-fern/SKILL_md:
- Around line 113-118: The JSON in SKILL_md hardcodes "version": "3.29.1", which
will become stale; replace the fixed version with guidance to either leave the
version out or use a placeholder and add a note telling users to runnpm show fern-api versionto find the latest Fern CLI version (or to substitute
"latest"/their desired version), and update the documentation text around the
JSON (the block containing "organization" and "version") to explain how and when
to refresh the version value.In
@docs/comprehensive-llm-benchmarking.md:
- Around line 1-4: The SPDX header in docs/comprehensive-llm-benchmarking.md
uses a stray leading "#" on the two SPDX lines; remove the "#" prefix from the
lines beginning with "# SPDX-FileCopyrightText:" and "#
SPDX-License-Identifier:" so they match the SPDX_HEADER_MD format used elsewhere
(i.e., change them to "SPDX-FileCopyrightText: ..." and
"SPDX-License-Identifier: ...").In
@docs/tutorials/embeddings.md:
- Line 37: Several doc files use MDX-style comment delimiters like {/*
aiperf-run-vllm-default-openai-endpoint-server /} which markdownlint mis-parses
as emphasis and triggers MD037; update the markdownlint config
(.markdownlint-cli2.yaml) to suppress MD037 for MDX-style files by adding MD037:
false globally or, preferably, add an overrides entry that disables MD037 for
the docs glob (e.g., docs/**/.md or the MDX-specific glob) so the {/* … */}
comments no longer produce false positives while preserving MD037 for
pure-markdown files.In
@docs/tutorials/sglang-image-generation.md:
- Around line 240-242: Add the missing language identifier "json" to the fenced
code blocks that contain JSON prompts (e.g., the blocks containing {"text": "A
futuristic city with flying cars"} and {"text": "A cute robot playing with a
kitten"} and the earlier prompt at the first block), by changing the opening
backtick fences fromtojson so the intent is explicit and the
markdownlint MD040 warning is suppressed.</details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
5483f52 to
44df8df
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
docs/tutorials/arrival-patterns.md (2)
189-200:⚠️ Potential issue | 🟡 MinorUse
bashfence for executable CLI commands.This block is a runnable command example but is fenced as
text. Please switch it tobashfor consistency and copy/paste ergonomics.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/arrival-patterns.md` around lines 189 - 200, Update the fenced code block for the runnable CLI example so the language tag is "bash" instead of "text" to improve copy/paste and syntax highlighting; locate the block containing the "aiperf profile \ --model your-model \ --url localhost:8000 \ --endpoint-type chat \ --streaming \ --request-rate 100 \ --arrival-pattern poisson \ --benchmark-duration 60 \ --output-dir results/poisson" command and replace the opening triple-backtick fence from ```text to ```bash.
14-20: 🛠️ Refactor suggestion | 🟠 MajorConvert conceptual ASCII diagrams to Mermaid.
These timeline visuals are ASCII diagrams in a Markdown doc. Please convert them to Mermaid blocks to align with repo docs standards, while keeping terminal-output ASCII tables as-is.
As per coding guidelines, "Use mermaid diagrams instead of ASCII art in markdown files."
Based on learnings, preserve ASCII box-drawing tables inside real command output blocks exactly as written.Also applies to: 45-49, 65-69, 96-108, 121-127
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/arrival-patterns.md` around lines 14 - 20, Replace the three-block ASCII timeline (the fenced ```text block showing "Constant Pattern", "Poisson Pattern", "Gamma (bursty)" and their arrows) with a mermaid diagram: wrap a mermaid block (```mermaid) and create a left-to-right flowchart (e.g., "flowchart LR") with nodes named "Constant Pattern", "Poisson Pattern", "Gamma (bursty)" connected by arrows and include sublabels "Perfect spacing", "Natural variance", "Clustered bursts" as node descriptions or subnodes so the visual matches the original. Do the same conversion for the other listed ASCII timeline blocks (lines 45-49, 65-69, 96-108, 121-127), but do not change any ASCII box-drawing tables that are inside real command/output fences—leave those exact text blocks untouched.docs/environment-variables.md (1)
1-23:⚠️ Potential issue | 🟠 MajorUpdate the generator to output JSX
<Warning>component instead of markdown blockquotes.The file is auto-generated by
tools/generate_env_vars_docs.py, but currently out of sync. The generator produces> [!WARNING]markdown blockquotes (line 203-205 ingenerate_markdown), while the file contains<Warning>JSX components. Updatetools/generate_env_vars_docs.pyto replace the blockquote syntax with:"<Warning>", "Environment variable names, default values, and definitions are subject to change.", "These settings may be modified, renamed, or removed in future releases.", "</Warning>",Then regenerate the file with
make generate-env-vars-docsto sync all changes (the SPDX header format is already correct).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/environment-variables.md` around lines 1 - 23, The generated docs still use markdown blockquote warning syntax; update the generator in tools/generate_env_vars_docs.py (look for the generate_markdown function around the block that emits the warning lines) to emit the JSX <Warning> component lines instead of the "> [!WARNING]" blockquote—replace the three line blockquote output with the four strings: "<Warning>", the warning text line(s), and "</Warning>". After making that change, run make generate-env-vars-docs to re-generate docs so docs/environment-variables.md matches the new JSX warning format.docs/tutorials/audio.md (1)
89-93:⚠️ Potential issue | 🟠 MajorRemove user-specific home paths from sample outputs.
These paths expose a developer-specific username/location and make the tutorial non-portable. Replace them with neutral placeholders (for example,
/path/to/aiperf/artifacts/...).Suggested doc-safe replacement
-/home/lkomali/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/profile_export_aiperf.csv +/path/to/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency4/profile_export_aiperf.csv ... -/home/lkomali/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency1/profile_export_aiperf.json +/path/to/aiperf/artifacts/Qwen_Qwen2-Audio-7B-Instruct-openai-chat-concurrency1/profile_export_aiperf.jsonAlso applies to: 155-158
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/audio.md` around lines 89 - 93, The sample output contains developer-specific absolute home paths; locate the occurrences of the filenames "profile_export_aiperf.csv", "profile_export_aiperf.json", and "aiperf.log" in docs/tutorials/audio.md and replace their parent paths with neutral placeholders (e.g. "/path/to/aiperf/artifacts/.../profile_export_aiperf.csv", "/path/to/aiperf/artifacts/.../profile_export_aiperf.json", "/path/to/aiperf/artifacts/.../logs/aiperf.log") ensuring you update all instances (including the other occurrences mentioned) so no user-specific home directories remain.docs/cli-options.md (1)
1-1077:⚠️ Potential issue | 🟠 MajorThis generated file is out of sync with the generator output.
CI reports
pre-commithookgenerate-cli-docsmodified files. Please regenerate and commit updated artifacts before merge.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/cli-options.md` around lines 1 - 1077, The docs/cli-options.md is stale and was modified by the pre-commit `generate-cli-docs` hook; regenerate the CLI docs using the same generator invoked by the pre-commit hook (run the `generate-cli-docs` step or the project script/Makefile target that produces cli docs), replace the checked-in docs/cli-options.md with the newly generated output, and commit the updated file so the pre-commit check passes; ensure you run the pre-commit hooks (or `pre-commit run generate-cli-docs --all-files`) locally before pushing.
♻️ Duplicate comments (7)
docs/tutorials/fixed-schedule.md (1)
57-57:⚠️ Potential issue | 🟠 MajorSame MDX syntax issue affects benchmark run markers.
The benchmark run markers use JSX-style comments (e.g.,
{/* aiperf-run-vllm-default-openai-endpoint-server */}) which will also render as literal text on GitHub. For consistency with the recommendation in the license header comment, these should also use HTML comment syntax if GitHub visibility is required.Note: The markdownlint MD037 warnings (spaces inside emphasis markers) are false positives caused by the linter misinterpreting JSX comment syntax. If converted to HTML comments, these warnings will resolve.
Also applies to: 84-84, 121-121, 136-136
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/fixed-schedule.md` at line 57, Replace the JSX-style benchmark run markers like "{/* aiperf-run-vllm-default-openai-endpoint-server */}" with HTML comments "<!-- aiperf-run-vllm-default-openai-endpoint-server -->" so they don’t render as literal text on GitHub and so the MDX/markdown linter no longer misinterprets them; find each benchmark marker string (e.g., "aiperf-run-vllm-default-openai-endpoint-server" and the other similar markers present in the file) and convert the surrounding "{/* ... */}" to the HTML "<!-- ... -->" form.docs/cli-options.md (1)
545-545:⚠️ Potential issue | 🟡 MinorFix the missing space in the generated description (
one.Mutually).This typo is still present; update the source docstring and regenerate this file.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/cli-options.md` at line 545, Typo: the generated description for the --num-prefix-prompts option contains "one.Mutually" (missing space). Edit the original docstring/option help text where --num-prefix-prompts is defined (search for the string "one.Mutually" or the help text for --num-prefix-prompts) and add the missing space so it reads "one. Mutually exclusive..."; then regenerate the docs/cli-options.md using the project's docs generation task/script to update the generated file so the fix appears in docs.docs/diagrams/mixins.md (1)
9-31:⚠️ Potential issue | 🔴 CriticalFix Mermaid edge syntax (
*/}→-->) to restore diagram rendering.All connectors in this segment use an invalid operator, so links won’t render.
Proposed fix
- A["BaseMixin<br/><em>Ensures proper inheritance chain</em>"] */} B["AIPerfLoggerMixin<br/><em>Lazy-evaluated logging with f-strings</em>"] + A["BaseMixin<br/><em>Ensures proper inheritance chain</em>"] --> B["AIPerfLoggerMixin<br/><em>Lazy-evaluated logging with f-strings</em>"]Apply the same replacement for every
*/}edge in this diagram.#!/bin/bash set -euo pipefail rg -n '\*/}' docs/diagrams/mixins.md || true🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/diagrams/mixins.md` around lines 9 - 31, The Mermaid diagram uses invalid edge syntax `*/}`; replace every occurrence of `*/}` with the correct Mermaid connector `-->` for all edges in this block (e.g., the edges connecting BaseMixin → AIPerfLoggerMixin, AIPerfLoggerMixin → HooksMixin/TaskManagerMixin, HooksMixin → AIPerfLifecycleMixin, TaskManagerMixin → AIPerfLifecycleMixin, AIPerfLifecycleMixin → MessageBusClientMixin, MessageBusClientMixin → BaseService, BaseService → BaseComponentService/SystemController, and BaseComponentService → DatasetManager/TimingManager/RecordsManager/RecordProcessor/WorkerManager/Worker) so the diagram renders properly..cursor/skills/docs-to-fern/SKILL_md (2)
344-353:⚠️ Potential issue | 🟡 MinorSPDX conversion guidance is inconsistent with the repo’s current JSX header style.
This script converts SPDX HTML comments to YAML frontmatter; current docs in this PR are standardized on JSX comment headers. Align the skill instructions to one canonical format.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.cursor/skills/docs-to-fern/SKILL_md around lines 344 - 353, The SPDX conversion currently turns SPDX HTML comment blocks into YAML frontmatter using spdx_pattern/match/spdx_content/spdx_lines, but the repo uses JSX comment headers; update the logic to produce the canonical JSX header instead of YAML frontmatter: when spdx_pattern matches, build a JSX comment block (prefixed lines wrapped as {/* ... */}) using spdx_content/spdx_lines and insert it in place of the match, and ensure the final fallback re.sub(r'<!--(.*?)-->', r'{/* \1 */}', ...) remains or is adjusted to avoid double-wrapping already-converted SPDX blocks.
262-280:⚠️ Potential issue | 🟠 MajorUpdate this skill to the config-only Fern model (no
fern/pagescontent duplication).These sections still prescribe bulk-copying
docs/intofern/pages/and navigation rooted in../pages/..., which conflicts with the agreed single-source docs approach.Also applies to: 463-565, 666-667
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.cursor/skills/docs-to-fern/SKILL_md around lines 262 - 280, The instructions currently tell users to bulk-copy docs into fern/pages and use ../pages navigation (the find/while loop that builds target="fern/pages/$(...)" and the mkdir/cp steps); replace this with config-only Fern guidance: remove the find/while copy script and any mention of creating or using fern/pages, and instead show how to point the Fern configuration (site/navigation) at the existing docs/ source (and remove ../pages-* links), updating references where navigation items or links use ../pages/... to reference the original docs paths; ensure all mentions of target="fern/pages/..." and the copy/mkdir/cp steps are deleted and replaced with a brief note describing using Fern’s config to map docs as the single source.docs/tutorials/sglang-image-generation.md (1)
247-247:⚠️ Potential issue | 🟡 MinorAdd descriptive alt text to the three generated-image references.
Line [247], Line [253], and Line [259] still use empty
, which fails accessibility/lint checks.✏️ Proposed fix
- + - + - +Also applies to: 253-253, 259-259
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/sglang-image-generation.md` at line 247, Three image markdown references lack alt text; replace the empty brackets for the three images (../media/extracted-images/image-0001-00-00.jpg, image-0001-00-01.jpg, image-0001-00-02.jpg) with short descriptive alt text describing each generated image (e.g. "generated solar system scene" or whatever matches content) so markdown becomes  for each occurrence to satisfy accessibility/lint rules.docs/server-metrics/server-metrics-reference.md (1)
1-4:⚠️ Potential issue | 🟠 MajorSPDX wrapper issue is still present here.
Line [1]-Line [4] still uses
{/* ... */}in a.mdfile; this is the same rendering-portability issue previously flagged.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/server-metrics/server-metrics-reference.md` around lines 1 - 4, Replace the invalid JSX-style comment wrapper `{/* ... */}` at the top of the Markdown file with a proper Markdown-safe comment: remove the `{/*` and `*/}` tokens around the SPDX header and put the SPDX lines inside an HTML comment so they won't render (e.g., use an HTML comment containing the SPDX-FileCopyrightText and SPDX-License-Identifier lines); locate the existing block by searching for the `{/*` token and update the SPDX block accordingly (do not leave the JSX-style wrapper or expose raw license lines in rendered output).
🧹 Nitpick comments (3)
docs/tutorials/request-rate-concurrency.md (1)
38-42: Add blank line after table for consistency.The table should be followed by a blank line before the closing
</Note>tag to comply with MD058 and maintain consistent spacing.📝 Proposed formatting fix
| 100 | 20 req/s | ~5.0 seconds | + </Note>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorials/request-rate-concurrency.md` around lines 38 - 42, Add a single blank line after the Markdown table (the block starting with "| Concurrency | Request Rate | Ramp-up Time |" and ending with "| 100 | 20 req/s | ~5.0 seconds |") so there is an empty line before the closing </Note> tag to satisfy MD058 and maintain consistent spacing..cursor/skills/docs-to-fern/SKILL_md (1)
885-889: Includedocs/**in the publish trigger paths for config-only docs.If docs content remains in
docs/, triggering only onfern/**will miss content-only documentation updates.Suggested workflow snippet update
on: push: branches: [main] - paths: ['fern/**'] + paths: ['fern/**', 'docs/**']🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.cursor/skills/docs-to-fern/SKILL_md around lines 885 - 889, The GitHub Actions workflow trigger only watches fern/** and will miss changes under docs/**; update the workflow's push.paths configuration (the on.push.paths entry that currently reads paths: ['fern/**']) to also include docs/** (e.g., add 'docs/**' to the array) so content-only documentation updates under docs/ will trigger the workflow.docs/tutorial.md (1)
12-12: Optional: remove spaces inside MDX tag comments to clear MD037 warnings.Line [12], Line [21], Line [23], Line [27], Line [30], and Line [42] use
{/* ... */}with surrounding spaces; compact form avoids markdownlint noise.✂️ Suggested cleanup
-{/* setup-vllm-default-openai-endpoint-server */} +{/*setup-vllm-default-openai-endpoint-server*/} ... -{/* /setup-vllm-default-openai-endpoint-server */} +{/*/setup-vllm-default-openai-endpoint-server*/} ... -{/* health-check-vllm-default-openai-endpoint-server */} +{/*health-check-vllm-default-openai-endpoint-server*/} ... -{/* /health-check-vllm-default-openai-endpoint-server */} +{/*/health-check-vllm-default-openai-endpoint-server*/} ... -{/* aiperf-run-vllm-default-openai-endpoint-server */} +{/*aiperf-run-vllm-default-openai-endpoint-server*/} ... -{/* /aiperf-run-vllm-default-openai-endpoint-server */} +{/*/aiperf-run-vllm-default-openai-endpoint-server*/}Also applies to: 21-21, 23-23, 27-27, 30-30, 42-42
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/tutorial.md` at line 12, Replace the MDX comment tags that contain extra spaces so they use a compact form; locate occurrences of the comment pattern{/* setup-vllm-default-openai-endpoint-server */} (and the other similar tags on the page) and remove the spaces after the opening brace and before the closing brace so they become{/*setup-vllm-default-openai-endpoint-server*/}, repeating the same fix for the other MDX comments mentioned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/tutorials/custom-prompt-benchmarking.md`:
- Around line 41-43: The Markdown heading "# Create an input file with specific
text inputs" sits outside the fenced code block after the JSX marker {/*
aiperf-run-vllm-default-openai-endpoint-server */} and will render visibly;
either move that line inside the following triple-backtick code fence so it's
part of the code block or replace it with a JSX comment (e.g., wrap it with {/*
... */}) so it does not render; update the snippet containing the JSX marker and
the code fence accordingly to keep comments non-rendering.
In `@docs/tutorials/openai-text-endpoints.md`:
- Around line 1-4: Replace all JSX-style block comments "{/* ... */}" with
standard HTML comments "<!-- ... -->" throughout the document so the Markdown is
valid; search for the "{/*" and "*/}" tokens (they appear at the top and at
other comment spots) and convert each to the corresponding "<!--" and "-->"
form, preserving the original comment text and spacing.
---
Outside diff comments:
In `@docs/cli-options.md`:
- Around line 1-1077: The docs/cli-options.md is stale and was modified by the
pre-commit `generate-cli-docs` hook; regenerate the CLI docs using the same
generator invoked by the pre-commit hook (run the `generate-cli-docs` step or
the project script/Makefile target that produces cli docs), replace the
checked-in docs/cli-options.md with the newly generated output, and commit the
updated file so the pre-commit check passes; ensure you run the pre-commit hooks
(or `pre-commit run generate-cli-docs --all-files`) locally before pushing.
In `@docs/environment-variables.md`:
- Around line 1-23: The generated docs still use markdown blockquote warning
syntax; update the generator in tools/generate_env_vars_docs.py (look for the
generate_markdown function around the block that emits the warning lines) to
emit the JSX <Warning> component lines instead of the "> [!WARNING]"
blockquote—replace the three line blockquote output with the four strings:
"<Warning>", the warning text line(s), and "</Warning>". After making that
change, run make generate-env-vars-docs to re-generate docs so
docs/environment-variables.md matches the new JSX warning format.
In `@docs/tutorials/arrival-patterns.md`:
- Around line 189-200: Update the fenced code block for the runnable CLI example
so the language tag is "bash" instead of "text" to improve copy/paste and syntax
highlighting; locate the block containing the "aiperf profile \ --model
your-model \ --url localhost:8000 \ --endpoint-type chat \ --streaming \
--request-rate 100 \ --arrival-pattern poisson \ --benchmark-duration 60 \
--output-dir results/poisson" command and replace the opening triple-backtick
fence from ```text to ```bash.
- Around line 14-20: Replace the three-block ASCII timeline (the fenced ```text
block showing "Constant Pattern", "Poisson Pattern", "Gamma (bursty)" and their
arrows) with a mermaid diagram: wrap a mermaid block (```mermaid) and create a
left-to-right flowchart (e.g., "flowchart LR") with nodes named "Constant
Pattern", "Poisson Pattern", "Gamma (bursty)" connected by arrows and include
sublabels "Perfect spacing", "Natural variance", "Clustered bursts" as node
descriptions or subnodes so the visual matches the original. Do the same
conversion for the other listed ASCII timeline blocks (lines 45-49, 65-69,
96-108, 121-127), but do not change any ASCII box-drawing tables that are inside
real command/output fences—leave those exact text blocks untouched.
In `@docs/tutorials/audio.md`:
- Around line 89-93: The sample output contains developer-specific absolute home
paths; locate the occurrences of the filenames "profile_export_aiperf.csv",
"profile_export_aiperf.json", and "aiperf.log" in docs/tutorials/audio.md and
replace their parent paths with neutral placeholders (e.g.
"/path/to/aiperf/artifacts/.../profile_export_aiperf.csv",
"/path/to/aiperf/artifacts/.../profile_export_aiperf.json",
"/path/to/aiperf/artifacts/.../logs/aiperf.log") ensuring you update all
instances (including the other occurrences mentioned) so no user-specific home
directories remain.
---
Duplicate comments:
In @.cursor/skills/docs-to-fern/SKILL_md:
- Around line 344-353: The SPDX conversion currently turns SPDX HTML comment
blocks into YAML frontmatter using spdx_pattern/match/spdx_content/spdx_lines,
but the repo uses JSX comment headers; update the logic to produce the canonical
JSX header instead of YAML frontmatter: when spdx_pattern matches, build a JSX
comment block (prefixed lines wrapped as {/* ... */}) using
spdx_content/spdx_lines and insert it in place of the match, and ensure the
final fallback re.sub(r'<!--(.*?)-->', r'{/* \1 */}', ...) remains or is
adjusted to avoid double-wrapping already-converted SPDX blocks.
- Around line 262-280: The instructions currently tell users to bulk-copy docs
into fern/pages and use ../pages navigation (the find/while loop that builds
target="fern/pages/$(...)" and the mkdir/cp steps); replace this with
config-only Fern guidance: remove the find/while copy script and any mention of
creating or using fern/pages, and instead show how to point the Fern
configuration (site/navigation) at the existing docs/ source (and remove
../pages-* links), updating references where navigation items or links use
../pages/... to reference the original docs paths; ensure all mentions of
target="fern/pages/..." and the copy/mkdir/cp steps are deleted and replaced
with a brief note describing using Fern’s config to map docs as the single
source.
In `@docs/cli-options.md`:
- Line 545: Typo: the generated description for the --num-prefix-prompts option
contains "one.Mutually" (missing space). Edit the original docstring/option help
text where --num-prefix-prompts is defined (search for the string "one.Mutually"
or the help text for --num-prefix-prompts) and add the missing space so it reads
"one. Mutually exclusive..."; then regenerate the docs/cli-options.md using the
project's docs generation task/script to update the generated file so the fix
appears in docs.
In `@docs/diagrams/mixins.md`:
- Around line 9-31: The Mermaid diagram uses invalid edge syntax `*/}`; replace
every occurrence of `*/}` with the correct Mermaid connector `-->` for all edges
in this block (e.g., the edges connecting BaseMixin → AIPerfLoggerMixin,
AIPerfLoggerMixin → HooksMixin/TaskManagerMixin, HooksMixin →
AIPerfLifecycleMixin, TaskManagerMixin → AIPerfLifecycleMixin,
AIPerfLifecycleMixin → MessageBusClientMixin, MessageBusClientMixin →
BaseService, BaseService → BaseComponentService/SystemController, and
BaseComponentService →
DatasetManager/TimingManager/RecordsManager/RecordProcessor/WorkerManager/Worker)
so the diagram renders properly.
In `@docs/server-metrics/server-metrics-reference.md`:
- Around line 1-4: Replace the invalid JSX-style comment wrapper `{/* ... */}`
at the top of the Markdown file with a proper Markdown-safe comment: remove the
`{/*` and `*/}` tokens around the SPDX header and put the SPDX lines inside an
HTML comment so they won't render (e.g., use an HTML comment containing the
SPDX-FileCopyrightText and SPDX-License-Identifier lines); locate the existing
block by searching for the `{/*` token and update the SPDX block accordingly (do
not leave the JSX-style wrapper or expose raw license lines in rendered output).
In `@docs/tutorials/fixed-schedule.md`:
- Line 57: Replace the JSX-style benchmark run markers like "{/*
aiperf-run-vllm-default-openai-endpoint-server */}" with HTML comments "<!--
aiperf-run-vllm-default-openai-endpoint-server -->" so they don’t render as
literal text on GitHub and so the MDX/markdown linter no longer misinterprets
them; find each benchmark marker string (e.g.,
"aiperf-run-vllm-default-openai-endpoint-server" and the other similar markers
present in the file) and convert the surrounding "{/* ... */}" to the HTML "<!--
... -->" form.
In `@docs/tutorials/sglang-image-generation.md`:
- Line 247: Three image markdown references lack alt text; replace the empty
brackets for the three images (../media/extracted-images/image-0001-00-00.jpg,
image-0001-00-01.jpg, image-0001-00-02.jpg) with short descriptive alt text
describing each generated image (e.g. "generated solar system scene" or whatever
matches content) so markdown becomes  for each occurrence to
satisfy accessibility/lint rules.
---
Nitpick comments:
In @.cursor/skills/docs-to-fern/SKILL_md:
- Around line 885-889: The GitHub Actions workflow trigger only watches fern/**
and will miss changes under docs/**; update the workflow's push.paths
configuration (the on.push.paths entry that currently reads paths: ['fern/**'])
to also include docs/** (e.g., add 'docs/**' to the array) so content-only
documentation updates under docs/ will trigger the workflow.
In `@docs/tutorial.md`:
- Line 12: Replace the MDX comment tags that contain extra spaces so they use a
compact form; locate occurrences of the comment pattern{/*
setup-vllm-default-openai-endpoint-server */} (and the other similar tags on the
page) and remove the spaces after the opening brace and before the closing brace
so they become{/*setup-vllm-default-openai-endpoint-server*/}, repeating the
same fix for the other MDX comments mentioned.
In `@docs/tutorials/request-rate-concurrency.md`:
- Around line 38-42: Add a single blank line after the Markdown table (the block
starting with "| Concurrency | Request Rate | Ramp-up Time |" and ending with "|
100 | 20 req/s | ~5.0 seconds |") so there is an empty line before
the closing </Note> tag to satisfy MD058 and maintain consistent spacing.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📥 Commits
Reviewing files that changed from the base of the PR and between c754b47e75d79ab6f0d6f47095fc2569efaa35e5 and 44df8dfd670f342fab007a409f0c940792727435.
⛔ Files ignored due to path filters (24)
docs/diagrams/plot-examples/multi-run/config-experiment-classification/pareto-curve-throughput-per-gpu-vs-interactivity.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/multi-run/config-experiment-classification/ttft-vs-throughput.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/multi-run/pareto-curve-throughput-per-gpu-vs-interactivity.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/multi-run/pareto-curve-throughput-per-gpu-vs-latency.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/multi-run/theme-dark-mode/pareto-curve-throughput-per-gpu-vs-interactivity.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/multi-run/theme-dark-mode/pareto-curve-throughput-per-gpu-vs-latency.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/multi-run/theme-dark-mode/ttft-vs-throughput.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/multi-run/ttft-vs-throughput.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/time-series/dispersed-throughput-over-time.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/time-series/gpu-utilization-and-throughput-over-time.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/time-series/itl-over-time.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/time-series/latency-over-time.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/time-series/theme-dark-mode/gpu-utilization-and-throughput-over-time.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/time-series/theme-dark-mode/itl-over-time.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/time-series/theme-dark-mode/timeslices-itl.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/time-series/ttft-over-time.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/time-series/ttft-timeline.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/timeslices/timeslices-itl.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/timeslices/timeslices-latency.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/timeslices/timeslices-throughput-warning.pngis excluded by!**/*.pngdocs/diagrams/plot-examples/single-run/timeslices/timeslices-ttft.pngis excluded by!**/*.pngdocs/media/extracted-images/image-0001-00-00.jpgis excluded by!**/*.jpgdocs/media/extracted-images/image-0002-00-00.jpgis excluded by!**/*.jpgdocs/media/extracted-images/image-0003-00-00.jpgis excluded by!**/*.jpg
📒 Files selected for processing (68)
.cursor/skills/docs-to-fern/SKILL_mddocs/api/synthesis.mddocs/architecture.mddocs/benchmark-datasets.mddocs/benchmark-modes/timing-modes-reference.mddocs/benchmark-modes/trace-replay.mddocs/cli-options.mddocs/comprehensive-llm-benchmarking.mddocs/dev/patterns.mddocs/diagrams/metrics-flow.mddocs/diagrams/mixins.mddocs/environment-variables.mddocs/genai-perf-feature-comparison.mddocs/index.mddocs/metrics-reference.mddocs/migrating.mddocs/plugins/creating-your-first-plugin.mddocs/plugins/plugin-system.mddocs/reference/tokenizer-auto-detection.mddocs/reproducibility.mddocs/server-metrics/server-metrics-json-schema.mddocs/server-metrics/server-metrics-parquet-schema.mddocs/server-metrics/server-metrics-reference.mddocs/server-metrics/server-metrics.mddocs/tutorial.mddocs/tutorials/arrival-patterns.mddocs/tutorials/audio.mddocs/tutorials/custom-dataset.mddocs/tutorials/custom-prompt-benchmarking.mddocs/tutorials/embeddings.mddocs/tutorials/fixed-schedule.mddocs/tutorials/goodput.mddocs/tutorials/gpu-telemetry.mddocs/tutorials/http-trace-metrics.mddocs/tutorials/huggingface-tgi.mddocs/tutorials/local-tokenizer.mddocs/tutorials/multi-run-confidence.mddocs/tutorials/multi-turn.mddocs/tutorials/multi-url-load-balancing.mddocs/tutorials/openai-text-endpoints.mddocs/tutorials/plot.mddocs/tutorials/prefill-concurrency.mddocs/tutorials/prefix-synthesis.mddocs/tutorials/ramping.mddocs/tutorials/rankings.mddocs/tutorials/request-cancellation.mddocs/tutorials/request-rate-concurrency.mddocs/tutorials/sequence-distributions.mddocs/tutorials/sglang-image-generation.mddocs/tutorials/sglang-video-generation.mddocs/tutorials/sharegpt.mddocs/tutorials/synthetic-video.mddocs/tutorials/template-endpoint.mddocs/tutorials/time-based-benchmarking.mddocs/tutorials/timeslices.mddocs/tutorials/ui-types.mddocs/tutorials/user-centric-timing.mddocs/tutorials/vision.mddocs/tutorials/warmup.mddocs/tutorials/working-with-profile-exports.mdfern/docs.ymlfern/fern.config.jsonfern/versions/next.ymlmkdocs.ymltests/ci/test_docs_end_to_end/parser.pytools/_core.pytools/generate_cli_docs.pytools/generate_env_vars_docs.py
✅ Files skipped from review due to trivial changes (1)
- docs/dev/patterns.md
🚧 Files skipped from review as they are similar to previous changes (29)
- docs/tutorials/multi-run-confidence.md
- docs/genai-perf-feature-comparison.md
- docs/tutorials/user-centric-timing.md
- docs/reference/tokenizer-auto-detection.md
- docs/benchmark-datasets.md
- docs/plugins/plugin-system.md
- docs/tutorials/huggingface-tgi.md
- docs/tutorials/synthetic-video.md
- fern/fern.config.json
- docs/tutorials/ui-types.md
- docs/architecture.md
- docs/tutorials/gpu-telemetry.md
- fern/docs.yml
- docs/metrics-reference.md
- docs/tutorials/template-endpoint.md
- fern/versions/next.yml
- docs/tutorials/rankings.md
- docs/tutorials/warmup.md
- docs/tutorials/time-based-benchmarking.md
- docs/tutorials/ramping.md
- docs/benchmark-modes/timing-modes-reference.md
- docs/diagrams/metrics-flow.md
- docs/tutorials/working-with-profile-exports.md
- docs/index.md
- docs/reproducibility.md
- docs/tutorials/multi-url-load-balancing.md
- docs/api/synthesis.md
- docs/server-metrics/server-metrics-json-schema.md
- docs/tutorials/prefix-synthesis.md
4f0b5f8 to
dd3bc7f
Compare
915d0dd to
5a4ca96
Compare
5a4ca96 to
649cd79
Compare
Summary by CodeRabbit
New Features
Documentation