Skip to content

Conversation

@mazam-lab
Copy link
Contributor

@mazam-lab mazam-lab commented Nov 5, 2025

Still missing a few values and some additional plots.

Would like some feedback on the general direction of how the plots and tables are going to look as well as information on how to properly

(note that all the changes up to October 14th should already be reflected in this repo via the squashed commit that occurred in my last PR)

Summary by CodeRabbit

  • Documentation
    • Added comprehensive runtime measurement guidance for model fine‑tuning: wall‑clock timings, per‑epoch and full‑run results, sanity checks, and data preprocessing notes.
    • Includes comparative analyses (model and hardware variants), adjusted scenario breakdowns, tables/graphs, multipliers, variability guidance, and references for reproducible performance evaluation.

@coderabbitai
Copy link

coderabbitai bot commented Nov 5, 2025

Walkthrough

Added a new documentation file providing wall-clock runtime measurements for model fine-tuning, including experiment notes, per‑epoch timings, comparative analyses across hardware and model versions, dataset preprocessing details, and references to graphs and sanity checks. (49 words)

Changes

Cohort / File(s) Change Summary
Runtime Measurement Documentation
examples/docs/runtime_estimates.md
Added a new documentation file containing experiment defaults and notes, per‑epoch training time tables, hardware/model/dataset comparison sections (A100 vs H100, Granite 3.3 vs 4, SFT vs OSFT), multiplier/difference calculations, adjusted scenarios, dataset preprocessing and token/sequence considerations, sanity checks, and references to graphs/images.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Areas for attention:
    • Verify numeric timing values and derived multipliers/differences.
    • Confirm referenced images/graphs exist and paths are correct.
    • Check dataset preprocessing descriptions, token/sequence assumptions, and any dataset/model links for accuracy.

Poem

🐰 I hopped through logs at dawn's first light,

Timings tallied, tables bright,
GPUs humming, epochs align,
Numbers neat, each row and line,
A tiny carrot of insight.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change in the PR: adding documentation for runtime estimates. It is specific, clear, and directly related to the new file being added.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mazam-lab mazam-lab marked this pull request as ready for review November 7, 2025 18:21
@mazam-lab
Copy link
Contributor Author

All missing information has been, this is ready for a full PR

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
examples/docs/runtime_estimates.md (5)

9-10: Fix unordered list indentation.

Markdown linting expects 2-space indentation for nested list items, but these lines use 4-space indentation.

Apply this diff to use consistent 2-space indentation:

 - The experiments were conducted by using the default settings provided by `sft_granite_example.py` and `osft_granite_example.py`
-    - For SFT, `max_tokens_per_gpu=25000` and `max_seq_len=20000`
-    - For OSFT, `max_tokens_per_gpu=10000`, `max_seq_len=4096`, and `unfreeze_rank_ratio=0.3`
+  - For SFT, `max_tokens_per_gpu=25000` and `max_seq_len=20000`
+  - For OSFT, `max_tokens_per_gpu=10000`, `max_seq_len=4096`, and `unfreeze_rank_ratio=0.3`
 - **Models**: Two models were tested, **Granite 3.3 8B**, and **Granite 4 Tiny Preview** (a Mixture-of-Experts model that also has 8B Parameters)
 - **Hardware**: Two different hardware configurations were tested, a server with **8x A100s**, and an Openshift cluster with **8x H100s**. 
 - **Datasets**: Two datasets were tested, a simple dataset in Table-GPT and a much larger and longer dataset in Bespoke-Stratos-17k.
-    - Please note that both datasets were obtained by downloading the dataset from HuggingFace and then extracting the .jsonl file. 
+  - Please note that both datasets were obtained by downloading the dataset from HuggingFace and then extracting the .jsonl file. 
 - All experiments were run for the first full epoch two times, with the displayed time being the average of the two times. 
-    - **Please be aware that time for later epochs may vary**
-    - On the A100 machine, the variation between the two runs was negligible, never more than 6 seconds. 
-    - The variation is a bit larger on the H100 machine, especially during the first run of a pod (the first result was discarded and reran if it varied significantly)
+  - **Please be aware that time for later epochs may vary**
+  - On the A100 machine, the variation between the two runs was negligible, never more than 6 seconds. 
+  - The variation is a bit larger on the H100 machine, especially during the first run of a pod (the first result was discarded and reran if it varied significantly)
 - The time measurement is calculated by using the timestamps logged during the training process in the above scripts
 - By default, OSFT makes use of Liger Kernels to improve memory usage and runtime. However, as of Nov 7th 2025, Liger Kernels currently don't have built-in support for Granite 4
-    - As a result, the script was modified for allow Liger Kernels to be disabled for certain experiments
-    - The tables will be updated once support for Liger Kernels is added. 
+  - As a result, the script was modified for allow Liger Kernels to be disabled for certain experiments
+  - The tables will be updated once support for Liger Kernels is added. 
 - Many of these tests had the checkpointing hardcoded to be disabled in the script (set `checkpoint_at_epoch=False` and `accelerate_full_state_at_epoch=False`)
-    - This does not appear to impact runtime of the actual training loop
-    - This was mostly done to conserve disk space due to checkpoints being very large (tens of GB per epoch), which can cause DiskPressure on OpenShift
+  - This does not appear to impact runtime of the actual training loop
+  - This was mostly done to conserve disk space due to checkpoints being very large (tens of GB per epoch), which can cause DiskPressure on OpenShift

Also applies to: 14-18, 21-25


113-119: Clarify table column headers for dataset/model breakdowns.

The header "Granite 3.3 Bespoke" and "Granite 4 Bespoke" for columns 3–4 is misleading given that column 2 contains Table-GPT statistics. Since line 109 notes that Granite 3.3 and 4 differ negligibly for Table-GPT, the table structure should more clearly indicate which column represents which dataset.

Revise the table header to be clearer:

-| Stat                 | Table-GPT   | Granite 3.3 Bespoke | Granite 4 Bespoke |
+| Stat                 | Table-GPT | Bespoke (Granite 3.3) | Bespoke (Granite 4) |

Or, if the differences matter, split into separate tables labeled "Table-GPT Stats" and "Bespoke Stats" to remove ambiguity.


150-151: Use hyphenated compound adjectives before nouns.

Grammar: "open source" should be "open-source" when used as a compound adjective modifying a noun.

Apply this diff:

-Granite 3.3 is an open source **8B Parameter Large Language** Instruct model
-Granite 4 is still in preview stages, for these runs we use Tiny Preview, which is an open source **7B Parameter Hybrid Mixture-of-Experts** Instruct Model
+Granite 3.3 is an open-source **8B Parameter Large Language** Instruct model
+Granite 4 is still in preview stages, for these runs we use Tiny Preview, which is an open-source **7B Parameter Hybrid Mixture-of-Experts** Instruct Model

Also, "8B Parameter" and "7B Parameter" should be hyphenated: 8B-Parameter and 7B-Parameter.

-Granite 3.3 is an open-source **8B Parameter Large Language** Instruct model
+Granite 3.3 is an open-source **8B-Parameter Large-Language** Instruct model
-Granite 4 is still in preview stages, for these runs we use Tiny Preview, which is an open-source **7B Parameter Hybrid Mixture-of-Experts** Instruct Model
+Granite 4 is still in preview stages, for these runs we use Tiny Preview, which is an open-source **7B-Parameter Hybrid Mixture-of-Experts** Instruct Model

181-181: Use a regular heading instead of bold text; improve text conciseness.

Line 181 uses bold emphasis to introduce a note, but this should be a proper heading (###). Additionally, "All of the" should be shortened to "All" for clarity.

Apply this diff:

-**All of the measured times are for a single trial only! They are NOT the average of multiple trials**
+### Note on Trial Counts
+
+All measured times are for a single trial only. They are NOT the average of multiple trials.

25-25: Replace vague intensifier with concrete descriptor.

"very large" is imprecise. Use a specific measurement or qualifier.

-    - This was mostly done to conserve disk space due to checkpoints being very large (tens of GB per epoch), which can cause DiskPressure on OpenShift
+  - This was mostly done to conserve disk space due to checkpoints being extremely large (tens of GB per epoch), which can cause DiskPressure on OpenShift

Alternatively, if space is the key constraint, rephrase to:

+  - This was mostly done to conserve disk space; checkpoints are tens of GB per epoch and can cause DiskPressure on OpenShift
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ce5903a and c65a094.

⛔ Files ignored due to path filters (6)
  • examples/docs/osft_datasets.png is excluded by !**/*.png
  • examples/docs/osft_hardware.png is excluded by !**/*.png
  • examples/docs/osft_models.png is excluded by !**/*.png
  • examples/docs/sft_datasets.png is excluded by !**/*.png
  • examples/docs/sft_hardware.png is excluded by !**/*.png
  • examples/docs/sft_models.png is excluded by !**/*.png
📒 Files selected for processing (1)
  • examples/docs/runtime_estimates.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/docs/runtime_estimates.md

[style] ~25-~25: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...rve disk space due to checkpoints being very large (tens of GB per epoch), which can cause...

(EN_WEAK_ADJECTIVE)


[grammar] ~150-~150: Use a hyphen to join words.
Context: ...3.3 vs Granite 4 Granite 3.3 is an open source 8B Parameter Large Language I...

(QB_NEW_EN_HYPHEN)


[grammar] ~151-~151: Use a hyphen to join words.
Context: ...ns we use Tiny Preview, which is an open source **7B Parameter Hybrid Mixture-of-...

(QB_NEW_EN_HYPHEN)


[style] ~181-~181: Consider removing “of” to be more concise
Context: ...to the full training cycle measures. **All of the measured times are for a single trial o...

(ALL_OF_THE)

🪛 markdownlint-cli2 (0.18.1)
examples/docs/runtime_estimates.md

9-9: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


10-10: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


14-14: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


16-16: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


17-17: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


18-18: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


21-21: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


22-22: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


24-24: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


25-25: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


57-57: Bare URL used

(MD034, no-bare-urls)


69-69: Link and image reference definitions should be needed
Duplicate link or image reference definition: "logo"

(MD053, link-image-reference-definitions)


91-91: Bare URL used

(MD034, no-bare-urls)


91-91: Bare URL used

(MD034, no-bare-urls)


125-125: Link and image reference definitions should be needed
Duplicate link or image reference definition: "logo"

(MD053, link-image-reference-definitions)


129-129: Link and image reference definitions should be needed
Duplicate link or image reference definition: "logo"

(MD053, link-image-reference-definitions)


150-150: Bare URL used

(MD034, no-bare-urls)


151-151: Bare URL used

(MD034, no-bare-urls)


157-157: Link and image reference definitions should be needed
Duplicate link or image reference definition: "logo"

(MD053, link-image-reference-definitions)


161-161: Link and image reference definitions should be needed
Duplicate link or image reference definition: "logo"

(MD053, link-image-reference-definitions)


181-181: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

🔇 Additional comments (1)
examples/docs/runtime_estimates.md (1)

63-70: Verify that referenced image files exist in the repository.

The documentation references PNG files on GitHub (e.g., sft_hardware.png, osft_datasets.png). If these images are placeholders or not yet committed, the links will be broken for readers.

Please confirm that the following image files exist on the main branch at the paths specified:

  • examples/docs/sft_hardware.png
  • examples/docs/osft_hardware.png
  • examples/docs/sft_datasets.png
  • examples/docs/osft_datasets.png
  • examples/docs/sft_models.png
  • examples/docs/osft_models.png

If these files are not yet available, consider adding a note (e.g., "Graphs coming soon") or temporarily removing the image sections to avoid broken links in documentation.

Also applies to: 123-130, 155-162

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
examples/docs/runtime_estimates.md (1)

57-57: Wrap bare URLs in markdown link syntax

Bare URLs should be wrapped in markdown links for proper rendering and accessibility. This was flagged in a previous review and remains unaddressed.

Apply these diffs:

- The A100 and H100 both contain 80 GB of VRAM per GPUs, and both setups contain 8 GPUs. However, the H100 higher FLOPs and more cores than the A100. See Wikipedia: https://en.wikipedia.org/wiki/Hopper_(microarchitecture)#H100_accelerator_and_DGX_H100
+ The A100 and H100 both contain 80 GB of VRAM per GPUs, and both setups contain 8 GPUs. However, the H100 higher FLOPs and more cores than the A100. See [Wikipedia on H100](https://en.wikipedia.org/wiki/Hopper_(microarchitecture)#H100_accelerator_and_DGX_H100)
- The two datasets used were Table-GPT (https://huggingface.co/datasets/LipengCS/Table-GPT) and Bespoke-Stratos-17k (https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-17k).
+ The two datasets used were [Table-GPT](https://huggingface.co/datasets/LipengCS/Table-GPT) and [Bespoke-Stratos-17k](https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-17k).
- Granite 3.3 is an open source **8B Parameter Large Language** Instruct model https://huggingface.co/ibm-granite/granite-3.3-8b-instruct
- Granite 4 is still in preview stages, for these runs we use Tiny Preview, which is an open source **7B Parameter Hybrid Mixture-of-Experts** Instruct Model https://huggingface.co/ibm-granite/granite-4.0-tiny-preview
+ Granite 3.3 is an open source **8B Parameter Large Language** Instruct model [on HuggingFace](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct)
+ Granite 4 is still in preview stages, for these runs we use Tiny Preview, which is an open source **7B Parameter Hybrid Mixture-of-Experts** Instruct Model [on HuggingFace](https://huggingface.co/ibm-granite/granite-4.0-tiny-preview)

Also applies to: 91-91, 150-151

🧹 Nitpick comments (3)
examples/docs/runtime_estimates.md (3)

150-151: Use hyphens in compound modifiers.

Compound adjectives before nouns should be hyphenated for grammatical correctness.

Apply this diff:

- Granite 3.3 is an open source **8B Parameter Large Language** Instruct model [on HuggingFace](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct)
- Granite 4 is still in preview stages, for these runs we use Tiny Preview, which is an open source **7B Parameter Hybrid Mixture-of-Experts** Instruct Model [on HuggingFace](https://huggingface.co/ibm-granite/granite-4.0-tiny-preview)
+ Granite 3.3 is an open source **8B-Parameter Large-Language** Instruct model [on HuggingFace](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct)
+ Granite 4 is still in preview stages, for these runs we use Tiny Preview, which is an open source **7B-Parameter Hybrid Mixture-of-Experts** Instruct Model [on HuggingFace](https://huggingface.co/ibm-granite/granite-4.0-tiny-preview)

181-181: Improve emphasis line wording and consider structured highlighting.

Line 181 uses bold emphasis for an important caveat but could be clearer. Additionally, "All of the" is unnecessarily verbose. Consider a blockquote or code block for better visual hierarchy, or simplify to normal emphasis:

- **All of the measured times are for a single trial only! They are NOT the average of multiple trials**
+ > **Note:** All measured times are for a single trial only, not the average of multiple trials.

Alternatively, if keeping emphasis:

- **All of the measured times are for a single trial only! They are NOT the average of multiple trials**
+ **Note:** All measured times are for a single trial only—not the average of multiple trials.

25-25: Replace weak intensifier "very large" with specific descriptor.

Given that you specify "tens of GB per epoch," the size is already clear. Either remove the intensifier or use a more specific adjective:

- This was mostly done to conserve disk space due to checkpoints being very large (tens of GB per epoch), which can cause DiskPressure on OpenShift
+ This was mostly done to conserve disk space due to checkpoints being substantial (tens of GB per epoch), which can cause DiskPressure on OpenShift

Or simply:

- This was mostly done to conserve disk space due to checkpoints being very large (tens of GB per epoch), which can cause DiskPressure on OpenShift
+ This was mostly done to conserve disk space since checkpoints are quite large—tens of GB per epoch—which can cause DiskPressure on OpenShift.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c65a094 and aee38d9.

📒 Files selected for processing (1)
  • examples/docs/runtime_estimates.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/docs/runtime_estimates.md

[style] ~25-~25: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...rve disk space due to checkpoints being very large (tens of GB per epoch), which can cause...

(EN_WEAK_ADJECTIVE)


[grammar] ~150-~150: Use a hyphen to join words.
Context: ...3.3 vs Granite 4 Granite 3.3 is an open source 8B Parameter Large Language I...

(QB_NEW_EN_HYPHEN)


[grammar] ~151-~151: Use a hyphen to join words.
Context: ...ns we use Tiny Preview, which is an open source **7B Parameter Hybrid Mixture-of-...

(QB_NEW_EN_HYPHEN)


[style] ~181-~181: Consider removing “of” to be more concise
Context: ...to the full training cycle measures. **All of the measured times are for a single trial o...

(ALL_OF_THE)

🪛 markdownlint-cli2 (0.18.1)
examples/docs/runtime_estimates.md

9-9: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


10-10: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


14-14: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


16-16: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


17-17: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


18-18: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


21-21: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


22-22: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


24-24: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


25-25: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


57-57: Bare URL used

(MD034, no-bare-urls)


91-91: Bare URL used

(MD034, no-bare-urls)


91-91: Bare URL used

(MD034, no-bare-urls)


150-150: Bare URL used

(MD034, no-bare-urls)


151-151: Bare URL used

(MD034, no-bare-urls)


181-181: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

Copy link
Collaborator

@NikhilNayak-debug NikhilNayak-debug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple of comments, otherwise the PR looks good to me!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
examples/docs/runtime_estimates.md (3)

59-60: Wrap bare URLs in markdown link syntax.

These lines contain bare URLs that should be wrapped in markdown links for proper rendering and accessibility.

Apply this diff:

-Granite 3.3 is an open source **8B Parameter Large Language** Instruct model https://huggingface.co/ibm-granite/granite-3.3-8b-instruct
-Granite 4 is still in preview stages, for these runs we use Tiny Preview, which is an open source **7B Parameter Hybrid Mixture-of-Experts** Instruct Model https://huggingface.co/ibm-granite/granite-4.0-tiny-preview 
+Granite 3.3 is an open source **8B Parameter Large Language** Instruct model [on HuggingFace](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct)
+Granite 4 is still in preview stages, for these runs we use Tiny Preview, which is an open source **7B Parameter Hybrid Mixture-of-Experts** Instruct Model [on HuggingFace](https://huggingface.co/ibm-granite/granite-4.0-tiny-preview)

113-113: Wrap bare URLs in markdown link syntax.

This line contains two bare URLs that should be wrapped in markdown links for proper rendering and accessibility.

Apply this diff:

-The two datasets used were Table-GPT (https://huggingface.co/datasets/LipengCS/Table-GPT) and Bespoke-Stratos-17k (https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-17k). Both datasets are using the training split. Each dataset was downloaded via Huggingface's datasets package, with the .jsonl file extracted for use in Training-Hub.
+The two datasets used were [Table-GPT](https://huggingface.co/datasets/LipengCS/Table-GPT) and [Bespoke-Stratos-17k](https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-17k). Both datasets are using the training split. Each dataset was downloaded via Huggingface's datasets package, with the .jsonl file extracted for use in Training-Hub.

88-88: Wrap bare URL in markdown link syntax.

This bare URL should be wrapped in markdown link for proper rendering and accessibility.

Apply this diff:

-The A100 and H100 both contain 80 GB of VRAM per GPUs, and both setups contain 8 GPUs. However, the H100 higher FLOPs and more cores than the A100. See Wikipedia: https://en.wikipedia.org/wiki/Hopper_(microarchitecture)#H100_accelerator_and_DGX_H100 
+The A100 and H100 both contain 80 GB of VRAM per GPUs, and both setups contain 8 GPUs. However, the H100 higher FLOPs and more cores than the A100. See [Wikipedia on H100 accelerator](https://en.wikipedia.org/wiki/Hopper_(microarchitecture)#H100_accelerator_and_DGX_H100)
🧹 Nitpick comments (1)
examples/docs/runtime_estimates.md (1)

263-263: Convert emphasis to proper heading syntax.

Line 263 uses bold emphasis (**...**) where a proper markdown heading (##) would be more appropriate for section structure.

Apply this diff:

-**All of the measured times are for a single trial only! They are NOT the average of multiple trials**
+## Note on Trial Count
+
+All of the measured times are for a single trial only! They are NOT the average of multiple trials.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aee38d9 and 4a6c77c.

📒 Files selected for processing (1)
  • examples/docs/runtime_estimates.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
examples/docs/runtime_estimates.md

[style] ~20-~20: For conciseness, consider replacing this expression with an adverb.
Context: ...details about this warm-up aren't known at the moment, but will be added when more informatio...

(AT_THE_MOMENT)


[style] ~27-~27: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...rve disk space due to checkpoints being very large (tens of GB per epoch), which can cause...

(EN_WEAK_ADJECTIVE)


[grammar] ~59-~59: Use a hyphen to join words.
Context: ...3.3 vs Granite 4 Granite 3.3 is an open source 8B Parameter Large Language I...

(QB_NEW_EN_HYPHEN)


[grammar] ~60-~60: Use a hyphen to join words.
Context: ...ns we use Tiny Preview, which is an open source **7B Parameter Hybrid Mixture-of-...

(QB_NEW_EN_HYPHEN)


[style] ~160-~160: This wording can make your sentence hard to follow. Try rephrasing for improved clarity.
Context: ...ens_per_gpuhave an impact on the time due to it effectively serving as a batch size, butmax_seq_len` prev...

(DUE_TO_BECAUSE)


[style] ~263-~263: Consider removing “of” to be more concise
Context: ...to the full training cycle measures. **All of the measured times are for a single trial o...

(ALL_OF_THE)

🪛 markdownlint-cli2 (0.18.1)
examples/docs/runtime_estimates.md

9-9: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


10-10: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


14-14: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


16-16: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


17-17: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


18-18: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


19-19: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


20-20: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


23-23: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


24-24: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


26-26: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


27-27: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


59-59: Bare URL used

(MD034, no-bare-urls)


60-60: Bare URL used

(MD034, no-bare-urls)


88-88: Bare URL used

(MD034, no-bare-urls)


113-113: Bare URL used

(MD034, no-bare-urls)


113-113: Bare URL used

(MD034, no-bare-urls)


263-263: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

Comment on lines +9 to +10
- For SFT, `max_tokens_per_gpu=25000` and `max_seq_len=20000`
- For OSFT, `max_tokens_per_gpu=10000`, `max_seq_len=4096`, and `unfreeze_rank_ratio=0.3`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Fix markdown list indentation to use 2 spaces instead of 4.

Multiple nested list items use 4-space indentation but should use 2 spaces per markdown conventions for proper rendering and consistency.

Apply this diff to fix the indentation across all nested list items:

 - The experiments were conducted by using the default settings provided by `sft_granite_example.py` and `osft_granite_example.py`
-    - For SFT, `max_tokens_per_gpu=25000` and `max_seq_len=20000`
-    - For OSFT, `max_tokens_per_gpu=10000`, `max_seq_len=4096`, and `unfreeze_rank_ratio=0.3`
+  - For SFT, `max_tokens_per_gpu=25000` and `max_seq_len=20000`
+  - For OSFT, `max_tokens_per_gpu=10000`, `max_seq_len=4096`, and `unfreeze_rank_ratio=0.3`
 - **Models**: Two models were tested, **Granite 3.3 8B**, and **Granite 4 Tiny Preview** (a Mixture-of-Experts model that also has 8B Parameters)
 - **Hardware**: Two different hardware configurations were tested, a server with **8x A100s**, and an Openshift cluster with **8x H100s**. 
 - **Datasets**: Two datasets were tested, a simple dataset in Table-GPT and a much larger and longer dataset in Bespoke-Stratos-17k.
-    - Please note that both datasets were obtained by downloading the dataset from HuggingFace and then extracting the .jsonl file. 
+  - Please note that both datasets were obtained by downloading the dataset from HuggingFace and then extracting the .jsonl file. 
 - All experiments were run for the first full epoch two times, with the displayed time being the average of the two times. 
-    - **Please be aware that time for later epochs may vary**
-    - On the A100 machine, the variation between the two runs was negligible, never more than 6 seconds. 
-    - The variation is a bit larger on the H100 machine, especially during the first run of a pod (the first result was discarded and reran if it varied significantly)
-    - **Granite 4 tends to require some amount of warm-up before its first usage.** The shown times for Granite 4 are for runs where the warm-up does not occur. Typically, the warm-up adds about an additional 1 minute to the runtime.
-        - The reasons for and details about this warm-up aren't known at the moment, but will be added when more information is gathered. Please keep in mind Granite 4 is still in preview. 
+  - **Please be aware that time for later epochs may vary**
+  - On the A100 machine, the variation between the two runs was negligible, never more than 6 seconds. 
+  - The variation is a bit larger on the H100 machine, especially during the first run of a pod (the first result was discarded and reran if it varied significantly)
+  - **Granite 4 tends to require some amount of warm-up before its first usage.** The shown times for Granite 4 are for runs where the warm-up does not occur. Typically, the warm-up adds about an additional 1 minute to the runtime.
+    - The reasons for and details about this warm-up aren't known at the moment, but will be added when more information is gathered. Please keep in mind Granite 4 is still in preview. 
 - The time measurement is calculated by using the timestamps logged during the training process in the above scripts
 - By default, OSFT makes use of Liger Kernels to improve memory usage and runtime. However, as of Nov 7th 2025, Liger Kernels currently don't have built-in support for Granite 4
-    - As a result, the script was modified for allow Liger Kernels to be disabled for certain experiments
-    - The tables will be updated once support for Liger Kernels is added. 
+  - As a result, the script was modified for allow Liger Kernels to be disabled for certain experiments
+  - The tables will be updated once support for Liger Kernels is added. 
 - Many of these tests had the checkpointing hardcoded to be disabled in the script (set `checkpoint_at_epoch=False` and `accelerate_full_state_at_epoch=False`)
-    - This does not appear to impact runtime of the actual training loop
-    - This was mostly done to conserve disk space due to checkpoints being very large (tens of GB per epoch), which can cause DiskPressure on OpenShift
+  - This does not appear to impact runtime of the actual training loop
+  - This was mostly done to conserve disk space due to checkpoints being very large (tens of GB per epoch), which can cause DiskPressure on OpenShift

Also applies to: 14-14, 16-18, 21-22, 24-25

🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

9-9: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


10-10: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

🤖 Prompt for AI Agents
In examples/docs/runtime_estimates.md around lines 9-10 (also apply to 14,
16-18, 21-22, 24-25), several nested markdown list items are indented with 4
spaces; change those nested list item indentations to 2 spaces each so the lists
render correctly and consistently (update every listed line range to use 2-space
indentation for nested items).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants