feat(public-docsite-v9): add llms docs #34838

dmytrokirpa · 2025-07-15T16:40:00Z

Previous Behavior

New Behavior

This PR introduces a new CLI tool that extracts documentation from Storybook builds and converts it to LLM-friendly formats following the llmstxt.org specification. The tool processes Storybook production builds to generate comprehensive documentation in plain text format optimized for Large Language Models.

Key Features

✅ Component Documentation: Extracts props, descriptions, and type information from React components
✅ Story Examples: Captures all story variations with complete source code
✅ MDX Support: Processes MDX documentation pages and converts HTML to clean Markdown
✅ Subcomponents: Handles complex components with subcomponents and their props
✅ LLMs.txt Format: Generates summary files following the llmstxt.org specification
✅ Static File Serving: Uses Playwright routing instead of Express for better reliability
✅ Flexible Configuration: Supports CLI arguments and config files

Technical Implementation

Static File Routing: Uses Playwright's page.route() to serve Storybook files without needing a web server
Story Extraction: Accesses Storybook's internal story store (__STORYBOOK_PREVIEW__) for metadata
Content Processing: Converts HTML documentation to clean Markdown using Turndown with GitHub Flavored Markdown support
Storybook Compatibility: Supports both Storybook 7 (storyStore) and Storybook 8+ (storyStoreValue)

Output Structure

storybook-static/ 
├── llms.txt # Main summary file (llmstxt.org format) 
└── llms/ 
├── components-button.txt # Individual component docs 
├── components-accordion.txt 
└── concepts-introduction.txt # MDX page docs

Usage Examples

Basic Usage:

npx storybook-llms-extractor --distPath "storybook-static" --baseUrl "https://storybook.example.com"

# or with refs

npx storybook-llms-extractor \
  --distPath "storybook-static" \
  --baseUrl "https://main.storybook.dev" \
  --refs '{"title":"Charts","url":"https://charts.storybook.dev"}'

With Configuration File:

// storybook-llms.config.js
// @ts-check

/** @type {import('@fluentui/storybook-llms-extractor').Args}
module.exports = {
  distPath: 'storybook-static',
  baseUrl: 'https://react.fluentui.dev',
  summaryTitle: 'Fluent UI React v9',
  summaryDescription: 'Fluent UI React components documentation',
  refs: [
    { title: 'Charts v9', url: 'https://charts.fluentui.dev' }
  ]
};

Files Added

tools/storybook-llms-extractor/src/cli.ts - CLI entry point and argument processing
tools/storybook-llms-extractor/src/utils.ts - Core extraction and conversion logic
tools/storybook-llms-extractor/src/types.ts - TypeScript type definitions
tools/storybook-llms-extractor/src/index.ts - Package exports
tools/storybook-llms-extractor/src/utils.spec.ts - Unit tests
tools/storybook-llms-extractor/src/__fixtures__/ - Test fixtures
tools/storybook-llms-extractor/README.md - Comprehensive documentation

github-actions · 2025-07-15T16:49:47Z

📊 Bundle size report

✅ No changes found

github-actions · 2025-07-15T19:08:57Z

Pull request demo site: URL

.github/workflows/pr-website-deploy.yml

tudorpopams · 2025-07-16T12:27:04Z

This is awesome! Can we apply the same pattern to composed stories as well? It would be really cool to get this done for charts and contrib as well.

Hotell · 2025-07-17T12:31:10Z

scripts/storybook/src/scripts/generate-llms-docs.ts

+    baseUrl: argv.baseUrl,
+    summaryTitle: argv.summaryTitle,
+    summaryDescription: argv.summaryDescription,
+    refs: parseRefs(argv.refs),


this whole logic could be simplified as we can guarantee what yargs parse will provide

it works well with primitive types, but I had some issues with objects, that's why this is needed. argv.refs has string|number[] type

scripts/storybook/src/scripts/generate-llms-docs.ts

Hotell · 2025-07-17T12:42:20Z

scripts/storybook/src/scripts/generate-llms-docs.ts

+
+  const stories: StorybookStoreItem[] = await page.evaluate(async () => {
+    // @ts-expect-error - Storybook Client API is not typed
+    await window.__STORYBOOK_CLIENT_API__.storyStore.cacheAllCSFFiles();


this wont work starting sb v8

we will need to add similar feature flagged behaviours like we did for storywright, which is also official public API

https://github.com/microsoft/storywright/pull/74/files#diff-e056b3ef14d67b65de77e3e846aa3bf75c699e5f4b60c6754c83766a306152afR38

this returns different metadata as currently used private api so needs to be doublechecked if this is feasible

__STORYBOOK_PREVIEW__.extract() doesn't provide all data we need, but we could use __STORYBOOK_PREVIEW__.storyStore which is available for both SB 7 and 8+

scripts/storybook/src/scripts/generate-llms-docs.ts

scripts/storybook/src/scripts/generate-llms-docs.spec.ts

dmytrokirpa · 2025-07-17T13:05:18Z

This is awesome! Can we apply the same pattern to composed stories as well? It would be really cool to get this done for charts and contrib as well.

https://fluentuipr.z22.web.core.windows.net/pull/34838/public-docsite-v9/storybook/llms.txt - v9 llms.txt
https://fluentuipr.z22.web.core.windows.net/pull/34838/chart-docsite/storybook/llms.txt - charts llms.txt

Contrib isn't ready yet since it's not in the monorepo, and I'm still figuring out the optimal way to distribute the script if we'll decide to go with it.

Hotell · 2025-07-17T13:09:50Z

distribution:

i don't see how this could possible work as SB addon or bundler plugin because how this works under the hood.

it's very similar to what storywright does for obtaining screenshots, which is actually desired behaviour as it makes the tool atomic and re-usable.

While the implementation is tightly coupled to our full source addon, it shouldn't coupled as a pre-requirement to have - thus having a graceful behaviour, if full source exists we process that code otherwise standard sb code.

naming of the CLI package, something like: StorybookLLMextractor feels appropriate

storybook composition:

this approach won't scale outside repo linked SB, thus the approach here should be that it's responsibility of linked(composed) SB to generate the markdown assets as part of their production builds

dmytrokirpa · 2025-07-17T13:20:02Z

Thanks for the feedback @Hotell!

distribution:

i don't see how this could possible work as SB addon or bundler plugin because how this works under the hood.

it's very similar to what storywright does for obtaining screenshots, which is actually desired behaviour as it makes the tool atomic and re-usable.

While the implementation is tightly coupled to our full source addon, it shouldn't coupled as a pre-requirement to have - thus having a graceful behaviour, if full source exists we process that code otherwise standard sb code.

That makes sense.

naming of the CLI package, something like: StorybookLLMextractor feels appropriate

Agree, do you think it should live in the core monorepo or as a standalone repo?

storybook composition:

this approach won't scale outside repo linked SB, thus the approach here should be that it's responsibility of linked(composed) SB to generate the markdown assets as part of their production builds

That's exactly how it works atm, we use the refs cli arg to only include links to external (composed storybooks) in llms.txt, their assets generated as part of their production builds

…torybook documentation extraction

Hotell · 2025-07-29T12:30:16Z

Agree, do you think it should live in the core monorepo or as a standalone repo?

lets stick in core repo for now for logistic and distribution simplicity, in future it might make sense to create a new fluent-storybook-addons repo or something alike

Hotell

looking great !

added some commens/actionables ( mainly the SB api simplification / encapsulation )

A thing for thought:

with this approach it's a black box that might come as a surprise what the deployed output will be. maybe we should consider actually storing the .txt generation in git and force to re-generate if content changes ( similarly like we have for JSXIntrinsicElement in react-utilities )

tools/storybook-llms-extractor/src/utils.ts

Hotell · 2025-07-29T13:50:24Z

tools/storybook-llms-extractor/src/cli.ts

+      demandOption: true,
+      describe: 'Relative path to the Storybook distribution folder',
+    })
+    .option('baseUrl', {


this is a bit confusing, I thought this is used to actually fetch metadata from that origin but its only purpose is text emit in the .txt file

can we improve the docs or rename the property to something more meaningful ?

Hotell · 2025-07-29T13:52:22Z

tools/storybook-llms-extractor/src/cli.ts

+    .option('baseUrl', {
+      type: 'string',
+      default: '/',
+      describe: 'Base URL for the Storybook docs',
+    })


Suggested change

.option('baseUrl', {

type: 'string',

default: '/',

describe: 'Base URL for the Storybook docs',

})

.option('summaryBaseUrl', {

type: 'string',

default: '/',

describe: 'Storybook deployed URL for the summary docs',

})

Hotell · 2025-07-29T14:24:02Z

tools/storybook-llms-extractor/README.md

+## Requirements
+
+- Node.js 16+
+- Storybook 7+ (supports both Storybook 7 and 8)


the new approach should cover also v9 correct ?

I’ve tested this with v7 and v8 for Fluent core and extensions, and will check v9 next.

dmytrokirpa · 2025-07-29T15:38:15Z

with this approach it's a black box that might come as a surprise what the deployed output will be. maybe we should consider actually storing the .txt generation in git and force to re-generate if content changes ( similarly like we have for JSXIntrinsicElement in react-utilities )

That's a valid point about controlling the output, but it would mean core devs need to build a full docsite locally with every component story update PR, right?

dmytrokirpa self-assigned this Jul 15, 2025

dmytrokirpa added Area: Documentation Storybook labels Jul 15, 2025

github-actions bot added the CI label Jul 15, 2025

github-actions bot reviewed Jul 15, 2025

View reviewed changes

.github/workflows/pr-website-deploy.yml Show resolved Hide resolved

Hotell reviewed Jul 17, 2025

View reviewed changes

github-actions bot added the NX: core label Jul 28, 2025

dmytrokirpa added 13 commits July 28, 2025 12:08

feat(public-docsite-v9): add llms docs

864c513

fix CI issues

c00d56e

fix dupes

5a9a78a

update yarn.lock

9ab762f

refactor

6acc934

refactor

1299123

fixup

31a5f17

add support for Storybook composition

815a651

use correct URLs and fix type issues

a14aa15

fix dupes

06ca255

refactor: replace Express server with Playwright static routing for S…

2013515

…torybook documentation extraction

refactor: enhance type definitions and improve Storybook data handling

128a156

refactor: move script to a separate package

e9051fa

dmytrokirpa force-pushed the feat/llm-docs branch from 1b87b82 to e9051fa Compare July 28, 2025 10:08

dmytrokirpa added 3 commits July 28, 2025 12:09

revert scripts-storybook change

4a2074a

remove html fixtures

77919db

docs: update README for improved clarity and usage instructions

595d466

dmytrokirpa added 8 commits July 28, 2025 14:59

fix tsconfig.base.all

99d4b5a

add storybook-llms-extractor package as dep for docsites

363b591

dedupe deps

6fc77aa

update lockfile

502a876

Merge branch 'master' into feat/llm-docs

304bedb

prebuild storybook-llms-extractor package

5809223

Merge branch 'master' into feat/llm-docs

0812116

Merge branch 'master' into feat/llm-docs

0d29695

dmytrokirpa requested a review from Hotell July 28, 2025 14:53

dmytrokirpa marked this pull request as ready for review July 28, 2025 15:58

dmytrokirpa requested review from a team as code owners July 28, 2025 15:58

AtishayMsft approved these changes Jul 29, 2025

View reviewed changes

Hotell reviewed Jul 29, 2025

View reviewed changes

dmytrokirpa removed the NX: core label Jul 29, 2025

address review comments

cc454e0

github-actions bot added the NX: core label Jul 29, 2025

improve types

b68395f

dmytrokirpa requested a review from Hotell July 29, 2025 15:38

dmytrokirpa added 3 commits July 29, 2025 17:42

rename baseUrl in readme

ff0b9d4

fixup

4b1210f

Merge branch 'master' into feat/llm-docs

af5b0a6

feat(public-docsite-v9): add llms docs #34838

Are you sure you want to change the base?

feat(public-docsite-v9): add llms docs #34838

Conversation

dmytrokirpa commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Previous Behavior

New Behavior

Key Features

Technical Implementation

Output Structure

Usage Examples

Basic Usage:

With Configuration File:

Files Added

Uh oh!

github-actions bot commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Bundle size report

Uh oh!

github-actions bot commented Jul 15, 2025

Uh oh!

Uh oh!

tudorpopams commented Jul 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dmytrokirpa commented Jul 17, 2025

Uh oh!

Hotell commented Jul 17, 2025

Uh oh!

dmytrokirpa commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Hotell commented Jul 29, 2025

Uh oh!

Hotell left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dmytrokirpa commented Jul 29, 2025

Uh oh!

Uh oh!

dmytrokirpa commented Jul 15, 2025 •

edited

Loading

github-actions bot commented Jul 15, 2025 •

edited

Loading

dmytrokirpa commented Jul 17, 2025 •

edited

Loading