Plot Gallery Documentation
Problem Statement
The current PyRetailScience documentation includes API reference pages for individual plots, but lacks a visual gallery that helps users quickly discover and understand plotting capabilities. Users need to see what each plot type looks like and understand the major configuration options available, similar to how the Matplotlib plot types gallery presents visualizations.
Proposed Solution
Create a comprehensive plot gallery section in the documentation with:
- Root "Plots" page - Overview gallery showing all available plot types with thumbnail examples
- Individual plot pages - Detailed pages for each plot type showing major configuration examples
The gallery should focus on demonstrating built-in features (e.g., group_col, Series vs DataFrame input, orientation options) rather than general matplotlib customizations (e.g., tick label sizes, kwargs passthrough).
Structure
Navigation Hierarchy
docs/
└── gallery/
├── index.md # Root gallery page
├── plots/
│ ├── area.ipynb
│ ├── bar.ipynb
│ ├── broken_timeline.ipynb
│ ├── cohort.ipynb
│ ├── heatmap.ipynb
│ ├── histogram.ipynb
│ ├── index_plot.ipynb # Named to avoid conflict with index.md
│ ├── line.ipynb
│ ├── period_on_period.ipynb
│ ├── price.ipynb
│ ├── scatter.ipynb
│ ├── time.ipynb
│ ├── venn.ipynb
│ └── waterfall.ipynb
mkdocs.yml Changes
Add new navigation section:
nav:
- Home: index.md
- Getting Started:
- Installation: getting_started/installation.md
- Options & Configuration: getting_started/options_guide.md
- Analysis Modules:
- analysis_modules.md
- Plot Gallery: # NEW SECTION
- Overview: gallery/index.md
- Area Plot: gallery/plots/area.ipynb
- Bar Plot: gallery/plots/bar.ipynb
- Broken Timeline: gallery/plots/broken_timeline.ipynb
- Cohort Plot: gallery/plots/cohort.ipynb
- Heatmap Plot: gallery/plots/heatmap.ipynb
- Histogram Plot: gallery/plots/histogram.ipynb
- Index Plot: gallery/plots/index_plot.ipynb
- Line Plot: gallery/plots/line.ipynb
- Period on Period: gallery/plots/period_on_period.ipynb
- Price Plot: gallery/plots/price.ipynb
- Scatter Plot: gallery/plots/scatter.ipynb
- Time Plot: gallery/plots/time.ipynb
- Venn Diagram: gallery/plots/venn.ipynb
- Waterfall Plot: gallery/plots/waterfall.ipynb
- Examples:
- Customer Retention: examples/retention.ipynb
# ... rest of examples
- Reference:
# ... existing reference sections
Content Format
Root Gallery Page (gallery/index.md)
A markdown page with:
- Brief introduction to PyRetailScience plotting capabilities
- Grid layout showing all plot types with thumbnail images and brief descriptions
- Links to detailed plot pages
Format:
# Plot Gallery
PyRetailScience provides a comprehensive set of plotting functions designed specifically for retail analytics. All plots use a consistent API and come pre-styled with retail-friendly color schemes.
## Plot Types
### Basic Plots
#### [Line Plot](plots/line.ipynb)

Visualize sequential data like daily trends or event impact analysis.
#### [Bar Plot](plots/bar.ipynb)

Compare categorical data with vertical or horizontal bars.
[Continue for all plot types...]
Individual Plot Pages (gallery/plots/*.ipynb)
Each plot page should be a Jupyter notebook containing:
- Title and description - What the plot is used for
- Basic example - Simplest usage
- Major configuration examples - Each in its own section with markdown headers
Standard format for each example:
## [Feature Name]
Brief description of what this configuration does.
# Python code to generate the plot
import pandas as pd
from pyretailscience.plots import line
# Create example data
df = pd.DataFrame({...})
# Generate plot
ax = line.plot(df, value_col="sales", ...)
[Output cell showing the plot image]
Plot-Specific Requirements
Below are the ACTUAL features from the codebase for each plot. Demonstrations should focus on these real capabilities.
Line Plot (gallery/plots/line.ipynb)
Demonstrate:
- Basic line plot with DataFrame (with
x_col and value_col)
- Plotting a pandas Series (no
value_col needed)
- Using
group_col for multiple lines (creates separate line per group)
- Multiple value columns (
value_col as list) - note: cannot combine with group_col
- Index-based plotting (omit
x_col, uses DataFrame index)
- Using
fill_na_value parameter when pivoting with group_col
Bar Plot (gallery/plots/bar.ipynb)
Demonstrate:
- Basic vertical bar plot
- Horizontal bar plot (
orientation="horizontal" or "h")
- Grouped bars using
x_col parameter
- Multiple value columns (
value_col as list)
- Sorting (show one example:
sort_order="descending" or "ascending")
- Data labels:
data_label_format="absolute", "percentage_by_bar_group", or "percentage_by_series"
- Hatching patterns (
use_hatch=True)
- Stacked bars (via
stacked=True kwarg)
Heatmap Plot (gallery/plots/heatmap.ipynb)
Demonstrate:
- Basic heatmap from DataFrame (index=rows, columns=columns)
- Custom colorbar label (
cbar_label)
- Custom colorbar format (
cbar_format string)
- Cell text annotations (automatically added with auto-contrast black/white text)
Waterfall Plot (gallery/plots/waterfall.ipynb)
Demonstrate:
- Basic waterfall from amounts list and labels list
- Data label formats:
data_label_format="absolute", "percentage", or "both"
- Net bar display (
display_net_bar=True)
- Net line display (
display_net_line=True)
- Removing zero amounts (
remove_zero_amounts=True)
Scatter Plot (gallery/plots/scatter.ipynb)
Demonstrate:
- Basic scatter plot (single
value_col)
- Multiple scatter series using
group_col
- Multiple value columns (
value_col as list) - note: cannot combine with group_col
- Point labels using
label_col parameter (only works with single value_col)
- Customizing label appearance with
label_kwargs
Time Plot (gallery/plots/time.ipynb)
Demonstrate:
- Basic time series (requires transaction_date column and aggregates by period)
- Different aggregation periods:
period="D" (daily), "W" (weekly), "M" (monthly)
- Different aggregation functions:
agg_func="sum" or "mean"
- Grouping by category with
group_col parameter
Area Plot (gallery/plots/area.ipynb)
Demonstrate:
- Basic area plot (single
value_col)
- Multiple areas using
group_col parameter
- Multiple value columns (
value_col as list) - note: cannot combine with group_col
- Stacked areas (via
stacked=True kwarg)
- Using
x_col vs index
Histogram Plot (gallery/plots/histogram.ipynb)
Demonstrate:
- Basic histogram (single
value_col)
- Multiple histograms using
group_col
- Multiple value columns (
value_col as list) - note: cannot combine with group_col
- Range clipping:
range_lower, range_upper, range_method="clip" or "fillna"
- Hatching patterns (
use_hatch=True)
- Custom bins (via
bins kwarg)
Cohort Plot (gallery/plots/cohort.ipynb)
Demonstrate:
- Basic cohort heatmap from DataFrame
- Percentage display (
percentage=True - default)
- Raw value display (
percentage=False)
- The distinctive horizontal line at row 3 (automatically added)
Period on Period Plot (gallery/plots/period_on_period.ipynb)
Demonstrate:
- Basic period-on-period comparison with list of (start_date, end_date) tuples in
periods parameter
- Overlaying 2-3 different time periods on same chart
- Different line styles automatically applied to each period
Venn Diagram (gallery/plots/venn.ipynb)
Demonstrate:
- 2-set Venn diagram (requires DataFrame with 'groups' and 'percent' columns)
- 3-set Venn diagram
- Euler diagram mode (
vary_size=True - sizes proportional to values)
- Custom subset label formatting with
subset_label_formatter
Broken Timeline Plot (gallery/plots/broken_timeline.ipynb)
Demonstrate:
- Basic broken timeline showing data availability across categories over time
- Different aggregation periods:
period="D" or "W"
- Threshold filtering:
threshold_value to hide low-value periods
- Different aggregation functions:
agg_func="sum" or other
Index Plot (gallery/plots/index_plot.ipynb)
Demonstrate:
- Basic index plot showing performance relative to baseline (100)
- Sorting options:
sort_by="group" or "value"
- Group filtering:
exclude_groups or include_only_groups
- Multiple series with
series_col parameter
- Highlighting range with
highlight_range parameter
- Value filtering:
filter_above or filter_below
Price Plot (gallery/plots/price.ipynb)
Demonstrate:
- Basic bubble chart showing price distribution across categories
- Price binning with
bins parameter (int for equal-width, list for custom boundaries)
- Grouping by categorical column (
group_col)
- Bubble sizes represent percentage of products in each price band
Data Guidelines
All example data should:
- Use realistic retail domain values (customer_id, store_id, product names, dates, dollar amounts)
- Be small enough to be clearly readable (typically 5-15 rows)
- Be self-contained within each notebook (no external data files)
- Use descriptive variable names
Import Style (IMPORTANT)
All notebooks MUST use this import pattern:
from pyretailscience.plots import line
# Then call: line.plot(...)
DO NOT use:
import pyretailscience.plots.line as line_plot # ❌ WRONG
# or
import pyretailscience.plots.line # ❌ WRONG
This keeps imports consistent across all documentation and examples.
Example:
# Good - retail domain
df = pd.DataFrame({
"product": ["Laptop", "Mouse", "Keyboard", "Monitor", "Headphones"],
"sales": [125000, 15000, 22000, 85000, 18000],
"category": ["Electronics", "Accessories", "Accessories", "Electronics", "Accessories"]
})
# Avoid - generic placeholders
df = pd.DataFrame({
"x": ["A", "B", "C", "D", "E"],
"y": [1, 2, 3, 4, 5],
"group": ["test", "data", "test", "data", "test"]
})
Technical Implementation
Jupyter Notebooks
- Create actual Jupyter notebook (
.ipynb) files, not Python scripts
- Use mkdocs-jupyter plugin (already configured in mkdocs.yml)
- Each notebook should have markdown cells for section headers
- Execute all cells before committing to ensure images are embedded
- No need for
%matplotlib inline or other magic commands - plots will render automatically in notebooks
Image Assets
For the root gallery page thumbnails:
- Store in
docs/assets/gallery/
- Generate programmatically or screenshot from notebook outputs
- Optimize images for web (PNG format, reasonable file sizes)
- Consistent thumbnail dimensions (e.g., 400x300px)
Style Consistency
All plots should:
- Use default PyRetailScience styling (don't override unless demonstrating that feature)
- Include appropriate titles and axis labels
- Be large enough to read clearly in documentation
- Use consistent figure sizes across examples
Implementation Strategy
IMPORTANT: This work should be split into separate PRs - one PR per plot type. This approach:
- Makes reviews manageable and focused
- Allows incremental progress and merging
- Reduces risk of conflicts
- Enables parallel work if multiple contributors are involved
Suggested PR sequence:
- PR 1: Root gallery page structure (
docs/gallery/index.md) with placeholder thumbnails and mkdocs.yml updates
- PR 2: Line plot gallery (
gallery/plots/line.ipynb)
- PR 3: Bar plot gallery (
gallery/plots/bar.ipynb)
- PR 4: Scatter plot gallery (
gallery/plots/scatter.ipynb)
- PR 5: Heatmap plot gallery (
gallery/plots/heatmap.ipynb)
- PR 6: Time plot gallery (
gallery/plots/time.ipynb)
- PR 7: Area plot gallery (
gallery/plots/area.ipynb)
- PR 8: Histogram plot gallery (
gallery/plots/histogram.ipynb)
- PR 9: Waterfall plot gallery (
gallery/plots/waterfall.ipynb)
- PR 10: Cohort plot gallery (
gallery/plots/cohort.ipynb)
- PR 11: Venn diagram gallery (
gallery/plots/venn.ipynb)
- PR 12: Period on Period plot gallery (
gallery/plots/period_on_period.ipynb)
- PR 13: Broken Timeline plot gallery (
gallery/plots/broken_timeline.ipynb)
- PR 14: Index plot gallery (
gallery/plots/index_plot.ipynb)
- PR 15: Price plot gallery (
gallery/plots/price.ipynb)
Each plot PR should:
- Include the complete notebook with all examples
- Update root gallery page with thumbnail and description for that plot
- Execute all notebook cells to embed images
- Ensure docs build successfully
Acceptance Criteria
Out of Scope
- Exhaustive coverage of every parameter and kwarg option
- Demonstrations of general matplotlib customizations (tick sizes, font changes, etc.)
- Interactive plots or widgets
- Performance benchmarking
- Comparison with other plotting libraries
- Style customization guides (covered separately in api/plots/styles/)
Notes
- The
tree_diagram.py module should be excluded as it's used internally by the Revenue Tree analysis and not meant for direct user consumption
- The gallery complements (not replaces) the existing API reference documentation
- Examples should be copy-pasteable and runnable by users
- Consider adding a note at the top of each plot page linking to the full API reference for that plot type
Example Notebook Structure
Below is an example showing the structure of a Jupyter notebook (.ipynb file). Create actual .ipynb files in Jupyter, not Python scripts.
First markdown cell:
# Line Plot Gallery
The line plot is used for visualizing sequential data like daily trends or event impact analysis.
It's ideal for time-based sequences or ordered data points.
First code cell:
import pandas as pd
from pyretailscience.plots import line
import matplotlib.pyplot as plt
Markdown cell:
## Basic Line Plot
Plot a single value column from a DataFrame.
Code cell:
df = pd.DataFrame({
"day": range(1, 8),
"revenue": [12000, 15000, 13000, 18000, 22000, 19000, 21000]
})
ax = line.plot(
df,
x_col="day",
value_col="revenue",
title="Daily Revenue",
x_label="Day",
y_label="Revenue ($)"
)
plt.show()
Markdown cell:
## Plotting a Series
You can also plot a pandas Series directly.
Code cell:
sales = pd.Series(
[12000, 15000, 13000, 18000, 22000, 19000, 21000],
index=range(1, 8),
name="Sales"
)
ax = line.plot(
sales,
title="Daily Sales",
x_label="Day",
y_label="Sales ($)"
)
plt.show()
Markdown cell:
## Multiple Lines with group_col
Create separate lines for each category using the group_col parameter.
Code cell:
df_multi = pd.DataFrame({
"day": [1, 1, 2, 2, 3, 3, 4, 4, 5, 5],
"category": ["Electronics", "Apparel"] * 5,
"revenue": [8000, 4000, 10000, 5000, 8500, 4500, 12000, 6000, 15000, 7000]
})
ax = line.plot(
df_multi,
x_col="day",
value_col="revenue",
group_col="category",
title="Revenue by Category",
x_label="Day",
y_label="Revenue ($)",
legend_title="Category"
)
plt.show()
Continue with more examples...
Related Issues
- Existing API reference documentation:
docs/api/plots/
- Examples section:
docs/examples/ (focuses on analysis workflows, not individual plots)
Labels
type:docs
status:draft (until approved)
Priority
P1 - High priority documentation improvement that will significantly enhance user experience and plot discoverability.
Plot Gallery Documentation
Problem Statement
The current PyRetailScience documentation includes API reference pages for individual plots, but lacks a visual gallery that helps users quickly discover and understand plotting capabilities. Users need to see what each plot type looks like and understand the major configuration options available, similar to how the Matplotlib plot types gallery presents visualizations.
Proposed Solution
Create a comprehensive plot gallery section in the documentation with:
The gallery should focus on demonstrating built-in features (e.g.,
group_col, Series vs DataFrame input, orientation options) rather than general matplotlib customizations (e.g., tick label sizes, kwargs passthrough).Structure
Navigation Hierarchy
mkdocs.yml Changes
Add new navigation section:
Content Format
Root Gallery Page (gallery/index.md)
A markdown page with:
Format:
Individual Plot Pages (gallery/plots/*.ipynb)
Each plot page should be a Jupyter notebook containing:
Standard format for each example:
## [Feature Name] Brief description of what this configuration does.[Output cell showing the plot image]
Plot-Specific Requirements
Below are the ACTUAL features from the codebase for each plot. Demonstrations should focus on these real capabilities.
Line Plot (gallery/plots/line.ipynb)
Demonstrate:
x_colandvalue_col)value_colneeded)group_colfor multiple lines (creates separate line per group)value_colas list) - note: cannot combine withgroup_colx_col, uses DataFrame index)fill_na_valueparameter when pivoting withgroup_colBar Plot (gallery/plots/bar.ipynb)
Demonstrate:
orientation="horizontal"or"h")x_colparametervalue_colas list)sort_order="descending"or"ascending")data_label_format="absolute","percentage_by_bar_group", or"percentage_by_series"use_hatch=True)stacked=Truekwarg)Heatmap Plot (gallery/plots/heatmap.ipynb)
Demonstrate:
cbar_label)cbar_formatstring)Waterfall Plot (gallery/plots/waterfall.ipynb)
Demonstrate:
data_label_format="absolute","percentage", or"both"display_net_bar=True)display_net_line=True)remove_zero_amounts=True)Scatter Plot (gallery/plots/scatter.ipynb)
Demonstrate:
value_col)group_colvalue_colas list) - note: cannot combine withgroup_collabel_colparameter (only works with singlevalue_col)label_kwargsTime Plot (gallery/plots/time.ipynb)
Demonstrate:
period="D"(daily),"W"(weekly),"M"(monthly)agg_func="sum"or"mean"group_colparameterArea Plot (gallery/plots/area.ipynb)
Demonstrate:
value_col)group_colparametervalue_colas list) - note: cannot combine withgroup_colstacked=Truekwarg)x_colvs indexHistogram Plot (gallery/plots/histogram.ipynb)
Demonstrate:
value_col)group_colvalue_colas list) - note: cannot combine withgroup_colrange_lower,range_upper,range_method="clip"or"fillna"use_hatch=True)binskwarg)Cohort Plot (gallery/plots/cohort.ipynb)
Demonstrate:
percentage=True- default)percentage=False)Period on Period Plot (gallery/plots/period_on_period.ipynb)
Demonstrate:
periodsparameterVenn Diagram (gallery/plots/venn.ipynb)
Demonstrate:
vary_size=True- sizes proportional to values)subset_label_formatterBroken Timeline Plot (gallery/plots/broken_timeline.ipynb)
Demonstrate:
period="D"or"W"threshold_valueto hide low-value periodsagg_func="sum"or otherIndex Plot (gallery/plots/index_plot.ipynb)
Demonstrate:
sort_by="group"or"value"exclude_groupsorinclude_only_groupsseries_colparameterhighlight_rangeparameterfilter_aboveorfilter_belowPrice Plot (gallery/plots/price.ipynb)
Demonstrate:
binsparameter (int for equal-width, list for custom boundaries)group_col)Data Guidelines
All example data should:
Import Style (IMPORTANT)
All notebooks MUST use this import pattern:
DO NOT use:
This keeps imports consistent across all documentation and examples.
Example:
Technical Implementation
Jupyter Notebooks
.ipynb) files, not Python scripts%matplotlib inlineor other magic commands - plots will render automatically in notebooksImage Assets
For the root gallery page thumbnails:
docs/assets/gallery/Style Consistency
All plots should:
Implementation Strategy
IMPORTANT: This work should be split into separate PRs - one PR per plot type. This approach:
Suggested PR sequence:
docs/gallery/index.md) with placeholder thumbnails and mkdocs.yml updatesgallery/plots/line.ipynb)gallery/plots/bar.ipynb)gallery/plots/scatter.ipynb)gallery/plots/heatmap.ipynb)gallery/plots/time.ipynb)gallery/plots/area.ipynb)gallery/plots/histogram.ipynb)gallery/plots/waterfall.ipynb)gallery/plots/cohort.ipynb)gallery/plots/venn.ipynb)gallery/plots/period_on_period.ipynb)gallery/plots/broken_timeline.ipynb)gallery/plots/index_plot.ipynb)gallery/plots/price.ipynb)Each plot PR should:
Acceptance Criteria
docs/gallery/index.md) created with overview and thumbnailsmkdocs buildmkdocs serve)Out of Scope
Notes
tree_diagram.pymodule should be excluded as it's used internally by the Revenue Tree analysis and not meant for direct user consumptionExample Notebook Structure
Below is an example showing the structure of a Jupyter notebook (
.ipynbfile). Create actual.ipynbfiles in Jupyter, not Python scripts.First markdown cell:
# Line Plot Gallery The line plot is used for visualizing sequential data like daily trends or event impact analysis. It's ideal for time-based sequences or ordered data points.First code cell:
Markdown cell:
## Basic Line Plot Plot a single value column from a DataFrame.Code cell:
Markdown cell:
## Plotting a Series You can also plot a pandas Series directly.Code cell:
Markdown cell:
## Multiple Lines with group_col Create separate lines for each category using the group_col parameter.Code cell:
Continue with more examples...
Related Issues
docs/api/plots/docs/examples/(focuses on analysis workflows, not individual plots)Labels
type:docsstatus:draft(until approved)Priority
P1 - High priority documentation improvement that will significantly enhance user experience and plot discoverability.