Skip to content

Commit 2ce2eec

Browse files
mvanwykCopilot
andauthored
Added threshold segmentation analysis modules docs (#111)
* docs: added hml seg example to analysis modules page * Update docs/analysis_modules.md Typo fix Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * docs: added threshold seg example to analysis modules page --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent becf774 commit 2ce2eec

2 files changed

Lines changed: 1805 additions & 3 deletions

File tree

docs/analysis_modules.md

Lines changed: 55 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -462,6 +462,7 @@ rev_tree = revenue_tree.RevenueTree(
462462
<div class="clear" markdown>
463463

464464
![HML Segmentation Distribution](assets/images/analysis_modules/hml_segmentation.svg){ align=right loading=lazy width="50%"}
465+
465466
Heavy, Medium, Light (HML) is a segmentation that places customers into groups based on their percentile of spend or the
466467
number of products they bought. Heavy customers are the top 20% of customers, medium are the next 30%, and light are the
467468
bottom 50% of customers. These values are chosen based on the proportions of the Pareto distribution. Often, purchase
@@ -515,16 +516,67 @@ bar.plot(
515516

516517
<div class="clear" markdown>
517518

518-
![Image title](https://placehold.co/600x400/EEE/31343C){ align=right loading=lazy width="50%"}
519+
![Threshold Segmentation Distribution](
520+
assets/images/analysis_modules/threshold_segmentation.svg
521+
){align=right loading=lazy width="50%"}
519522

520-
PASTE TEXT HERE
523+
Threshold Segmentation offers a flexible approach to customer grouping based on custom-defined percentile thresholds.
524+
Unlike the fixed 20/30/50 split in HML segmentation, Threshold Segmentation allows you to specify your own thresholds
525+
and segment names, making it adaptable to various business needs.
526+
527+
This flexibility enables you to:
528+
529+
- Create quartile segmentations (e.g., top 25%, next 25%, etc.)
530+
- Define custom tiers based on your specific business model
531+
- Segment customers based on alternative metrics beyond spend, such as visit frequency or product variety
532+
533+
Like HML segmentation, the module provides options for handling customers with zero values, allowing you to include
534+
them with the lowest segment, exclude them entirely, or place them in a separate segment.
521535

522536
</div>
523537

524538
Example:
525539

526540
```python
527-
PASTE CODE HERE
541+
import numpy as np
542+
import pandas as pd
543+
544+
from pyretailscience.plots import bar
545+
from pyretailscience.segmentation import ThresholdSegmentation
546+
547+
# Create sample transaction data
548+
rng = np.random.default_rng(42)
549+
df = pd.DataFrame(
550+
{
551+
"customer_id": np.repeat(range(1, 51), 3), # 50 customers with 3 transactions each
552+
"unit_spend": rng.pareto(a=1.5, size=150) * 20, # Pareto distribution to mimic real spending
553+
},
554+
)
555+
556+
# Create custom segmentation with quartiles
557+
# Define thresholds at 25%, 50%, 75%, and 100% (quartiles)
558+
thresholds = [0.25, 0.50, 0.75, 1.0]
559+
segments = ["Bronze", "Silver", "Gold", "Platinum"]
560+
561+
# Create threshold segmentation
562+
seg = ThresholdSegmentation(
563+
df=df,
564+
thresholds=thresholds,
565+
segments=segments,
566+
zero_value_customers="separate_segment",
567+
)
568+
569+
# Visualize spend by segment
570+
bar.plot(
571+
seg.df.groupby("segment_name")["unit_spend"].sum(),
572+
value_col="unit_spend",
573+
source_text="Source: PyRetailScience",
574+
sort_order="descending",
575+
x_label="",
576+
y_label="Segment Spend",
577+
title="Customer Value by Segment",
578+
rot=0,
579+
)
528580
```
529581

530582
### Segmentation Stats

0 commit comments

Comments
 (0)