@@ -462,6 +462,7 @@ rev_tree = revenue_tree.RevenueTree(
462462<div class =" clear " markdown >
463463
464464![ HML Segmentation Distribution] ( assets/images/analysis_modules/hml_segmentation.svg ) { align=right loading=lazy width="50%"}
465+
465466Heavy, Medium, Light (HML) is a segmentation that places customers into groups based on their percentile of spend or the
466467number of products they bought. Heavy customers are the top 20% of customers, medium are the next 30%, and light are the
467468bottom 50% of customers. These values are chosen based on the proportions of the Pareto distribution. Often, purchase
@@ -515,16 +516,67 @@ bar.plot(
515516
516517<div class =" clear " markdown >
517518
518- ![ Image title] ( https://placehold.co/600x400/EEE/31343C ) { align=right loading=lazy width="50%"}
519+ ![ Threshold Segmentation Distribution] (
520+ assets/images/analysis_modules/threshold_segmentation.svg
521+ ){align=right loading=lazy width="50%"}
519522
520- PASTE TEXT HERE
523+ Threshold Segmentation offers a flexible approach to customer grouping based on custom-defined percentile thresholds.
524+ Unlike the fixed 20/30/50 split in HML segmentation, Threshold Segmentation allows you to specify your own thresholds
525+ and segment names, making it adaptable to various business needs.
526+
527+ This flexibility enables you to:
528+
529+ - Create quartile segmentations (e.g., top 25%, next 25%, etc.)
530+ - Define custom tiers based on your specific business model
531+ - Segment customers based on alternative metrics beyond spend, such as visit frequency or product variety
532+
533+ Like HML segmentation, the module provides options for handling customers with zero values, allowing you to include
534+ them with the lowest segment, exclude them entirely, or place them in a separate segment.
521535
522536</div >
523537
524538Example:
525539
526540``` python
527- PASTE CODE HERE
541+ import numpy as np
542+ import pandas as pd
543+
544+ from pyretailscience.plots import bar
545+ from pyretailscience.segmentation import ThresholdSegmentation
546+
547+ # Create sample transaction data
548+ rng = np.random.default_rng(42 )
549+ df = pd.DataFrame(
550+ {
551+ " customer_id" : np.repeat(range (1 , 51 ), 3 ), # 50 customers with 3 transactions each
552+ " unit_spend" : rng.pareto(a = 1.5 , size = 150 ) * 20 , # Pareto distribution to mimic real spending
553+ },
554+ )
555+
556+ # Create custom segmentation with quartiles
557+ # Define thresholds at 25%, 50%, 75%, and 100% (quartiles)
558+ thresholds = [0.25 , 0.50 , 0.75 , 1.0 ]
559+ segments = [" Bronze" , " Silver" , " Gold" , " Platinum" ]
560+
561+ # Create threshold segmentation
562+ seg = ThresholdSegmentation(
563+ df = df,
564+ thresholds = thresholds,
565+ segments = segments,
566+ zero_value_customers = " separate_segment" ,
567+ )
568+
569+ # Visualize spend by segment
570+ bar.plot(
571+ seg.df.groupby(" segment_name" )[" unit_spend" ].sum(),
572+ value_col = " unit_spend" ,
573+ source_text = " Source: PyRetailScience" ,
574+ sort_order = " descending" ,
575+ x_label = " " ,
576+ y_label = " Segment Spend" ,
577+ title = " Customer Value by Segment" ,
578+ rot = 0 ,
579+ )
528580```
529581
530582### Segmentation Stats
0 commit comments