-
Notifications
You must be signed in to change notification settings - Fork 1
Added ThresholdSegmentation class #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
PR Reviewer Guide 🔍
|
PR Code Suggestions ✨
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- pyretailscience/segmentation.py (2 hunks)
- tests/test_segmentation.py (3 hunks)
Additional comments not posted (25)
pyretailscience/segmentation.py (9)
70-84
: Constructor enhancements and parameter validation look good!The new parameters and enhanced error handling improve the flexibility of the
ThresholdSegmentation
class. Ensure that all parameters are correctly passed and utilized in the segmentation logic.
101-103
: Validate DataFrame for emptiness.Good practice to check if the DataFrame is empty before proceeding.
115-117
: Check for sufficient customers relative to thresholds.Ensuring that the number of customers is not less than the number of thresholds is a good validation step.
119-121
: Ensure thresholds and segments match.Validating that the number of thresholds matches the number of segments prevents potential segmentation errors.
123-133
: Separate customers with zero spend.The logic for separating zero spend customers is clear and well-implemented. Ensure that the handling of zero spend customers aligns with the provided options.
136-140
: Ensure thresholds cover all values.Adding a zero threshold if not present ensures that all values are covered.
147-151
: Check for unsegmented customers.Raising an error if some customers are not segmented based on thresholds is a good validation step.
155-155
: Combine zero spend customers if needed.Concatenating the zero spend customers back to the main DataFrame if required is handled well.
158-186
: Constructor correctly initializes superclass with default thresholds and segments.The
HMLSegmentation
class simplifies segmentation by providing default parameters for thresholds and segments, which are correctly passed to the superclass.tests/test_segmentation.py (16)
93-113
: Comprehensive test for correct segmentation.The test ensures that customers are correctly segmented based on given thresholds and segments.
114-125
: Test for single customer segmentation.The test correctly raises a ValueError for a DataFrame with only one customer, ensuring thresholds and segments are appropriately validated.
126-170
: Test for correct aggregation function.The test verifies that the correct aggregation function is applied, ensuring flexibility in segmentation criteria.
171-208
: Test for merging segment data back into the original DataFrame.The test ensures that segment data is correctly merged back, validating the integrity of the original DataFrame.
209-224
: Test for handling duplicate customer ID entries.The test ensures that duplicate customer IDs are correctly handled, maintaining the DataFrame's integrity.
225-246
: Test for mapping segment names to segment IDs with fixed thresholds.The test ensures correct mapping of segment names to IDs, validating the consistency of segmentation.
247-255
: Test for incomplete threshold coverage.The test correctly raises an error when thresholds do not cover all values, ensuring comprehensive segmentation.
268-282
: Test for handling empty DataFrame.The test correctly raises an error for an empty DataFrame, ensuring required columns are present.
284-299
: Test for excluding zero spend customers.The test ensures zero spend customers are correctly excluded based on the specified parameter.
301-317
: Test for including zero spend customers with light spenders.The test ensures zero spend customers are correctly included with light spenders based on the specified parameter.
319-334
: Test for separating zero spend customers.The test ensures zero spend customers are correctly separated into their own segment based on the specified parameter.
336-340
: Test for missing required columns.The test correctly raises an error when required columns are missing, ensuring DataFrame integrity.
342-348
: Test for single customer segmentation.The test correctly raises a ValueError for a DataFrame with only one customer, ensuring thresholds and segments are appropriately validated.
350-358
: Test for immutability of input DataFrame.The test ensures the input DataFrame is not altered, maintaining data integrity.
359-372
: Test for alternate value column.The test ensures correct segmentation when an alternate value column is used.
Line range hint
256-266
:
Test for handling empty DataFrame with errors.The test correctly raises an error when the DataFrame is missing a required column, ensuring required columns are present.
448a0dd
to
fa6d7d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- pyretailscience/segmentation.py (2 hunks)
- tests/test_segmentation.py (3 hunks)
Files skipped from review as they are similar to previous changes (1)
- pyretailscience/segmentation.py
Additional comments not posted (21)
tests/test_segmentation.py (21)
93-94
: Review classTestThresholdSegmentation
The class
TestThresholdSegmentation
is introduced to cover the newThresholdSegmentation
class.
96-113
: Review methodtest_correct_segmentation
The method
test_correct_segmentation
correctly verifies that customers are segmented based on the provided thresholds and segments.
114-125
: Review methodtest_single_customer
The method
test_single_customer
correctly verifies that a ValueError is raised when attempting to segment a single customer.
126-165
: Review methodtest_correct_aggregation_function
The method
test_correct_aggregation_function
correctly verifies that the aggregation function is applied and the segmentation is accurate.
166-203
: Review methodtest_correctly_checks_segment_data
The method
test_correctly_checks_segment_data
correctly verifies that segment data is merged back into the original DataFrame accurately.
204-219
: Review methodtest_handles_dataframe_with_duplicate_customer_id_entries
The method
test_handles_dataframe_with_duplicate_customer_id_entries
correctly verifies that the segmentation handles duplicate customer IDs.
220-241
: Review methodtest_correctly_maps_segment_names_to_segment_ids_with_fixed_thresholds
The method
test_correctly_maps_segment_names_to_segment_ids_with_fixed_thresholds
correctly verifies that segment names and IDs are mapped accurately.
242-250
: Review methodtest_thresholds_not_unique
The method
test_thresholds_not_unique
correctly verifies that a ValueError is raised when the thresholds are not unique.
251-259
: Review methodtest_thresholds_too_few_segments
The method
test_thresholds_too_few_segments
correctly verifies that a ValueError is raised when the number of segments does not match the number of thresholds.
265-277
: Review methodtest_thresholds_too_too_few_thresholds
The method
test_thresholds_too_too_few_thresholds
correctly verifies that a ValueError is raised when the number of thresholds does not match the number of segments.
291-292
: Review classTestHMLSegmentation
The class
TestHMLSegmentation
is introduced to cover the newHMLSegmentation
class.
299-305
: Review methodtest_no_transactions
The method
test_no_transactions
correctly verifies that a ValueError is raised when there are no transactions.
307-323
: Review methodtest_handles_zero_spend_customers_are_excluded_in_result
The method
test_handles_zero_spend_customers_are_excluded_in_result
correctly verifies that zero spend customers are excluded from the segmentation results whenzero_value_customers
is set to "exclude".
325-340
: Review methodtest_handles_zero_spend_customers_include_with_light
The method
test_handles_zero_spend_customers_include_with_light
correctly verifies that zero spend customers are included in the "Light" segment whenzero_value_customers
is set to "include_with_light".
342-357
: Review methodtest_handles_zero_spend_customers_separate_segment
The method
test_handles_zero_spend_customers_separate_segment
correctly verifies that zero spend customers are placed in a separate segment whenzero_value_customers
is set to "separate_segment".
359-363
: Review methodtest_raises_value_error_if_required_columns_missing
The method
test_raises_value_error_if_required_columns_missing
correctly verifies that a ValueError is raised when required columns are missing.
365-371
: Review methodtest_segments_customer_single
The method
test_segments_customer_single
correctly verifies that a ValueError is raised when the DataFrame contains only one customer.
373-381
: Review methodtest_input_dataframe_not_changed
The method
test_input_dataframe_not_changed
correctly verifies that the original DataFrame remains unchanged after segmentation.
382-395
: Review methodtest_alternate_value_col
The method
test_alternate_value_col
correctly verifies that the segmentation works with an alternate value column.
278-279
: Review classTestSegTransactionStats
The class
TestSegTransactionStats
contains tests for theSegTransactionStats
class.
Line range hint
278-289
: Review methodtest_handles_empty_dataframe_with_errors
The method
test_handles_empty_dataframe_with_errors
correctly verifies that a ValueError is raised when the DataFrame is missing a required column.
* feat: add input validation and tests in HMLSegmentation * feat: added treshold segmentation creation
PR Type
Enhancement, Tests
Description
HMLSegmentation
toThresholdSegmentation
with enhanced functionality.HMLSegmentation
as a subclass ofThresholdSegmentation
with predefined thresholds and segments.ThresholdSegmentation
andHMLSegmentation
classes to ensure correct segmentation and error handling.Changes walkthrough 📝
segmentation.py
Refactor and enhance segmentation classes with validation
pyretailscience/segmentation.py
HMLSegmentation
toThresholdSegmentation
.and segments.
HMLSegmentation
as a subclass ofThresholdSegmentation
.test_segmentation.py
Add comprehensive tests for segmentation classes
tests/test_segmentation.py
ThresholdSegmentation
class.HMLSegmentation
class.Summary by CodeRabbit
New Features
HMLSegmentation
class for streamlined Heavy, Medium, Light, and Zero spenders segmentation.ThresholdSegmentation
for customizable user-defined thresholds and segments.Bug Fixes
Tests