-
Notifications
You must be signed in to change notification settings - Fork 1
Added pandas style option system and updated segmentations module to use it. #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ntation code and related tests
WalkthroughThe recent updates introduce a comprehensive options management system and enhance the configurability of segmentation functionality within the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Options
participant Segmentation
User->>Options: set_option("column.customer_id", "customer_id")
User->>Options: get_option("column.customer_id")
Options-->>User: "customer_id"
User->>Segmentation: add_segment(dataframe)
Segmentation->>Options: get_option("column.customer_id")
Options-->>Segmentation: "customer_id"
Segmentation->>Segmentation: process segmentation
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
PR Reviewer Guide 🔍
|
PR Code Suggestions ✨
|
Codecov ReportAttention: Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range, codebase verification and nitpick comments (2)
pyretailscience/options.py (1)
203-203
: Missing Test CoverageThe line where the TOML file is checked for existence is not covered by tests. Specifically, there are no tests that handle the scenario where the TOML file is not found (e.g., tests that involve exception handling or raising errors).
- Ensure that the test cases cover scenarios where the TOML file is not found.
Analysis chain
Missing Test Coverage
The line where the TOML file is checked for existence is not covered by tests.
Ensure that the test cases cover scenarios where the TOML file is found and not found.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify test coverage for TOML file existence check. # Test: Ensure test cases cover scenarios where the TOML file is found and not found. rg --type python -A 5 'load_from_toml' | grep -E 'assert|test'Length of output: 93
Script:
#!/bin/bash # Description: Verify test coverage for TOML file existence check. # Test: Ensure test cases cover scenarios where the TOML file is found and not found. rg --glob '*.py' -A 5 'load_from_toml' | grep -E 'assert|test'Length of output: 1322
Script:
#!/bin/bash # Description: Verify test coverage for TOML file existence check, including scenarios where the file is not found. # Test: Ensure test cases cover scenarios where the TOML file is found and not found. rg --glob '*.py' -A 5 'load_from_toml' | grep -E 'assert|test|except|raise'Length of output: 1335
Tools
GitHub Check: codecov/patch
[warning] 203-203: pyretailscience/options.py#L203
Added line #L203 was not covered by testspyretailscience/segmentation.py (1)
227-227
: Missing Test CoverageThe line where the unit quantity column is checked for existence is not covered by tests.
- No assertions or test cases specifically validate the existence of the
unit_quantity
column intests/test_segmentation.py
.Analysis chain
Missing Test Coverage
The line where the unit quantity column is checked for existence is not covered by tests.
Ensure that the test cases cover scenarios where the unit quantity column is checked for existence.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify test coverage for unit quantity column existence check. # Test: Ensure test cases cover scenarios where the unit quantity column is checked for existence. rg --type python -A 5 'unit_quantity' | grep -E 'assert|test'Length of output: 92
Script:
#!/bin/bash # Description: Verify test coverage for unit quantity column existence check. # Test: Ensure test cases cover scenarios where the unit quantity column is checked for existence. rg --glob '*.py' -A 5 'unit_quantity' | grep -E 'assert|test'Length of output: 1408
Script:
#!/bin/bash # Description: Verify test coverage for unit quantity column existence check. # Test: Ensure that test cases specifically validate the existence of the unit quantity column. rg --glob 'tests/test_segmentation.py' -A 5 'def test_' | grep -E 'unit_quantity|assert'Length of output: 347
Tools
GitHub Check: codecov/patch
[warning] 227-227: pyretailscience/segmentation.py#L227
Added line #L227 was not covered by tests
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (7)
- pyretailscience/options.py (1 hunks)
- pyretailscience/segmentation.py (13 hunks)
- tests/test_options.py (1 hunks)
- tests/test_segmentation.py (19 hunks)
- tests/toml_files/corrupt.toml (1 hunks)
- tests/toml_files/invalid_option.toml (1 hunks)
- tests/toml_files/valid.toml (1 hunks)
Additional context used
GitHub Check: codecov/patch
pyretailscience/options.py
[warning] 199-199: pyretailscience/options.py#L199
Added line #L199 was not covered by tests
[warning] 203-203: pyretailscience/options.py#L203
Added line #L203 was not covered by testspyretailscience/segmentation.py
[warning] 60-60: pyretailscience/segmentation.py#L60
Added line #L60 was not covered by tests
[warning] 72-72: pyretailscience/segmentation.py#L72
Added line #L72 was not covered by tests
[warning] 227-227: pyretailscience/segmentation.py#L227
Added line #L227 was not covered by tests
Additional comments not posted (63)
tests/toml_files/corrupt.toml (1)
1-1
: Confirm the purpose of this test case.The content is not in TOML format and is expected to generate an error. Ensure that this test case is intended to validate error handling for invalid TOML files.
tests/toml_files/invalid_option.toml (1)
1-3
: Confirm the purpose of this test case.The file contains a TOML configuration with an invalid option (
unknown_column
). Ensure that this test case is intended to validate error handling for invalid options.tests/toml_files/valid.toml (1)
1-12
: Confirm the purpose of this test case.The file contains a valid TOML configuration with various column options. Ensure that this test case is intended to validate correct option parsing and usage.
tests/test_options.py (20)
15-25
: LGTM!The test method correctly verifies that setting, getting, resetting, and describing an unknown option raises a ValueError.
27-30
: LGTM!The test method correctly verifies that listing all options returns all options.
32-36
: LGTM!The test method correctly verifies that setting an option updates the option value correctly.
38-43
: LGTM!The test method correctly verifies that getting an option retrieves the correct value.
45-51
: LGTM!The test method correctly verifies that resetting an option restores its default value.
53-61
: LGTM!The test method correctly verifies that describing an option provides the correct description and current value.
63-66
: LGTM!The test method correctly verifies that all options have a corresponding description and vice versa.
68-73
: LGTM!The test method correctly verifies that the context manager overrides the option value correctly at the global level.
75-81
: LGTM!The test method correctly verifies that the context manager raises a ValueError when an odd number of arguments is passed.
83-87
: LGTM!The test method correctly verifies that setting an option updates the option value correctly at the global level.
89-97
: LGTM!The test method correctly verifies that getting an option retrieves the correct value at the global level.
99-108
: LGTM!The test method correctly verifies that resetting an option restores its default value at the global level.
110-120
: LGTM!The test method correctly verifies that describing an option provides the correct description and current value at the global level.
122-128
: LGTM!The test method correctly verifies that listing all options returns all options at the global level.
130-134
: LGTM!The test method correctly verifies that loading an invalid TOML file raises a ValueError.
136-145
: LGTM!The test method correctly verifies that loading a valid TOML file updates the options correctly.
146-150
: LGTM!The test method correctly verifies that loading an invalid TOML file with unknown options raises a ValueError.
152-168
: LGTM!The test method correctly verifies that flattening the options dictionary works correctly.
170-175
: LGTM!The fixture correctly ensures that the LRU cache is cleared before and after the test.
176-218
: LGTM!The test methods correctly verify various scenarios for finding the project root, including when the .git directory or pyretailscience.toml file is found, and when no project root is found.
tests/test_segmentation.py (20)
6-6
: LGTM! Import statement forget_option
.The import statement for
get_option
frompyretailscience.options
is necessary for the dynamic column name retrieval.
18-22
: LGTM! Dynamic column names inbase_df
method.The
base_df
method inTestCalcSegStats
has been updated to useget_option
for column names, enhancing flexibility and maintainability.
30-35
: LGTM! Dynamic column names in expected output.The
test_correctly_calculates_revenue_transactions_customers_per_segment
method has been updated to useget_option
for expected output column names, ensuring adaptability to changes in column definitions.
47-58
: LGTM! Dynamic column names intest_correctly_calculates_revenue_transactions_customers
method.The
test_correctly_calculates_revenue_transactions_customers
method has been updated to useget_option
for column names in the DataFrame and expected output, ensuring adaptability to changes in column definitions.
80-85
: LGTM! Dynamic column names intest_handles_dataframe_with_one_segment
method.The
test_handles_dataframe_with_one_segment
method has been updated to useget_option
for expected output column names, ensuring adaptability to changes in column definitions.
99-111
: LGTM! Dynamic column names intest_correct_segmentation
method.The
test_correct_segmentation
method inTestThresholdSegmentation
has been updated to useget_option
for column names in the DataFrame andvalue_col
parameter, ensuring adaptability to changes in column definitions.
122-122
: LGTM! Dynamic column names intest_single_customer
method.The
test_single_customer
method inTestThresholdSegmentation
has been updated to useget_option
for column names in the DataFrame, ensuring adaptability to changes in column definitions.
Line range hint
136-154
: LGTM! Dynamic column names intest_correct_aggregation_function
method.The
test_correct_aggregation_function
method inTestThresholdSegmentation
has been updated to useget_option
for column names in the DataFrame and expected result, ensuring adaptability to changes in column definitions.
Line range hint
176-203
: LGTM! Dynamic column names intest_correctly_checks_segment_data
method.The
test_correctly_checks_segment_data
method inTestThresholdSegmentation
has been updated to useget_option
for column names in the DataFrame and expected result, ensuring adaptability to changes in column definitions.
212-221
: LGTM! Dynamic column names intest_handles_dataframe_with_duplicate_customer_id_entries
method.The
test_handles_dataframe_with_duplicate_customer_id_entries
method inTestThresholdSegmentation
has been updated to useget_option
for column names in the DataFrame andvalue_col
parameter, ensuring adaptability to changes in column definitions.
234-240
: LGTM! Dynamic column names intest_correctly_maps_segment_names_to_segment_ids_with_fixed_thresholds
method.The
test_correctly_maps_segment_names_to_segment_ids_with_fixed_thresholds
method inTestThresholdSegmentation
has been updated to useget_option
for column names in the DataFrame andvalue_col
parameter, ensuring adaptability to changes in column definitions.
260-265
: LGTM! Dynamic column names intest_thresholds_not_unique
method.The
test_thresholds_not_unique
method inTestThresholdSegmentation
has been updated to useget_option
for column names in the DataFrame, ensuring adaptability to changes in column definitions.
274-279
: LGTM! Dynamic column names intest_thresholds_too_few_segments
method.The
test_thresholds_too_few_segments
method inTestThresholdSegmentation
has been updated to useget_option
for column names in the DataFrame, ensuring adaptability to changes in column definitions.
293-298
: LGTM! Dynamic column names intest_thresholds_too_too_few_thresholds
method.The
test_thresholds_too_too_few_thresholds
method inTestThresholdSegmentation
has been updated to useget_option
for column names in the DataFrame, ensuring adaptability to changes in column definitions.
316-318
: LGTM! Dynamic column names intest_handles_empty_dataframe_with_errors
method.The
test_handles_empty_dataframe_with_errors
method inTestSegTransactionStats
has been updated to useget_option
for column names in the DataFrame, ensuring adaptability to changes in column definitions.
330-335
: LGTM! Dynamic column names inbase_df
method.The
base_df
method inTestHMLSegmentation
has been updated to useget_option
for column names, enhancing flexibility and maintainability.
339-339
: LGTM! Dynamic column names intest_no_transactions
method.The
test_no_transactions
method inTestHMLSegmentation
has been updated to useget_option
for column names in the DataFrame, ensuring adaptability to changes in column definitions.
400-400
: LGTM! Dynamic column names intest_raises_value_error_if_required_columns_missing
method.The
test_raises_value_error_if_required_columns_missing
method inTestHMLSegmentation
has been updated to useget_option
for column names in the DataFrame, ensuring adaptability to changes in column definitions.
405-405
: LGTM! Dynamic column names intest_segments_customer_single
method.The
test_segments_customer_single
method inTestHMLSegmentation
has been updated to useget_option
for column names in the DataFrame, ensuring adaptability to changes in column definitions.
422-422
: LGTM! Dynamic column names intest_alternate_value_col
method.The
test_alternate_value_col
method inTestHMLSegmentation
has been updated to useget_option
for renaming the column in the DataFrame, ensuring adaptability to changes in column definitions.pyretailscience/options.py (13)
29-31
: Class DocumentationThe class
Options
is well-documented and provides a clear description of its purpose.
32-95
: Initialization LogicThe
__init__
method correctly initializes the options and descriptions with default values.
97-111
: Set Option LogicThe
set_option
method correctly sets the value of the specified option and handles unknown options appropriately.
113-129
: Get Option LogicThe
get_option
method correctly retrieves the value of the specified option and handles unknown options appropriately.
131-145
: Reset Option LogicThe
reset_option
method correctly resets the specified option to its default value and handles unknown options appropriately.
146-152
: List Options LogicThe
list_options
method correctly lists all available options.
154-170
: Describe Option LogicThe
describe_option
method correctly describes the specified option and handles unknown options appropriately.
172-184
: Flatten Options LogicThe
flatten_options
method correctly flattens nested options into a single dictionary.
187-205
: Load From Project LogicThe
load_from_project
method correctly attempts to load options from apyretailscience.toml
file in the project root directory and handles cases where the project root cannot be found.Tools
GitHub Check: codecov/patch
[warning] 199-199: pyretailscience/options.py#L199
Added line #L199 was not covered by tests
[warning] 203-203: pyretailscience/options.py#L203
Added line #L203 was not covered by tests
207-230
: Load From TOML LogicThe
load_from_toml
method correctly loads options from a TOML file and handles unknown options appropriately.
233-247
: Find Project Root LogicThe
find_project_root
function correctly traverses up the directory tree to find the project root and handles cases where the project root cannot be found.
328-363
: Option Context LogicThe
option_context
function correctly sets options temporarily and restores them after the context exits. Proper error handling is in place for an odd number of arguments.
199-199
: Missing Test CoverageThe line where the project root is checked for existence is not covered by tests.
Ensure that the test cases cover scenarios where the project root is found and not found.
Verification successful
Test Coverage Verified
The test cases in
tests/test_options.py
cover various scenarios for thefind_project_root()
function, including when the project root is found and not found.
test_find_project_root_git_found
: Tests finding the project root when the.git
directory is found.test_find_project_root_toml_found
: Tests finding the project root when thepyretailscience.toml
file is found.test_find_project_root_no_project_found
: Tests finding the project root when no project root is found.test_find_project_root_found_in_parent
: Tests finding the project root when the project root is found in a parent directory.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify test coverage for project root existence check. # Test: Ensure test cases cover scenarios where the project root is found and not found. rg --type python -A 5 'find_project_root()' | grep -E 'assert|test'Length of output: 100
Script:
#!/bin/bash # Description: Verify test coverage for project root existence check. # Corrected Test: Ensure test cases cover scenarios where the project root is found and not found. rg --glob '*.py' -A 5 'find_project_root()' | grep -E 'assert|test'Length of output: 3487
Tools
GitHub Check: codecov/patch
[warning] 199-199: pyretailscience/options.py#L199
Added line #L199 was not covered by testspyretailscience/segmentation.py (7)
35-40
: Add Segment LogicThe
add_segment
method correctly usesget_option
for dynamic column referencing and handles the merge operation appropriately.
Line range hint
60-74
: Initialization LogicThe
__init__
method correctly usesget_option
for dynamic column referencing and handles the validation of required columns appropriately.Tools
GitHub Check: codecov/patch
[warning] 72-72: pyretailscience/segmentation.py#L72
Added line #L72 was not covered by tests
Line range hint
85-151
: Initialization LogicThe
__init__
method correctly usesget_option
for dynamic column referencing and handles the validation of required columns, thresholds, and segments appropriately.
174-192
: Initialization LogicThe
__init__
method correctly usesget_option
for dynamic column referencing and handles the initialization of thresholds and segments appropriately.
220-225
: Initialization LogicThe
__init__
method correctly usesget_option
for dynamic column referencing and handles the validation of required columns appropriately.
Line range hint
245-271
: Calculate Segment Stats LogicThe
_calc_seg_stats
method correctly usesget_option
for dynamic column referencing and handles the aggregation and calculation of segment statistics appropriately.
72-72
: Missing Test CoverageThe line where the DataFrame is indexed by the customer ID column is not covered by tests.
Ensure that the test cases cover scenarios where the DataFrame is indexed by the customer ID column.
Verification successful
Test Coverage Verified
The test cases in the following files cover scenarios where the DataFrame is indexed by the customer ID column:
tests/test_segmentation.py
tests/test_cross_shop.py
These tests ensure that the DataFrame indexing functionality is adequately covered.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify test coverage for DataFrame indexing by customer ID column. # Test: Ensure test cases cover scenarios where the DataFrame is indexed by the customer ID column. rg --type python -A 5 'set_index' | grep -E 'assert|test'Length of output: 88
Script:
#!/bin/bash # Step 1: Locate all test files in the repository fd -e py -g '*test*.py' # Step 2: Search within these test files for test cases that cover the `set_index` method rg 'set_index' $(fd -e py -g '*test*.py')Length of output: 772
Tools
GitHub Check: codecov/patch
[warning] 72-72: pyretailscience/segmentation.py#L72
Added line #L72 was not covered by tests
…use it. (#66) * feat: add a simplified pandas-like options system * feat(tests): integrate dynamic column names using get_option in segmentation code and related tests
User description
tests/test_segmentation.py
to useget_option
for dynamic column names.ThresholdSegmentation
andHMLSegmentation
classes inpyretailscience/segmentation.py
to useget_option
for column names.get_option
.value_col
inThresholdSegmentation
andHMLSegmentation
to useget_option("column.unit_spend")
.PR Type
Enhancement, Tests
Description
Options
class to manage configurable options, including setting, getting, resetting, listing, and describing options.BaseSegmentation
,ThresholdSegmentation
, andHMLSegmentation
classes to use dynamic column names viaget_option
.Options
module, ensuring correct behavior for all functionalities.Changes walkthrough 📝
options.py
Introduced a simplified pandas-like options system
pyretailscience/options.py
Options
class to manage configurable options.segmentation.py
Integrated dynamic column names using options system
pyretailscience/segmentation.py
BaseSegmentation
,ThresholdSegmentation
, andHMLSegmentation
to use dynamic column names via
get_option
.system.
value_col
inThresholdSegmentation
andHMLSegmentation
usingget_option
.test_options.py
Added unit tests for the Options module
tests/test_options.py
Options
class.test_segmentation.py
Updated segmentation tests to use dynamic column names
tests/test_segmentation.py
get_option
for dynamic column names.corrupt.toml
Added corrupt TOML file for testing
tests/toml_files/corrupt.toml
invalid_option.toml
Added invalid option TOML file for testing
tests/toml_files/invalid_option.toml
valid.toml
Added valid TOML file for testing
tests/toml_files/valid.toml
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation