-
Couldn't load subscription status.
- Fork 1
feat: add extra_aggs parameter to SegTransactionStats #138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Allow users to perform arbitrary Ibis aggregations on specified columns by adding an extra_aggs parameter to SegTransactionStats. This enables functionality like counting distinct stores or products by segment. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
WalkthroughThe changes update the Changes
Sequence Diagram(s)sequenceDiagram
participant U as User
participant S as SegTransactionStats
participant A as _calc_seg_stats
participant D as DataFrame
U->>S: Instantiate with data, segment_col, extra_aggs
S->>S: Validate extra_aggs (column existence, function validity)
S->>A: Call _calc_seg_stats(data, segment_col, extra_aggs)
A->>A: Iterate extra_aggs, apply each aggregation
A->>S: Return aggregated results
S->>D: Include extra aggregation columns in DataFrame output
S->>U: Return complete DataFrame with aggregations
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Tip ⚡🧪 Multi-step agentic review comment chat (experimental)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces an extra_aggs parameter to SegTransactionStats so users can perform arbitrary Ibis aggregations on specified columns.
- Updated the SegTransactionStats class to accept and process extra_aggs in both the constructor and the aggregation function.
- Added tests covering correct functionality as well as error cases for invalid aggregation specifications.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| tests/analysis/test_segmentation.py | Adds tests for extra_aggs functionality and error handling. |
| pyretailscience/analysis/segmentation.py | Modifies init and _calc_seg_stats to support extra_aggs. |
Comments suppressed due to low confidence (1)
pyretailscience/analysis/segmentation.py:289
- The docstring for _calc_seg_stats states that it returns a pd.DataFrame, but it actually returns an ibis.Table. Please update the docstring to accurately reflect the return type.
Returns:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pyretailscience/analysis/segmentation.py (1)
350-353: Consider additional handling for column orderingThe extra aggregation columns are added to the end of the column list. For improved usability, you might want to add a parameter to control the position of these columns in the final DataFrame.
You could enhance column ordering flexibility with something like:
# Add any extra aggregation columns to the column order if hasattr(self, "extra_aggs") and self.extra_aggs: - col_order.extend(self.extra_aggs.keys()) + extra_cols = list(self.extra_aggs.keys()) + # Optional: sort the extra columns alphabetically for consistent output + extra_cols.sort() + col_order.extend(extra_cols)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pyretailscience/analysis/segmentation.py(6 hunks)tests/analysis/test_segmentation.py(1 hunks)
🔇 Additional comments (8)
tests/analysis/test_segmentation.py (3)
317-371: Well-structured test for extra_aggs functionalityThe test thoroughly verifies both single and multiple aggregation capabilities. It sets clear expected values as constants and includes appropriate assertions for each segment type. The test design effectively validates the core functionality of the new feature.
372-387: Good error case handling for invalid columnsThis test correctly verifies that the appropriate error is raised when an invalid column is specified in extra_aggs. The assertion checking for the presence of "does not exist in the data" in the error message ensures the error is descriptive and helpful.
388-402: Good error case handling for invalid functionsThe test appropriately checks that an invalid aggregation function triggers the expected ValueError with a descriptive message. This helps ensure robust input validation.
pyretailscience/analysis/segmentation.py (5)
192-212: Well-documented parameter addition to init methodThe extra_aggs parameter is properly typed and well-documented with clear examples. The docstring effectively explains the parameter's purpose, expected format, and usage.
228-238: Comprehensive validation for extra_aggs parameterThe validation logic properly checks both that specified columns exist in the data and that the requested aggregation functions are available for those columns. The error messages are clear and descriptive.
240-242: Proper initialization and method call with extra_aggsThe code correctly initializes extra_aggs as an empty dictionary when None is provided and properly passes it to the _calc_seg_stats method.
274-287: Well-documented _calc_seg_stats method parameterThe method signature is properly updated with the extra_aggs parameter, including appropriate type annotations and documentation. This maintains consistency with the class constructor.
309-314: Efficient implementation of extra aggregationsThe code that applies the extra aggregations is concise and follows the same pattern as the base aggregations. It correctly uses the getattr function to dynamically access the aggregation method on the column.
Allow users to perform arbitrary Ibis aggregations on specified columns by adding an extra_aggs parameter to SegTransactionStats. This enables functionality like counting distinct stores or products by segment.
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests