-
Notifications
You must be signed in to change notification settings - Fork 1
Setup GCP BigQuery Integration Tests for Analysis Modules #244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis update introduces comprehensive integration testing for BigQuery compatibility across multiple analysis modules. It adds BigQuery integration tests, supporting fixtures, and a GitHub Actions workflow for automated testing. Documentation and configuration files are updated to guide usage, manage dependencies, and exclude BigQuery tests from pre-commit checks. Minor code refactoring optimizes conditional logic. Changes
Sequence Diagram(s)sequenceDiagram
participant Developer
participant GitHub Actions
participant GCP
participant BigQuery
participant Test Suite
Developer->>GitHub Actions: Trigger BigQuery Integration Tests (workflow_dispatch)
GitHub Actions->>GitHub Actions: Set up Python, install dependencies
GitHub Actions->>GCP: Authenticate using service account key
GitHub Actions->>BigQuery: Set up SDK and environment
GitHub Actions->>Test Suite: Run pytest with selected test suite
Test Suite->>BigQuery: Query test data
Test Suite-->>GitHub Actions: Report test results
GitHub Actions-->>Developer: Display results in workflow logs
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
🚀 New features to boost your workflow:
|
6a78624
to
af4e56f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 9
🧹 Nitpick comments (26)
.env_sample (1)
1-2
: Add documentation and ensure proper file formatConsider adding a comment to explain what GCP_PROJECT_ID is used for and how users can obtain their Google Cloud Project ID for running the integration tests.
+# Your Google Cloud Platform project ID for BigQuery integration tests GCP_PROJECT_ID = +tests/integration/bigquery/conftest.py (2)
30-34
: Add error handling for missing tableConsider adding error handling in case the "test_data.transactions" table doesn't exist. This would provide a more helpful error message to users setting up the integration tests for the first time.
@pytest.fixture(scope="session") def transactions_table(bigquery_connection): """Get the transactions table for testing.""" - return bigquery_connection.table("test_data.transactions") + try: + table = bigquery_connection.table("test_data.transactions") + # Verify table exists by executing a simple query + table.limit(1).execute() + return table + except Exception as e: + logger.error(f"Failed to access test_data.transactions: {e}") + logger.error("Please ensure the test_data.transactions table exists in your BigQuery project") + raise
1-34
: Consider adding test data schema documentationIt would be helpful to document the expected schema of the "test_data.transactions" table either in this file or in README documentation to help others set up the test environment correctly.
"""BigQuery integration test fixtures.""" +""" +Expected schema for test_data.transactions table: +- transaction_id: STRING +- customer_id: STRING +- transaction_dt: TIMESTAMP +- store_id: STRING +- product_id: STRING +- quantity: NUMERIC +- amount: NUMERIC +""" + import ostests/integration/bigquery/test_customer_decision_hierarchy.py (2)
65-83
: Enhance initialization test with more specific assertionsWhile the test correctly verifies basic initialization with BigQuery data, it could be improved by adding more specific assertions about the initialized object's properties.
assert cdh is not None assert isinstance(cdh, rp.CustomerDecisionHierarchy) +assert hasattr(cdh, 'pairs_df'), "The CDH instance should have a pairs_df attribute" +assert hasattr(cdh, 'distances_matrix'), "The CDH instance should have a distances_matrix attribute"
95-114
: Add assertion for non-empty pairs DataFrameThe conditional check at line 108 is good for avoiding errors when no pairs are found, but it might mask potential issues if pairs_df is unexpectedly empty. Consider adding an assertion to verify that pairs are actually found.
) assert cols.customer_id in pairs_df.columns assert "product_name" in pairs_df.columns +assert len(pairs_df) > 0, "Expected pairs DataFrame to contain at least some rows"
README.md (1)
286-301
: Comprehensive GitHub Actions instructions with inconsistent formattingThe section on running tests via GitHub Actions is well-documented, but there's inconsistent formatting in the list of required secrets.
To run the workflow in GitHub Actions, add these secrets to your repository: - - `GCP_SA_KEY`: The entire JSON content of your GCP service account key file - - `GCP_PROJECT_ID`: Your GCP project ID + - `GCP_SA_KEY`: The entire JSON content of your GCP service account key file + - `GCP_PROJECT_ID`: Your GCP project ID🧰 Tools
🪛 LanguageTool
[uncategorized] ~299-~299: Loose punctuation mark.
Context: ...rets to your repository: -GCP_SA_KEY
: The entire JSON content of your GCP ser...(UNLIKELY_OPENING_PUNCTUATION)
tests/integration/bigquery/test_cohort_analysis.py (2)
12-24
: Enhance cohort computation test with more specific assertionsWhile the test correctly verifies basic functionality, it could be improved by adding more specific assertions about the structure of the resulting DataFrame.
assert not result.empty, "Cohort table should not be empty for valid BigQuery data" assert isinstance(result, pd.DataFrame) +# Verify the structure of the cohort table +assert result.index.name == "min_period_shopped", "Index should be named 'min_period_shopped'" +assert result.columns.dtype == 'int64', "Columns should represent period_since as integers" +assert result.dtypes.nunique() == 1, "All columns should have the same data type"
38-49
: Enhance percentage test with stronger assertionsThe test verifies that percentage values are bounded by 1.0, but it could be improved by also checking that values are non-negative and that the first column has all 1.0 values (the base cohort).
assert not result.empty assert result.max().max() <= 1.0, "Values should be <= 1 when percentage=True" +assert result.min().min() >= 0.0, "Values should be >= 0 when percentage=True" +if 0 in result.columns: + assert (result[0] == 1.0).all(), "First period should always be 100% (1.0)".github/workflows/bigquery-integration.yml (2)
38-47
: Consider adding dependency cachingThe workflow installs dependencies each run, which could be optimized by adding a caching step for Python dependencies.
- name: Setup Python uses: actions/setup-python@v5 with: python-version: "3.11" + cache: 'pip' + cache-dependency-path: | + pyproject.toml + requirements*.txt - name: Install uv Package run: |
61-67
: Enhance test filtering for robustnessThe conditional expression for filtering tests works but could be improved to handle potential issues with special characters in test names.
- name: Run Integration Tests env: TEST_SUITE: ${{ inputs.test_suite }} run: | - uv run pytest tests/integration/bigquery -v \ - $(if [ "$TEST_SUITE" != "all" ]; then echo "-k $TEST_SUITE"; fi) + if [ "$TEST_SUITE" = "all" ]; then + uv run pytest tests/integration/bigquery -v + else + uv run pytest tests/integration/bigquery -v -k "$TEST_SUITE" + fitests/integration/bigquery/test_composite_rank.py (1)
31-33
: Reconsider usingcustomer_id
for rankingUsing
customer_id
as a ranking column doesn't make business sense since it's an identifier, not a metric. In a real-world scenario, you'd typically rank by business metrics like revenue, frequency, or recency.Consider using meaningful business metrics for ranking instead of identifier columns:
rank_cols = [ ("unit_spend", "desc"), - ("customer_id", "desc"), + ("transaction_frequency", "desc"), # A more meaningful business metric ("spend_per_customer", "desc"), ]tests/integration/bigquery/test_product_association.py (2)
57-58
: Add more descriptive error message for association test skipThe current skip message is not informative enough to help diagnose why the test was skipped.
if len(expected_results_df) == 0: - pytest.skip("No association rules found in data") + pytest.skip("No association rules found in data - the dataset may not have enough transactions or product variety")
298-320
: Consolidate data suitability checks into a utility functionThe data suitability checks for category and brand association tests are very similar but repeated with slight differences. This creates potential for inconsistency and makes maintenance harder.
Extract the repeated logic into a utility function:
+ def _check_association_data_suitability(self, df, value_col, skip_message_prefix): + """Check if data is suitable for association testing and skip if not.""" + # Check for missing values + if df[value_col].isna().mean() > MISSING_VALUE_THRESHOLD: + pytest.skip(f"{skip_message_prefix}: Too many missing values in {value_col}") + + # Check for enough unique values + unique_values = df[value_col].nunique() + if unique_values < MIN_CATEGORIES_REQUIRED: + pytest.skip(f"{skip_message_prefix}: Not enough unique values ({unique_values}) in {value_col}") + + # Check for multi-value transactions + value_counts = df.groupby("transaction_id")[value_col].nunique() + multi_value_transactions = (value_counts > 1).sum() + if multi_value_transactions < MIN_MULTI_CATEGORY_TRANSACTIONS: + pytest.skip(f"{skip_message_prefix}: Not enough transactions with multiple values ({multi_value_transactions})") + + return True + def test_real_world_category_association(self, transactions_table): """Test real-world category association analysis using BigQuery data.""" value_col = "category_1_name" @@ -297,12 +318,7 @@ df = query.execute() - unique_categories = df[value_col].nunique() - if unique_categories < MIN_CATEGORIES_REQUIRED: - pytest.skip(f"Not enough unique categories ({unique_categories}) for association testing") - - category_counts = df.groupby("transaction_id")[value_col].nunique() - multi_category_transactions = (category_counts > 1).sum() - - if multi_category_transactions < MIN_MULTI_CATEGORY_TRANSACTIONS: - pytest.skip(f"Not enough transactions with multiple categories ({multi_category_transactions})") + # Check if data is suitable for association testing + self._check_association_data_suitability(df, value_col, "Category association")Use the same function in the brand association test with similar logic.
tests/integration/bigquery/test_revenue_tree.py (4)
11-14
: Use descriptive constant names instead of magic numbersThe constants have generic names that don't convey their purpose in the tests.
-EXPECTED_LENGTH_2 = 2 -EXPECTED_LENGTH_5 = 5 -EXPECTED_SUM_3 = 3 -EXPECTED_SUM_2 = 2 +PERIODS_COUNT = 2 # Expected number of periods (P1, P2) +MAX_BRANDS_COUNT = 5 # Expected maximum number of brands in test data +P1_BRANDS_COUNT = 3 # Expected number of brands in period P1 +P2_BRANDS_COUNT = 2 # Expected number of brands in period P2
173-175
: Use a more general solution for calculating tree_data indexUsing a hard-coded index of 0 assumes knowledge about how the data will be structured, which could lead to brittle tests if the implementation changes.
- tree_index = 0 - tree_data = rt.df.to_dict(orient="records")[tree_index] + # Use the first row for simple tests without a group_col + if rt.df.index.name is None: + tree_data = rt.df.iloc[0].to_dict() + else: + # With a group_col, get the first group + tree_data = rt.df.iloc[0].to_dict()
289-294
: Use a more representative test datasetThe test dataset is constructed with arbitrary values that don't reflect real-world scenarios where brands might have clusters of transactions.
Create a more meaningful test dataset:
test_df = pd.DataFrame( { "brand_id": [ - unique_brands[0], - unique_brands[0], - unique_brands[0], - unique_brands[1], - unique_brands[1], - unique_brands[1], - unique_brands[2], + unique_brands[0], unique_brands[0], unique_brands[0], # Brand 0 has transactions in both periods + unique_brands[1], unique_brands[1], unique_brands[1], # Brand 1 has transactions in both periods + unique_brands[2], unique_brands[2], # Brand 2 only has transactions in P1 ], - "customer_id": [1, 2, 3, 4, 5, 6, 7], - "transaction_id": [1, 2, 3, 4, 5, 6, 7], - "unit_spend": [100.0, 200.0, 300.0, 400.0, 500.0, 600.0, 700.0], - "period": ["P1", "P2", "P1", "P2", "P1", "P2", "P1"], + "customer_id": [1, 2, 3, 4, 5, 6, 7, 8], + "transaction_id": [1, 2, 3, 4, 5, 6, 7, 8], + "unit_spend": [100.0, 200.0, 300.0, 400.0, 500.0, 600.0, 700.0, 800.0], + "period": ["P1", "P2", "P1", "P2", "P1", "P2", "P1", "P1"], }, )
359-361
: Add tests for edge case price elasticity calculationsThe test only verifies a simple case for price elasticity but doesn't test edge cases like zero price change or zero quantity change.
Add test cases for edge scenarios:
if include_quantity: q1, q2 = 10, 12 p1, p2 = 900.0 / q1, 1100.0 / q2 expected_elasticity = ((q2 - q1) / ((q2 + q1) / 2)) / ((p2 - p1) / ((p2 + p1) / 2)) assert round(result["price_elasticity"].iloc[0], 6) == round(expected_elasticity, 6) + + # Test edge cases + edge_cases = [ + # Same price, different quantity + {"qty": [10, 12], "spend": [1000, 1200], "expected": float('inf')}, + # Same quantity, different price + {"qty": [10, 10], "spend": [1000, 1200], "expected": 0.0}, + # No change in either + {"qty": [10, 10], "spend": [1000, 1000], "expected": None}, + ] + + for case in edge_cases: + edge_df = pd.DataFrame({ + cols.agg_customer_id: [3, 3], + cols.agg_transaction_id: [3, 3], + cols.agg_unit_spend: case["spend"], + cols.agg_unit_qty: case["qty"], + }, index=["p1", "p2"]) + + edge_result = calc_tree_kpis(edge_df, [True, False], [False, True]) + + # For the no-change case, the value might be NaN or infinity + if case["expected"] is None: + assert pd.isna(edge_result["price_elasticity"].iloc[0]) or np.isinf(edge_result["price_elasticity"].iloc[0]) + else: + assert round(edge_result["price_elasticity"].iloc[0], 6) == round(case["expected"], 6)tests/integration/bigquery/test_threshold_segmentation.py (2)
207-226
: Split test case into two separate tests for clarityThe test combines two different cases that should be tested separately for clarity.
-def test_thresholds_too_few_segments(self, transactions_table): - """Test that the method raises an error when there are too few/many segments for the number of thresholds.""" +def test_thresholds_more_than_segments(self, transactions_table): + """Test that the method raises an error when there are more thresholds than segments.""" query = transactions_table.select( transactions_table.customer_id, transactions_table.unit_spend, ).limit(100) df = query.execute() thresholds = [0.4, 0.6, 0.8, 1] segments = ["Low", "High"] with pytest.raises(ValueError): ThresholdSegmentation(df, thresholds, segments) + +def test_thresholds_fewer_than_segments(self, transactions_table): + """Test that the method raises an error when there are fewer thresholds than segments.""" + query = transactions_table.select( + transactions_table.customer_id, + transactions_table.unit_spend, + ).limit(100) + + df = query.execute() segments = ["Low", "Medium", "High"] + thresholds = [0.4, 0.6, 0.8, 1] with pytest.raises(ValueError): ThresholdSegmentation(df, thresholds, segments)
227-246
: Split test case and improve test names for claritySimilar to the previous comment, this test combines two different cases and has a confusing name.
-def test_thresholds_too_few_thresholds(self, transactions_table): - """Test that the method raises an error when there are too few/many thresholds for the number of segments.""" +def test_fewer_thresholds_than_segments(self, transactions_table): + """Test that the method raises an error when there are fewer thresholds than segments.""" query = transactions_table.select( transactions_table.customer_id, transactions_table.unit_spend, ).limit(100) df = query.execute() thresholds = [0.4, 1] segments = ["Low", "Medium", "High"] with pytest.raises(ValueError): ThresholdSegmentation(df, thresholds, segments) + +def test_more_thresholds_than_segments(self, transactions_table): + """Test that the method raises an error when there are more thresholds than segments.""" + query = transactions_table.select( + transactions_table.customer_id, + transactions_table.unit_spend, + ).limit(100) + + df = query.execute() + segments = ["Low", "Medium", "High"] thresholds = [0.2, 0.5, 0.6, 0.8, 1] with pytest.raises(ValueError): ThresholdSegmentation(df, thresholds, segments)tests/integration/bigquery/test_rfm_segmentation.py (3)
12-12
: Add a comment explaining the purpose of the REASONABLE_THRESHOLD constant.Adding a brief comment would clarify what this threshold is used for and why the value of 0.1 was chosen.
-REASONABLE_THRESHOLD = 0.1 +# Threshold for comparing frequency values to raw transaction counts +# 0.1 means ratios as extreme as 1:10 are still considered reasonable +REASONABLE_THRESHOLD = 0.1
106-112
: Consider simplifying the ratio comparison logic.The current implementation to check if values are reasonably close is clever but somewhat complex. A more straightforward approach might improve readability.
- ratio = max(actual_frequency, 1) / max(raw_transaction_count, 1) - inverse_ratio = max(raw_transaction_count, 1) / max(actual_frequency, 1) - reasonable = min(ratio, inverse_ratio) >= REASONABLE_THRESHOLD + # Check if the values are within a factor of 1/REASONABLE_THRESHOLD of each other + max_value = max(actual_frequency, raw_transaction_count, 1) + min_value = min(actual_frequency, raw_transaction_count, 1) + reasonable = min_value / max_value >= REASONABLE_THRESHOLD
167-184
: Consider parameterizing the limit value.The hard-coded limit of 10,000 records may cause unpredictable test durations. A parameterized or configurable value would provide more flexibility.
+ @pytest.fixture(params=[100, 1000, 10000]) + def large_data_limit(self, request): + """Return different size limits for testing performance.""" + return request.param + - def test_large_dataset_performance(self, transactions_table): + def test_large_dataset_performance(self, transactions_table, large_data_limit): """Test RFMSegmentation performance with a larger dataset.""" query = transactions_table.select( transactions_table.transaction_id, transactions_table.transaction_date, transactions_table.customer_id, transactions_table.unit_spend, - ).limit(10000) + ).limit(large_data_limit)tests/integration/bigquery/test_segstats_segmentation.py (2)
26-26
: Consider explicitly specifying region values instead of using modulo arithmetic.Using modulo operations on store_id to create test regions makes the test data generation less transparent. Consider using explicit mapping for clarity.
- df["region"] = np.where(df["store_id"] % 3 == 0, "North", np.where(df["store_id"] % 3 == 1, "South", "East")) + # Map store_id to regions using a more readable approach + region_map = {0: "North", 1: "South", 2: "East"} + df["region"] = df["store_id"].apply(lambda x: region_map[x % 3])
187-190
: Add comment explaining the brand_name/brand_id fallback logic.The fallback from brand_name to brand_id is important for test resilience, but its purpose isn't documented.
+ # Some test datasets might have brand_name, others might use brand_id + # We're flexible to support both schemas by creating segments from either if "brand_name" in transaction_df.columns: transaction_df["brand_segment"] = transaction_df["brand_name"].astype(str).str[:2] else: transaction_df["brand_segment"] = transaction_df["brand_id"].astype(str).str[:2]tests/integration/bigquery/test_cross_shop.py (2)
17-31
: Consider using the Ibis API instead of raw SQL.For consistency with other tests, consider using the Ibis API directly rather than SQL strings.
@pytest.fixture def transactions_df(transactions_table): """Fetch transaction data for testing from BigQuery.""" - query = """ - SELECT - customer_id, - category_1_name, - unit_spend - FROM - test_data.transactions - WHERE - category_1_name IN ('Jeans', 'Shoes', 'Dresses', 'Hats') - LIMIT 100 - """ - return transactions_table.sql(query).execute() + return (transactions_table + .select( + transactions_table.customer_id, + transactions_table.category_1_name, + transactions_table.unit_spend + ) + .filter(transactions_table.category_1_name.isin(['Jeans', 'Shoes', 'Dresses', 'Hats'])) + .limit(100) + .execute())
156-156
: Simplify integer check for improved readability.The current check for integer values is complex and hard to read. Consider a clearer approach.
- assert all(isinstance(val, int | float) and val.is_integer() for val in cross_shop_table[cols.customer_id]) + # Check that all customer_id counts are integers (even if stored as float) + for val in cross_shop_table[cols.customer_id]: + assert isinstance(val, (int, float)) + if isinstance(val, float): + assert val.is_integer(), f"Expected integer value, got {val}"
🛑 Comments failed to post (9)
tests/integration/bigquery/conftest.py (1)
11-13:
⚠️ Potential issueRemove hardcoded project ID
There's an inconsistency in how BigQuery clients are created. The
bigquery.Client
is initialized with a hardcoded project ID "pyretailscience-infra", but it's never used. Meanwhile, thebigquery_connection
fixture uses the environment variable.load_dotenv() -client = bigquery.Client(project="pyretailscience-infra")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.load_dotenv()
tests/integration/bigquery/test_composite_rank.py (3)
46-53: 🛠️ Refactor suggestion
Verify rank column results beyond just presence
The test only checks for the presence of rank columns but doesn't verify their values or ranking logic.
Add assertions to verify the actual ranking logic:
for col in expected_rank_columns: assert col in cr.df.columns + + # Verify that ranks are correctly sorted + unit_spend_ranks = cr.df["unit_spend_rank"] + assert unit_spend_ranks.is_monotonic_increasing or unit_spend_ranks.is_monotonic_decreasing + + # Verify that composite rank is calculated correctly (e.g., as the mean of individual ranks) + calculated_mean = cr.df[["unit_spend_rank", "customer_id_rank", "spend_per_customer_rank"]].mean(axis=1) + assert cr.df["composite_rank"].equals(calculated_mean)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.expected_rank_columns = [ "unit_spend_rank", "customer_id_rank", "spend_per_customer_rank", "composite_rank", ] for col in expected_rank_columns: assert col in cr.df.columns # Verify that ranks are correctly sorted unit_spend_ranks = cr.df["unit_spend_rank"] assert unit_spend_ranks.is_monotonic_increasing or unit_spend_ranks.is_monotonic_decreasing # Verify that composite rank is calculated correctly (e.g., as the mean of individual ranks) calculated_mean = cr.df[["unit_spend_rank", "customer_id_rank", "spend_per_customer_rank"]].mean(axis=1) assert cr.df["composite_rank"].equals(calculated_mean)
20-22:
⚠️ Potential issueCheck the calculation of
spend_per_customer
The calculation divides
unit_spend
bycustomer_id
, which is problematic sincecustomer_id
is typically an identifier rather than a quantity. This division will produce meaningless results that vary based on ID values.- if "spend_per_customer" not in df.columns: - df["spend_per_customer"] = df["unit_spend"] / df["customer_id"] + if "spend_per_customer" not in df.columns: + # Calculate spend per customer by first grouping by customer_id + customer_counts = df.groupby("customer_id").size() + customer_spend = df.groupby("customer_id")["unit_spend"].sum() + spend_per_customer_series = customer_spend / customer_counts + # Map the values back to each row + df["spend_per_customer"] = df["customer_id"].map(spend_per_customer_series)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.if "spend_per_customer" not in df.columns: # Calculate spend per customer by first grouping by customer_id customer_counts = df.groupby("customer_id").size() customer_spend = df.groupby("customer_id")["unit_spend"].sum() spend_per_customer_series = customer_spend / customer_counts # Map the values back to each row df["spend_per_customer"] = df["customer_id"].map(spend_per_customer_series)
76-96: 🛠️ Refactor suggestion
Enhance tie-breaking test to validate actual ranking behavior
The current test only checks for the presence of rank columns but doesn't verify that tie-breaking behavior is different between the two scenarios.
Add assertions to verify the actual tie-breaking behavior:
assert "unit_spend_rank" in cr_with_ties.df.columns assert "unit_spend_rank" in cr_no_ties.df.columns + + # Find duplicate unit_spend values to check tie behavior + spend_counts = test_transactions_df["unit_spend"].value_counts() + tied_spend_values = spend_counts[spend_counts > 1].index + + if len(tied_spend_values) > 0: + tied_value = tied_spend_values[0] + tied_rows_with_ties = cr_with_ties.df[cr_with_ties.df["unit_spend"] == tied_value] + tied_rows_no_ties = cr_no_ties.df[cr_no_ties.df["unit_spend"] == tied_value] + + # When ignore_ties=False, tied values should have the same rank + if len(tied_rows_with_ties) > 1: + assert tied_rows_with_ties["unit_spend_rank"].nunique() == 1 + + # When ignore_ties=True, tied values should have different ranks + if len(tied_rows_no_ties) > 1: + assert tied_rows_no_ties["unit_spend_rank"].nunique() == len(tied_rows_no_ties)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.def test_ignore_ties_with_bigquery(self, test_transactions_df): """Test tie-breaking behavior with BigQuery data.""" rank_cols = [("unit_spend", "desc")] cr_with_ties = CompositeRank( df=test_transactions_df, rank_cols=rank_cols, agg_func="mean", ignore_ties=False, ) cr_no_ties = CompositeRank( df=test_transactions_df, rank_cols=rank_cols, agg_func="mean", ignore_ties=True, ) assert "unit_spend_rank" in cr_with_ties.df.columns assert "unit_spend_rank" in cr_no_ties.df.columns # Find duplicate unit_spend values to check tie behavior spend_counts = test_transactions_df["unit_spend"].value_counts() tied_spend_values = spend_counts[spend_counts > 1].index if len(tied_spend_values) > 0: tied_value = tied_spend_values[0] tied_rows_with_ties = cr_with_ties.df[ cr_with_ties.df["unit_spend"] == tied_value ] tied_rows_no_ties = cr_no_ties.df[ cr_no_ties.df["unit_spend"] == tied_value ] # When ignore_ties=False, tied values should have the same rank if len(tied_rows_with_ties) > 1: assert tied_rows_with_ties["unit_spend_rank"].nunique() == 1 # When ignore_ties=True, tied values should have different ranks if len(tied_rows_no_ties) > 1: assert tied_rows_no_ties["unit_spend_rank"].nunique() == len( tied_rows_no_ties )
tests/integration/bigquery/test_product_association.py (1)
33-43: 🛠️ Refactor suggestion
Handle potential execution failures in expected_results_df fixture
The fixture doesn't handle potential failure scenarios when executing the association calculation.
Add error handling to prevent test failures caused by issues in the fixture:
"""Dynamically generate expected results based on actual BigQuery data.""" + try: associations_df = ProductAssociation._calc_association( df=transactions_df, value_col="product", group_col="transaction_id", ) if not isinstance(associations_df, pd.DataFrame) and hasattr(associations_df, "execute"): - associations_df = associations_df.execute() + try: + associations_df = associations_df.execute() + except Exception as e: + pytest.skip(f"Error executing association calculation: {str(e)}") + return pd.DataFrame() return associations_df + except Exception as e: + pytest.skip(f"Error calculating product associations: {str(e)}") + return pd.DataFrame()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements."""Dynamically generate expected results based on actual BigQuery data.""" try: associations_df = ProductAssociation._calc_association( df=transactions_df, value_col="product", group_col="transaction_id", ) if not isinstance(associations_df, pd.DataFrame) and hasattr(associations_df, "execute"): try: associations_df = associations_df.execute() except Exception as e: pytest.skip(f"Error executing association calculation: {str(e)}") return pd.DataFrame() return associations_df except Exception as e: pytest.skip(f"Error calculating product associations: {str(e)}") return pd.DataFrame()
tests/integration/bigquery/test_revenue_tree.py (1)
232-236:
⚠️ Potential issueDuplicate assertion check - Same test is performed twice
There's a redundant assertion that checks the same condition twice.
assert ( tree_data[cols.calc_spend_per_cust_p2] - tree_data[cols.calc_spend_per_cust_p1] == tree_data[cols.calc_spend_per_cust_diff] ) - assert ( - tree_data[cols.calc_spend_per_cust_p2] - tree_data[cols.calc_spend_per_cust_p1] - == tree_data[cols.calc_spend_per_cust_diff] - )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.) assert ( tree_data[cols.calc_spend_per_cust_p2] - tree_data[cols.calc_spend_per_cust_p1] == tree_data[cols.calc_spend_per_cust_diff] )
tests/integration/bigquery/test_threshold_segmentation.py (1)
55-73:
⚠️ Potential issueTest name doesn't match test implementation
The test name suggests it's testing with a single customer, but it's actually testing a mismatch between thresholds and segments count.
-def test_single_customer(self, transactions_table): - """Test that the method correctly handles a DataFrame with only one customer.""" +def test_thresholds_segments_mismatch(self, transactions_table): + """Test that the method raises an error when thresholds and segments count don't match.""" query = transactions_table.select( transactions_table.customer_id, transactions_table.unit_spend, ).limit(1) df = query.execute() thresholds = [0.5, 1] segments = ["Low"] with pytest.raises(ValueError): ThresholdSegmentation( df=df, thresholds=thresholds, segments=segments, )Also, add a separate test for the single customer case:
def test_single_customer(self, transactions_table): """Test that the method correctly handles a DataFrame with only one customer.""" query = transactions_table.select( transactions_table.customer_id, transactions_table.unit_spend, ).limit(1) df = query.execute() thresholds = [1] segments = ["Single"] # This should work with a single customer seg = ThresholdSegmentation( df=df, thresholds=thresholds, segments=segments, ) result_df = seg.df assert len(result_df) == 1 assert result_df["segment_name"].iloc[0] == "Single"tests/integration/bigquery/test_segstats_segmentation.py (1)
157-162: 🛠️ Refactor suggestion
Avoid catching generic Exception.
Using a broad Exception catch can hide unexpected errors. Specify the exceptions you expect to catch.
try: seg_stats_no_total.plot("spend") plotting_no_total_worked = True - except Exception as e: # noqa: BLE001 + except (ValueError, TypeError, RuntimeError, AttributeError, ImportError) as e: plotting_no_total_worked = False pytest.fail(f"Plotting with calc_total=False failed: {e!s}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.try: seg_stats_no_total.plot("spend") plotting_no_total_worked = True except (ValueError, TypeError, RuntimeError, AttributeError, ImportError) as e: plotting_no_total_worked = False pytest.fail(f"Plotting with calc_total=False failed: {e!s}")
tests/integration/bigquery/test_hml_segmentation.py (1)
34-41: 🛠️ Refactor suggestion
Extract the zero-spend customer creation to a separate fixture.
This code is duplicated in three test methods. Consider extracting it into a reusable fixture.
+ @pytest.fixture + def ensure_zero_spend_customer(self, customers_spend_df): + """Ensure at least one zero-spend customer exists in the DataFrame.""" + if not (customers_spend_df[cols.unit_spend] == 0).any(): + zero_customer = pd.DataFrame( + { + get_option("column.customer_id"): [999999], + cols.unit_spend: [0], + }, + ) + customers_spend_df = pd.concat([customers_spend_df, zero_customer], ignore_index=True) + + zero_spend_customer_id = customers_spend_df[customers_spend_df[cols.unit_spend] == 0][ + get_option("column.customer_id") + ].iloc[0] + + return customers_spend_df, zero_spend_customer_idThen use it in each test:
def test_handles_zero_spend_customers_are_excluded_in_result(self, ensure_zero_spend_customer): """Test that the method correctly handles zero spend customers when zero_value_customers is 'exclude'.""" customers_spend_df, zero_spend_customer_id = ensure_zero_spend_customer hml_segmentation = HMLSegmentation(customers_spend_df, zero_value_customers="exclude") result_df = hml_segmentation.df assert zero_spend_customer_id not in result_df.index.values assert "Heavy" in result_df["segment_name"].values assert "Medium" in result_df["segment_name"].values assert "Light" in result_df["segment_name"].values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
tests/integration/bigquery/test_revenue_tree.py (2)
11-14
: Consider more descriptive constant namesThese constants use numeric suffixes that don't clearly indicate their purpose. Consider using more descriptive names that explain what these values represent in the tests.
-EXPECTED_LENGTH_2 = 2 -EXPECTED_LENGTH_5 = 5 -EXPECTED_SUM_3 = 3 -EXPECTED_SUM_2 = 2 +EXPECTED_PERIODS_COUNT = 2 +EXPECTED_GROUPS_COUNT = 5 +EXPECTED_P1_GROUPS_COUNT = 3 +EXPECTED_P2_GROUPS_COUNT = 2
61-103
: Extract test data setup into helper methodsThe test data setup logic is repeated across multiple tests. Consider extracting this common setup code into helper methods to reduce duplication.
You could create helper methods for:
- Preparing a DataFrame with basic transaction data
- Adding period flags
- Applying data type conversions
Example refactoring:
def _prepare_base_transaction_df(self, transactions_table, columns, limit=6): """Prepare a base transaction DataFrame with common columns.""" query = transactions_table.select(columns).limit(limit) df = query.execute() return df def _add_period_flags(self, df): """Add period flags (P1/P2) to the DataFrame.""" middle_index = len(df) // 2 df["period"] = ["P1"] * middle_index + ["P2"] * (len(df) - middle_index) return df def _convert_data_types(self, df, cols, include_quantity=False): """Apply common data type conversions.""" df["customer_id"] = df["customer_id"].astype(int) df["transaction_id"] = df["transaction_id"].astype(int) df["unit_spend"] = df["unit_spend"].astype(float) if include_quantity and "unit_quantity" in df.columns: df[cols.unit_qty] = df["unit_quantity"].astype(int) return dfThen, test methods could use these helpers:
def test_agg_data_no_group(self, cols: ColumnHelper, include_quantity: bool, transactions_table): """Test the _agg_data method with no group_col using BigQuery data.""" columns = ["customer_id", "transaction_id", "unit_spend", "transaction_date"] if include_quantity: columns.append("unit_quantity") df = self._prepare_base_transaction_df(transactions_table, columns) df = self._add_period_flags(df) df = self._convert_data_types(df, cols, include_quantity) # Rest of the test...README.md (1)
285-289
: Fix spacing in list formattingThere's an extra space after the colon before the list marker.
-To run the workflow in GitHub Actions, add these secrets to your repository: - `GCP_SA_KEY`: The entire JSON content of your GCP service account key file - +To run the workflow in GitHub Actions, add these secrets to your repository: +- `GCP_SA_KEY`: The entire JSON content of your GCP service account key file🧰 Tools
🪛 LanguageTool
[uncategorized] ~287-~287: Loose punctuation mark.
Context: ...rets to your repository: -GCP_SA_KEY
: The entire JSON content of your GCP ser...(UNLIKELY_OPENING_PUNCTUATION)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (16)
.env_sample
(1 hunks).github/workflows/bigquery-integration.yml
(1 hunks).pre-commit-config.yaml
(1 hunks)README.md
(2 hunks)pyproject.toml
(1 hunks)tests/integration/bigquery/conftest.py
(1 hunks)tests/integration/bigquery/test_cohort_analysis.py
(1 hunks)tests/integration/bigquery/test_composite_rank.py
(1 hunks)tests/integration/bigquery/test_cross_shop.py
(1 hunks)tests/integration/bigquery/test_customer_decision_hierarchy.py
(1 hunks)tests/integration/bigquery/test_hml_segmentation.py
(1 hunks)tests/integration/bigquery/test_product_association.py
(1 hunks)tests/integration/bigquery/test_revenue_tree.py
(1 hunks)tests/integration/bigquery/test_rfm_segmentation.py
(1 hunks)tests/integration/bigquery/test_segstats_segmentation.py
(1 hunks)tests/integration/bigquery/test_threshold_segmentation.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (14)
- .env_sample
- pyproject.toml
- .pre-commit-config.yaml
- tests/integration/bigquery/test_customer_decision_hierarchy.py
- tests/integration/bigquery/conftest.py
- tests/integration/bigquery/test_threshold_segmentation.py
- tests/integration/bigquery/test_cohort_analysis.py
- tests/integration/bigquery/test_segstats_segmentation.py
- tests/integration/bigquery/test_product_association.py
- .github/workflows/bigquery-integration.yml
- tests/integration/bigquery/test_composite_rank.py
- tests/integration/bigquery/test_hml_segmentation.py
- tests/integration/bigquery/test_cross_shop.py
- tests/integration/bigquery/test_rfm_segmentation.py
🧰 Additional context used
🪛 LanguageTool
README.md
[uncategorized] ~287-~287: Loose punctuation mark.
Context: ...rets to your repository: - GCP_SA_KEY
: The entire JSON content of your GCP ser...
(UNLIKELY_OPENING_PUNCTUATION)
🔇 Additional comments (5)
tests/integration/bigquery/test_revenue_tree.py (4)
317-361
: Good test design for calc_tree_kpis functionThis test is well-structured with good organization:
- Clear test data setup with explicit values
- Proper handling of the parametrized test cases
- Validation of return type and structure
- Exhaustive column verification
- Advanced verification of elasticity calculations when quantity data is present
1-9
: Well-structured imports and module docstringThe imports are properly organized with standard library imports first (math), followed by third-party libraries (pandas, pytest), and then project-specific imports. The module docstring clearly describes the purpose of the test file.
31-44
: Good error handling testThis test properly verifies that the RevenueTree class raises appropriate ValueErrors when required columns are missing, including checking the error message content.
260-315
: Comprehensive edge case testingThe test successfully verifies an important edge case where some groups are only present in the P1 period but not in P2. This is crucial for ensuring correct handling of imbalanced data across time periods.
README.md (1)
291-309
:❓ Verification inconclusive
Verify expected BigQuery table schema
The documentation lists the required table schema for BigQuery integration tests. Ensure that this matches exactly what the tests expect to avoid potential test failures.
🏁 Script executed:
#!/bin/bash # Verify all column names mentioned in the test files match those in the README # Find all required columns mentioned in the test files echo "Searching for column references in test files..." # Look for column names in BigQuery test files rg -o --no-filename "(\"[a-zA-Z0-9_]+\"|\[\"[a-zA-Z0-9_]+\"\])" tests/integration/bigquery/ | sort | uniqLength of output: 2326
Ensure BigQuery integration test schema aligns with README
The grep across
tests/integration/bigquery/
reveals which column names your tests actually reference versus what’s documented. There are mismatches that need reconciliation to avoid test failures and confusion:• README.md (lines 291–309): currently lists
- transaction_id
- transaction_date
- transaction_time
- customer_id
- product_id
- product_name
- category_0_name
- category_0_id
- category_1_name
- category_1_id
- brand_name
- brand_id
- unit_quantity
- unit_cost
- unit_spend
- store_id
• Tests reference (among many others):
- transaction_id, transaction_date
- customer_id, product_id, product_name
- brand_id, brand_name
- store_id, unit_quantity, unit_cost, unit_spend
- plus derived fields like category_segment, brand_segment, avg_unit_cost, composite_rank, etc.
Key discrepancies:
- Missing in tests:
transaction_time
,category_0_id
,category_1_id
- Missing in docs: any derived columns (
category_segment
,brand_segment
, etc.)Please verify and update either the README schema or the test assertions so that they match exactly.
- Update README.md to reflect only the columns your tests create/assert
- Or adjust your tests to cover all documented base columns
af4e56f
to
f51ea78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
tests/integration/bigquery/test_revenue_tree.py (4)
11-14
: Use more descriptive constant names.The constant names (
EXPECTED_LENGTH_2
,EXPECTED_LENGTH_5
, etc.) are not very descriptive and don't indicate their purpose in the tests. Consider using more meaningful names that explain what these values represent.-EXPECTED_LENGTH_2 = 2 -EXPECTED_LENGTH_5 = 5 -EXPECTED_SUM_3 = 3 -EXPECTED_SUM_2 = 2 +# Constants for expected array lengths and sums in test assertions +EXPECTED_PERIODS_LENGTH = 2 # Number of periods (p1, p2) +EXPECTED_TEST_GROUPS_LENGTH = 5 # Number of test groups in the grouped test data +EXPECTED_P1_GROUPS_COUNT = 3 # Number of groups that appear in period 1 +EXPECTED_P2_GROUPS_COUNT = 2 # Number of groups that appear in period 2
168-203
: The test method could benefit from additional documentation.This test method is comprehensive but could be clearer about what specific aspects of the contribution calculations are being verified. Consider adding more inline comments to explain the logical flow and purpose of each set of assertions.
@pytest.mark.parametrize("include_quantity", [True, False]) def test_contrib_sum(self, cols: ColumnHelper, include_quantity: bool, transactions_table): """Test that the contributions add up to the total using BigQuery data.""" + # This test verifies that the revenue tree components correctly sum up + # to their totals and that the calculated differences match the actual differences + # between period values columns = [ "customer_id", "transaction_id", "unit_spend", "transaction_date", ]
312-356
: Consider improving price elasticity calculation clarity.The price elasticity calculation in
test_calc_tree_kpis_basic
is implemented correctly, but its purpose and the formula used could be better documented for future maintainers.if include_quantity: q1, q2 = 10, 12 p1, p2 = 900.0 / q1, 1100.0 / q2 + # Calculate price elasticity using arc elasticity formula: + # (% change in quantity / % change in price) + # Using midpoint formula for percentage changes for more accurate results with large changes expected_elasticity = ((q2 - q1) / ((q2 + q1) / 2)) / ((p2 - p1) / ((p2 + p1) / 2)) assert round(result["price_elasticity"].iloc[0], 6) == round(expected_elasticity, 6)
74-93
: Consider parameterizing test data size.The test uses hardcoded limits for BigQuery data sampling. Consider making these limits configurable parameters for better test flexibility.
- query = transactions_table.limit(6) + sample_size = 6 # Can be adjusted based on test needs + query = transactions_table.limit(sample_size)README.md (3)
287-287
: Fix spacing after colon in the bullet list.There is an extra space after the colon in this line which was flagged by static analysis.
-To run the workflow in GitHub Actions, add these secrets to your repository: +To run the workflow in GitHub Actions, add these secrets to your repository:🧰 Tools
🪛 LanguageTool
[uncategorized] ~287-~287: Loose punctuation mark.
Context: ...rets to your repository: -GCP_SA_KEY
: The entire JSON content of your GCP ser...(UNLIKELY_OPENING_PUNCTUATION)
236-244
: Consider adding version requirements for GCP components.The prerequisites section could be enhanced by specifying any version requirements for GCP components or BigQuery API.
You might want to add information about minimum GCP API versions or client library versions required for the tests to run successfully.
245-271
: Consider mentioning test runtime expectations.The documentation provides clear instructions on running tests but doesn't mention how long they might take to execute or any resource consumption expectations.
Consider adding information about typical test duration and any resource consumption implications when running against BigQuery, as this could help developers plan their testing accordingly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (16)
.env_sample
(1 hunks).github/workflows/bigquery-integration.yml
(1 hunks).pre-commit-config.yaml
(1 hunks)README.md
(2 hunks)pyproject.toml
(1 hunks)tests/integration/bigquery/conftest.py
(1 hunks)tests/integration/bigquery/test_cohort_analysis.py
(1 hunks)tests/integration/bigquery/test_composite_rank.py
(1 hunks)tests/integration/bigquery/test_cross_shop.py
(1 hunks)tests/integration/bigquery/test_customer_decision_hierarchy.py
(1 hunks)tests/integration/bigquery/test_hml_segmentation.py
(1 hunks)tests/integration/bigquery/test_product_association.py
(1 hunks)tests/integration/bigquery/test_revenue_tree.py
(1 hunks)tests/integration/bigquery/test_rfm_segmentation.py
(1 hunks)tests/integration/bigquery/test_segstats_segmentation.py
(1 hunks)tests/integration/bigquery/test_threshold_segmentation.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/integration/bigquery/test_segstats_segmentation.py
🚧 Files skipped from review as they are similar to previous changes (13)
- pyproject.toml
- .env_sample
- .pre-commit-config.yaml
- .github/workflows/bigquery-integration.yml
- tests/integration/bigquery/conftest.py
- tests/integration/bigquery/test_threshold_segmentation.py
- tests/integration/bigquery/test_customer_decision_hierarchy.py
- tests/integration/bigquery/test_cohort_analysis.py
- tests/integration/bigquery/test_rfm_segmentation.py
- tests/integration/bigquery/test_product_association.py
- tests/integration/bigquery/test_composite_rank.py
- tests/integration/bigquery/test_hml_segmentation.py
- tests/integration/bigquery/test_cross_shop.py
🧰 Additional context used
🪛 LanguageTool
README.md
[uncategorized] ~287-~287: Loose punctuation mark.
Context: ...rets to your repository: - GCP_SA_KEY
: The entire JSON content of your GCP ser...
(UNLIKELY_OPENING_PUNCTUATION)
🔇 Additional comments (2)
tests/integration/bigquery/test_revenue_tree.py (1)
225-233
: The previously identified duplicate assertion has been fixed.The code no longer contains the duplicate assertion for
calc_spend_per_cust_diff
that was flagged in previous reviews. Good job addressing this issue.README.md (1)
213-312
: Well-structured and comprehensive documentation for BigQuery integration tests.The documentation for BigQuery integration tests is very thorough, covering all aspects from overview to prerequisites, setup, and execution instructions. This will be very helpful for developers working with these tests.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~287-~287: Loose punctuation mark.
Context: ...rets to your repository: -GCP_SA_KEY
: The entire JSON content of your GCP ser...(UNLIKELY_OPENING_PUNCTUATION)
- name: Set up Google Cloud SDK | ||
uses: google-github-actions/setup-gcloud@v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the Google Cloud SDK? Anytime I've used it before it's been pretty heavy weight.
README.md
Outdated
uv pip install -e . | ||
uv pip install "ibis-framework[bigquery]>=10.0.0,<11" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be better to just run uv sync
no?
uv pip install -e . | |
uv pip install "ibis-framework[bigquery]>=10.0.0,<11" | |
uv sync |
README.md
Outdated
The integration tests cover the following analysis modules: | ||
|
||
- **Cohort Analysis** - Tests customer cohort retention metrics | ||
- **Cross Shop Analysis** - Tests product/category cross-shopping patterns | ||
- **Customer Analysis** - Tests customer lifetime value and purchase frequency metrics | ||
- **Gain Loss Analysis** - Tests comparative performance analysis | ||
- **Haversine Analysis** - Tests geographic distance calculations | ||
- **Product Association Analysis** - Tests market basket analysis | ||
- **Customer Decision Hierarchy** - Tests customer purchase decision patterns | ||
- **Revenue Tree Analysis** - Tests hierarchical revenue breakdowns | ||
- **Composite Rank Analysis** - Tests weighted ranking of entities | ||
- **Segmentation Analysis** - Tests RFM and value-frequency customer segmentation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not list all the modules. I'm a bit worried that all the effort additional steps required as we implement new modules.
The integration tests cover the following analysis modules: | |
- **Cohort Analysis** - Tests customer cohort retention metrics | |
- **Cross Shop Analysis** - Tests product/category cross-shopping patterns | |
- **Customer Analysis** - Tests customer lifetime value and purchase frequency metrics | |
- **Gain Loss Analysis** - Tests comparative performance analysis | |
- **Haversine Analysis** - Tests geographic distance calculations | |
- **Product Association Analysis** - Tests market basket analysis | |
- **Customer Decision Hierarchy** - Tests customer purchase decision patterns | |
- **Revenue Tree Analysis** - Tests hierarchical revenue breakdowns | |
- **Composite Rank Analysis** - Tests weighted ranking of entities | |
- **Segmentation Analysis** - Tests RFM and value-frequency customer segmentation | |
The integration tests cover the analysis modules. |
README.md
Outdated
## Test Data | ||
|
||
The tests expect a BigQuery dataset named `test_data` with a table named `transactions` containing the following columns: | ||
|
||
- `transaction_id` | ||
- `transaction_date` | ||
- `transaction_time` | ||
- `customer_id` | ||
- `product_id` | ||
- `product_name` | ||
- `category_0_name` | ||
- `category_0_id` | ||
- `category_1_name` | ||
- `category_1_id` | ||
- `brand_name` | ||
- `brand_id` | ||
- `unit_quantity` | ||
- `unit_cost` | ||
- `unit_spend` | ||
- `store_id` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this too. The file should be changing in the next several weeks I hope
def test_invalid_period(self, transactions_table): | ||
"""Test if an invalid period raises an error.""" | ||
invalid_period = "m" | ||
with pytest.raises( | ||
ValueError, | ||
match=f"Invalid period '{invalid_period}'. Allowed values: {CohortAnalysis.VALID_PERIODS}", | ||
): | ||
CohortAnalysis( | ||
df=transactions_table, | ||
aggregation_column="unit_spend", | ||
period=invalid_period, | ||
) | ||
|
||
def test_cohort_percentage(self, transactions_table): | ||
"""Tests cohort analysis with percentage=True.""" | ||
cohort = CohortAnalysis( | ||
df=transactions_table, | ||
aggregation_column="unit_spend", | ||
agg_func="sum", | ||
period="month", | ||
percentage=True, | ||
) | ||
result = cohort.table | ||
assert not result.empty | ||
assert result.max().max() <= 1.0, "Values should be <= 1 when percentage=True" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the first test is enough for now. Basically, it just checks that the computation doesn't error on BigQuery
def test_invalid_period(self, transactions_table): | |
"""Test if an invalid period raises an error.""" | |
invalid_period = "m" | |
with pytest.raises( | |
ValueError, | |
match=f"Invalid period '{invalid_period}'. Allowed values: {CohortAnalysis.VALID_PERIODS}", | |
): | |
CohortAnalysis( | |
df=transactions_table, | |
aggregation_column="unit_spend", | |
period=invalid_period, | |
) | |
def test_cohort_percentage(self, transactions_table): | |
"""Tests cohort analysis with percentage=True.""" | |
cohort = CohortAnalysis( | |
df=transactions_table, | |
aggregation_column="unit_spend", | |
agg_func="sum", | |
period="month", | |
percentage=True, | |
) | |
result = cohort.table | |
assert not result.empty | |
assert result.max().max() <= 1.0, "Values should be <= 1 when percentage=True" |
def test_composite_rank_with_bigquery_data(self, test_transactions_df): | ||
"""Test CompositeRank functionality with real BigQuery data. | ||
|
||
This test demonstrates using CompositeRank with BigQuery-sourced data. | ||
""" | ||
rank_cols = [ | ||
("unit_spend", "desc"), | ||
("customer_id", "desc"), | ||
("spend_per_customer", "desc"), | ||
] | ||
|
||
cr = CompositeRank( | ||
df=test_transactions_df, | ||
rank_cols=rank_cols, | ||
agg_func="mean", | ||
ignore_ties=False, | ||
) | ||
|
||
assert "composite_rank" in cr.df.columns | ||
assert len(cr.df) > 0 | ||
|
||
expected_rank_columns = [ | ||
"unit_spend_rank", | ||
"customer_id_rank", | ||
"spend_per_customer_rank", | ||
"composite_rank", | ||
] | ||
for col in expected_rank_columns: | ||
assert col in cr.df.columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this function overlap with the next one in terms of functionality?
|
||
def test_different_agg_functions_with_bigquery(self, test_transactions_df): | ||
"""Test different aggregation functions with BigQuery data.""" | ||
agg_functions = ["mean", "sum", "min", "max"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you implement this with pytest parameterize instead of a loop in the function please.
"""Test tie-breaking behavior with BigQuery data.""" | ||
rank_cols = [("unit_spend", "desc")] | ||
|
||
cr_with_ties = CompositeRank( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here can you implement with pytest parameterize please.
def test_ibis_table_input(self, transactions_table): | ||
"""Explicitly test Ibis table input for CompositeRank.""" | ||
cr = CompositeRank( | ||
df=transactions_table, | ||
rank_cols=[("unit_spend", "desc"), ("customer_id", "desc")], | ||
agg_func="mean", | ||
ignore_ties=False, | ||
) | ||
|
||
expected_columns = [ | ||
"unit_spend_rank", | ||
"customer_id_rank", | ||
"composite_rank", | ||
] | ||
|
||
for col in expected_columns: | ||
assert col in cr.df.columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this test for now. I think the others cover my big concerns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
tests/integration/bigquery/test_haversine.py (1)
20-21
: Consider adding a more descriptive assertion messageWhile the test includes an error message, it could be more descriptive to aid in debugging.
- assert pytest.approx(result, rel=1e-3) == expected_distance, "Distance calculation error" + assert pytest.approx(result, rel=1e-3) == expected_distance, f"Distance calculation error: expected {expected_distance}, got {result}"README.md (1)
275-275
: Fix formatting in the list of required secretsThere appears to be extra spacing before the list item.
-To run the workflow in GitHub Actions, add these secrets to your repository: +To run the workflow in GitHub Actions, add these secrets to your repository:🧰 Tools
🪛 LanguageTool
[uncategorized] ~275-~275: Loose punctuation mark.
Context: ...rets to your repository: -GCP_SA_KEY
: The entire JSON content of your GCP ser...(UNLIKELY_OPENING_PUNCTUATION)
tests/integration/bigquery/test_cross_shop.py (3)
28-41
: Consider adding assertions to the integration testThe test currently verifies that CrossShop can be instantiated without errors, but doesn't assert any expected behavior or results. While this is in line with the previous review comments to focus on Ibis compatibility, adding a basic assertion would strengthen the test.
- CrossShop( + cs = CrossShop( df=transactions_df, group_1_col="category_1_name", group_1_val="Jeans", group_2_col="category_1_name", group_2_val="Shoes", ) + # Assert that CrossShop object was successfully created with data + assert len(cs.cross_shop_df) > 0, "CrossShop should have processed data"
43-58
: Consider adding assertions to the three-group testSimilar to the previous test, this one verifies that the three-group configuration doesn't raise errors, but doesn't assert any expected behavior.
- CrossShop( + cs = CrossShop( df=transactions_df, group_1_col="category_1_name", group_1_val="Jeans", group_2_col="category_1_name", group_2_val="Shoes", group_3_col="category_1_name", group_3_val="Dresses", ) + # Assert that CrossShop correctly handled three groups + assert cs.cross_shop_table_df is not None, "CrossShop should generate cross_shop_table_df"
60-74
: Verify custom aggregation results in the testThis test validates that custom aggregation functions like "nunique" work with BigQuery, but doesn't assert the expected behavior. Consider adding a simple assertion to confirm the aggregation was applied correctly.
- CrossShop( + cs = CrossShop( df=transactions_df, group_1_col="category_1_name", group_1_val="Jeans", group_2_col="category_1_name", group_2_val="Shoes", value_col=cols.customer_id, agg_func="nunique", ) + # Assert that customer count aggregation was successful + assert cs.cross_shop_table_df is not None, "CrossShop should generate table with customer counts"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
.github/workflows/bigquery-integration.yml
(1 hunks)README.md
(2 hunks)pyretailscience/segmentation/threshold.py
(1 hunks)tests/integration/bigquery/test_cohort_analysis.py
(1 hunks)tests/integration/bigquery/test_composite_rank.py
(1 hunks)tests/integration/bigquery/test_cross_shop.py
(1 hunks)tests/integration/bigquery/test_customer_decision_hierarchy.py
(1 hunks)tests/integration/bigquery/test_haversine.py
(1 hunks)tests/integration/bigquery/test_hml_segmentation.py
(1 hunks)tests/integration/bigquery/test_product_association.py
(1 hunks)tests/integration/bigquery/test_revenue_tree.py
(1 hunks)tests/integration/bigquery/test_rfm_segmentation.py
(1 hunks)tests/integration/bigquery/test_segstats_segmentation.py
(1 hunks)tests/integration/bigquery/test_threshold_segmentation.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- pyretailscience/segmentation/threshold.py
🚧 Files skipped from review as they are similar to previous changes (10)
- tests/integration/bigquery/test_revenue_tree.py
- tests/integration/bigquery/test_customer_decision_hierarchy.py
- .github/workflows/bigquery-integration.yml
- tests/integration/bigquery/test_cohort_analysis.py
- tests/integration/bigquery/test_rfm_segmentation.py
- tests/integration/bigquery/test_product_association.py
- tests/integration/bigquery/test_threshold_segmentation.py
- tests/integration/bigquery/test_hml_segmentation.py
- tests/integration/bigquery/test_composite_rank.py
- tests/integration/bigquery/test_segstats_segmentation.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
tests/integration/bigquery/test_haversine.py (1)
pyretailscience/analysis/haversine.py (1)
haversine_distance
(28-55)
tests/integration/bigquery/test_cross_shop.py (3)
pyretailscience/analysis/cross_shop.py (1)
CrossShop
(13-200)pyretailscience/options.py (1)
ColumnHelper
(408-560)tests/integration/bigquery/conftest.py (1)
transactions_table
(31-33)
🪛 LanguageTool
README.md
[uncategorized] ~275-~275: Loose punctuation mark.
Context: ...rets to your repository: - GCP_SA_KEY
: The entire JSON content of your GCP ser...
(UNLIKELY_OPENING_PUNCTUATION)
🔇 Additional comments (11)
tests/integration/bigquery/test_haversine.py (4)
1-2
: Good documentationClear docstring that succinctly describes the purpose of this test file.
9-10
: Well-named test function with clear docstringThe test function name and docstring clearly communicate the purpose of the test.
11-14
: Good use of Ibis literals for test coordinatesUsing
ibis.literal
with specified type "float64" ensures compatibility with BigQuery's type system. The coordinates represent San Francisco and New York, making this a realistic test case.
16-18
: Clean expression construction and executionThe code correctly builds and executes the haversine distance expression using the imported function.
README.md (5)
213-219
: Well-structured BigQuery Integration Tests documentation sectionClear introduction to the BigQuery integration tests with a concise overview of their purpose.
221-223
: Concise test coverage statementThe simplified statement aligns with previous feedback to avoid listing all modules.
225-233
: Comprehensive prerequisites sectionGood documentation of all requirements needed to run the integration tests, including GCP account, service account, credentials, and test data.
235-259
: Clear manual setup instructionsStep-by-step instructions for setting up authentication, installing dependencies, and running tests. The commands are provided with examples for both running all tests and specific test modules.
261-276
: Detailed GitHub Actions workflow documentationThorough instructions for running tests via GitHub Actions, including the required repository secrets.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~275-~275: Loose punctuation mark.
Context: ...rets to your repository: -GCP_SA_KEY
: The entire JSON content of your GCP ser...(UNLIKELY_OPENING_PUNCTUATION)
tests/integration/bigquery/test_cross_shop.py (2)
1-7
: Clear imports and docstringGood documentation and appropriate imports for the CrossShop integration tests.
11-26
: Well-implemented fixture for test dataThe fixture uses appropriate filtering and limiting to create a manageable test dataset while including the necessary categories for cross-shop analysis.
7bec26b
to
a9a6b58
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
README.md (1)
275-275
: Fix spacing in formatting.There appears to be an extra space after the colon.
-To run the workflow in GitHub Actions, add these secrets to your repository: +To run the workflow in GitHub Actions, add these secrets to your repository:🧰 Tools
🪛 LanguageTool
[uncategorized] ~275-~275: Loose punctuation mark.
Context: ...rets to your repository: -GCP_SA_KEY
: The entire JSON content of your GCP ser...(UNLIKELY_OPENING_PUNCTUATION)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (15)
.github/workflows/bigquery-integration.yml
(1 hunks)README.md
(2 hunks)pyretailscience/segmentation/threshold.py
(1 hunks)tests/integration/bigquery/test_cohort_analysis.py
(1 hunks)tests/integration/bigquery/test_composite_rank.py
(1 hunks)tests/integration/bigquery/test_cross_shop.py
(1 hunks)tests/integration/bigquery/test_customer_decision_hierarchy.py
(1 hunks)tests/integration/bigquery/test_gain_loss_analysis.py
(1 hunks)tests/integration/bigquery/test_haversine.py
(1 hunks)tests/integration/bigquery/test_hml_segmentation.py
(1 hunks)tests/integration/bigquery/test_product_association.py
(1 hunks)tests/integration/bigquery/test_revenue_tree.py
(1 hunks)tests/integration/bigquery/test_rfm_segmentation.py
(1 hunks)tests/integration/bigquery/test_segstats_segmentation.py
(1 hunks)tests/integration/bigquery/test_threshold_segmentation.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/integration/bigquery/test_rfm_segmentation.py
🚧 Files skipped from review as they are similar to previous changes (12)
- pyretailscience/segmentation/threshold.py
- tests/integration/bigquery/test_customer_decision_hierarchy.py
- tests/integration/bigquery/test_cohort_analysis.py
- tests/integration/bigquery/test_haversine.py
- tests/integration/bigquery/test_revenue_tree.py
- tests/integration/bigquery/test_cross_shop.py
- tests/integration/bigquery/test_threshold_segmentation.py
- tests/integration/bigquery/test_composite_rank.py
- tests/integration/bigquery/test_hml_segmentation.py
- tests/integration/bigquery/test_product_association.py
- .github/workflows/bigquery-integration.yml
- tests/integration/bigquery/test_segstats_segmentation.py
🧰 Additional context used
🪛 LanguageTool
README.md
[uncategorized] ~275-~275: Loose punctuation mark.
Context: ...rets to your repository: - GCP_SA_KEY
: The entire JSON content of your GCP ser...
(UNLIKELY_OPENING_PUNCTUATION)
🔇 Additional comments (10)
README.md (3)
213-277
: Comprehensive BigQuery integration tests documentation added.This is a well-structured documentation section that clearly explains the purpose, prerequisites, and methods for running the BigQuery integration tests. The section provides both manual setup instructions and GitHub Actions workflow guidance.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~275-~275: Loose punctuation mark.
Context: ...rets to your repository: -GCP_SA_KEY
: The entire JSON content of your GCP ser...(UNLIKELY_OPENING_PUNCTUATION)
248-249
: Command matches previous reviewer's suggestion.The command
uv sync
aligns with murray-ds's previous comment suggesting this format.
223-223
: Documentation follows previous reviewer's suggestion.You've implemented murray-ds's suggestion to not list all the modules, which helps reduce maintenance overhead as new modules are added.
tests/integration/bigquery/test_gain_loss_analysis.py (7)
1-11
: Well-structured imports and module documentation.The file has appropriate imports and clear module-level documentation explaining the purpose of these integration tests.
16-38
: Robust fixture with proper error handling.The BigQuery connection fixture includes appropriate error handling for GoogleCloudError exceptions, ensuring tests are skipped gracefully when data cannot be loaded.
41-104
: Comprehensive instantiation test with thorough assertions.This test effectively verifies that the GainLoss class works with BigQuery data, including proper validation of the resulting dataframes and expected columns.
106-158
: Well-designed parametrized group column test.The test covers different grouping scenarios and properly verifies the structure of both regular column lists and MultiIndex columns in the results.
160-256
: Thorough testing of customer group processing logic.This parametrized test provides comprehensive coverage of various customer movement scenarios, with appropriate assertions for each case.
258-312
: Good coverage of aggregation functions and value columns.The test ensures that GainLoss works correctly with different aggregation functions and value columns, including proper handling of missing columns.
277-281
: Handles missing columns gracefully.The code appropriately creates the transaction_count column when needed and skips tests when required columns are missing, providing good test robustness.
uv sync | ||
uv sync --group dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need both here. uv sync
will also install group dev
uv sync | |
uv sync --group dev | |
uv sync |
@@ -25,7 +25,7 @@ repos: | |||
hooks: | |||
- id: pytest | |||
name: pytest | |||
entry: uv run pytest --cov=pyretailscience --cov-report=xml --cov-branch tests | |||
entry: uv run pytest --cov=pyretailscience --cov-report=xml --cov-branch tests --ignore=tests/integration/bigquery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually we'll also have other DB integration tests. Maybe we just ignore the whole tests/integration
directory?
entry: uv run pytest --cov=pyretailscience --cov-report=xml --cov-branch tests --ignore=tests/integration/bigquery | |
entry: uv run pytest --cov=pyretailscience --cov-report=xml --cov-branch tests --ignore=tests/integration |
@@ -83,14 +83,11 @@ def __init__( | |||
window = ibis.window(order_by=ibis.asc(df[value_col])) | |||
df = df.mutate(ptile=ibis.percent_rank().over(window)) | |||
|
|||
case = ibis.case() | |||
|
|||
case_args = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can do this with list comprehension, ie
case_args = [] | |
case_args = [(df["ptile"] <= quantile, segment) for quantile, segment in zip(thresholds, segments, strict=True)] |
.env_sample
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please rename this file to .test_env_sample
so it's clear it's only for testing purposes.
def test_invalid_period(self, transactions_table): | ||
"""Test if an invalid period raises an error.""" | ||
invalid_period = "m" | ||
with pytest.raises( | ||
ValueError, | ||
match=f"Invalid period '{invalid_period}'. Allowed values: {CohortAnalysis.VALID_PERIODS}", | ||
): | ||
CohortAnalysis( | ||
df=transactions_table, | ||
aggregation_column="unit_spend", | ||
period=invalid_period, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this test? This is only testing ibis functionality correct?
def test_ibis_bigquery_brand_association(self, transactions_table): | ||
"""Test brand association analysis using BigQuery backend. | ||
|
||
This ensures that brand-level association works with BigQuery. | ||
""" | ||
value_col = "brand_name" | ||
query = transactions_table.select( | ||
"transaction_id", | ||
value_col, | ||
).limit(1000) | ||
|
||
df = query.execute() | ||
|
||
if df[value_col].isna().mean() > MAX_NULL_RATIO or df[value_col].nunique() < MIN_UNIQUE_CATEGORIES: | ||
pytest.skip("Brand data not suitable for association testing") | ||
|
||
associations = ProductAssociation( | ||
df=df, | ||
value_col=value_col, | ||
group_col="transaction_id", | ||
min_cooccurrences=1, | ||
) | ||
|
||
result = associations.df | ||
if hasattr(result, "execute"): | ||
result = result.execute() | ||
|
||
assert isinstance(result, pd.DataFrame) | ||
|
||
if len(result) > 0: | ||
assert f"{value_col}_1" in result.columns | ||
assert f"{value_col}_2" in result.columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this test already covered by the above?
"transaction_date", | ||
] | ||
query = transactions_table.select(required_cols).limit(10) | ||
df = query.execute() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't testing the function with BQ. You need to pass the Ibis table so it does the aggregations in the RevenueTree
def test_ibis_bigquery_with_quantity(self, cols, transactions_table): | ||
"""Test that RevenueTree works with additional quantity column in BigQuery.""" | ||
required_cols = [ | ||
"customer_id", | ||
"unit_spend", | ||
"transaction_id", | ||
"transaction_date", | ||
"unit_quantity", | ||
] | ||
query = transactions_table.select(required_cols).limit(15) | ||
df = query.execute() | ||
|
||
middle_index = len(df) // 2 | ||
df["period"] = ["P1"] * middle_index + ["P2"] * (len(df) - middle_index) | ||
|
||
rt = RevenueTree( | ||
df=df, | ||
period_col="period", | ||
p1_value="P1", | ||
p2_value="P2", | ||
) | ||
|
||
assert rt.df is not None | ||
assert cols.calc_units_per_trans_contrib in rt.df.columns | ||
assert cols.calc_price_per_unit_contrib in rt.df.columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this tests anything additional.
def test_ibis_bigquery_uneven_periods(self, cols, transactions_table): | ||
"""Test that RevenueTree works with uneven period data in BigQuery.""" | ||
required_cols = [ | ||
"customer_id", | ||
"unit_spend", | ||
"transaction_id", | ||
"transaction_date", | ||
"brand_id", | ||
] | ||
query = transactions_table.select(required_cols).limit(15) | ||
df = query.execute() | ||
|
||
df["period"] = ["P1"] * 10 + ["P2"] * (len(df) - 10) | ||
|
||
rt = RevenueTree( | ||
df=df, | ||
period_col="period", | ||
p1_value="P1", | ||
p2_value="P2", | ||
group_col="brand_id", | ||
) | ||
|
||
assert rt.df is not None | ||
assert len(rt.df) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why might uneven periods cause an issue with the BQ backend?
def test_ibis_bigquery_column_selection(self, cols, transactions_table): | ||
"""Test that RevenueTree works with specific column selection in BigQuery.""" | ||
query = transactions_table.select( | ||
[ | ||
"customer_id", | ||
"unit_spend", | ||
"transaction_id", | ||
"transaction_date", | ||
], | ||
).limit(10) | ||
df = query.execute() | ||
|
||
middle_index = len(df) // 2 | ||
df["period"] = ["P1"] * middle_index + ["P2"] * (len(df) - middle_index) | ||
|
||
rt = RevenueTree( | ||
df=df, | ||
period_col="period", | ||
p1_value="P1", | ||
p2_value="P2", | ||
) | ||
|
||
assert rt.df is not None | ||
assert cols.agg_customer_id_p1 in rt.df.columns | ||
assert cols.agg_customer_id_p2 in rt.df.columns | ||
assert cols.agg_unit_spend_p1 in rt.df.columns | ||
assert cols.agg_unit_spend_p2 in rt.df.columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to test this?
a9a6b58
to
00783ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
tests/integration/bigquery/test_customer_decision_hierarchy.py (3)
11-19
: Good use of parametrization to test various combinations.The test effectively covers multiple product columns, both methods (truncated_svd and yules_q), and both values for exclude_same_transaction_products as requested in the previous review comment.
Consider adding one more parameter combination to test
yules_q
withexclude_same_transaction=False
for more complete coverage.
36-42
: Consider validating the output structure.Currently, the test only verifies that the class initializes without errors. Consider adding assertions to verify that the resulting object's structure is as expected (e.g., check that distances are calculated, pairs_df is populated).
Example addition:
CustomerDecisionHierarchy( df=transactions_df, product_col=product_col, exclude_same_transaction_products=exclude_same_transaction, method=method, min_var_explained=min_var_explained if min_var_explained is not None else 0.8, ) + # Add assertions to verify that the object was initialized correctly + cdh = CustomerDecisionHierarchy( + df=transactions_df, + product_col=product_col, + exclude_same_transaction_products=exclude_same_transaction, + method=method, + min_var_explained=min_var_explained if min_var_explained is not None else 0.8, + ) + # Verify that the object has the expected structure + assert cdh.pairs_df is not None and len(cdh.pairs_df) > 0, "pairs_df should be populated" + assert cdh.distances is not None, "distances should be calculated"
43-44
: Consider catching more specific exceptions.While using a broad exception handler is acceptable for tests, catching more specific exceptions would provide better insights into what's failing.
- except Exception as e: # noqa: BLE001 + except (ValueError, TypeError, ImportError) as e: pytest.fail(f"CustomerDecisionHierarchy failed with product_col={product_col}, method={method}: {e}")README.md (3)
232-233
: Consider adding more details about the test dataset.It would be helpful to provide more information about the required test dataset structure or schema to help users set up their test environment correctly.
Consider adding:
- Information about the schema of the
transactions
table- Minimum required columns based on the tests
- A sample schema definition or data generation script
240-243
: Clarify environment variable usage.The manual setup uses
GOOGLE_APPLICATION_CREDENTIALS
while GitHub Actions usesGCP_SA_KEY
. Consider explaining the reason for this difference.- Set up authentication: + Set up authentication (for manual testing): ```bash export GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/service-account-key.json export GCP_PROJECT_ID=your-project-id
- Note:
GOOGLE_APPLICATION_CREDENTIALS
is the standard environment variable used by Google Cloud client libraries, while GitHub Actions usesGCP_SA_KEY
to store the same information as a repository secret.--- `272-276`: **Fix formatting of list items.** There's a formatting issue with the list of GitHub secrets. ```diff To run the workflow in GitHub Actions, add these secrets to your repository: - `GCP_SA_KEY`: The entire JSON content of your GCP service account key file - `GCP_PROJECT_ID`: Your GCP project ID
🧰 Tools
🪛 LanguageTool
[uncategorized] ~275-~275: Loose punctuation mark.
Context: ...rets to your repository: -GCP_SA_KEY
: The entire JSON content of your GCP ser...(UNLIKELY_OPENING_PUNCTUATION)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (15)
.github/workflows/bigquery-integration.yml
(1 hunks)README.md
(2 hunks)pyretailscience/segmentation/threshold.py
(1 hunks)tests/integration/bigquery/test_cohort_analysis.py
(1 hunks)tests/integration/bigquery/test_composite_rank.py
(1 hunks)tests/integration/bigquery/test_cross_shop.py
(1 hunks)tests/integration/bigquery/test_customer_decision_hierarchy.py
(1 hunks)tests/integration/bigquery/test_gain_loss_analysis.py
(1 hunks)tests/integration/bigquery/test_haversine.py
(1 hunks)tests/integration/bigquery/test_hml_segmentation.py
(1 hunks)tests/integration/bigquery/test_product_association.py
(1 hunks)tests/integration/bigquery/test_revenue_tree.py
(1 hunks)tests/integration/bigquery/test_rfm_segmentation.py
(1 hunks)tests/integration/bigquery/test_segstats_segmentation.py
(1 hunks)tests/integration/bigquery/test_threshold_segmentation.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (13)
- pyretailscience/segmentation/threshold.py
- tests/integration/bigquery/test_threshold_segmentation.py
- tests/integration/bigquery/test_cross_shop.py
- tests/integration/bigquery/test_segstats_segmentation.py
- tests/integration/bigquery/test_haversine.py
- tests/integration/bigquery/test_rfm_segmentation.py
- tests/integration/bigquery/test_product_association.py
- .github/workflows/bigquery-integration.yml
- tests/integration/bigquery/test_cohort_analysis.py
- tests/integration/bigquery/test_hml_segmentation.py
- tests/integration/bigquery/test_composite_rank.py
- tests/integration/bigquery/test_gain_loss_analysis.py
- tests/integration/bigquery/test_revenue_tree.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/integration/bigquery/test_customer_decision_hierarchy.py (3)
pyretailscience/analysis/customer_decision_hierarchy.py (1)
CustomerDecisionHierarchy
(16-289)pyretailscience/options.py (1)
ColumnHelper
(408-560)tests/integration/bigquery/conftest.py (1)
transactions_table
(31-33)
🪛 LanguageTool
README.md
[uncategorized] ~275-~275: Loose punctuation mark.
Context: ...rets to your repository: - GCP_SA_KEY
: The entire JSON content of your GCP ser...
(UNLIKELY_OPENING_PUNCTUATION)
🔇 Additional comments (2)
tests/integration/bigquery/test_customer_decision_hierarchy.py (1)
33-33
:❓ Verification inconclusive
Verify data limit sufficiency.
The 5000 row limit may be sufficient for testing initialization, but consider whether it provides enough data density for meaningful analysis results, especially for sparse customer-product relationships.
You could run a quick check to ensure the data contains sufficient variety:
🏁 Script executed:
#!/bin/bash # Check if the limited dataset has enough variety for meaningful tests count_unique=$(cat <<EOF | python - import sys import pandas as pd # Create a simple dataframe with sample data for illustration df = pd.DataFrame({ 'customer_id': range(10), 'product_name': [f'product_{i % 3}' for i in range(10)], 'transaction_id': range(10) }) # Print example counts that would be useful to check in the actual test data print("Sample counts that would be useful to check in your actual BigQuery data:") print(f"- Unique customers: {df['customer_id'].nunique()}") print(f"- Unique products: {df['product_name'].nunique()}") print(f"- Unique transactions: {df['transaction_id'].nunique()}") print(f"- Average products per customer: {len(df) / df['customer_id'].nunique():.2f}") EOF ) echo "$count_unique"Length of output: 164
Verify dataset variety in
tests/integration/bigquery/test_customer_decision_hierarchy.py:33
The hard‐coded limit of 5 000 rows may not cover enough unique customer–product relationships for meaningful assertions. Please manually run a query against your test BigQuery table to confirm:
WITH sample AS ( SELECT * FROM `your_project.your_dataset.transactions_table` LIMIT 5000 ) SELECT COUNT(DISTINCT customer_id) AS unique_customers, COUNT(DISTINCT product_name) AS unique_products, AVG(txn_count) AS avg_txns_per_customer FROM ( SELECT customer_id, COUNT(*) AS txn_count FROM sample GROUP BY customer_id );Adjust the row limit or add more diverse test data if these metrics fall below your expected thresholds.
README.md (1)
217-219
: Clear and concise introduction to the BigQuery integration tests.The overview effectively explains the purpose of these tests and their importance in ensuring compatibility with BigQuery.
df = transactions_table.to_pandas() | ||
|
||
df["spend_per_customer"] = df["unit_spend"] / df["customer_id"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be done in Ibis and passed as an ibis table.
("product_col", "method", "min_var_explained", "exclude_same_transaction"), | ||
[ | ||
("product_name", "truncated_svd", 0.8, True), | ||
("category_1_name", "yules_q", None, True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need a version where method="yules_q" and exclude_same_transaction=False
|
||
|
||
@pytest.mark.parametrize( | ||
("aggregation_column", "agg_func", "period", "percentage"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to test with different columns? Does that affect ibis functionality? I get that it might be the wrong answer we're not correctly switching the column, but I don't think that will cause an Ibis failure.
("brand_name", "category_1_name", None, "unit_spend", "sum"), | ||
("category_0_name", "brand_name", None, "unit_quantity", "max"), | ||
("category_1_name", "category_0_name", None, "unit_cost", "mean"), | ||
# Test with 3 groups | ||
("brand_name", "category_0_name", "category_1_name", "unit_spend", "count"), | ||
("category_0_name", "category_1_name", "brand_name", "unit_quantity", "mean"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to test with different columns? Does that affect ibis functionality? I get that it might be the wrong answer we're not correctly switching the column, but I don't think that will cause an Ibis failure.
|
||
|
||
@pytest.mark.parametrize( | ||
("focus_brand_id", "comparison_brand_id", "group_col", "value_col,agg_func"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to test with different group columns? Does that affect ibis functionality? I get that it might be the wrong answer we're not correctly switching the column, but I don't think that will cause an Ibis failure.
Same thing with the different brand IDs
@pytest.mark.parametrize( | ||
("value_col", "group_col", "target_item"), | ||
[ | ||
("product_name", "customer_id", None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to test with different columns? Does that affect ibis functionality? I get that it might be the wrong answer we're not correctly switching the column, but I don't think that will cause an Ibis failure. Target item is good though
@pytest.mark.parametrize( | ||
("min_occurrences", "min_cooccurrences", "min_support", "min_confidence", "min_uplift"), | ||
[ | ||
(1, 1, 0.0, 0.0, 0.0), | ||
(5, 3, 0.01, 0.05, 1.0), | ||
(10, 5, 0.02, 0.1, 1.5), | ||
], | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's enough to test with one set of values so we don't need parameterize here.
("transaction_date", "brand_name"), | ||
("transaction_date", "store_id"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's enough to test with None and one group_col value
limited_transactions = transactions_table.limit(10000) | ||
|
||
try: | ||
period_values = limited_transactions.select(period_col).distinct().limit(2).execute() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be an ibis table
…ature/bigquery-integration-analysis
6fe7d14
to
54db275
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
tests/integration/bigquery/test_threshold_segmentation.py (1)
1-39
: Test design is good, but consider enhancing assertions.The parameterized test correctly verifies that ThresholdSegmentation can process BigQuery data with different zero_value_handling options without exceptions. The test is properly structured and uses parameterization as requested in previous reviews.
However, the test could be enhanced to verify that the segmentation produces expected results rather than just checking for exceptions. Consider adding assertions that validate the segment distribution or other key properties of the segmentation output.
You could improve this by:
ThresholdSegmentation( df=limited_table, thresholds=[0.33, 0.66], segments=["Low", "High"], value_col=cols.unit_spend, agg_func="mean", zero_value_customers=zero_value_handling, - ) + ).table.to_pandas().head() # Verify we can execute the table
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (1)
uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (18)
.env_sample
(1 hunks).github/workflows/bigquery-integration.yml
(1 hunks).pre-commit-config.yaml
(1 hunks)README.md
(2 hunks)pyproject.toml
(1 hunks)pyretailscience/segmentation/threshold.py
(1 hunks)tests/integration/bigquery/conftest.py
(1 hunks)tests/integration/bigquery/test_cohort_analysis.py
(1 hunks)tests/integration/bigquery/test_composite_rank.py
(1 hunks)tests/integration/bigquery/test_cross_shop.py
(1 hunks)tests/integration/bigquery/test_customer_decision_hierarchy.py
(1 hunks)tests/integration/bigquery/test_haversine.py
(1 hunks)tests/integration/bigquery/test_hml_segmentation.py
(1 hunks)tests/integration/bigquery/test_product_association.py
(1 hunks)tests/integration/bigquery/test_revenue_tree.py
(1 hunks)tests/integration/bigquery/test_rfm_segmentation.py
(1 hunks)tests/integration/bigquery/test_segstats_segmentation.py
(1 hunks)tests/integration/bigquery/test_threshold_segmentation.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (15)
- .env_sample
- pyproject.toml
- tests/integration/bigquery/test_hml_segmentation.py
- .pre-commit-config.yaml
- tests/integration/bigquery/test_cohort_analysis.py
- pyretailscience/segmentation/threshold.py
- tests/integration/bigquery/test_customer_decision_hierarchy.py
- tests/integration/bigquery/test_rfm_segmentation.py
- tests/integration/bigquery/test_product_association.py
- tests/integration/bigquery/test_cross_shop.py
- tests/integration/bigquery/test_segstats_segmentation.py
- tests/integration/bigquery/test_haversine.py
- tests/integration/bigquery/test_revenue_tree.py
- tests/integration/bigquery/conftest.py
- .github/workflows/bigquery-integration.yml
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/integration/bigquery/test_threshold_segmentation.py (3)
pyretailscience/options.py (1)
ColumnHelper
(408-560)pyretailscience/segmentation/threshold.py (2)
ThresholdSegmentation
(22-102)df
(98-102)tests/integration/bigquery/conftest.py (1)
transactions_table
(31-33)
🪛 LanguageTool
README.md
[uncategorized] ~275-~275: Loose punctuation mark.
Context: ...rets to your repository: - GCP_SA_KEY
: The entire JSON content of your GCP ser...
(UNLIKELY_OPENING_PUNCTUATION)
🔇 Additional comments (1)
README.md (1)
213-277
: Documentation for BigQuery integration tests is clear and complete.The new section provides comprehensive guidance on running the BigQuery integration tests, both manually and through GitHub Actions. The instructions are well-structured with clear prerequisites, setup steps, and execution commands.
The changes address previous review feedback by using
uv sync
for dependency installation and not listing all individual modules covered by the tests.🧰 Tools
🪛 LanguageTool
[uncategorized] ~275-~275: Loose punctuation mark.
Context: ...rets to your repository: -GCP_SA_KEY
: The entire JSON content of your GCP ser...(UNLIKELY_OPENING_PUNCTUATION)
ibis_table = transactions_table.mutate( | ||
spend_per_customer=transactions_table.unit_spend / transactions_table.customer_id, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating spend_per_customer using division by customer_id looks incorrect.
Dividing unit_spend
by customer_id
is mathematically questionable, as customer IDs are typically not meaningful numerical values but rather identifiers. This calculation won't produce a meaningful "spend per customer" metric.
Consider one of these alternatives:
- ibis_table = transactions_table.mutate(
- spend_per_customer=transactions_table.unit_spend / transactions_table.customer_id,
- )
+ # Option 1: Create a more meaningful derived metric
+ ibis_table = transactions_table.mutate(
+ spend_per_customer=transactions_table.unit_spend / transactions_table.quantity,
+ )
Or if you need a synthetic column just for testing:
- ibis_table = transactions_table.mutate(
- spend_per_customer=transactions_table.unit_spend / transactions_table.customer_id,
- )
+ # Option 2: Create a synthetic column that's more clearly for testing purposes
+ ibis_table = transactions_table.mutate(
+ spend_per_customer=transactions_table.unit_spend * 0.5, # Arbitrary calculation for testing
+ )
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
ibis_table = transactions_table.mutate( | |
spend_per_customer=transactions_table.unit_spend / transactions_table.customer_id, | |
) | |
# Option 1: Create a more meaningful derived metric | |
ibis_table = transactions_table.mutate( | |
spend_per_customer=transactions_table.unit_spend / transactions_table.quantity, | |
) |
ibis_table = transactions_table.mutate( | |
spend_per_customer=transactions_table.unit_spend / transactions_table.customer_id, | |
) | |
# Option 2: Create a synthetic column for testing purposes | |
ibis_table = transactions_table.mutate( | |
spend_per_customer=transactions_table.unit_spend * 0.5, # Arbitrary calculation for testing | |
) |
🤖 Prompt for AI Agents
In tests/integration/bigquery/test_composite_rank.py around lines 16 to 18, the
calculation of spend_per_customer by dividing unit_spend by customer_id is
incorrect because customer_id is an identifier, not a numeric value for
division. To fix this, replace the division by customer_id with a meaningful
aggregation or calculation, such as summing unit_spend grouped by customer_id or
using a constant or synthetic value if the column is only needed for testing
purposes.
@pytest.mark.parametrize("ignore_ties", [False, True]) | ||
def test_tie_handling(test_transactions_table, ignore_ties): | ||
"""Test handling of ties during rank calculation.""" | ||
rank_cols = [("unit_spend", "desc")] | ||
try: | ||
result = CompositeRank( | ||
df=test_transactions_table, | ||
rank_cols=rank_cols, | ||
agg_func="mean", | ||
ignore_ties=ignore_ties, | ||
) | ||
assert result is not None | ||
except Exception as e: # noqa: BLE001 | ||
pytest.fail(f"CompositeRank failed with ignore_ties={ignore_ties}: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Parameterization is well implemented, but test lacks meaningful assertions.
The parameterized test for tie handling is a good approach but, like the basic test, it only verifies that the CompositeRank class can be instantiated without errors. It does not verify that the tie handling is working as expected with different ignore_ties
values.
Consider adding assertions that specifically validate the tie handling behavior:
result = CompositeRank(
df=test_transactions_table,
rank_cols=rank_cols,
agg_func="mean",
ignore_ties=ignore_ties,
)
assert result is not None
+
+ # Execute the table to get results
+ result_df = result.table.to_pandas()
+
+ # Find records with tied values to verify tie handling
+ if ignore_ties:
+ # With ignore_ties=True, ranks should be unique
+ assert result_df["rank"].nunique() == len(result_df)
+ else:
+ # With ignore_ties=False, we might have duplicate ranks for tied values
+ # This is a basic check to ensure the behavior is different
+ # A more comprehensive test would look for specific tied values
+ group_sizes = result_df.groupby("unit_spend").size()
+ has_ties = any(size > 1 for size in group_sizes)
+ if has_ties:
+ # If there are ties in the data, we should have some duplicate ranks
+ assert result_df["rank"].nunique() < len(result_df)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@pytest.mark.parametrize("ignore_ties", [False, True]) | |
def test_tie_handling(test_transactions_table, ignore_ties): | |
"""Test handling of ties during rank calculation.""" | |
rank_cols = [("unit_spend", "desc")] | |
try: | |
result = CompositeRank( | |
df=test_transactions_table, | |
rank_cols=rank_cols, | |
agg_func="mean", | |
ignore_ties=ignore_ties, | |
) | |
assert result is not None | |
except Exception as e: # noqa: BLE001 | |
pytest.fail(f"CompositeRank failed with ignore_ties={ignore_ties}: {e}") | |
@pytest.mark.parametrize("ignore_ties", [False, True]) | |
def test_tie_handling(test_transactions_table, ignore_ties): | |
"""Test handling of ties during rank calculation.""" | |
rank_cols = [("unit_spend", "desc")] | |
try: | |
result = CompositeRank( | |
df=test_transactions_table, | |
rank_cols=rank_cols, | |
agg_func="mean", | |
ignore_ties=ignore_ties, | |
) | |
assert result is not None | |
# Execute the table to get results | |
result_df = result.table.to_pandas() | |
# Find records with tied values to verify tie handling | |
if ignore_ties: | |
# With ignore_ties=True, ranks should be unique | |
assert result_df["rank"].nunique() == len(result_df) | |
else: | |
# With ignore_ties=False, we might have duplicate ranks for tied values | |
# This is a basic check to ensure the behavior is different | |
# A more comprehensive test would look for specific tied values | |
group_sizes = result_df.groupby("unit_spend").size() | |
has_ties = any(size > 1 for size in group_sizes) | |
if has_ties: | |
# If there are ties in the data, we should have some duplicate ranks | |
assert result_df["rank"].nunique() < len(result_df) | |
except Exception as e: # noqa: BLE001 | |
pytest.fail(f"CompositeRank failed with ignore_ties={ignore_ties}: {e}") |
🤖 Prompt for AI Agents
In tests/integration/bigquery/test_composite_rank.py around lines 45 to 58, the
test_tie_handling function only checks that CompositeRank instantiates without
error but does not verify correct tie handling behavior. Enhance the test by
adding assertions that confirm the ranking results reflect the expected behavior
when ignore_ties is True versus False, such as checking if tied ranks are
assigned correctly or if ties are ignored as intended. This will ensure the
parameterized test validates the functional impact of the ignore_ties parameter.
feat: Setup GCP BigQuery Integration Tests for Analysis Modules
Summary by CodeRabbit
New Features
Documentation
Tests
Chores