Skip to content

feat: standard bar plot #86

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 49 commits into from
Feb 2, 2025
Merged

feat: standard bar plot #86

merged 49 commits into from
Feb 2, 2025

Conversation

Mrglglglglgl
Copy link
Collaborator

@Mrglglglglgl Mrglglglglgl commented Oct 4, 2024

Summary by CodeRabbit

  • New Features

    • Introduced a new "Bar Plot" section in documentation, detailing its functionality and use cases in retail analytics.
    • Added a new module for creating customizable bar plots, supporting single and grouped configurations.
    • Included new sections on various analytical tools in the documentation.
  • Documentation

    • Enhanced clarity and organization of existing documentation sections.
    • Added navigation entry for "Bar Plot" to improve user access to resources.
  • Tests

    • Established a comprehensive suite of unit tests for the new bar plot functionality.
    • Added new tests for the graph_utils module to ensure formatting and truncation functionalities are reliable.
    • Introduced a new test suite for the tailwind module focusing on color mapping functions.
  • Chores

    • Updated .gitignore to include entries for Claude Sync files.

@Mrglglglglgl Mrglglglglgl linked an issue Oct 4, 2024 that may be closed by this pull request
4 tasks
Copy link

coderabbitai bot commented Oct 4, 2024

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (1)
  • docs/assets/images/analysis_modules/plots/histogram_plot.svg is excluded by !**/*.svg

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The pull request introduces several enhancements to the documentation and functionality related to bar plots in the PyRetailScience library. A new section on "Bar Plot" has been added to docs/analysis_modules.md, along with multiple new sections covering various analytical topics. The bar.py module has been created to provide functionality for generating customizable bar plots, and corresponding documentation has been added in docs/api/plots/bar.md. Additionally, unit tests for the bar plot functionality have been implemented in tests/plots/test_bar.py. Changes to the mkdocs.yml file ensure proper navigation to the new documentation.

Changes

File Path Change Summary
docs/analysis_modules.md Added sections on "Bar Plot," "Cross Shop," "Gain Loss," "Revenue Tree," "HML Segmentation," "Threshold Segmentation," "Segmentation Stats," "Purchases Per Customer," "Days Between Purchases," and "Transaction Churn." Reorganized existing sections for clarity.
docs/api/plots/bar.md Introduced a new section for the pyretailscience.plots.bar module, detailing bar plot functionality.
mkdocs.yml Added "Bar Plot" entry under the "Reference" section, linking to api/plots/bar.md.
pyretailscience/plots/bar.py Created a module for customizable bar plots, including functions for plotting and generating labels.
tests/plots/test_bar.py Added unit tests for bar plot functionality, covering various scenarios and edge cases.

Possibly related PRs

  • feat: add standard histogram plot #85: The changes in the main PR and the retrieved PR are related as both introduce new sections in the docs/analysis_modules.md file for different types of plots (bar and histogram), and they both include example code snippets demonstrating their respective functionalities. Additionally, both PRs modify the documentation structure to enhance clarity and usability for users interested in visualizations.
  • feat: add line plot #84: The changes in the main PR, which expand the documentation for bar plots, are related to the retrieved PR that introduces a new line plot module, as both involve enhancements to the plotting capabilities within the pyretailscience library and include new sections in the documentation for their respective plot types.

Suggested labels

documentation, enhancement, Tests, Review effort [1-5]: 4

Suggested reviewers

  • murray-ds

🐇 In the world of plots, we hop and play,
With bar charts brightening the retail way.
New sections sprout, like carrots in spring,
Analytics bloom, oh what joy they bring!
From cross shops to gains, our insights will soar,
With tests to ensure, we’ll always explore! 🌼📊


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented Oct 4, 2024

Codecov Report

Attention: Patch coverage is 88.88889% with 15 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pyretailscience/plots/bar.py 87.75% 7 Missing and 5 partials ⚠️
pyretailscience/style/graph_utils.py 91.89% 2 Missing and 1 partial ⚠️
Files with missing lines Coverage Δ
pyretailscience/style/graph_utils.py 76.51% <91.89%> (ø)
pyretailscience/plots/bar.py 87.75% <87.75%> (ø)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Outside diff range and nitpick comments (10)
docs/analysis_modules.md (10)

106-106: Minor: Consider wrapping long lines

There are two lines that exceed the recommended 120-character limit:

  1. Line 106: "Bar plots are ideal for visualizing comparisons between categories or groups, showing how metrics such as revenue, sales, or other values vary across different categories. This module allows you to easily group bars by different categories and stack them when comparing multiple metrics. You can also add data labels to display absolute or percentage values for each bar."
  2. Line 114: "This module provides flexibility in customizing legends, axes, and other visual elements, making it easy to represent data across different dimensions, either as grouped or single bar plots."

Consider breaking these lines into shorter segments for improved readability.

Also applies to: 114-114

🧰 Tools
🪛 Markdownlint

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)


Line range hint 339-351: TODO: Complete the Cross Shop section

The Cross Shop section currently contains placeholder content. Please add the following:

  1. A description of Cross Shop analysis and its importance in retail analytics.
  2. Key use cases or applications of Cross Shop analysis.
  3. An example code snippet demonstrating how to perform Cross Shop analysis using the PyRetailScience library.
  4. Replace the placeholder image with an actual Cross Shop visualization.

Would you like assistance in drafting the content for this section?

🧰 Tools
🪛 Markdownlint

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)


114-114: Expected: 120; Actual: 190
Line length

(MD013, line-length)


Line range hint 353-365: TODO: Complete the Gain Loss section

The Gain Loss section currently contains placeholder content. Please add the following:

  1. An explanation of Gain Loss analysis and its significance in retail analytics.
  2. Key applications or insights that can be derived from Gain Loss analysis.
  3. An example code snippet showing how to perform Gain Loss analysis using the PyRetailScience library.
  4. Replace the placeholder image with a relevant Gain Loss visualization.

Would you like assistance in drafting the content for this section?

🧰 Tools
🪛 Markdownlint

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)


114-114: Expected: 120; Actual: 190
Line length

(MD013, line-length)


Line range hint 401-413: TODO: Complete the Revenue Tree section

The Revenue Tree section currently contains placeholder content. Please add the following:

  1. A description of Revenue Tree analysis and its importance in retail analytics.
  2. Key use cases or applications of Revenue Tree analysis.
  3. An example code snippet demonstrating how to create a Revenue Tree using the PyRetailScience library.
  4. Replace the placeholder image with an actual Revenue Tree visualization.

Would you like assistance in drafting the content for this section?

🧰 Tools
🪛 Markdownlint

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)


114-114: Expected: 120; Actual: 190
Line length

(MD013, line-length)


Line range hint 415-427: TODO: Complete the HML Segmentation section

The HML Segmentation section currently contains placeholder content. Please add the following:

  1. An explanation of HML (High, Medium, Low) Segmentation and its importance in retail analytics.
  2. Key benefits and applications of HML Segmentation in customer analysis.
  3. An example code snippet showing how to perform HML Segmentation using the PyRetailScience library.
  4. Replace the placeholder image with a relevant HML Segmentation visualization or diagram.

Would you like assistance in drafting the content for this section?

🧰 Tools
🪛 Markdownlint

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)


114-114: Expected: 120; Actual: 190
Line length

(MD013, line-length)


Line range hint 429-441: TODO: Complete the Threshold Segmentation section

The Threshold Segmentation section currently contains placeholder content. Please add the following:

  1. A description of Threshold Segmentation and its significance in retail analytics.
  2. Key applications and benefits of using Threshold Segmentation for customer analysis.
  3. An example code snippet demonstrating how to perform Threshold Segmentation using the PyRetailScience library.
  4. Replace the placeholder image with an actual Threshold Segmentation visualization or diagram.

Would you like assistance in drafting the content for this section?

🧰 Tools
🪛 Markdownlint

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)


114-114: Expected: 120; Actual: 190
Line length

(MD013, line-length)


Line range hint 443-455: TODO: Complete the Segmentation Stats section

The Segmentation Stats section currently contains placeholder content. Please add the following:

  1. An explanation of Segmentation Stats and their importance in retail analytics.
  2. Key metrics and insights that can be derived from Segmentation Stats.
  3. An example code snippet showing how to calculate and analyze Segmentation Stats using the PyRetailScience library.
  4. Replace the placeholder image with a relevant Segmentation Stats visualization or table.

Would you like assistance in drafting the content for this section?

🧰 Tools
🪛 Markdownlint

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)


114-114: Expected: 120; Actual: 190
Line length

(MD013, line-length)


Line range hint 457-469: TODO: Complete the Purchases Per Customer section

The Purchases Per Customer section currently contains placeholder content. Please add the following:

  1. A description of the Purchases Per Customer metric and its significance in retail analytics.
  2. Key insights and applications that can be derived from analyzing Purchases Per Customer.
  3. An example code snippet demonstrating how to calculate and analyze Purchases Per Customer using the PyRetailScience library.
  4. Replace the placeholder image with an actual Purchases Per Customer visualization or chart.

Would you like assistance in drafting the content for this section?

🧰 Tools
🪛 Markdownlint

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)


114-114: Expected: 120; Actual: 190
Line length

(MD013, line-length)


Line range hint 471-483: TODO: Complete the Days Between Purchases section

The Days Between Purchases section currently contains placeholder content. Please add the following:

  1. An explanation of the Days Between Purchases metric and its importance in retail analytics.
  2. Key insights and applications that can be derived from analyzing Days Between Purchases.
  3. An example code snippet showing how to calculate and analyze Days Between Purchases using the PyRetailScience library.
  4. Replace the placeholder image with a relevant Days Between Purchases visualization or chart.

Would you like assistance in drafting the content for this section?

🧰 Tools
🪛 Markdownlint

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)


114-114: Expected: 120; Actual: 190
Line length

(MD013, line-length)


Line range hint 485-497: TODO: Complete the Transaction Churn section

The Transaction Churn section currently contains placeholder content. Please add the following:

  1. A description of Transaction Churn and its significance in retail analytics.
  2. Key insights and applications that can be derived from analyzing Transaction Churn.
  3. An example code snippet demonstrating how to calculate and analyze Transaction Churn using the PyRetailScience library.
  4. Replace the placeholder image with an actual Transaction Churn visualization or chart.

Would you like assistance in drafting the content for this section?

🧰 Tools
🪛 Markdownlint

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)


114-114: Expected: 120; Actual: 190
Line length

(MD013, line-length)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between a9bb418 and f7c0e51.

⛔ Files ignored due to path filters (1)
  • docs/assets/images/analysis_modules/plots/bar_plot.svg is excluded by !**/*.svg
📒 Files selected for processing (5)
  • docs/analysis_modules.md (1 hunks)
  • docs/api/plots/bar.md (1 hunks)
  • mkdocs.yml (1 hunks)
  • pyretailscience/plots/bar.py (1 hunks)
  • tests/plots/test_bar.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/api/plots/bar.md
🧰 Additional context used
🪛 Markdownlint
docs/analysis_modules.md

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)


114-114: Expected: 120; Actual: 190
Line length

(MD013, line-length)

🪛 Ruff
tests/plots/test_bar.py

14-14: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)


25-25: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)


31-31: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)


40-40: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)

🔇 Additional comments (5)
mkdocs.yml (1)

28-28: LGTM! Consider verifying the existence of the referenced file.

The addition of the "Bar Plot" entry to the navigation structure is consistent with the existing format and appropriately placed under the "Reference" section. This change aligns well with the PR objective of adding a standard bar plot to the documentation.

To ensure the referenced file exists, you can run the following command:

✅ Verification successful

Verified: The referenced file exists.

The "Bar Plot" entry has been successfully added to the navigation structure, and the referenced file docs/api/plots/bar.md is present.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the existence of the bar plot documentation file
if [ -f "docs/api/plots/bar.md" ]; then
    echo "The file docs/api/plots/bar.md exists."
else
    echo "Warning: The file docs/api/plots/bar.md does not exist."
fi

Length of output: 121

docs/analysis_modules.md (2)

100-143: Great addition of the Bar Plot section!

The new Bar Plot section is well-structured and informative. It provides a clear explanation of bar plots, their use cases in retail analytics, and includes a helpful example using the pyretailscience.plots.bar module. The content and formatting are consistent with other sections in the document.

🧰 Tools
🪛 Markdownlint

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)


114-114: Expected: 120; Actual: 190
Line length

(MD013, line-length)


Line range hint 1-497: Overall document structure and consistency

The document maintains a consistent structure throughout, with each section following a similar format of description, use cases, and code examples. The addition of new sections enhances the comprehensiveness of the documentation for PyRetailScience's analysis modules.

However, there are several placeholder sections that need to be completed to fully realize the document's potential. Consider the following suggestions:

  1. Prioritize completing the placeholder sections to provide a comprehensive guide to all analysis modules.
  2. Ensure that all sections have relevant visualizations or diagrams to illustrate the concepts.
  3. Consider adding a table of contents at the beginning of the document for easy navigation.
  4. Review the completed sections for any potential updates or improvements based on recent developments in the PyRetailScience library.

To ensure consistency across all sections, run the following script:

python' docs/analysis_modules.md

Test: Check for any remaining placeholder content

rg --type markdown 'PASTE (TEXT|CODE) HERE' docs/analysis_modules.md

Test: Verify image paths

rg --type markdown '![Image title]((assets/images/analysis_modules/.*|https://placehold\.co/.*)\)' docs/analysis_modules.md


<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 Markdownlint</summary><blockquote>

106-106: Expected: 120; Actual: 371
Line length

(MD013, line-length)

---

114-114: Expected: 120; Actual: 190
Line length

(MD013, line-length)

</blockquote></details>

</details>

</blockquote></details>
<details>
<summary>pyretailscience/plots/bar.py (1)</summary><blockquote>

`1-38`: **Comprehensive Module Documentation**

The module docstring provides a clear and detailed explanation of the functionality, use cases, and limitations. It will be very helpful for users to understand how to use this module effectively.

</blockquote></details>
<details>
<summary>tests/plots/test_bar.py (1)</summary><blockquote>

`1-321`: **Comprehensive and well-structured test suite**

The test suite provides extensive coverage for the bar plot functionality, covering various scenarios and edge cases. The use of fixtures and mocks is appropriate, and the tests are well-organized.

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 Ruff</summary><blockquote>

14-14: Use `@pytest.fixture` over `@pytest.fixture()`

Remove parentheses

(PT001)

---

25-25: Use `@pytest.fixture` over `@pytest.fixture()`

Remove parentheses

(PT001)

---

31-31: Use `@pytest.fixture` over `@pytest.fixture()`

Remove parentheses

(PT001)

---

40-40: Use `@pytest.fixture` over `@pytest.fixture()`

Remove parentheses

(PT001)

</blockquote></details>

</details>

</blockquote></details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment on lines 184 to 190
total_sum = df[value_col[0]].sum() # Precompute sum for percentage calculations
for container in ax.containers:
if data_label_format == "absolute":
labels = [f"{get_bar_value(v):.0f}" for v in container]
elif data_label_format == "percentage":
labels = [f"{(get_bar_value(v) / total_sum):.1%}" for v in container]
ax.bar_label(container, labels=labels, label_type="edge")
Copy link

@coderabbitai coderabbitai bot Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix percentage calculation for multiple 'value_col's

When using data_label_format='percentage' with multiple value_cols (grouped bars), the total sum used for percentage calculations currently only considers the first value column:

total_sum = df[value_col[0]].sum()

This could lead to incorrect percentage labels. To correctly calculate percentages for grouped bars, adjust the calculation to account for all value columns.

Suggested change:

Sum over all specified value_cols:

-    total_sum = df[value_col[0]].sum()
+    total_sum = df[value_col].sum().sum()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mrglglglglgl Does the data_label_format='percentage' work with multiple series? I don't think the AI suggestion is correct btw.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment on lines +98 to +102
valid_orientations = ["horizontal", "h", "vertical", "v"]
if orientation not in valid_orientations:
error_msg = f"Invalid orientation: {orientation}. Expected one of {valid_orientations}"
raise ValueError(error_msg)

Copy link

@coderabbitai coderabbitai bot Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Normalize 'orientation' parameter to handle different cases

Currently, the orientation parameter checks against a list of valid orientations in lowercase. If a user passes 'Vertical', 'V', 'Horizontal', or 'H' with different cases, it will result in a ValueError.

Consider normalizing the orientation input to lowercase to make the function more robust.

Suggested change:

+    orientation = orientation.lower()
     valid_orientations = ["horizontal", "h", "vertical", "v"]
     if orientation not in valid_orientations:
         error_msg = f"Invalid orientation: {orientation}. Expected one of {valid_orientations}"
         raise ValueError(error_msg)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
valid_orientations = ["horizontal", "h", "vertical", "v"]
if orientation not in valid_orientations:
error_msg = f"Invalid orientation: {orientation}. Expected one of {valid_orientations}"
raise ValueError(error_msg)
orientation = orientation.lower()
valid_orientations = ["horizontal", "h", "vertical", "v"]
if orientation not in valid_orientations:
error_msg = f"Invalid orientation: {orientation}. Expected one of {valid_orientations}"
raise ValueError(error_msg)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mrglglglglgl I'm a bit torn on these suggestions to make the parameters lower case. Seems like something for the future to be abstracted as a pattern. Or something that already has a standard solution.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment on lines +115 to +116
value_col = [value_col] if isinstance(value_col, str) else (["Value"] if value_col is None else value_col)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify 'value_col' initialization for clarity

The current initialization of value_col can be made more readable by simplifying the conditional logic.

Suggested change:

-    value_col = [value_col] if isinstance(value_col, str) else (["Value"] if value_col is None else value_col)
+    if value_col is None:
+        value_col = ["Value"]
+    elif isinstance(value_col, str):
+        value_col = [value_col]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
value_col = [value_col] if isinstance(value_col, str) else (["Value"] if value_col is None else value_col)
if value_col is None:
value_col = ["Value"]
elif isinstance(value_col, str):
value_col = [value_col]

Comment on lines 40 to 41
from typing import Literal

Copy link

@coderabbitai coderabbitai bot Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Import 'Any' from 'typing' and correct type hints

The type hint dict[str, any] should use Any with a capital 'A'. Please import Any from the typing module to ensure correct type annotation.

Suggested change:

Add Any to the imports:

-from typing import Literal
+from typing import Literal, Any

And correct the type hint in the function signature:

-    **kwargs: dict[str, any],
+    **kwargs: dict[str, Any],

Committable suggestion was skipped due to low confidence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mrglglglglgl I didn't realize this but it's correct! https://stackoverflow.com/a/72334424

Can you make that change please.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment on lines +104 to +107
valid_sort_orders = ["ascending", "descending", None]
if sort_order not in valid_sort_orders:
error_msg = f"Invalid sort_order: {sort_order}. Expected one of {valid_sort_orders}"
raise ValueError(error_msg)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Normalize 'sort_order' parameter to handle different cases

Similarly, normalizing the sort_order parameter to lowercase will prevent errors if users input values like 'Ascending' or 'DESCENDING'.

Suggested change:

     valid_sort_orders = ["ascending", "descending", None]
+    if sort_order is not None:
+        sort_order = sort_order.lower()
     if sort_order not in valid_sort_orders:
         error_msg = f"Invalid sort_order: {sort_order}. Expected one of {valid_sort_orders}"
         raise ValueError(error_msg)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
valid_sort_orders = ["ascending", "descending", None]
if sort_order not in valid_sort_orders:
error_msg = f"Invalid sort_order: {sort_order}. Expected one of {valid_sort_orders}"
raise ValueError(error_msg)
valid_sort_orders = ["ascending", "descending", None]
if sort_order is not None:
sort_order = sort_order.lower()
if sort_order not in valid_sort_orders:
error_msg = f"Invalid sort_order: {sort_order}. Expected one of {valid_sort_orders}"
raise ValueError(error_msg)

@pytest.mark.usefixtures("_mock_color_generators", "_mock_gu_functions")
def test_plot_single_bar(sample_dataframe):
"""Test the bar plot function with a single value column."""
_, ax = plt.subplots()
Copy link

@coderabbitai coderabbitai bot Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Close figures after tests to prevent resource warnings

In your test functions, you're creating figures but not closing them after the tests are done. This can lead to resource warnings or a buildup of open figures during testing.

Apply this diff to capture the figure object and close it after each test. For example:

 def test_plot_single_bar(sample_dataframe):
-    _, ax = plt.subplots()
+    fig, ax = plt.subplots()
     ...
+    plt.close(fig)

Also applies to: 69-69, 86-86, 105-105, 122-122, 140-140, 159-159, 176-176, 203-203, 265-265, 280-280, 295-295, 312-312

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mrglglglglgl I think you need to do this as we're not using plt.show anywhere.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

return pd.DataFrame(data)


@pytest.fixture()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify the fixture decorator by removing parentheses

The @pytest.fixture decorator does not require parentheses when no parameters are provided. You can remove the parentheses for simplicity.

Apply this diff to remove the parentheses:

-@pytest.fixture()
+@pytest.fixture
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@pytest.fixture()
@pytest.fixture
🧰 Tools
🪛 Ruff

25-25: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)

from pyretailscience.style import graph_utils as gu


@pytest.fixture()
Copy link

@coderabbitai coderabbitai bot Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify the fixture decorator by removing parentheses

The @pytest.fixture decorator does not require parentheses when no parameters are provided. You can remove the parentheses for simplicity.

Apply this diff to remove the parentheses:

-@pytest.fixture()
+@pytest.fixture
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@pytest.fixture()
@pytest.fixture
🧰 Tools
🪛 Ruff

14-14: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mrglglglglgl Can you make these changes please. I've done it elsewhere.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

return pd.Series([1000, 1500, 2000, 2500], index=["A", "B", "C", "D"])


@pytest.fixture()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify the fixture decorator by removing parentheses

The @pytest.fixture decorator does not require parentheses when no parameters are provided. You can remove the parentheses for simplicity.

Apply this diff to remove the parentheses:

-@pytest.fixture()
+@pytest.fixture
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@pytest.fixture()
@pytest.fixture
🧰 Tools
🪛 Ruff

31-31: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)

mocker.patch("pyretailscience.style.tailwind.get_multi_color_cmap", return_value=multi_color_gen)


@pytest.fixture()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify the fixture decorator by removing parentheses

The @pytest.fixture decorator does not require parentheses when no parameters are provided. You can remove the parentheses for simplicity.

Apply this diff to remove the parentheses:

-@pytest.fixture()
+@pytest.fixture
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@pytest.fixture()
@pytest.fixture
🧰 Tools
🪛 Ruff

40-40: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)

x_label: str | None = None,
y_label: str | None = None,
legend_title: str | None = None,
bar_width: float = 0.35,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you used 0.35 instead of 0.8? Also is there a benefit of adding a new parameter "bar_width" rather than just letting the user pass width and popping that from kwargs?

Comment on lines 40 to 41
from typing import Literal

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mrglglglglgl I didn't realize this but it's correct! https://stackoverflow.com/a/72334424

Can you make that change please.

Comment on lines 28 to 29
- **Pre-Aggregated Data Required**: The module does not perform any data aggregation, so all data must be pre-aggregated before
being passed in for plotting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the last plot I don't think this is necessary.

Suggested change
- **Pre-Aggregated Data Required**: The module does not perform any data aggregation, so all data must be pre-aggregated before
being passed in for plotting.

Comment on lines 33 to 37
### Additional Features

- **Data Label Customization**: Show either absolute values or percentages as labels for the bars.
- **Legend Customization**: For multiple value columns, you can move the legend outside the plot.
- **Orientation**: Choose between vertical (`"v"`, `"vertical"`) and horizontal (`"h"`, `"horizontal"`) bar orientations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bullets in this section just provide a bit of extra detail on the core features. Can you please merge these bullets above where relevant

Comment on lines +98 to +102
valid_orientations = ["horizontal", "h", "vertical", "v"]
if orientation not in valid_orientations:
error_msg = f"Invalid orientation: {orientation}. Expected one of {valid_orientations}"
raise ValueError(error_msg)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mrglglglglgl I'm a bit torn on these suggestions to make the parameters lower case. Seems like something for the future to be abstracted as a pattern. Or something that already has a standard solution.

ax=ax,
color=[next(cmap) for _ in range(len(value_col))],
legend=(len(value_col) > 1),
width=bar_width,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above about eliminating the bar_width parameter

Comment on lines 184 to 190
total_sum = df[value_col[0]].sum() # Precompute sum for percentage calculations
for container in ax.containers:
if data_label_format == "absolute":
labels = [f"{get_bar_value(v):.0f}" for v in container]
elif data_label_format == "percentage":
labels = [f"{(get_bar_value(v) / total_sum):.1%}" for v in container]
ax.bar_label(container, labels=labels, label_type="edge")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mrglglglglgl Does the data_label_format='percentage' work with multiple series? I don't think the AI suggestion is correct btw.

Comment on lines 180 to 190

def get_bar_value(v: Rectangle) -> float:
return v.get_height() if plot_kind == "bar" else v.get_width()

total_sum = df[value_col[0]].sum() # Precompute sum for percentage calculations
for container in ax.containers:
if data_label_format == "absolute":
labels = [f"{get_bar_value(v):.0f}" for v in container]
elif data_label_format == "percentage":
labels = [f"{(get_bar_value(v) / total_sum):.1%}" for v in container]
ax.bar_label(container, labels=labels, label_type="edge")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mrglglglglgl In the case total_sum == 0 and data_label_format == "percentage", I think the labels should be blank and there warning.

from pyretailscience.style import graph_utils as gu


@pytest.fixture()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mrglglglglgl Can you make these changes please. I've done it elsewhere.

@pytest.mark.usefixtures("_mock_color_generators", "_mock_gu_functions")
def test_plot_single_bar(sample_dataframe):
"""Test the bar plot function with a single value column."""
_, ax = plt.subplots()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mrglglglglgl I think you need to do this as we're not using plt.show anywhere.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (5)
tests/plots/test_bar.py (5)

14-14: Simplify pytest.fixture decorators

Remove the parentheses from the @pytest.fixture decorators as they are unnecessary when no parameters are provided. This change applies to lines 14, 25, 31, and 40.

Apply this diff to simplify the decorators:

-@pytest.fixture()
+@pytest.fixture

This change improves code readability and follows pytest best practices.

Also applies to: 25-25, 31-31, 40-40

🧰 Tools
🪛 Ruff

14-14: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)


64-77: Close figure after test to prevent resource warnings

In the test_plot_single_bar function, the figure is created but not closed after the test. This can lead to resource warnings or a buildup of open figures during testing.

Consider modifying the test to capture the figure object and close it after assertions:

 def test_plot_single_bar(sample_dataframe):
     """Test the bar plot function with a single value column."""
+    fig, ax = plt.subplots()
     result_ax = bar.plot(
         df=sample_dataframe,
         value_col="sales_q1",
         x_col="product",
         title="Test Single Bar Plot",
+        ax=ax
     )
 
     expected_num_patches = 4
 
     assert isinstance(result_ax, Axes)
     assert len(result_ax.patches) == expected_num_patches
+    plt.close(fig)

This change ensures proper cleanup of resources after each test.


249-269: Enhance sorting verification in test_plot_with_sorting

The test_plot_with_sorting function verifies that the first bar corresponds to the lowest value, but it doesn't fully check if the entire plot is correctly sorted. Consider adding the following assertions to ensure complete sorting:

     expected_order = ["A", "B", "C", "D"]
     actual_order = [label.get_text() for label in result_ax.get_xticklabels()]
     bar_heights = [patch.get_height() for patch in result_ax.patches]
     expected_num_patches = 4
 
     assert isinstance(result_ax, Axes)
     assert len(result_ax.patches) == expected_num_patches
-    assert result_ax.get_xticklabels()[0].get_text() == "A"
+    assert actual_order == expected_order
+    assert bar_heights == sorted(bar_heights)

These additional assertions will ensure that the entire plot is correctly sorted, not just the first bar.


423-423: Add test case for empty value column

Consider adding a test case to verify the behavior when the value column is empty (contains all NaN or null values). This edge case is not currently covered in the test suite.

Here's a suggested implementation:

@pytest.mark.usefixtures("_mock_color_generators", "_mock_gu_functions")
def test_plot_with_empty_value_column(sample_dataframe):
    """Test bar plot when the value column is empty (all NaN)."""
    sample_dataframe['empty_column'] = pd.NA
    
    with pytest.raises(ValueError, match="Cannot plot with empty value column"):
        bar.plot(
            df=sample_dataframe,
            value_col="empty_column",
            x_col="product",
            title="Test Plot with Empty Value Column",
        )

This test will ensure that the bar.plot function handles empty value columns gracefully by raising an appropriate exception.


1-1: Add module-level docstring for test suite overview

Consider adding a module-level docstring at the beginning of the file to provide an overview of the test suite. This will help developers understand the purpose and structure of the tests at a glance.

Add the following docstring at the beginning of the file:

"""
Unit tests for the bar plot module in pyretailscience.plots.bar.

This test suite covers various aspects of the bar plot functionality, including:
- Basic plotting scenarios (single bar, multiple bars)
- Handling of different data types and structures (DataFrame, Series)
- Customization options (orientation, width, sorting, data labels)
- Error handling and edge cases
- Mocking of color generators and graph utility functions

Each test case is designed to verify a specific aspect of the bar plot functionality,
ensuring robustness and correctness of the implementation.
"""

This docstring provides a high-level overview of the test suite, making it easier for developers to understand its scope and purpose.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 4dbc7b3 and dc1edbf.

📒 Files selected for processing (1)
  • tests/plots/test_bar.py (1 hunks)
🧰 Additional context used
🪛 Ruff
tests/plots/test_bar.py

14-14: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)


25-25: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)


31-31: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)


40-40: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)

🔇 Additional comments (1)
tests/plots/test_bar.py (1)

1-479: Excellent test coverage and structure for bar plot module

The test suite for the bar plot module is comprehensive, well-structured, and covers a wide range of scenarios. It includes tests for basic functionality, advanced features, error handling, and edge cases. The use of fixtures and mocking is appropriate and helps isolate the tests from external dependencies.

A few minor improvements have been suggested:

  1. Simplifying pytest.fixture decorators
  2. Closing figures after tests to prevent resource warnings
  3. Enhancing sorting verification in the test_plot_with_sorting function
  4. Adding a test case for an empty value column
  5. Including a module-level docstring for better documentation

Overall, this test suite provides robust coverage for the bar plot module and will help ensure its reliability and correctness.

🧰 Tools
🪛 Ruff

14-14: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)


25-25: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)


31-31: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)


40-40: Use @pytest.fixture over @pytest.fixture()

Remove parentheses

(PT001)

Copy link
Contributor

qodo-merge-pro bot commented Feb 2, 2025

CI Feedback 🧐

(Feedback updated until commit 41d85d3)

A test triggered by this PR failed. Here is an AI-generated analysis of the failure:

Action: Pre-Commit

Failed stage: Run Pre-commit [❌]

Failed test name: trim trailing whitespace

Failure summary:

The action failed because the pre-commit hook 'trim trailing whitespace' detected and modified files
with trailing whitespace. Specifically:

  • The file docs/assets/images/analysis_modules/plots/histogram_plot.svg was modified to remove
    trailing whitespace
  • The pre-commit hooks are configured to fail when they need to modify files, requiring the changes to
    be committed first

  • Relevant error logs:
    1:  ##[group]Operating System
    2:  Ubuntu
    ...
    
    485:  [INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks.
    486:  [INFO] Once installed this environment will be reused.
    487:  [INFO] This may take a few minutes...
    488:  [INFO] Installing environment for https://github.com/kynan/nbstripout.
    489:  [INFO] Once installed this environment will be reused.
    490:  [INFO] This may take a few minutes...
    491:  ruff.....................................................................Passed
    492:  ruff-format..............................................................Passed
    493:  trim trailing whitespace.................................................Failed
    ...
    
    496:  - files were modified by this hook
    497:  Fixing docs/assets/images/analysis_modules/plots/histogram_plot.svg
    498:  fix end of files.........................................................Passed
    499:  fix python encoding pragma...............................................Passed
    500:  check yaml...............................................................Passed
    501:  debug statements (python)................................................Passed
    502:  pytest...................................................................Passed
    503:  nbstripout...............................................................Passed
    504:  ##[error]Process completed with exit code 1.
    

    @mvanwyk mvanwyk self-assigned this Feb 2, 2025
    @mvanwyk mvanwyk merged commit df44318 into main Feb 2, 2025
    4 checks passed
    @mvanwyk mvanwyk deleted the 78-standard-bar-plot branch February 2, 2025 11:47
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    Standard Bar Plot
    2 participants