Skip to content

Settings: Add settings comparison utility#421

Merged
amotl merged 12 commits intomainfrom
settings-compare
Apr 25, 2025
Merged

Settings: Add settings comparison utility#421
amotl merged 12 commits intomainfrom
settings-compare

Conversation

@amotl
Copy link
Copy Markdown
Member

@amotl amotl commented Apr 25, 2025

About

Accompanying the settings extractor with a comparison tool.

Documentation

https://cratedb-toolkit--421.org.readthedocs.build/util/settings.html

References

Screenshots

image

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 25, 2025

Walkthrough

This update introduces a new subsystem for CrateDB Toolkit focused on cluster settings management. It adds a CLI group for settings-related commands, including listing and comparing current cluster settings against documented defaults. The comparison utility handles memory and time settings with configurable tolerances and provides color-coded, grouped output. Supporting changes include new database adapter methods for retrieving cluster settings and heap size, as well as comprehensive tests and documentation. Related tests for the functions CLI command are moved to a more appropriate location. The documentation and dependency configuration are updated to reflect the new settings functionality.

Changes

File(s) Change Summary
cratedb_toolkit/settings/compare.py New module for comparing cluster runtime settings to documented defaults, with utilities for flattening dicts, parsing memory/time values, normalization, and CLI command for comparison.
cratedb_toolkit/settings/cli.py New CLI group for settings commands, with options for verbosity, debug, and subcommands for comparing and listing settings.
cratedb_toolkit/cli.py Registers the new settings CLI group under the main CLI.
cratedb_toolkit/util/database.py Adds get_settings and get_heap_size methods to DatabaseAdapter.
cratedb_toolkit/docs/cli.py Refactors import in the settings command from relative to absolute; minor docstring rewording.
tests/docs/test_functions.py Adds new tests for the ctk docs functions CLI command, checking JSON and Markdown output.
tests/docs/test_settings.py Removes tests for the functions CLI command (now moved to test_functions.py).
tests/settings/test_cli.py New tests for the settings CLI group, verifying list and compare commands and their outputs/logs.
CHANGES.md Adds changelog entry for the new settings comparison utility.
pyproject.toml Adds a new optional dependency group [settings] including cratedb-toolkit[docs-api].
doc/util/settings.md New documentation page describing the settings subsystem, installation, and usage.
doc/util/index.md Adds the "settings" entry to the utilities documentation index.
doc/docs-api.md Hyphenates "runtime-configurable" in documentation.
doc/index.md Updates diagnostics section to reference util/index instead of cmd/index.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant SettingsCLI
    participant DatabaseAdapter
    participant DocsSettingsExtractor

    User->>CLI: ctk settings compare
    CLI->>SettingsCLI: Invoke compare command
    SettingsCLI->>DatabaseAdapter: get_heap_size()
    DatabaseAdapter-->>SettingsCLI: heap size
    SettingsCLI->>DatabaseAdapter: get_settings()
    DatabaseAdapter-->>SettingsCLI: cluster settings
    SettingsCLI->>DocsSettingsExtractor: extract default settings
    DocsSettingsExtractor-->>SettingsCLI: documented defaults
    SettingsCLI->>SettingsCLI: Compare current vs. default settings
    SettingsCLI-->>User: Print grouped, color-coded comparison report
Loading

Possibly related PRs

  • crate/cratedb-toolkit#398: Adds foundational settings extraction functionality, which the new comparison utility builds upon.
  • crate/cratedb-toolkit#399: Introduces the CLI command group for docs, directly related to the CLI integration in this update.
  • crate/cratedb-toolkit#400: Refactors settings extraction logic and adds YAML support, which is extended by the current settings subsystem.

Suggested reviewers

  • WalBeh

Poem

In the warren of settings deep,
A toolkit rabbit takes a leap—
Now it lists and compares with flair,
Colors and docs are everywhere!
With bytes and millis, it’s precise,
Cluster configs checked not once, but twice.
🐇 Hooray for settings, neat and bright—
The toolkit’s future’s looking right!

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
tests/docs/test_functions.py (1)

30-48: Minor: assert more than success & one key

Right now the test passes as long as one specific function exists. To increase confidence, also validate that the resulting file is non-empty and well-formed, e.g. by checking jsonschema or that at least N functions are present.

cratedb_toolkit/docs/settings/compare.py (3)

70-107: to_bytes() misses some real-world units & edge cases

  1. CrateDB settings occasionally use bytes, kib, mib, etc. These are not recognised.
  2. Values like 0 (without a unit) pass through match but the resulting unit is "b", which is not in multipliers, so the function falls back to int(number) – fine – but the intent deserves a comment.
  3. Consider early-returning 0 instead of None when the string literally equals "0" to make downstream code simpler.

No immediate bug, yet adding the units will avoid silent mis-comparisons.


461-462: Define colour constants once

PURPLE is re-defined for every iteration of the outer loop. Define it next to the other ANSI constants to avoid tiny runtime overhead and keep the colour palette in one place.


32-55: Parsing multiple JSON objects by counting [/] is fragile

parse_multiple_json_objects() assumes top-level JSON arrays and will break if the file contains:

  • Nested arrays whose ] closes before an outer ].
  • Top-level objects ({}) instead of arrays.

Using a streaming parser (ijson.items() or json.loads('[' + ','.join(...) + ']')) would be more robust and simpler to maintain.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 150fbc8 and 6818a47.

📒 Files selected for processing (4)
  • cratedb_toolkit/docs/cli.py (1 hunks)
  • cratedb_toolkit/docs/settings/compare.py (1 hunks)
  • tests/docs/test_functions.py (1 hunks)
  • tests/docs/test_settings.py (0 hunks)
💤 Files with no reviewable changes (1)
  • tests/docs/test_settings.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
cratedb_toolkit/docs/cli.py (1)
cratedb_toolkit/docs/settings/extract.py (1)
  • SettingsExtractor (673-728)
tests/docs/test_functions.py (1)
cratedb_toolkit/docs/cli.py (1)
  • cli (15-19)
⏰ Context from checks skipped due to timeout of 90000ms (11)
  • GitHub Check: Kinesis: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.9 on OS ubuntu-latest
  • GitHub Check: CFR: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.9 on OS ubuntu-latest
  • GitHub Check: CFR for OS windows-latest
  • GitHub Check: CFR for OS ubuntu-latest
  • GitHub Check: Generic: Python 3.8 on OS ubuntu-latest
  • GitHub Check: build-and-test
  • GitHub Check: CFR for OS macos-latest
  • GitHub Check: CFR for OS macos-13
🔇 Additional comments (1)
cratedb_toolkit/docs/cli.py (1)

104-104: Confirm import path is available after installation

The switch from a relative to an absolute import is fine, but please double-check that cratedb_toolkit.docs.settings.extract is included in pyproject.toml/setup.py so that it is actually shipped with the wheel or sdist; otherwise this CLI entry-point will raise an ImportError once the package is installed.

Comment thread tests/docs/test_functions.py
Comment thread cratedb_toolkit/docs/settings/compare.py
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
cratedb_toolkit/settings/cli.py (1)

13-24: Consider validating the return value of boot_click

boot_click() is invoked and its return value passed straight back to Click.
If boot_click() ever returns None (as the snippet in util/cli.py suggests) this is fine—Click ignores the value—but if a future refactor makes it return something else (e.g. a configured logger) Click will silently swallow it.

A tiny defensive tweak keeps behaviour explicit:

-    return boot_click(ctx, verbose, debug)
+    boot_click(ctx, verbose, debug)  # Ensure side-effects only

Not urgent, just a small robustness improvement.

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 21-21: cratedb_toolkit/settings/cli.py#L21
Added line #L21 was not covered by tests

cratedb_toolkit/settings/compare.py (1)

62-99: Unit parsing misses common variants & explicit ‘B’ multiplier

to_bytes() overlooks several real-world inputs:

  • IEC units (MiB, GiB) and upper-case MB / GB
  • The plain “b” case is handled by falling through but could be explicit for clarity
  • Values like 8192 (no unit) are treated as bytes, good—consider documenting that

A quick win:

 multipliers = {
-    "kb": 1024,
-    "mb": 1024 * 1024,
-    "gb": 1024 * 1024 * 1024,
-    "tb": 1024 * 1024 * 1024 * 1024,
+    "b": 1,
+    "kb": 1000,
+    "kib": 1024,
+    "mb": 1000**2,
+    "mib": 1024**2,
+    "gb": 1000**3,
+    "gib": 1024**3,
+    "tb": 1000**4,
+    "tib": 1024**4,
 }

…and extend the regex with [kmgt]i?b to catch IEC forms.

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 64-65: cratedb_toolkit/settings/compare.py#L64-L65
Added lines #L64 - L65 were not covered by tests


[warning] 67-67: cratedb_toolkit/settings/compare.py#L67
Added line #L67 was not covered by tests


[warning] 70-72: cratedb_toolkit/settings/compare.py#L70-L72
Added lines #L70 - L72 were not covered by tests


[warning] 74-78: cratedb_toolkit/settings/compare.py#L74-L78
Added lines #L74 - L78 were not covered by tests


[warning] 81-81: cratedb_toolkit/settings/compare.py#L81
Added line #L81 was not covered by tests


[warning] 89-91: cratedb_toolkit/settings/compare.py#L89-L91
Added lines #L89 - L91 were not covered by tests


[warning] 93-94: cratedb_toolkit/settings/compare.py#L93-L94
Added lines #L93 - L94 were not covered by tests


[warning] 96-98: cratedb_toolkit/settings/compare.py#L96-L98
Added lines #L96 - L98 were not covered by tests

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6818a47 and 23d75b5.

📒 Files selected for processing (4)
  • cratedb_toolkit/cli.py (2 hunks)
  • cratedb_toolkit/docs/cli.py (1 hunks)
  • cratedb_toolkit/settings/cli.py (1 hunks)
  • cratedb_toolkit/settings/compare.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • cratedb_toolkit/cli.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
cratedb_toolkit/settings/cli.py (2)
cratedb_toolkit/util/cli.py (1)
  • boot_click (16-27)
cratedb_toolkit/settings/compare.py (1)
  • compare_cluster_settings (354-466)
cratedb_toolkit/docs/cli.py (1)
cratedb_toolkit/docs/settings.py (1)
  • SettingsExtractor (673-728)
🪛 GitHub Check: codecov/patch
cratedb_toolkit/settings/cli.py

[warning] 21-21: cratedb_toolkit/settings/cli.py#L21
Added line #L21 was not covered by tests

cratedb_toolkit/settings/compare.py

[warning] 26-27: cratedb_toolkit/settings/compare.py#L26-L27
Added lines #L26 - L27 were not covered by tests


[warning] 29-31: cratedb_toolkit/settings/compare.py#L29-L31
Added lines #L29 - L31 were not covered by tests


[warning] 33-45: cratedb_toolkit/settings/compare.py#L33-L45
Added lines #L33 - L45 were not covered by tests


[warning] 47-47: cratedb_toolkit/settings/compare.py#L47
Added line #L47 was not covered by tests


[warning] 52-56: cratedb_toolkit/settings/compare.py#L52-L56
Added lines #L52 - L56 were not covered by tests


[warning] 58-59: cratedb_toolkit/settings/compare.py#L58-L59
Added lines #L58 - L59 were not covered by tests


[warning] 64-65: cratedb_toolkit/settings/compare.py#L64-L65
Added lines #L64 - L65 were not covered by tests


[warning] 67-67: cratedb_toolkit/settings/compare.py#L67
Added line #L67 was not covered by tests


[warning] 70-72: cratedb_toolkit/settings/compare.py#L70-L72
Added lines #L70 - L72 were not covered by tests


[warning] 74-78: cratedb_toolkit/settings/compare.py#L74-L78
Added lines #L74 - L78 were not covered by tests


[warning] 81-81: cratedb_toolkit/settings/compare.py#L81
Added line #L81 was not covered by tests


[warning] 89-91: cratedb_toolkit/settings/compare.py#L89-L91
Added lines #L89 - L91 were not covered by tests


[warning] 93-94: cratedb_toolkit/settings/compare.py#L93-L94
Added lines #L93 - L94 were not covered by tests


[warning] 96-98: cratedb_toolkit/settings/compare.py#L96-L98
Added lines #L96 - L98 were not covered by tests


[warning] 103-104: cratedb_toolkit/settings/compare.py#L103-L104
Added lines #L103 - L104 were not covered by tests


[warning] 106-106: cratedb_toolkit/settings/compare.py#L106
Added line #L106 was not covered by tests


[warning] 109-111: cratedb_toolkit/settings/compare.py#L109-L111
Added lines #L109 - L111 were not covered by tests


[warning] 113-114: cratedb_toolkit/settings/compare.py#L113-L114
Added lines #L113 - L114 were not covered by tests


[warning] 117-117: cratedb_toolkit/settings/compare.py#L117
Added line #L117 was not covered by tests


[warning] 125-127: cratedb_toolkit/settings/compare.py#L125-L127
Added lines #L125 - L127 were not covered by tests


[warning] 132-139: cratedb_toolkit/settings/compare.py#L132-L139
Added lines #L132 - L139 were not covered by tests


[warning] 141-141: cratedb_toolkit/settings/compare.py#L141
Added line #L141 was not covered by tests


[warning] 146-153: cratedb_toolkit/settings/compare.py#L146-L153
Added lines #L146 - L153 were not covered by tests


[warning] 155-156: cratedb_toolkit/settings/compare.py#L155-L156
Added lines #L155 - L156 were not covered by tests


[warning] 159-159: cratedb_toolkit/settings/compare.py#L159
Added line #L159 was not covered by tests


[warning] 162-164: cratedb_toolkit/settings/compare.py#L162-L164
Added lines #L162 - L164 were not covered by tests


[warning] 169-170: cratedb_toolkit/settings/compare.py#L169-L170
Added lines #L169 - L170 were not covered by tests


[warning] 172-173: cratedb_toolkit/settings/compare.py#L172-L173
Added lines #L172 - L173 were not covered by tests

cratedb_toolkit/docs/cli.py

[warning] 104-104: cratedb_toolkit/docs/cli.py#L104
Added line #L104 was not covered by tests

⏰ Context from checks skipped due to timeout of 90000ms (10)
  • GitHub Check: Generic: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.9 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.8 on OS ubuntu-latest
  • GitHub Check: CFR for OS windows-latest
  • GitHub Check: build-and-test
  • GitHub Check: Kinesis: Python 3.9 on OS ubuntu-latest
  • GitHub Check: CFR for OS ubuntu-latest
  • GitHub Check: CFR for OS macos-latest
  • GitHub Check: CFR for OS macos-13
🔇 Additional comments (3)
cratedb_toolkit/docs/cli.py (1)

100-106: Absolute import improves clarity but slightly reduces package-relative flexibility

Switching from a relative import (from .settings import …) to an absolute package import (from cratedb_toolkit.docs.settings import …) is perfectly valid inside the monorepo and avoids ambiguity when the module is executed as a script.
Be aware, however, that absolute imports make local development a bit trickier when the package is not installed in editable (pip install -e .) mode. If contributors sometimes run python -m cratedb_toolkit.docs.cli … from the source tree, the absolute path still works, so nothing is broken—just worth keeping in mind.

No action required, just informational.

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 104-104: cratedb_toolkit/docs/cli.py#L104
Added line #L104 was not covered by tests

cratedb_toolkit/settings/cli.py (1)

24-24: Registering the command object directly is 👍

Adding the already-decorated compare_cluster_settings click command via add_command() is the simplest, duplication-free way to expose it—nice!

cratedb_toolkit/settings/compare.py (1)

374-380: Silent acceptance of zero/negative heap size disguises mistakes

When heap_size_bytes is provided but ≤ 0 the subsequent percentage math still runs (or divides by zero as noted above).
Fail fast with a clear message:

-if heap_size_bytes:
+if heap_size_bytes and heap_size_bytes > 0:
     …
 else:
-    print(f"{YELLOW}No heap size provided{RESET}")
+    print(f"{RED}Invalid or missing heap size (--heap-size-bytes must be > 0){RESET}")
+    return

Improves UX and prevents misleading reports.

Comment thread cratedb_toolkit/settings/compare.py Outdated
Comment thread cratedb_toolkit/settings/compare.py
Comment thread cratedb_toolkit/settings/compare.py Outdated
@amotl amotl force-pushed the settings-compare branch from 23d75b5 to f765393 Compare April 25, 2025 15:52
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
cratedb_toolkit/settings/compare.py (2)

253-258: find_cluster_settings() only handles the [{"settings": …}] shape

If the result ever comes back as a bare object ({"settings": …}) or the array contains multiple rows, this helper silently returns None, leading to confusing errors downstream.
See the defensive version proposed in the earlier review comment.


24-47: 🛠️ Refactor suggestion

Parser is still brittle – see earlier review

This manual “square-bracket depth” scanner…

  • ignores { … } root objects,
  • breaks on [ or ] that appear inside quoted strings,
  • loads the complete file into memory,

exactly as pointed out in the previous review. Please consider the streaming json.JSONDecoder approach suggested earlier for robustness and scalability.

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 26-27: cratedb_toolkit/settings/compare.py#L26-L27
Added lines #L26 - L27 were not covered by tests


[warning] 29-31: cratedb_toolkit/settings/compare.py#L29-L31
Added lines #L29 - L31 were not covered by tests


[warning] 33-45: cratedb_toolkit/settings/compare.py#L33-L45
Added lines #L33 - L45 were not covered by tests


[warning] 47-47: cratedb_toolkit/settings/compare.py#L47
Added line #L47 was not covered by tests

🧹 Nitpick comments (1)
cratedb_toolkit/settings/compare.py (1)

224-227: Edge-case: threshold boundary is excluded

threshold_percent intends to separate “large” from “small” settings, but the comparison uses >:

tolerance = tolerance_percent_large if default_percent > threshold_percent else tolerance_percent_small

A default exactly at the boundary (e.g. 20% when threshold_percent=20) is currently treated as “small”, which might be surprising. Consider changing to >=:

-    tolerance = tolerance_percent_large if default_percent > threshold_percent else tolerance_percent_small
+    tolerance = tolerance_percent_large if default_percent >= threshold_percent else tolerance_percent_small

This makes the rule inclusive and removes the fuzzy off-by-one behaviour.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 23d75b5 and f765393.

📒 Files selected for processing (1)
  • cratedb_toolkit/settings/compare.py (1 hunks)
🧰 Additional context used
🪛 GitHub Check: codecov/patch
cratedb_toolkit/settings/compare.py

[warning] 26-27: cratedb_toolkit/settings/compare.py#L26-L27
Added lines #L26 - L27 were not covered by tests


[warning] 29-31: cratedb_toolkit/settings/compare.py#L29-L31
Added lines #L29 - L31 were not covered by tests


[warning] 33-45: cratedb_toolkit/settings/compare.py#L33-L45
Added lines #L33 - L45 were not covered by tests


[warning] 47-47: cratedb_toolkit/settings/compare.py#L47
Added line #L47 was not covered by tests


[warning] 52-56: cratedb_toolkit/settings/compare.py#L52-L56
Added lines #L52 - L56 were not covered by tests


[warning] 58-59: cratedb_toolkit/settings/compare.py#L58-L59
Added lines #L58 - L59 were not covered by tests


[warning] 64-65: cratedb_toolkit/settings/compare.py#L64-L65
Added lines #L64 - L65 were not covered by tests


[warning] 67-67: cratedb_toolkit/settings/compare.py#L67
Added line #L67 was not covered by tests


[warning] 70-72: cratedb_toolkit/settings/compare.py#L70-L72
Added lines #L70 - L72 were not covered by tests


[warning] 74-78: cratedb_toolkit/settings/compare.py#L74-L78
Added lines #L74 - L78 were not covered by tests


[warning] 81-81: cratedb_toolkit/settings/compare.py#L81
Added line #L81 was not covered by tests


[warning] 89-91: cratedb_toolkit/settings/compare.py#L89-L91
Added lines #L89 - L91 were not covered by tests


[warning] 93-94: cratedb_toolkit/settings/compare.py#L93-L94
Added lines #L93 - L94 were not covered by tests


[warning] 96-98: cratedb_toolkit/settings/compare.py#L96-L98
Added lines #L96 - L98 were not covered by tests


[warning] 103-104: cratedb_toolkit/settings/compare.py#L103-L104
Added lines #L103 - L104 were not covered by tests


[warning] 106-106: cratedb_toolkit/settings/compare.py#L106
Added line #L106 was not covered by tests


[warning] 109-111: cratedb_toolkit/settings/compare.py#L109-L111
Added lines #L109 - L111 were not covered by tests


[warning] 113-114: cratedb_toolkit/settings/compare.py#L113-L114
Added lines #L113 - L114 were not covered by tests


[warning] 117-117: cratedb_toolkit/settings/compare.py#L117
Added line #L117 was not covered by tests


[warning] 125-127: cratedb_toolkit/settings/compare.py#L125-L127
Added lines #L125 - L127 were not covered by tests


[warning] 132-139: cratedb_toolkit/settings/compare.py#L132-L139
Added lines #L132 - L139 were not covered by tests


[warning] 141-141: cratedb_toolkit/settings/compare.py#L141
Added line #L141 was not covered by tests


[warning] 146-153: cratedb_toolkit/settings/compare.py#L146-L153
Added lines #L146 - L153 were not covered by tests


[warning] 155-156: cratedb_toolkit/settings/compare.py#L155-L156
Added lines #L155 - L156 were not covered by tests


[warning] 159-159: cratedb_toolkit/settings/compare.py#L159
Added line #L159 was not covered by tests


[warning] 162-164: cratedb_toolkit/settings/compare.py#L162-L164
Added lines #L162 - L164 were not covered by tests


[warning] 169-170: cratedb_toolkit/settings/compare.py#L169-L170
Added lines #L169 - L170 were not covered by tests


[warning] 172-173: cratedb_toolkit/settings/compare.py#L172-L173
Added lines #L172 - L173 were not covered by tests

⏰ Context from checks skipped due to timeout of 90000ms (10)
  • GitHub Check: Generic: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.9 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.9 on OS ubuntu-latest
  • GitHub Check: CFR for OS windows-latest
  • GitHub Check: CFR for OS ubuntu-latest
  • GitHub Check: Generic: Python 3.8 on OS ubuntu-latest
  • GitHub Check: build-and-test
  • GitHub Check: CFR for OS macos-latest
  • GitHub Check: CFR for OS macos-13

Comment thread cratedb_toolkit/settings/compare.py Outdated
@amotl amotl force-pushed the settings-compare branch from f765393 to 969f78e Compare April 25, 2025 16:08
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
cratedb_toolkit/settings/compare.py (3)

29-52: Refactor needed for JSON parser to handle edge cases

The current implementation of parse_multiple_json_objects() has several limitations:

  1. It only handles arrays at the root level, ignoring objects like {...}
  2. It will break if brackets appear inside string literals
  3. It loads the entire file into memory, which could be problematic for large files

258-263: Improve defensive handling of cluster settings extraction

The current implementation only handles a specific JSON structure format [{"settings": ...}] and would silently return None for other valid structures.


365-375: ⚠️ Potential issue

Fix UnboundLocalError with color constants

Re-assigning global color constants within the function causes them to be treated as local variables for the entire function body, leading to an UnboundLocalError when they're referenced before assignment.

Instead of modifying global variables, create local aliases:

-    # ANSI color codes
-    if no_color:
-        HEADER = ""
-        BOLD = ""
-        GREEN = ""
-        YELLOW = ""
-        RED = ""
-        BLUE = ""
-        RESET = ""
+    # ANSI color aliases - use local variables to avoid shadowing globals
+    if no_color:
+        header = bold = green = yellow = red = blue = purple = reset = ""
+    else:
+        header, bold, green, yellow, red, blue, purple, reset = (
+            HEADER, BOLD, GREEN, YELLOW, RED, BLUE, PURPLE, RESET
+        )

Then update all subsequent references to use these local aliases rather than the global constants.

🧹 Nitpick comments (4)
cratedb_toolkit/settings/compare.py (4)

229-229: Consider using >= instead of > for threshold comparison

The current threshold comparison only applies the large tolerance when default_percent > threshold_percent, which misses the exact boundary case.

-    tolerance = tolerance_percent_large if default_percent > threshold_percent else tolerance_percent_small
+    tolerance = tolerance_percent_large if default_percent >= threshold_percent else tolerance_percent_small

445-447: Improve memory-related setting detection

The current approach uses a simple string matching to detect memory-related settings, which might incorrectly identify settings that happen to contain these substrings but aren't actually memory-related.

Consider using more precise patterns or a predefined list of known memory-related settings:

-        elif heap_size_bytes and any(
-            mem_term in setting_key for mem_term in ["memory", "heap", "breaker", "limit", "size"]
-        ):
+        elif heap_size_bytes and any(
+            mem_term in setting_key.lower() for mem_term in ["memory", "heap", "breaker"]
+        ) or any(
+            f".{term}" in setting_key.lower() for term in ["limit", "size"]
+        ):

This avoids false positives from settings that happen to contain these terms as substrings in unrelated contexts.


298-301: Avoid repeated string operations in the reporting loop

The code replaces text in the full_line string, which is inefficient when done repeatedly in a loop.

-                # Add yellow color to the default value part
-                full_line = full_line.replace(" (default: ", f" (default: {YELLOW}").replace(")", f"{RESET})")
+                # Format with colors directly
+                default_start = full_line.find(" (default: ")
+                if default_start != -1:
+                    full_line = (
+                        full_line[:default_start] + 
+                        f" (default: {YELLOW}" + 
+                        full_line[default_start + 11:-1] + 
+                        f"{RESET})"
+                    )

419-426: Consider logging skipped settings for better debugging

When settings are skipped (either because they're not in default_settings or have no default value), this information isn't logged, which might make debugging difficult.

         if setting_key not in default_settings:
+            if logger.isEnabledFor(logging.DEBUG):
+                logger.debug(f"Skipping setting {setting_key} - not found in default settings")
             continue

         # Get default value
         default_value = default_settings[setting_key].get("default_value", "")
         if not default_value:
+            if logger.isEnabledFor(logging.DEBUG):
+                logger.debug(f"Skipping setting {setting_key} - no default value")
             continue
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f765393 and 969f78e.

📒 Files selected for processing (1)
  • cratedb_toolkit/settings/compare.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (10)
  • GitHub Check: Kinesis: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.9 on OS ubuntu-latest
  • GitHub Check: CFR for OS windows-latest
  • GitHub Check: Kinesis: Python 3.9 on OS ubuntu-latest
  • GitHub Check: CFR for OS ubuntu-latest
  • GitHub Check: Generic: Python 3.8 on OS ubuntu-latest
  • GitHub Check: CFR for OS macos-latest
  • GitHub Check: build-and-test
  • GitHub Check: CFR for OS macos-13
🔇 Additional comments (5)
cratedb_toolkit/settings/compare.py (5)

106-133: LGTM: Time conversion implementation looks good

The to_milliseconds function correctly handles various time units and provides a robust conversion to milliseconds.


16-24: LGTM: Good module imports organization

The imports are well-organized, with standard library imports first, followed by third-party libraries, and finally project-specific imports.


328-361: LGTM: Well-designed CLI interface with Click

The command definition is well-structured with appropriate arguments and options, including clear help messages and sensible defaults.


429-442: LGTM: Good time-related setting detection

The approach for identifying time-related settings is comprehensive and covers common patterns found in configuration parameters.


205-256: ⚠️ Potential issue

Fix potential division-by-zero error

The compare_memory_settings function checks if heap_size_bytes is not provided at line 221, but it doesn't guard against zero values when performing division at lines 225-226.

-    if current_bytes is None or default_bytes is None:
+    if current_bytes is None or default_bytes is None or not heap_size_bytes:
         return None

-    if not heap_size_bytes:
+    if heap_size_bytes <= 0:
         raise ValueError("Heap size must be provided to compare memory settings.")

Likely an incorrect or invalid review comment.

Comment thread cratedb_toolkit/settings/compare.py
@amotl amotl force-pushed the settings-compare branch from 969f78e to d15074e Compare April 25, 2025 16:21
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (5)
cratedb_toolkit/settings/compare.py (5)

256-261: 🛠️ Refactor suggestion

Improve find_cluster_settings() to handle diverse JSON structures

The current implementation only handles the [{"settings": ...}] shape. If the query output changes to a bare object or the array contains multiple rows, this helper will silently return None.

 def find_cluster_settings(json_objects):
     """Extract cluster settings from parsed JSON objects."""
-    for obj in json_objects:
-        if isinstance(obj, list) and len(obj) > 0 and "settings" in obj[0]:
-            return obj[0].get("settings", {})
+    for obj in json_objects:
+        if isinstance(obj, list):
+            for row in obj:
+                if isinstance(row, dict) and "settings" in row:
+                    return row["settings"]
+        elif isinstance(obj, dict) and "settings" in obj:
+            return obj["settings"]
     return None

This more defensive approach handles both object and array structures, preventing confusing "Could not find cluster settings" errors.


366-374: ⚠️ Potential issue

Fix UnboundLocalError with color constants

Reassigning global variables inside a function makes Python treat them as local variables for the entire function body. This causes an UnboundLocalError when the variables are referenced before assignment at line 375 and below.

-    # ANSI color codes
-    if no_color:
-        HEADER = ""
-        BOLD = ""
-        GREEN = ""
-        YELLOW = ""
-        RED = ""
-        BLUE = ""
-        RESET = ""
+    # ANSI color palette – use *local* variables to avoid shadowing globals
+    if no_color:
+        header = bold = green = yellow = red = blue = reset = purple = ""
+    else:
+        header, bold, green, yellow, red, blue, reset, purple = (
+            HEADER, BOLD, GREEN, YELLOW, RED, BLUE, RESET, PURPLE
+        )

Then, for all subsequent references to the global color constants in this function (like line 375), replace them with these local variables (e.g., use bold instead of BOLD).


396-399: 🛠️ Refactor suggestion

Add error handling for extractor.acquire()

The code calls extractor.acquire() without checking its return value or handling potential acquisition failures.

     try:
         extractor = SettingsExtractor()
-        extractor.acquire()  # .render("json").write(output)
+        acquisition_result = extractor.acquire()
+        if not acquisition_result:
+            msg = f"{RED}Failed to acquire settings from documentation source{RESET}"
+            logger.error(msg)
+            raise click.ClickException(msg)
         default_settings = extractor.thing["settings"]

203-254: ⚠️ Potential issue

Guard against zero division in memory settings comparison

The code doesn't check if heap_size_bytes is zero before performing division at lines 223-224, which could lead to a ZeroDivisionError.

-    if current_bytes is None or default_bytes is None:
+    if current_bytes is None or default_bytes is None or heap_size_bytes == 0:
         return None

     if not heap_size_bytes:
         raise ValueError("Heap size must be provided to compare memory settings.")

Additionally, consider using >= instead of > in the threshold check to avoid boundary issues:

-    tolerance = tolerance_percent_large if default_percent > threshold_percent else tolerance_percent_small
+    tolerance = tolerance_percent_large if default_percent >= threshold_percent else tolerance_percent_small

27-50: 🛠️ Refactor suggestion

Consider using a streaming JSON parser for better reliability

The current JSON parsing implementation has several limitations:

  1. It only processes arrays at the root level ([...]), ignoring standalone objects ({...})
  2. It breaks if square brackets appear inside quoted strings or JSON strings
  3. It reads the entire file into memory, which could cause problems with large diagnostic dumps
-def parse_multiple_json_objects(file_path):
-    """Parse a file with multiple JSON objects/arrays."""
-    with open(file_path, "r") as f:
-        content = f.read()
-
-    json_objects = []
-    depth = 0
-    start = 0
-
-    for i, char in enumerate(content):
-        if char == "[":
-            if depth == 0:
-                start = i
-            depth += 1
-        elif char == "]":
-            depth -= 1
-            if depth == 0:
-                try:
-                    json_obj = json.loads(content[start : i + 1])
-                    json_objects.append(json_obj)
-                except json.JSONDecodeError as e:
-                    print(f"Warning: Failed to parse JSON object at position {start}-{i}: {e}")
+def parse_multiple_json_objects(file_path):
+    """Parse a file with multiple JSON objects/arrays."""
+    json_objects = []
+    with open(file_path) as f:
+        decoder = json.JSONDecoder()
+        buffer = ""
+        for chunk in iter(lambda: f.read(8192), ""):
+            buffer += chunk
+            while buffer.strip():
+                try:
+                    obj, idx = decoder.raw_decode(buffer)
+                    json_objects.append(obj)
+                    buffer = buffer[idx:]
+                except json.JSONDecodeError:
+                    # Need more data
+                    break

This streaming approach handles both objects and arrays, works correctly with nested JSON structures, processes files incrementally, and scales better with large input files.

🧹 Nitpick comments (2)
cratedb_toolkit/settings/compare.py (2)

430-440: Consider using a set for time-related terms

Using a set instead of a list for the time-related terms would make the membership check more efficient, especially as the list of terms grows.

     is_time_setting = any(
         time_term in setting_key.lower()
-        for time_term in [
-            "timeout",
-            "interval",
-            "delay",
-            "expiration",
-            "time",
-            "duration",
-        ]
+        for time_term in {
+            "timeout",
+            "interval",
+            "delay",
+            "expiration",
+            "time",
+            "duration",
+        }
     )

445-447: Consider using a set for memory-related terms

Similar to the time-related terms, using a set for memory-related terms would make the membership check more efficient.

     elif heap_size_bytes and any(
-        mem_term in setting_key for mem_term in ["memory", "heap", "breaker", "limit", "size"]
+        mem_term in setting_key for mem_term in {"memory", "heap", "breaker", "limit", "size"}
     ):
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 969f78e and d15074e.

📒 Files selected for processing (1)
  • cratedb_toolkit/settings/compare.py (1 hunks)
🔇 Additional comments (7)
cratedb_toolkit/settings/compare.py (7)

65-101: LGTM: Memory value conversion handles both absolute and percentage values

The implementation properly converts memory size specifications to bytes, handling both percentage values (relative to heap size) and absolute values with various units (KB, MB, GB, TB). The code includes appropriate error handling for invalid inputs.


104-131: LGTM: Time value conversion handles various time units

The implementation correctly converts time specifications to milliseconds, handling various units (ms, s, m, h, d). The code includes appropriate error handling for invalid inputs and assumes milliseconds when no unit is specified.


147-168: LGTM: Normalized value handling for comparison

The normalize_value function correctly handles different types including nulls, booleans, numbers, and strings. The special handling for disabled values (e.g., "0s (disabled)") and Java float notation is particularly useful.


170-201: LGTM: Time settings comparison with tolerances

The implementation properly compares time-based settings with appropriate tolerances (1%), handling special cases like zero values. The human-readable formatting is a nice touch for usability.


1-14: LGTM: Clear module docstring with usage examples

The module docstring provides a clear explanation of the script's purpose and includes helpful examples of how to generate the required input files.


326-351: LGTM: Well-structured CLI with clear options

The Click command is well-structured with appropriate arguments and options. The help text is clear and descriptive, making it user-friendly.


274-324: LGTM: Organized reporting with categorical grouping

The reporting function organizes settings by category, which makes the output more readable. The color coding, wrapping, and inclusion of explanatory statements from default settings enhance usability.

@amotl amotl force-pushed the settings-compare branch from d15074e to 6fa67ec Compare April 25, 2025 18:04
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🔭 Outside diff range comments (1)
tests/settings/test_cli.py (1)

28-41: 🛠️ Refactor suggestion

Make assertions resilient to JVM size variations

Hard-coding 536_870_912 couples the test to a single heap size. A different CI image or future CrateDB default will break the test.

-    assert "Heap Size: 536_870_912 bytes" in result.output
+    assert re.search(r"Heap Size:\s+\d[\d_,]*\s+bytes", result.output)

(remember to import re at the top).

This still proves that the heap size is printed without locking into an exact value.

♻️ Duplicate comments (1)
cratedb_toolkit/settings/compare.py (1)

185-214: Still vulnerable to a 0-heap division

heap_size_bytes == 0 passes the truthiness check (bool(0) is False) but it reaches the percentage division before the guard, raising ZeroDivisionError.

Move the guard before any arithmetic:

-    current_bytes = to_bytes(str(current_value), heap_size_bytes)
-    default_bytes = to_bytes(default_value, heap_size_bytes)
-
-    if current_bytes is None or default_bytes is None:
-        return None
-
-    if not heap_size_bytes:
-        raise ValueError("Heap size must be provided to compare memory settings.")
+    if not heap_size_bytes:
+        raise ValueError("Heap size must be provided (cannot be 0)")
+
+    current_bytes = to_bytes(str(current_value), heap_size_bytes)
+    default_bytes = to_bytes(default_value, heap_size_bytes)
+
+    if current_bytes is None or default_bytes is None:
+        return None
🧹 Nitpick comments (2)
cratedb_toolkit/settings/compare.py (2)

73-83: Tighten regex and support KiB / MiB etc.

  1. Anchoring the pattern with ^…$ avoids partially-matched garbage like "8gbfoobar".
  2. Many CrateDB settings use binary prefixes (kib, mib, …). Adding them costs little.
-    match = re.match(r"(\d+(?:\.\d+)?)\s*(kb|mb|gb|tb|b)?", value_str, re.IGNORECASE)
+    match = re.match(
+        r"^\s*(\d+(?:\.\d+)?)\s*(kib?|mib?|gib?|tib?|kb|mb|gb|tb|b)?\s*$",
+        value_str,
+        re.IGNORECASE,
+    )

380-443: default_settings keys are already flattened – skip the O(N²) if-in loop

The inner if setting_key not in default_settings line forces an unnecessary search for every key. By intersecting the key sets once, you halve the work and simplify the logic:

-    for setting_key, current_value in sorted(flat_settings.items()):
-        if setting_key not in default_settings:
-            continue
+    for setting_key, current_value in sorted(
+        (k, v) for k, v in flat_settings.items() if k in default_settings
+    ):

For large settings dictionaries this saves a few milliseconds.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d15074e and 6fa67ec.

📒 Files selected for processing (13)
  • CHANGES.md (1 hunks)
  • cratedb_toolkit/cli.py (2 hunks)
  • cratedb_toolkit/docs/cli.py (2 hunks)
  • cratedb_toolkit/settings/cli.py (1 hunks)
  • cratedb_toolkit/settings/compare.py (1 hunks)
  • cratedb_toolkit/util/database.py (1 hunks)
  • doc/docs-api.md (1 hunks)
  • doc/index.md (1 hunks)
  • doc/settings/index.md (1 hunks)
  • pyproject.toml (1 hunks)
  • tests/docs/test_functions.py (1 hunks)
  • tests/docs/test_settings.py (0 hunks)
  • tests/settings/test_cli.py (1 hunks)
💤 Files with no reviewable changes (1)
  • tests/docs/test_settings.py
✅ Files skipped from review due to trivial changes (7)
  • CHANGES.md
  • doc/docs-api.md
  • doc/index.md
  • cratedb_toolkit/docs/cli.py
  • cratedb_toolkit/cli.py
  • doc/settings/index.md
  • pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (2)
  • cratedb_toolkit/settings/cli.py
  • tests/docs/test_functions.py
🧰 Additional context used
🧬 Code Graph Analysis (3)
cratedb_toolkit/util/database.py (1)
cratedb_toolkit/datasets/model.py (1)
  • run_sql (118-120)
tests/settings/test_cli.py (3)
cratedb_toolkit/settings/cli.py (1)
  • cli (18-22)
tests/conftest.py (1)
  • cratedb (79-84)
cratedb_toolkit/model.py (1)
  • dburi (41-45)
cratedb_toolkit/settings/compare.py (3)
cratedb_toolkit/util/database.py (3)
  • DatabaseAdapter (36-383)
  • get_heap_size (217-221)
  • get_settings (211-215)
cratedb_toolkit/docs/cli.py (1)
  • settings (98-112)
cratedb_toolkit/docs/settings.py (2)
  • SettingsExtractor (673-728)
  • acquire (686-705)
⏰ Context from checks skipped due to timeout of 90000ms (26)
  • GitHub Check: Python 3.11, MongoDB 5 on OS ubuntu-latest
  • GitHub Check: Python 3.11, MongoDB 3 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 7 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 6 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 5 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.9 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 4 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 3 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 7 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.8 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 6 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 5 on OS ubuntu-latest
  • GitHub Check: CFR for OS windows-latest
  • GitHub Check: build-and-test
  • GitHub Check: Kinesis: Python 3.9 on OS ubuntu-latest
  • GitHub Check: CFR for OS ubuntu-latest
  • GitHub Check: Python 3.8, InfluxDB 2.7, OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 4 on OS ubuntu-latest
  • GitHub Check: Python 3.11, LocalStack 3.6, OS ubuntu-latest
  • GitHub Check: Python 3.8, InfluxDB 2.6, OS ubuntu-latest
  • GitHub Check: Python 3.9, LocalStack 3.6, OS ubuntu-latest
  • GitHub Check: CFR for OS macos-latest
  • GitHub Check: Python 3.8, MongoDB 3 on OS ubuntu-latest
  • GitHub Check: CFR for OS macos-13

Comment thread cratedb_toolkit/util/database.py
Comment on lines +217 to +228
def get_heap_size(self) -> int:
"""
Get configured JVM maximum heap size.
"""
return int(self.run_sql("SELECT heap['max'] AS heap_size FROM sys.nodes LIMIT 1", records=True)[0]["heap_size"])

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Handle multi-node clusters and NULL heap sizes

LIMIT 1 silently hides divergent heap sizes in a multi-node cluster.
Consider either:

  • aggregating (MAX(heap['max'])), or
  • validating all nodes agree and warning otherwise.

Also add the same empty-set guard as above.

-        return int(self.run_sql("SELECT heap['max'] AS heap_size FROM sys.nodes LIMIT 1", records=True)[0]["heap_size"])
+        rows = self.run_sql(
+            "SELECT MAX(heap['max']) AS heap_size FROM sys.nodes", records=True
+        )
+        if not rows or rows[0]["heap_size"] is None:
+            raise RuntimeError("Could not determine heap size from sys.nodes")
+        return int(rows[0]["heap_size"])

🏁 Script executed:

#!/bin/bash
# Show context around get_heap_size in cratedb_toolkit/util/database.py
sed -n '150,260p' cratedb_toolkit/util/database.py

Length of output: 3322


Handle multi-node clusters and NULL heap sizes in get_heap_size

The current implementation uses LIMIT 1, which silently hides divergent JVM heap settings across nodes and will break if no rows or a NULL value is returned. Aggregate the heap size (e.g. with MAX) or validate consistency across all nodes, and add a guard for empty or null results.

• File: cratedb_toolkit/util/database.py
• Method: get_heap_size

Suggested diff:

-    def get_heap_size(self) -> int:
-        """
-        Get configured JVM maximum heap size.
-        """
-        return int(self.run_sql(
-            "SELECT heap['max'] AS heap_size FROM sys.nodes LIMIT 1",
-            records=True
-        )[0]["heap_size"])
+    def get_heap_size(self) -> int:
+        """
+        Get configured JVM maximum heap size across all nodes.
+        """
+        rows = self.run_sql(
+            "SELECT MAX(heap['max']) AS heap_size FROM sys.nodes",
+            records=True
+        )
+        if not rows or rows[0]["heap_size"] is None:
+            raise RuntimeError("Could not determine heap size from sys.nodes")
+        return int(rows[0]["heap_size"])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_heap_size(self) -> int:
"""
Get configured JVM maximum heap size.
"""
return int(self.run_sql("SELECT heap['max'] AS heap_size FROM sys.nodes LIMIT 1", records=True)[0]["heap_size"])
def get_heap_size(self) -> int:
"""
Get configured JVM maximum heap size across all nodes.
"""
rows = self.run_sql(
"SELECT MAX(heap['max']) AS heap_size FROM sys.nodes",
records=True
)
if not rows or rows[0]["heap_size"] is None:
raise RuntimeError("Could not determine heap size from sys.nodes")
return int(rows[0]["heap_size"])

Comment thread cratedb_toolkit/settings/compare.py Outdated
Comment thread cratedb_toolkit/settings/compare.py
@amotl amotl force-pushed the settings-compare branch 2 times, most recently from 440d603 to 76f8759 Compare April 25, 2025 18:23
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (6)
cratedb_toolkit/settings/compare.py (6)

345-359: ⚠️ Potential issue

Color handling inconsistency could cause no_color flag to be ineffective

The code initializes a local color instance and modifies it when no_color is True (lines 345-349). However, the subsequent code (lines 351-359) extracts values from the global Color class instead of this local instance. This makes the no_color flag ineffective for subsequent color usage.

-    HEADER, BOLD, GREEN, YELLOW, RED, BLUE, RESET = (
-        Color.HEADER,
-        Color.BOLD,
-        Color.GREEN,
-        Color.YELLOW,
-        Color.RED,
-        Color.BLUE,
-        Color.RESET,
-    )
+    HEADER, BOLD, GREEN, YELLOW, RED, BLUE, RESET = (
+        color.HEADER,
+        color.BOLD,
+        color.GREEN,
+        color.YELLOW,
+        color.RED,
+        color.BLUE,
+        color.RESET,
+    )

361-361: 🛠️ Refactor suggestion

Use local color variables for consistent styling

This line accesses global BOLD, BLUE, and RESET constants, but should use the local variables to respect the no_color flag settings.

-    print(f"{BOLD}Comparing settings in {BLUE}{cratedb_sqlalchemy_url}{RESET}{BOLD} against default settings{RESET}")
+    print(f"{bold}Comparing settings in {blue}{cratedb_sqlalchemy_url}{reset}{bold} against default settings{reset}")

249-256: ⚠️ Potential issue

Report comparison function ignores the passed color instance

The report_comparison function takes a color parameter, but immediately extracts values from the global Color class instead. This means the function ignores any custom color settings. The function should use the passed color instance.

def report_comparison(color: Color, default_settings, non_default_settings):
    BOLD, GREEN, YELLOW, RESET, PURPLE = (
-        Color.BOLD,
-        Color.GREEN,
-        Color.YELLOW,
-        Color.RESET,
-        Color.PURPLE,
+        color.BOLD,
+        color.GREEN,
+        color.YELLOW,
+        color.RESET,
+        color.PURPLE,
    )

373-376: 🛠️ Refactor suggestion

Add error checking for the settings extractor acquisition

The code calls extractor.acquire() but doesn't verify if the acquisition was successful before accessing extractor.thing["settings"], which could cause errors if acquisition fails but doesn't throw an exception.

    try:
        extractor = SettingsExtractor()
-        extractor.acquire()
+        acquisition_result = extractor.acquire()
+        if not acquisition_result:
+            msg = f"{color.RED}Failed to acquire settings from documentation source{color.RESET}"
+            logger.error(msg)
+            raise click.ClickException(msg)
        default_settings = extractor.thing["settings"]

55-63: 🛠️ Refactor suggestion

Inconsistent error handling for percentage memory values

The code returns None when a percentage value has no heap size reference (line 57), but later raises a ValueError for parse errors (line 63). This inconsistency in error handling could lead to confusing behavior in the calling code.

-        if not heap_size_bytes:
-            return None
+        if not heap_size_bytes:
+            raise ValueError("Percentage memory values require a heap size reference")

185-203: ⚠️ Potential issue

Fix potential division-by-zero in memory settings comparison

The function checks for None values in current_bytes or default_bytes before proceeding (line 198-199), but the heap size check and error (line 201-202) happens after this check. This means we could have valid bytes values but a zero heap size, causing a division by zero in lines 205-206.

-    if current_bytes is None or default_bytes is None:
+    if current_bytes is None or default_bytes is None or not heap_size_bytes:
         return None

-    if not heap_size_bytes:
-        raise ValueError("Heap size must be provided to compare memory settings.")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 440d603 and 76f8759.

📒 Files selected for processing (5)
  • cratedb_toolkit/settings/compare.py (1 hunks)
  • cratedb_toolkit/util/database.py (1 hunks)
  • doc/index.md (1 hunks)
  • doc/util/index.md (1 hunks)
  • doc/util/settings.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • doc/index.md
  • doc/util/settings.md
  • doc/util/index.md
  • cratedb_toolkit/util/database.py
⏰ Context from checks skipped due to timeout of 90000ms (22)
  • GitHub Check: Python 3.11, MongoDB 7 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 7 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 6 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 4 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 7 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 6 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.9 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.8 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.9 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 5 on OS ubuntu-latest
  • GitHub Check: CFR for OS windows-latest
  • GitHub Check: CFR: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Python 3.11, LocalStack 3.6, OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 4 on OS ubuntu-latest
  • GitHub Check: build-and-test
  • GitHub Check: CFR for OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 3 on OS ubuntu-latest
  • GitHub Check: Python 3.9, LocalStack 3.6, OS ubuntu-latest
  • GitHub Check: CFR for OS macos-latest
  • GitHub Check: CFR for OS macos-13
🔇 Additional comments (4)
cratedb_toolkit/settings/compare.py (4)

129-149: The normalize_value function handles special cases well

The function correctly normalizes different value types for comparison, with special handling for parenthetical content in strings like "0s (disabled)" and Java float notation. This kind of attention to detail improves the comparison accuracy.


415-423: Comprehensive time-related keyword detection

The code identifies time-related settings using a thorough set of keywords, which helps ensure that time settings get the appropriate specialized comparison logic. Good use of a set for efficient lookup.


284-290: Well-implemented text wrapping for readability

The text wrapping configuration ensures that output remains readable with appropriate indentation and line breaks, while avoiding breaking up large numbers or important identifiers.


309-333: Well-designed CLI interface with good documentation

The command options are well-documented with appropriate defaults for the different tolerance parameters. The --no-color flag is a good addition for accessibility and machine parsing of output.

Comment thread cratedb_toolkit/settings/compare.py
@amotl amotl force-pushed the settings-compare branch from 76f8759 to e388415 Compare April 25, 2025 18:29
@amotl amotl requested a review from WalBeh April 25, 2025 18:30
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
cratedb_toolkit/settings/compare.py (3)

452-452: Ensure color settings correctly propagate through function calls

The local color instance is correctly passed to report_comparison, but as shown in previous comments, the function doesn't use it properly.


360-398: 🛠️ Refactor suggestion

Replace global color constants with local variables throughout function

Continue to use the local color variables throughout the print statements in this section to ensure consistency with the color settings.

-    print(f"{BOLD}Comparing settings in {BLUE}{cratedb_sqlalchemy_url}{RESET}{BOLD} against default settings{RESET}")
+    print(f"{color.BOLD}Comparing settings in {color.BLUE}{cratedb_sqlalchemy_url}{color.RESET}{color.BOLD} against default settings{color.RESET}")

     # Other print statements similarly updated...

-    print(f"{BOLD}Heap Size: {GREEN}{formatted_heap} bytes{RESET}")
+    print(f"{color.BOLD}Heap Size: {color.GREEN}{formatted_heap} bytes{color.RESET}")

     # Continue updating all print statements...

249-307: ⚠️ Potential issue

Function doesn't use the passed color parameter

The report_comparison function accepts a color parameter but doesn't use it. Instead, it uses global constants from Color class. This creates inconsistency between what's passed in and what's used internally.

 def report_comparison(color: Color, default_settings, non_default_settings):
-    BOLD, GREEN, YELLOW, RESET, PURPLE = (
-        Color.BOLD,
-        Color.GREEN,
-        Color.YELLOW,
-        Color.RESET,
-        Color.PURPLE,
-    )
+    BOLD, GREEN, YELLOW, RESET, PURPLE = (
+        color.BOLD,
+        color.GREEN,
+        color.YELLOW,
+        color.RESET,
+        color.PURPLE,
+    )
🧹 Nitpick comments (1)
cratedb_toolkit/settings/compare.py (1)

238-247: Consider implementing Color class as a singleton or with constants

The Color class is currently implemented with class attributes. Consider making these constants to better communicate their intended use, or implementing a proper singleton pattern if state management is needed.

-class Color:
-    HEADER = "\033[95m"
-    BOLD = "\033[1m"
-    GREEN = "\033[92m"
-    YELLOW = "\033[93m"
-    RED = "\033[91m"
-    BLUE = "\033[94m"
-    RESET = "\033[0m"
-    PURPLE = "\033[95m"  # Define purple color
+# Define colors as module-level constants
+HEADER = "\033[95m"
+BOLD = "\033[1m" 
+GREEN = "\033[92m"
+YELLOW = "\033[93m"
+RED = "\033[91m"
+BLUE = "\033[94m"
+RESET = "\033[0m"
+PURPLE = "\033[95m"
+
+class Color:
+    """Color constants for terminal output."""
+    HEADER = HEADER
+    BOLD = BOLD
+    GREEN = GREEN
+    YELLOW = YELLOW
+    RED = RED
+    BLUE = BLUE
+    RESET = RESET
+    PURPLE = PURPLE
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76f8759 and e388415.

📒 Files selected for processing (5)
  • cratedb_toolkit/settings/compare.py (1 hunks)
  • cratedb_toolkit/util/database.py (1 hunks)
  • doc/index.md (1 hunks)
  • doc/util/index.md (1 hunks)
  • doc/util/settings.md (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • doc/util/settings.md
🚧 Files skipped from review as they are similar to previous changes (3)
  • doc/util/index.md
  • doc/index.md
  • cratedb_toolkit/util/database.py
🧰 Additional context used
🪛 GitHub Check: codecov/patch
cratedb_toolkit/settings/compare.py

[warning] 37-41: cratedb_toolkit/settings/compare.py#L37-L41
Added lines #L37 - L41 were not covered by tests


[warning] 43-44: cratedb_toolkit/settings/compare.py#L43-L44
Added lines #L43 - L44 were not covered by tests


[warning] 49-50: cratedb_toolkit/settings/compare.py#L49-L50
Added lines #L49 - L50 were not covered by tests


[warning] 52-52: cratedb_toolkit/settings/compare.py#L52
Added line #L52 was not covered by tests


[warning] 55-57: cratedb_toolkit/settings/compare.py#L55-L57
Added lines #L55 - L57 were not covered by tests


[warning] 59-63: cratedb_toolkit/settings/compare.py#L59-L63
Added lines #L59 - L63 were not covered by tests


[warning] 66-66: cratedb_toolkit/settings/compare.py#L66
Added line #L66 was not covered by tests


[warning] 74-76: cratedb_toolkit/settings/compare.py#L74-L76
Added lines #L74 - L76 were not covered by tests


[warning] 78-79: cratedb_toolkit/settings/compare.py#L78-L79
Added lines #L78 - L79 were not covered by tests


[warning] 81-83: cratedb_toolkit/settings/compare.py#L81-L83
Added lines #L81 - L83 were not covered by tests


[warning] 88-89: cratedb_toolkit/settings/compare.py#L88-L89
Added lines #L88 - L89 were not covered by tests


[warning] 91-91: cratedb_toolkit/settings/compare.py#L91
Added line #L91 was not covered by tests


[warning] 94-96: cratedb_toolkit/settings/compare.py#L94-L96
Added lines #L94 - L96 were not covered by tests


[warning] 98-99: cratedb_toolkit/settings/compare.py#L98-L99
Added lines #L98 - L99 were not covered by tests


[warning] 102-102: cratedb_toolkit/settings/compare.py#L102
Added line #L102 was not covered by tests


[warning] 110-112: cratedb_toolkit/settings/compare.py#L110-L112
Added lines #L110 - L112 were not covered by tests


[warning] 117-124: cratedb_toolkit/settings/compare.py#L117-L124
Added lines #L117 - L124 were not covered by tests


[warning] 126-126: cratedb_toolkit/settings/compare.py#L126
Added line #L126 was not covered by tests


[warning] 131-138: cratedb_toolkit/settings/compare.py#L131-L138
Added lines #L131 - L138 were not covered by tests


[warning] 140-141: cratedb_toolkit/settings/compare.py#L140-L141
Added lines #L140 - L141 were not covered by tests


[warning] 144-144: cratedb_toolkit/settings/compare.py#L144
Added line #L144 was not covered by tests


[warning] 147-149: cratedb_toolkit/settings/compare.py#L147-L149
Added lines #L147 - L149 were not covered by tests


[warning] 154-155: cratedb_toolkit/settings/compare.py#L154-L155
Added lines #L154 - L155 were not covered by tests


[warning] 157-158: cratedb_toolkit/settings/compare.py#L157-L158
Added lines #L157 - L158 were not covered by tests


[warning] 161-162: cratedb_toolkit/settings/compare.py#L161-L162
Added lines #L161 - L162 were not covered by tests


[warning] 165-167: cratedb_toolkit/settings/compare.py#L165-L167
Added lines #L165 - L167 were not covered by tests


[warning] 171-172: cratedb_toolkit/settings/compare.py#L171-L172
Added lines #L171 - L172 were not covered by tests


[warning] 174-174: cratedb_toolkit/settings/compare.py#L174
Added line #L174 was not covered by tests

⏰ Context from checks skipped due to timeout of 90000ms (28)
  • GitHub Check: codecov/project
  • GitHub Check: Generic: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.9 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.8 on OS ubuntu-latest
  • GitHub Check: CFR: Python 3.12 on OS ubuntu-latest
  • GitHub Check: CFR: Python 3.9 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.9 on OS ubuntu-latest
  • GitHub Check: Python 3.11, MongoDB 7 on OS ubuntu-latest
  • GitHub Check: Python 3.11, MongoDB 4 on OS ubuntu-latest
  • GitHub Check: Python 3.11, MongoDB 3 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 7 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 6 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 5 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 4 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 3 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 7 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 6 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 5 on OS ubuntu-latest
  • GitHub Check: Python 3.11, LocalStack 3.6, OS ubuntu-latest
  • GitHub Check: CFR for OS windows-latest
  • GitHub Check: Python 3.8, MongoDB 4 on OS ubuntu-latest
  • GitHub Check: CFR for OS ubuntu-latest
  • GitHub Check: Python 3.8, InfluxDB 2.6, OS ubuntu-latest
  • GitHub Check: Python 3.9, LocalStack 3.6, OS ubuntu-latest
  • GitHub Check: build-and-test
  • GitHub Check: CFR for OS macos-latest
  • GitHub Check: CFR for OS macos-13
🔇 Additional comments (3)
cratedb_toolkit/settings/compare.py (3)

1-20: Well-structured module with comprehensive documentation

Excellent job providing clear documentation with installation instructions and usage examples. The module's purpose is well-defined, and the ruff noqa directive for print statements is appropriate for a CLI tool.


47-64: Good error handling for percentage memory values

The implementation now properly raises a ValueError with a clear message when percentage values are used without a heap size reference. This prevents silent failures and makes debugging easier.

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 49-50: cratedb_toolkit/settings/compare.py#L49-L50
Added lines #L49 - L50 were not covered by tests


[warning] 52-52: cratedb_toolkit/settings/compare.py#L52
Added line #L52 was not covered by tests


[warning] 55-57: cratedb_toolkit/settings/compare.py#L55-L57
Added lines #L55 - L57 were not covered by tests


[warning] 59-63: cratedb_toolkit/settings/compare.py#L59-L63
Added lines #L59 - L63 were not covered by tests


201-202: Proper validation for heap size

Good work ensuring that heap size is validated before attempting division operations. This prevents potential division by zero errors.

Comment thread cratedb_toolkit/settings/compare.py
Comment thread cratedb_toolkit/settings/compare.py
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
cratedb_toolkit/settings/compare.py (1)

345-359: ⚠️ Potential issue

Fix color handling inconsistency in CLI command

The current implementation creates a local color instance, potentially modifies it based on --no-color, but then ignores it and initializes local variables from the global Color class. This causes --no-color to be ineffective.

     # ANSI color palette – use *local* variables to avoid shadowing globals.
     color = Color()
     if no_color:
         color.HEADER = color.BOLD = color.GREEN = color.YELLOW = color.RED = color.BLUE = color.RESET = color.PURPLE = (
             ""
         )
 
-    HEADER, BOLD, GREEN, YELLOW, RED, BLUE, RESET = (
-        Color.HEADER,
-        Color.BOLD,
-        Color.GREEN,
-        Color.YELLOW,
-        Color.RED,
-        Color.BLUE,
-        Color.RESET,
-    )
+    # Use the local color instance consistently
+    HEADER, BOLD, GREEN, YELLOW, RED, BLUE, RESET = (
+        color.HEADER,
+        color.BOLD,
+        color.GREEN,
+        color.YELLOW,
+        color.RED,
+        color.BLUE,
+        color.RESET,
+    )
🧹 Nitpick comments (2)
cratedb_toolkit/settings/compare.py (2)

412-426: Consider adding explicit rule checking for time settings

The current check for time settings relies on keyword matching in setting names. While effective in most cases, consider adding explicit setting path checking for known time settings that might not contain the time-related keywords.

     # Check if this is a time-related setting
     is_time_setting = any(
         time_term in setting_key.lower()
         for time_term in {
             "timeout",
             "interval",
             "delay",
             "expiration",
             "time",
             "duration",
         }
-    )
+    ) or setting_key in {
+        # Add explicit known time settings that don't match the pattern
+        "discovery.zen.ping.unicast.hosts.connect_all",  # Example of potential setting
+        # Add other known time settings as needed
+    }

427-440: Improve memory setting detection

Similarly to the time settings, consider enhancing the detection of memory settings with explicit setting paths for known memory-related settings that don't contain the current keywords.

     # Check if this is a memory-related setting
     elif heap_size_bytes and any(
-        mem_term in setting_key for mem_term in {"memory", "heap", "breaker", "limit", "size"}
+        mem_term in setting_key.lower() for mem_term in {"memory", "heap", "breaker", "limit", "size", "buffer"}
+    ) or setting_key in {
+        # Add explicit known memory settings that don't match the pattern
+        # Example: "indices.recovery.max_bytes_per_sec"
     ):
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e388415 and c6d758a.

📒 Files selected for processing (1)
  • cratedb_toolkit/settings/compare.py (1 hunks)
🧰 Additional context used
🪛 GitHub Check: codecov/patch
cratedb_toolkit/settings/compare.py

[warning] 37-41: cratedb_toolkit/settings/compare.py#L37-L41
Added lines #L37 - L41 were not covered by tests


[warning] 43-44: cratedb_toolkit/settings/compare.py#L43-L44
Added lines #L43 - L44 were not covered by tests


[warning] 49-50: cratedb_toolkit/settings/compare.py#L49-L50
Added lines #L49 - L50 were not covered by tests


[warning] 52-52: cratedb_toolkit/settings/compare.py#L52
Added line #L52 was not covered by tests


[warning] 55-57: cratedb_toolkit/settings/compare.py#L55-L57
Added lines #L55 - L57 were not covered by tests


[warning] 59-63: cratedb_toolkit/settings/compare.py#L59-L63
Added lines #L59 - L63 were not covered by tests


[warning] 66-66: cratedb_toolkit/settings/compare.py#L66
Added line #L66 was not covered by tests


[warning] 74-76: cratedb_toolkit/settings/compare.py#L74-L76
Added lines #L74 - L76 were not covered by tests


[warning] 78-79: cratedb_toolkit/settings/compare.py#L78-L79
Added lines #L78 - L79 were not covered by tests


[warning] 81-83: cratedb_toolkit/settings/compare.py#L81-L83
Added lines #L81 - L83 were not covered by tests


[warning] 88-89: cratedb_toolkit/settings/compare.py#L88-L89
Added lines #L88 - L89 were not covered by tests


[warning] 91-91: cratedb_toolkit/settings/compare.py#L91
Added line #L91 was not covered by tests


[warning] 94-96: cratedb_toolkit/settings/compare.py#L94-L96
Added lines #L94 - L96 were not covered by tests


[warning] 98-99: cratedb_toolkit/settings/compare.py#L98-L99
Added lines #L98 - L99 were not covered by tests


[warning] 102-102: cratedb_toolkit/settings/compare.py#L102
Added line #L102 was not covered by tests


[warning] 110-112: cratedb_toolkit/settings/compare.py#L110-L112
Added lines #L110 - L112 were not covered by tests


[warning] 117-124: cratedb_toolkit/settings/compare.py#L117-L124
Added lines #L117 - L124 were not covered by tests


[warning] 126-126: cratedb_toolkit/settings/compare.py#L126
Added line #L126 was not covered by tests


[warning] 131-138: cratedb_toolkit/settings/compare.py#L131-L138
Added lines #L131 - L138 were not covered by tests


[warning] 140-141: cratedb_toolkit/settings/compare.py#L140-L141
Added lines #L140 - L141 were not covered by tests


[warning] 144-144: cratedb_toolkit/settings/compare.py#L144
Added line #L144 was not covered by tests


[warning] 147-149: cratedb_toolkit/settings/compare.py#L147-L149
Added lines #L147 - L149 were not covered by tests


[warning] 154-155: cratedb_toolkit/settings/compare.py#L154-L155
Added lines #L154 - L155 were not covered by tests


[warning] 157-158: cratedb_toolkit/settings/compare.py#L157-L158
Added lines #L157 - L158 were not covered by tests


[warning] 161-162: cratedb_toolkit/settings/compare.py#L161-L162
Added lines #L161 - L162 were not covered by tests


[warning] 165-167: cratedb_toolkit/settings/compare.py#L165-L167
Added lines #L165 - L167 were not covered by tests


[warning] 171-172: cratedb_toolkit/settings/compare.py#L171-L172
Added lines #L171 - L172 were not covered by tests


[warning] 174-174: cratedb_toolkit/settings/compare.py#L174
Added line #L174 was not covered by tests

⏰ Context from checks skipped due to timeout of 90000ms (20)
  • GitHub Check: Python 3.9, MongoDB 7 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 6 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.9 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 3 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 7 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 6 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.9 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 5 on OS ubuntu-latest
  • GitHub Check: CFR for OS windows-latest
  • GitHub Check: Python 3.11, LocalStack 3.6, OS ubuntu-latest
  • GitHub Check: CFR for OS ubuntu-latest
  • GitHub Check: Python 3.8, InfluxDB 2.6, OS ubuntu-latest
  • GitHub Check: build-and-test
  • GitHub Check: Python 3.9, LocalStack 3.6, OS ubuntu-latest
  • GitHub Check: Generic: Python 3.8 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 3 on OS ubuntu-latest
  • GitHub Check: CFR for OS macos-latest
  • GitHub Check: CFR for OS macos-13
🔇 Additional comments (1)
cratedb_toolkit/settings/compare.py (1)

55-63: Consistent error handling for percentage memory values

Good job adding a proper error message when a percentage memory value is used without a heap size reference. This is more explicit than the previous implementation that silently returned None.

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 55-57: cratedb_toolkit/settings/compare.py#L55-L57
Added lines #L55 - L57 were not covered by tests


[warning] 59-63: cratedb_toolkit/settings/compare.py#L59-L63
Added lines #L59 - L63 were not covered by tests

Comment on lines +185 to +236
def compare_memory_settings(
setting_key,
current_value,
default_value,
heap_size_bytes,
tolerance_percent_large=2.9,
tolerance_percent_small=1,
threshold_percent=20,
):
"""Compare memory-based settings and return formatted output if different."""
current_bytes = to_bytes(str(current_value), heap_size_bytes)
default_bytes = to_bytes(default_value, heap_size_bytes)

if current_bytes is None or default_bytes is None:
return None

if not heap_size_bytes:
raise ValueError("Heap size must be provided to compare memory settings.")

# Calculate percentage of heap
current_percent = (current_bytes / heap_size_bytes) * 100
default_percent = (default_bytes / heap_size_bytes) * 100

# Choose tolerance based on the size of default value
tolerance = tolerance_percent_large if default_percent >= threshold_percent else tolerance_percent_small

# Check if values differ by more than tolerance
if abs(current_percent - default_percent) <= tolerance:
return None

# Format for display with calculated percentage
formatted_current = f"{current_value}"
formatted_default = f"{default_value}"

is_percent_format = "%" in str(default_value)
current_is_percent = "%" in str(current_value)

# Add percentage if absolute value
if not current_is_percent:
formatted_current += f" (~{current_percent:.1f}% of heap)"
# Add byte equivalent if value is percentage
else:
formatted_current += f" (~{current_bytes:,} bytes)"

# Add percentage/bytes for default value
if is_percent_format:
formatted_default += f" (~{default_bytes:,} bytes)"
else:
formatted_default += f" (~{default_percent:.1f}% of heap)"

return f"{setting_key}: {formatted_current} (default: {formatted_default})"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Add test coverage for memory-based settings comparison

The compare_memory_settings function handles the complex logic of comparing memory settings with variable tolerances based on heap size percentages. This logic is central to the utility's functionality and should be thoroughly tested.


🏁 Script executed:

#!/bin/bash
# Check if there are tests for the compare_memory_settings function
echo "Looking for tests covering compare_memory_settings:"
rg -A 3 "test.*compare_memory|compare_memory.*assert" --type py

Length of output: 178


Add test coverage for compare_memory_settings

No existing tests cover compare_memory_settings; its logic is non-trivial and deserves thorough unit tests. Please add a new test module (e.g. tests/test_compare_memory_settings.py) that covers:

  • Cases where to_bytes returns None (expect None).
  • Missing heap_size_bytes (expect ValueError).
  • Differences within both large/small tolerances (expect None).
  • Differences exceeding tolerances (expect formatted output):
    • Absolute values (e.g. bytes → percentage annotations).
    • Percentage inputs (e.g. "25%" → byte annotations).
    • Default as percent vs. absolute, verifying both branches.

Comment on lines +152 to +183
def compare_time_settings(setting_key, current_value, default_value):
"""Compare time-based settings and return formatted output if different."""
current_ms = to_milliseconds(current_value)
default_ms = to_milliseconds(default_value)

if current_ms is None or default_ms is None:
return None

# Special handling for 0 values - always show if one is zero and the other isn't
if (current_ms == 0 and default_ms > 0) or (default_ms == 0 and current_ms > 0):
return f"{setting_key}: {current_value} (default: {default_value})"

# For non-zero values, check if they differ by more than 1%
if current_ms != 0 and default_ms != 0:
if abs(current_ms - default_ms) <= max(current_ms, default_ms) * 0.01:
return None

# Format output based on readability
# If the time value is in milliseconds (likely not human readable), include the conversion
if "ms" in current_value or len(current_value) > 6:
formatted_current = f"{current_value} (~{format_time(current_ms)})"
else:
formatted_current = current_value

if "ms" in default_value or len(default_value) > 6:
formatted_default = f"{default_value} (~{format_time(default_ms)})"
else:
formatted_default = default_value

# If we get here, the values are different
return f"{setting_key}: {formatted_current} (default: {formatted_default})"

Copy link
Copy Markdown

@coderabbitai coderabbitai bot Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Add test coverage for time-based settings comparison

The compare_time_settings function implements important logic for comparing time settings with appropriate tolerance, but lacks test coverage. This is critical functionality that should be verified with unit tests.


🏁 Script executed:

#!/bin/bash
# Check if there are tests for the compare_time_settings function
echo "Looking for tests covering compare_time_settings:"
rg -A 3 "test.*compare_time|compare_time.*assert" --type py

Length of output: 170


**Add comprehensive unit tests for compare_time_settings **

The compare_time_settings function covers several edge cases and formatting branches but currently has no test coverage. Please add tests that verify:

  • Behavior when to_milliseconds returns None for either value → result is None
  • Special zero‐value handling:
    • current_ms == 0 & default_ms > 0
    • default_ms == 0 & current_ms > 0
  • Tolerance check for non‐zero values differing by ≤1% → result is None
  • Difference >1% triggers output, including:
    • Simple values (no “ms” and short strings)
    • Values containing “ms” or longer than 6 characters, with (~formatted) suffix
  • Correct formatting of the (~format_time(...)) conversions

Example test file: tests/test_compare_time_settings.py – cover each branch with clear expected outputs.

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 154-155: cratedb_toolkit/settings/compare.py#L154-L155
Added lines #L154 - L155 were not covered by tests


[warning] 157-158: cratedb_toolkit/settings/compare.py#L157-L158
Added lines #L157 - L158 were not covered by tests


[warning] 161-162: cratedb_toolkit/settings/compare.py#L161-L162
Added lines #L161 - L162 were not covered by tests


[warning] 165-167: cratedb_toolkit/settings/compare.py#L165-L167
Added lines #L165 - L167 were not covered by tests


[warning] 171-172: cratedb_toolkit/settings/compare.py#L171-L172
Added lines #L171 - L172 were not covered by tests


[warning] 174-174: cratedb_toolkit/settings/compare.py#L174
Added line #L174 was not covered by tests

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai: We are currently skipping comprehensive test cases in favor of a minimal "run-to-completion" test.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl I understand your approach. Using minimal "run-to-completion" tests is a valid strategy, especially during active development. Thanks for clarifying your testing approach.


✏️ Learnings added
Learnt from: amotl
PR: crate/cratedb-toolkit#421
File: cratedb_toolkit/settings/compare.py:152-183
Timestamp: 2025-04-25T18:39:11.701Z
Learning: The project is currently favoring minimal "run-to-completion" tests over comprehensive unit tests, especially for new functionality like the settings comparison utility.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment thread cratedb_toolkit/settings/compare.py
Comment thread cratedb_toolkit/settings/compare.py
Comment thread cratedb_toolkit/settings/compare.py
@amotl amotl force-pushed the settings-compare branch from c6d758a to d5193d3 Compare April 25, 2025 18:37
@amotl amotl marked this pull request as ready for review April 25, 2025 18:44
@amotl amotl force-pushed the settings-compare branch from d5193d3 to 65609af Compare April 25, 2025 18:57
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
cratedb_toolkit/settings/compare.py (1)

63-63: Consider enhancing the error message for clarity

The error message could be more specific about what went wrong with the percentage conversion.

-            raise ValueError("Percentage memory values require a heap size reference") from ex
+            raise ValueError(f"Failed to convert '{value_str}' to bytes: Percentage memory values require a heap size reference") from ex
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c6d758a and 65609af.

📒 Files selected for processing (5)
  • cratedb_toolkit/settings/compare.py (1 hunks)
  • cratedb_toolkit/util/database.py (1 hunks)
  • doc/index.md (1 hunks)
  • doc/util/index.md (1 hunks)
  • doc/util/settings.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • doc/index.md
  • doc/util/index.md
  • doc/util/settings.md
  • cratedb_toolkit/util/database.py
🧰 Additional context used
🧠 Learnings (1)
cratedb_toolkit/settings/compare.py (4)
Learnt from: amotl
PR: crate/cratedb-toolkit#421
File: cratedb_toolkit/settings/compare.py:412-447
Timestamp: 2025-04-25T18:36:10.496Z
Learning: The cratedb-toolkit project prefers blackbox "run-to-completion" testing for CLI tools rather than detailed unit tests for individual utility functions. Integration tests that verify the end-to-end functionality are prioritized over testing individual components in isolation.
Learnt from: amotl
PR: crate/cratedb-toolkit#421
File: cratedb_toolkit/settings/compare.py:35-45
Timestamp: 2025-04-25T18:40:39.490Z
Learning: For cratedb-toolkit, amotl prefers starting with minimal "run-to-completion" tests that verify basic functionality before adding more thorough test cases.
Learnt from: amotl
PR: crate/cratedb-toolkit#421
File: cratedb_toolkit/settings/compare.py:129-149
Timestamp: 2025-04-25T18:40:26.454Z
Learning: In the cratedb-toolkit project, comprehensive unit tests for utility functions like `normalize_value` in the settings comparison module are intentionally skipped as per the project's testing strategy.
Learnt from: amotl
PR: crate/cratedb-toolkit#421
File: cratedb_toolkit/settings/compare.py:152-183
Timestamp: 2025-04-25T18:39:11.701Z
Learning: The project is currently favoring minimal "run-to-completion" tests over comprehensive unit tests, especially for new functionality like the settings comparison utility.
⏰ Context from checks skipped due to timeout of 90000ms (22)
  • GitHub Check: Python 3.9, MongoDB 7 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 6 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 5 on OS ubuntu-latest
  • GitHub Check: Kinesis: Python 3.9 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.12 on OS ubuntu-latest
  • GitHub Check: Python 3.9, MongoDB 3 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.9 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 7 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 6 on OS ubuntu-latest
  • GitHub Check: Generic: Python 3.8 on OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 5 on OS ubuntu-latest
  • GitHub Check: CFR for OS windows-latest
  • GitHub Check: Python 3.11, LocalStack 3.6, OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 4 on OS ubuntu-latest
  • GitHub Check: CFR for OS ubuntu-latest
  • GitHub Check: Python 3.8, InfluxDB 2.6, OS ubuntu-latest
  • GitHub Check: Python 3.9, LocalStack 3.6, OS ubuntu-latest
  • GitHub Check: Python 3.8, MongoDB 3 on OS ubuntu-latest
  • GitHub Check: build-and-test
  • GitHub Check: CFR for OS macos-latest
  • GitHub Check: CFR for OS macos-13
🔇 Additional comments (9)
cratedb_toolkit/settings/compare.py (9)

129-149: Comprehensive value normalization looks good

The function handles various edge cases well, including null values, booleans, Java-style floats, and disabled values.


152-183: Well-implemented time comparison with sensible tolerance

The time comparison logic uses a 1% tolerance threshold for non-zero values and special handling for zero values. The implementation correctly formats output for readability.


185-236: Memory comparison with flexible tolerances is well-designed

The function provides configurable tolerances based on the size of memory settings relative to heap size. It also handles both percentage and absolute representations elegantly.


345-349: Good use of local color instance

Using a local color instance and setting its attributes rather than modifying global constants avoids potential side effects and threading issues.


399-450: Clear and modular comparison workflow

The settings comparison process is well-structured, handling different types of settings appropriately and collecting unique non-default settings using a set.


412-426: Thorough detection logic for time-related settings

The implementation correctly identifies time-related settings through key term matching and delegates to the specialized comparison function.


427-440: Effective detection for memory-related settings

Good pattern matching to identify memory-related settings by common terms, with appropriate forwarding to the specialized comparison function.


442-447: Fallback to standard comparison is well-implemented

The code correctly uses the normalize_value function for standard settings comparison after checking the specialized cases.


452-452: Correctly passing the local color instance

Good job passing the local color instance to report_comparison, ensuring consistent color handling throughout the output.

@amotl amotl merged commit c462b50 into main Apr 25, 2025
42 checks passed
@amotl amotl deleted the settings-compare branch April 25, 2025 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants