Settings: Add settings comparison utility by amotl · Pull Request #421 · crate/cratedb-toolkit

amotl · 2025-04-25T15:09:25Z

About

Accompanying the settings extractor with a comparison tool.

Documentation

https://cratedb-toolkit--421.org.readthedocs.build/util/settings.html

References

Docs API: Add CrateDB settings extractor #398

Screenshots

coderabbitai · 2025-04-25T15:09:32Z

Walkthrough

This update introduces a new subsystem for CrateDB Toolkit focused on cluster settings management. It adds a CLI group for settings-related commands, including listing and comparing current cluster settings against documented defaults. The comparison utility handles memory and time settings with configurable tolerances and provides color-coded, grouped output. Supporting changes include new database adapter methods for retrieving cluster settings and heap size, as well as comprehensive tests and documentation. Related tests for the functions CLI command are moved to a more appropriate location. The documentation and dependency configuration are updated to reflect the new settings functionality.

Changes

File(s)	Change Summary
`cratedb_toolkit/settings/compare.py`	New module for comparing cluster runtime settings to documented defaults, with utilities for flattening dicts, parsing memory/time values, normalization, and CLI command for comparison.
`cratedb_toolkit/settings/cli.py`	New CLI group for settings commands, with options for verbosity, debug, and subcommands for comparing and listing settings.
`cratedb_toolkit/cli.py`	Registers the new settings CLI group under the main CLI.
`cratedb_toolkit/util/database.py`	Adds `get_settings` and `get_heap_size` methods to `DatabaseAdapter`.
`cratedb_toolkit/docs/cli.py`	Refactors import in the `settings` command from relative to absolute; minor docstring rewording.
`tests/docs/test_functions.py`	Adds new tests for the `ctk docs functions` CLI command, checking JSON and Markdown output.
`tests/docs/test_settings.py`	Removes tests for the functions CLI command (now moved to `test_functions.py`).
`tests/settings/test_cli.py`	New tests for the settings CLI group, verifying list and compare commands and their outputs/logs.
`CHANGES.md`	Adds changelog entry for the new settings comparison utility.
`pyproject.toml`	Adds a new optional dependency group `[settings]` including `cratedb-toolkit[docs-api]`.
`doc/util/settings.md`	New documentation page describing the settings subsystem, installation, and usage.
`doc/util/index.md`	Adds the "settings" entry to the utilities documentation index.
`doc/docs-api.md`	Hyphenates "runtime-configurable" in documentation.
`doc/index.md`	Updates diagnostics section to reference `util/index` instead of `cmd/index`.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant SettingsCLI
    participant DatabaseAdapter
    participant DocsSettingsExtractor

    User->>CLI: ctk settings compare
    CLI->>SettingsCLI: Invoke compare command
    SettingsCLI->>DatabaseAdapter: get_heap_size()
    DatabaseAdapter-->>SettingsCLI: heap size
    SettingsCLI->>DatabaseAdapter: get_settings()
    DatabaseAdapter-->>SettingsCLI: cluster settings
    SettingsCLI->>DocsSettingsExtractor: extract default settings
    DocsSettingsExtractor-->>SettingsCLI: documented defaults
    SettingsCLI->>SettingsCLI: Compare current vs. default settings
    SettingsCLI-->>User: Print grouped, color-coded comparison report

Possibly related PRs

crate/cratedb-toolkit#398: Adds foundational settings extraction functionality, which the new comparison utility builds upon.
crate/cratedb-toolkit#399: Introduces the CLI command group for docs, directly related to the CLI integration in this update.
crate/cratedb-toolkit#400: Refactors settings extraction logic and adds YAML support, which is extended by the current settings subsystem.

Suggested reviewers

WalBeh

Poem

In the warren of settings deep,
A toolkit rabbit takes a leap—
Now it lists and compares with flair,
Colors and docs are everywhere!
With bytes and millis, it’s precise,
Cluster configs checked not once, but twice.
🐇 Hooray for settings, neat and bright—
The toolkit’s future’s looking right!

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (4)

tests/docs/test_functions.py (1)

30-48: Minor: assert more than success & one key

Right now the test passes as long as one specific function exists. To increase confidence, also validate that the resulting file is non-empty and well-formed, e.g. by checking jsonschema or that at least N functions are present.

cratedb_toolkit/docs/settings/compare.py (3)

70-107: to_bytes() misses some real-world units & edge cases

CrateDB settings occasionally use bytes, kib, mib, etc. These are not recognised.

Values like 0 (without a unit) pass through match but the resulting unit is "b", which is not in multipliers, so the function falls back to int(number) – fine – but the intent deserves a comment.

Consider early-returning 0 instead of None when the string literally equals "0" to make downstream code simpler.

No immediate bug, yet adding the units will avoid silent mis-comparisons.

461-462: Define colour constants once

PURPLE is re-defined for every iteration of the outer loop. Define it next to the other ANSI constants to avoid tiny runtime overhead and keep the colour palette in one place.

32-55: Parsing multiple JSON objects by counting [/] is fragile

parse_multiple_json_objects() assumes top-level JSON arrays and will break if the file contains:

Nested arrays whose ] closes before an outer ].

Top-level objects ({}) instead of arrays.

Using a streaming parser (ijson.items() or json.loads('[' + ','.join(...) + ']')) would be more robust and simpler to maintain.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 150fbc8 and 6818a47.

📒 Files selected for processing (4)

cratedb_toolkit/docs/cli.py (1 hunks)
cratedb_toolkit/docs/settings/compare.py (1 hunks)
tests/docs/test_functions.py (1 hunks)
tests/docs/test_settings.py (0 hunks)

💤 Files with no reviewable changes (1)

tests/docs/test_settings.py

🧰 Additional context used

🧬 Code Graph Analysis (2)

cratedb_toolkit/docs/cli.py (1)

cratedb_toolkit/docs/settings/extract.py (1)

SettingsExtractor (673-728)

tests/docs/test_functions.py (1)

cratedb_toolkit/docs/cli.py (1)

cli (15-19)

⏰ Context from checks skipped due to timeout of 90000ms (11)

GitHub Check: Kinesis: Python 3.12 on OS ubuntu-latest
GitHub Check: Kinesis: Python 3.9 on OS ubuntu-latest
GitHub Check: CFR: Python 3.12 on OS ubuntu-latest
GitHub Check: Generic: Python 3.12 on OS ubuntu-latest
GitHub Check: Generic: Python 3.9 on OS ubuntu-latest
GitHub Check: CFR for OS windows-latest
GitHub Check: CFR for OS ubuntu-latest
GitHub Check: Generic: Python 3.8 on OS ubuntu-latest
GitHub Check: build-and-test
GitHub Check: CFR for OS macos-latest
GitHub Check: CFR for OS macos-13

🔇 Additional comments (1)

cratedb_toolkit/docs/cli.py (1)

104-104: Confirm import path is available after installation

The switch from a relative to an absolute import is fine, but please double-check that cratedb_toolkit.docs.settings.extract is included in pyproject.toml/setup.py so that it is actually shipped with the wheel or sdist; otherwise this CLI entry-point will raise an ImportError once the package is installed.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

cratedb_toolkit/settings/cli.py (1)
13-24: Consider validating the return value of boot_click

boot_click() is invoked and its return value passed straight back to Click.
If boot_click() ever returns None (as the snippet in util/cli.py suggests) this is fine—Click ignores the value—but if a future refactor makes it return something else (e.g. a configured logger) Click will silently swallow it.

A tiny defensive tweak keeps behaviour explicit:
-    return boot_click(ctx, verbose, debug)
+    boot_click(ctx, verbose, debug)  # Ensure side-effects only
Not urgent, just a small robustness improvement.

🧰 Tools

🪛 GitHub Check: codecov/patch

[warning] 21-21: cratedb_toolkit/settings/cli.py#L21
Added line #L21 was not covered by tests
cratedb_toolkit/settings/compare.py (1)
62-99: Unit parsing misses common variants & explicit ‘B’ multiplier

to_bytes() overlooks several real-world inputs:

IEC units (MiB, GiB) and upper-case MB / GB

The plain “b” case is handled by falling through but could be explicit for clarity

Values like 8192 (no unit) are treated as bytes, good—consider documenting that

A quick win:
 multipliers = {
-    "kb": 1024,
-    "mb": 1024 * 1024,
-    "gb": 1024 * 1024 * 1024,
-    "tb": 1024 * 1024 * 1024 * 1024,
+    "b": 1,
+    "kb": 1000,
+    "kib": 1024,
+    "mb": 1000**2,
+    "mib": 1024**2,
+    "gb": 1000**3,
+    "gib": 1024**3,
+    "tb": 1000**4,
+    "tib": 1024**4,
 }
…and extend the regex with [kmgt]i?b to catch IEC forms.

🧰 Tools

🪛 GitHub Check: codecov/patch

[warning] 64-65: cratedb_toolkit/settings/compare.py#L64-L65
Added lines #L64 - L65 were not covered by tests

[warning] 67-67: cratedb_toolkit/settings/compare.py#L67
Added line #L67 was not covered by tests

[warning] 70-72: cratedb_toolkit/settings/compare.py#L70-L72
Added lines #L70 - L72 were not covered by tests

[warning] 74-78: cratedb_toolkit/settings/compare.py#L74-L78
Added lines #L74 - L78 were not covered by tests

[warning] 81-81: cratedb_toolkit/settings/compare.py#L81
Added line #L81 was not covered by tests

[warning] 89-91: cratedb_toolkit/settings/compare.py#L89-L91
Added lines #L89 - L91 were not covered by tests

[warning] 93-94: cratedb_toolkit/settings/compare.py#L93-L94
Added lines #L93 - L94 were not covered by tests

[warning] 96-98: cratedb_toolkit/settings/compare.py#L96-L98
Added lines #L96 - L98 were not covered by tests

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6818a47 and 23d75b5.

📒 Files selected for processing (4)

cratedb_toolkit/cli.py (2 hunks)
cratedb_toolkit/docs/cli.py (1 hunks)
cratedb_toolkit/settings/cli.py (1 hunks)
cratedb_toolkit/settings/compare.py (1 hunks)

✅ Files skipped from review due to trivial changes (1)

cratedb_toolkit/cli.py

🧰 Additional context used

🧬 Code Graph Analysis (2)

cratedb_toolkit/settings/cli.py (2)

cratedb_toolkit/util/cli.py (1)

boot_click (16-27)

cratedb_toolkit/settings/compare.py (1)

compare_cluster_settings (354-466)

cratedb_toolkit/docs/cli.py (1)

cratedb_toolkit/docs/settings.py (1)

SettingsExtractor (673-728)

🪛 GitHub Check: codecov/patch

cratedb_toolkit/settings/cli.py

[warning] 21-21: cratedb_toolkit/settings/cli.py#L21
Added line #L21 was not covered by tests

cratedb_toolkit/settings/compare.py

[warning] 26-27: cratedb_toolkit/settings/compare.py#L26-L27
Added lines #L26 - L27 were not covered by tests

[warning] 29-31: cratedb_toolkit/settings/compare.py#L29-L31
Added lines #L29 - L31 were not covered by tests

[warning] 33-45: cratedb_toolkit/settings/compare.py#L33-L45
Added lines #L33 - L45 were not covered by tests

[warning] 47-47: cratedb_toolkit/settings/compare.py#L47
Added line #L47 was not covered by tests

[warning] 52-56: cratedb_toolkit/settings/compare.py#L52-L56
Added lines #L52 - L56 were not covered by tests

[warning] 58-59: cratedb_toolkit/settings/compare.py#L58-L59
Added lines #L58 - L59 were not covered by tests

[warning] 64-65: cratedb_toolkit/settings/compare.py#L64-L65
Added lines #L64 - L65 were not covered by tests

[warning] 67-67: cratedb_toolkit/settings/compare.py#L67
Added line #L67 was not covered by tests

[warning] 70-72: cratedb_toolkit/settings/compare.py#L70-L72
Added lines #L70 - L72 were not covered by tests

[warning] 74-78: cratedb_toolkit/settings/compare.py#L74-L78
Added lines #L74 - L78 were not covered by tests

[warning] 81-81: cratedb_toolkit/settings/compare.py#L81
Added line #L81 was not covered by tests

[warning] 89-91: cratedb_toolkit/settings/compare.py#L89-L91
Added lines #L89 - L91 were not covered by tests

[warning] 93-94: cratedb_toolkit/settings/compare.py#L93-L94
Added lines #L93 - L94 were not covered by tests

[warning] 96-98: cratedb_toolkit/settings/compare.py#L96-L98
Added lines #L96 - L98 were not covered by tests

[warning] 103-104: cratedb_toolkit/settings/compare.py#L103-L104
Added lines #L103 - L104 were not covered by tests

[warning] 106-106: cratedb_toolkit/settings/compare.py#L106
Added line #L106 was not covered by tests

[warning] 109-111: cratedb_toolkit/settings/compare.py#L109-L111
Added lines #L109 - L111 were not covered by tests

[warning] 113-114: cratedb_toolkit/settings/compare.py#L113-L114
Added lines #L113 - L114 were not covered by tests

[warning] 117-117: cratedb_toolkit/settings/compare.py#L117
Added line #L117 was not covered by tests

[warning] 125-127: cratedb_toolkit/settings/compare.py#L125-L127
Added lines #L125 - L127 were not covered by tests

[warning] 132-139: cratedb_toolkit/settings/compare.py#L132-L139
Added lines #L132 - L139 were not covered by tests

[warning] 141-141: cratedb_toolkit/settings/compare.py#L141
Added line #L141 was not covered by tests

[warning] 146-153: cratedb_toolkit/settings/compare.py#L146-L153
Added lines #L146 - L153 were not covered by tests

[warning] 155-156: cratedb_toolkit/settings/compare.py#L155-L156
Added lines #L155 - L156 were not covered by tests

[warning] 159-159: cratedb_toolkit/settings/compare.py#L159
Added line #L159 was not covered by tests

[warning] 162-164: cratedb_toolkit/settings/compare.py#L162-L164
Added lines #L162 - L164 were not covered by tests

[warning] 169-170: cratedb_toolkit/settings/compare.py#L169-L170
Added lines #L169 - L170 were not covered by tests

[warning] 172-173: cratedb_toolkit/settings/compare.py#L172-L173
Added lines #L172 - L173 were not covered by tests

cratedb_toolkit/docs/cli.py

[warning] 104-104: cratedb_toolkit/docs/cli.py#L104
Added line #L104 was not covered by tests

⏰ Context from checks skipped due to timeout of 90000ms (10)

GitHub Check: Generic: Python 3.12 on OS ubuntu-latest
GitHub Check: Kinesis: Python 3.12 on OS ubuntu-latest
GitHub Check: Generic: Python 3.9 on OS ubuntu-latest
GitHub Check: Generic: Python 3.8 on OS ubuntu-latest
GitHub Check: CFR for OS windows-latest
GitHub Check: build-and-test
GitHub Check: Kinesis: Python 3.9 on OS ubuntu-latest
GitHub Check: CFR for OS ubuntu-latest
GitHub Check: CFR for OS macos-latest
GitHub Check: CFR for OS macos-13

🔇 Additional comments (3)

cratedb_toolkit/docs/cli.py (1)

100-106: Absolute import improves clarity but slightly reduces package-relative flexibility

Switching from a relative import (from .settings import …) to an absolute package import (from cratedb_toolkit.docs.settings import …) is perfectly valid inside the monorepo and avoids ambiguity when the module is executed as a script.
Be aware, however, that absolute imports make local development a bit trickier when the package is not installed in editable (pip install -e .) mode. If contributors sometimes run python -m cratedb_toolkit.docs.cli … from the source tree, the absolute path still works, so nothing is broken—just worth keeping in mind.

No action required, just informational.

🧰 Tools

🪛 GitHub Check: codecov/patch

[warning] 104-104: cratedb_toolkit/docs/cli.py#L104
Added line #L104 was not covered by tests

cratedb_toolkit/settings/cli.py (1)

24-24: Registering the command object directly is 👍

Adding the already-decorated compare_cluster_settings click command via add_command() is the simplest, duplication-free way to expose it—nice!
cratedb_toolkit/settings/compare.py (1)

374-380: Silent acceptance of zero/negative heap size disguises mistakes

When heap_size_bytes is provided but ≤ 0 the subsequent percentage math still runs (or divides by zero as noted above).
Fail fast with a clear message:
-if heap_size_bytes:
+if heap_size_bytes and heap_size_bytes > 0:
     …
 else:
-    print(f"{YELLOW}No heap size provided{RESET}")
+    print(f"{RED}Invalid or missing heap size (--heap-size-bytes must be > 0){RESET}")
+    return
Improves UX and prevents misleading reports.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

cratedb_toolkit/settings/compare.py (2)

253-258: find_cluster_settings() only handles the [{"settings": …}] shape

If the result ever comes back as a bare object ({"settings": …}) or the array contains multiple rows, this helper silently returns None, leading to confusing errors downstream.
See the defensive version proposed in the earlier review comment.

24-47: 🛠️ Refactor suggestion

Parser is still brittle – see earlier review

This manual “square-bracket depth” scanner…

ignores { … } root objects,

breaks on [ or ] that appear inside quoted strings,

loads the complete file into memory,

exactly as pointed out in the previous review. Please consider the streaming json.JSONDecoder approach suggested earlier for robustness and scalability.

🧰 Tools

🪛 GitHub Check: codecov/patch

[warning] 26-27: cratedb_toolkit/settings/compare.py#L26-L27
Added lines #L26 - L27 were not covered by tests

[warning] 29-31: cratedb_toolkit/settings/compare.py#L29-L31
Added lines #L29 - L31 were not covered by tests

[warning] 33-45: cratedb_toolkit/settings/compare.py#L33-L45
Added lines #L33 - L45 were not covered by tests

[warning] 47-47: cratedb_toolkit/settings/compare.py#L47
Added line #L47 was not covered by tests

🧹 Nitpick comments (1)

cratedb_toolkit/settings/compare.py (1)
224-227: Edge-case: threshold boundary is excluded

threshold_percent intends to separate “large” from “small” settings, but the comparison uses >:
tolerance = tolerance_percent_large if default_percent > threshold_percent else tolerance_percent_small
A default exactly at the boundary (e.g. 20% when threshold_percent=20) is currently treated as “small”, which might be surprising. Consider changing to >=:
-    tolerance = tolerance_percent_large if default_percent > threshold_percent else tolerance_percent_small
+    tolerance = tolerance_percent_large if default_percent >= threshold_percent else tolerance_percent_small
This makes the rule inclusive and removes the fuzzy off-by-one behaviour.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 23d75b5 and f765393.

📒 Files selected for processing (1)

cratedb_toolkit/settings/compare.py (1 hunks)