Skip to content

Chore: Refactor ctk.io package#679

Merged
amotl merged 15 commits intomainfrom
io-refactor
Mar 7, 2026
Merged

Chore: Refactor ctk.io package#679
amotl merged 15 commits intomainfrom
io-refactor

Conversation

@amotl
Copy link
Copy Markdown
Member

@amotl amotl commented Mar 5, 2026

Just maintenance: Refactoring and spring cleaning around the ctk.io package and friends, nothing wild.

@amotl amotl changed the title Io refactor Chore: Refactor ctk.io package Mar 5, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 5, 2026

Walkthrough

Adds an IoRouter dispatcher and refactors StandaloneCluster/ManagedCluster to delegate load/save operations to it; introduces CloudJob/CloudIoSpecs model, moves BulkProcessor import into cratedb-specific bulk module, adjusts TableAddress truthiness, and updates tests/docs/examples and several imports.

Changes

Cohort / File(s) Summary
IoRouter implementation
cratedb_toolkit/io/router.py
New IoRouter class with load_table/save_table dispatching to multiple backends by URL scheme (dynamodb, influxdb, kinesis, mongodb(+cdc), deltalake, iceberg, ingestr); dynamic imports and explicit error signaling.
Cluster I/O delegation
cratedb_toolkit/cluster/core.py
StandaloneCluster/ManagedCluster now delegate load_table/save_table to an instance _router: IoRouter (default factory); signatures updated to accept optional target and transformation.
Cloud model reorg
cratedb_toolkit/io/cratedb/cloud/__init__.py, cratedb_toolkit/io/cratedb/cloud/model.py
New CloudJob and CloudIoSpecs defined in model.py; __init__ re-exports them. CloudJob normalizes job info and exposes id/status/success/message.
Bulk & SQL import moves
cratedb_toolkit/io/cratedb/bulk.py, cratedb_toolkit/io/dynamodb/copy.py, cratedb_toolkit/io/mongodb/copy.py, cratedb_toolkit/io/sql.py
BulkProcessor import moved to io.cratedb.bulk; bulk.py adds docstring and logger; io.sql stops re-exporting run_sql/DatabaseAdapter.
Model tweak
cratedb_toolkit/model.py
Added TableAddress.__bool__ so truthiness depends on presence of table.
Tests & fixtures reorg
tests/conftest.py, tests/io/test_file.py, tests/io/..., tests/cluster/test_import.py
Added dummy_csv fixture and new file-import tests; removed tests/io/test_import.py; updated many test import paths (awslambda/kinesis, wrap_kinesis, run_sql sources).
Kinesis, examples & docs
cratedb_toolkit/io/kinesis/api.py, examples/cdc/..., doc/io/dynamodb/cdc-lambda.md
kinesis_relay now returns True; example Lambda entrypoint/handler and docs updated to CDC-oriented paths; examples/cdc/aws/kinesis_put.py builds CDC payload explicitly.
Misc tests updates
tests/io/dynamodb/test_relay.py, tests/io/kinesis/manager.py, tests/io/kinesis/test_relay.py, tests/io/test_awslambda.py, tests/util/test_run_sql.py
Updated test import sources to new module paths and updated cleanup/import references.
CI matrix
.github/workflows/kinesis.yml
Commented-out Python 3.10 entry in the workflow matrix (disabled).

Sequence Diagram(s)

sequenceDiagram
    participant Cluster as StandaloneCluster
    participant Router as IoRouter
    participant Kinesis as KinesisAdapter
    participant Dynamo as DynamoDBAdapter
    participant Mongo as MongoDBAdapter
    participant Delta as DeltaLakeAdapter
    Cluster->>Router: load_table(source, target, transformation)
    Router->>Router: parse source URL scheme
    alt kinesis scheme
        Router->>Kinesis: kinesis_relay(source_url, target_url, recipe)
        Kinesis-->>Router: bool
    else dynamodb scheme
        Router->>Dynamo: dynamodb_copy(source, target, progress=True)
        Dynamo-->>Router: bool
    else mongodb or +cdc
        Router->>Mongo: mongodb_copy / mongodb_relay_cdc(...)
        Mongo-->>Router: bool
    else deltalake
        Router->>Delta: from_deltalake(source, target)
        Delta-->>Router: bool
    else unsupported
        Router-->>Cluster: NotImplementedError
    end
    Router-->>Cluster: bool result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • surister
  • hammerhead

Poem

🐰 I hopped along the router's wire,
Through Kinesis, Mongo, lakes of fire.
Cloud jobs neat, imports in a row,
Tables arrive where they should go.
A tiny hop — and data flows!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Chore: Refactor ctk.io package' accurately describes the main change—refactoring the IO package—and aligns with the comprehensive refactoring evident in the changeset.
Description check ✅ Passed The description 'Just maintenance: Refactoring and spring cleaning around the ctk.io package and friends, nothing wild' relates to the changeset, which demonstrates IO package refactoring, reorganization of imports, and cleanup.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch io-refactor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@amotl amotl force-pushed the io-refactor branch 2 times, most recently from 74f2ced to c14fcc4 Compare March 6, 2026 15:02
coderabbitai[bot]

This comment was marked as resolved.

@amotl amotl marked this pull request as ready for review March 7, 2026 14:48
coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

amotl added 2 commits March 7, 2026 16:25
test_kinesis_latest_dynamodb_cdc_insert_update - assert 0 == 1
coderabbitai[bot]

This comment was marked as resolved.

@amotl amotl merged commit 341db38 into main Mar 7, 2026
17 of 19 checks passed
@amotl amotl deleted the io-refactor branch March 7, 2026 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant