Conversation
WalkthroughAdded a CSV import adapter and wiring: new CSV loader module with URL parsing, batching, optional MacroPipe transformations, router dispatch, tests/fixtures, model/util adjustments, pyproject/CI tweaks, and changelog entry. No API-breaking changes to existing public functions beyond adding TableAddress.if_exists. Changes
Sequence Diagram(s)sequenceDiagram
participant User as "User"
participant Router as "IoRouter"
participant CsvAddr as "CsvFileAddress"
participant Polars as "Polars"
participant Macro as "MacroPipe"
participant Crate as "CrateDB"
User->>Router: load_table(csv_source, target)
Router->>CsvAddr: from_url(csv_source)
CsvAddr->>CsvAddr: parse URI, extract batch/pipeline/storage opts
CsvAddr->>Polars: scan_csv(path, sep, quote, storage_options)
Polars-->>CsvAddr: LazyFrame
alt pipeline present
CsvAddr->>Macro: MacroPipe.from_recipes(pipeline)
Macro->>Macro: apply(LazyFrame)
Macro-->>CsvAddr: transformed LazyFrame
end
CsvAddr->>Polars: collect() / chunk
Polars-->>CsvAddr: DataFrame chunk
CsvAddr->>Crate: polars_to_cratedb(chunk, target_url, if_exists)
Crate-->>CsvAddr: insert result
CsvAddr-->>User: success/failure
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| def load_table(self) -> pl.LazyFrame: | ||
| """ | ||
| Load the CSV file as a Polars LazyFrame. | ||
| """ | ||
|
|
||
| # Read from data source. | ||
| lf = pl.scan_csv( | ||
| self.location, | ||
| separator=self.separator, | ||
| quote_char=self.quote_char, | ||
| storage_options=self.storage_options, | ||
| ) | ||
|
|
||
| # Optionally apply transformations. | ||
| if self.pipeline: | ||
| from macropipe import MacroPipe | ||
|
|
||
| mp = MacroPipe.from_recipes(*self.pipeline) | ||
| lf = mp.apply(lf) | ||
|
|
||
| return lf |
There was a problem hiding this comment.
This is where Macropipe comes into play, providing a concisely configurable transformation unit to your ingress channel.
| climate_json_json = ( | ||
| str(data_folder / "climate_json_json.csv") + "?quote-char='&pipe=json_array_to_wkt_point:geo_location" | ||
| ) | ||
| climate_json_python = ( | ||
| str(data_folder / "climate_json_python.csv") | ||
| + '?quote-char="&pipe=json_array_to_wkt_point:geo_location&pipe=python_to_json:data' | ||
| ) | ||
| climate_wkt_json = str(data_folder / "climate_wkt_json.csv") + "?quote-char='" |
There was a problem hiding this comment.
5251ddd to
b8ae509
Compare
About
Support CSV file imports with special needs.
Poem
I sniff the CSV, a ribboned trail,
Pipes and separators wag their tail,
Polars hum, macro-steps unfold,
Chunks hop into Crate, stories told.
References
Backlog