Skip to content

feat: implement export-v2 chunked data export flow#7841

Merged
fengjiachun merged 9 commits intoGreptimeTeam:mainfrom
fengjiachun:feat/export-import-v2-pr2
Mar 31, 2026
Merged

feat: implement export-v2 chunked data export flow#7841
fengjiachun merged 9 commits intoGreptimeTeam:mainfrom
fengjiachun:feat/export-import-v2-pr2

Conversation

@fengjiachun
Copy link
Copy Markdown
Collaborator

@fengjiachun fengjiachun commented Mar 21, 2026

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

As gemini said

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

@fengjiachun fengjiachun requested a review from a team as a code owner March 21, 2026 04:21
@github-actions github-actions bot added size/L docs-not-required This change does not impact docs. labels Mar 21, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the export-v2 CLI command by introducing a chunked data export mechanism. This change allows for more granular and resilient data exports, particularly for large datasets, by dividing the export into manageable time-based chunks. It lays the groundwork for a more robust data management pipeline, while temporarily restricting full snapshot imports until the corresponding data import logic is in place.

Highlights

  • Chunked Data Export Implementation: Implemented a new chunked data export flow for the export-v2 CLI command, allowing data to be exported in time-based segments.
  • New Modules for Export V2: Introduced chunker, coordinator, and data modules within export_v2 to handle chunk generation, export orchestration, and SQL COPY DATABASE statement construction, respectively.
  • Export Command Enhancements: The export-v2 create command now supports a --chunk-time-window option for defining chunk sizes and includes improved validation for time ranges.
  • Manifest Updates: The export manifest (manifest.rs) has been extended to store chunk metadata, track chunk status (InProgress, Completed, Skipped, Failed), and parse time ranges from input.
  • Import Command Restrictions: The import-v2 command now explicitly rejects importing full snapshots that contain data chunks, as the data import functionality for these snapshots is not yet implemented.
  • Object Storage Utilities: Added create_dir_all and list_files_recursive methods to the SnapshotStorage trait and its OpenDalStorage implementation to support directory creation and file listing in object storage.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the chunked data export flow for export-v2, a significant feature enhancement. The changes are well-structured, introducing new modules for chunking, coordination, and data handling, which improves modularity. The implementation correctly handles resumable exports by updating a manifest file at each step of the process. New error types and comprehensive tests have been added. I've included one suggestion to improve code clarity by reducing duplication in the export coordinator.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4e6af2c307

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@fengjiachun fengjiachun requested review from WenyXu and discord9 March 21, 2026 05:25
@github-actions github-actions bot added size/XL and removed size/L labels Mar 30, 2026
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
@fengjiachun fengjiachun force-pushed the feat/export-import-v2-pr2 branch from 4194dab to e021d7f Compare March 30, 2026 21:20
Copy link
Copy Markdown
Contributor

@discord9 discord9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
@fengjiachun fengjiachun enabled auto-merge March 31, 2026 22:13
@fengjiachun fengjiachun added this pull request to the merge queue Mar 31, 2026
Merged via the queue into GreptimeTeam:main with commit ab10696 Mar 31, 2026
46 checks passed
@fengjiachun fengjiachun deleted the feat/export-import-v2-pr2 branch March 31, 2026 23:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-not-required This change does not impact docs. size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants