Skip to content

feat: add RegionId to FileId #6410

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

waynexia
Copy link
Member

Signed-off-by: Ruihang Xia [email protected]I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

Add a RegionId field to FileId struct

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

waynexia added 4 commits June 24, 2025 15:39
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
@waynexia waynexia requested review from v0y4g3r, evenyag and Copilot June 26, 2025 09:56
@github-actions github-actions bot added size/M docs-not-required This change does not impact docs. labels Jun 26, 2025
@waynexia waynexia mentioned this pull request Jun 26, 2025
4 tasks
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors stale manifest structures and updates file identification by adding a RegionId field to the FileId struct. Key changes include removing the old manifest module, updating API paths and file handling functions to use table_dir instead of region_dir, and replacing FileId::random() calls with FileId::new(region_id).

Reviewed Changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated no comments.

File Description
src/store-api/src/manifest.rs Removed stale manifest module
src/store-api/src/lib.rs Commented out manifest module and added manifest version constants
src/mito2/* Updated API and tests to use table_dir and new FileId creation methods
Comments suppressed due to low confidence (2)

src/mito2/src/sst/file.rs:48

  • The doc comment for FileId declares its string representation as {region_id}_{uuid} but the Display implementation formats it as {region_name}/{uuid}. Please update the documentation or the implementation for consistency.
///

src/mito2/src/sst/file.rs:575

  • The test 'test_file_id_serialization' is currently marked as ignored with a TODO comment. It is recommended to either fix or remove the test to ensure adequate test coverage for FileId serialization.
    #[ignore = "TODO: fix this test or remove it"]

@waynexia waynexia changed the title refactor: remove staled manifest structures feat: add RegionId to FileId Jun 27, 2025
waynexia added 2 commits June 27, 2025 10:56
Signed-off-by: Ruihang Xia <[email protected]>
@evenyag
Copy link
Contributor

evenyag commented Jun 27, 2025

I'm going to refactor this by adding another struct with region id and file id. Adding region id to FileId looks strange.

@v0y4g3r Do you have any suggestions for this new struct? RegionFileId, RegionIdAndFileId, FileIdWithRegion.

@v0y4g3r
Copy link
Contributor

v0y4g3r commented Jun 27, 2025

RegionFileId

I'm going to refactor this by adding another struct with region id and file id. Adding region id to FileId looks strange.

@v0y4g3r Do you have any suggestions for this new struct? RegionFileId, RegionIdAndFileId, FileIdWithRegion.

➕1️⃣ for RegionFileId.

evenyag added 2 commits June 27, 2025 22:54
- FileId still only consist of an uuid
- PathProvider accepts RegionFileId and doesn't need to keep a region id
  in it
- All Index applier takes RegionFileId and respects the region id in the RegionFileId
- FileMeta can still derive Serialize/Deserialize
- Refactor the CacheManager to accept RegionFileId

Signed-off-by: evenyag <[email protected]>
@evenyag
Copy link
Contributor

evenyag commented Jun 27, 2025

I'm going to refactor this by adding another struct with region id and file id. Adding region id to FileId looks strange.

Done in a3bfb58

I also modified the code that uses both region id and file id to respect the region id in the RegionFileId, so the touched files are more than refactoring. But it should avoid some work to modify related structs later. We can still use derive(Serialize, Deserialize) after this commit.

@v0y4g3r @waynexia

@evenyag evenyag requested a review from a team as a code owner July 1, 2025 09:16
@evenyag evenyag force-pushed the file-id-with-region branch from dc11254 to 0b23527 Compare July 1, 2025 09:17
evenyag added 9 commits July 1, 2025 17:19
Signed-off-by: evenyag <[email protected]>
Move region_dir_from_table_dir to mito and use join_dir internally

Signed-off-by: evenyag <[email protected]>
Signed-off-by: evenyag <[email protected]>
We can get table_dir and path_type from the access layer

Signed-off-by: evenyag <[email protected]>
Signed-off-by: evenyag <[email protected]>
@evenyag evenyag force-pushed the file-id-with-region branch from 0b23527 to 31eceb5 Compare July 1, 2025 09:19
Signed-off-by: evenyag <[email protected]>
@evenyag evenyag requested a review from zhongzc as a code owner July 1, 2025 09:20
Signed-off-by: evenyag <[email protected]>
@evenyag
Copy link
Contributor

evenyag commented Jul 1, 2025

We have to introduce an enum PathType to compute the region dir in the mito region correctly.

pub enum PathType {
    /// A bare path - the original path of an engine.
    ///
    /// The path prefix is `{table_dir}/{table_id}_{region_sequence}/`.
    Bare,
    /// A path for the data region of a metric engine table.
    ///
    /// The path prefix is `{table_dir}/{table_id}_{region_sequence}/data/`.
    Data,
    /// A path for the metadata region of a metric engine table.
    ///
    /// The path prefix is `{table_dir}/{table_id}_{region_sequence}/metadata/`.
    Metadata,
}

I added the path_type to RegionCreateRequest and RegionOpenRequest. I renamed the region_dir of these requests to table_dir.

Note that when opening a region in the metric engine, we still specify PathType::Bare.

We still need to rename other region_dir fields in the mito crate to table_dir later. This PR is already too large, so I didn't do this refactoring.

@evenyag evenyag force-pushed the file-id-with-region branch from 9c509eb to 3fc2e64 Compare July 1, 2025 11:54
@killme2008
Copy link
Contributor

Lots of conflicts @waynexia

@evenyag
Copy link
Contributor

evenyag commented Jul 7, 2025

Lots of conflicts @waynexia

I'll address them later. It's ok to review before addressing the conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-not-required This change does not impact docs. size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants