Skip to content

mito: Write cache for remote object store #2965

@evenyag

Description

@evenyag

What type of enhancement is this?

Performance

What does the enhancement do?

Now the mito storage engine writes SST files to remote object stores directly.

pub struct ParquetWriter {
/// SST output file path.
file_path: String,
/// Input data source.
source: Source,
/// Region metadata of the source and the target SST.
metadata: RegionMetadataRef,
object_store: ObjectStore,
}

We have to fetch the object from the object store again if we want to access the file. If we implement a write-through cache for parquet files, we don't need to download the object again.

Implementation challenges

This might increase the cost of uploading an object and the memory pressure of memtables.

  • A better approach is to release the memtable once we flush the file to the write cache.
  • We update the manifest after the object is fully uploaded to the remote object store

To implement async upload, we need to store other metadata such as flushed sequence and region id for files in the write cache. The region edit also requires memtable ids to remove flushed memtables. We should switch to using the minimum sequence of memtable as the memtable id is incremented globally.

For simplicity, we can implement the sync version first, which returns after files are uploaded.

Steps

Further discussions

  • If the engine opens a region in a fresh env without the write cache, replaying the wal might cause oom
  • We can trigger a flush during replay to avoid OOM and upload it once we have write permission
  • Delays uploading level 0 files we might compact them later

Related Issues

It should be part of #2516

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions