kvcache-ai
diff --git a/‎docs/source/design/mooncake-store.md‎
Lines changed: 53 additions & 0 deletions b/‎docs/source/design/mooncake-store.md‎
Lines changed: 53 additions & 0 deletions
diff --git a/‎docs/source/python-api-reference/mooncake-store.md‎
Lines changed: 241 additions & 0 deletions b/‎docs/source/python-api-reference/mooncake-store.md‎
Lines changed: 241 additions & 0 deletions
@@ -105,6 +105,25 @@ struct ReplicateConfig {
 };
 ```
 
+### Upsert
+
+```C++
+tl::expected<void, ErrorCode> Upsert(const ObjectKey& key,
+                                     std::vector<Slice>& slices,
+                                     const ReplicateConfig& config);
+
+std::vector<tl::expected<void, ErrorCode>> BatchUpsert(
+    const std::vector<ObjectKey>& keys,
+    std::vector<std::vector<Slice>>& batched_slices,
+    const ReplicateConfig& config);
+```
+
+`Upsert` inserts `key` if it does not exist and updates the existing object if
+it does. It uses the same replication configuration model as `Put`, while
+allowing the store to reuse existing placement for in-place updates when the
+current layout permits it. `BatchUpsert` performs the same operation for
+multiple keys using a shared replication configuration.
+
 ### Remove
 
 ```C++
@@ -516,6 +535,40 @@ The Master Service handles object-related interfaces as follows:
 
 Before writing an object, the Client calls PutStart to request storage space allocation from the Master Service. After completing data writing, the Client calls PutEnd to notify the Master Service to mark the object write as completed.
 
+- Upsert
+
+```C++
+tl::expected<std::vector<Replica::Descriptor>, ErrorCode> UpsertStart(
+    const std::string& key,
+    const std::vector<size_t>& slice_lengths,
+    const ReplicateConfig& config);
+
+std::vector<tl::expected<std::vector<Replica::Descriptor>, ErrorCode>>
+BatchUpsertStart(const std::vector<std::string>& keys,
+                 const std::vector<std::vector<uint64_t>>& slice_lengths,
+                 const ReplicateConfig& config);
+
+tl::expected<void, ErrorCode> UpsertEnd(
+    const std::string& key, ReplicaType replica_type);
+
+std::vector<tl::expected<void, ErrorCode>> BatchUpsertEnd(
+    const std::vector<std::string>& keys);
+
+tl::expected<void, ErrorCode> UpsertRevoke(
+    const std::string& key, ReplicaType replica_type);
+
+std::vector<tl::expected<void, ErrorCode>> BatchUpsertRevoke(
+    const std::vector<std::string>& keys);
+```
+
+`UpsertStart` / `UpsertEnd` / `UpsertRevoke` mirror the existing put lifecycle
+but operate on insert-or-update semantics. If the key does not exist, the flow
+behaves like `PutStart`. If the key already exists, the Master may reuse the
+current allocation for an in-place update or allocate new space when the object
+layout changes. The batch variants provide the same control flow for multiple
+keys and are the lower-level primitives used by the high-level `BatchUpsert`
+path.
+
 - GetReplicaList
 
 ```C++
 
@@ -629,6 +629,120 @@ result = store.put_batch(keys, values)
 
 ---
 
+#### upsert()
+
+Insert a new object if the key does not exist, or update the existing object in place when possible. They use the same replication configuration model as `put()`.
+
+Upsert binary data in the distributed storage.
+
+```python
+def upsert(self, key: str, value: bytes, config: ReplicateConfig = None) -> int
+```
+
+**Parameters:**
+- `key` (str): Unique object identifier
+- `value` (bytes): Binary data to insert or update
+- `config` (ReplicateConfig, optional): Replication configuration
+
+**Returns:**
+- `int`: Status code (0 = success, non-zero = error code)
+
+**Example:**
+```python
+config = ReplicateConfig()
+config.replica_num = 2
+
+rc = store.upsert("weights", b"new-bytes", config)
+if rc == 0:
+    print("Upsert succeeded")
+```
+
+#### upsert_from()
+
+Upsert object data directly from a pre-allocated buffer (zero-copy).
+
+```python
+def upsert_from(self, key: str, buffer_ptr: int, size: int, config: ReplicateConfig = None) -> int
+```
+
+**Parameters:**
+- `key` (str): Object identifier
+- `buffer_ptr` (int): Memory address of the source buffer
+- `size` (int): Number of bytes to insert or update
+- `config` (ReplicateConfig, optional): Replication configuration
+
+**Returns:**
+- `int`: Status code (0 = success, non-zero = error code)
+
+**Note:** This is the zero-copy counterpart of `upsert()`. As with
+`put_from()`, register the buffer before issuing the request.
+
+#### batch_upsert_from()
+
+Upsert multiple objects directly from pre-allocated buffers.
+
+```python
+def batch_upsert_from(self, keys: List[str], buffer_ptrs: List[int], sizes: List[int],
+                      config: ReplicateConfig = None) -> List[int]
+```
+
+**Parameters:**
+- `keys` (List[str]): List of object identifiers
+- `buffer_ptrs` (List[int]): List of source buffer addresses
+- `sizes` (List[int]): List of byte lengths for each buffer
+- `config` (ReplicateConfig, optional): Replication configuration shared by all objects
+
+**Returns:**
+- `List[int]`: List of status codes for each upsert
+
+#### upsert_parts()
+
+Upsert data from multiple buffer parts as a single object (insert or update).
+
+```python
+def upsert_parts(self, key: str, *parts, config: ReplicateConfig = None) -> int
+```
+
+**Parameters:**
+- `key` (str): Object identifier
+- `*parts`: Variable number of bytes-like objects to concatenate
+- `config` (ReplicateConfig, optional): Replication configuration
+
+**Returns:**
+- `int`: Status code (0 = success, non-zero = error code)
+
+**Example:**
+```python
+part1 = b"Hello, "
+part2 = b"World!"
+result = store.upsert_parts("greeting", part1, part2)
+```
+
+#### upsert_batch()
+
+Upsert multiple objects in a single batch operation.
+
+```python
+def upsert_batch(self, keys: List[str], values: List[bytes], config: ReplicateConfig = None) -> int
+```
+
+**Parameters:**
+- `keys` (List[str]): List of object identifiers
+- `values` (List[bytes]): List of binary data to insert or update
+- `config` (ReplicateConfig, optional): Replication configuration for all objects
+
+**Returns:**
+- `int`: Status code (0 = success, non-zero = error code)
+
+**Example:**
+```python
+keys = ["key1", "key2", "key3"]
+values = [b"value1", b"value2", b"value3"]
+result = store.upsert_batch(keys, values)
+```
+
+---
+
 #### get_batch()
 Retrieve multiple objects in a single batch operation.
 
@@ -1533,6 +1647,133 @@ def batch_pub_tensor(self, keys: List[str], tensors_list: List[torch.Tensor], co
 
 ---
 
+#### upsert_tensor()
+
+Insert a tensor if its key is missing, or update the existing tensor if the key already exists. The current tensor upsert helpers use the default `ReplicateConfig` and therefore do not take a `config` parameter.
+
+Upsert a PyTorch tensor into the store.
+
+```python
+def upsert_tensor(self, key: str, tensor: torch.Tensor) -> int
+```
+
+**Parameters:**
+- `key` (str): Object identifier
+- `tensor` (torch.Tensor): The PyTorch tensor to insert or update
+
+**Returns:**
+- `int`: Status code (0 = success, non-zero = error code)
+
+**Note:** This function requires `torch` to be installed and available in the environment.
+
+#### upsert_tensor_from()
+
+Upsert a tensor directly from a pre-allocated buffer. The buffer layout must be
+`[TensorMetadata][tensor data]`, matching the layout used by
+`get_tensor_into()`.
+
+```python
+def upsert_tensor_from(self, key: str, buffer_ptr: int, size: int) -> int
+```
+
+**Parameters:**
+- `key` (str): Object identifier
+- `buffer_ptr` (int): Buffer pointer containing serialized tensor metadata and payload
+- `size` (int): Actual serialized byte length of the tensor buffer
+
+**Returns:**
+- `int`: Status code (0 = success, non-zero = error code)
+
+**Note:** This function is not supported for dummy client.
+
+#### batch_upsert_tensor_from()
+
+Upsert multiple tensors directly from pre-allocated buffers. Each buffer must
+use layout `[TensorMetadata][tensor data]`.
+
+```python
+def batch_upsert_tensor_from(self, keys: List[str], buffer_ptrs: List[int], sizes: List[int]) -> List[int]
+```
+
+**Parameters:**
+- `keys` (List[str]): List of object identifiers
+- `buffer_ptrs` (List[int]): List of serialized tensor buffer pointers
+- `sizes` (List[int]): List of actual serialized byte lengths
+
+**Returns:**
+- `List[int]`: List of status codes for each tensor upsert
+
+#### batch_upsert_tensor()
+
+Upsert a batch of PyTorch tensors into the store (insert or update).
+
+```python
+def batch_upsert_tensor(self, keys: List[str], tensors_list: List[torch.Tensor]) -> List[int]
+```
+
+**Parameters:**
+- `keys` (List[str]): List of object identifiers
+- `tensors_list` (List[torch.Tensor]): List of tensors to insert or update
+
+**Returns:**
+- `List[int]`: List of status codes for each tensor operation.
+
+**Note:** This function requires `torch` to be installed and available in the environment. Not supported for dummy client.
+
+#### upsert_pub_tensor()
+
+Upsert a PyTorch tensor with configurable replication settings (insert or update).
+
+```python
+def upsert_pub_tensor(self, key: str, tensor: torch.Tensor, config: ReplicateConfig = None) -> int
+```
+
+**Parameters:**
+- `key` (str): Unique object identifier
+- `tensor` (torch.Tensor): PyTorch tensor to insert or update
+- `config` (ReplicateConfig, optional): Replication configuration
+
+**Returns:**
+- `int`: Status code (0 = success, non-zero = error code)
+
+**Note:** This function requires `torch` to be installed and available in the environment. Not supported for dummy client.
+
+**Example:**
+```python
+import torch
+from mooncake.store import ReplicateConfig
+
+tensor = torch.randn(100, 100)
+
+config = ReplicateConfig()
+config.replica_num = 2
+config.with_soft_pin = True
+
+result = store.upsert_pub_tensor("my_tensor", tensor, config)
+if result == 0:
+    print("Tensor upserted successfully")
+```
+
+#### batch_upsert_pub_tensor()
+
+Batch upsert PyTorch tensors with configurable replication settings (insert or update).
+
+```python
+def batch_upsert_pub_tensor(self, keys: List[str], tensors_list: List[torch.Tensor], config: ReplicateConfig = None) -> List[int]
+```
+
+**Parameters:**
+- `keys` (List[str]): List of object identifiers
+- `tensors_list` (List[torch.Tensor]): List of tensors to insert or update
+- `config` (ReplicateConfig, optional): Replication configuration
+
+**Returns:**
+- `List[int]`: List of status codes for each tensor operation.
+
+**Note:** This function requires `torch` to be installed and available in the environment. Not supported for dummy client.
+
+---
+
 ### PyTorch Tensor Operations (Zero Copy)
 
 These methods provide direct support for storing and retrieving PyTorch tensors. They automatically handle serialization and metadata, and include built-in support for **Tensor Parallelism (TP)** by automatically splitting and reconstructing tensor shards.