RFC: Requests' deduplication in Tarantool #11997

mrForza · 2025-10-27T16:30:38Z

mrForza
Oct 27, 2025
Collaborator

Reviewers

[] Main Reviewer: @Serpentian

[] Second Reviewr: @Gerold103

[] TeamLead: @sergepetrenko

[] CTO: @sergos

Tickets

JIRA: https://jira.vk.team/browse/TNTP-3265

JIRA: https://jira.vk.team/browse/TNT-1374?focusedCommentId=37716675

Changelogs

Summary

Problem

Suppose our system has two main components:

A client (library / driver / connector) to a distributed database.
A distributed database consisting of one master and n-1 replicas.

The client creates a synchronous space consisting, for example, of several integer fields: [id, value]. It then performs a non-idempotent operation, e.g. incrementing the value field by 1. Since no existing Tarantool connector implements safe retry logic for non-idempotent requests, and the server side does not provide deduplication mechanisms, this responsibility falls entirely on the client's application code. As a result, the client does not always ensure idempotency of its operations or deduplicate non-idempotent requests, which can lead to data duplication and inconsistent state both on the client side and in the distributed database.

During the execution of this non-idempotent request, something unexpected may occur between the client service and the distributed database: a network disconnects before the request reaches the database, a database crashes or hangs, or a network failures before the response is sent back. As a result, the client waits for n seconds and then receives a TimedOut error within its service. It subsequently retries by sending an identical non-idempotent request to the distributed database. By the time the second request is sent, the original issue that caused the TimedOut error on the client side has suddenly been resolved: e.g. the network connection is restored, the master instance has restarted, or a heavy transaction has finally completed. Consequently, the master node in the distributed database receives two identical non-idempotent requests, even though the client intended to send only one. Thus, the client expects the value field to be incremented by 1, but upon reading the data, it discovers that the value field has actually been incremented by 2. This issue is common in distributed databases and has a well-known name: "duplicate requests."

Causes

To begin with, it is worth outlining the main causes that can, to a greater or lesser extent, lead to the problem of duplicate requests and, consequently, to an inconsistent state on both the client and the distributed database sides:

Use of unsafe retry mechanisms CLIENT side. Most retry mechanisms implemented in connectors or client code do not verify the state of the distributed database after receiving a TimedOut error. As a result, a retry request is sent even in cases where the previous request was successfully executed, but its response never reached the client.
Use of non-idempotent requests CLIENT side. Applying two non-idempotent requests always leads to different database states. Because connectors do not provide any interface to manually mark requests as idempotent, users may write incorrect business logic, overlooking whether their requests are idempotent or non-idempotent.
No server-side mechanisms for detecting and handling duplicate identical requests SERVER side. Because the server lacks logic to distinguish between two identical non-idempotent requests that may belong to the same retry session, it ends up executing all of them. This is the final cause leading to the problem, especially when the two previously mentioned issues remain unaddressed.

Preferred solution

Client side (driver)

As an example of a client driver, the following repository will be considered: tarantool-python.

In this connector, the primary entity used to interact with a Tarantool instance is the Connection class object. This object provides the following methods, which enable the execution of Query, DML, and DDL requests; their descriptions are provided below:

Request type	Writable	Leads to duplicates
eval	DEPENDS ON	YES
execute	DEPENDS ON	YES
get	NO	NO
select	NO	NO
insert	YES	NO
replace	YES	NO
delete	YES	NO
space	YES	NO
update	YES	YES
upsert	YES	YES

First, it is necessary to modify the interface of these methods. From now on, they will accept the following parameters:

Name	Type	Default value	Purpose
`retry`	bool	false	The request will be retried until a response is received from the database (either a database error or a successful result).
`timeout`	float	0.0	Sets the time after which the request is considered unsuccessful, resulting in a `TimedOut` error being thrown. If the `retry` parameter is enabled, the request will be retried once this timeout period elapses.

The following algorithms will only take effect when the retry and timeout parameters of the corresponding Connection object methods are set to values different from their defaults.

Description of the retry algorithm for READ requests (get, select):

All READ requests will be retried without idempotency keys, as their execution cannot cause data duplication on the database side. The request is sent to any replica.
- 1.1 If the request times out, it is resent to the same replica.
- 1.2 If a database system error or a successful response is received, the method terminates and returns the response to the user.

Description of the retry algorithm for WRITE requests (eval, execute, insert, replace, delete, space, update, upsert):

A unique 16-byte binary key of type UUID is generated.
All request objects (e.g., RequestInsert, RequestUpdate, etc.) must include an additional attribute, idempotency_token, in accordance with the updated IPROTO protocol. The generated unique key is attached to each request object.
The request, now containing the token, is sent to the master node.
- 3.1 If the request times out, it is resent to the master node with the same unique key.
- 3.2 If a database system error or a successful response is received, the method terminates and returns the response to the user. Since the connector communicates with Tarantool using the IPROTO protocol, the response formats for retried requests are described in the following section: New IPROTO Interface.

Below is a diagram of the client-side algorithm for clarity:

New IPROTO interface

All IPROTO requests listed below must have an updated header containing an additional binary field, IPROTO_IDEMPOTENT_TOKEN, of 16 bytes in size (used to encode a UUID value). Since not all requests need to be sent with retry semantics, when a zero UUID (00000000-0000-0000-0000-000000000000) is provided, Tarantool will process the request using the standard algorithm.

IPROTO WRITE REQUESTS:

IPROTO_INSERT
Size	MP_UINT
Header	IPROTO_REQUEST_TYPE	IPROTO_INSERT
	IPROTO_SYNC	MP_UINT
	IPROTO_IDEMPOTENT_TOKEN	MP_BIN(16)
Body	IPROTO_SPACE_ID	MP_UINT
Body	IPROTO_TUPLE	MP_ARRAY

IPROTO_REPLACE
Size	MP_UINT
Header	IPROTO_REQUEST_TYPE	IPROTO_REPLACE
	IPROTO_SYNC	MP_UINT
	IPROTO_IDEMPOTENT_TOKEN	MP_BIN(16)
Body	IPROTO_SPACE_ID	MP_UINT
Body	IPROTO_TUPLE	MP_ARRAY

IPROTO_UPDATE
Size	MP_UINT
Header	IPROTO_REQUEST_TYPE	IPROTO_UPDATE
	IPROTO_SYNC	MP_UINT
	IPROTO_IDEMPOTENT_TOKEN	MP_BIN(16)
Body	IPROTO_SPACE_ID	MP_UINT
	IPROTO_INDEX_ID	MP_UINT
	IPROTO_KEY	MP_ARRAY of index keys
	IPROTO_TUPLE	MP_ARRAY of update ops

IPROTO_UPSERT
Size	MP_UINT
Header	IPROTO_REQUEST_TYPE	IPROTO_UPSERT
	IPROTO_SYNC	MP_UINT
	IPROTO_IDEMPOTENT_TOKEN	MP_BIN(16)
Body	IPROTO_SPACE_ID	MP_UINT
	IPROTO_INDEX_BASE	MP_UINT
	IPROTO_OPS	MP_ARRAY
	IPROTO_TUPLE	MP_ARRAY

IPROTO_DELETE
Size	MP_UINT
Header	IPROTO_REQUEST_TYPE	IPROTO_DELETE
	IPROTO_SYNC	MP_UINT
	IPROTO_IDEMPOTENT_TOKEN	MP_BIN(16)
Body	IPROTO_SPACE_ID	MP_UINT
	IPROTO_INDEX_ID	MP_UINT
	IPROTO_KEY	MP_ARRAY of key values

IPROTO_EVAL
Size	MP_UINT
Header	IPROTO_REQUEST_TYPE	IPROTO_EVAL
	IPROTO_SYNC	MP_UINT
	IPROTO_IDEMPOTENT_TOKEN	MP_BIN(16)
Body	IPROTO_EXPR	MP_STR
Body	IPROTO_TUPLE	MP_ARRAY of arguments

IPROTO WRITE RESPONSE:

If an IPROTO request was sent with retry semantics, the response to the user may take one of the following forms:

Successful execution: IPROTO_OK with a boolean field IPROTO_DATA set to true.
Failed execution: IPROTO_ERROR.

Response to IPROTO write request
Size	MP_UINT
Header	IPROTO_REQUEST_TYPE	IPROTO_OK
Header	IPROTO_SYNC IPROTO_SCHEMA_VERSION	MP_UINT MP_UINT
Body	IPROTO_DATA	MP_OBJECT (MP_BOOL or MP_ARRAY)

IPROTO_ERROR
Size	MP_UINT
Header	IPROTO_REQUEST_TYPE	0x8XXX
Header	IPROTO_SYNC IPROTO_SCHEMA_VERSION	MP_UINT MP_UINT
Body	IPROTO_ERROR IPROTO_ERROR_24	MP_ERROR MP_STR

New Lua interface

To enable users to retry not only write requests but entire transactions, we will extend the interface of the box.begin and box.atomic functions by adding an additional parameter, idempotent_token, to the opts options table.

box.begin:

box.begin({txn_isolation, timeout, idempotent_token})

box.atomic:

box.atomic({txn_isolation, timeout, idempotent_token}, tx-function, function-arguments)

New net.box interface

Similarly, we will extend the interface of all write functions in the net.box API by adding the idempotent_token parameter to the opts option.

conn.space.s:insert(tuple, {timeout, idempotent_token})

conn.space.s:replace(tuple, {timeout, idempotent_token})

conn.space.s:update(tuple, {timeout, idempotent_token})

conn.space.s:upsert(tuple, {timeout, idempotent_token})

conn.space.s:delete(tuple, {timeout, idempotent_token})

conn.space.s:call(func_name, params, {timeout, idempotent_token})

conn.space.s:eval(lua_string, params, {timeout, idempotent_token})

net_box_stream:begin({txn_isolation, idempotent_token})

Server side

Where idempotency tokens are stored:

To enable the database to distinguish identical requests within the same retry session, the idempotency keys provided in request headers must be stored somewhere. It is proposed to store these keys in a global system space named _idempotency. The structure of this space is shown below.

token: UUID	success: bool

How long idempotency tokens are stored:

It is also important to consider how we will clean up idempotency tokens from the database. To this end, we will introduce a new parameter in box.cfg called idempotency_lifetime, which defines the lifetime of idempotency tokens (in seconds). Once this time has elapsed since the creation of a tuple in the _idempotency space, triggers will automatically remove the expired tuple from this space.

Name	Type	Unit of measurement	Default value	Max value
`box.cfg.idempotancy_lifetime`	unsigned	seconds	3600	4294967295

Generalized algorithm for handling idempotency keys on the master node:

The process begins by looking up a tuple in the global system space _idempotency that matches the current request’s idempotency key. This determines whether the request has already been executed previously.
Was a tuple found in the space?
- If a record exists → proceed to step 3.
- If no record exists → this means the request has not yet been executed. In this case:
  - Execute the write request or transaction.
  - Proceed to step 4.
Did the previous execution succeed?
- If the previous request succeeded → return the stored result to the client according to the updated interface (iproto / net.box / lua).
- If the previous request failed → re-execute the request to obtain the error code and return the error to the client.
Execute the current request:
- If the request succeeds → store the idempotency token and a success flag (success = true) in the _idempotency space.
- If the request fails → store the idempotency token and a failure flag (success = false) in the _idempotency space.
Next, we replicate all spaces modified as a result of the request or transaction, including _idempotency. If is_sync is set to true, we return the query result to the client only after a quorum of acknowledgments has been collected. For asynchronous requests, we return the response to the client immediately.

Below is a diagram of the server-side algorithm for clarity:

Other solutions

Below are various methods and patterns, providing partial or complete solutions to the duplicate request problem—implemented on the client (driver) and server (distributed database) sides in different databases and their SDKs. However, not all of these methods are suitable for Tarantool specifically.

Client side solutions

No Retry mechanisms PARTIAL solution 🔴

Description: The absence of retry mechanisms is the most primitive and straightforward way to prevent duplicate requests at the driver level. However, this does not provide a 100% guarantee, as the client may implement retry-like patterns in its own application code.

Pros:
- Requires no action from driver or database developers.
Cons:
- Places full responsibility for implementing this pattern on the client. There is a high risk that the client will implement it incorrectly and encounter the aforementioned problem again.
- Severely limits the functionality of the client driver.
Automatic retry only for READ/idempotent operations PARTIAL solution 🔴

Description: This approach allows retry logic to be applied only to operations whose semantics guarantee idempotency to the user.

It is important to note that this method is suitable only for drivers that provide a NoSQL-style API with a clear distinction between READ and idempotent operations. This characteristic of the driver ensures a well-defined and finite set of operations that are safe to retry.

However, if the client library exposes only an SQL-like API (i.e., sending textual SQL queries through the driver), we face the challenge that determining the idempotency of an SQL query would require syntactic and semantic analysis, significantly complicating the driver’s logic.

Pros:
- Enables safe retries for a limited, well-defined set of queries.
Cons:
- Does not fully eliminate the problem of data duplication, as users will still implement their own retry logic specifically for non-idempotent operations.
- Requires a NoSQL-style API with a clear distinction between operations that are safe to retry and those that are not.
- If only an SQL-like API is available, it necessitates syntactic and semantic analysis of queries.
Idempotency flag in retry mechanism PARTIAL solution 🟠

Description: With this approach, the user can apply retry mechanisms to absolutely any operation (an improvement over Solution 2). However, the responsibility for determining whether an operation is idempotent lies entirely with the user. If the user is confident that the operation is safe to retry, they set the idempotency = True flag, and the request will then be automatically retried internally until it succeeds. However, if the user misjudges the idempotency of the operation, they will encounter the aforementioned problem.

Pros:
- A fully implemented retry pattern supporting all operations.
- Lower risk of data duplication compared to Solutions 1 and 2.
Cons:
- However, there is still a risk of duplicate requests if the user incorrectly determines the idempotency of a query.

Server side solutions

Conditional Writes with idempotency keys at the data schema level 🟠

Description: This pattern is a special case of "idempotency keys." Its core idea is that the user explicitly designates which tables or spaces should support idempotency tokens. Such tables must include a mandatory unique field that stores these keys at the schema level of the space. In the event of an idempotency key conflict, the user defines their own business logic to resolve it. All write operations (insert, delete, replace, upsert, update, etc.) must use a non-standard interface or SQL syntax. Example:
```
INSERT INTO users (id, name, idempotent_token)
VALUES (456, 'Bob', 'c59ef57a-3d03-4872-93ed-8bf47d6e51b2')
ON CONFLICT (idempotent_token) DO NOTHING;
```
```
UPDATE users
SET name = 'NewName'
WHERE idempotent_token != 'c59ef57a-3d03-4872-93ed-8bf47d6e51b2'
ON CONFLICT (idempotent_token) DO NOTHING;
```
```
box.space.s:update(key, {op_1, ... , op_n}, idempotent_cond)
-- idempotent_cond - Idempotency condition to prevent duplicates
```
The logic for adding/removing idempotency keys from user-defined spaces differs from the standard approach. In this method, immediately after a retried operation succeeds, the idempotency key field must be cleared from the corresponding tuple to avoid conflicts between requests and transactions belonging to different retry sessions. This cleanup occurs precisely when the client sends an acknowledgment confirming receipt of the database's response.

Pros:
- Eliminates the need to create a separate space-idempotency keys are stored directly within the same spaces and tuples that require deduplication (keys are physically co-located with the data).
- Enables embedding more sophisticated business logic directly into queries to handle conflicts during retries.
Cons:
- Requires modifications or extensions to the SQL or Lua interface so that write operations can support conditional writes and conflict resolution logic.
- Business requirements do not always permit schema changes to user-defined spaces at the system level.
External idempotency keys 🔴

Description: This approach involves storing idempotency keys along with operation results not in the database itself, but in an external service, e.g. a cache such as Redis or Memcached. Now, before executing a write request or transaction, the master node checks for the idempotency key not in a database space, but in the external cache. If the cache already contains the required token, the database can return the result of the previous request execution - the one whose response never reached the client. If the token is not present in the cache, the master node executes the request and sends two responses: one to the client and another to the caching service.

Pros:
- Requires fewer internal modifications to the database itself, as it leverages an existing off-the-shelf component - the cache.
Cons:
- Not suitable for Tarantool, since we are not allowed to use third-party products for storing system or user data within it.
- Introduces additional failure points due to the inclusion of a new external service in the system.
- May cause noticeable performance degradation due to the extra network hop required to communicate with another component in the distributed system.

Gerold103 · 2025-10-27T21:08:06Z

Gerold103
Oct 27, 2025
Collaborator

Interesting. Is there a reason why we try to solve that on the core level and not in some external module?

The main problem is that no matter how you try to solve it on the server-side, a proper solution needs to much context from the application logic. See some notes below about that.

REPLACE -> Leads to duplicates -> NO

You have it in the table where you say which requests can lead to "duplicates". And you say, that "replaces" don't have that. The problem isn't just an actual duplicate key. It is the same operation applied twice. Same replace sent 2 times can also mess things up if between them the data was changed to some new state, and then the second replace reverted it back.

A unique 16-byte binary key of type UUID is generated

UUIDs can clash. It is just a hash function, in general. If you have many clients and enough time, they will clash eventually. Even if you use time-based UUID, you can't even guarantee that it will be monotonic. Because 1) real time can literally be adjusted backwards, 2) true monotonic time will be lost on restart. Tbh, I don't see how this can possibly be solved without being aware of the application logic.

Any sort of random/time hashing solution I wouldn't consider safe, as it would be just a time bomb which can consider a request suddenly as "already done", even though it wasn't.

flag retry -> "The request will be retried until a response is received from the database"

So if this one is specified, the request will be retrying potentially forever? Even when a timeout is given?

2 replies

mrForza Nov 5, 2025
Collaborator Author

You have it in the table where you say which requests can lead to "duplicates"

This table demonstrates the main operations in Tarantool and whether they individually lead to data duplication (that is, requests from other clients do not arrive in our database). It is in this context that some writing operations (replace, insert, delete) cannot lead to duplicates during retrying. If we consider cases where requests from other clients arrive in our database between our retry requests, then we will have the following situation:

The first retry request is applied (successfully). Currently, the space _idempotancy has an entry <token_1, true>.
The response does not reach the user due to a network error.
Since the first request was timedouted on the client, the user sends this request a second time (in retrying).
At this moment, some other requests arrive that changes the data that was involved in request 1.
When processing the second requested query (in retrying), the database responds to the user with true, since there is an entry with this key in the _idempotancy space.

As a result, our second completed request has no effect on the result of previous requests from other clients. I would like to note that we will transfer the idempotent token to absolutely any writing request (it does not matter whether it leads to duplicates individually or not). And yes, the "Leads to duplicates" column can be a bit confusing, so I'll change it.

UUIDs can clash

I don't see a more reliable data type than a UUID. The probability of a collision when using a uuid is 10^-13. Even if we take into account the fact that we may have hundreds or even thousands of writing clients (in practice, this number is unlikely to be reached), and they simultaneously commit a million writing RPS, this is still a near-zero probability of a collision.

And if we consider that we can set up the lifetime of the _idempotancy space and it will never monotonously grow, then I believe that we will never get collisions.

the request will be retrying potentially forever

Yes, until at least some response from the database arrives to the client.

Gerold103 Nov 5, 2025
Collaborator

Thanks for explaining!

It is in this context that some writing operations (replace, insert, delete) cannot lead to duplicates during retrying

My point was that the duplicates aren't the only problem. But a fix, if it exists, would probably solve both issues, this is true.

this is still a near-zero probability of a collision

It is not zero though, right? And it is unpredictable. Random. 10^-13 doesn't mean it will happen once in the given number of requests. It can, in theory, happen literally with 2 requests sent one right after the other. It is not a zero possibility. Which turns Tarantool into a surprise-machine - if your luck is ok, then all works good. If you get very unlucky, some of your requests might silently return "ok" but actually be not applied. This looks like a show-stopper to me. No warning, no nothing, just an operation being (even though very unlikely) completely skipped.

I find it sometimes helpful to ask customers in the red public chat in Telegram about how would they feel about some public features/behavior (did it years ago though, but it was interesting insights). You can try to ask this. Like "would you use a feature which makes retrying a bit safer, but might silently lose your data with 10^-13 probability?".

I don't see a more reliable data type than a UUID.

Me too. This was the main point of my comment. I don't see any way how it can possibly by done in a universal and reliable way. It just doesn't look "implementable". On top of that, a solution, whatever it is, will not solve the problem for good. For the following reasons:

If you only store this token temporarily for a timeout, then a retry after the timeout will still be applied. And the application logic will need to be able to handle that.
If you store the tokens infinitely, then they will obviously eat all the storage eventually.

The solution doesn't make the problem go away completely, requires a lot of code to support it, and introduces new problems. Is it a good solution then?

and it will never monotonously grow

Didn't get that part. Can you elaborate?

Yes, until at least some response from the database arrives to the client.

This will be problematic. Infinite requests look convenient and easy to implement. But in the real world requests can never retry infinitely. There must be eventually a stop. And between the retries there needs to be a backoff. Retrying them infinitely one after the other will lead sometimes to:

Very high load on the storage and network. If there is no backoff, it will certainly happen that retrying will be very very frequent in certain cases.
Inability for the client to just stop the retries when it already won't even make sense to continue the retries.

This is why there always must be a backoff in the retry-logic. And the retries can not be infinite. About backoff techniques I suggest to google "retry backoff", and there will be some articles, which can explain it better than I.

sergepetrenko · 2025-11-07T15:47:04Z

sergepetrenko
Nov 7, 2025
Maintainer

I'll ask the tarantool queue guys whether they would like this feature in core or not, and if yes, which parts of it would be interesting to them (they have deduplication keys there already, implemented on their level).

Speaking of uuid being not optimal for deduplication - well, I don't see why we should limit the users to one particular type. They could use literally anything that can be encoded to msgpack and stored into a space (float, double, decimal, uuid, string, unsigned, ...).

Also in my thinking the idempotent_token should be optional, and should be passed by the user. If the user passes nothing, we perform no checks. If the user passes something, we use this "something" to check for duplicates.

0 replies

Serpentian · 2025-11-08T18:04:03Z

Serpentian
Nov 8, 2025
Collaborator

1. Review

1.1 Retry option in connectors

Firstly)

I don't think we should introduce such high-level API, the responsibility for writing retry logic should be on the client's code, in connectors we just have to give user a convenient way to implement it.

Consider the case, where user will want to execute some custom application logic between retries, it won't have such opportunity to do that, with these retry and timeout options.

For this I suppose, we should allow passing the idempotency_token to the requests and that's it. The retry logic in the client's code will look like that:

local res, err
local token = uuid.str()
while err.code == 'TimedOut' do
    res, err = pcall(conn.space.bands.insert, conn.space.bands,
        {1, 'Roxette', 1986}, {timeout = 1, idempotency_token = token})
    -- Some custom logic, e.g. how many retries were done, the timeout for
    -- whole while loop and so on...
end

VShard is one of Tarantool's user, which lacks the retry logic for RW requests. And the options, you propose, look like the ones, which we already use there: request_timeout and timeout:

If we decide to implement the deduplication framework in core, then we won't have to do anything at all in vshard, user will pass the token and configure storages properly, that's it.
If we decide not to implement it in core, then we can still implement the deduplication of RW requests in vshard only, if that's needed.

P.S. Below you write about net.box, which is actually just a Lua connector to the Tarantool. The python and lua connectors should have the similar interface, so I didn't understand, why they differ. I'm ok with net.box interface, I'm not with Python ones.

1.2 New IPROTO interface, UUID in the header.

IPROTO_IDEMPOTENT_TOKEN, of 16 bytes in size (used to encode a UUID value

We should allow specifying any kind of idempotency token, user should decide. As Vlad said, UUID may clash, and even if the possibility of that clash is near 0, this still can happen. The result of such clash will be too serious: the DB will say, that the transaction was applied, when it wasn't actually.

Only the user can decide, which data he wanna use as token and it really depends on the app. UUID looks like a good general solution, but if user will use it, then he'll undestand, that they could clash.

What I'm trying to say, the token should be completely transparent to user and he should decide what to use. For that we should think about using some general type of the IPROTO_IDEMPOTENT_TOKEN here: maybe MP_BIN is ok, or maybe MP_STR is better, don't know.

Since not all requests need to be sent with retry semantics, when a zero UUID (00000000-0000-0000-0000-000000000000) is provided, Tarantool will process the request using the standard algorithm.

Firstly)

We cannot do that, this will hit the performance of all requests. I agree, that if it's implemented, the idempotency token should be in the HEADER of xrow, since we don't need to decode the body (which may be costly). But by default it should be Nil, not encoded at all, not the 16 bytes of UUID or string to decode on every request.

User, who doesn't use the feature, should not pay with performnce, no matter where it's implemented (in Tarantool or VShard).

Secondly)

I'm not sure, we should limit, where this key may exist, let the user decide, whether he wants get or call to be done only once or not. I suppose, that since we place that field in the header, any request may have it. Moreover, to make the interation of the idempotency framework from the core to vshard seamless, we must allow deduplication of the call requests.

1.3 Server side

This one is much more difficult, since it should meet the requirements of any client.

It is proposed to store these keys in a global system space named _idempotency

Consider a user, who wanna deduplicate the following function (as I said, we need a way to deduplicate IPROTO_CALL too):

function increment()
    _G.counter = _G.counter + 1
end

This user doesn't need that value to be saved in the global _idempotency space, because the idempotancy_token will have to be really huge to minimize the clash possibility, and he doesn't wanna use that big token, since it will hit the performance.

In order to allow that, I propose to introduce the box.ctl.on_iproto_request, which will allow to set triggers on every event, which the server gets through IPROTO. The arguments of the trigger will be the map of xrow_header. In that trigger user will be able to do, whatever he wants: e.g. save idempotancy token in global Lua table and through error, if it's already there. We could narrow the trigger to smth like box.ctl.on_iproto_idempotancy_token. but I really like the more general variant more, since it will allow a lot of interesting things: e.g. completely prohibiting specific types of requests.

In addition to the box.ctl.on_iproto_request we should introduce the idempotancy_mode box.cfg option, which can be either persistent or manual. The manual is the default one. User may set triggers by itself, but in this mode if idempotancy_token is sent and none of the triggers are set, nothing happens. This is done in order not to make users, who don't wanna use that, pay with performance. In the persistent mode the trigger, which saves the tokens in the space, will be added automatically (though, user may still add some additional triggers via box.ctl.on_iproto_request).

Here, in order not to make all users pay with perf I see two options:

box.ctl.on_iproto_idempotancy_token, which is executed iff IPROTO_IDEMPOTANCY_TOKEN is encountered + default persistent mode. With default config user gets deduplication by passing the token
box.ctl.on_iproto_request + manual mode by default. Deduplication doesn't work with default config, but the trigger looks more interesting.

How long idempotency tokens are stored:

Firstly)

How will you know, when the key was written? We'll probably need timestamp in the schema of the _idempotancy space to figure that out. And we'll have problems with that, since the time may differ, after master switches.

And I don't really like, that we try to do the logic of the above standing module in the core, we already have the the expirationd.

Secondly)

This is not enough, the problem here is that it may happen, that there will come millions of requests in second, and we'll write so many entries to our space, that the memory will end. Users will want to have 2 options for that: idempotancy_lifetime + idempotancy_max_size. And we'll of course have to clean the oldest tuple there, when the new one comes and we don't have space for it.

IPROTO WRITE RESPONSE:

I'd consider replying with error, when the request was already applied, according to the whole text, I've written above.

2. Potential scenarios

Now, let's check out, which options we have.

We implement the deduplication framework in the core, it's available for our clients, they can implement their own deduplication logic, the above standing projects don't have to do anything. The interface is the same for every above standing module or user.

In that variant, we should implement the config options idempotancy_mode, idempotancy_lifetime (we should probably change the name, sounds incosistent with other box.cfg's, probably timeout is better) and idempotancy_max_size. The new trigger: box.ctl.on_iproto_request or box.ctl.on_idempotancy_token. See the review part for details on all of that.

We implement the deduplication in the VShard only and call it a day. Currently vshard accepts request_timeout and timeout options, but does nothing for RW requests, they`re not retried. There was a request from user to support that, AFAIU from the JIRA ticket. We can implement it by itroducing custom vshard.storage.call_idempotant function, which will accept not only the user func and arguments, but also the token. It'll save these keys in space and check in that func the token.
Do nothing, even now any user can already implement the deduplication logic itself, but he will always have to use IPROTO_CALL for that. Even the deduplication in vshard can be done by calling specific function and retrying the vshard.router.callrw.

The problem with the second solution is that users, who doesn't need VShard, won't have the deduplication logic. Every user or above standing Tarantool project (e.g. AEON) will have to implement the deduplication itself and the API may differ. The third solution propagates that incosistency even thurther, everyone has to implement everything from the ground. The deduplication is possible only via IPROTO_CALL, all other requests cannot be deduplicated.

The problem of the first solution is that it's difficult to satisfy the requirements of all users, but it's still possible. It's a step towards unification of the deduplication in Tarantool, we'll have to do that only once and none of the above standing projects will have to do that themseves. Plus, we get deduplication in all requests, simplifying the users life (and probably making IPROTO_REPLACE is much faster, than calling a function, which will do that).

I'm here on the side of the first solution, but firstly we should indeed ask users, do they actually need that. But according to the JIRA ticket, some of users may want that (but, AFAU, we cannot ask product managers anymore here).

@mrForza, @Gerold103 WDYT?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tarantool

RFC: Requests' deduplication in Tarantool #11997

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Tarantool

RFC: Requests' deduplication in Tarantool #11997

Uh oh!

Uh oh!

mrForza Oct 27, 2025 Collaborator

Reviewers

Tickets

Changelogs

Summary

Problem

Causes

Preferred solution

Client side (driver)

New IPROTO interface

New Lua interface

New net.box interface

Server side

Other solutions

Client side solutions

Server side solutions

Replies: 3 comments · 2 replies

Uh oh!

Gerold103 Oct 27, 2025 Collaborator

Uh oh!

mrForza Nov 5, 2025 Collaborator Author

Uh oh!

Gerold103 Nov 5, 2025 Collaborator

Uh oh!

sergepetrenko Nov 7, 2025 Maintainer

Uh oh!

Uh oh!

Serpentian Nov 8, 2025 Collaborator

1. Review

1.1 Retry option in connectors

1.2 New IPROTO interface, UUID in the header.

1.3 Server side

2. Potential scenarios

mrForza
Oct 27, 2025
Collaborator

Replies: 3 comments 2 replies

Gerold103
Oct 27, 2025
Collaborator

mrForza Nov 5, 2025
Collaborator Author

Gerold103 Nov 5, 2025
Collaborator

sergepetrenko
Nov 7, 2025
Maintainer

Serpentian
Nov 8, 2025
Collaborator