-
Notifications
You must be signed in to change notification settings - Fork 15
P2P: Feature: Transaction notice #1499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ing entries" This reverts commit fcc0a7e.
…ze for trx notice to 200
…que since this is used to get a count.
…entation tanked performance.
…kup, so don't make a copy of the connections, just lookup directly in multiindex.
…mved when they expire.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unordered_flat_set
is much nicer :-)
if (auto tptr = id_idx.find( id ); tptr != id_idx.end()) { | ||
if (tptr->connection_ids.insert(c.connection_id).second) | ||
++c.trx_entries_size; | ||
already_have_trx = tptr->have_trx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that maybe we should update tptr->expires
if tptr->have_trx == false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each node has a configurable amount of time it will keep trxs entries around. If we update it then we are not really honoring that limit if they keep being sent to us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each node has a configurable amount of time it will keep trxs entries around. If we update it then we are not really honoring that limit if they keep being sent to us.
We store this expiration limit per transaction, not per connection. How can we honor the limit that each node sends?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each node does not send us an expiration. Our node has an expiration for how long it will remember a trx. I don't think another coming into the node should reset the expiration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have the expiration in the notice then? It is somewhat irrelevant imo. I think it would make sense to have the expiration only when we have the transaction (so maybe make it a std::optional
in node_transaction_state
, and remove from notice message).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra few bytes for the expiration in the notice I doubt would make little difference but it does allow the receiver of the notice to potentially remove it sooner. Note there can/will be many notices sent where a trx is never sent. If the trx fails to subjectively execute on the node then it will never be sent to its peers. So imagine a node that is being spammed with trxs that fail. That node is going to spam all its peers with trx notice messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra few bytes for the expiration in the notice I doubt would make little difference
yes, I agree
but it does allow the receiver of the notice to potentially remove it sooner
Sooner is a doubtful benefit. If a node is spammed with transactions that fail, why would the spammer provide a short expiration time.
I think maybe we should have a member in node_transaction_state
which would be:
timestamp first_notice_received;
,
and in expire_txns
, we would remove all transactions where
have_trxn == false
and for which now() - first_notice_received > max_trx_lifetime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In at least one common case of spam trxs they use a small (or default) expiration. If you are spamming a trx in as to hit some defi condition or win some game. You send many trx that fail. You are not trying to bring down the node, you are just trying to hit a condition on-chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually since there is a potential large delay in the notice and actually receiving the trx, I can see updating the expiration when actually receiving the trx. I'll make that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plugins/net_plugin/net_plugin.cpp
Outdated
auto& conn_idx = connections.get<by_connection_id>(); | ||
for (auto [conn_id, count] : expired_trxs_for_connection) { | ||
if (auto itr = conn_idx.find( conn_id ); itr != conn_idx.end()) { | ||
itr->c->trx_entries_size -= count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the notices should affect trx_entries_size
.
maybe instead of:
mutable bool have_trx = false;
have:
.std::optional<uint32_t> trx_connection_source; // if we have received the trx, connection_id which sent it
actually, because we can receive the same transaction from multiple connections, we probably should have:
mutable connection_id_set connection_srcs; // connections which have sent us this transaction
This is a lot of effort for managing this trx_entries_size
count, I wonder if it is worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is currently not a use-case for keeping track of who sent us an actual trx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must misunderstand the purpose of c->trx_entries_size
? Why do we need to track this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is to prevent someone just spamming the node with transaction_notice_messages until it grinds to a halt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just have a counter on the connection
, which is incremented every time we receive a notice, and reset to 0 every max_trx_lifetime
(we can check whenever we receive a notice on a connection whether the counter needs to be reset by keeping a notice_counter_last_reset_time
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we expect a honest node to send us 100,000 notices for transactions that fail within max_trx_lifetime
?
Maybe we should have a variant of the notice message to notify that the transaction failed and will never be sent, so we can clean it up immediately from our multiindex (in theory we whould have received it only from this peer, so the connection_ids
unordered_flat_map should have only one entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would eliminate the need for the expiration time in the notice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we expect a honest node to send us 100,000 notices for transactions that fail within
max_trx_lifetime
?
Or succeed. We don't remove them on success until expire either.
The failure notice is interesting. Lets see what @arhag thinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't remove them on success until expire either.
If they succeed, we will receive the transaction and update the expire in the multi-index accordingly, so we wouldn't rely on the expire from the notice anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking some more about this over the weekend. We already have p2p-dedup-cache-expire-time-sec
which defaults to 10 seconds. So by default the longest a trx/trx-notice entry is kept is 10 seconds, not 1hr.
Why not just have a counter on the
connection
, which is incremented every time we receive a notice, and reset to 0 everymax_trx_lifetime
(we can check whenever we receive a notice on a connection whether the counter needs to be reset by keeping anotice_counter_last_reset_time
).
This would be much simpler and faster. We could reset the connection entry counter every p2p-dedup-cache-expire-time-sec
. We set the max on the counter to a hard-coded 200,000. By default that allows for 20,000 TPS. A user could increase that by decreasing the p2p-dedup-cache-expire-time-sec
.
…on if we should send it a notice.
…lock to use the 15 seconds instead of 7.
Should |
@heifner any thoughts about my telegram note: Another idea which maybe is an alternative to the transaction notice:
This would reduce traffic and the number of i/o requests significantly, and, similarly as the notice, avoid sending a node a transaction it already has. The more I think of it, the more I feel that this would be better than the transaction notice. It reduces the number of i/o requests drastically. Even if we have 2000 trx ids to send every 1/10th of a second (if we are doing 20,000 tps), that still a single message, of size only 64KB, and same for the answer. Also we send all the requested transactions batched together. It is also simpler than the notice feature I think, which requires updating info in the multiindex for every notice, and consuming memory for every transaction/notice. This new suggestion has very minimal overhead. |
…e can't be trusted, just want to use p2p_dedup_cache_expire_time_us. Also since a node uses the minimum of p2p_dedup_cache_expire_time_us and transaction expiration, there is no need to validate expiration in net_plugin (it will be validated by controller).
… of p2p_dedup_cache_expire_time_us instead of number of entries can be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like use proto_version_t
instead of uint16_t
.
Should some tests be added for the new message?
P2P Transaction propagation enhancement
transaction_notice_message
record trx id and connection id but not that we have the transactionNotes:
A transaction request message is not needed. Transactions are broadcast to nodes that are known not to have the transaction. A node knows a connection does not have a transaction because it has not received the trx or a
transaction_notice_message
from that connection.transaction_notice_message
is not propagated when received.transaction_notice_message
is only sent when a transaction is received. Since transactions are verified (speculatively executed) before being broadcast this will normally provide enough time fortransaction_notice_message
to be received from nodes that already have the transaction.Backward compatible with previous versions of net protocol. Will not send
transaction_notice_message
to any connections not at latest net protocol. Since notransaction_notice_message
received from older versions all transactions will be broadcast to them if not received from them.Resolves #1475