Skip to content

Conversation

keynslug
Copy link
Contributor

@keynslug keynslug commented Jun 29, 2025

Introduce esockd:open_tcpsocket/3 that starts socket-based listener and acceptor pool.

Unlike other TCP/SSL listeners, this listener is not accompanied by esockd_transport abstraction layer. Instead, the user is required to use socket APIs directly. This is mainly because of a lot of differences in socket-based sockets APIs, asynchronous messaging, and general behavior, which inevitably make attempts to fit them into esockd_transport messy and prone to performance hits.

Currently, tcpsocket listeners:

  • Support usual resource usage and access controls like max_connections, limiter and access_rules.
  • Share the same tcp_options but several options are silently ignored due to incompatibility.
  • Do not support SSL / TLS.
  • However, support Proxy-Protocol v1/2 through esockd_proxy_protocol, with various proxy information available through esockd_socket module.

Part of EMQX-14333.

@keynslug keynslug force-pushed the perf/EMQX-14333/erl-socket branch from 025c5dc to 52a7403 Compare June 30, 2025 08:47
@keynslug keynslug force-pushed the perf/EMQX-14333/erl-socket branch from 52a7403 to 072674c Compare June 30, 2025 08:49
@keynslug keynslug marked this pull request as ready for review June 30, 2025 16:55
@keynslug keynslug force-pushed the perf/EMQX-14333/erl-socket branch from f2c217a to 16f5766 Compare July 1, 2025 11:14
Copy link
Contributor

@thalesmg thalesmg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌶️ 💪

@keynslug keynslug merged commit a82f047 into emqx:master Jul 2, 2025
3 checks passed
@keynslug keynslug deleted the perf/EMQX-14333/erl-socket branch July 2, 2025 08:57
-spec fast_close(socket()) -> ok.
fast_close(Sock) ->
%% TODO: Research better alternatives.
_Pid = erlang:spawn(socket, close, [Sock]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

socket:close may be a long blocking call. this is why we spawn a process to run?
Double check the side effect of calling from a none-owner process.

review where fast_close is used, maybe not widely used.

I remember QUIC transport has similar problem to solve.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

socket:close may be a long blocking call. this is why we spawn a process to run?

Yes, there's no guaranteed non-blocking close API in socket, unless I'm missing something. I guess another option is to setopt nolinger and close, to force kernel to throw away any buffer and send RST?

On the other hand, it's still somewhat unclear what's the goal of fast_close/1 due to lack of documentation. Is it "send the RST and free OS resources as soon as possible"?

Double check the side effect of calling from a none-owner process.

Concept of "owner" in case of socket sockets is pretty rudimentary AFAICT.

  1. Lifetime of the socket is tied to the lifetime of the owner process.
  2. Only the owner process can change {otp, meta} and {otp, controlling_process}.
    ...And this is basically it.

review where fast_close is used, maybe not widely used.

Pretty extensively, for example in emqx_connection, and in emqx_socket_connection in a similar fashion. FWIW this codepath is now triggered in emqx_client_SUITE:t_congestion_send_timeout, according to the coverage report.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fast_close means close the socket without flush the buffer, I think.
e.g. in emqx, Peer may not recv MQTT.DISCONNECT if we send MQTT.DISCONNECT then fast_close socket.

My defination of the owner is controlling process in https://www.erlang.org/docs/26/man/socket

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fast_close means close the socket without flush the buffer, I think.

Got it. Then I guess "better" fast_close/1 should:

  1. Set socket:setopt(S, {socket, linger}, #{onoff => true, linger => 0}).
  2. Call socket:close(S).

WDYT?

My defination of the owner is controlling process in https://www.erlang.org/docs/26/man/socket

That's what I was talking about as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could leave it.

setopt is extra syscall (I guess) I hope we could avoid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setopt is extra syscall (I guess) I hope we could avoid.

Good point. On the other hand, there's risk of port range exhaustion due to TIME_WAIT IIUC.

Anyway, I already started working on this stuff and drafted #209, let's move discussion there.

TcpOpts = merge_defaults(proplists:get_value(tcp_options, Opts, [])),
case listen(ListenOn, TcpOpts) of
{ok, LSock, SockOpts} ->
_MRef = socket:monitor(LSock),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we ignore MRef? I think monitor handler need match this MRef.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we only ever spawn a single monitor and match on LSock itself. Ultimately, it's important that LSock has been closed, not that some specific monitor has been triggered. Besides, we never need to unmonitor it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants