-
Notifications
You must be signed in to change notification settings - Fork 18k
net: conn.Read() methods seem to thrash CPU, please return on insufficient data #27315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
What are you really trying to do? Why are you using Have you considered not using |
Thanks for the response. Answers below. Why are you using SetReadDeadline? How many concurrent connections do you need to support? Have you considered not using SetReadDeadline, but instead using an ordinary Read and sending the results on a channel? Then some other goroutine can use timers and read from the channel as it pleases. I don't mind essentially forking the functionality from conn out into a new specialized UDP reader, but believe that my extreme scenario is showing an area where there is low hanging fruit for optimization. I believe many out there might be spinning up a Go routine to process a stream of data from a port. The obvious and simple way to occomplish this is with a read loop continually hitting the port with bocking conn.Read() calls. This uses quite a bit of CPU. In my case, where I can allow the buffer to fill for a millisecond, then process repeatedly, having a read call that returns immedietly when there is insufficient data simplifies the timing logic that would otherwise be needed to determine that the buffer is empty and polling is occurring. |
Thanks for the detailed reply. I suspect that for your use case you will be better off using |
My design when using a non-polled read is more transparent and CPU friendly. I'll look into syscall.Read, though quite a few of the internal helper functions are not exported so my initial thought is that this will require a bit of code to get it working. I haven't looked in detail yet though. Thanks for the quick replies. Though I am developing on Linux and targeting Linux we are keeping cross compilation as an option. In additional to simplifying timer logic across operating systems and platforms, a non-polled read might also reduce the large ntoskrnl.exe burden that we see on Windows - up to 5% per stream. I haven't tested this yet. If you think others might benefit and I can help, let me know. |
I personally would be extremely reluctant to add a non-polled Looking back at your initial report, I noticed this:
That is not accurate. The code does not spin. |
I didn't mean to imply that SetReadDeadline causes the CPU to spins, but rather requesting a Read when a packet is not yet available. The SetReadDeadline only happens rarely, it does not use any noticeable CPU unless set on every read, in which case it does, but I see no reason to set it every time unless using it to set a deadline for slowly arriving messages or a single response. I appreciate the push to use syscall.Read, though initially I thought it would be a bit painful there are only a couple of line of code that required a change. There aren't a ton of examples so I'm pasting the gist below.
func (t *FTDV) run2(conn *net.UDPConn) {
defer conn.Close()
file, _ := conn.File()
// handle error here
fd := int(file.Fd())
syscall.SetNonblock(fd, true) // otherwise it is a blocking read
pkt := make([]byte, ftdvUDPPktSize)
...
for {
n, err := syscall.Read(fd, pkt)
if n <= 0 { // returns -1 with "resource not available" as the error
// relinquish, publish metrics, etc. likely that no more data is available
continue
}
if err != nil {
...
// handle other errors - haven't looked to see how this corresponds
// to 'n' so better processing may be needed here or better logic with
// the 'n <= 0' call
continue
}
// parsing logic here
// at first I expect to potentially get the UDP packet header, but this wasn't the case
// performs the same to adding in a return to the core conn.Read() function, thus
// non-blocking used there as well?
}
} Though the logic is a little different between the two read methods I believe most of the CPU performance increase is in avoiding the polled read. Using a 1 ms sleep each time the buffer is emptied (either by time to process 25 packets with the polled read. Since each packet arrives at a precise 40 us interval, but the processing time is much faster than that, every 25 packets the time to process is compared against 350us. If it takes longer, then the read loop sleeps for 1 ms. In the case of the syscall.Read, whenever n <= 0 it sleeps for 1 ms. syscall.Read - 7.5% of a core, 8 cores You're probably right regarding the integrated poller, etc. For my scenario, a non-blocking read would be used and having it exist within the fabric of the internals code would also be appreciated. This would be a Read that returns when there is no data available. There is no noticeable performance delta between syscall.Read and conn.Read with the continue being replace with a "return n, err". A non-blocking read would also make it possible to use a pseudo read-time scheduler with a hot thread per work queue, though the strongest argument for it is the reduction in CPU usage. There are quite a few project in Go that stand up read processes on ports. Thanks again and tell me if I can help with anything. |
The CPU doesn't spin there either, though. The goroutine sleeps waiting in the poller. Glad that |
Hopefully we can delete comments if my ignorance becomes too obvious. Looking at the fd_unix.go file (line 133), in the case where no data is available (potentially just when the connection is busy... not sure on this), the logic for (err != nil), (err == syscall.EAGAIN && fd.pd.pollable()), (err == nil) is taken over an over again. There does not appear to be any spot here where the process would sleep. When calling syscall.Read() directly without syscall.SetNonBlock(), the call only returns with data. When blocking is set to false this starts to return without data and with an error. I guess the for loop in FD.Read() seems like the poller since this is where syscall.Read is iterated over, but I don't see a place where a sleep happens. |
In the |
Yes, I see now. If it is possible in the future to see an indication of whether waitRead is being hit I could use that rather than syscall.Read as an indication that the buffers are empty. Optionally some direct call that returns the current conn buffer level would also serve. Thanks for the help. |
Update on syscall.Read in Go 1.11. Using the example code above the syscall.Reads will start failing after about a minute on "resource not available". A band-aid to this is to occasionally re-request the fd and set it to non-blocking, but this kind of starts to seem like magic. Do you know why the fd needs to be re-associated with the variable? Are syscalls not touching it in a way that the GC knows about it? |
I want Non-blocking version of In my case, I want to check TCP connection is closed or not, before sending packet on the connection. I tried to implement it with There are two problems:
There is another approach: use dedicated goroutine for reading. |
Please open a new issue, perhaps a proposal might be better. To me, your statement |
Looks like related to #15735 |
to clarify - my request is not related to 15735 and, in my opinion,
net.Conn methods already allow for the 15735 request.
What I want is non-blocking reads or the ability to see on the .Read return
if it had to wait. As of now this is hidden. Having this simplifies cases
where a programmer does not want to continue with repeated blocking reads
when the port buffer is empty and awaiting more data. This can be
accomplished through timing logic if one has a fixed interval stream of
data, but the logic can be tricky as it depends on system burden and the
hardware being used. It is not possible to easily write the logic for
non-periodic streams. A non-blocking read tells you specifically whether
there is data available or not. Another benefit realized with syscall.Read
is a reduction in CPU, but this is secondary to the CPU reduction for
non-blocking which would also be realized if the net.Conn library supported
it.
The CPU beefit I get from using non-blocking reads is a reduction from ~6%
to 0.5%. By writing in a return on no data to the net.Conn there is a
similar reduction, though not quite to far as there is still additional
wrapper code around the syscall.Read calls. 6 down to 1% using Go. The
additional drop from 1.0 down to 0.5 would not have been worth the time to
use syscall.Read calls for me.
…On Sun, Nov 4, 2018 at 1:33 AM Artur Troian ***@***.***> wrote:
Looks like related to #15735 <#15735>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#27315 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AQfI0rENwpigiEicjLPHac96XJwtwqV1ks5urqZggaJpZM4WQC60>
.
|
I doubt #15735 allows now to check if any data in socket available to read. Though it is mail goal of that |
If #15735 desires immediate return of whether data is available, then
sure. Go does allow for understanding whether data is available or not,
just not with a non-blocking return. It is the non-blocking functionality
that I desire, primarily for CPU usage reasons. Not needing to handle the
non-blocking sleep logic would be even better. Another solution for
higher performance applications might be to just supply a return value
specifying whether conn.Read blocked.
From: Artur Troian <[email protected]>
To: golang/go <[email protected]>
Cc: matthalladay <[email protected]>, State change
<[email protected]>
Date: 11/04/2018 10:31 PM
Subject: Re: [golang/go] net: conn.Read() methods seem to thrash
CPU, please return on insufficient data (#27315)
I doubt #15735 [github.com] allows now to check if any data in socket
available to read. Though it is mail goal of that
?
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub [github.com], or mute the
thread [github.com].
|
I'm sorry, I don't understand what this issue is about now. As @mikioh suggested, if you have a problem using |
Problem: CPU is high when processing high bandwidth streamed data (40 us
packet interval with 1500 byte packets). One method to lower CPU is to
batch process buffered data with intermediate sleeps, e.g. sleep for 1 ms,
process accumulated data until buffer is drained, sleep for 1 ms...
Non-blocking reads simplify the sleep condition analysis and Go does not
(outside of syscall.Read) support non-blocking reads.
Ian's proposed solution: Use syscall.Read for "non-blocking" reads.
Potential problems with this solution:
1. This may only be possible in Linux as my method of using a file
descriptor throws a Windows build error. We decided to not support
windows in the high performance use case.
2. It seems like we need to re-associate the file descriptor
periodically to avoid losing the connection. This is only applicable to
use cases where the connection is left open for streaming, as we are.
Re-association does not require much CPU, so we re-associate whenever the
non-blocking read fails.
Additional benefits of this solution: syscall.Read also short circuits
abstraction layers of the net.Conn infrastructure, thus is slightly more
CPU efficient that enabling non-blocking IO in net.Conn packages would be.
Future: If Go were to add a non-blocking read feature to net.Conn, we
would consider using it. It would enable cross OS support, avoid the need
to grab file descriptor handles, remove the syscall import, and re-enable
cross OS compilations.
Understand that we are no longer making changes to this portion of our
application and are satisfied with the syscall.Read solution.
Thanks.
From: Ian Lance Taylor <[email protected]>
To: golang/go <[email protected]>
Cc: matthalladay <[email protected]>, State change
<[email protected]>
Date: 11/05/2018 10:23 AM
Subject: Re: [golang/go] net: conn.Read() methods seem to thrash
CPU, please return on insufficient data (#27315)
I'm sorry, I don't understand what this issue is about now. As @mikioh
[github.com] suggested, if you have a problem using syscall.Read, please
open a new issue describing that problem. Thanks.
?
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub [github.com], or mute the
thread [github.com].
|
Thanks for the note. We have no current plans for adding non-blocking I/O to |
I have created new issue: #28607 |
It is likely that I missed something, but in implementing a network stream collector which sits and spins on arriving UDP messages (arrive every 40 microseconds) I see quite heavy CPU utilization.
This is using conn.SetReadDeadline with conn.Read. Note that the conn.SetReadDeadline is pulled out to a larger interval of every 25-50 packets rather than every packet. This same interval is used to insert Sleep() calls that allow the port to buffer packets for increased CPU efficiency. Somewhat complicated, but the end CPU utilization was around 1% CPU on Linux, down from utilizing most of a core.
Much of this is because when conn.Read() is called prior to a packet arriving the CPU spins on the read until data arrives, or until the ReadDeadline is hit. This would be much simpler with a method that returns on insufficient data (perhaps conn.TryRead()), or surfacing a conn.Length().
To this end I changed a line in the fd_unix.go file on my local install to achieve a conn.Read method that simple returns when data is unavailable. I understand that this would break backward compatibility, so am not suggesting a change to conn.Read(), but rather the addition of a method which allows for the non-blocking read call as this avoids burdening the kernel with repeated polling and in my experience results in fairly substantial CPU gains.
Additional benefit is that when working with Windows (assume a similar addition is possible there) - Sleep timeouts and timer precision are both much less granular.
I believe this addition will aid all streaming protocols and applications which want to maximize the number of streams and connections they are processing.
Inside fd_unix.go
Am I missing some call which achieves a non-blocking read? Thanks for you time in reading this.
The text was updated successfully, but these errors were encountered: