-
Notifications
You must be signed in to change notification settings - Fork 18k
net: ExecIO invalid number of bytes written #18887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
|
What version of Go? |
A git clone from about a month ago |
Sorry, I meant what version of Windows? |
Hmm, its wine 2.0-rc1 :) |
You think its a good idea to ask wine developers about why some internal method in the deep windows runtime core of golang returns invalid number of bytes? |
It's Wine's job to act like Windows. If a program (e.g. grpc-go's test suite) runs fine on Windows but doesn't run on Wine, then it's a bug in Wine. I don't care how the Wine developers debug it. I'm just saying it's not Go's job to adapt to their differences in behavior. ("bugs") If a user has Wine bugs with a Go program, they can just use their Go program on Linux, FreeBSD, NetBSD, OpenBSD, Mac OS X, Dragonfly BSD, Plan 9, Solaris, or other operating systems that Go natively supports. There's no reason for users to subject themselves to Wine, right? What am I missing? |
I can not ask wine developers to debug issues like network connection write returning invalid number of bytes, since there are so many levels prior calling real syscall which writes the data. |
Why are you using Wine? |
That's actually a pretty frequent issue, and it doesn't look like something is broken (it is not until it is flush call, and as a workaround I will submit a proof-of-concept patch which spins inside Flush() method, well, flushing data into the writer, and only returns after error):
print above was made from the following debug patch:
|
/cc @alexbrainman as FYI |
@bradfitz Since I work on linux and do not really know windows ecosystem, I decided to work with wine (first). |
@bioothod, fair enough. I just want you to know that you're in uncharted waters. Good luck. |
@bradfitz I understand that, and thnk you. Though it is not that much different from some other complex system one has to have to deal with, its just takes some time to get into |
/cc @johnsonj too Jeff, this isn't a priority for us since it works on Windows and only fails on Wine, but it might be interesting to you anyway to verify that we're using the Win32 API correctly and not making too many assumptions. Maybe we're just getting lucky. |
So, if I read your output correctly, WSASend is asked to send 30 (or 37) bytes, and returned that it has sent 38 bytes. That sounds wrong to me. Mind you, the WSASend might be returning error. Does it return error? Alex |
Yup, it looks like it was asked to send 30 bytes, but it returns 38. It looks like it is always 38, but I'm not 100% sure. There is no error in this case, both errno and return value are zero. According to this https://msdn.microsoft.com/en-us/library/windows/desktop/ms742203(v=vs.85).aspx, it is not recommended to use
I will investigate it further. |
@alexbrainman you know, just setting overlapped pointer to nil fixes the problem. No matter what @bradfitz and his fellow legal friends will tell about patches in github, but this is roughly the fix I use, and it passes my grpc test under quite a load of about 50 ping-pong krps which golang under wine never passed before
|
It requires a bit more overlapped work, namely only accept/connect/cancel use this structure now. What do you think, should I submit a patch? |
@bioothod @alexbrainman do we allow the creation of overlapped sockets in the first place? A little grep-ing doesn't show any exposure. This could be a case of Wine not ignoring |
Well, some code uses |
Do you know if this only occurs with multiple writes? It'd be great to see a min-repro or failing test in golang. I'm curious if we're passing Win32 a pointer then re-using it again for a subsequent call before it's completed. We need to allocate a unique
Setting the
|
It does happen with single writer too. It looks like all sockets are potentially overlapped (I did not understand whether it is required or not to set My patch makes it clear that we do not have async sockets, which we did not have anyway - As of whether this bug is a wine bug or not - I'm not sure it is wine, since msdn clearly says that it is possible to have error if both sent bytes and overlapped pointers are set
and the fact that native windows does not expose this behaviour doesn't mean it will not in some future versions. Wine kind of pays it forward :) |
Sorry to clarify- multiple writes meaning more than a single send on a given socket. Not necessarily multiple goroutines because a socket opened with It could simply be a timing issue that doesn't repro in Windows today but I agree we are using the API incorrectly. We shouldn't look at (or pass in a handle for writing) Perhaps we extend |
I didn't check how grpc sends data, but my protocol implies request-response of quite small messages (like 20-30 bytes), so I believe they all are being sent with single write-single read in a loop one after another. Sometimes I got this error quickly after the start, it looks like it could be the race you mentioned but on the server side after it just had accepted the client and started to send him the first message. It can be completely different though. Changing windows code to be fully async is a nice way to go, but it will not be much of a profit, since there are no users out there who might fully utilize this feature - there is no api and there are no other operation systems who might implement it as well. Quite contrary, using common |
No, grpc has separate goroutines doing writes and doing reads. It's not one goroutine alternating between the two tasks. |
As far as I can tell, we use The fix of passing a nil |
You are right, there is async IO and it should work if My patch which dropped overlapped structure from read and write path (actually they both seems to be affected on wine) simply turned async sockets into sync, and it either "fixed" this race or just disabled wine-specific artefacts. It looks like testing this possible overlapped bug will be a bit more challenging - I not only have to allocate overlapped control structure per |
Perhaps Go implementation is at fault here. But we never seen any errors here. Going by https://msdn.microsoft.com/en-us/library/windows/desktop/ms742203(v=vs.85).aspx Overlapped Socket I/O I assumed that we must pass lpNumberOfBytesSent parameter in case IO completes before WSASend finishes. So we don't wait for IO to complete after WSASend returned.
I am not surprised. But that makes WSASend wait until IO completes. So your program will be consuming OS thread here waiting for IO. That is what Go net library is trying to avoid here.
I do not know how Go makes sockets overlapped. Maybe all Windows TCP sockets are OK for non-blocking IO. Maybe the fact that we pass OVERLAPPED everywhere makes them non-blocking.
It is not true. We can reuse lpOverlapped again and again (and we do). As long as we do not use same lpOverlapped for different IOs simultaneously.
You do not have to use completion callback function. You can just use WaitForSingleObject (or any other Wait* function) and WSAGetOverlappedResult to get IO result. Or use CreateIoCompletionPort, GetQueuedCompletionStatus and WSAGetOverlappedResult. You can see Go implementation in netpoll_windows.go in runtime package.
See my explanation above. Alex |
Thanks Alex.
See my comment. If ConnectEx is on a platform and suitable then it is used which makes the socket overlapped. Edit: I see that net.ioSrv.ExecIO blocks until the call is completed which answers my question below. Curious why it's ok to reuse the same Here's psuedocode of what I think is happening today: WSAOVERLAPPED Overlapped;
WSASendTo(SendToSocket, &DataBuf1, 1,
&BytesSent, Flags, &RecvAddr,
RecvAddrSize, &Overlapped, NULL);
WSASendTo(SendToSocket, &DataBuf2, 1,
&BytesSent, Flags, &RecvAddr,
RecvAddrSize, &Overlapped, NULL); Wine keeps a long living reference to the structure on a per-WSASend-call basis. |
We do. windows netFD has pd field that is used to ensure only single write in flight. Alex |
So, @alexbrainman what is the solution for my problem? Setting This is trivially reproducible under wine with several clients and grpc_test server I posted link above. |
As I understand it, setting lpOverlapped to nil makes WSASend wait until IO completes before returning. That, obviously, occupy OS thread. If that suits you, then fine.
WSASend should not return number of bytes sent greater than requested. It is a bug in WSASend, or the way we use WSASend. But I don't see else we can call WSASend. Perhaps you can look at WINE source code (or even debug that call) to understand the problem.
I don't see that as a problem. It is up to the caller to handle this case. If some code fails to do that, I think that code needs adjustment. Alex |
Well, I started to debug wine to find out why
|
Filled wine bug: https://bugs.winehq.org/show_bug.cgi?id=42377 |
Fixed in wine: https://bugs.winehq.org/show_bug.cgi?id=42377#c18 |
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (
go version
)?go version devel +ffedff7 Sun Jan 8 00:01:30 2017 +0000 linux/amd64
What operating system and processor architecture are you using (
go env
)?It is linux, but go was cross-compiled for windows
While debugging bug in grpc-go server bugs grpc/grpc-go#1054 I added several debug prints to find out why
bufio.Flush()
sometimes returnsshort write
error when flushing data into http2 connection, in particular, I checked number of bytes returned fromnetFD.Write()
method, and although most of the time it returned smaller number of bytes than buffer size, I got this error once:I.e.
len(buf)
in this method returns 33, butExecIO
returned 38.How is it possible?
The text was updated successfully, but these errors were encountered: