-
Notifications
You must be signed in to change notification settings - Fork 18k
net: when LookupIP is timed out, all duplicate lookups wait #10117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
what if the timeout is only temporary network flake? do you want to have
one temporary failure
affect all the duplicate lookups?
|
It won't matter. The code says to forget that host lookup, so the current ones are stranded. Only new calls after the "Forget" will cause a new Lookup. |
@minux I see what you are saying.... I also noticed that the c.dups in the return is racy since it isn't guarded by a lock. |
Here is the stack showing all the routines stuck in a lookup with nothing actually doing a lookup. |
Which version of Go are you using? The file names suggest you are using a version before Go 1.4. I expect that this was fixed in Go 1.4 by the fix for issue #8602. |
This is Go 1.3. You are correct that it looks like with the Forget in there, this problem should go away. But the c.dups on Line 62 in 77595e4
|
Will close this out since its fixed in Go 1.4 and beyond. |
The reference to c.dups on line 62 is safe because it happens after doCall returns. After doCall returns the entry has been removed from the g.m map, which means that it will never again be modified. All modifications to the entry happen with the lock held, so the lock in doCall provides the necessary happens-before relation. |
It seems that if you have many goroutines that do lookups, they are tied to one routine that does the actual DNS resolve. If it succeeds, the answer is provided to everyone and all is happy.
When it times out, rather than releasing everyone, all the goroutines must also wait their time out too. It would make more sense to also release the other routines with the errTimeout immediately.
The text was updated successfully, but these errors were encountered: