-
Notifications
You must be signed in to change notification settings - Fork 18k
net: LookupIP("doesnotexist.domain") returns "server misbehaving" when resolv.conf contains search lists #12712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Without more information or a way to reproduce it we won't be able to do much. Can you capture the network traffic when you run the Go program, or dig? From that we should be able to write a test, once we see what's arriving on the wire from your DNS server. |
If you can tell me how to make a local copy of Go's net package work, I can dump out the relevant data structures and maybe even find the issue myself. Unfortunately just copying /opt/go/src/net to my local directory and changing import "net" to import "../net" does not work. It says "imports internal/singleflight: use of internal package not allowed". Copying /opt/go/src/internal, too, does not help. Copying "net" and "internal" to $GOPATH/src does not work (without changing import "net") either. They are simply ignored and the global system version keeps being used. |
You won't be able to do that. Just build Go from source and modify the net package. Run "./src/make.bash" from the "src" directory to recompile Go. It takes 30-60 seconds to build everything. |
I've found the problem. My resolv.conf has a line "search foo bar" for 2 subdomains foo and bar. For whatever reason queries to anything.bar are answered with error code 2 SERVFAIL. LookupIP("doesnotexist.domain") apparently uses the search field in resolv.conf and queries "doesnotexist.domain", "doesnotexist.domain.foo" and "doesnotexist.domain.bar" in sequence and reports the LAST ERROR CODE returned, which happens to be SERVFAIL. If the order in resolv.conf is changed to "search bar foo", LookupIP reports the NXDOMAIN error (no such host). While it is apparent that the DNS server is misbehaving, the current LookupIP() implementation is suboptimal, because:
In any case, I would change LookupIP() so that if the DNS server returns different error codes for the different queries made due to to "search", LookupIP() should always return the error code returned for the query with the actual argument passed to LookupIP(). This is the only query that is guaranteed to happen and therefore the only code that can be expected to be consistent. |
Go 1.5 uses Go by default for DNS lookups, and only falls back to libc's resolver for special cases. See https://golang.org/doc/go1.5#net for details. So, it's not surprising that the Go DNS resolver has some rough edges. It's being exercised a lot more than it has in the past. /cc @mikioh |
FWIW, IIRC, troubleshooting tools such as dig and/or drill don't use search list for super (not sub) domains in resolv.conf by default. Looks like you have some idea to LookupIP. I'm not the original designer of LookupIP but I guess he wanted to make LookupIP easier for name-to-address mapping traversal. If you think of having a new API (sorry, we cannot change the behavior of LookupIP because it works as a stub resolver for helping Dial) for some purpose, please follow the procedure: https://github.com/golang/proposal#readme Thank you. |
Why did you close this bug? |
PS: If you need more low-level control for DNS, there are external packages. For example, http://godoc.org/github.com/miekg/dns. |
Just hands slipped. |
I don't think @mbenkmann wanted low-level control of DNS. I interpreted this bug as Go's native DNS resolver just misbehaving compared to libc. |
To repeat, here is what I (and I would assume most application programmers) expect from LookupIP()
|
I'm repeating myself but I really need to drive this point home: My application, like many, resolves host names provided by users. Users make mistakes, especially typos. When LookupIP() fails I write the error into a log file and users look at that log file. It makes a HUGE difference if that log file says "host not found" or "server misbehaving". It's the difference between allowing the user to quickly realize he's made a typo and fix the problem himself or forcing the user to file a support request with me that will lead to a fruitless discussion unpleasant to both sides about a DNS server and system configuration that neither I nor the user have under our control. |
I totally agree with. A packet capture would be helpful for debugging and writing a regression test. |
There's nothing special about the server's reply. It's just an error code 2 SERVFAIL which triggers the "server misbehaving" branch in net/dnsclient.go:answer(). The answer() function is called for each of the queries attempted during lookup (i.e. for each entry in search) and as currently implemented the last error propagates up to be the return from LookupIP(). So for regression testing, just set up your test DNS server to reply SERVFAIL for each query that ends in a certain subdomain and list that subdomain in resolv.conf's search line. Once this issue is fixed, it should no longer matter if the broken subdomain is listed first or last and unless the argument to LookupIP() explicitly includes the broken subdomain, the SERVFAIL should never propagate up. |
Related to #12778 ? |
Similar to #12778 I took a quick look at glibc to see how it handles this. If I'm reading/interpreting correctly, there is a special case for SERVFAIL when trying names with |
I did some more digging and was able to repro this using a DNS server built with github.com/miekg/dns. Instead of the current capturing and returning of the last seen error it probably would make sense to prefer the error encountered when looking up the name closest to what was passed in to This would mostly mimic what glibc does (here and here), though it only returns the error encountered when looking up the provided name if it has enough dots. Otherwise it will return the last error encountered while trying names with search domains appended, similar to what the Go implementation is doing now. I'm happy to put a CL together if preference for errors close to user input sounds like a good plan. |
CL https://golang.org/cl/16953 mentions this issue. |
The following test program returns "lookup doesnotexist.domain: no such host" when run with https://storage.googleapis.com/golang/go1.4.2.linux-amd64.tar.gz but returns "lookup doesnotexist.domain on 172.16.2.203:53: server misbehaving" with https://storage.googleapis.com/golang/go1.5.linux-amd64.tar.gz and https://storage.googleapis.com/golang/go1.5.1.linux-amd64.tar.gz
When I change resolv.conf to use nameserver 8.8.8.8 the output is correct. Apparently something has changed in Go 1.5 that prevents it from understanding the reply from our internal DNS server.
nslookup does not have a problem:
dig also has no problem
The text was updated successfully, but these errors were encountered: