MFdragonfly: resolver fix for timeouts on unqualified hostnames
res_search only incremented got_servfail for h_errno == TRY_AGAIN *AND* hp->rcode == SERVFAIL. However, there are cases such as timeouts where rcode is not always set to SERVFAIL. This leads to inconsistent nameserver operation during multi-domain and truncated dot searches, especially during booting when portions of the network are being brought up simultanious with dns lookups. This patch attempts to correct the problem by unconditionally terminating the search if TRY_AGAIN is returned (after res_query has gone through all retries and name servers) instead of trying other domain elements in the domain seach path. This patch should fix reported problems (which I can reproduce) with some NFS mounts failing during boot. This occured because mount_nfs thought the host name lookup returned a definitive failure using a non-dotted host name when, in fact, it timed out on the first part (host.search.domain.name) and got a definitive host-not-found response on the second part (host.). Generally speaking, search path name server timeouts can exceed 60 seconds per element and most machines which consistently timeout on earlier portions of a search path are effectively non-operational due to the imposed delays. It is more important for DNS lookups to return the proper error code then to be able to recover a valid lookup in later portions of the search path in these situations. Obtained from: DragonFly MFC after: 3 weeks
This commit is contained in:
parent
75988358a2
commit
1cc11684ac
@ -273,11 +273,24 @@ res_search(name, class, type, answer, anslen)
|
||||
/* keep trying */
|
||||
break;
|
||||
case TRY_AGAIN:
|
||||
if (hp->rcode == SERVFAIL) {
|
||||
/* try next search element, if any */
|
||||
got_servfail++;
|
||||
break;
|
||||
}
|
||||
/*
|
||||
* This can occur due to a server failure
|
||||
* (that is, all listed servers have failed),
|
||||
* or all listed servers have timed out.
|
||||
* hp->rcode may not be set to SERVFAIL in the
|
||||
* case of a timeout.
|
||||
*
|
||||
* Either way we must terminate the search
|
||||
* and return TRY_AGAIN in order to avoid
|
||||
* non-deterministic return codes. For
|
||||
* example, loaded name servers or races
|
||||
* against network startup/validation (dhcp,
|
||||
* ppp, etc) can cause the search to timeout
|
||||
* on one search element, e.g. 'fu.bar.com',
|
||||
* and return a definitive failure on the
|
||||
* next search element, e.g. 'fu.'.
|
||||
*/
|
||||
++got_servfail;
|
||||
/* FALLTHROUGH */
|
||||
default:
|
||||
/* anything else implies that we're done */
|
||||
|
Loading…
Reference in New Issue
Block a user