freebsd-dev/contrib/bind9/doc/draft/draft-ietf-dnsop-respsize-02.txt
2005-12-29 04:22:58 +00:00

480 lines
18 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

DNSOP Working Group Paul Vixie, ISC
INTERNET-DRAFT Akira Kato, WIDE
<draft-ietf-dnsop-respsize-02.txt> July 2005
DNS Response Size Issues
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Copyright Notice
Copyright (C) The Internet Society (2005). All Rights Reserved.
Abstract
With a mandated default minimum maximum message size of 512 octets,
the DNS protocol presents some special problems for zones wishing to
expose a moderate or high number of authority servers (NS RRs). This
document explains the operational issues caused by, or related to
this response size limit.
Expires December 2005 [Page 1]
INTERNET-DRAFT July 2005 RESPSIZE
1 - Introduction and Overview
1.1. The DNS standard (see [RFC1035 4.2.1]) limits message size to 512
octets. Even though this limitation was due to the required minimum UDP
reassembly limit for IPv4, it is a hard DNS protocol limit and is not
implicitly relaxed by changes in transport, for example to IPv6.
1.2. The EDNS0 standard (see [RFC2671 2.3, 4.5]) permits larger
responses by mutual agreement of the requestor and responder. However,
deployment of EDNS0 cannot be expected to reach every Internet resolver
in the short or medium term. The 512 octet message size limit remains
in practical effect at this time.
1.3. Since DNS responses include a copy of the request, the space
available for response data is somewhat less than the full 512 octets.
For negative responses, there is rarely a space constraint. For
positive and delegation responses, though, every octet must be carefully
and sparingly allocated. This document specifically addresses
delegation response sizes.
2 - Delegation Details
2.1. A delegation response will include the following elements:
Header Section: fixed length (12 octets)
Question Section: original query (name, class, type)
Answer Section: (empty)
Authority Section: NS RRset (nameserver names)
Additional Section: A and AAAA RRsets (nameserver addresses)
2.2. If the total response size would exceed 512 octets, and if the data
that would not fit belonged in the question, answer, or authority
section, then the TC bit will be set (indicating truncation) which may
cause the requestor to retry using TCP, depending on what information
was desired and what information was omitted. If a retry using TCP is
needed, the total cost of the transaction is much higher. (See [RFC1123
6.1.3.2] for details on the protocol requirement that UDP be attempted
before falling back to TCP.)
2.3. RRsets are never sent partially unless truncation occurs, in which
case the final apparent RRset in the final nonempty section must be
considered "possibly damaged". With or without truncation, the glue
present in the additional data section should be considered "possibly
incomplete", and requestors should be prepared to re-query for any
damaged or missing RRsets. For multi-transport name or mail services,
Expires December 2005 [Page 2]
INTERNET-DRAFT July 2005 RESPSIZE
this can mean querying for an IPv6 (AAAA) RRset even when an IPv4 (A)
RRset is present.
2.4. DNS label compression allows a domain name to be instantiated only
once per DNS message, and then referenced with a two-octet "pointer"
from other locations in that same DNS message. If all nameserver names
in a message are similar (for example, all ending in ".ROOT-
SERVERS.NET"), then more space will be available for uncompressable data
(such as nameserver addresses).
2.5. The query name can be as long as 255 characters of presentation
data, which can be up to 256 octets of network data. In this worst case
scenario, the question section will be 260 octets in size, which would
leave only 240 octets for the authority and additional sections (after
deducting 12 octets for the fixed length header.)
2.6. Average and maximum question section sizes can be predicted by the
zone owner, since they will know what names actually exist, and can
measure which ones are queried for most often. For cost and performance
reasons, the majority of requests should be satisfied without truncation
or TCP retry.
2.7. Requestors who deliberately send large queries to force truncation
are only increasing their own costs, and cannot effectively attack the
resources of an authority server since the requestor would have to retry
using TCP to complete the attack. An attack that always used TCP would
have a lower cost.
2.8. The minimum useful number of address records is two, since with
only one address, the probability that it would refer to an unreachable
server is too high. Truncation which occurs after two address records
have been added to the additional data section is therefore less
operationally significant than truncation which occurs earlier.
2.9. The best case is no truncation. This is because many requestors
will retry using TCP by reflex, or will automatically re-query for
RRsets that are "possibly truncated", without considering whether the
omitted data was actually necessary.
2.10. Each added NS RR for a zone will add a minimum of between 16 and
44 octets to every untruncated referral or negative response from the
zone's authority servers (16 octets for an NS RR, 16 octets for an A RR,
and 28 octets for an AAAA RR), in addition to whatever space is taken by
the nameserver name (NS NSDNAME and A/AAAA owner name).
Expires December 2005 [Page 3]
INTERNET-DRAFT July 2005 RESPSIZE
3 - Analysis
3.1. An instrumented protocol trace of a best case delegation response
follows. Note that 13 servers are named, and 13 addresses are given.
This query was artificially designed to exactly reach the 512 octet
limit.
;; flags: qr rd; QUERY: 1, ANS: 0, AUTH: 13, ADDIT: 13
;; QUERY SECTION:
;; [23456789.123456789.123456789.\
123456789.123456789.123456789.com A IN] ;; @80
;; AUTHORITY SECTION:
com. 86400 NS E.GTLD-SERVERS.NET. ;; @112
com. 86400 NS F.GTLD-SERVERS.NET. ;; @128
com. 86400 NS G.GTLD-SERVERS.NET. ;; @144
com. 86400 NS H.GTLD-SERVERS.NET. ;; @160
com. 86400 NS I.GTLD-SERVERS.NET. ;; @176
com. 86400 NS J.GTLD-SERVERS.NET. ;; @192
com. 86400 NS K.GTLD-SERVERS.NET. ;; @208
com. 86400 NS L.GTLD-SERVERS.NET. ;; @224
com. 86400 NS M.GTLD-SERVERS.NET. ;; @240
com. 86400 NS A.GTLD-SERVERS.NET. ;; @256
com. 86400 NS B.GTLD-SERVERS.NET. ;; @272
com. 86400 NS C.GTLD-SERVERS.NET. ;; @288
com. 86400 NS D.GTLD-SERVERS.NET. ;; @304
;; ADDITIONAL SECTION:
A.GTLD-SERVERS.NET. 86400 A 192.5.6.30 ;; @320
B.GTLD-SERVERS.NET. 86400 A 192.33.14.30 ;; @336
C.GTLD-SERVERS.NET. 86400 A 192.26.92.30 ;; @352
D.GTLD-SERVERS.NET. 86400 A 192.31.80.30 ;; @368
E.GTLD-SERVERS.NET. 86400 A 192.12.94.30 ;; @384
F.GTLD-SERVERS.NET. 86400 A 192.35.51.30 ;; @400
G.GTLD-SERVERS.NET. 86400 A 192.42.93.30 ;; @416
H.GTLD-SERVERS.NET. 86400 A 192.54.112.30 ;; @432
I.GTLD-SERVERS.NET. 86400 A 192.43.172.30 ;; @448
J.GTLD-SERVERS.NET. 86400 A 192.48.79.30 ;; @464
K.GTLD-SERVERS.NET. 86400 A 192.52.178.30 ;; @480
L.GTLD-SERVERS.NET. 86400 A 192.41.162.30 ;; @496
M.GTLD-SERVERS.NET. 86400 A 192.55.83.30 ;; @512
;; MSG SIZE sent: 80 rcvd: 512
Expires December 2005 [Page 4]
INTERNET-DRAFT July 2005 RESPSIZE
3.2. For longer query names, the number of address records supplied will
be lower. Furthermore, it is only by using a common parent name (which
is GTLD-SERVERS.NET in this example) that all 13 addresses are able to
fit. The following output from a response simulator demonstrates these
properties:
% perl respsize.pl a.dns.br b.dns.br c.dns.br d.dns.br
a.dns.br requires 10 bytes
b.dns.br requires 4 bytes
c.dns.br requires 4 bytes
d.dns.br requires 4 bytes
# of NS: 4
For maximum size query (255 byte):
if only A is considered: # of A is 4 (green)
if A and AAAA are condered: # of A+AAAA is 3 (yellow)
if prefer_glue A is assumed: # of A is 4, # of AAAA is 3 (yellow)
For average size query (64 byte):
if only A is considered: # of A is 4 (green)
if A and AAAA are condered: # of A+AAAA is 4 (green)
if prefer_glue A is assumed: # of A is 4, # of AAAA is 4 (green)
% perl respsize.pl ns-ext.isc.org ns.psg.com ns.ripe.net ns.eu.int
ns-ext.isc.org requires 16 bytes
ns.psg.com requires 12 bytes
ns.ripe.net requires 13 bytes
ns.eu.int requires 11 bytes
# of NS: 4
For maximum size query (255 byte):
if only A is considered: # of A is 4 (green)
if A and AAAA are condered: # of A+AAAA is 3 (yellow)
if prefer_glue A is assumed: # of A is 4, # of AAAA is 2 (yellow)
For average size query (64 byte):
if only A is considered: # of A is 4 (green)
if A and AAAA are condered: # of A+AAAA is 4 (green)
if prefer_glue A is assumed: # of A is 4, # of AAAA is 4 (green)
(Note: The response simulator program is shown in Section 5.)
Here we use the term "green" if all address records could fit, or
"orange" if two or more could fit, or "red" if fewer than two could fit.
It's clear that without a common parent for nameserver names, much space
would be lost. For these examples we use an average/common name size of
15 octets, befitting our assumption of GTLD-SERVERS.NET as our common
parent name.
Expires December 2005 [Page 5]
INTERNET-DRAFT July 2005 RESPSIZE
We're assuming an average query name size of 64 since that is the
typical average maximum size seen in trace data at the time of this
writing. If Internationalized Domain Name (IDN) or any other technology
which results in larger query names be deployed significantly in advance
of EDNS, then new measurements and new estimates will have to be made.
4 - Conclusions
4.1. The current practice of giving all nameserver names a common parent
(such as GTLD-SERVERS.NET or ROOT-SERVERS.NET) saves space in DNS
responses and allows for more nameservers to be enumerated than would
otherwise be possible. (Note that in this case it is wise to serve the
common parent domain's zone from the same servers that are named within
it, in order to limit external dependencies when all your eggs are in a
single basket.)
4.2. Thirteen (13) seems to be the effective maximum number of
nameserver names usable traditional (non-extended) DNS, assuming a
common parent domain name, and given that response truncation is
undesirable as an average case, and assuming mostly IPv4-only
reachability (only A RRs exist, not AAAA RRs).
4.3. Adding two to five IPv6 nameserver address records (AAAA RRs) to a
prototypical delegation that currently contains thirteen (13) IPv4
nameserver addresses (A RRs) for thirteen (13) nameserver names under a
common parent, would not have a significant negative operational impact
on the domain name system.
5 - Source Code
#!/usr/bin/perl
#
# SYNOPSIS
# repsize.pl [ -z zone ] fqdn_ns1 fqdn_ns2 ...
# if all queries are assumed to have zone suffux, such as "jp" in
# JP TLD servers, specify it in -z option
#
use strict;
use Getopt::Std;
my ($sz_msg) = (512);
my ($sz_header, $sz_ptr, $sz_rr_a, $sz_rr_aaaa) = (12, 2, 16, 28);
my ($sz_type, $sz_class, $sz_ttl, $sz_rdlen) = (2, 2, 4, 2);
my (%namedb, $name, $nssect, %opts, $optz);
my $n_ns = 0;
Expires December 2005 [Page 6]
INTERNET-DRAFT July 2005 RESPSIZE
getopt('z', opts);
if (defined($opts{'z'})) {
server_name_len($opts{'z'}); # just register it
}
foreach $name (@ARGV) {
my $len;
$n_ns++;
$len = server_name_len($name);
print "$name requires $len bytes\n";
$nssect += $sz_ptr + $sz_type + $sz_class + $sz_ttl + $sz_rdlen + $len;
}
print "# of NS: $n_ns\n";
arsect(255, $nssect, $n_ns, "maximum");
arsect(64, $nssect, $n_ns, "average");
sub server_name_len {
my ($name) = @_;
my (@labels, $len, $n, $suffix);
$name =~ tr/A-Z/a-z/;
@labels = split(/./, $name);
$len = length(join('.', @labels)) + 2;
for ($n = 0; $#labels >= 0; $n++, shift @labels) {
$suffix = join('.', @labels);
return length($name) - length($suffix) + $sz_ptr
if (defined($namedb{$suffix}));
$namedb{$suffix} = 1;
}
return $len;
}
sub arsect {
my ($sz_query, $nssect, $n_ns, $cond) = @_;
my ($space, $n_a, $n_a_aaaa, $n_p_aaaa, $ansect);
$ansect = $sz_query + 1 + $sz_type + $sz_class;
$space = $sz_msg - $sz_header - $ansect - $nssect;
$n_a = atmost(int($space / $sz_rr_a), $n_ns);
$n_a_aaaa = atmost(int($space / ($sz_rr_a + $sz_rr_aaaa)), $n_ns);
$n_p_aaaa = atmost(int(($space - $sz_rr_a * $n_ns) / $sz_rr_aaaa), $n_ns);
printf "For %s size query (%d byte):\n", $cond, $sz_query;
printf "if only A is considered: ";
printf "# of A is %d (%s)\n", $n_a, &judge($n_a, $n_ns);
printf "if A and AAAA are condered: ";
printf "# of A+AAAA is %d (%s)\n", $n_a_aaaa, &judge($n_a_aaaa, $n_ns);
Expires December 2005 [Page 7]
INTERNET-DRAFT July 2005 RESPSIZE
printf "if prefer_glue A is assumed: ";
printf "# of A is %d, # of AAAA is %d (%s)\n",
$n_a, $n_p_aaaa, &judge($n_p_aaaa, $n_ns);
}
sub judge {
my ($n, $n_ns) = @_;
return "green" if ($n >= $n_ns);
return "yellow" if ($n >= 2);
return "orange" if ($n == 1);
return "red";
}
sub atmost {
my ($a, $b) = @_;
return 0 if ($a < 0);
return $b if ($a > $b);
return $a;
}
Security Considerations
The recommendations contained in this document have no known security
implications.
IANA Considerations
This document does not call for changes or additions to any IANA
registry.
IPR Statement
Copyright (C) The Internet Society (2005). This document is subject to
the rights, licenses and restrictions contained in BCP 78, and except as
set forth therein, the authors retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR
IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Expires December 2005 [Page 8]
INTERNET-DRAFT July 2005 RESPSIZE
Authors' Addresses
Paul Vixie
950 Charter Street
Redwood City, CA 94063
+1 650 423 1301
vixie@isc.org
Akira Kato
University of Tokyo, Information Technology Center
2-11-16 Yayoi Bunkyo
Tokyo 113-8658, JAPAN
+81 3 5841 2750
kato@wide.ad.jp
Expires December 2005 [Page 9]