7/16/2003 -- Without the Domain Name System (DNS), which translates IP addresses into names such as www.google.com, the Internet would probably be useless to its ostensible masters-although the machines would probably hum along, unperturbed. DNS is a distributed directory service. The question "who is www.google.com" is answered by a distant DNS server; but the question "who is www.yahoo.com" is answered by a different distant server. Since retrieving information from the network necessarily incurs latency, DNS queries occur at a glacial pace, in terms of processor time. For individual users, the delay incurred by waiting for replies to DNS queries is hardly noticeable; however, for web or mail servers that perform millions of queries per day, these delays can be crippling. Additionally, if the DNS server for www.google.com had to answer every query for that name, it would quickly be overloaded.
Therefore, it's useful from both a client and server perspective for clients to store answers to DNS queries locally, once retrieved: the DNS client is then able to answer subsequent queries quickly, and the server is relieved of an additional query. The concept of a DNS cache, described in the defining RFCs (1034 and 1035), provides for this. Caches are simply locations where frequently accessed information is stored for easy retrieval. In this article, we'll describe how to set up a caching-only DNS server, and examine the benefits and pitfalls of doing so. A caching DNS server is simply a special instance of a DNS server that stores responses from other DNS servers, but doesn't publish any DNS information itself.
The DNS is a set of standards defined in various Requests for Comments, and like most other "Internet Standards," exactly the opposite is true -- instead of a standard, there are multiple competing implementations that conform, more or less, to the ideal. Hereafter, we'll be focusing on the latest version of BIND (9.2.2), the Berkeley Internet Name Domain, which is widely regarded as the reference implementation for the Domain Name System. Since BIND is included in most major Linux distributions and commercial Unix operating systems, we'll only be covering the configuration of BIND, not its compilation from source code. However, those seeking to build their own BIND software can find assistance here, and on the mailing lists maintained by the Internet Software Consortium.
Reading the Manual
Section 3.1.1 of the Bind 9 Administrator Reference Manual is titled "A Caching-only Nameserver." However, it only provides one of the files we need. Redhat 9 and other operating systems provide graphical tools for configuring BIND and recommend against editing the zone files directly-a good idea, if you're managing a DNS server that publishes information for a number of zones. However, for a caching-only nameserver there are only two zones, so we shouldn't be intimidated, and will look at the contents of these files in some detail.
BIND requires three files in a caching-only nameserver configuration: db.cache, db.127.0.0, and named.conf. Let's step through these files. The names, to an extent, are arbitrary. In place of the "db.cache" file, for instance, you may also see "named.cache" or "named.root." So long as your configuration files are internally consistent, these names are irrelevant, but using one of the common names increased the clarity of your work for other administrators.
db.cache
The master db.cache file is located at ftp://ftp.rs.internic.net/domain/db.cache. This file simply contains a list of the root DNS servers for the Internet. It should never be modified, though it may need to be updated from the source periodically. The file is available from the same FTP site under several alternate names, including named.cache and named.root.
db.127.0.0
The file db.127.0.0 defines the single zone for which this DNS server will "publish" information: the localhost zone, 127.0.0.xxx. It contains the following text:
$TTL 1D |
|
|
|
@ |
IN SOA localhost |
. |
root.localhost. ( |
|
2002031401 |
; |
Serial |
|
8h |
; |
Refresh after 8 hours |
|
1h |
; |
Retry after 1 hour |
|
1w |
; |
Expire after 1 week |
|
1d ) |
; |
Minimum TTL (time to live) of 1 day |
|
IN NS localhost |
. |
|
1 |
1 IN PTR localhost |
. |
|
Without going into excessive detail, it should be apparent that this file contains a single PTR record: the record for localhost. A PTR record associates an IP address with a canonical name. In this case, localhost gets associated with the address 127.0.0.1. Note the "1" at the beginning of the "PTR" line-this gets appended to the name of the zone itself, 127.0.0, to form the full address, 127.0.0.1. This address is better known as the loopback address: by convention, it always refers to the machine itself.
Those two zone files define the caching-only nameserver: it will only answer positively for a single address, 127.0.0.1. All other queries will be referred to one of the servers in the db.cache file. The file named.conf contains the rest of our configuration.
named.conf The nameserver's behavior is still highly customizable, despite the simplicity of the zone configuration, and can be harnessed to improve your understanding of how your machine is functioning. You'll notice that this version of the named.conf file differs slightly from the version in the Administrator's manual. I've numbered the lines for reference:
1 acl "internal" { 127.0.0.1/32; };
2 options {
3 directory "/usr/local/bind/etc";
4 pid-file "/tmp/named.pid";
5 allow-query { "internal"; };
6 listen-on { 127.0.0.1; };
7 statistics-file "/tmp/named.stats";
8 };
9 zone "." {
10 type hint;
11 file "db.cache";
12 };
13 zone "0.0.127.in-addr.arpa" {
14 type master;
15 file "db.127.0.0";
16 notify no;
17 };
18 logging {
19 channel namedlog {
20 file "/tmp/named.log";
21 print-time yes;
22 print-category yes;
23 print-severity yes;
24 };
25 category queries { namedlog; };
26 };
Though intimidating at first, this file is easily broken down into its constituent elements.
Line 1 contains an access control list, or acl, directive. This particular acl contains the IP address 127.0.0.1-which, of course, is the aforementioned loopback address.
Line 2 begins the options section, which contains:
- The location of the configuration directory that contains the zone files.
- The location of a temporary file containing the ID of the server process. This file is created when the nameserver starts, and is checked when stopping the server, to ensure that the correct process is stopped; it has no other importance.
- The allow-query line contains the name of the previously defined acl ("internal"). This indicates that queries to this server will only be allowed from the IP address in the acl: the loopback address. This is an extra, perhaps paranoid security measure-in case a security hole is found in BIND, it's best not to allow the process to respond to requests originating from other machines.
- The IP address on which the caching-only DNS server will listen for incoming queries (notice that this is also the localhost address). This provides another small increment of security.
- The name of the statistics file.
Lines 9 and 14 each begin a section that defines a zone, referring to one of the db files we detailed previously. The "notify" parameter simply indicates that this nameserver should not share its information with any other nameservers for that zone -- there's no need to share any information with other servers in a caching-only situation.
Line 18 begins the logging section. For nameservers that actually publish information, BIND's logging capabilities are extensive. For a caching-only nameserver, though, it's sufficient to simply log queries, because it won't be involved in zone transfers, DNSSEC, or any of the more advanced features of DNS.
Note that the named.conf file contains nothing related to caching. The default behavior of BIND is to cache answers to queries, and to refresh those answers after a certain period of time. The only differences between a caching-only DNS server and a normal one are, as mentioned before, that the caching-only server doesn't publish information about any zones, and only knows about one address -- itself. Other implementations of DNS may differ in how they configure caching -- for instance, djbdns, another popular DNS server, allows you to configure the cache size.
Deploying BIND Once you have the configuration files in place, run the named-checkconf command included with the distribution. BIND is very sensitive to syntax errors. If there are errors in the named.conf file, named-checkconf will inform you of the fact, like this:
/etc/named.conf:11: missing ';' before 'zone'
After correcting the errors, you can start the name service daemon, named, using the configuration file:
# /usr/sbin/named -c /etc/named.conf
In this case, we've started the server as the root user. Methods for starting the named process as a non-root user, which are recommended in some situations, are covered more extensively in section 7.2 of the Administrator's manual. If the server starts successfully, there will be no notification, but you can check to see if the process is running using the "ps" command:
# ps -ef | grep named
root 21173 1 0 22:08:10 ? 0:03 /usr/sbin/named -c /etc/named.conf
However, we still aren't to the point where the machine will resolve addresses using our new nameserver. In order to do that, we have to configure our machine's resolver -- the "client" half of the DNS equation -- so that it sends DNS queries to our server on the loopback address. On most Unix variants, this is accomplished by editing the /etc/resolv.conf file so that it includes the loopback address as the nameserver address:
domain some.host.com
nameserver 127.0.0.1
search domain1.site.com domain2.site.com site.com
Although I haven't done so here, it's common to include a second "nameserver" line, so that queries will still resolve, albeit more slowly, if something unwonted happens to the caching-only nameserver. Verifying that things are working is then a simple matter of making some DNS queries. You can use the dig tool included with the BIND distribution for this, or use the older nslookup utility found on most systems. Here, we issue two searches for the domain www.google.com using nslookup:
# nslookup
Default Server: localhost
Address: 127.0.0.1
> www.google.com
Server: localhost
Address: 127.0.0.1
Name: www.google.com
Address: 216.239.57.99
> www.google.com
Server: localhost
Address: 127.0.0.1
Non-authoritative answer:
Name: www.google.com
Address: 216.239.57.99
There are two items of note here, disguised as one. Note that the second reply says that it's a non-authoritative answer. This is the client's way of telling you that the server didn't actually search for the hostname, the second time; it gave you the value out of its cache. The first search, though, was conducted by querying the authoritative server, and placed the result in the cache.
The difference in time spent for the two queries is measurable, even by crude yardsticks. Here, are the results of the GNU "time" command, showing the processor time taken by the two consecutive queries:
real 0.074
user 0.006
sys 0.009
real 0.017
user 0.005
sys 0.008
The first query clearly took more real time, though an almost identical amount of processor time. The difference is due to the latency incurred by sending the first query over the network: while the query is being answered by the remote server, the processor sits idle (or on a busy server, moves on to something else). While it doesn't hurt the processor to sit idle, it might annoy your customer.
One last item of note: some modern operating systems include their own DNS caching facility. Solaris, for instance, provides something called nscd, the name service caching daemon, which caches user, group, and password information in addition to hostnames. To avoid conflicts, it's useful to choose one or the other of these utilities for your system -- there's no point in caching the same information twice.
Logging The logs produced by a caching-only DNS server can reveal interesting things about the behavior of your application. If you'll recall the resolv.conf file mentioned previously, note that the search order for queries is contained in the last line:
search domain1.site.com domain2.site.com site.com
This list defines the order in which unqualified DNS queries will be searched for: for instance, a search for "mail" would prompt the DNS server to look for mail.domain1.site.com, then mail.domain2.site.com, then mail.site.com. This shows up in the log file as follows:
Jun 29 21:25:00.579 queries: info: client 127.0.0.1#61614: query: mail.domain1.site.com IN MX
Jun 29 21:25:00.584 queries: info: client 127.0.0.1#61615: query: mail.domain2.site.com IN MX
Jun 29 21:25:00.585 queries: info: client 127.0.0.1#61616: query: mail.site.com IN MX
If your application is issuing a lot of redundant queries, or is issuing queries for servers that it shouldn't be (a sign you've been hacked?), the logs can reveal it. DNS logs are also interesting for their own sake, as they can show the distribution of client traffic. A useful tool for analysis of DNS logs is Lire, available at the LogReport Foundation.
Management
The last task we should mention, having configured our caching-only DNS server, is management. The BIND distribution contains a utility, rndc, for managing DNS servers. Several of its subcommands are useful:
-
rndc status shows the status of the server:
number of zones: 3
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
server is up and running
-
rndc stats dumps some statistics to the stats file designated in the named.conf file:
success 65461583
referral 0
nxrrset 782046
nxdomain 2353453
recursion 16216
failure 225
- rndc querylog toggles the logging facility on and off
- rndc dumpdb dumps the cached records to a file names named_dump.db (this is merely of academic interest)
- rndc stop stops the nameserver
Strangely, the inverse for the last command, rndc restart, is not yet implemented, so if you stop the server, you must start it again using the full command line, including the name of the configuration file.
As you can see from the output of rndc stats above, though, there is rarely a need to restart a functioning nameserver. While previous versions of BIND had a nasty tendency to leak memory and eventually crash, the above server has served 65 million requests without a glitch-a small number by the standards of large sites, but still respectable, and a sizeable savings in time and bandwidth. If you don't consider BIND the fastest, tiniest, or even most secure implementation of the DNS standards, the caching techniques described in this article should still apply equally well to your favorite flavor of DNS. 
Questions? Comments? Post 'em below!
|