How DNS Servers Work
DNS tracks the name and address of every computer on
the Internet
Neil Randall
September 24, 1996
Imagine, for a moment, that we had to address one another by our Social
Security numbers. It
would work for maybe the first ten people we got to know, but beyond that
we'd likely never
remember what to call anyone. Human beings, for many reasons, remember
names better than
numbers.
Switch to the Internet. You want to launch a telnet session, and you know
the address of the telnet
site at the National Center for Supercomputing Applications is ncsa.uiuc.edu.
Then you decide to
access the Microsoft Web site, and you guess that the address is probably
www.microsoft.com, so
you try it. It works, and you commit both addresses to memory.
What you've done, however, is simply memorize the domain names of the computers
you're
connecting with. From the Internet's standpoint, you haven't actually identified
anything. Computers
on the Internet are identified by numbers, not by names, and the domain
name is merely a
human-friendly pseudonym for the computer's real ID. On the Internet, each
computer is assigned
an Internet Protocol (IP) address, and this numeric identifier differentiates
one computer from
another. You may prefer to know them by name, but the Internet prefers
the vital statistics.
What's in a Name?
An IP number comprises a four-part format, with each part of the address
becoming increasingly
machine-specific. For example, the IP number for Microsoft's Web server
is 198.105.232.4, and
the IP number for the ncsa.uiuc.edu machine is 141.142.2.2. Starting from
the left, the first part of
the number identifies the geographic region, and the second specifies the
organization or provider.
After that, it gets even more specific, with the third number denoting
a group of computers, and the
fourth the actual machine itself. For example, in the case of my own organization--the
University of
Waterloo, Canada--all computers share the first two portions of the IP
address (129.97). The two
machines in my office, 129.97.178.94 and 129.97.178.17, both belong to
the 129.97.178 group;
the final portions of the addresses are unique to either computer.
Why does this matter? Whenever you specify a domain name in a telnet, finger,
Gopher, FTP, or
Web session, the session doesn't actually begin until the domain name is
translated into its IP
address. This translation is the task of a Domain Name System (DNS) server
or, as is more often
the case, a series of DNS servers, in which the first queries the next
until the correct IP number is
acquired.
Whenever you attempt to send or request a piece of data that contains a
domain name, the DNS
sets its magic in motion. You can avoid the DNS entirely by foregoing the
domain name in favor of
the IP number, but that's unrealistic most of the time, especially since
addresses to which you're
sending data or requests (through e-mail replies or hyperlink clicks, for
example) are nearly always
given to you in domain name format. As soon as you perform the action,
you engage a piece of
software, called a resolver, on the local machine that's been set up as
a DNS server. The resolver,
as its name suggests, tries to resolve the domain name, first by looking
in its DNS database, then, if
that doesn't work, by connecting to external DNS servers.
You can think of this process as similar to your attempts, as a new manager,
to get information
about a little-known company procedure. Your first step is to ask your
secretary, who, if he or she
doesn't know, would contact someone higher up the chain. That person would
go a step higher,
until eventually someone would be found who knew the answer. Each person
along the way would
ideally store the knowledge for future reference so that the next time
you needed the information, it
would be available from the first person in the chain.
DNS servers around the world have to be made aware of changes as quickly
as possible. Before
DNS servers came along, domain name translation depended entirely on the
host table, a text file
stored in the /etc/hosts/ directory on your organization's Unix server,
or in a relevant directory on
your PC. The host table listed, line by line, Internet host names and their
associated IP numbers.
The master host table is compiled and stored on the machines at the Network
Information Center
(NIC)--nic.ddn.mil, in netinfo/hosts.txt, and one look at its half-a-megabyte
size will tell you why
you wouldn't want the responsibility of maintaining this thing. As the
Internet grows, domain names
are added hourly (at least), and it's impractical for every host on the
Internet to keep acquiring this
file for its users.
Distributed Information
The solution was the DNS server system. Unlike the host table, DNS servers
don't rely on one
large mapping file. Instead, DNS servers contain only a limited amount
of information, because they
know where to find details on domains they have yet to encounter. Whenever
a DNS server gets a
request for a host not contained in its cache, it simply does the sensible
thing and asks someone
who knows. That "someone" is an authoritative server, a server responsible
for maintaining DNS
information. A server is authoritative if, when asked about an address
in its domain, it can state with
certainty that the name exists.
If the contacted server doesn't contain information for that domain name,
it passes the request to an
authoritative server higher up the chain, forming a series of queries that
continues until the
information is found. In practice, this means that the request can be handled
by any number of
servers, and that this sort of back-and-forth activity happens all day,
every day on the constantly
changing Internet. The server that originally made the request will cache
the information to satisfy
future requests without the need to go to an authoritative server. This
information is set by the DNS
server administrator to time-out after a specified period, to avoid the
problem of fulfilling name
requests with old data.
The DNS translation doesn't take long, but it does add to the time required
for your request to
reach the remote machine. You can perform a quick (though hardly foolproof)
test of this yourself,
by trying to access a Web site first using the domain name--www.microsoft.com,
for
example--then using the IP number--198.105.232.4. If you try this, however,
be sure to close your
browser and then reopen it to initiate a new session; otherwise, you'll
simply load the cached
version of the page. (And keep in mind that delays in loading can result
from any of a number of
factors, so take the results with a grain of salt.)
The most common software for DNS service is Berkeley Internet Name Domain,
better known as
BIND, originally from U.C. Berkeley but now sponsored by the Internet Software
Consortium.
The latest release, 4.9.3, contains the standard Unix version, plus a Windows
NT port. BIND
provides both resolver and name server software, with the resolver doing
the actual queries and the
name server providing the responses. BIND separates name servers into three
types: the primary
server contains all the data about a domain; the secondary server, in effect,
copies the DNS
database from the primary server; and the caching-only server builds a
DNS database exclusively
by caching queries. Only primary and secondary servers are considered authoritative
for their
particular domains.
To understand how DNS servers operate, it's necessary to understand the
domain name hierarchy
itself. At the top of the hierarchy is the root domain. Information on
this domain resides on a select
number of root servers around the Internet. Below the root domain come
the top-level domains,
which are either country codes or organization codes. Examples of country
codes are SG
(Singapore) and CA (Canada), while organization codes include the well-known
COM
(commercial organizations), EDU (educational institutions), GOV (governmental
organizations),
and NET (network organizations), among others. (Note that top-level domains
outside the U.S.
are normally country codes, but that U.S.-based sites usually omit country
codes.) Beneath the
top-level domains are the second-level domains (whitehouse.gov; microsoft.com,
inforamp.net),
and then the third-level domains, and so on down the chain.
If you want to establish a domain name in the U.S., you must contact the
Network Information
Center (NIC). Before it grants your request, it will ensure first that
the name you want isn't already
in use, and second, that at least two servers currently in existence will
serve the new domain name.
When the NIC finally fills the request, it will grant you a second-level
domain, and it will place
pointers to that name in the servers for the top-level domain. For example,
if you request the
domain name mybiz.com, you must first get two name servers somewhere on
the Internet to serve
the information (your ISP's servers can handle this), and then NIC will
place mybiz in the COM
domain server system, with pointers to those two specified servers.
Once you have the domain in place, you can add any number of subdomains
you wish. You might
want to name one of your machines sales.mybiz.com and another techsupport.mybiz.com
and so
on. You don't need NIC approval for these, and, in fact, the NIC doesn't
care. But if you want
anyone to actually access your subdomains, you have to place information
about them in the
domain immediately above it. In this particular case, IP information about
sales.mybiz.com and
techsupport.mybiz.com must be placed in the servers for mybiz.com. Each
server in the hierarchy
contains a DNS database with entries called NS (name server) records, and
each of these records
contains the name of the domain or subdomain, plus the name of the host
that acts as a server for
that domain or subdomain. In our example, we'll tell the root server that
it can find information
about mybiz.com and all its subdomains on our DNS server, located on the
machine
details.mybiz.com.
Let's see how this all works. Someone at a university on the other side
of the country sees a link on
a Web page that points to your brand-new subdomain, techsupport.mybiz.com.
She clicks on it,
and her local DNS server (most likely located on a machine at the university)
kicks into gear. First,
the server searches its own DNS database for the translation information,
but because it's never
encountered techsupport.mybiz.com before, the server has no record of that
domain's existence
and can't resolve the IP number. What is contained in its DNS database,
however, is the address
of a root server (all DNS servers must be set up with such a reference).
The local DNS server
goes out onto the Internet and queries that root server. The root server
looks in its DNS database
for the COM top-level domains, and it replies with the NS record that tells
the university's DNS
server to query details.mybiz.com for information about mybiz.com. The
university's server does
so, and learns from details.mybiz.com the correct IP address for techsupport.mybiz.com.
At all
stages in this process, the university's DNS server is caching the NS records
so that the next time
anyone from the university needs an IP translation for mybiz.com, details.mybiz.com,
ortechsupport.mybiz.com, the information will be available locally.
As with other Internet protocols, the DNS is outlined in several Internet
Request For Comments
(RFC) documents (initially in RFC 882, 883, and 973). To understand the
workings of a DNS
server, however, RFC 1035 is your best bet. Although you can find RFC 1035
in several places
on the Internet, a nice HTML version is available at www.crynwr.com:80/crynwr/rfc1035/.
As
you might expect, the RFC is quite technical, and you may be not be interested
in gaining more than
a general sense of how a DNS server operates. But keep the RFC in mind
in case somewhere
along the line, you decide to become a server administrator.
Neil Randall has authored and coauthored several books about the Internet,
including Using
Microsoft FrontPage, Special Edition, the recent Guide to Netscape Navigator
Gold, and the
upcoming The Soul of the Internet. He can be reached at nrandall@mariner.uwaterloo.ca.
DNS
servers contain only a limited amount of information, because they know
where to find details on
domains they have yet to encounter.
BIND provides resolver and name server software, with the resolver doing
the actual queries and
the name server providing the responses.
This page is owned by PC Magazine and Ziff Davis Publishing.
It is provided to Rutgers
University students under the fair use rules in the U.S. copyright code,
for educational
purposes only.