● The DNS is an Internet-wide distributed database that translate between domain names and IP addresses
● Times before DNS (1985)
– Using HOSTS.txt file under, e.g., /etc/hosts
– Download from server through FTP regularly
– Many more downloads
– Many more update
Hence, DNS
But, why not centralize DNS?
– Single point failure
– Traffic volume
– Distant centralized database
– Single point of update
→ Doesn't scale
DNS goals
● Basically a wide-area distributed database
● Scalability
● Decentralized maintenance
● Robustness
● Global scope
– Names mean the same thing everywhere
Programmer's view of DNS
● Conceptually, programmers can view the DNS
database as a collection of millions of host entry
structures:
/* DNS host entry structure */
struct addrinfo {
int ai_family; /* host address type (AF_INET) */
size_t ai_addrlen; /* length of an address, in bytes */
struct sockaddr *ai_addr; /* address! */
char *ai_canonname; /* official domain name of host */
struct addrinfo *ai_next; /* other entries for host */
};
● Functions for retrieving host entries from DNS:
– getaddrinfo: query key is a DNS host name.
– getnameinfo: query key is an IP address.
DNS message format
DNS header field
● Identification
– Used to match up request/response
● Flags
– 1-bit to mark query or response
– 1-bit to mark authoritative or not
– 1-bit to request recursive resolution
– 1-bit to indicate support for recursive resolution
DNS Resource Record
RR format: (class, name, value, type, ttl)
● DB contains tuples called resource records (RRs)
● Classes = Internet (IN), Chaosnet (CH), etc.
● Each class defines value associated with type
Hierarchy of name server
The resolution of the hierarchical name space is done by a hierarchy of name servers
● Each server is responsible (authoritative) for a contiguous portion of the DNS namespace, called a zone.
DNS server answers queries about hosts in its zone
DNS server
● For each zone, there must be a primary name server and a secondary name server
– The primary server (master server) maintains a zone file which has information about the zone. Updates are made to the primary server manually
– The secondary server copies data stored at the primary server.
● Adding a host:
– abc.com.vn created by abc.com.vn administrator
– Who created abc.com.vn?
Root name server
● The root name servers know how to find the authoritative name
servers for all top-level zones. See http://www.root-servers.org/
● Ten servers were originally in the United States. Three servers were
originally located in Stockholm (I), Amsterdam (K), and Tokyo (M).
http://en.wikipedia.org/wiki/Root_nameserver
Server/Resolver
● Each host has a resolver
– Typically a library that applications can link to
– Local name servers hand-configured (e.g.
/etc/resolv.conf)
● Name servers
– Either responsible for some zone or...
– Local servers
● Do lookup of distant host names for local hosts
● Typically answer queries about local zone
Recursive and Iterative Queries
● There are 2 type of queries (determined by a
bit in the query)
– Recursive query: When the name server of a host
cannot resolve a query, the server issues a query
to resolve the query
– Iterative query: When the name server of a host
cannot resolve a query, it sends a referral to
another server to the resolver
Recursive query
● If the server cannot supply the answer, it will send the query to the “closest known” authoritative name server.
● Only returns final answer or “not found”
● Puts burden of name resolution on contacted name server
Iterative query
● “I don’t know this name, but ask this server”
● This involves more work for the local DNS server
Workload Impact & Choice
● Local server typically does recursive
● Root/distant server does iterative
Workload and Caching
● Are all servers/names likely to be equally popular?
● DNS responses are cached
– Quick response for repeated translations
– Other queries may reuse some parts of lookup (ftp from www)
– NS records for domains
● DNS negative queries are cached
– Don’t have to repeat past mistakes
– E.g. misspellings, search strings in resolv.conf
● Cached data periodically times out
– Lifetime (TTL) of data controlled by owner of data
– TTL passed with every record
0 comments:
Post a Comment