Version 31
In Rhodes and Goerzen, chapter 4, "DNS," pages 51–top of 56, 63–66. The rest is optional; you might occasionally need it for reference.
Name resolution is the translation of a host name to a (numeric) IP (v4 or v6) address. How does it work?
Your computer typically "knows" its own name and may know the names of a few others. This is configured in Unix in the file /etc/hosts and in Windows. Additional Unix configuration files are /etc/host.conf and /etc/resolv.conf. (In Windows this is different; what is WINS?)
Usually, a host name lookup begins in the /etc/hosts file, and if the host name is not found there, it goes to DNS. DNS is the Domain Name Service; it is a globally distributed, hierarchical database of host names, IP addresses, and related information. A DNS server maintains the host names and IP addresses of hosts within the organization it serves, and also a cache of the recently served host names and IP addresses from outside the organization.
Your computer knows the IP addresses of one or more DNS servers (specified in /etc/resolv.conf, either manually configured or obtained by DHCP). To begin a DNS lookup, it contacts the primary one of these. If that fails, it will try the secondary DNS server, and so on.
We'll suppose that your query is about www.cs.ucla.edu, and your computer uses dns.myisp.com (actually the IP address of this, of course!) as its primary DNS server.
If dns.myisp.com has the requested information in ts cache, it sends the information back.
If it doesn't, it will then query a top-level domain server (in this case, for the .edu top level domain), which provides it with the name server for ucla.edu. Then dns.myisp.com contacts the name server for ucla.edu asking for the IP address of www.cs.ucla.edu. If the name server for ucla.edu has this information, it sends it back; otherwise, it sends back the IP address of the DNS server for cs.ucla.edu. Then dns.myisp.com contacts the name server for cs.ucla.edu and finally (it had better!) gets back the IP address of www.cs.ucla.edu.
(p. 55)
The textbook recommends using the function socket.getaddrinfo instead of older functions in the socket library (socket.gethostname, socket.gethostbyname, etc., p. 59). One reason is that getaddrinfo is more flexible in dealing with IPv6 as well as IPv4.
The example on p. 55 may be confusing
The usage of this function might be clearer if we allow Python to "deconstruct" the tuple in advance:
>>> from socket import *
>>> infos = getaddrinfo('www.iue.edu', 'http')
>>> for i in infos: print(i)
...
(2, 1, 6, '', ('149.165.1.130', 80))
(2, 2, 17, '', ('149.165.1.130', 80))
(2, 1, 132, '', ('149.165.1.130', 80))
(2, 5, 132, '', ('149.165.1.130', 80))
>>> info = infos[0]
>>> info
(2, 1, 6, '', ('149.165.1.130', 80))
>>> (family, socktype, protocol, canonical_name, sockaddr) = infos[0]
>>> family
2
>>> AF_INET
2
>>> socktype
1
>>> SOCK_STREAM
1
>>> protocol
6
>>> IPPROTO_TCP
6
>>> canonical_name
''
>>> sockaddr
('149.165.1.130', 80)
>>> s = socket(family=family, type=socktype)
>>> s.connect(sockaddr)
>>> s.close()
We got a bunch of addresses (well, address infos) because we weren't specific about the socket type or protocol we wanted.
>>> getaddrinfo('www.iue.edu', 'http', type=SOCK_STREAM)
[(2, 1, 6, '', ('149.165.1.130', 80))]
>>> getaddrinfo('www.iue.edu', 'http', proto=IPPROTO_TCP)
[(2, 1, 6, '', ('149.165.1.130', 80))]
We didn't get a canonical name because we didn't ask for it.
>>> getaddrinfo('www.iue.edu', 'http', type=SOCK_STREAM, flags=AI_CANONNAME)
[(2, 1, 6, 'www.iue.indiana.edu', ('149.165.1.130', 80))]
We are here, sort of. A recent article in CACM (cite???) notes that cell phone providers are now running IPv6---there are not enough IPv4 addresses for all those cell phones, or is it just the smart phones?---and translating between IPv6 and IPv4 when their customers want to visit, say, the web sites in the IPv4 world. An incentive exists for web site owners (etc.) to convert their systems to IPv6: all of this address translation is slowing down their customers who visit on their smart phones! The article suggests start with one critical service that needs to be converted to IPv6, rather than attempting a massive upgrade of all services at once.
We know that the old IPv4 ways work while we're in the computer lab, but … if you happen to be writing a Python app for a smart phone, you might want to use that new-fangled getaddrinfo.
Some of the DNS functions in Python that Goerzen describes are also available as Unix shell commands:
$ man ping
$ man hostname
$ man host
$ man nslookup
$ ping www.iue.edu
PING www.iue.indiana.edu (149.165.1.130) 56(84) bytes of data.
64 bytes from www.iue.indiana.edu (149.165.1.130): icmp_seq=1 ttl=127 time=0.247
ms
64 bytes from www.iue.indiana.edu (149.165.1.130): icmp_seq=2 ttl=127 time=0.275
ms
64 bytes from www.iue.indiana.edu (149.165.1.130): icmp_seq=3 ttl=127 time=0.249
ms
^C
--- www.iue.indiana.edu ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2072ms
rtt min/avg/max/mdev = 0.247/0.257/0.275/0.012 ms
$ host www.iue.edu
www.iue.edu is an alias for www.iue.indiana.edu.
www.iue.indiana.edu has address 149.165.1.130
$ nslookup www.iue.edu
Server: 134.68.1.9
Address: 134.68.1.9#53
www.iue.edu canonical name = www.iue.indiana.edu.
Name: www.iue.indiana.edu
Address: 149.165.1.130
$ hostname
lettuce.mis.iue.edu
$
Read their man pages for more details information, e.g.,
$ man host
(Type q to quit from the manual pager.)
getaddrinfo and gethostbynamegethostbyname and getfqdn perform?