Courses
Wayne State University
College of Lifelong Learning
Interdisciplinary Studies Program
Computers and Society courses, Winter 2001
    ( http://www.cll.wayne.edu/isp/drbowen/casw01)

Mondays, 6 - 9:40 PM in 113 Rackham
Bullet1.png (242 bytes)Computers and Society
    GST 2710, Section 988, Call Number 95241, 4 credits

Bullet1.png (242 bytes)Computers and Society
    AGS 3360, Section 983, Call Number 98319, 4 credits

Office hours: Mondays 4 - 6 PM in 113 Rackham


                         Instructor

David R. Bowen
2311 A/AB
Wayne State University
Detroit, MI 48202
Daytime tel: (313) 577-1498
Evening tel: (248) 549-8518
FAX: (313) 577-8585
Home Page:
    http://www.cll.wayne.edu/isp/drbowen

Email: d.r.bowen@wayne.edu

Last updated: 3/5/01
Link back to course Welcome

The Internet
Structure, Function and Applications

  1. "Internet" can be used vaguely today. We will use the technical definition here. The Internet connects computers within local networks so that any connected computer can send computer information to any other.
    1. Most computers connected to the Internet are either part of a LAN (Local Area Network - the cloud in the picture below) or are connected via an ISP - Internet Service Provider. In either case, there is a "gateway" computer that can receive information from and direct it to the connected computer using the LAN communications. The gateway computer has the direct internet connection.
      1. For example, the ISP 113 Rackham computer lab uses Ethernet networking hardware and Novell Netware (LAN) networking software. This networking hardware and software work together so that users can print to the common printer. With a shared printer, a better printer can be afforded than would be the case if all users had their own printer. However, Ethernet and Novell Netware are proprietary (trade secrets) and those using other brands could not interconnect without some hardware and software in between them. Even connecting large groups each using the same proprietary systems (hardware and software) is difficult.
      2. Connecting LANs led to the name "Internet" - a network of networks
    2. Each computer connected to the Internet is given a numerical "IP address" or just plain "IP" with the form of four bytes separated by three dots, e.g. 141.217.142.149 for the IP address of the CLL web server. A gateway handles a "family" of IPs. For example, 141.217.142.--- is CLL - College of Lifelong Learning, including Interdisciplinary Studies Program. So, my desktop is 141.217.142.125. Our gateway is 141.217.142.1, and we are all connected together via a Novell NetWare LAN.
      1. LAN - Local Area Network. Shares files, printers. LANs (NetWare, Banyan Vines, Token Ring, AppleTalk) are usually proprietary and cannot interconnect.
      2. How the gateway and workstations communicate is different on each type of LAN, and is the responsibility of the LAN. This is incorporated into the brand.
      3. IP addresses must be unique - no repeats
      4. Organizations can be given control over a "domain" and allocate IP addresses within that domain. For example, 141.217.xxx.yyy is Wayne State University. WSU allocates 141.217.142 to CLL, and CLL allocates the yyy to individual computers.
      5. IPs can be "static" -- unchanging, assigned by network administrator -- or "dynamic" -- assigned from a pool of IPs at each boot up by the router
        1. DHCP (Dynamic Host Configuration Protocol) is the protocol that assigns IP addresses to computers when they boot up. One advantage of this for computer administrators is that it almost eliminates the problem of two computers with the same IP address.
        2. Because of the way another system (the Domain Name System or DNS, see below) works, Internet servers, e.g. web servers and email servers) must have fixed IP addresses.
        3. AOL changes IPs at each "hit"
    3. Information travels in packets - finite groups of bits. Packets have two parts
      1. Head is standardized - contains "meta information" - "from" address, length of packet, "to" address, etc. (time to die)
      2. Body is freeform - contains the information itself
    4. Once out of the LAN, packets are steered to destination by Internet routers. Each router has an Internet address.
      1. Router has "router table" listing final destination and next hop, for each packet matches final destination, then sends packet on its next hop to the next router.
      2. There are usually many possible routes to destination, so routers have a method of making the choice, usually on the basis of low traffic and therefore probably fastest time
        InetWork.gif (5423 bytes)
    5. Computers are very inflexible, and must have an explicit order of which computer starts communication and what it does, how the second responds, etc. These are "protocols". The Internet uses the TCP/IP protocol for communication. This is actually two main protocols -- TCP and IP, with a host of others that are lumped in.
      1. IP is the raw transport mechanism, just throwing information out as fast as it can, without asking if it arrives.
      2. TCP sits on top, and uses IP both ways to confirm accurate arrival, and resend if not
    6. Domain Name Server system is another layer of protocol to make remembering addresses easier.
      1. Many servers (see below) have static IPs and employ "dot-com" names, e.g. www.cll.wayne.edu. These are called domain names, because the last group of letters is known as the domain. Primarily US
        1. edu
        2. com
        3. org
        4. gov
        5. net
      2. Other countries use two-letter domain, e.g. de
      3. New domains added 1997
        1. firm
        2. store
        3. arts
        4. rec
        5. info
        6. web
        7. nom
          1. The part of the domain name in front of the domain name is free form and unconstrained. For example, many people think that the "www" is required for a web site - it is not. As far as the Domain Name System is concerned, that is just letters.
      4. Client software goes to a local Domain Name Server (DNS) to get the IP (numerical) address
        1. Communication with DNS goes uses TCP/IP
        2. If local DNS does not have entry, kicked up to a higher-level DNS
      5. This only happens the first time during a session. For the rest of that session, the client remembers the IP address
      6. This happens without action from the user
      7. Sequence and errors are:
        1. Client sends out domain name seeking IP address. Local domain name server answers if it is found, kicks up to higher level if not. On-screen message is "Looking up host". If not found, error is "Has no domain name entry"
        2. Once IP address is found, Client goes to that server. On-screen message is "Contacting host" then "Waiting for reply, then "Reading file". If nothing received from server, error message is "No response from... Perhaps the server is down."
      8. TCP/IP developed incrementally by decentralized workers and informal groups. Protocols and software are freely and publicly available. Government support, especially at the start, was critical. Very different from proprietary development.
    7. TCP/IP is the basic Internet communication protocol
      1. IP = Internet protocol
        1. One-way communication - packets go from A to B without any checking, implements IP addresses
        2. Not secure
        3. Fast
        4. Can run on many different types of hardware -- telephone wires, coax, optical fiber, etc.
        5. Current version is IPv4, being upgraded to IPv6. 4.3 billion IP4 addresses (four bytes, or 2564), not enough. Rapid growth, also many are not used. Number provided by IPv6: 32 bytes or 3.4 × 1038 (3.4 followed by 38 zeroes)
      2. TCP  uses ("sits on top of") IP. Transmission Control Protocol. Uses IP in both directions to implement secure transmission
        1. Waits for distant computer to acknowledge ("connection")
        2. Successive packets have sequence numbers.
        3. Flow control - halt, continue
        4. Slower
      3. Audio and video could use another protocol instead of TCP, because dropping a bit or two is not serious, and faster transmission would more than make up for any "snow"
    8. This is what I will mean by "The Internet" -- a pipeline for delivering information between any two connected computers
      1. By connecting to the Internet, an organization extends it and provides alternate routes. Each organization funds its own part. Backbone is maintained by large communications companies.
      2. Standards were developed by public discussion, are not proprietary, many companies make routers and provide services. (Cisco Systems is largest manufacturer.)
  2. "Applications" are programs that use this transmission mechanism
    1. Peer applications have two computers acting as equals, but this is fairly rare. That is, until Napster and others. With Napster, the basic listing of songs and where to find them is on a central server, but the actual files are transported client-to-client. With Gnutella, everything is client-to-client. Very difficult to sue.
    2. Client-server is much more common
      1. A client requests information from a server, displays information when it is received
      2. Server sits and waits for information request, services request when request is received. Server seems to be simpler, but it must be able to service simultaneous requests, also expected to be very robust -- always available
      3. Clients and servers using the same application protocol are (supposed to be) interchangeable.
        1. E.g., email has three primary protocols
          1. POP (currently POP3) or Post Office Protocol
          2. IMAP (currently IMAP4) or Internet Mail Access Protocol
          3. Web-based email such as hotmail. Becoming very popular since it requires much less configuration than others, web-knowledgeable users already know how to use it.
          4. Client and server must be matched at each end, but can be different at the two ends
    3. Email. Client A is first user, with an account on mail server #1, a second client, B, has an account on mail server #2. A addresses a message to B, sends it by transmitting it to mail server #1, mail server #1 sends it to mail server #2. Message waits until B logs on, picks up message. Uses simple text for messages, but can attach files to messages. There are two major protocols -- POP (Post Office Protocol) and IMAP (Internet Message Access Protocol). POP is simpler and more popular, IMAP more comprehensive and is commonly supposed to be the future. Client and server must use the same protocol, POP or IMAP. Some email server computers run both servers, and some email clients can be configured either way.
      1. Internet email address has two parts separated by @. e.g. d.r.bowen@wayne.edu
        1. part to left of @ is name of account (d.r.bowen)
        2. part to right of @ is email server that account is on (wayne.edu)
          Email.gif (5547 bytes)
    4. World Wide Web (the web). Client uses web client (a.k.a. web browser, e.g. Netscape Communicator or Microsoft Internet Explorer. User can request a file by (a) typing in the file, (b) clicking on a link containing the file as hidden text, or (c) selecting a bookmark, which is the specification for a file previously viewed. Server gets file and returns it, client displays it. HyperText Transport Protocol (HTTP) is the basic web protocol. (HyperText means linked text, but has provisions for graphics and many other extensions.)
      1. Anatomy of a URL (Universal Resource Locator, what you type into the Location or Address window of your browser.
        Example:
        http://www.cll.wayne.edu/isp/drbowen/internet/welcome.htm
        1. http:// - The method (of transfer). http is optional. Other methods are
          1. ftp:// (File Transfer Protocol)
          2. telnet:// (Logging into a computer with a command line interface)
          3. gopher:// (Earlier test-based protocol without links inside documents)
          4. file:// (You can open a file directly in your browser to check it out, without going through the web server, and this is the method used in that case.)
        2. www.cll.wayne.edu - Domain Name of the web server. You can also use the numerical IP address, e.g. 141.217.142.149
        3. /isp/drbowen/internet/ - The path of folders to the requested file, from the "document root" folder of the server.
        4. welcome.htm - The name of the requested file. The browser displays files with extensions of htm, html, gif, jpeg, and jpg, and for others, asks if you want to download the file. If no file is listed, web servers are configured with a default file name, which is sent from the folder in the URL.
        5. If the requested filename is the "default" filename, it does not have to be listed. This is good because the user has to type less. If there is no file extension at the end of the URL, the URL is interpreted as requesting the default file name. (Normally the default file name is index.htm or index.html. On the CLL web server, it is welcome.htm)
      2. The full URL specifies everything about the requested file. This is an absolute URL. If the requested file is on the same web server, an abbreviated form known as a relative URL can be used. This is particularly useful for creating links and loading images. There are several possible forms for relative URLs, depending on how close the requested page is to the current page.
        1. If the requested file is in the highest-level folder for this web server, only a "/" is necessary, followed by the filename if it is not the default filename
        2. If the requested page is in the same folder, only the name need be given.
        3. If the requested page is in a sub-folder, only the folder path from the folder for the current page, and the file name (if it is not the default file name) need be given. In this case, do not preceed the first folder with "/" - that is interpreted as the highest level folder for this web server

        NOTE: Relative URLs are very convenient, because if you develop the web site on one computer and the web server is another computer, then you do not have to worry about the higher-level folders on the web server, which the web master will often be reluctant to divulge (the folder structure is one element needed to hack a web site). Also, if web sites are moved, absolute URLs for the same web site are broken, while relative ones usually survive.

      3. Web file format is HTML - HyperText Markup Language. HTML files are simple text files with two types of content
        1. Text appears on the screen as typed except that multiple spaces and line starts (<Enter>) are ignored.
        2. Markup or formatting commands appear inside corner brackets <>, e.g. <center>...</center>
        3. Browser implements formatting commands
        4. Formatting also includes links, graphics, audio, video, accept user input, etc.
      4. For web-based email, browser takes information such as destination and message, sends it to web server, web server transfers it to an email server
      5. On the Internet, web traffic is increasing at a high rate of growth, doubling approximately every eighteen months or less. It has surpassed the previous leader, email traffic. Other indicators, such as the total number of servers, are growing at similar rates.There are probably serveral reasons for this popularity
        1. Connects all computer platforms
        2. Ease of use, including interactivity
        3. Colorful, attractive layout
        4. Wide variety of content, including purchasing from home
        5. Ability to search, although organization is not a strong point, hard to focus down on the content you want