Wayne State University
College of Lifelong Learning
Interdisciplinary Studies Program
Instructor email: d.r.bowen@wayne.edu
Instructor tel (WSU) (313) 577-1498 / (Home) (248) 549-8518

Macomb University Center, WSU office (810) 263-6700 / (313) 577-6261
Computers, the Internet, and Society
http://www.cll.wayne.edu/isp/drbowen/inetw00
AGS 3360 Section 301 Call Number 99879, 4 cr
or
ISP 7990 Section 300 Call Number 95259, 4 cr

Last updated: 1/17/2000
Link back to course Welcome

The Internet
Structure, Function and Applications

  1. "Internet" can be used vaguely today. We will use the technical definition here. The Internet connects computers within local networks so that any connected computer can send computer information to any other.
    1. Most computers connected to the Internet are either part of a network or are connected via an ISP - Internet Service Provider. In either case, there is a "gateway" computer that can receive information from and direct it to the connected computer. The gateway computer has the direct internet connection.
    2. Each connected computer is given a numerical "IP address" or just plain "IP" with the form of four bytes separated by three dots, e.g. 141.217.142.149 for the IP address of the CLL web server. A gateway handles a "family" of IPs. For example, 141.217.142.--- is CLL - College of Lifelong Learning, including Interdisciplinary Studies Program. So, my desktop is 141.217.142.125. Within CLL, gateway is 141.217.142.1, and we are all connected together via an Novell NetWare LAN.

      InetWork.gif (5423 bytes)
      1. Both the left-hand and right-hand LANs in this picture have many computers, only one of which is shown. We want to send information from left-hand computer to right-hand computer over the Internet.
      2. LAN - Local Area Network. Shares files, printers. LANs (NetWare, Token Ring, AppleTalk) are usually proprietary. Proprietary arrangements are trade secrets and so do not interoperate. How the gateway and workstations communicate is different on each type of LAN.
      3. Internet protocols are public domain, how they work is published freely, anyone can use them.
      4. IP addresses must be unique - no repeats
      5. Organizations can be given control over a "domain" and allocate IP addresses within that domain. For example, 141.217.xxx.yyy is Wayne State University. WSU allocates 141.217.142 to CLL, and CLL allocates the yyy to individual computers.
      6. IPs can be "static" -- unchanging, assigned by network administrator -- or "dynamic" -- assigned from a pool of IPs at each boot up by the router
        1. AOL changes IPs at each "hit"
      7. Left-hand LAN uses its proprietary communications to send packet from computer to Gateway, Gateway uses Internet communications to send packet to right-hand Gateway, Right-hand LAN uses its proprietary communications to send packet to computer.
    3. Information travels in packets - finite groups of bits. Packets have two parts
      1. Head is standardized - contains "meta information" - "from" address, length of packet, "to" address, etc. (time to live)
      2. Body is freeform - contains the information itself
    4. Packets are steered to destination by Internet routers. Each router has an Internet address.
      1. Router has "router table" listing final destination and next hop, for each packet matches final destination, then sends packet on its next hop to the next router.
      2. There are usually many possible routes to destination, so routers have a method of making the choice, usually on the basis of low traffic and therefore probably fastest time
    5. Computers are very inflexible, and must have an explicit order of which computer starts communication and what it does, how the second responds, etc. These are "protocols". The Internet uses the TCP/IP protocol for communication. This is actually two main protocols -- TCP and IP, with a host of others that are lumped in.
      1. IP is the raw transport mechanism, just throwing information out as fast as it can, without asking if it arrives.
      2. TCP sits on top, and uses IP both ways to confirm accurate arrival, and resend if not
    6. Domain Name Server system is another layer of protocol to make remembering addresses easier.
      1. Many servers (see below) have static IPs and employ "dot-com" names, e.g. www.cll.wayne.edu. These are called domain names, because the last group of letters is known as the domain. Primarily US
        1. edu
        2. com
        3. org
        4. gov
        5. net
      2. Other countries use two-letter domain, e.g. de
      3. New domains added 1997
        1. firm
        2. store
        3. arts
        4. rec
        5. info
        6. web
        7. nom
      4. Client software goes to a local Domain Name Server (DNS) to get the IP (numerical) address
        1. Communication with DNS goes uses TCP/IP
        2. If local DNS does not have entry, kicked up to a higher-level DNS
      5. This only happens the first time during a session. For the rest of that session, the client remembers the IP address
      6. This happens without action from the user
      7. TCP/IP developed incrementally by decentralized workers and informal groups. Protocols and software are freely and publicly available. Government support, especially at the start, was critical. Very different from proprietary development.
    7. TCP/IP is the basic Internet communication protocol
      1. IP = Internet protocol
        1. One-way communication - packets go from A to B without any checking, implements IP addresses
        2. Not secure
        3. Fast
        4. Can run on many different types of hardware -- telephone wires, coax, optical fiber, etc.
        5. Current version is IPv4, being upgraded to IPv6. 4.3 billion IP4 addresses, not enough. Rapid growth, also many are not used. Number provided by IPv6: 3.4 followed by 38 zeroes
      2. TCP  uses ("sits on top of") IP. Transmission Control Protocol. Uses IP in both directions to implement secure transmission
        1. Waits for distant computer to acknowledge ("connection")
        2. Successive packets have sequence numbers.
        3. Flow control - halt, continue
        4. Slower
      3. Audio and video could use another protocol instead of TCP, because dropping a bit or two is not serious, and faster transmission would more than make up for any "snow"
    8. This is what I will mean by "The Internet" -- a pipeline for delivering information between any two connected computers
      1. By connecting to the Internet, an organization extends it and provides alternate routes. Each organization funds its own part. Backbone is maintained by large communications companies.
      2. Standards were developed by public discussion, are not proprietary, many companies make routers and provide services. (Cisco Systems is largest manufacturer.)
  2. "Applications" are programs that use this transmission mechanism
    1. Peer applications have two computers acting as equals, but this is fairly rare
    2. Client-server is much more common
      1. A client requests information from a server, displays information when it is received
      2. Server sits and waits for information request, services request when request is received. Server seems to be simpler, but it must be able to service simultaneous requests, also expected to be very robust -- always available
      3. Clients and servers using the same application protocol are (supposed to be) interchangeable.
        1. E.g., email has three primary protocols
          1. POP (currently POP3) or Post Office Protocol
          2. IMAP (currently IMAP4) or Internet Mail Access Protocol
          3. Web-based email such as hotmail. Becoming very popular since it requires much less configuration than others, web-knowledgeable users already know how to use it.
          4. Client and server must be matched at each end, but can be different at the two ends
    3. Email. Client A is first user, with an account on mail server #1, a second client, B, has an account on mail server #2. A addresses a message to B, sends it by transmitting it to mail server #1, mail server #1 sends it to mail server #2. Message waits until B logs on, picks up message. Uses simple text for messages, but can attach files to messages. There are two major protocols -- POP (Post Office Protocol) and IMAP (Internet Message Access Protocol). POP is simpler and more popular, IMAP more comprehensive and is commonly supposed to be the future. Client and server must use the same protocol. Some email server computers run both servers, and some email clients can be configured either way.
      Email.gif (5547 bytes)
      1. Internet email address has two parts separated by @. e.g. d.r.bowen@wayne.edu
        1. part to left of @ is name of account (d.r.bowen)
        2. part to right of @ is email server that account is on (wayne.edu)
    4. World Wide Web (the web). Client uses web client (a.k.a. web browser, e.g. Netscape Communicator or Microsoft Internet Explorer. User can request a file by (a) typing in the file, (b) clicking on a link containing the file as hidden text, or (c) selecting a bookmark, which is the specification for a file previously viewed. Server gets file and returns it, client displays it. HyperText Transport Protocol (HTTP) is the basic web protocol. (HyperText means linked text, but has provisions for graphics and many other extensions.)
      1. Anatomy of a URL (Universal Resource Locator, what you type into the Location or Address window of your browser.
        Example:
        http://www.cll.wayne.edu/isp/drbowen/internet/welcome.htm
        1. http:// - The method (of transfer). http is optional. Other methods are
          1. ftp:// (File Transfer Protocol)
          2. telnet:// (Logging into a computer with a command line interface)
          3. gopher:// (Earlier test-based protocol without links inside documents)
          4. file:// (You can open a file directly in your browser to check it out, without going through the web server, and this is the method used in that case.)
        2. www.cll.wayne.edu - Domain Name of the web server. You can also use the numerical IP address, e.g. 141.217.142.149
        3. /isp/drbowen/internet/ - The path of folders to the requested file, from the "document root" folder of the server.
        4. welcome.htm - The name of the requested file. The browser displays files with extensions of htm, html, gif, jpeg, and jpg, and for others, asks if you want to download the file. If no file is listed, web servers are configured with a default file name, which is sent from the folder in the URL.
        5. If the requested filename is the "default" filename, it does not have to be listed. This is good because the user has to type less. If there is no file extension at the end of the URL, the URL is interpreted as requesting the default file name. (Normally the default file name is index.htm or index.html. On the CLL web server, it is welcome.htm)
      2. The full URL specifies everything about the requested file. This is an absolute URL. If the requested file is on the same web server, an abbreviated form known as a relative URL can be used. This is particularly useful for creating links and loading images. There are several possible forms for relative URLs, depending on how close the requested page is to the current page.
        1. If the requested file is in the highest-level folder for this web server, only a "/" is necessary, followed by the filename if it is not the default filename
        2. If the requested page is in the same folder, only the name need be given.
        3. If the requested page is in a sub-folder, only the folder path from the folder for the current page, and the file name (if it is not the default file name) need be given. In this case, do not preceed the first folder with "/" - that is interpreted as the highest level folder for this web server

        NOTE: Relative URLs are very convenient, because if you develop the web site on one computer and the web server is another computer, then you do not have to worry about the higher-level folders on the web server, which the web master will often be reluctant to divulge (the folder structure is one element needed to hack a web site). Also, if web sites are moved, absolute URLs for the same web site are broken, while relative ones usually survive.

      3. Web file format is HTML - HyperText Markup Language. HTML files are simple text files with two types of content
        1. Text appears on the screen as typed except that multiple spaces and line starts (<Enter>) are ignored.
        2. Markup or formatting commands appear inside corner brackets <>, e.g. <center>...</center>
        3. Browser implements formatting commands
        4. Formatting also includes links, graphics, audio, video, accept user input, etc.
      4. For web-based email, browser takes information such as destination and message, sends it to web server, web server transfers it to an email server
      5. On the Internet, web traffic is increasing at a high rate of growth, doubling approximately every eighteen months or less. It has surpassed the previous leader, email traffic. Other indicators, such as the total number of servers, are growing at similar rates.There are probably serveral reasons for this popularity
        1. Connects all computer platforms
        2. Ease of use, including interactivity
        3. Colorful, attractive layout
        4. Wide variety of content, including purchasing from home
        5. Ability to search, although organization is not a strong point, hard to focus down on the content you want