Notes on WWW Technology

The Web is an example of a client server system using TCP/IP.  The WWW also uses higher level protocols such as HTTP, FTP and Telnet.  A general outline of a web browser is given below.

The controller portion of the web browser receives input from the user.  It can retrieve information from a web server through the HTTP interface.  Some servers may use other interfaces for different types of media, such as streaming audio.  Web documents are formatted using the HyperText Markup Language.  HTML describes how the data should be displayed.  The HTML interpreter reads the markup tags that specify display characteristics such as font, color and position.  While the World Wide Web has been around only since the early 1990s, your instructor remembers hypertext systems from the early 1970s.

Hypertext Tags

The general format of a tag is <TAGNAME> to start and </TAGNAME> to end

<HTML>
	<HEAD>
		<TITLE>
		a title
		</TITLE>
	</HEAD>
	<BODY>
		body of document here
	</BODY>
</HTML>

pictures are added with

<IMG SRC="filename.gif">

size and positioning information can also be included in the IMG tag.

Pictures or text can contain a link to another file or web page. The HTML code

<a href="comp476.htm">Class Web Page</a>

will produce the text "Class Web Page". If a user selects this text, the web page comp476.htm will be loaded.  Using your Web browser, you can view the html source for this web page.  This can usually be done by selecting the menu item View/Source.

Web pages often involve more than one file.  The HTML page contains the text information in the web page.  Each graphical image is downloaded from the server as a separate file.

Types of Web Documents

Web pages are cached locally on the user’s disk. If you request the same page to be loaded again (e.g. by pressing the BACK button), the page will be loaded from disk instead of being loaded from the server over the network. This significantly speeds the loading of previously referenced pages. The same action occurs for any graphics or additional files associated with a page. If the same graphic is used on multiple pages, it will only have to be downloaded once.

Static Documents

Static documents are probably the most common type of web page.  The author of the web document formats it using HTML and saves the document on a web server.  When a user requests the web page, the web server copies the web document to the user.  The same document is transferred to the use each time it is requested.

Dynamic Documents

Dynamic web documents involve the execution of a program on the web server.  This program usually generates the web document which is sent to the client browser.  This is commonly used in connection with web forms.  The user types some information into the form and this data is passed to a program on the server.  The server then generates a web page based on the input.  Search engines are a common example of a dynamic document.

Common Gateway Interface (CGI) is a standard defining how a program will interact with a server to generate dynamic documents. Can use any programming language and can generate any type of output.

The CGI header indicates the type of document, such as:

Content-type: text/html

Location: /new/file.txt

The server can pass parameters to a CGI program. The parameters come from the URL after a "?"

Parameters are passed in environment variables. Forms are web pages that send the contents of fields to an active document. Each field has a name. The browser sends the entered values to the server as:

?fieldname = value,fieldname2=value2

Active documents

Active documents are programs downloaded from the server to the browser and run on the clients machine. Popular systems for active documents are ActiveX and Java.

Java was developed by Sun Microsystems.