< Home | Next Session >

Session One
Getting Started with CGI Programs

Welcome to Building Web Applications.

This tutorial focuses on building Common Gateway Interface (CGI) applications with dBASE Plus. It consists of ten Sessions and culminates with a Hands-on Project in which you will create your own WebStore application.

In Session One there are three tasks that you need to work on: (1) You will begin reading about the nature of CGI applications; (2) you will install a web server and configure it so that it handles CGI application requests; and finally (3) you will install a sample application that is included with dBASE Plus.

An Overview of CGI

The Common Gateway Interface (CGI) [1] is an interface between your web server and the programs you write. CGI lets those programs process HTML forms or other data coming from clients, and then it lets the CGI programs send a response back to the client. The response can be HTML documents, GIF files, video clips, or any data the client browser can view. This makes your web pages interactive with the user.

The word gateway is used to describe the connection between your program and your HTTP (web) server. Like a gate in a fence between two fields, CGI rests between your program and the server.

The rest of this section describes:

The CGI Request Process In Detail

This section provides an overview of how CGI fits into the interaction of client software like Internet Explorer, Netscape Navigator, or Opera and HTTP servers like the Apache or IIS. When a client requests a document from a server, the server finds the file and sends it to the client. However, if a client requests a CGI program, the server simply acts as an intermediary between the client and the CGI program.

The following steps list a simplified overview of what happens when a client requests a CGI process.

  1. The client (a web browser) sends a request to the server for a document. If it can, the server responds to the request directly by sending the document.
  2. If the server determines the request isn't for a document it can simply deliver, the server creates a CGI process.
  3. The CGI process turns the request information into environment variables. Next, it establishes a current working directory for the child process. Finally, it establishes pipes (data pathways) between the server and an external CGI program.
  4. After the external CGI program processes the request, it uses the data pathway to send a response back to the server, which in turn, sends the response back to the client.

In reality, the behavior of the CGI process is a bit more complicated. The specific details that happen at each stage are described next.

The client sends the request. The client sends a request to the server running on your machine. The request might be for a document, or it might be something else, like the contents of an HTML form. If the request is for a regular document (such as an HTML document or a .GIF file), the server sends that document directly back to the client.

If the request is data intended for an external application, then the server needs to use CGI to run that application. For example, the client's request might be to search a database. The CGI application takes the search criteria, searches the database, then sends the results back to the client.

The server creates the CGI process. When a server receives a request that must be handled by an external application (a CGI request) that server creates a copy of itself. This second process is called the CGI process because it is the process in which the CGI program will run. The CGI process has all the same communication pathways that the server process has. The only purpose for the CGI process is to set up communications between the CGI program and the server.

Because it is a copy of the server, the CGI process has access to information about the CGI request. For example, the CGI process knows

The server assigns variables and opens data paths. The CGI process takes the data the server has about the current request and puts it into environment variables. The CGI process enables the client to pass data to the CGI program as standard input. The CGI process also creates data pathways between your CGI program and the server. This means the server can send to the program any encoded form data the client submitted, and the external CGI program can send a reply back to the client via the server.

The CGI program responds to the client. The CGI program takes the data that the server provides through environment variables, standard input, or command-line arguments. It processes the data, contacts any external services it needs to, and then sends a response to the server by way of the data pathways using standard output. The server then takes the program's response, prepends any necessary protocol headers to the output, and sends it back to the client software. Your program can output any type of data it needs to, including HTML, GIFs, or JPEGs.

Accessing CGI Programs Through URLs

When responding to a request from a client, the server must figure out if it can handle the request itself or if it must create a CGI process. To determine whether a URL that the client requests refers to a CGI program, the server can use two methods. You can configure the server to use either method.

The method of CGI activation you choose determines only part of the URL used to access your program. URLs to CGI programs can be split into three different parts, shown here in brackets:

   
   [virtual path][extra path information]?[query string]

Accepting user input from URLs

The main type of information the client can send to a CGI program is HTML form data. When the data is typed as form text, the data is encoded using URL encoding. In URL encoding, there are two rules:

If the data comes from an HTML form, then the location of the data varies depending on the method attribute specified with the FORM tag in your HTML document.

Wherever the data comes from, it appears in this form:

   
  name1=value1&name2=value2 ... &nameN=valueN 

If there are any = or & characters in the data, they must be encoded using URL encoding. This avoids ambiguity when your program translates the form data. To properly decode this data, a CGI program should first split it into name-value pairs (eliminating the ampersands), then split each pair into a name and a value, and then apply URL decoding to the name portion and to the value portion of the pair. Fortunately, the dBASE WebClass library handles this task for you.

The various form elements have their own rules for determining what value is associated with the name they are given:

Data The Server Sends To The CGI Program

The server sends data to the CGI program in three ways: environment variables, standard input, and command-line arguments.

Environment variables are the most common method used to pass data about a request to your CGI program. This data comes from the server software itself, from the network socket connecting the client to the server, and from the URL that was used to access the CGI program (for example, when using the GET method, the data is sent to the QUERY_STRING variable).

Environment variables are identified by character strings and have character string values.

HTML forms that use the POST method send their encoded information using the standard input. You can use the CONTENT_LENGTH environment variable to determine the number of bytes to read in.

The standard HTML form assumes the content-type is "application/x-www-form-urlencoded," but you can use the standard input to send other types of data to your programs, like for example, binary data, which you can then saves to a file on your server machine.

Sending Output From CGI Programs

Usually the CGI program's output goes to the client through the server-spawned CGI process. This means your CGI program doesn't have to worry about protocol-specific headers and such. Here we will consider the CGI header which is sent by your program and read by the web server.

CGI generic headers. When the CGI program sends its output to the server, it begins the text with generic headers. A CGI header consists of one or more text lines in this format:

   
   name: value
 

A single blank line signals the end of the header. After the blank line, the server stops parsing your program's header and sends the rest of your data untouched to the client. This means that your program can output any type of data it needs to, including HTML, GIFs, or JPEGs.

Each name-value pair is an HTTP protocol header. You can output any header you want and the server sends it to the client. However, if the server detects odd header lines, the server logs a 500 error and doesn't return any data to the client.

Some of the commonly used HTTP headers are described here. When you output any of these headers, the server doesn't alter their values or their output.

Content-type

This header reports the type of data your program is returning. This is a valid MIME type in the format type/subtype. This header should always be sent from any CGI program.

Example

   
   text/html, text/plain, image/gif, image/jpeg, audio/basic

Content-length

This header reports the length of the data in bytes, not including the header. This header is used when you are streaming a binary file, like an image, from your CGI program.

Expires

This header reports the date on which this file should be considered outdated by the client. You must use a GMT date formatted string of the following form:

    
   Saturday, 12-Nov-94 14:05:51 GMT

dBASE Plus will creates this type of date string with the toGMTString() method.

CGI-specific headers. The headers listed next are special to CGI and make the server act on your program's behalf.

Location

The Location header reports the location of a new file for the server or client to retrieve. This header must be in one of two forms: a complete URL or a virtual path.

If the value is a full URL, such as http://mysrvr/misc/file.html then the server redirects the client to the new URL (this is transparent to the user). The client then acts as if it had originally requested that redirected URL, so all relative links in the document of the URL are resolved from the directory specified in that URL. For example, if the URL points to an HTML file with relative paths to graphic files, the client locates the files from the directory where the HTML file is.

If the location is a virtual path, such as

    
   /misc/file.html

then the server restarts the request using the virtual path, for example

   
   http://mysrvr/misc/file.html.

However, the client isn't informed of the new location, so any relative links in the document are resolved from the directory of your CGI program, not of the document that is actually being returned. This means images referenced in the document might not work because the client might be looking for them in the wrong directory.

Status

The Status header is a status code that is returned for every HTTP request. The status code indicates to the client whether the request succeeded or not. If the request was unsuccessful, one of several error codes is provided to tell the client what happened. If the request was successful, there are status codes that indicate a successful request and to ask for further action from the client. If no Status header line is provided in the CGI header, the default is assumed 200 OK unless a Location header with a full URL is present. If the location is present, the default is 302 Found.

The status line has the form nnn reason, where nnn is the three-digit code for the request, and reason is a short string describing the error. The following codes and reasons are commonly recognized by web browsers.

200 OK. The request finished normally.
204 No response. The request was understood and processed, but there is no new document to be loaded by the client.
302 Found. The client should look for data at a new URL, given by a Location header.
304 Use local copy. The client sent a request with an if-modified-since header, and the requested data hasn't been modified since the given date.
400 Bad request. The request had illegal or unintelligible HTTP inside.
401 Unauthorized. If access authorization is enabled, the request could not be fulfilled because the user did not provide the proper authorization to access the area. With current authorization schemes, a WWW-Authenticate header must be provided to give the client instructions on how to complete the request with the proper authorization.
403 Forbidden. The client is not allowed to access what it requested.
404 Not found. The client asked for something the server couldn't find.
500 Server error. This is a catch-all error code that indicates something went wrong in the server or the CGI program, and the problem stopped the request from being completed.
501 Not implemented. The client asked the server to perform an action that the server knows about, but can't do.

Sample program output

The following CGI program output sends an HTML document back to the client. Notice the blank line after the header. This lets the server know when the header ends and where document begins.

   
   Content-type: text/html  

   <title>My little document</title> 
   This is my own little document. Do you like it?

The following output instructs the client to retrieve a different URL. The small HTML fragment at the bottom allows any navigation software that doesn't support redirection to retrieve the given URL.

   
   Location: http://www.sample.org/abc/afile.html 
 
   This document can be accessed at the following 
   <a href=http://www.sample.org/abc/afile>location</a>.

 
Note
You can read more about CGI programs in dBASE on The Web by A.A. Katz. This is a very useful document. The first two parts of the book provide an overview of web applications and how dBASE Plus relates to these types of applications. I encourage you to read "Part 1: The Web" and "Part 2: dBASE on The Web" which will take you though page 45. You can download dBASE on the web from this link: [ Download dBASE on the Web (792KB)]

Install a Web server

Now that we have seen how CGI programs work, we need to install a web server. You are welcome to use any web server you want. I am most familiar with Apache, which means that I'll be able to provide the most help with this server.

Microsoft's Internet Information Server (IIS) is also very popular and more than adequate for this Tutorial. Apache, however, has the advantage that it can be run as service or as a console application. This ability can be very useful for debugging web applications. IIS does not run as a console application, and therefore cannot display error messages at the computer console. You can develop your application with the Apache server and deploying it to an IIS server without any conflicts.

There is an Apache installer on your dBASE Plus CD. The dBASE Plus 2.0 CD includes version 1.3.22 of the Apache server. The distributors of Apache, known as the Apache Group, provide an updated version of their 1.3 server and they provide the Apache 2.0.x web server.

For purposes of this Tutorial, either the 1.3.x or the 2.0.x server is adequate. I've been using the Apache 2.0 server for quite some time now, and for what we will do in this Tutorial, there is no difference between the two versions. On a production server, however, whether you are using a 1.3 or 2.0 server, be sure that you installed the most up-to-date build. This is to avoid any security vulnerabilities. The installers for these server are available at the Apache website. [Get the Apache web server]

At the time of this writing Apache 2.0.47 is the most recent build. Look for a file with a name similar to this: apache_2.0.47-win32-x86-no_ssl.msi. This is version 2.0.47 for Windows. It does not include Secure Socket Layer (SSL) support. (SSL is used when encrypted data must be passed between a web browser and a web server. SSL is built into IIS and it can be added to an Apache 1.3.x server. We will not need SSL support in this Tutorial.)

Run the Apache installer now to setup the web server on your development computer. Visit the Apache Web site for additional information about installing the server.

 
Note

I install the Apache web server in the root of the C drive so that the server files are located in the folder C:\Apache2. This Tutorial will use that path in all of its examples. If you use a different path, be sure to adjust the examples accordingly.


Apache must be configured on your computer so that it will run CGI applications. I am going to suggest some basic settings so that your server will function on a development machine. These settings are not necessarily the settings that you would use on a production server. In addition to the comments below, there is a useful article in the dBASE KnowledgeBase that discusses how to configure an Apache server.

http://www.dbase.com/KnowledgeBase/int/Apache/Apache.htm

Find the folder where you installed Apache (This might be C:\Apache or C:\Program Files\Apache Group\Apache2). There is a subfolder named "conf" which contains the Apache configuration files. In that folder you will find httpd.conf. Open this file with a text editor and set the following "directives":

ServerName

If you are using your development machine and you are not going to make local network calls to the server, you can set this to:

   
   ServerName localhost

If you have a home network, you can use the IP address of the machine on which Apache is installed. For example:

   
   ServerName 192.168.0.2

Below the ServerName directive you should add a setting which will allow suppression of the dBASE Plus runtime engine initialization error dialogs. This is an operating system environment variable which Apache will set each time a CGI application is run.

   
   ServerName localhost
   SetEnv DBASE_SUPPRESS_STARTUP_DIALOGS 1

DocumentRoot

The DocumentRoot directive identifies the folder on your hard drive that will be the top most folder on your website. When Apache is installed, it creates a default document root named "htdocs." On my computer this folder is "C:/Apache2/htdocs". If you used the default installation location, then the path will be "C:/Program Files/Apache Group/Apache2/htdocs". This folder is where all the web files will be stored for our Tutorial. It also contains the Apache documentation.

If you would like to use a different location for the document root -- for example: "d:/webdocs" or "f:/dbaseWeb", modify the DocumentRoot directive so that it points to the desired folder. Note that Apache uses forward slashes ("/") rather than back slashes ("\") in its configuration file.

   
    DocumentRoot "c:/Apache2/htdocs"
 

Activate CGI script

In the httpd.conf file, find the "Directory Block" for your web site's document root. It will look something like this:

   
   <Directory "C:/Apache2/htdocs">
  
   </Directory>

If you changed the default documentRoot location, then you must edit the path in the directory block.

Within the block you will see many comments and some directives. All the directives within a directory block apply to the specified folder and to all its subfolders. The subfolder inherits the settings of the parent folder. (You can override the inherited settings by adding another directory block and changing the directive as desired.)

Within the directory block for your server's document root, find the "Option" directive, which will look similar to this:

   
   Options Indexes FollowSymLinks

And add "ExecCGI" to the end of the line and add a new line. After modification the lines should look like this:

   
   Options Indexes FollowSymLinks ExecCGI
   AddHandler cgi-script .cgi .exe .dbw 
 

The first modification tells the server that CGI execution is permitted in this particular directory. The second modification tells the server that files with the specified extension should be handled as CGI script. Note that in our case this means that CGI execution is permitted across the entire document root. Normally, you should not do this on a production server. When a web server is communicating with the open network, you should enable CGI execution only for those folders in which executable files are stored.

Additional information about running CGI scripts can be found at this link.

Install a Sample Web Application

The final task for Session One is to install the distribution files for this Tutorial and to get a sample application running on your web site.

Click here to download the distribution files. Unzip the archive into the document root on you server. Among the different files in this archive, you will find the sample application that is shipped with dBASE Plus. I have modified this application only so slightly. If your server is setup correctly, you should be able to get this sample application running on your own web server. The benefit of installing this application is that you can ensure that your web server is running correctly and trouble shoot it if needed. Also you will have some model code to refer to as you proceed through the Tutorial Sessions.

Follow these steps to get the sample application working on your web server.

1) Open dBASE Plus and, in the Navigator, change to the following folder:

   
   \Apache2\htdocs\signup\source

2) Switch to the command window and type:

   
   DO BuildAllToSamples

3) Open your web browser and type the following into the location field:

   
   http://localhost/signup/signup.htm

For more detail on this sample application see dBASE on The Web, Part 7: Sample Applications, pp. 115-122.

Summary

In this Session our goal was to ensure that your web server is properly configured. If everything worked as designed, you should have a dBASE Plus CGI application running on your web server.


[1] Portions of the following have been adapted from "Enterprise Server Programmer's Guide"
http://developer.netscape.com


< Home | Next Session >

The Legal Stuff: This document is part of the dBASE onLine Training Program created by Michael J. Nuwer. This material is copyright © 2001, 2003 by Michael J. Nuwer. dBASE is copyrighted, trademarked, etc., by dBASE, Inc. This document may not be posted elsewhere without the explicit permission of the author, who retains all rights to the document.