CGI Environment Variables
The defining characteristic of a CGI program is its ability to read and understand data submitted to it from a web form. When a remote client submits a form, the browser bundles it up in a special format and sends it back to the web server. The server then passes it on to your program. Your program must know how to acquire that bundle of data and then unbundle it. The CGI protocol is the language that specifies how the data is bundled and supplied to the CGI program.
In this Session we will discuss the environment variables that your web server will create when a CGI session is generated. Environment variables are a series of hidden values that the web server sends to every CGI program that it runs. Your CGI program (or the dBASE WebClass library) can parse them, and use the data they send.
In order to get a better understanding of the myriad of environment variables, we will begin by building a CGI program that displays many of these variables and their values. Run dBASE Plus and change to the "Source" folder in your server's document root. Then switch to the command window and enter the following two commands:
compile showEnv build showEnv to ..\app\showEnv.exe WEB
You now have a CGI program ready to run. This program will read a set of CGI environment variables and returns their values in the response page. To see the environment variables click the following link. (Be sure your web server is running and use the browser's back button to return to this page):
Your response page should look very similar to the following printout, except that your values will be slightly different. This is not a complete list of all the environment variables. But it includes most of the more commonly used variables.
SERVER_SOFTWARE = Apache/2.0.28 (Win32) SERVER_NAME = localhost SERVER_PROTOCOL = HTTP/1.1 SERVER_PORT = 80 SERVER_ADMIN = mike@home GATEWAY_INTERFACE = CGI/1.1 DOCUMENT_ROOT = C:/Apache2/htdocs REQUEST_METHOD = GET SCRIPT_FILENAME = C:/Apache2/htdocs/App/showEnv.exe SCRIPT_NAME = /app/showEnv.exe CONTENT_TYPE = CONTENT_LENGTH = QUERY_STRING = REQUEST_URI = /app/showEnv.exe PATH_INFO = PATH_TRANSLATED = AUTH_TYPE = REMOTE_HOST = REMOTE_ADDR = 127.0.0.1 REMOTE_USER = HTTP_HOST = localhost HTTP_ACCEPT_CHARSET = HTTP_ACCEPT_LANGUAGE = en-us HTTP_CONNECTION = Keep-Alive HTTP_REFERER = HTTP_USER_AGENT = Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; DigExt)
Notice that some environment variables give information about your server, and will never change from CGI to CGI (such as SERVER_NAME and SERVER_ADMIN), while others give information about the visitor, and will be different every time someone accesses the script. Different web servers set their own environment variables as well, so you should check your server documentation for more information. In addition, some server, like Apache, let you define your own custom environment variables.
Not all environment variables get set for every CGI program. For example, REMOTE_USER is only set for pages in a directory or subdirectory that's password-protected. And HTTP_REFERER is only set when a server request is made from another web page, rather then from a bookmark or typed directly into the location field.
In your dBASE Plus CGI Program, you can use the getenv() function to access any of the CGI environment variables.
cRemoteHost = getenv("REMOTE_ADDR")
The following list describes many of the common environment variables. If the
client sends any HTTP headers along with its request, then these headers are
also placed into the environment. The names of these environment variables are
the names of the HTTP headers, and are prefixed with
letters in the name are changed to upper case and all hyphens are changed to
This environment variable contains the name and version of the software that your program is running under.
This environment variable contains the domain name or IP address of the server machine.
This environment variable contains the name and revision of the protocol being used by the client and server.
This environment variable contains the number of the port to which this request was sent.
This environment variable contains the directory under which the current program is executing, as defined in the server's configuration file.
The value given to the ServerAdmin (for Apache) directive in the web server configuration file. If the script is running on a virtual host, this will be the value defined for that virtual host.
This environment variable contains the revision of the CGI specification supported by the server software.
This environment variable contains the name of the method (defined in the HTTP protocol) to be used when accessing URLs on the server. When a hyperlink is clicked, the GET method is used. When a form is submitted, the method used is determined by the METHOD attribute to the FORM tag. CGI programs do not have to deal with the HEAD method directly and can treat it just like the GET method.
The absolute pathname of the currently executing script.
This environment variable contains the name of the virtual path to your program. If your program needs to refer the remote client back to itself, or needs to construct anchors in HTML referring to itself, you can use this variable.
If a form is submitted with the POST method, then this environment variable contains the type of data being sent by the client. While clients normally only send "application/x-www-form-urlencoded," this variable can contain any MIME type. To transfer binary data to the your CGI program you must use "multipart/form-data".
This environment variable contains the number of bytes being sent by the client. You use this variable to determine the number of bytes you need to read.
This environment variable contains information from an HTML page to your program in these three instances:
- When a page contains links with encoded queries
- When a form was accessed with the GET method
- When a page contains an ISINDEX tag and the user executes a search
The Uniform Resource Identifier (URI) which was given in order to access the program. The URI points the server to the file that contains the CGI program you want to run (or the static document or image to be served).
This environment variable contains the extra path information that the server derives from the URL that was used to access the CGI program.
This environment variable contains the actual fully-qualified file name that was translated from the URL. web servers distinguish between path names used in URLs, and file system path names. It is often useful to make your PATH_INFO a virtual path so that the server provides a physical path name in this variable. This way, you can avoid giving file system path names to remote client software.
If the CGI script is protected by any type of authorization, this environment variable contains the authorization type. Apache web servers support HTTP basic and digest access authorization.
This environment variable contains the host name of the remote client software. This is a fully-qualified domain name such as www.dbase.com (instead of just www, which you might type within your intranet). If no host name information is available, your program should use the REMOTE_ADDR variable instead.
This environment variable contains the IP address of the remote host. This information is guaranteed to be present.
This environment variable is set to the name of the local HTTP user of the person using the browser software only if access authorization has been activated for this URL. Note that this is not a way to determine the user name of any person accessing your program.
This environment variable identifies the browser software being used to access your program.
Contents of the Host: header from the current request, if there is one.
This environment variable enumerates the types of data the client can accept. For most client software, this protocol feature has become a bit convoluted and the information isn't always useful.
Contents of the Accept-Charset: header from the current request, if there is one.
Contents of the Accept-Language: header from the current request, if there is one. This value can be changed on the client browser options, when choosing preferred language.
Contents of the Connection: header from the current request, if there is one.
The address of the page (if any) which referred the browser to the current page. This is set by the user's browser; not all browsers will set this.
Contents of the User_Agent: header from the current request, if there is one. This is a string denoting the browser software being used to view the current page.
There are two primary ways in which a remote client can submit data to a CGI program. The first way is to use an Anchor tag. This tag lets you define a hypertext link that the user can click to display a document. You define a hypertext link by using the <A> tag with an HREF attribute to indicate the start of the hypertext link, and use the </A> tag to indicate the end of the link. When the user clicks any content between the <A HREF> and </A> tags, the link is activated. The value of the HREF attribute must be a URL or a virtual path. If you want the link to open a new document, the value of HREF should be the URL or virtual path for the destination document.
Using a hyperlink to execute a CGI program utilizes the GET method and populates the QUERY_STRING environment variable. Using the query string is the simplest way of passing data to a CGI program. If you append a question mark (?) to the URL for your applet, then any characters after the question mark will be passed to your applet in the QUERY_STRING environment variable.
You can use the following hyperlink to run the ShowEnv applet and see the values of the environment variables. You will see that "GET" is the value for the REQUEST_METHOD environment variable and "name=value&name1=value1" is the value for the QUERY_STRING environment variable.
REQUEST_METHOD = GET SCRIPT_FILENAME = c:/Apache2/htdocs/app/showenv.exe SCRIPT_NAME = /app/showEnv.exe CONTENT_TYPE = CONTENT_LENGTH = QUERY_STRING = name=value&name1=value1 REQUEST_URI = /app/showEnv.exe?name=value&name1=value1
In a dBASE CGI program, you can obtain the data in the QUERY_STRING with code like the following.
cMethod = upper(getEnv("REQUEST_METHOD")) if cMethod == "GET" cEnv = getEnv("QUERY_STRING") endif
Although passing data through the query string is among the easiest ways to submit CGI data, there are a few draw backs to this approach. First, there is a limit on the length of the query string. The web server treats the local portion of the url, including the query string, as a kind of file path. And thus the length is restricted by the operating systems limit on the length of the path to a file (256 characters). So you can't pass very much data this way.
Second, when you place data in the url yourself, you are responsible for url-encoding. As we noted in Session One, spaces must be converted to + signs, and punctuation characters must be escaped with the % sign and hexadecimal digits.
Third, the url, including the query string, is collected in the access logs maintained at the server. If your access logs are public, you may not object to having your hits recorded, but your data might contain information you'd prefer not to expose.
Because of these limitations, another method of transmitting data to a CGI program was developed, and is now the most common and recommended method. The POST method sends data to a program's Standard Input. It's less public (it's not reported in the server logs), the web browser automatically encodes the data, and in principle there are no length limitations. On the other hand, you cannot supply POST data to a program directly in the url, as can be done with the query string or path info. You must use a web form to POST data.
You can submit the following form to the showEnv applet to see the values associated with the relevant environment variables. You will see that "POST" is the value for the REQUEST_METHOD environment variable and that the QUERY_STRING environment variable is empty. The form tag for this example is the following:
<FORM METHOD=POST ACTION="http://localhost/app/showEnv.exe">
DOCUMENT_ROOT = C:/Apache2/htdocs REQUEST_METHOD = POST SCRIPT_FILENAME = C:/Apache2/app/showenv.exe SCRIPT_NAME = /app/showEnv.exe CONTENT_TYPE = application/x-www-form-urlencoded CONTENT_LENGTH = 27 QUERY_STRING = REQUEST_URI = /app/showEnv.exe
HTML forms that use the POST method send their encoded information using the standard input. To access this data in a dBASE program you must use a file object and read the data as if it is stored on the hard drive. In order to determine the number of bytes to read in you can use the CONTENT_LENGTH environment variable. In the above example this value is 27. The dBL code to read this information might look like the following:
cMethod = upper(getEnv("REQUEST_METHOD")) if cMethod = 'POST' nLen = val(getEnv("CONTENT_LENGTH")) fIn = new file() fIn.Open("StdIn", "RA") cEnv = fIn.Read(nLen) endif
The dBASE WebClass library makes working with the data stuffed in environment variables quite easy. Although we will explore the details of this library in the Sessions that follow, here we should point out that this library does much of work we have been reviewing. Normally, a your CGI application will call a method called "Connect()" from the WebClass library. This methods does a number of important tasks.
First the method establishes a communication channel with Standard In. Then it determines whether the CGI request is a GET or POST method and. based on that determination, reads the incoming data. Because the data is URL encoded, the WebClass library then transforms CGI escape chars embedded in the submitted data
Next all the name/value pairs passed in this data is added to an associative array, which you can easily access in your CGI program. Finally, the connect method opens a communication channel with Standard Out so that you have a route to send your response page.
There is no doubt that the WebClass library saves the developer a good deal of time by reading and formatting the data sent to your CGI program.
Another method of sending data to a CGI program is through the PATH_INFO environment variable. Similar to the query string, path info is whatever comes after the program name in the url. You need to start the path info with a slash (/) to let the web server know where the program name ends.
Try the following hyperlink. It contains both path info and a query string.
REQUEST_URI = /app/showEnv.exe/uploads/dir1/?name=value&name1=value1 PATH_INFO = /uploads/dir1/ QUERY_STRING = name=value&name1=value1 SCRIPT_NAME = /app/showEnv.exe DOCUMENT_ROOT = C:/Apache2/htdocs PATH_TRANSLATED = C:\Apache2\htdocs\uploads\dir1
You can see from the output that "/uploads/dir1/" is listed as the value of the PATH_INFO environment variable. And that "name=value&name1=value1" is the query string. You must supply the path info first and the query string second in the URL.
Path info need not be a path to any particular file or directory. You can use it to convey constant information to your CGI programs independent of the information the client sends. Nonetheless, this was its original intent and is still the most common use. For example, many hit counter scripts are installed for system-wide use. If one program serves many users, the program must be told which page is to be counted. Path info will often be the (file system or url) path to the particular file that contains the current count. Path info has the same disadvantages as the query string. It is not automatically url encoded, it is subject to the same path length limitations, and it is reported in the server logs.
You can also use the extra path info to access the web server's virtual-to-physical path translation. You can send a virtual path as extra path information, so your CGI program can use the path information to access a file on your server machine. This means the server provides the physical path name corresponding to that virtual path in the environment variables by using path translation. Notice in the above printout that the PATH_TRANSLATED environment variable is the concatenation of the DOCUMENT_ROOT and PATH_INFO variables (with the slashes converted to backslashes on a Windows server).
It turns out that you can add extra path info (and a query string) to a URL used to submit a form. The next example form uses the Method=POST (so the form data is passed to the CGI program through standard in), but it also includes extra path info (in the ACTION attribute of the FORM tag.
<FORM METHOD=POST ACTION="http://localhost/app/showEnv.exe/uploads/dir1/">
DOCUMENT_ROOT = c:/Apache2/htdocs REQUEST_METHOD = POST SCRIPT_FILENAME = c:/Apache2/htdocs/app/showenv.exe SCRIPT_NAME = /app/showEnv.exe CONTENT_TYPE = application/x-www-form-urlencoded CONTENT_LENGTH = 27 QUERY_STRING = REQUEST_URI = /app/showEnv.exe/uploads/dir1/ PATH_INFO = /uploads/dir1/ PATH_TRANSLATED = c:\Apache2\htdocs\uploads\dir1
Using Server Authentication
The most common way to screen web clients with Apache is to use basic HTTP authentication. Server authentication can act a wall to keep unauthorized people out of your sites. It can also be used to customize a site by adding user specific information to the web pages. To control access to a folder on your web site you must include the Apache User Authentication Directives within a directory block. The core configuration directives are as follows:
<Directory "c:/apache/htdocs/app"> AuthType Basic AuthName "My Important Site" AuthUserFile "c:/apache/users/users.txt" require valid-user </Directory>
The AuthType directive tells Apache to use basic HTTP authentication. This system is pretty simple. Usernames and passwords are stored in a text file that resided on the same computer that is running the web server. Moreover basic authentication uses no encryption, and, therefore, passwords are sent in plain-text form.
The AuthName directive assigns a name to the area being protected. You can use any name you want for this area. This text will be sent to the web browser and displayed in the logon dialog box.
The AuthUserFile directive sets the path to the password file. This file must be a simple text file with a username and password on each line. The format of this file is "username:password". You should add a blank line after the list of name. Also note that it is a good idea to locate this file outside of your server's web space. A sample file might look like this:
The require directive (note the case) lets you specify which users are allowed access to a protected area. The directive has three valid arguments: user, valid-user, and group. In the sample above, we use the valid-user argument which admits all users whose names are listed in the password file.
The Apache Basic Authentication system also supports organizing users into groups. To implement this feature, create a text file with the following format:
groupname: username1 username2 username3
Use the AuthGroupFile directive to sets the path to this file:
Finally use the group argument in the require directive:
require group groupname
The following are two examples of the directory blocks that assign groups. In the first example all users must be assigned to a group, and only those within "myGroup" can access this folder. In the second example, any user who belongs to the group "myGroup" can access the folder and the user "nuwer" who may not be a group member can access the folder.
AuthType Basic AuthName "My Important Site" AuthUserFile "c:/apache/users/users.txt" AuthGroupFile "c:/apache/users/groups.txt" require group myGroup
AuthType Basic AuthName "My Important Site" AuthUserFile "c:/apache/users/users.txt" require user nuwer AuthGroupFile "c:/apache/users/groups.txt" require group myGroup
Authorization Error Message. You can customize any of the error messages that Apache handles with the ErrorDocument directive. By including this directive within the directory block, you can create different customized error page for each secure folder on your server. The following example returns /ErrorFiles/Error401.htm to the browser when the user fails to authenticate.
ErrorDocument 401 /ErrorFiles/Error401.htm
Notice that this page is located in a non-protected folder on the server. Since the user has failed the authentication process, the web server will not give access to files or documents in the secure folder. This can, however, be confusing. Even though the user does not have access to the files in the protected folder, that folder is still treated as the default. So if your error document has images or other objects and their not references with a full path, the html page will not be able to use them.
CGI Environment Variables. One of the standard CGI environment variables is REMOTE_USER. Once a user has been authenticated by the web server, this variable contains the user's username. Moreover, the web server passes this variable to your program each time it is called. Thus it is easy to grab this variable and use it to look up information about this user.
cLookFor = getenv("REMOTE_USER") q = new query() q.sql = 'Select * from "c:\apache\data\users.dbf"' q.active = true q.rowset.indexName := "username" q.rowset.findkey(upper(cLookFor))
You can now customize your pages, like for example, print the user's full name at the top of the page.