|
|
A Tour of HTML Forms and CGI Scripts
Overview
Forms on a Web Page
A Sample Web Form
A CGI Script in Perl
The Preamble

Reading the CGI Data

Perl Variables

Building a Response Page

Script Summary

Installing a CGI script
Exercise 1: Your Form, My Script
Exercise 2: Your Form, Your Script
Other Readings
Return to CGI Resources
|
 |
This is a quick introduction to HTML Forms and CGI Scripts. It reviews
some of the common form elements and then describes how a simple CGI
script interacts with a web form. As prerequisite, you should already
be acquainted with HTML, since I use it without explanation here. To
create your own CGI scripts, you'll also need to know some programming
language. I use Perl here since it's a fairly readable language.
There are risks associated with CGI scripts. As you'll see, you are
essentially allowing anyone on the Internet to execute a program on
your system, as often as they like. If you write a script with flaws,
it can pose a serious security risk to your account, your files or the
entire system, and can also be a massive drain on system resources.
This document is a simplified introduction to the elements of web
forms and CGI scripts; it's not complete and is not guaranteed to be
accurate.

Overview

It's first important to understand what HTML forms and CGI scripts
are. They are very different, but work closely together. A form is
simply a web page with some additional markup tags to instruct a web
browser how to display the various form elements, such as checkboxes,
selection lists, buttons and user-editable text areas. However, the
web page itself does not process the data, nor does the web server,
which doesn't know what you'd like to do with the user's answers. A
separate program or script, must process that data, in whatever way
you wish.
HTML forms are just markup tags on a web page. CGI (Common Gateway
Interface) is the language or protocol that the browser uses to
communicate the data from the form to the web server. When the user
submits her answers on a form, the browser bundles them up and sends
them to the web server, which passes them on to your script for
processing. A CGI script is any program which knows how to read that
bundle of data.
Some important points are:
- The web page itself does not process the data entered on the
form. Neither does the web server. There must be a separate script
which the web page tells the server to send the data to, and which
knows how to speak the language (CGI) that the server will use to
send the data. You need both the web page and the script.
- For security reasons, most web servers will not execute a file
(even a script or program with the right permissions) unless it is
in a designated directory, or sometimes has a designated filename
extension. Even if you can put a web page on your
system, you may not have write permissions in that directory.
You'll have to ask your webmaster for the location of and how to
write to that directory. You can't write forms without this,
unless you use a script that's already installed.
- Your script processes the data however you want, and then
almost always returns an acknowledgement page. So the script must
build up and return the html source for a web page. You
occasionally see a C program doing this because they are faster to
execute, but shell and Perl
scripts are easier for this kind of text manipulation and are more
commonly used for CGI scripts.
In what follows, we'll first describe the various form tags you can
use in your web page and give the HTML source for a sample page using
most of them. The page works so you can try it
out. We'll then go through the Perl CGI script which does the
processing for the page.

Forms on a web page

It's pretty easy to place things like radio
buttons, selection and check boxes, and interactive text areas onto
your web page. It's a little harder to do anything with them, but
we'll get to that later. Right now, we'll talk about how to
incorporate form elements into a web page.
Forms on a web page usually are included inside a single set of FORM
markup tags. Like this:
|
<FORM ACTION="http://www.site/your_script"
METHOD=POST>
...
</FORM> |
ll the form tags described below should be inside this FORM
region. The opening form tag specifies an ACTION attribute, which
gives the URL of the CGI script you want to process the user's form
data when she submits it. This is where you supply the link between
your form and your script. The form tag also specifies a protocol or
method for sending the data, which can be either GET or POST. The
latter is both more secure and flexible and is recommended.
Inside a FORM region, you can have any HTML elements you wish,
including text, images or links. You can also have various form
elements. Here's a list of the major form elements. Each has an
example of what it looks like on a page, a template for the
corresponding markup tag that's used in the source for your page, and
a brief comment or description.
For most of these markup tags, you'll specify attributes, such as
type, display characteristics, and especially a name and a value. The
name is essentially a variable name, which is passed to your script so
it can refer to the information (the value) the user entered for that
variable. The name can be anything as long as it's different for each
kind of information you want the user to supply.

A Sample Web Form

Here's the HTML source for a simple web page with many
of these form elements. You can try out the page in action.
|
<html><head><title>Your Title</title></head>
<body><h1>Your Heading</h1>
<form
action="http://www.halcyon.com/sanford/
cgi/perl_form.cgi" method=post>
Type something here:
<input type="text" name="some_text"
size=30 maxlength=50><p>
Here's a checkbox:
<input type="checkbox" name="box"> <p>
Select one of these:
<select name="choice">
<option selected> Ha
<option> He
<option> Hi
<option> Ho
</select> <p>
Now some radio buttons:
<input type="radio" name="radbut"
value="oop" checked> Oop
<input type="radio" name="radbut"
value="eep"> Eep
<input type="radio" name="radbut"
value="urp"> Urp <p>
<hr>
Finally, you need to submit it:
<input type="submit" value="Send it">
or
<input type="reset" value="Erase all"> <p>
</form>
</body></html> |

A CGI Script in Perl

When the user submits her form data, the browser bundles all her
answers up in a package and sends it to the script whose URL was
specified in the ACTION attribute. CGI is the language or protocol
used to construct this bundle and a script that knows how to
unconstruct the bundle is a CGI script.
You needn't be concerned about the details of CGI since in Perl (as in
shell and C) there are packages which can read this bundle for you and
return each of the user's form data in special variables which you can
manipulate or process in any way you wish. The best known packages for
Perl are cgi-lib.pl,
written by Steven E. Brenner and cgi.pm,
by Lincoln Stein. These packages contain a number of functions which
are very useful for CGI scripts.
I'm not going to describe all those functions, or even show how to
include them in your script. Including a package is a somewhat
advanced feature of Perl. Also, these package are popular and there
are a lot of versions around, many of them older versions, which may
not work as I shall describe. Instead, I'll give you the most useful
of those functions, and show you how to copy and use it in your
script.
So here's a line by line account of a simple Perl CGI script.
Remember, you can try out the page in
action.

The Preamble

A Perl CGI script should always begin with the
following lines:
|
#!/usr/bin/perl
# perl_form - a simple illustration of forms and Perl CGI |
The first line is mandatory for all Perl scripts. You may need to
change the path to Perl for your site. The second line is a comment,
giving the name of the script and what it does. Put lots of comments
in all your scripts and programs. Everything to the right of # is
ignored by Perl, as are blank lines.
|
print "Content-type: text/html\n\n"; |
This line begins some work. The web server knows it is executing a
script, but has no idea what to expect in return. So the script must
first tell the server what is coming, usually a web page of some
sort. The print command simply writes back to whoever executed it, the
web server in this case. It sends the magic words indicating a web
page is about to follow. After the server sees this, it will pass the
contents of further print statements back to the browser. This is how
a script can return a web page.

Reading the CGI Data

To read the user's form data into your script, it's as simple as this:
|
&ReadParse; |
This is the really useful function from the cgi-lib.pl
package. It reads all the form data from the user and puts them into a
Perl variable called %in . The Perl variable called
%ENV has some good data in it as well. I'll talk about
how to use these in a moment.
In order to use a function, you must define it somewhere. Perl has
special syntax for the use and definition of functions. To
use a Perl function, preface it with an ampersand
(&), as we did above. To define a function, use the
special keyword sub , then the name of the function,
then the block of code which defines the function, enclosed in
braces. Something like this:
|
sub ReadParse {
... a lot of code ...
} |
Programmers will often place the definitions of their functions at the
end of the script, and we will too, in the Summary section below. I'm not going to explain
how this function works here (though more advanced or bold readers can
view a separate tutorial devoted to reading CGI form data). It's fairly
elegant code, but you really need to know a few of the gory details of
Perl to understand it. Just copy and paste it onto the end of your
script and use it happily somewhere near the beginning.
Use it how? So the form data is in something called
%in . What's that?

Perl Variables

There are three kinds of variables in Perl, distinguished by the
character preceding the variable's name:
 |
- $
a scalar, like $in
- a variable which can contain a string, an integer or a real number;
- @
an array, like @in
- a simple list of any kind of scalar data. E.g., the first one,
the second one, ...;
- %
an associative array, like %in
- a list, but instead of being indexed by numerical order, you refer
to the items in the list by any set of keywords of your choosing. E.g.,
the red one, the green one, the blue one,....
|
So %in is an associative array which contains the data
the user submitted on the form. The keywords used to access the
elements of this array are just the variable names you specified on
the web form page. If you look at the source for
The Sample Web Form
above, you'll recall that we used variable names of: some_text, box,
choice, and radbut. Consequently, the values that the user submitted
for each of the form elements are stored in $in{some_text},
$in{box}, $in{choice}, $in{radbut} .
You'll notice that to access a particular value in an associative
array, you put the keyword inside braces following the name of the
array. You also use a $ in front instead of a %. Many people find this
confusing, thinking you should use a % instead, but it has a certain
logic when you note that the particular value you want is in fact a
scalar, even though it's coming from an array. You still use % when
you want to refer to the array as a whole, as we'll see.

Building a Response Page

We can now start processing the form data in the script. In this case,
we'll simply return a page reporting what the data was. To do this, we
first need to start building the HTML source for a web page, to return
to the web server that called the script. Recall Perl's
print function does this:
|
print "<title>The Response</title><h1>The Response</h1><hr>";
print "Here is the form data:<ul>"; |
We want an unordered list of the variable names and corresponding
values that the user submitted. This is just the keyword and its
corresponding value in %in . Rather than code each keyword
by hand, Perl has some built-in functions and control loops that make
this easy to do, and also means this script will work with
any web form. So you can specify this script as your target
ACTION in the form tag when you try building
your own web form. (That's an important point -- read it again. You'll
find this simple script is quite useful for debugging a web form.)
Perl's built-in function, keys , returns a list of all
keywords in an associative array. Then, Perl's foreach
function will cycle through every element in an array, and execute a
block of code once each time. Here it is:
|
foreach $key (keys %in) {
print "<li>$key: $in{$key}";
}
print "</ul>"; |
This says, set the scalar variable, $key , to be
successively, each of the keywords in the associative array,
%in . Then each time, print a <li> tag
followed by the keyword, a colon and a space, then the item in the
associative array corresponding to that keyword. This works, no matter
how many keywords (named variables in your script) there are, or what
they are called. Finally at the end, print one
</ul> to close the unordered list.
Note carefully the variable substitution that occurs in the
print statement. You don't literally print the characters
"$key" since it's a scalar variable. Perl finds the value of
that variable and prints that instead. If you actually wanted to print
out "$", you would need to "escape" it by using "\$" inside the print
statement, so Perl knows you don't want to do variable substitution.
The same is true for arrays, "@" and "%". On the other hand, Perl does
print literally any characters it doesn't recognize as a variable,
such as <li>, and the colon and space. Perl makes printing very
easy.
I mentioned above that another associative array, called
%ENV , has interesting information as well. As you might
guess, these are a set of environment variables, that browsers send
when they request a page from a web server. In fact, these variables
are always sent for every web page, not just pages with forms, but you
need a CGI script to read them. Are you curious about what your web
browser is saying about you behind your back? Let's find out:
|
print "and here are all the environment variables:<ul>";
foreach $key (keys %ENV) {
print "<li>$key: $ENV{$key}";
}
print "</ul>"; |
And that concludes our CGI script. If you haven't tried out the page
in action yet,
you should now.

Summary

To bring everything together in one place, Here is the script again,
including the definition of the ReadParse function.
|
#!/usr/bin/perl
# perl_form - a simple illustration of forms and Perl CGI
print "Content-type: text/html\n\n";
&ReadParse;
print "<title>The Response</title><h1>The Response</h1><hr>";
print "Here is the form data:<ul>";
foreach $key (keys %in) {
print "<li>$key: $in{$key}";
}
print "</ul>";
print "and here are all the environment variables:<ul>";
foreach $key (keys %ENV) {
print "<li>$key: $ENV{$key}";
}
print "</ul>";
# Adapted from cgi-lib.pl by S.E.Brenner@bioc.cam.ac.uk
# Copyright 1994 Steven E. Brenner
sub ReadParse {
local (*in) = @_ if @_;
local ($i, $key, $val);
if ( $ENV{'REQUEST_METHOD'} eq "GET" ) {
$in = $ENV{'QUERY_STRING'};
} elsif ($ENV{'REQUEST_METHOD'} eq "POST") {
read(STDIN,$in,$ENV{'CONTENT_LENGTH'});
} else {
# Added for command line debugging
# Supply name/value form data as a command line argument
# Format: name1=value1\&name2=value2\&...
# (need to escape & for shell)
# Find the first argument that's not a switch (-)
$in = ( grep( !/^-/, @ARGV )) [0];
$in =~ s/\\&/&/g;
}
@in = split(/&/,$in);
foreach $i (0 .. $#in) {
# Convert plus's to spaces
$in[$i] =~ s/\+/ /g;
# Split into key and value.
($key, $val) = split(/=/,$in[$i],2); # splits on the first =.
# Convert %XX from hex numbers to alphanumeric
$key =~ s/%(..)/pack("c",hex($1))/ge;
$val =~ s/%(..)/pack("c",hex($1))/ge;
# Associate key and value. \0 is the multiple separator
$in{$key} .= "\0" if (defined($in{$key}));
$in{$key} .= $val;
}
return length($in);
} |

Installing a CGI Script

Unfortunately, I can't tell you precisely how to install a CGI script
on your web server, or even whether it's possible. Each server is
configured a little differently. You must ask your system
administrator if CGI is enabled on your server (or read the
documentation yourself) and if user-installed CGI scripts are
permitted (some systems permit only the administrator to install CGI
scripts). Then you'll typically need to find out: what's the path to
Perl (needed for the first line of a Perl script), where to place the
script, what to call it, what permissions to set for the script, and
if your script needs to write to some files, how to set those files'
permissions.
I can briefly illustrate how Apache might be configured, a free
and popular web server for Unix, but only one of many. Apache has
three configuration files, httpd.conf, srm.conf and
access.conf. On my system, all are located in
/etc/httpd/conf/, though they could be anywhere, often
somewhere under /usr/ or /usr/local/.
The second of these files is for server resource management, and
specifies how the server should handle requests from browsers. If a
browser requests a web page, the server returns the html source for
that page, that is, the contents of the file. But if a browser
requests a CGI script, the server must know it's not supposed to
return the contents of the file containing the script, but instead
should run that script as a program, and return to the browser the
results of the program. The server must be told the difference between
an HTML file and a CGI script.
There are a couple of different ways to do this in srm.conf,
as the following two server directives in my configuration
show:
|
ScriptAlias /cgi-bin/ /home/httpd/cgi-bin/
AddHandler cgi-script .cgi |
The first line tells the server that any file it finds in the
directory home/httpd/cgi-bin/ is a CGI script to be run as a
program when requested by a browser. The second line tells the server
that any file (anywhere under the server's document root
directory--specified by another server directive) whose name ends in
.cgi is a script to be executed. The ScriptAlias
directory and this .cgi file name extension could be
anything. If your server isn't configured with these directives (or
they have been commented out), then you can't run CGI scripts on your
server. If it has only the first directive, but not the second, and if
you don't have write permission for the cgi-bin directory, then your
system administrator can install CGI scripts but you can't.

Exercise 1: Your Form, My Script

You can use this script, even if you're not on my site. (Web browsers
usually don't care where they send a request, and my server will
accept yours.) So try it with your own form. Compose a web form on
your server and specify
|
http://www.halcyon.com/sanford/cgi/perl_form.cgi |
as the action attribute in your opening <form>
tag. No matter what or how many form elements you use in your own web
page, or what variable names you use for them, the script will report
the form and environment data, just as it did above. Of course, that's
all it will do. For anything more interesting, you'll need to write
your own script.

Exercise 2: Your Form, Your Script

If you have write access to your web server's
cgi-bin directory, you can copy this script to your server and use it
there. You'll need to make your script world-readable and
world-executable: chmod 755 </path/script_name> at
the unix prompt should do that. You'll also need to ask your
webmaster for the directory or naming conventions and the URL of your
script, which will typically be different from the path to the file
name. Then use that URL in the action attribute of your
web page's <form> tag.
To make things more interesting, specialize the script so it only
reports form data specifically mentioned in your web form. For example,
if your form tags have NAME attributes of my_first_tag
and my_second_tag , you could use a print statement in your
Perl script like this:
|
print "<ul>";
print "<li>my_first_tag: $in{'my_first_tag'}";
print "<li>my_second_tag: $in{'my_second_tag'}";
print "</ul>"; |
Try reformatting them, putting them in tables, adding images. Add a
link which points to the contents of the
$ENV{'HTTP_REFERER'} variable. Remember also to copy and
paste the definition of the ReadParse function from
above into your script.

Other Readings

For the most authoritative information on CGI, see the collection of references
assembled by the folks at the World Wide Web consortium. I know of a
few other CGI tutorials on the Web: Learn to Write
CGI-Forms and a CGI and Perl Tutorial.
I like Building-blocks
for CGI Scripts in Perl; it has site-specific material but is
quite good.. A sample
chapter from a book on CGI scripts in shell and Perl is available
on the Web. There is also Carlos'
Forms Tutorial, which discusses forms but not CGI. Yet Another HTCYOHP Home
Page discusses CGI scripts written in C. A good FAQ on CGI is A CGI Programmer's
Reference.
You can now continue to the more advanced material in CGI/Perl Tips, Tricks and Techniques or
return to the CGI Resource index

|
CGI Resources.
Copyright 1995-98,
Sanford Morton
Last modified: Tue Mar 24 04:04:56 PDT 1998
|
|