packages icon
   cgihtml Documentation
   Eugene Eric Kim, eekim@eekim.com
   v1.69, March 18, 1998

   Documentation for cgihtml, a set of CGI and HTML routines written in
   C.
   ______________________________________________________________________

   Table of Contents


   1. Introduction

      1.1 Where to Find cgihtml

   2. What is cgihtml?

      2.1 What are the advantages of cgihtml?
      2.2 Why C?
      2.3 Latest changes
      2.4 Files included in this package

   3. Installation

      3.1 Requirements
      3.2 Obtaining and unpacking the distribution
      3.3 Compiling the library
         3.3.1 Makefile variables
         3.3.2 Compiling for Win32
         3.3.3 Configuring File Uploadn
         3.3.4 Compiling and Installing
      3.4 Porting

   4. Using cgihtml

      4.1 Basic programming structure
      4.2 Compiling your program

   5. Routines

      5.1 cgi-lib.h
         5.1.1 Library Variables
         5.1.2 Library functions
      5.2 html-lib.h
         5.2.1 Library functions
      5.3 cgi-llist.h
         5.3.1 Library variables
         5.3.2 Library functions
      5.4 string-lib.h
         5.4.1 Library functions

   6. Example programs

      6.1 test.cgi
      6.2 query-results
      6.3 mail.cgi
      6.4 index-sample.cgi
      6.5 ignore.cgi

   7. Miscellaneous

      7.1 Release notes
      7.2 Future releases
      7.3 Credits


   ______________________________________________________________________

   1.  Introduction

   This documentation is also available at
   <http://www.eekim.com/software/cgihtml/cgihtml.html>

   Documentation last updated: March 18, 1998


   1.1.  Where to Find cgihtml

   You can download cgihtml (gzipped, UNIX compressed, or PKZipped) from
   one of the following sites:


      eekim.com
         <ftp://ftp.eekim.com/users/eekim/cgihtml/>

      Harvard Computer Society
         <ftp://hcs.harvard.edu/pub/web/tools/cgihtml/>

      Keith Bunge's Iowa State mirror site
         <ftp://aged.brenton.iastate.edu/>

   Useful information about cgihtml is available at the cgihtml home page
   <http://www.eekim.com/software/cgihtml/>, located at
   <http://www.eekim.com/software/cgihtml/>.


   2.  What is cgihtml?

   cgihtml is a collection of routines for parsing World Wide Web (WWW)
   Common Gateway Interface (CGI) input and outputting HyperText Markup
   Language (HTML).


   2.1.  What are the advantages of cgihtml?

   cgihtml simplifies the task of parsing CGI input and producing HTML
   output.  Tasks which would normally require several lines of C can be
   reduced to just a few.

   Additionally, I have attempted to include general routines which CGI
   programmers often find themselves using.  Consequently, some of the
   complexities of CGI programming are hidden.  On the other hand, if you
   want to know what's going on, source is included.


   2.2.  Why C?

   The purpose of CGI programs is to take data and manipulate it as the
   web programmer desires.  Since CGI programs are often dealing with
   text manipulation, Perl or other scripting languages is an ideal way
   of producing CGI scripts.  (Perl programmers should check out Lincoln
   Stein's CGI.pm <http://www-
   genome.wi.mit.edu/ftp/pub/software/WWW/cgi_docs.html> or Steve
   Brenner's cgi-lib.pl <http://www.bio.cam.ac.uk/web/form.html>.)

   However, interpreted scripting languages tend to be relatively large.
   This rarely has a major effect on the performance of your server
   (unless you are a high-traffic site).  However, if this is a concern
   of your's, a program written in C is often several times smaller than
   the equivalent program written in Perl.  There's definitely a
   performance improvement when using CGI programs written in C, although
   the performance is not always noticeable.
   Additionally, some servers (notably Netsite and Apache) have APIs so
   that CGI programs can be written as extensions to the server, rather
   than as separate programs.  This greatly improves performance,
   especially on high-traffic sites.  The best way to take advantage of
   these APIs is to write your programs in C.

   Or, you might fall under one of the following categories:


   o  You don't know Perl

   o  You don't like Perl

   o  You like C

   In which case, you will hopefully find cgihtml useful.


   2.3.  Latest changes

   See the CHANGES file, included in the distribution.


   2.4.  Files included in this package

   The following files are included in this package:


   3.  Installation

   3.1.  Requirements

   cgihtml was written for Unix machines in C, although it has been
   successfully ported to Windows 95 and NT, VMS, OS-9, and other
   operating systems.  All you need is a C compiler, and you should be
   set.

   By default, cgihtml assumes that the CGI source code goes in the cgi-
   src directory and the binaries in the cgi-bin directory.


   3.2.  Obtaining and unpacking the distribution

   You may find cgihtml.tar.gz at
   <ftp://ftp.eekim.com/users/eekim/cgihtml/>

   To unpack the distribution, you must first gunzip it (using the GNU
   gzip utility) and then untar it.  Copy the distribution into your CGI
   source directory, and try the following command:



        % gzip -dc cgihtml.tar.gz | tar xvf -




   cgihtml is also available in UNIX compressed (.Z) and PKZipped (.zip)
   format.


   3.3.  Compiling the library

   To compile the library, examine the Makefiles in the cgihtml and
   examples directories, and make sure you are satisfied with the
   variables.
   3.3.1.  Makefile variables

   INSTALLDIR in cgihtml's Makefile should point to your CGI source
   directory, while INSTALLDIR in your examples directory should point to
   your server's CGI binary directory.


   3.3.2.  Compiling for Win32

   If you're compiling for Win32 (ie. Windows 95/NT), make sure to
   uncomment the line with -DWINDOWS.


   3.3.3.  Configuring File Uploadn

   By default, the file upload directory is set to /tmp.  To change this
   value, uncomment #-DUPLOADDIR='"/tmp"' in the Makefile and replace
   /tmp with the directory of your choice.  Make sure that whichever
   directory you choose is surrounded by both the single and double
   quotes, ie: '"/foo/bar"'.


   3.3.4.  Compiling and Installing

   When you are satisfied with the Makefiles, type:



        % make cgihtml.a




   This will produce the file cgihtml.a.  To compile the library as well
   as all of the example programs, type:



        % make all




   To install the library and examples, type:



        % make install




   If you want to compile and/or install the example programs separately,
   change to the examples subdirectory and use make there.


   3.4.  Porting

   While compiling the libraries on various Unix machines, you may have
   trouble with the "ranlib" command.  If you system doesn't seem to have
   this command, you most likely don't need.  Set the RANLIB variable in
   the Makefile to "true".

   If you're compiling for Win32, make sure you use the -DWINDOWS
   directive when compiling.

   If you are compiling for DOS/16-bit Windows, VMS, or OS-9, you will
   need to change the filenames to support your OS.


   4.  Using cgihtml

   4.1.  Basic programming structure

   There are standard initialization things that you will want to do when
   using this library.  The following template should give you some idea
   of how to properly use this library.


   ______________________________________________________________________
   /* template using cgihtml.a library */

   #include <stdio.h>    /* standard io functions */
   #include "cgi-lib.h"  /* CGI-related routines */
   #include "html-lib.h" /* HTML-related routines */

   int main()
   {
     llist entries;  /* define a linked list; this is where the entries */
                     /* are stored.                                     */

   /* parse the form data and add it to the list */
     read_cgi_input(&entries);

   /* The data is now in a very usable form.  To search the list entries  */
   /* by name, call the function:                                         */
   /*      cgi_val(entries, "nameofentry")                                */
   /* which returns a pointer to the value associated with "nameofentry". */

     html_header();           /* print HTML MIME header */
     html_begin("Output");    /* send appropriate HTML headers with title */
                              /*   <title>Output</title>                  */

   /* display whatever data you wish here, probably with printf() */

     html_end();   /* send appropriate HTML end footers (</body> </html>) */
     list_clear(&entries);     /* free up the pointers in the linked list */
     return 0;
   }
   ______________________________________________________________________




   4.2.  Compiling your program


   To compile your program with the library, include the file cgihtml.a
   when linking your object files.  For example, if your main object file
   is program.cgi.o, the following should work successfully:



        cc -o program.cgi program.cgi.o cgihtml.a








   5.  Routines

   5.1.  cgi-lib.h

   5.1.1.  Library Variables

   cgi-lib.h defines constants for the standard CGI environment
   variables.  For instance, the value of the environment variable
   QUERY_STRING is stored in the constant QUERY_STRING in cgi-lib.h.
   Here is a list of the constants:


   o  SERVER_SOFTWARE

   o  SERVER_NAME

   o  GATEWAY_INTERFACE

   o  SERVER_PROTOCOL

   o  SERVER_PORT

   o  REQUEST_METHOD

   o  PATH_INFO

   o  PATH_TRANSLATED

   o  SCRIPT_NAME

   o  QUERY_STRING

   o  REMOTE_HOST

   o  REMOTE_ADDR

   o  AUTH_TYPE

   o  REMOTE_USER

   o  REMOTE_IDENT

   o  CONTENT_TYPE

   o  CONTENT_LENGTH

   o  HTTP_USER_AGENT


   5.1.2.  Library functions


   short accept_image();

   accept_image() determines whether the browser will accept pictures.
   It does so by checking the HTTP_ACCEPT environment variable for an
   image MIME type.  It returns a 1 if the browser will accept graphics,
   a 0 otherwise.

   void unescape_url();

   unescape_url() converts escaped URI values into their character form.
   read_cgi_input() calls this function.  You will rarely if ever need to
   access this function directly but it is made available in case you do.


   int read_cgi_input(llist *entries);

   This routine parses the raw CGI data passed from the browser to the
   server and adds each associated name and value to the linked list
   entries.  It will parse information transmitted using both the GET and
   POST method.  If it receives no information, it will return a 0,
   otherwise it returns the number of entries returned.  If it receives a
   badly encoded string, it will return -1.

   If you run your CGI program that calls read_cgi_input() from the
   command line, this function will start an interactive mode so you can
   directly input the CGI input string.  Note that this string must be
   properly encoded.

   read_cgi_input() also handles HTTP file upload correctly.  The file
   will be uploaded to the directory defined by UPLOADDIR in cgi-lib.h
   (/tmp by default).

   char* cgi_val(llist l, char *name);

   cgi_val() searches the linked list for the value of the entry named
   name and returns the value if it finds it.  If it cannot find an entry
   with name name, it returns NULL.

   char** cgi_val_multi(llist l, char *name);

   Same as cgi_val() except will return multiple values with the same
   name to an array of strings.  Will return NULL if it cannot find an
   entry with name name

   char* cgi_name(llist l, char *value);

   Same as cgi_val() except searches for value with specified name.

   char** cgi_name_multi(llist l, char *value);

   Analogous to cgi_multi_val().

   int parse_cookies(llist *entries);

   Parses the environment variable HTTP_COOKIE for cookies.  Returns the
   number of cookies parsed, zero if there are none.

   void print_cgi_env();

   Pretty prints the environment variables defined in cgi-lib.h.  Prints
   "(null)" if the variables are empty.

   void print_entries(llist l);

   This is a generic routine which will iterate through the linked list
   and print each name and associated value in HTML form.  It uses the
   <dl> list format to display the list.

   char* escape_input(char *str);

   escape_input() "escapes" shell metacharacters in the string.  It
   precedes all non-alphanumeric characters with a backslash.  C routines
   including system() and popen() open up a Bourne shell process before
   running.  If you do not escape shell metacharacters in the input
   (prefix metacharacters with a backslash), then malicious users may be
   able to take advantage of your system.

   short is_form_empty(llist l);


   is_form_empty() checks to see whether no names or values were
   submitted.  Note that this is different from submitting a blank form.

   short is_field_exists(llist l,char *str);

   Checks to see whether a field actually exists.  Equivalent to checking
   whether cgi_val() returns "" or NULL.  If it returns "", the field
   exists but is empty; if it returns NULL, the field does not exist.

   short is_field_empty(llist l,char *str);

   Returns 1 (true) if either the field does not exist or the field does
   exist but is empty.


   5.2.  html-lib.h



   5.2.1.  Library functions


   void html_header();

   html_header() prints a MIME compliant header which should precede the
   output of any HTML document from a CGI script.  It simply prints to
   STDOUT:



        Content-Type: text/html




   and a blank line.

   void mime_header(char *mime);

   Allows you to print any MIME header.  For example, if you are about to
   send a GIF image to the standard output from your C CGI program,
   precede your program with:


   ______________________________________________________________________
   mime_header("image/gif");
   /* now you can send your GIF file to stdout */
   ______________________________________________________________________



   mime_header() simply prints Content-Type: followed by your specified
   MIME header and a blank line.

   void nph_header(char *status);

   Sends a standard HTTP header for direct communication with the client
   using no parse header.  status is the status code followed by the
   status message.  For instance, to send a "No Content" header, you
   could use:






   ______________________________________________________________________
   nph_header("204 No Content");
   html_header();
   ______________________________________________________________________



   which would send:



        HTTP/1.0 204 No Content
        Server: CGI using cgihtml
        Content-Type: text/html




   nph_header() does not send a blank line after printing the headers, so
   you must follow it with either another header or a blank line.  Also,
   scripts using this function must have "nph-" preceding their
   filenames.

   void show_html_page(char *loc);

   Sends a "Location: " header.  loc is the location of the HTML file you
   wish sent to the browser.  For example, if you want to send the root
   index file from the CGI program:


   ______________________________________________________________________
   show_html_page("/index.html");
   ______________________________________________________________________



   void status(char *status);

   Sends an HTTP Status header.  status is a status code followed by a
   status message.  For instance, to send a status code of 302 (temporary
   redirection) followed by a location header:


   ______________________________________________________________________
   status("302 Temporarily Moved");
   show_html_page("http://hcs.harvard.edu/");
   ______________________________________________________________________



   status() does not print a blank line following the header, so you must
   follow it with either a0);.er function which does output a blank line
   or an explicit printf("
   void pragma(char *msg);

   Sends an HTTP Pragma header.  Most commonly used to tell the browser
   not to cache the document, ie.:


   ______________________________________________________________________
   pragma("No-cache");
   html_header();
   ______________________________________________________________________


   As with status(), pragma() does not print a blank line folowing the
   header.

   void set_cookie(char *name, char *value, char *expires, char *path,
   char *domain, short secure);

   Sets a cookie using the values given in the parameters.

   void html_begin(char *title);

   html_begin() sends somewhat standard HTML tags which should generally
   be at the top of every HTML file.  It will send:



        <html> <head>
        <title>title</title>
        </head>
        <body>




   void html_end();

   html_end() is the complement to html_begin(), sending the following
   HTML:



        </body> </html>




   Note that neither html_begin() nor html_end() are necessary for your
   CGI scripts to output HTML, but they are good style, and I encourage
   use of these routines.

   void h1(char *header);

   Surrounds header with appropriate headline tags.  Defined for h1() to
   h6().

   void hidden(char *name, char *value);

   Prints a hidden form field with name name and value value.


   5.3.  cgi-llist.h

   For most scripts, with the exception of list_end(), you will most
   likely never have to use any of the link list routines available,
   since cgi-lib.h handles most common linked list manipulation almost
   transparently.  However, you may sometimes want to manipulate the
   information directly or perform special functions on each entry, in
   which case these routines may be useful.


   5.3.1.  Library variables






   ______________________________________________________________________
   typedef struct {
     char *name;
     char *value;
   } entrytype;

   typedef struct _node {
     entrytype entry;
     struct _node* next;
   } node;

   typedef struct {
     node* head;
   } llist;
   ______________________________________________________________________




   5.3.2.  Library functions


   void list_create(llist *l);

   list_create() creates and initializes the list, and it should be
   called at the beginning of every CGI script using this library.

   node* list_next(node* w);

   list_next() returns the next node on the list.

   short on_list(llist *l, node* w);

   on_list() returns a 1 if the node w is on the linked list l;
   otherwise, it returns a 0.

   short on_list_debug(llist *l, node* w);

   The previous routine makes the assumption that my linked list routines
   are bug-free, a possibly bad assumption.  If you are using linked list
   routines and on_list() isn't returning the correct value, try using
   on_list_debug() which makes no assumptions, is almost definitely
   reliable, but is a little slower than the other routine.

   void list_traverse(llist *l, void (*visit)(entrytype item));

   list_traverse() lets you pass a pointer to a function which will
   manipulate each entry on the list.

   To use, you must create a function that will take as its argument a
   variable of type entrytype.  For example, if you wanted to write your
   own print_entries() function, you could do the following:


   ______________________________________________________________________
   void print_element(entrytype item);
   {
     printf("%s = %s0,item.name,item.value);
   }

   void print_entries(llist entries);
   {
     list_traverse(&stuff, print_element);
   }
   ______________________________________________________________________

   node* list_insafter(llist* l, node* w, entrytype item);

   list_insafter() adds the entry item after the node w and returns the
   pointer to the newly created node.  I didn't bother writing a function
   to insert before a node since my CGI functions don't need one.

   void list_clear(llist* l);

   This routine frees up the memory used by the linked list after you are
   finished with it.  It is imperative that you call this function at the
   end of every program which calls read_cgi_input().


   5.4.  string-lib.h

   5.4.1.  Library functions


   char* newstr(char *str);

   newstr() allocates memory and returns a copy of str.  Use this
   function to correctly allocate memory and copy strings.

   char* substr(char *str, int offset, int len);

   Analogous to the Perl substr function.  Finds the substring of str at
   an offset of offset and of length len.  A negative offset will start
   the substring from the end of the string.

   char* replace_ltgt(char *str);

   Replaces all instances of < and > in str with &lt; and &gt;.

   char* lower_case(char *buffer);

   Converts a string from upper to lower case.


   6.  Example programs

   6.1.  test.cgi

   test.cgi is a simple test program.  It will display the CGI
   environment, and if there is any input, it will parse and display
   those values as well.


   6.2.  query-results

   This is a generic forms parser which is useful for testing purposes.
   It will parse both GET and POST forms successfully.  Simply call it as
   the form "action", and it will return all of the names and associated
   values entered by the user.

   query-results also works from the command line.  For instance, if you
   run query-results from the command line, you will see:










   Content-type: text/html

   <html> <head>
   <title>Query Results</title>
   </head>

   <body>

   --- cgihtml Interactive Mode ---
   Enter CGI input string.  Remember to encode appropriate characters.
   Press ENTER when done:




   Suppose you enter the input string:



        name=eugene&age=21




   Then query-results will return:



         Input string: name=eugene&age=21
        String length: 18
        --- end cgihtml Interactive Mode ---

        <h1>Query results</h1>
        <dl>
          <dt> <b>name</b>
          <dd> eugene
          <dt> <b>age</b>
          <dd> 21
        </dl>
        </body> </html>




   This feature is extremely useful if you are debugging code.  query-
   results will also handle file upload properly and transparently.  It
   will save the file to the directory defined by UPLOADDIR (/tmp by
   default).


   6.3.  mail.cgi

   This is a generic comments program which will parse the form, check to
   see if the intended recipient is a valid recipient, and send the e-
   mail if so.

   You will want to edit two things in the source file: WEBADMIN and
   AUTH.  WEBADMIN is the complete e-mail address to which the comments
   should be sent by default.  AUTH is the exact location of the
   authorization file.

   The authorization file is simple a text file with a list of valid e-
   mail recipients.  Users will only be able to use this program to send
   e-mail to those listed in the authorization file.  Your file might
   look like this:

        web@www.company.com
        jschmoe@www.company.com




   In the above case, you would only be able to send e-mail to
   web@www.company.com and jschmoe@www.company.com.  Make sure you
   include the value of WEBADMIN in your authorization file.

   The following are valid variables in your form:


   o  to

   o  name

   o  email

   o  subject

   o  content

   If there is no to variable defined in the form, the mail will be sent
   to WEBADMIN by default.  mail.cgi will reject empty forms.

   mail.cgi adds a "X-Sender:" header on each message so recipients know
   that the message was sent by this program and not by a regular mail
   client.


   6.4.  index-sample.cgi

   Imagemaps have become increasingly popular to use on home pages.
   Unfortunately, imagemaps are not lynx friendly; if you forget to
   include some sort of text index as well, lynx users will not be able
   to access any of your subpages.

   You can circumvent this problem by using a CGI program as your home
   page rather than an HTML page (or by using server-side includes).
   This CGI program will determine whether your browser is a graphics or
   text-browser.  If it is a text-browser, it will send a text HTML file,
   otherwise it will send a graphics HTML file.

   You will need to create two HTML files: a graphical and a text one.
   Place the names of these files in the macros: TEXT_PAGE and
   IMAGE_PAGE.


   6.5.  ignore.cgi

   Sends a status code of 204, signifying no content.  If you use
   imagemaps, you can set "default" to /cgi-bin/ignore.cgi.  Whenever
   someone clicks on a part of the picture which is undefined, the server
   will just ignore the request.


   7.  Miscellaneous

   7.1.  Release notes

   I periodically enhance this library, and I welcome any comments or
   suggestions.  Please e-mail them to eekim@eekim.com.

   This library is e-mail ware.  Please send me e-mail if you use this
   library; I'd really like to hear your comments.  Although I do not
   require it, I would appreciate attribution if you use my code.


   7.2.  Future releases

   This library is nearing its final release.  I hope to include FastCGI
   support, and API support (for Apache, Netscape, and Microsoft
   servers).  I may also attempt to port it to the Macintosh, and I want
   to improve generally portability.  I will most likely rewrite the API
   in the next major release.  Finally, I'd like to improve the
   robustness and fix all the bugs.

   If you have any suggestions, I'd like to hear them.  Feel free to e-
   mail me at eekim@eekim.com


   7.3.  Credits


   Thanks to the countless people who have sent me suggestions and
   comments.  You may contact me via e-mail at eekim@eekim.com.  My web
   page is located at  <http://www.eekim.com/>.