Berkeley Web Template CGI script
The Berkeley EAD Toolkit
University of California, Berkeley
http://sunsite.berkeley.edu/ead/tools
 

Berkeley Web Template installation instructions

The Berkeley Web Template package consists of three components which are installed in three distinct locations: the script files, the templates, and web-accessible components such as images and help files. The .tar.gz file contains three directories for each of these three components.

Script files

The "scripts" directory contains the main "template" script itself, the extension files, and a file called "ead.dtd" which is actually a perl script (the name is intended to fool Microsoft Internet Explorer. More on this later). The "template" and "ead.dtd" scripts must be copied to your cgi directory. The extensions can be copied anywhere that has read and execute permissions for your webserver. However, due to security issues I strongly recommend the extension files be copied into a subdirectory within your cgi directory, e.g., cgi-bin/ead_extensions. Extension files all have a .ext extension, e.g., ead.ext. Both the "template" script and the "ead.dtd" script must have executable permissions set. The extension files should not.

Two variables must be set in the "template script." Set $TemplateDir to the directory where you will copy your templates. Set $ExtensionDir to the directory where you will copy your extension files. You may also need to change the first "hash-bang" line to point to the location of your perl executable, e.g., #!/usr/local/bin/perl

Template files

Template files are hierarchical. The master ead.cfg file contains template configurations common to all of your individual repository-level templates, while generic.cfg contains those configurations unique to individual templates. generic.cfg will serve as a base on which you can derive individual repository templates. Thus you may make several copies of generic.cfg with appropriate filenames, e.g., manuscripts.cfg, pictorial.cfg, archives.cfg, etc. In this guide I will use the terminology "master" or "top-level" template to refer to ead.cfg, the template which contains configurations common to all sub templates, and "sub template" or "repository template" to refer to the multiple templates containing configurations unique to that repository or type of EAD document being encoded.

It is convenient, but not required, to copy your template files to a similarly hierarchical directory structure. The "master" template, e.g., ead.cfg must be copied into whatever directory you specified in $TemplateDir above. The sub templates can go anywhere but I recommend a subdirectory within $TemplateDir with the same name as the master template, minus the .cfg extension:


/home/templates/data/ead.cfg
/home/templates/data/ead/manuscripts.cfg
/home/templates/data/ead/pictorial.cfg
/home/templates/data/ead/archives.cfg
etc. ...


/home/templates/data/tei.cfg
/home/templates/data/tei/oral_histories.cfg
/home/templates/data/tei/letters.cfg
/home/templates/data/tei/reports.cfg
etc. ...

Note, these directories and files MUST have permissions set so that they are readable by your webserver, but should not live in your web data directory. Ask your web manager if you don't know what this means.

As noted above, sub templates can actually be located anywhere on your server. Their location must be set in the master template file in the TemplateDir parameter. So for my recommended hierarchical structure described above, you might set TemplateDir to /home/templates/data/ead in ead.cfg:

TemplateDir       /home/templates/data/ead

Web files

The "web" directory contains images and files that must be placed somewhere in your web data directory. Ask your web manager what this means if you don't understand. These files are the various and miscellaneous images and help documents associated with a template. Set the IMAGEURL parameter in your master template to the base url where all your template-associated images reside. Set and HELP_URL to the base url containing your help documentation. These two parameters can point to the same location if you wish.

[Global Template Variables]
IMAGEURL       http://server.url.edu/templates/images
HELP_URL       http://server.url.edu/templates/help


Invoking the Berkeley Web Templates

The Berkeley Web Templates are hierarchical, composed of a master template and one or more sub templates. The urls for calling these templates are similarly hierarchical. The url consists of the following:

http://[server name]/[cgi directory]/[script name]/[master template name]/[sub template name]

Where [master template name] and [sub template name] are the names of each of those files, respectively, but without the .cfg extension. E.g.,

http://server.name.edu/cgi-bin/template/ead/manuscripts
http://server.name.edu/cgi-bin/template/tei/oral_histories

Some server software requires script files to have a distinct extension, e.g., '.pl' for perl scripts. You may rename the main "template" script to whatever you like without any problems. I generally omit the .pl extension from my cgi script filenames simply because I think it makes the URLs look tidier. But that's just my own preference. That's also why I like to use PATH_INFO to reflect hierarchical data, it just makes the urls tidier and easy to remember in my opinion.

http://server.name.edu/cgi-bin/template.pl/ead/manuscripts
http://server.name.edu/cgi-bin/template.pl/tei/oral_histories


Microsoft headaches

Microsoft Internet Information Server (IIS)

While the template scripts will run on Unix, Linux, and MS Windows platforms, it will likely not work with Microsoft Internet Information Server (IIS). IIS flouts established web standards and generally does not support the use of so-called PATH_INFO in urls. In Berkeley Web Template urls, /[master template name]/[sub template name] is an example of PATH_INFO (it does not correspond to actual physical locations on the server, but rather is meaningful to the template script only). I have heard it is possible to configure IIS to understand PATH_INFO (at the possible risk of breaking support for ASP) but it is probably not worth the effort. You may wish to explore modifying the template script (and individual templates!) to use an alternate URL syntax that does not use PATH_INFO, e.g.,

http://server.name.edu/scripts/template.pl?master=ead&sub=manuscripts

Don't ask me for advice on how to do this however. You should probably start by investigating the ParseUrl subroutine, then hunting through the templates changing occurrences of {_FULL_URL}?submit=1 and {_URL}?submit=1 to {_FULL_URL}&submit=1 and {_URL}&submit=1. No doubt there are other little hidden traps squirreled about that need to be modified.

For more information on Microsoft hijinx do a Google search for "iis path_info". (Just don't believe Microsoft when it says it does this for "Security Purposes").

The "ead.dtd" perl script

By default the template script generates its output markup as plain text, that is, it emits a text/plain Content-type header. Surprise surprise: Microsoft Internet Explorer is the only browser in existence that ignores Content-type! If it detects something that looks like XML markup it will assume you are stupid and don't know what you are doing and "helpfully" ignore your Content-type and interpret the output as XML anyway. There is a great deal of overhead associated with parsing and XML file in the browser. Everybody has experienced long delays and occasional crashes when trying to view even a moderately large XML file in their browser. MSIE goes even farther and insists on retrieving and parsing a DTD if one is present in a DOCTYPE declaration. For most of us, our DOCTYPE declarations look like this:

<!DOCTYPE ead PUBLIC "+//ISBN 1-931666-00-8//DTD ead.dtd (Encoded Archival Description
(EAD) Version 2002)//EN" "ead.dtd">

(And I hope people know better than to use a relative path to the DTD such as "../ead.dtd", or a full url such as
"http://some.overburdened.site.gov/ead/files/ead.dtd", both symptoms of very very dumb XML parsing software or struggling to work with a single broken browser to view XML files (*cough* MSIE *cough*))

MSIE will attempt to download an ead.dtd file from your cgi-bin directory, and nothing you can do, no configuration option, will convince it not to try and do this. If it doesn't find one it will generate an error message. How helpful. While figuring out a way to deal with this I stumbled on the solution of creating a cgi-script called "ead.dtd" that simply echoed the full EAD DTD file. I wish I could simply recommend everybody use Firefox instead of MSIE. First of all, that's obnoxious, but second of all, see my to do list in regards to browser support. Until I have more time, MSIE in fact is the better browser to use with respect to the Berkeley EAD Web Templates.