XML Batch Validator (Schema)
The Berkeley EAD Toolkit
University of California, Berkeley
http://sunsite.berkeley.edu/ead/tools
 

Batch Validate XML Files Against Schemata

Here is a small program I wrote which we use internally to batch validate hundreds or thousands of METS documents against the official METS schema and related schemata. When validation is complete it displays error messages in a browser window. It is intended for internal use and as such has not been extensively tested. Your mileage may vary. It should work to validate documents against any type of schema. Xerces can also validate against DTDs but I have not taken the time to configure this program to do so (thus it will likely not work for EAD).

I have also not configured the program to use XML catalogs, but you can do so by referring to http://xerces.apache.org/xerces2-j/faq-xcatalogs.html. This means references to schemata must either be full urls:

"http://www.loc.gov/standards/mets/mets.xsd"

or if a relative URL is used, the schemata must be located correctly in your filesystem, e.g., located in the same folder as the instance being validated:

<Export xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="GenDBLoad.xsd">

When batch validation is complete, Internet Explorer will open and display a list of all of the files with errors, the line and character position where the error occurred, and the text of the Xerces message for that error. Xerces is configured not to abort validation at the first error, as is the default with all conforming XML parrsers. That way you see a list of all the errors in your document, not just the first one.



Clicking on any of the line numbers will display the XML document, scrolled to the line where that error occurred, with the offending line highlighted in yellow.


Using the XML Batch Validator


Upon installation, the setup program will create a single icon on your desktop. This is used to validate individual XML documents. Simply drag the file from your Windows Explorer onto the icon and release it.

However it is the batch processing capability that is the most useful. In the installation folder there is a small batch file called batch_validate.bat. Simply copy this file from the installation folder and paste it into the folder containing all of the XML files you wish to validate. Double-click it to initiate the validation process.



Warning

The validation program creates a temporary file for each XML file containing errors. If you are validating thousands of XML files and each one contains an error, thousands of temporary files will be created, slowing the process down enormously. I recommend you first validate a few individual documents to eliminate any consistent errors which might have infilitrated your markup, and then validate the entire batch.



Download and install the XML Batch Validator

Right-click and select "Save As": schema_validate_setup.exe (2/09/2006)

The schema validator will default to c:\ as the default installation directory. You may choose a more reasonable location. However, since the program creates temporary files within the installation directory each time it is used, be sure that the user has write permissions to whichever directory it is installed into. Non-privileged users cannot write to any folder within c:\Program Files so that may be an inappropriate location for installation.