Parser Generator April 9th, 2010
Patrick Stein

Say you’ve got a bunch of different XML formats that you need to read into internal data structures. You could sit down and code a SAX parser from scratch. You could use a DOM parser and then wander through the DOM filling in your internal data structures. Or, you could write a little snippet describing how you’d like to turn your XML into structures and run this parser generator.

Currently, this parser generator will generate SAX-based parsers for Lisp or Objective-C and create all of the necessary data types for your internal structures based on your input file.

Obtaining the code

Examples

Here is a simple example. Suppose you wanted to parse the following snippet of XML:

<?xml version="1.0"?>
<dining-room>
  <manufacturer>The Wood Shop</manufacturer>
  <table type="round" wood="maple">
    <price>$199.99</price>
  </table>
  <chair wood="maple">
    <quantity>6</quantity>
    <price>$39.99</price>
  </chair>
</dining-room>

You might think of this as a dining room with a manufacturer, a table, and a chair. If so, then you could use this input for the parser generator: dining_room_direct.xml to parse it into structures like this:

   struct Table {
       string shape;
       string wood;
       string price;
   }

   struct Chair {
       integer quantity;
       string wood;
       string price;
   }
       
   struct DiningRoom {
       string manufacturer
       Table table;
       Chair chair;
   }

You might think of this as a dining room with a manufacturer and an array of table and chair entities. If so, then you could use this input for the parser generator: dining_room_het.xml to parse it into structures like this:

   struct Table {
       string shape;
       string wood;
       string price;
   }

   struct Chair {
       integer quantity;
       string wood;
       string price;
   }
       
   struct DiningRoom {
       string manufacturer
       array pieces;  // pieces are either Table or Chair instances
   }

Or, you might think of this as a dining room with a manufacturer and some furniture. In that case, you can use this input for the parser generator: dining_room_hom.xml to parse it into structures like this:

   struct Furniture {
       string shape;     // defaults to ""
       string wood;
       integer quantity; // defaults to 1
       string price;
   }

   struct DiningRoom {
       string manufacturer
       array pieces;  // pieces are all furniture, either from <table>
                      // or <chair>
   }