## Parser Generator releasedApril 9th, 2010 Patrick Stein

A few weeks back, I described an XML Parser Generator that I was working on. At the time, it could generate the parser it used itself. Now, it’s got Objective-C support and Lisp support. (The Lisp support is slightly better than the Objective-C support right now. With the Objective-C backend, you can create arrays of structs, but not arrays of strings or integers.)

## XML Parser GeneratorMarch 16th, 2010 Patrick Stein

A few years back (for a very generous few), we needed to parse a wide variety of XML strings. It was quite tedious to go from the XML to the native-language representations of the data (even from a DOM version). Furthermore, we needed to parse this XML both in Java and in C++.

I wrote (in Java) an XML parser generator that took an XML description of how you’d like the native-language data structures to look and where in the XML it could find the values for those data structures. The Java code-base for this was ugly, ugly, ugly. I tried several times to clean it up into something publishable. I tried to clean it up several times so that it could actually generate the parser it used to read the XML description file. Alas, the meta-ness, combined with the clunky Java code, kept me from completing the circle.

Fast forward to last week. Suddenly, I have a reason to parse a wide variety of XML strings in Objective C. I certainly didn’t want to pull out the Java parser generator and try to beat it into generating Objective C, too. That’s fortunate, too, because I cannot find any of the copies (in various states of repair) that once lurked in ~/src.

What’s a man to do? Write it in Lisp, of course.

### Example

Here’s an example to show how it works. Let’s take some simple XML that lists food items on a menu:

<food name="Belgian Waffles" price="\$5.95" calories="650">
<description>two of our famous Belgian Waffles with plenty of real maple syrup</description>
</food>
<!-- ... more food entries, omitted here for brevity ... -->

We craft an XML description of how to go from the XML into a native representation:

<struct name="food item">
<field type="string" name="name" from="@name" />
<field type="string" name="price" from="@price" />
<field type="string" name="description" from="/description/." />
<field type="integer" name="calories" from="@calories" />
</struct>

<array>
<array_element type="food item" from="/food" />
</array>
</field>
</struct>
</parser_generator>

Now, you run the parser generator on the above input file:

% sh parser-generator.sh --language=lisp \

This generates two files for you: types.lisp and reader.lisp. This is what types.lisp looks like:

(:use :common-lisp)
(:export #:food-item
#:name
#:price
#:description
#:calories

(defclass food-item ()
((name :initarg :name :type string)
(price :initarg :price :type string)
(description :initarg :description :type string)
(calories :initarg :calories :type integer)))

I will not bore you with all of reader.lisp as it’s 134 lines of code you never had to write. The only part you need to worry about is the parse function which takes a stream for or pathname to the XML and returns an instance of the menu class. Here is a small snippet though:

;;; =================================================================
;;; food-item struct
;;; =================================================================
(defmethod data progn ((handler sax-handler) (item food-item) path value)
(with-slots (name price description calories) item
(case path
(:|@name| (setf name value))
(:|@price| (setf price value))
(:|/description/.| (setf description value))
(:|@calories| (setf calories (parse-integer value))))))

### Where it’s at

I currently have the parser generator generating its own parser (five times fast). I still have a little bit more that I’d like to add to include assertions for things like the minimum number of elements in an array or the minimum value of an integer. I also have a few kinks to work out so that you can return some type other than an instance of a class for cases like this where the menu class just wraps one item.

My next step though is to get it generating Objective C parsers.

Somewhere in there, I’ll post this to a public git repository.

## Speedy iPhone App ProgrammingFebruary 4th, 2010 Patrick Stein

Sunday, I decided that I needed a simpler statistics tracking program to keep track of stuff while I’m coaching volleyball. I started out keeping them on paper, but felt that I was staring at the page too often to find where to put a tick mark.

Next, I tried the iVolleyStats Match iPhone app. It is pretty reasonable to use, but it’s got too many controls on the user-interface. The only way that I could keep up with a match was to forgo half of the functionality… either ignoring who is getting credit for an act, ignoring passing stats altogether, and not recording attack or block attempts at all.

For the past few weeks, I have tried using the Voice Memos application on the iPhone. I narrate the game into my phone as the game goes on. This lets me get really fine resolution of statistics, but it doesn’t give me any information in real-time. When I call a time-out, I am going from memory to say how we’ve been passing or hitting. This takes away the lion’s share of the benefit one gets from gathering statistics at all.

So, my team has a tournament this Saturday. I decided Sunday night that I should try to get together an iPhone app that does what I want. I’ve long been thinking about what I want in a volleyball stat tracking iPhone app. What I want will be a big, big undertaking (read: longer than one week). So, I started studying Apple’s CoreData APIs on Sunday night and Monday morning. Then, I dove in.

Now, my previous application was based upon Cocos2D-iPhone. As such, it didn’t involve any of the Apple UIKit classes or any work with Interface Builder. This application is navigation based with table views and custom table view cells from separate NIB files.

Despite this being my first real foray into the UIKit and CoreData APIs, I’ve got an application that I can use on Saturday. There are two more bits that I will try to add tomorrow, but that I don’t need for Saturday. To package it up for sale, there’s more functionality that I’ll need to add in case you don’t like things in the order that I have them on the screen or in case you want to add your own categories of statistic. And, I need to make some specialized visualization screens for some of the stats.

The screenshot here is the main stat-tracking interface. The stats present are Penn State’s stats from my trial run watching game one of their NCAA semifinal against Hawaii from last December. In retrospect, I think I probably gave them credit for two neutral attacks that I should have called free balls. Other than that, I think it’s pretty good. It’s definitely information that will do me well during time-outs.

I fought for a long time this morning trying to get my UIBarButtonItem to show up in my UINavigationBar for one screen. It turns out that these two methods don’t quite do the same thing on iPhoneOS 3.1.2:

// working version that shows my UIBarButton in the UINavigationBar
- (void)showStatTrackingScreen {
}

// version made of fail that does NOT show my UIBarButton in the UINavigationBar
// until you go forward a screen and pop back to this one.
trackStatsController,
nil]
animated:YES];
}

## Spelling iPhone App sent to Beta TestersJanuary 28th, 2010 Patrick Stein

I am pleased to say that I just sent my first iPhone app out to some friends to beta test. I expect to forward it along to Apple for inclusion in the App Store some time in the next week or two.

At this point, I am far more comfortable with Objective-C and the Cocoa class hierarchy than I was even a month ago. I still think Objective-C is awful. You take a nice functional Smalltalk-ish language, you throw away most of the functional, you pretend like you have garbage collection when you don’t, you strip out any form of execution control, you add some funky compiler pragma-looking things (including one called synthesize that only fabricates about half of what you’d want it to build), you change the semantics of ->, and then you interleave it with C! Wahoo! Instant headache!

But, after I found the for-each sort of construction, my code got quite a bit simpler. A whole bunch of loops like this:

NSEnumerator* ee = [myArray enumerator];
MyItem* item;
while ( ( item = (MyItem*)[ee nextObject] ) != nil ) {
...
}

went to this:

for ( MyItem* item in myArrayOrEnumerator ) {
...
}

## Casting to Integers Considered HarmfulAugust 6th, 2009 Patrick Stein

### Background

Many years back, I wrote some ambient music generation code. The basic structure of the code is this: Take one queen and twenty or so drones in a thirty-two dimensional space. Give them each random positions and velocities. Limit the velocity and acceleration of the queen more than you limit the same for the drones. Now, select some point at random for the queen to target. Have the queen accelerate toward that target. Have the drones accelerate toward the queen. Use the average distance from the drones to the queens in the $i$-th dimension as the volume of the $i$-th note where the notes are logarithmically spaced across one octave. Clip negative volumes to zero. Every so often, or when the queen gets close to the target, give the queen a new target.

It makes for some interesting ambient noise that sounds a bit like movie space noises where the lumbering enemy battleship is looming in orbit as its center portion spins to create artificial gravity within.

I started working on an iPhone application based on this code. The original code was in C++. The conversion to Objective C was fairly straightforward and fairly painless (as I used the opportunity to try to correct my own faults by breaking things out into separate functions more often).

### Visualization troubles

The original code though chose random positions and velocities from uniform distributions. The iPhone app is going to involve visualization as well as auralization. The picture at the right here is a plot of five thousand points with each coordinate selected from a uniform distribution with range [-20,+20]. Because each axis value is chosen independently, it looks very unnatural.

What to do? The obvious answer is to use Gaussian random variables instead of uniform ones. The picture at the right here is five thousand points with each coordinate selected from a Gaussian distribution with a standard-deviation of 10. As you can see, this is much more natural looking.

### How did I generate the Gaussians?

I have usually used the Box-Muller method of generating two Gaussian-distributed random variables given two uniformly-distributed random variables:

(defun random-gaussian ()
(let ((u1 (random 1.0))
(u2 (random 1.0)))
(let ((mag (sqrt (* -2.0 (log u1))))
(ang (* 2.0 pi u2)))
(values (* mag (cos ang))
(* mag (sin ang))))))

But, I found an article online that shows a more numerically stable version:

(defun random-gaussian ()
(flet ((pick-in-circle ()
(loop as u1 = (random 1.0)
as u2 = (random 1.0)
as mag-squared = (+ (* u1 u1) (* u2 u2))
when (< mag-squared 1.0)
return (values u1 u2 mag-squared))))
(multiple-value-bind (u1 u2 mag-squared) (pick-in-circle)
(let ((ww (sqrt (/ (* -2.0 (log mag-squared)) mag-squared))))
(values (* u1 ww)
(* u2 ww))))))

For a quick sanity check, I thought, let’s just make sure it looks like a Gaussian. Here, I showed the code in Lisp, but the original code was in Objective-C. I figured, If I just change the function declaration, I can plop this into a short C program, run a few thousand trials into some histogram buckets, and see what I get.

### The trouble with zero

So, here comes the problem with zero. I had the following main loop:

#define BUCKET_COUNT 33
#define STDDEV       8.0
#define ITERATIONS   100000

for ( ii=0; ii < ITERATIONS; ++ii ) {
int bb = val_to_bucket( STDDEV * gaussian() );
if ( 0 <= bb && bb < BUCKET_COUNT ) {
++buckets[ bb ];
}
}

I now present you with three different implementations of the val_to_bucket() function.

int val_to_bucket( double _val ) {
return (int)_val + ( BUCKET_COUNT / 2 );
}

int val_to_bucket( double _val ) {
return (int)( _val + (int)( BUCKET_COUNT / 2 ) );
}

int val_to_bucket( double _val ) {
return (int)( _val + (int)( BUCKET_COUNT / 2 ) + 1 ) - 1;
}

As you can probably guess, after years or reading trick questions, only the last one actually works as far as my main loop is concerned. Why? Every number between -1 and +1 becomes zero when you cast the double to an integer. That’s twice as big a range as any other integer gets. So, for the first implementation, the middle bucket has about twice as many things in it as it should. For the second implementation, the first bucket has more things in it than it should. For the final implementation, the non-existent bucket before the first one is the overloaded bucket. In the end, I used this implementation instead so that I wouldn’t even bias non-existent buckets:

int val_to_bucket( double _val ) {
return (int)lround(_val) + ( BUCKET_COUNT / 2 );
}