pat | nklein software

USerial Library — v0.3.2011.03.05 March 4th, 2011
Patrick Stein

I have released a new version of my serialization library. I hope no one has dug in too far on using it yet because I rearranged the interface a fair bit in this release. To accommodate more complex serializers and unserializers as well as supporting a with-buffer macro, the buffer is no longer the first argument to the serialize and unserialize methods. Now, it is a &key argument to the serialize and unserialize generics. Further, the serialize and unserialize generics also &allow-other-keys.

In an intervening and unannounced release, I added serializers for slots and accessors.

In this release, I have also really fleshed out the documentation and examples.

For instructions on obtaining and using the USerial library, please refer to the USerial library web page.

Edit: This had been v0.3.2011.03.04, but I made a minor update to add MIT License and correct a few glitches in the docs. Now, it’s v0.3.2011.03.05.

Swim, Zach! Swim! February 26th, 2011
Patrick Stein

It looks like Google Maps is having some longitude problems along the eastern coast of the U.S. This is from the CL-USERS Google Map:

Calculating the mean and variance with one pass February 15th, 2011
Patrick Stein

A friend showed me this about 15 years ago. I use it every time I need to calculate the variance of some data set. I always forget the exact details and have to derive it again. But, it’s easy enough to derive that it’s never a problem.

I had to derive it again on Friday and thought, I should make sure more people get this tool into their utility belts.

First, a quick refresher on what we’re talking about here. The mean $\mu$ of a data set ${ x_1, x_2, \ldots, x_n }$ is defined to be $\frac{1}{n} \sum_{i=1}^n x_i$ . The variance $\sigma^2$ is defined to be $\frac{1}{n} \sum_{i=1}^n (x_i - \mu)^2$ .

A naÃ¯ve approach to calculating the variance then goes something like this:

(defun mean-variance (data)
(flet ((square (x) (* x x)))
(let* ((n (length data))
(sum (reduce #'+ data :initial-value 0))
(mu (/ sum n))
(vv (reduce #'(lambda (accum xi)
(+ accum (square (- xi mu))))
data :initial-value 0)))
(values mu (/ vv n)))))

This code runs through the data list once to count the items, once to calculate the mean, and once to calculate the variance. It is easy to see how we could count the items at the same time we are summing them. It is not as obvious how we can calculate the sum of squared terms involving the mean until we’ve calculated the mean.

If we expand the squared term and pull the constant $\mu$ outside of the summations it ends up in, we find that:

$\frac{\sum (x_i - \mu)^2}{n} = \frac{\sum x_i^2}{n} - 2 \mu \frac{\sum x_i}{n} + \mu^2 \frac{\sum 1}{n}$

When we recognize that $\frac{\sum x_i}{n} = \mu$ and $\sum_{i=1}^n 1 = n$ , we get:

$\sigma^2 = \frac{\sum x_i^2}{n} - \mu^2 = \frac{\sum x_i^2}{n} - \left( \frac{\sum x_i}{n} \right)^2$ .

This leads to the following code:

(defun mean-variance (data)
(flet ((square (x) (* x x)))
(destructuring-bind (n xs x2s)
(reduce #'(lambda (accum xi)
(list (1+ (first accum))
(+ (second accum) xi)
(+ (third accum) (square xi))))
data :initial-value '(0 0 0))
(let ((mu (/ xs n)))
(values mu (- (/ x2s n) (square mu)))))))

The code is not as simple, but you gain a great deal of flexibility. You can easily convert the above concept to continuously track the mean and variance as you iterate through an input stream. You do not have to keep data around to iterate through later. You can deal with things one sample at a time.

The same concept extends to higher-order moments, too.

Happy counting.

Edit: As many have pointed out, this isn’t the most numerically stable way to do this calculation. For my part, I was doing it with Lisp integers, so I’ve got all of the stability I could ever want. 🙂 But, yes…. if you are intending to use these numbers for big-time decision making, you probably want to look up a really stable algorithm.

USerial Library — v0.1.2010.12.26 December 27th, 2010
Patrick Stein

I am putting together a networking library atop usocket for use in a multiplayer Lisp game. So far, I have implemented a library for serializing data into a byte buffer.

Here is a simple example of serializing some things into a buffer:

(make-enum-serializer :opcode (:login :run :jump :logout))
(make-bitfield-serializer :login-flags (:hidden :stay-logged-in))

(serialize* (:opcode :login
:uint32 sequence-number
:login-flags '(:hidden)
:string login-name
:string password) buffer)

You can unserialize those bits into existing places:

(let (opcode sequence-number flags login-name password)
(unserialize* (:opcode opcode
:uint32 sequence-number
:login-flags flags
:string login-name
:string password) buffer)
...)

You can unserialize them into newly-created variables for use within a body:

(unserialize-let* (:opcode opcode
:uint32 sequence-number
:login-flags flags
:string login-name
:string password) buffer
...)

Or, you can unserialize them into a list:

(let ((parts (unserialize-list* (:opcode
:uint32
:login-flags
:string
:string) buffer)))
...)

You can find out more about the serialization library on my unet page.

A Currying Pipeline November 20th, 2010
Patrick Stein

I have been reading a great deal about Haskell and thinking a great deal about a networked Lisp game that I intend to work on soon. For the Lisp project, I will need to serialize and unserialize packets to send them over the network. I re-read the chapter on parsing binary files in Practical Common Lisp and started to think about how I could make readers and writers that worked on buffers. Thanks to the Haskell influence, I was also trying to do this serialization without side effects.

The Goal

I wanted to accomplish something like this without all of the SETF action and verbiage:

(let (vv)
(setf vv (serialize 'real xx vv))
(setf vv (serialize 'real yy vv))
(setf vv (serialize 'real zz vv))
vv)

The First Attempt

Well, also thanks to Haskell, my instinct was to make a CURRY-PIPELINE macro that gets called something like this:

(curry-pipeline nil
(serialize 'real xx)
(serialize 'real yy)
(serialize 'real zz))

and expands into something like this:

(serialize 'real zz (serialize 'real yy (serialize 'real xx nil)))

Unfortunately, this changes the order of evaluation of xx, yy, and zz entirely. So, this was suboptimal. It also involved a fifteen-line macro with macro recursion and two conditionals.

Slightly Cleaner

My next attempt was about a ten-line macro with one conditional that turned it into a bunch of nested LET statements.

(let ((#:V-1 nil))
(let ((#:V-1 (serialize 'real xx #:V-1)))
(let (#:V-1 (serialize 'real yy #:V-1)))
(let (#:V-1 (serialize 'real zz #:V-1)))
#:V-1))))

Cleaner Still

Then, I realized that most of my simple examples would be simplest if I curried into the first argument instead of the last. (In fact, it would have even fixed my order of evaluation problem in the initial version.) And, I realized that I could abandon the nested LET if I used a LET*. Now, I have a six-line macro that I really like.

(defmacro curry-pipeline (initial &body body)
(let ((vv (gensym "VARIABLE-")))
(labels ((curry-and-store (line)
`(,vv (,(first line) ,vv ,@(rest line)))))
`(let* ((,vv ,initial)
,@(mapcar #'curry-and-store body))
,vv))))

So, here is an example that does one of those grade school magic tricks. Pick a number, multiply by five, add six, multiply by four, add nine, and multiply by five. Tell me the answer. I subtract 165 and divide by one hundred to tell you your original number.

(defun magic-trick (n)
(let ((rube (curry-pipeline n
(* 5)
(+ 6)
(* 4)
(+ 9)
(* 5))))
(curry-pipeline rube
(- 165)
(/ 100))))

Back to the Original Problem

Now, my serialize functions can simply take a buffer and return a buffer which is the concatenation of the input buffer and the bytes required to encode the given value. The unserialize is not as nice since I will have to return both a buffer and a value, but I am sure I can work something out using a CONS as an accumulator. And, heck, it is going to kill my performance anyway if I really copy the buffer every time I want to add another item to it. I am probably going to ditch the functional aspect anyway. *shrug*

A New Tack

If you don’t like currying or need to have more control over where the accumulator goes in each step of the chain, you can still get the same kind of chaining if you require a declaration of the variable. And, it simplifies the macro:

(defmacro let-pipeline ((var initial) &body body)
`(let* ((,var ,initial)
,@(mapcar #'(lambda (line) `(,var ,line)) body))
,var))

(defun magic-trick (n)
(let ((rube (let-pipeline (r n)
(* 5 r)
(+ 6 r)
(* 4 r)
(+ 9 r)
(* 5 r))))
(let-pipeline (a rube)
(- a 165)
(/ a 100))))

From there, it is not too big of a leap to allow MULTIPLE-VALUE-BIND instead. To accomplish the unserialize, as I mentioned, I will need to track multiple values. I am now down to a four line macro:

(defmacro mv-pipeline ((vars call &rest rest-pipeline) &body body)
`(multiple-value-bind ,vars ,call
,(if rest-pipeline
`(mv-pipeline ,rest-pipeline ,@body)
`(progn ,@body))))

Here is an easy to follow example that returns the first five Fibonacci numbers:

(mv-pipeline ((f0 f1) (values 0 1)
(f2) (+ f1 f0)
(f3) (+ f2 f1)
(f4) (+ f3 f2))
(list f0 f1 f2 f3 f4))

Now, my unserialize case might look something like this:

(defmethod unserialize ((type (eq 'vector)) buffer)
(mv-pipeline ((xx buffer) (unserialize 'real buffer)
(yy buffer) (unserialize 'real buffer)
(zz buffer) (unserialize 'real buffer))
(values (vector xx yy zz) buffer)))

Conclusion

Luckily, I am not paying myself based on lines of code per hour. Almost every time I have done more work, I have reduced the number of lines of code. I am reminded of this quote from Blaise Pascal: I have made this letter longer, because I have not had the time to make it shorter.

nklein software

software development and consulting

USerial Library — v0.3.2011.03.05 March 4th, 2011
Patrick Stein

Swim, Zach! Swim! February 26th, 2011
Patrick Stein

Calculating the mean and variance with one pass February 15th, 2011
Patrick Stein

USerial Library — v0.1.2010.12.26 December 27th, 2010
Patrick Stein

A Currying Pipeline November 20th, 2010
Patrick Stein

The Goal

The First Attempt

Slightly Cleaner

Cleaner Still

Back to the Original Problem

A New Tack

Conclusion

RSS Feeds

Links

Categories

Recent Posts

Archives

nklein software

software development and consulting

USerial Library — v0.3.2011.03.05 March 4th, 2011 Patrick Stein

Swim, Zach! Swim! February 26th, 2011 Patrick Stein

Calculating the mean and variance with one pass February 15th, 2011 Patrick Stein

USerial Library — v0.1.2010.12.26 December 27th, 2010 Patrick Stein

A Currying Pipeline November 20th, 2010 Patrick Stein

The Goal

The First Attempt

Slightly Cleaner

Cleaner Still

Back to the Original Problem

A New Tack

Conclusion

RSS Feeds

Links

Categories

Recent Posts

Tags

Archives

USerial Library — v0.3.2011.03.05 March 4th, 2011
Patrick Stein

Swim, Zach! Swim! February 26th, 2011
Patrick Stein

Calculating the mean and variance with one pass February 15th, 2011
Patrick Stein

USerial Library — v0.1.2010.12.26 December 27th, 2010
Patrick Stein

A Currying Pipeline November 20th, 2010
Patrick Stein