- Buffer Handling
- Serializing and Unserializing
- Sample Application: Game Protocol
- The USerial home page: http://nklein.com/software/unet/userial/
- The tarball: userial_0.8.2011.06.02.tar.gz
- The signature: userial_0.8.2011.06.02.tar.gz.asc
- The main git repository: http://git.nklein.com/lisp/libs/userial.git/
- Browsable git mirror: https://github.com/nklein/userial
- Issue reporting: https://github.com/nklein/userial/issues
The USerial library is a general purpose library for serializing items into byte buffers and unserializing items back out of byte buffers. The “Buffer Handling” section below describes the various ways one can manipulate USerial buffers. The “Serializing and Unserializing” section below describes the different tools the USerial library provides for creating serializers and the serializers that the library provides out of the box. The “Sample Application: Game Protocol” section below describes how one might put all of these functions to use in preparing network packets for a simple game.
The USerial library uses the ContextL library to provide versioned serialization. The ContextL library allows one to define layers and then define functions within those layers. The
unserialize functions in the USerial library are ContextL layered functions. As such, one can define a hierarchy of ContextL layers and define serialize functions for different items at different points in the hierarchy.
For example, suppose one wanted to implement versions 1.0, 1.1, and 1.0.1 with both 1.1 and 1.0.1 extending 1.0 but unrelated to each other. One could use ContextL’s
deflayer macro as follows:
(deflayer :v1.1 (:v1.0))
(deflayer :v1.0.1 (:v1.0))
Then, one could use the USerial library’s
make-list-serializer macro to create a serializer that encodes a list of integers as a list of 16-bit integers in version 1.0 and as a list of 32-bit integers in version 1.1. (This assumes one has serializers already for
:int32 which USerial has by default.)
(make-list-serializer :list-of-int :int32 :layer :v1.1)
Then, the appropriate serialization would occur depending on which layer is activated in ContextL.
(serialize :list-of-int '(3 4 5))) => encoded as :int16
(serialize :list-of-int '(3 4 5))) => encoded as :int16
(serialize :list-of-int '(3 4 5))) => encoded as :int32
Note: if versioning is not yet a consideration, one can omit the
layer keyword parameter to the
make-list-serializer to simply define the default serializer and unserializer for the
To serialize, one needs a place to put the data. To unserialize, one needs a place from which to fetch the data. Some libraries choose to implement such things as streams. The USerial library serializes to and unserializes from memory buffers because the primary goal for this library is to facility assembly and disassembly of datagram packets.
The USerial library uses adjustable arrays of unsigned bytes with fill pointers. The fill pointer is used to track the current position in the buffer for serializing or unserializing. The buffers are automatically resized to accomodate the serialized data.
The basic types and constants used for buffer-related operations are described in the “Buffer-related Types and Constants” section below.
The USerial library provides a function for allocating a new buffer. This function is described in the “Creating Buffers” section below.
USerial library routines use the buffer in the special variable
*buffer*. There is a macro one can use to execute a body of statemets with a particular buffer. This macro is described in the “Using a Buffer” section below.
There are a variety of functions provided to allow one to query and manipulate the size of USerial buffers. These functions are described in the “Manipulating and Querying Buffer Sizes” section below.
There are some basic functions for adding an unsigned byte to a buffer and retrieving an unsigned byte from a buffer. Those functions are described below in the “Adding and Retrieving Bytes” section.
The USerial buffers are adjustable arrays of unsigned bytes with fill pointers. The array is adjustable so that it can be easily grown as needed to accomodate serialized data. It has a fill pointer that is used to track the current length of serialized data (as distinguished from the current allocated capacity of the array) or the current point from which data will be unserialized.
When one creates a USerial buffer, one can provide the initial capacity for the buffer. If no initial capacity is given for the buffer, a default size is used.
When one is adding bytes to a buffer, it would be very inefficient to reallocate the buffer each time an additional byte of space is needed. To this end, when the USerial library needs to increase the size of a buffer it adds at least the minimum of the current buffer size and this
For example, if the buffer were currently 256 bytes when the buffer needed to grow by a byte, it would be expanded to 512 bytes. If the buffer were currently 10,000 bytes when the buffer needed to grow by a byte, it would be expanded to 18,192 bytes.
The buffer allocator itself is the
make-buffer function. It takes an optional parameter specifying the initial capacity of the buffer.
(defun make-buffer (&optional initial-capacity)
As the buffer will be resized as needed, this parameter need not be set high enough to accomodate any and all serializations. It is provided merely to keep from having to reallocate the buffer several times if one can provide a decent, probable upper bound on the serialized size of the contents.
The following macro allows one to specify the buffer to use for the Userial library calls dynamically within the body.
This macro assigns the dynamic variable
*buffer* to be the given buffer for the duration of the body.
To retrieve the current value of
*buffer*, one can use the function
When serializing a buffer, the
buffer-length function returns the current length of the serialized data within the buffer. When unserializing a buffer, the
buffer-length function returns the current length of the serialized data which has already been unserialized from the buffer.
(defun buffer-length ()
The current allocated size of a buffer can be queried with the
buffer-capacity function. One can
(setf ...) the
buffer-capacity if needed to explicitly modify the amount of buffer space allocated.
(defun buffer-capacity ()
(setf (buffer-capacity) (integer 0 array-dimension-limit))
One can advance the current position within the buffer either to save space for later serialization or to skip over bytes during unserialization.
(defun buffer-advance (&optional (amount 1))
If not specified, the
buffer-advance function advances by a single byte.
One can reset the current position within the buffer back to the beginning to begin unserializing a serialized buffer, to fill in places that one skipped during the first stage of serialization, or to re-use the same buffer for the next serialization.
(defun buffer-rewind ()
At its base, the buffer class is an adjustable array of unsigned bytes. To add a byte to a buffer, one can use the following function. This function will expand the buffer if needed, place the given byte at the current fill pointer and advance the fill pointer.
(defun buffer-add-byte (byte &key (buffer *buffer*))
Similarly, to retrieve an unsigned byte from a buffer, one can use the following function. This function will retrieve the byte at the current fill pointer and advance the fill pointer.
(defun buffer-get-byte ())
The ultimate purpose of the USerial library is to allow one to serialize and unserialize data. To this end, the library defines two ContextL layered functions that dispatch on a keyword parameter. These ContextL layered functions are described in the “Serializing and Unserialize Layered Functions” section below.
There are some macros that facilitate serializing and unserializing sequences of items. These macros are described in the “Serializing and Unserializing Multiple Items” section below.
There are other macros which facilitate defining new serialize and unserialize methods for common situations. These macros are described in the “Defining New Serializers” section below.
There is a way to create a pair of functions where one serializes its arguments to a buffer and the other unserializes those arguments and executes a given body. This is described in the “Function Call Serialization” section below.
There are a variety of pre-defined
unserialize methods. These are describe in the “Pre-defined Serializers” section below.
The ContextL layered function used to serialize items takes a keyword as its first parameter, a value as its second parameter, and optional key arguments. The keyword is used to dispatch the appropriate implementation of the function for the given value. The serialize methods serialize the value into
*buffer* and return
(keyword value &key &allow-other-keys))
The ContextL layered function used to unserialize items takes a keyword as its first parameter and optional key arguments. The keyword is used to dispatch the appropriate implementation of the function. The unserialize methods unserialize a value from
*buffer* and return the value and
(keyword &key &allow-other-keys))
For most purposes, one wants to serialize more than one thing into a given buffer. The USerial library provides some convenience macros so that one is not forced to explicitly call serialize or unserialize for each item. Here is an example of explicitly calling the serialize method for each item.
(serialize :string login-name)
(serialize :string password)
(serialize :login-flags '(:hidden))
The first such macro is
serialize*. With this macro, one specifies the keywords and values explicitly. With it, the above example could be serialized as follows.
To unserialize from the resulting buffer, one could explicitly call unserialize for each item in the buffer storing each item explicitly into a place.
(setf opcode (unserialize :opcode)
login-name (unserialize :string)
password (unserialize :string)
flags (unserialize :login-flags))
To do the same sort of thing more directly, one can use the
unserialize* macro. This macro allows one to unserialize from a given buffer into given places using given keywords on which to dispatch.
(unserialize* :opcode opcode
Another way one might have used explicit calls to unserialize is to replace the
let construct in the above with a
let* and unserialize each variable as it is created.
(login-name (unserialize :string))
(password (unserialize :string))
(flags (unserialize :login-flags)))
To condense the above, one can use the
unserialize-let* macro. It takes a list of keyword/variable-names, a buffer (which is not optional), and a body of statements to execute while the named variables are in scope. Note: the buffer argument here is required.
Suppose one wanted to unserialize into a list (as this is Lisp after all). One could explicitly call unserialize for each item in the list.
To eliminate a great deal of typing the word
unserialize, one can use the
unserialize-list* macro. The macro takes a list of keywords and an optional buffer. It returns a list as the first value and the buffer as the second value.
unserialize-list* is a function rather than a macro (hence the quote before the list in the above example).
For items in classes or structs, one can serialize and unserialize them using slots or accessors. If one had a
person struct with slots
hair-color, one might do either of the following to serialize an instance
Similarly, one could then unserialize that data in either of the following ways.
unserialize-accessors* return the object as the first return value and the buffer as the second return value. One can reasonably use
(make-person) or some other factory in place of the
*person-instance* in the above examples.
This section describes some macros available for creating serialize and unserialize methods. Many of these macros allow keyword parameters
extra. If the
layer keyword parameter is given, then the serializer and unserializer generated will be in the ContextL layer specified. If the
extra keyword parameter is given, those extra keywords are available for the serializer and unserializer. This will be more clear after some examples.
At the most basic level, one can define a serializer using the
define-serializer macro and the corresponding unserializer using
(define-unserializer (key &key layer extra)
For example, assuming there is already a
:float32 encoder, one might define a serializer that serializes a vector’s offset from a specified reference point and define an unserializer that does the inverse given the same reference point.
(serialize* :float32 (- (vec-x point) (vec-x reference))
:float32 (- (vec-y point) (vec-y reference))))
(define-unserializer (:offset-vector :extra (reference))
(unserialize-let* (:float32 dx :float32 dy) buffer
(make-vec :x (+ (vec-x reference) dx)
:y (+ (vec-y reference) dy))))
:v2.0 extended vectors to three dimensions, then one would could add an additional serializer in layer
(serialize* :float32 (- (vec-x point) (vec-x reference))
:float32 (- (vec-y point) (vec-y reference))
:float32 (- (vec-z point) (vec-z reference))))
Almost every protocol requires the encoding and decoding of integer values. To make it easy to create as many of these types as one’s application requires, the USerial library defines a macro which creates a serialize and unserialize method for an integer that is a given number of bytes long. The macro takes two arguments: the key used to specify the method and an integer number of bytes.
For example, to make serialize and unserialize methods for signed bytes and signed quadwords with signed quadwords only available when layer
:v1.2 is active, one could simply call:
(make-int-serializer :signed-quadword 8 :layer :v1.2)
This macro will serialize integers in big-endian two’s complement form for the greatest compatibility with standard protocols.
Similarly, if one wanted to create serialize and unserialize methods for unsigned bytes and unsigned doublewords, one could use the following macro:
(make-uint-serializer :unsigned-byte 1)
(make-uint-serializer :unsigned-doubleword 4)
bytes argument to the
make-uint-serializer macros must be a constant value available at the time the macro is expanded.
To serialize floating point numbers, one must have a function that encodes floating point numbers into an integer representation and a function that decodes the integer representation back into a floating point number. Then, one can use the
make-float-serializer macro which takes a key used to specify the method, a lisp type for the floating point number, a constant number of bytes for the encoded values, an encoder, and a decoder.
For example, the following would create serializers that encode rational numbers (technically not floating point, I know) as 48-bit fixed point numbers with 16-bits devoted to the fractional portion and 32-bits devoted to the integer portion.
#'(lambda (rr) (round (* rr 65536)))
#'(lambda (ii) (/ ii 65536)))
The USerial library defines macros for helping one encode bit fields (to represent choices where more than one possibility at a time is acceptable) and enumerations (to represent choices where only a single selection can be made). These macros take a keyword used to specify the method and a list of choices.
(make-enum-serializer :direction (:left :right :up :down) &key layer)
With the bit field serializer, one can specify a single option or a list of zero or more options. With the enumeration serializer, must specify a single option.
(serialize :wants nil)
(serialize :wants '(:tea :sega))
(serialize :direction :up)
When unserializing, the bit field will always return a list even when there is a single item in it as in the
:tea example above.
To facilitate serializing and deserializing classes and structs, the USerial library provides macros which create serializers and unserializers for items based on slots or accessors. These macros take a key used to specify the methods, a factory form used by the unserialize method to create a new instance of the class or struct, and a plist of key/name pairs where the name is a slot name for the slot serializers or an accessor name for the accessor serializers and the key with each name specifies how to serialize the value in that slot.
An example will help to clarify the previous paragraph. Suppose one had a simple struct listing a person’s name, age, and favorite color.
One could create the following serialize and unserialize pairs to allow encoding the data for internal use (where all data is available) or for public use (where the age is kept secret).
:string name :uint8 age :string color)
(make-accessor-serializer (:person-public pp
(make-person :age :unknown))
:string person-name :string person-color)
layer keyword argument can be given following the three required parts of the opening form of each of those macros. The first item in the opening form of each of those macros is the keyword used to identify the serializer and unserializer. The second item in the opening form of each of those macros is the variable name used as a keyword parameter in the unserialize method to allow one to pass in a pre-initialized version of the object. The third item in the opening form of each of those is a factory used in the unserializer to create a new instance of a person if one is not given.
(unserialize :person-public :pp *instance*) => *instance*
Here is a simple session showing the above in action. The following code first defines a function which serializes a value using a given key to a new buffer, rewinds the buffer, and unserializes from the buffer using the key.
(serialize key value)
(nth-value 0 (unserialize key))))
CL-USER> (defvar *p* (make-person :name "Patrick"
CL-USER> (u-s :person-internal *p*)
#S(PERSON :NAME "Patrick" :AGE 40 :COLOR "Green")
CL-USER> (u-s :person-public *p*)
#S(PERSON :NAME "Patrick" :AGE :UNKNOWN :COLOR "Green")
One can also define a serialize/unserialize pair for a list where each item uses the same serializer. For example, one can do the following:
(serialize :string-list (list "a" "b" "c" "d"))
make-list-serializer can also take a
layer keyword parameter.
make-list-serializer encodes the length of the list, then the elements. If the length is known from context, one might instead use the
make-vector-serializer which serializes a sequence of a given length and unserializes to a vector. The
make-vector-serializer macro takes three parameters: the key to use for this serializer, the key used to encode each element of the sequence, and the length of the sequence. For example, one might have something like this:
(make-vector-serializer :pixel :uint8 3)
(make-vector-serializer :pixel :uint8 4 :layer :v2.0)
The USerial library allows one to create serializers with slots or accessors which allow a portion of the struct (or class) slots to be used as a key on the unserializing side to locate the object. An example should serve to illustrate. Given the above
person struct and a
find-person-by-name function, one might create a serializer which encodes a person’s name and a new color such that the unserializing side will find the person struct for the named person and set the color.
(:string (person-name found-person) pname)
;; store the color and then change it locally
(defparameter *p* (make-person :name "Patrick" :age 40 :color "Green"))
(serialize :set-person-color (make-person :name "Patrick" :color "Orange")
;; restore the serialized color "Green"
=> #S(PERSON :NAME "Patrick" :AGE 40 :COLOR "Orange")
One can incorporate more variables into either the key or the values and use either slots or accessors.
(:string name pname)
(make-key-slot-serializer (:change-color-by-name-and-age found-person
(:string name p-name
:uint8 age p-age)
(find-person-by-name-and-age p-name p-age))
The USerial library allows one to create serializers which unserialize into global variables. For example, suppose one had global variables
*author*. One could use the
:person-internal serializer above create a serializer that lets one serialize a person that will be unserialized into one of the above variables.
(defparameter *author* nil)
(make-global-variable-serializer (:global-person :person-internal)
;; serialize the moderator and clear it
(serialize :global-person '*moderator*)
(setf *moderator* nil)
;; rewind the buffer and restore the moderator
*moderator* => #S(PERSON :NAME "Patrick" :AGE 40 :COLOR "Green")
The USerial library allows one to alias serializers. For example, if one were encoding object ids as unsigned 32-bit integers, one could do either of the following:
(make-alias-serializer :object-id :uint32)
The alias is not a direct alias. It will involve one more function call that using the direct serializer, but it provides a convenient, self-documenting way to describe equivalent serializations.
define-serializing-funcall macro, one can create a function which serializes its arguments to a buffer and a corresponding function which unserializes those arguments to execute a function body. The macro takes two introductory forms and then a body. The first introductory form contains the name for the serializing function, the name for the unserializing function, and an optional
layer keyword argument. The second form contains a USerial-enhanced lambda list for the function. The enhanced lambda list precedes each required, optional, or keyword parameter with a USerial serialization keyword.
For example, given the
person struct used in the previous section, one might choose to do something like this:
(:string name :uint8 age)
(let ((pp (find-person-by-name name)))
(setf (person-age pp) age))
Then, the sending side of the application can serialize a call to this function as follows:
=> #(7 80 97 116 114 105 99 107 39)
The receiving side of the application can unserialize this function and invoke the given function body (assuming that
find-person-by-name succeeds for “Patrick”).
=> #S(PERSON :NAME "Patrick" :AGE 39 :COLOR "Green")
#(7 80 97 116 114 105 99 107 39)
One can use required,
&aux parameters. The required,
&key parameters must all be given a serialization keyword. The
&aux parameters do not use a serialization keyword.
The serialization side does not accept
&allow-other-keys. If the enhanced lambda list contains
&allow-other-keys, it can be used on the unserializing side.
&key parameters are given their defaults as compiled on the receiving side. So, if one serializes a function call, sends that buffer to a second machine which has different default values for some parameters, the values used for defaults will be those compiled on the receiving side. As such, the description of default values on the sending side may not necessarily reflect the default values that the receiving side will actually use. It is up to the application programmer whether this possibility is worth the expense of explicitly specifying the parameter on the sending side even if the desired value is the default.
The USerial library defines some commonly required serializers.
For symbols, the USerial library defines serializers (and unserializers) for
(serialize :symbol 'cl:first)
For signed integers, the USerial library defines the following serializers (and unserializers):
:int8 for signed bytes,
:int16 for signed 16-bit integers,
:int32 for signed 32-bit integers,
:int64 for signed 64-bit integers, and
:int for arbitrarily large integers.
For unsigned integers, the USerial library defines the following serializers:
:uint8 for unsigned bytes,
:uint16 for unsigned 16-bit integers,
:uint24 for unsigned 24-bit integers,
:uint32 for unsigned 32-bit integers,
:uint48 for unsigned 48-bit integers,
:uint64 for unsigned 64-bit integers, and
:uint for arbitrarily large unsigned integers.
For floating point numbers, the USerial library defines the
:float32 serializer for encoding
single-float values as 32-bit IEEE floating point numbers and the
:float64 serializer for encoding
double-float values as 64-bit IEEE floating point numbers. The USerial library uses the ieee-float library to encode and decode floating point numbers.
For arbitrary byte sequences, the USerial library defines the
:bytes serializer. These are encoded as a
:uint length and then the raw bytes. To include a raw sequence of bytes in the serialization without the length ahead of it, one can use the
&key (start 0) (end (length bytes-array)))
(unserialize :raw-bytes byte-array
&key output (start 0) (end (length output)))
For strings, the USerial library defines the
:string serializer for encoding strings as UTF-8 encoded sequences of arbitrary length. The USerial library uses the trivial-utf-8 library to encode and decode UTF-8 strings.
For enumerated types, the USerial library defines the
:boolean serializer for encoding an option that will be either
This example shows how one might use the tools above to serialize the data that would need to be exchanged between a client and server to implement a two-player game similar to Milton-Bradley’s Battleship game.
For this game, there will be a server and two clients. Each client will begin the game by placing his ships on an (2K+1)x(2K+1) board. The board will have coordinates ranging from -K through +K in both the X and Y axis. Ships will have to be placed either horizontally or vertically at integer coordinates. All ships are three units in length. It takes only one missile shot to sink a ship.
Once the ships are placed, regular play begins. During his turn during regular play, a client can either ping or fire. Each client begins with a defined amount of energy available with which to ping and a defined number of missiles.
If the client chooses to ping, the client chooses the radius of the ping and its center of origin. The server will calculate the distance from the center of origin to each enemy ship within the specified radius from the origin, round those distances to the nearest integer, and reply to the client with that list.
If the client chooses to fire, the client chooses the location upon which to fire. The server will respond to the client to tell him whether the shot was a hit or a miss.
To facilitate handling of received messages, each message will begin with an opcode identifying the message type. Some messages will be sent only from the client to the server. Others will be sent only from the server to the client.
(:login :place-ship :ping :fire))
(:welcome :ack :sunk :shot-results))
The message-receiving portion on the server side could then do something like this:
(ecase (unserialize :client-opcodes)
(:login (handle-login-message message))
(:place-ship (handle-place-ship-message message))
(:ping (handle-ping-message message))
(:fire (handle-ping-message message)))))
To begin a game, the client sends a message to the server with opcode
:login. The message declares the player’s name, which board sizes the client will play, and an optional name of an opponent that the client is waiting to play.
(:small :medium :large :huge))
(defun make-login-message (name &key opponent small medium large huge)
(let ((sizes (append (when small '(:small))
(when medium '(:medium))
(when large '(:large))
(when huge '(:huge)))))
(serialize* :client-opcode :login
:boolean (if opponent t nil))
(serialize :string opponent))
On the receiving side, the server might do something like the following (given that it already read the opcode from the message as it had in the previous section).
(unserialize-let* (:string name
(assert (plusp (length name)))
(assert (plusp (length sizes)))
(has-opponent (unserialize-let* (:string opponent)
(match-or-queue name sizes opponent)))
(t (match-or-queue name sizes))))))
When the server finds a match for the requested game, it composes welcome messages to each client. The welcome message contains the size of the board in squares, the number of ships each player has, the amount of ping energy each player has, the number of missiles each player has, and the name of the opponent.
(serialize* :server-opcode :welcome
Suppose the client had a class it was using to track the current state of the game. The client could then use a slot-serializer or accessor-serializer to parse the incoming welcome message.
The client could then handle the welcome message as follows (assuming the opcode has already been unserialized from the message buffer):
(unserialize-let* (:game-state-from-welcome game-state)
;; do anything with this game state here
To place ships, a client specifies the center coordinate of the ship and whether the ship is oriented horizontally or vertically.
(defun make-place-ship-message (x y orientation)
(serialize* :client-opcode :place-ship
The server could read the coordinates and orientation into local variables before calling a method to add the ship to the map.
(let (x y orientation)
(unserialize* :int8 x
(add-ship-to-map x y
:is-vertical (eql orientation
To perform a ping move, a client encodes a radius and a center for the ping.
(serialize* :client-opcode :ping
:int 8 y))
Here, the server will decode the ping request into a list to send to its routine to calculate the reply.
(unserialize-list* '(:float32 :int8 :int8))))
Supposing the return from
calculate-ping-response is a list of distances to ships, the ack message could be encoded like this:
(serialize* :server-opcodes :ack
:uint16 (length hits))
(mapcar #'(lambda (d) (serialize :uint8 d)) hits))
To send a fire message, the client just sends the coordinates of the location upon which to fire.
(serialize* :client-opcodes :fire
If the server determines the shot was a hit, it must send a sunk message to the opponent. Either way, a shot results message must be sent to the client.
(defun make-sunk-message (x y)
(serialize* :server-opcodes :sunk
(defun make-shot-results-message (hit)
(serialize* :server-opcodes :shot-results
:shot-result (if hit :hit :miss))))