Track-Best Library Updated July 8th, 2013
Patrick Stein

I updated my track-best library to allow you to keep all of the the things tied for best. The WITH-TRACK-BEST macro now accepts the :KEEP-TIES keyword parameter.

Here are some examples of using the :KEEP-TIES option. For all of the examples, we will use the same sequence of TRACK calls:

(defun track-numbers ()
  (track :one 1)
  (track :uno 1)
  (track :two 2)
  (track :dos 2)

Here are some calls with :KEEP-TIES as NIL (the default):

(with-track-best (:keep 1 :keep-ties nil) (track-numbers))
=> (values :TWO 2)

(with-track-best (:keep 3 :keep-ties nil) (track-numbers))
=> (values (:TWO :DOS :ONE) (2 2 1))

Here are some calls with :KEEP-TIES as T:

(with-track-best (:keep 1 :keep-ties t) (track-numbers))
=> (values (:TWO :DOS) (2 2))

(with-track-best (:keep 3 :keep-ties t) (track-numbers))
=> (values (:TWO :DOS :ONE :UNO) (2 2 1 1))

Track-Best Library Released May 9th, 2013
Patrick Stein

In doing a problem set from the Internet a few weeks ago, I found myself writing awkward constructions with REDUCE and LOOP to try to find the best one (or two or three) things in a big bag of things based on various criteria: similarity to English, hamming distance from their neighbor, etc.

I wrote a macro that encapsulated the pattern. I’ve reworked that macro into a library for public use.

Acquiring

Using

There are a variety of examples in the README and the tests directory.

Here is one example to pique your interest. Suppose you have some data about the elevations of various cities in various states and you (being a Moxy Fruvous fan) want to know What Is the Lowest Highest Point? Here’s how you might tackle that with the TRACK-BEST library:

(let ((data '(("Alabama"  ("Birmingham" 664)
                          ("Mobile" 218)
                          ("Montegomery" 221))
              ("Alaska"   ("Anchorage" 144)
                          ("Fairbanks" 531))
              ("Arizona"  ("Grand Canyon" 6606)
                          ("Phoenix" 1132)
                          ("Tuscon"  2641)))))
  (with-track-best (:order-by-fn #'<)
    (dolist (state-info data)
      (multiple-value-bind (city altitude)
          (with-track-best ()
            (dolist (city-info (rest state-info))
              (track (first city-info) (second city-info))))
        (track (list (first state-info) city) altitude)))))

With this limited dataset, the end result would be (VALUES '("Alaska" "Fairbanks") 531). The inner WITH-TRACK-BEST finds the highest city in each state. The outer WITH-TRACK-BEST finds the lowest of these.

Building the language up to you May 5th, 2013
Patrick Stein

One of the things that’s often said about Lisp is that it allows you to build up the language to you. Here are two examples that really drive home that point, for me.

The Problem Domain

I read an article about Treaps on reddit yesterday. The article used pretty direct pylisp to present Treaps.

I thought it would be fun to go through the exercise in Common Lisp instead. Pylisp made a number of things awkward that would melt away in Common Lisp.

I began by defining a node structure to encapsulate the information at a given node:

(defstruct node
  (priority (random 1.0d0) :type real :read-only t)
  (key 0 :read-only t)
  (value 0 :read-only t)
  (left nil :type (or null node) :read-only t)
  (right nil :type (or null node) :read-only t))

Then, I defined a top-level structure to hold the root node, track the function used to sort the tree, and track the function used to create a key from a value. I used the convention that two keys are equivalent if neither is less than the other.

(defstruct treap
  (root nil :type (or null node) :read-only t)
  (less-than #'< :read-only t)
  (key #'identity :read-only t))

The First Build Up

The article was using a functional style. As you may have guessed by the liberal use of :read-only t in those DEFSTRUCT clauses, I also used a functional style.

When working with functional data structures, one often needs to copy a whole structure with only slight modifications. Here, Lisp’s keyword arguments made everything simple and clean. I made this functions:

(defun copy-node (node &key (priority (node-priority node))
                            (key (node-key node))
                            (value (node-value node))
                            (left (node-left node))
                            (right (node-right node)))
  (make-node :priority priority
             :key key
             :value value
             :left left
             :right right))

Now, for any node, I could copy it and only have to specify the fields that I wanted to change. Here is a left-rotation using this:

(defun left-rotate (node)
  (let ((left (node-left node)))
    (copy-node left
               :right (copy-node node
                                 :left (node-right left)))))

What other languages let you do anything like that so simply? You could pull it off in Perl if you were willing to have your node be a hash (which, admittedly, you probably are if you’re writing Perl). What other language lets you do anything like that? Even other languages with named arguments don’t let you have the defaults based on other arguments.

The Second Build Up

As with any binary tree, you find yourself having to deal with four specific cases over and over again with Treaps.

  • You ran out of tree
  • Your key is less than the current node’s key
  • Your key is greater than the current node’s key
  • Your key is equivalent to the current node’s key

I wrote myself a TREAP-CASES macro that streamlines all of these checks. It also validates to make sure you don’t have duplicate cases or cases other than these four. It makes sure that no matter which order you write the cases (or even if you leave some out), they end up organized in the order listed above. The four cases are mutually exclusive so the order doesn’t matter except that you want to make sure you haven’t run out of tree before doing comparisons and that once you’re beyond the less-than and greater-than cases you already know you’re in the equivalence case.

With this macro, my TREAP-FIND function looks like this:

(defun treap-find (key treap)
  (check-type treap treap)
  (labels ((treap-node-find (root)
             (treap-cases (key root treap)
               (null (values nil nil))
               (< (treap-node-find (node-left root)))
               (> (treap-node-find (node-right root)))
               (= (values (node-value root) t)))))
    (treap-node-find (treap-root treap))))

The little DSL makes writing and reading the code so much easier. All of the mess of comparing the key to the node’s key is hidden away.

With years of practice and days of debugging, you might be able to pull off some quasi-readable control construct like this using C++ templates. With enough therapy, you could convince yourself you can get a close-enough effect with C-preprocessor macros. In Lisp, it’s the work of minutes (without lying to yourself).

Here is the TREAP-CASES macro for reference/completeness.

(defmacro treap-cases ((key root treap) &rest clauses)
  (validate-treap-case-clauses clauses)
  (let ((k (gensym "KEY-"))
        (tr (gensym "TREAP-"))
        (r (gensym "ROOT-"))
        (t< (gensym "<-")))
    `(let* ((,k ,key)
            (,tr ,treap)
            (,r ,root)
            (,t< (treap-less-than ,tr)))
       (cond
        ((null ,r) ,@(rest (assoc 'null clauses)))
        ((funcall ,t< ,k (node-key ,r)) ,@(rest (assoc '< clauses)))
        ((funcall ,t< (node-key ,r) ,k) ,@(rest (assoc '> clauses)))
        (t ,@(rest (assoc '= clauses)))))))

I suppose I should also include #'VALIDATE-TREAP-CASE-CLAUSES, too, but it’s what you’d expect:

(defun validate-treap-case-clauses (clauses)
  (let ((all-choices '(null < > =)))
    (flet ((assert-all-choices-valid ()
             (dolist (c clauses)
               (unless (member (first c) all-choices)
                 (error "Unrecognized clause type: ~S" (first c)))))
           (assert-no-duplicates ()
             (dolist (c all-choices)
               (unless (<= (count c clauses :key #'first) 1)
                 (error "Duplicate ~S clause not allowed." c)))))
      (assert-all-choices-valid)
      (assert-no-duplicates))))

Two snippets that I rewrite every three or four months… April 23rd, 2013
Patrick Stein

A friend of mine is taking a cryptography course this semester. One of his current homework problems involves creating n, e, and d values for RSA Encryption. Always one to play the home game when math is involved, I found myself rewriting two snippets of code that I always end up writing any time I do any number-theory related coding.

One of them is simple EXPONENT-MODULO that takes a base b, an exponent e, and a modulus m and returns b^e \pmod m. There are more efficient ways to do this, but this one is much faster than the naïve method and quite easy to rewrite every time that I need it. The idea is that when e is odd, you recursively calculate b^{e-1} \pmod m and return b \cdot b^{e-1} \pmod m. When e is even, you recursively calculate a = b^{e/2} \pmod m and return a^2 \pmod m.

(defun expt-mod (b e m)
  (cond
    ((zerop e) 1)
    ((evenp e) (let ((a (expt-mod b (/ e 2) m)))
                 (mod (* a a) m)))
    (t (mod (* b (expt-mod b (1- e) m)) m))))

The other snippet of code is a good deal trickier. Every time I need it, I spend way more time than I want to futzing with examples, getting lost in subscripts, etc. So, while I could rewrite the above in a moment, what’s below, I hope I have the search-fu to find this post next time I need it.

If you have a number n and some number a which is relatively prime to (shares no factors except for one with) n, then there exists some number b such that a \cdot b \equiv 1 \pmod n. And, in fact, all such numbers b for a given a differ by a multiple of n, so there is a unique such b \in [1,n).

Another way of saying a and n are relatively prime is to say that their greatest common divisor \gcd(a,n) is one. Besides being the biggest number that divides into both n and a, the \gcd(a,n) is also the smallest positive number expressible as x \cdot a + y \cdot n for integers x and y. If you keep track of the steps you use when calculating the greatest common divisor (which in this case is 1) while using the Euclidean algorithm you can (with enough care) backtrack through the steps and determine x and y for your given a and n.

For example, suppose you had n = 40 and a = 31. You want to know what b you’d need for 31b \equiv 1 \pmod{40}.
The steps in the Euclidean algorithm show:

\begin{array}{rcl}40 & = & 31 \cdot 1 + 9 \\31 & = & 9 \cdot 3 + 4 \\9 & = & 4 \cdot 2 + 1\end{array}

Working backwards, you can express 1 as a combination of x_0 \cdot 4 + y_0 \cdot 9. Then, you can express 4 as a combination of x_1 \cdot 9 + y_1 \cdot 31 and so on:

\begin{array}{rcl}1 & = & (-2)\cdot4 + 9\cdot{1} \\  & = & (-2)\cdot\left((-3)\cdot9 + 31\right) + 9\cdot{1} \\  & = & (-2)\cdot31 + 7\cdot9 \\  & = & (-2)\cdot31 + 7\cdot(40 - 31\cdot1) \\  & = & (-9)\cdot31 + 7\cdot(40) \equiv 1 \pmod{40}\end{array}

Wheee… in my example chosen to be easy to backtrack, the answer comes out to be -9 which is equal to 31 modulo 40. So, 31^2 \equiv 1 \pmod{40}. I picked a lucky number that is its own inverse modulo 40. It’s not usually like that. For example, 3 \cdot 27 \equiv 7 \cdot 23 \equiv 1 \pmod{40}.

So, anyhow, here’s a function that takes a and n and returns two values x and y so that x\cdot{}a + y\cdot{}n \equiv \gcd(a,n) \pmod n.

(defun gcd-as-linear-combination (a n)
  (multiple-value-bind (q r) (truncate n a)
    (if (zerop r)
        (values 1 0)
        (multiple-value-bind (x y) (gcd-as-linear-combination r a)
          (values (- y (* q x)) x)))))

Assuming that a and n are relatively prime with a < n, the first value returned is a‘s inverse modulo n.

Faster Primes with Heaps January 28th, 2013
Patrick Stein

I read a nice blog post this morning about implementing the Sieve of Eratosthenes in Lisp. It got me thinking about something that I did a few weeks back.

A few weeks back, I needed to generate all of the primes up to a given number. The number was big enough that I didn’t want to keep an array of bytes (or even bits) big enough to do the sieving. I just had an aversion to keeping around O(n) memory when O(n / \ln n) would suffice. I also had a hankering (probably because of Let Over Lambda) to implement it as a generator—a function that each time you called it gave you the next prime.

My first version amounted to trial-division against the entire list of primes generated so far. For the number of primes that I wanted to generate, this took more than nine minutes. Unacceptable.

I made some tweaks to it to ensure that I was only checking n against the previously generated primes that were less than or equal to (isqrt n). I made some other tweaks to keep my previously generated primes sorted so the most-likely to be a divisor of n were first. I made another tweak to accumulate the first k primes into a product so that I could check all of those against n with a (gcd n product-of-first-k) where k was constrained so that the product stayed below most-positive-fixnum.

Those tweaks got me down to around two minutes. That was still more than twice as long as I wanted it to take.

Now, with the help of CL-HEAP and the Rosetta-Code-approved optimization mentioned in that blog post, I have it down to thirty seconds with no trial division, no square roots, and only one multiplication per prime.

I keep a heap of cons cells. The car is the next number that will be crossed out by this prime and the cdr is how much to increment the car once I’ve crossed out the car.

Ignoring the special-case at the beginning for two, my current algorithm looks like this:

(defun prime-generator ()
  (let ((nixed (make-instance 'cl-heap:fibonacci-heap
                              :key #'first
                              :sort-fun #'<))
        (guess 1))
   
    (lambda ()
      (loop :do (incf guess 2) :until (%prime-p guess nixed))
      (cl-heap:add-to-heap nixed (cons (* guess guess) (* guess 2)))
      guess)))

The %prime-p function pulls cons cells off of the heap until it finds whose car isn’t equal to the current guess. For each of them where the car was equal, it increments the car by its cdr. Then, it puts all of the cons cells it removed back on the heap. The guess was prime if it wasn’t equal to any of the numbers on the heap.

(defun %prime-p (guess nixed)
  (let ((prime t)
        (n nil))
    (loop :do (setf n (cl-heap:pop-heap nixed))
       :while (and n (= (car n) guess))
       :do (progn
             (incf (car n) (cdr n))
             (cl-heap:add-to-heap nixed n)
             (setf prime nil))
       :finally (when n (cl-heap:add-to-heap nixed n)))
    prime))

I could probably save some time and cons-ing by peeking at the top of the heap rather than always gathering or countless other ways, but this more than exceeded my expectations for today.

Updates In Email

Email:

l