Inverse functions with fixed-points July 18th, 2013
Patrick Stein

Introduction

SICP has a few sections devoted to using a general, damped fixed-point iteration to solve square roots and then nth-roots. The Functional Programming In Scala course that I did on Coursera did the same exercise (at least as far as square roots go).

The idea goes like this. Say that I want to find the square root of five. I am looking then for some number s so that s^2 = 5. This means that I’m looking for some number s so that s = \frac{5}{s}. So, if I had a function f(x) = \frac{5}{x} and I could find some point s where s = f(s), I’d be done. Such a point is called a fixed point of f.

There is a general method by which one can find a fixed point of an arbitrary function. If you type some random number into a calculator and hit the “COS” button over and over, your calculator is eventually going to get stuck at 0.739085…. What happens is that you are doing a recurrence where x_{n+1} = \cos(x_n). Eventually, you end up at a point where x_{n+1} = x_{n} (to the limits of your calculator’s precision/display). After that, your stuck. You’ve found a fixed point. No matter how much you iterate, you’re going to be stuck in the same spot.

Now, there are some situations where you might end up in an oscillation where x_{n+1} \ne x_n, but x_{n+1} = x_{n-k} for some k \ge 1. To avoid that, one usually does the iteration x_{n+1} = \mathsf{avg}(x_n,f(x_n)) for some averaging function \mathsf{avg}. This “damps” the oscillation.

The Fixed Point higher-order function

In languages with first-class functions, it is easy to write a higher-order function called fixed-point that takes a function and iterates (with damping) to find a fixed point. In SICP and the Scala course mentioned above, the fixed-point function was written recursively.

(defun fixed-point (fn &optional (initial-guess 1) (tolerance 1e-8))
  (labels ((close-enough? (v1 v2)
             (<= (abs (- v1 v2)) tolerance))
           (average (v1 v2)
             (/ (+ v1 v2) 2))
           (try (guess)
             (let ((next (funcall fn guess)))
               (cond
                ((close-enough? guess next) next)
                (t (try (average guess next)))))))
    (try (* initial-guess 1d0))))

It is easy to express the recursion there iteratively instead if that’s easier for you to see/think about.

(defun fixed-point (fn &optional (initial-guess 1) (tolerance 1e-8))
  (flet ((close-enough? (v1 v2)
            (<= (abs (- v1 v2)) tolerance))
         (average (v1 v2)
            (/ (+ v1 v2) 2)))
    (loop :for guess = (* initial-guess 1d0) :then (average guess next)
          :for next = (funcall fn guess)
          :until (close-enough? guess next)
          :finally (return next))))

Using the Fixed Point function to find k-th roots

Above, we showed that the square root of n is a fixed point of the function f(x) = \frac{n}{x}. Now, we can use that to write our own square root function:

(defun my-sqrt (n)
  (fixed-point (lambda (x) (/ n x)))

By the same argument we used with the square root, we can find the k-th root of 5 by finding the fixed point of f(x) = \frac{5}{x^{k-1}}. We can make a function that returns a function that does k-th roots:

(defun kth-roots (k)
  (lambda (n)
    (fixed-point (lambda (x) (/ n (expt x (1- k)))))))

(setf (symbol-function 'cbrt) (kth-root 3))

Inverting functions

I found myself wanting to find inverses of various complicated functions. All that I knew about the functions was that if you restricted their domain to the unit interval, they were one-to-one and their domain was also the unit interval. What I needed was the inverse of the function.

For some functions (like f(x) = x^2), the inverse is easy enough to calculate. For other functions (like f(x) = 6x^5 - 15x^4 + 10x^3), the inverse seems possible but incredibly tedious to calculate.

Could I use fixed points to find inverses of general functions? We’ve already used them to find inverses for f(x) = x^k. Can we extend it further?

After flailing around Google for quite some time, I found this article by Chen, Lu, Chen, Ruchala, and Olivera about using fixed-point iteration to find inverses for deformation fields.

There, the approach to inverting f(x) was to formulate u(x) = f(x) - x and let v(x) = f^{-1}(x) - x. Then, because

x = f(f^{-1}(x)) =  f^{-1}(x) + u(f^{-1}(x)) = x + v(x) + u(x + v(x))

That leaves the relationship that v(x) = -u(x + v(x)). The goal then is to find a fixed point of v(x).

I messed this up a few times by conflating f and u so I abandoned it in favor of the tinkering that follows in the next section. Here though, is a debugged version based on the cited paper:

(defun pseudo-inverse (fn &optional (tolerance 1d-10))
  (lambda (x)
    (let ((iterant (lambda (v)
                     (flet ((u (x)
                               (- (funcall fn x) x)))
                       (- (u (+ x v)))))))
      (+ x (fixed-point iterant 0d0 tolerance)))))

Now, I can easily check the average variance over some points in the unit interval:

(defun check-pseudo-inverse (fn &optional (steps 100))
  (flet ((sqr (x) (* x x)))
    (/ (loop :with dx = (/ (1- steps))
             :with inverse = (pseudo-inverse fn)
             :repeat steps
             :for x :from 0 :by dx
             :summing (sqr (- (funcall fn (funcall inverse x)) x)))
       steps)))

(check-pseudo-inverse #'identity) => 0.0d0
(check-pseudo-inverse #'sin)      => 2.8820112095939962D-12
(check-pseudo-inverse #'sqrt)     => 2.7957469632748447D-19                                                          
(check-pseudo-inverse (lambda (x) (* x x x (+ (* x (- (* x 6) 15)) 10))))
                                  => 1.3296561385041381D-21

A tinkering attempt when I couldn’t get the previous to work

When I had abandoned the above, I spent some time tinkering on paper. To find f^{-1}(x), I need to find y so that f(y) = x. Multiplying both sides by y and dividing by f(y), I get y = \frac{xy}{f(y)}. So, to find f^{-1}(x), I need to find a y that is a fixed point for \frac{xy}{f(y)}:

(defun pseudo-inverse (fn &optional (tolerance 1d-10))
  (lambda (x)
    (let ((iterant (lambda (y)
                     (/ (* x y) (funcall fn y)))))
      (fixed-point iterant 1 tolerance))))

This version, however, has the disadvantage of using division. Division is more expensive and has obvious problems if you bump into zero on your way to your goal. Getting rid of the division also allows the above algorithms to be generalized for inverting endomorphisms of vector spaces (the \mathsf{avg} function being the only slightly tricky part).

Denouement

I finally found a use of the fixed-point function that goes beyond k-th roots. Wahoo!

Track-Best Library Updated July 8th, 2013
Patrick Stein

I updated my track-best library to allow you to keep all of the the things tied for best. The WITH-TRACK-BEST macro now accepts the :KEEP-TIES keyword parameter.

Here are some examples of using the :KEEP-TIES option. For all of the examples, we will use the same sequence of TRACK calls:

(defun track-numbers ()
  (track :one 1)
  (track :uno 1)
  (track :two 2)
  (track :dos 2)

Here are some calls with :KEEP-TIES as NIL (the default):

(with-track-best (:keep 1 :keep-ties nil) (track-numbers))
=> (values :TWO 2)

(with-track-best (:keep 3 :keep-ties nil) (track-numbers))
=> (values (:TWO :DOS :ONE) (2 2 1))

Here are some calls with :KEEP-TIES as T:

(with-track-best (:keep 1 :keep-ties t) (track-numbers))
=> (values (:TWO :DOS) (2 2))

(with-track-best (:keep 3 :keep-ties t) (track-numbers))
=> (values (:TWO :DOS :ONE :UNO) (2 2 1 1))

Track-Best Library Released May 9th, 2013
Patrick Stein

In doing a problem set from the Internet a few weeks ago, I found myself writing awkward constructions with REDUCE and LOOP to try to find the best one (or two or three) things in a big bag of things based on various criteria: similarity to English, hamming distance from their neighbor, etc.

I wrote a macro that encapsulated the pattern. I’ve reworked that macro into a library for public use.

Acquiring

Using

There are a variety of examples in the README and the tests directory.

Here is one example to pique your interest. Suppose you have some data about the elevations of various cities in various states and you (being a Moxy Fruvous fan) want to know What Is the Lowest Highest Point? Here’s how you might tackle that with the TRACK-BEST library:

(let ((data '(("Alabama"  ("Birmingham" 664)
                          ("Mobile" 218)
                          ("Montegomery" 221))
              ("Alaska"   ("Anchorage" 144)
                          ("Fairbanks" 531))
              ("Arizona"  ("Grand Canyon" 6606)
                          ("Phoenix" 1132)
                          ("Tuscon"  2641)))))
  (with-track-best (:order-by-fn #'<)
    (dolist (state-info data)
      (multiple-value-bind (city altitude)
          (with-track-best ()
            (dolist (city-info (rest state-info))
              (track (first city-info) (second city-info))))
        (track (list (first state-info) city) altitude)))))

With this limited dataset, the end result would be (VALUES '("Alaska" "Fairbanks") 531). The inner WITH-TRACK-BEST finds the highest city in each state. The outer WITH-TRACK-BEST finds the lowest of these.

Building the language up to you May 5th, 2013
Patrick Stein

One of the things that’s often said about Lisp is that it allows you to build up the language to you. Here are two examples that really drive home that point, for me.

The Problem Domain

I read an article about Treaps on reddit yesterday. The article used pretty direct pylisp to present Treaps.

I thought it would be fun to go through the exercise in Common Lisp instead. Pylisp made a number of things awkward that would melt away in Common Lisp.

I began by defining a node structure to encapsulate the information at a given node:

(defstruct node
  (priority (random 1.0d0) :type real :read-only t)
  (key 0 :read-only t)
  (value 0 :read-only t)
  (left nil :type (or null node) :read-only t)
  (right nil :type (or null node) :read-only t))

Then, I defined a top-level structure to hold the root node, track the function used to sort the tree, and track the function used to create a key from a value. I used the convention that two keys are equivalent if neither is less than the other.

(defstruct treap
  (root nil :type (or null node) :read-only t)
  (less-than #'< :read-only t)
  (key #'identity :read-only t))

The First Build Up

The article was using a functional style. As you may have guessed by the liberal use of :read-only t in those DEFSTRUCT clauses, I also used a functional style.

When working with functional data structures, one often needs to copy a whole structure with only slight modifications. Here, Lisp’s keyword arguments made everything simple and clean. I made this functions:

(defun copy-node (node &key (priority (node-priority node))
                            (key (node-key node))
                            (value (node-value node))
                            (left (node-left node))
                            (right (node-right node)))
  (make-node :priority priority
             :key key
             :value value
             :left left
             :right right))

Now, for any node, I could copy it and only have to specify the fields that I wanted to change. Here is a left-rotation using this:

(defun left-rotate (node)
  (let ((left (node-left node)))
    (copy-node left
               :right (copy-node node
                                 :left (node-right left)))))

What other languages let you do anything like that so simply? You could pull it off in Perl if you were willing to have your node be a hash (which, admittedly, you probably are if you’re writing Perl). What other language lets you do anything like that? Even other languages with named arguments don’t let you have the defaults based on other arguments.

The Second Build Up

As with any binary tree, you find yourself having to deal with four specific cases over and over again with Treaps.

  • You ran out of tree
  • Your key is less than the current node’s key
  • Your key is greater than the current node’s key
  • Your key is equivalent to the current node’s key

I wrote myself a TREAP-CASES macro that streamlines all of these checks. It also validates to make sure you don’t have duplicate cases or cases other than these four. It makes sure that no matter which order you write the cases (or even if you leave some out), they end up organized in the order listed above. The four cases are mutually exclusive so the order doesn’t matter except that you want to make sure you haven’t run out of tree before doing comparisons and that once you’re beyond the less-than and greater-than cases you already know you’re in the equivalence case.

With this macro, my TREAP-FIND function looks like this:

(defun treap-find (key treap)
  (check-type treap treap)
  (labels ((treap-node-find (root)
             (treap-cases (key root treap)
               (null (values nil nil))
               (< (treap-node-find (node-left root)))
               (> (treap-node-find (node-right root)))
               (= (values (node-value root) t)))))
    (treap-node-find (treap-root treap))))

The little DSL makes writing and reading the code so much easier. All of the mess of comparing the key to the node’s key is hidden away.

With years of practice and days of debugging, you might be able to pull off some quasi-readable control construct like this using C++ templates. With enough therapy, you could convince yourself you can get a close-enough effect with C-preprocessor macros. In Lisp, it’s the work of minutes (without lying to yourself).

Here is the TREAP-CASES macro for reference/completeness.

(defmacro treap-cases ((key root treap) &rest clauses)
  (validate-treap-case-clauses clauses)
  (let ((k (gensym "KEY-"))
        (tr (gensym "TREAP-"))
        (r (gensym "ROOT-"))
        (t< (gensym "<-")))
    `(let* ((,k ,key)
            (,tr ,treap)
            (,r ,root)
            (,t< (treap-less-than ,tr)))
       (cond
        ((null ,r) ,@(rest (assoc 'null clauses)))
        ((funcall ,t< ,k (node-key ,r)) ,@(rest (assoc '< clauses)))
        ((funcall ,t< (node-key ,r) ,k) ,@(rest (assoc '> clauses)))
        (t ,@(rest (assoc '= clauses)))))))

I suppose I should also include #'VALIDATE-TREAP-CASE-CLAUSES, too, but it’s what you’d expect:

(defun validate-treap-case-clauses (clauses)
  (let ((all-choices '(null < > =)))
    (flet ((assert-all-choices-valid ()
             (dolist (c clauses)
               (unless (member (first c) all-choices)
                 (error "Unrecognized clause type: ~S" (first c)))))
           (assert-no-duplicates ()
             (dolist (c all-choices)
               (unless (<= (count c clauses :key #'first) 1)
                 (error "Duplicate ~S clause not allowed." c)))))
      (assert-all-choices-valid)
      (assert-no-duplicates))))

Two snippets that I rewrite every three or four months… April 23rd, 2013
Patrick Stein

A friend of mine is taking a cryptography course this semester. One of his current homework problems involves creating n, e, and d values for RSA Encryption. Always one to play the home game when math is involved, I found myself rewriting two snippets of code that I always end up writing any time I do any number-theory related coding.

One of them is simple EXPONENT-MODULO that takes a base b, an exponent e, and a modulus m and returns b^e \pmod m. There are more efficient ways to do this, but this one is much faster than the naïve method and quite easy to rewrite every time that I need it. The idea is that when e is odd, you recursively calculate b^{e-1} \pmod m and return b \cdot b^{e-1} \pmod m. When e is even, you recursively calculate a = b^{e/2} \pmod m and return a^2 \pmod m.

(defun expt-mod (b e m)
  (cond
    ((zerop e) 1)
    ((evenp e) (let ((a (expt-mod b (/ e 2) m)))
                 (mod (* a a) m)))
    (t (mod (* b (expt-mod b (1- e) m)) m))))

The other snippet of code is a good deal trickier. Every time I need it, I spend way more time than I want to futzing with examples, getting lost in subscripts, etc. So, while I could rewrite the above in a moment, what’s below, I hope I have the search-fu to find this post next time I need it.

If you have a number n and some number a which is relatively prime to (shares no factors except for one with) n, then there exists some number b such that a \cdot b \equiv 1 \pmod n. And, in fact, all such numbers b for a given a differ by a multiple of n, so there is a unique such b \in [1,n).

Another way of saying a and n are relatively prime is to say that their greatest common divisor \gcd(a,n) is one. Besides being the biggest number that divides into both n and a, the \gcd(a,n) is also the smallest positive number expressible as x \cdot a + y \cdot n for integers x and y. If you keep track of the steps you use when calculating the greatest common divisor (which in this case is 1) while using the Euclidean algorithm you can (with enough care) backtrack through the steps and determine x and y for your given a and n.

For example, suppose you had n = 40 and a = 31. You want to know what b you’d need for 31b \equiv 1 \pmod{40}.
The steps in the Euclidean algorithm show:

\begin{array}{rcl}40 & = & 31 \cdot 1 + 9 \\31 & = & 9 \cdot 3 + 4 \\9 & = & 4 \cdot 2 + 1\end{array}

Working backwards, you can express 1 as a combination of x_0 \cdot 4 + y_0 \cdot 9. Then, you can express 4 as a combination of x_1 \cdot 9 + y_1 \cdot 31 and so on:

\begin{array}{rcl}1 & = & (-2)\cdot4 + 9\cdot{1} \\  & = & (-2)\cdot\left((-3)\cdot9 + 31\right) + 9\cdot{1} \\  & = & (-2)\cdot31 + 7\cdot9 \\  & = & (-2)\cdot31 + 7\cdot(40 - 31\cdot1) \\  & = & (-9)\cdot31 + 7\cdot(40) \equiv 1 \pmod{40}\end{array}

Wheee… in my example chosen to be easy to backtrack, the answer comes out to be -9 which is equal to 31 modulo 40. So, 31^2 \equiv 1 \pmod{40}. I picked a lucky number that is its own inverse modulo 40. It’s not usually like that. For example, 3 \cdot 27 \equiv 7 \cdot 23 \equiv 1 \pmod{40}.

So, anyhow, here’s a function that takes a and n and returns two values x and y so that x\cdot{}a + y\cdot{}n \equiv \gcd(a,n) \pmod n.

(defun gcd-as-linear-combination (a n)
  (multiple-value-bind (q r) (truncate n a)
    (if (zerop r)
        (values 1 0)
        (multiple-value-bind (x y) (gcd-as-linear-combination r a)
          (values (- y (* q x)) x)))))

Assuming that a and n are relatively prime with a < n, the first value returned is a‘s inverse modulo n.

Updates In Email

Email:

l