Lines Are Big Circles September 13th, 2013
Patrick Stein

In previous posts here, I described using Clifford algebras for representing points and rotations. I was never very satisfied with this because the translations were still tacked on rather than incorporated in the algebra. To represent geometric objects, you need to track two Clifford multivectors: one for the orientation and another for the offset.

About two weeks ago, I found David Hestenes’s paper Old Wine in New Bottles: A new algebraic framework for computational geometry. In that paper, he describes a way to use Clifford Algebras to unify the representation of points, lines, planes, hyperplanes, circles, spheres, hyperspheres, etc. This was a big bonus. By using a projective basis, we can unify the orientation and offset. By using a null basis, we can bring in lines, planes, hyperplanes, circles, spheres, and hyperspheres.

The null basis ends up giving you a point at infinity. Every line goes through the point at infinity. None of the circles do. But, if you think of a line as a really, really big circle that goes through infinity, now you have unified lines and circles. Circles and lines are both defined by three points in the plane. (Technically, you can define a line with any three collinear points, but then you need to craft a point collinear to the other two. The point at infinity is collinear with every line. Further, such things could be seen as flattened circles having finite extent (diameter equal to the distance between the furthest apart of the three points) rather than an infinite line.)

So, I need to use Clifford algebras with a projective and null basis. All of the playing I previously did with Clifford algebras was using an orthonormal basis.

What is a basis?

To make a Clifford algebra, one starts with a vector space. A vector space has a field of scalars (real numbers, usually) and vectors. You can multiply any vector by a scalar to get another vector. And, if \alpha and \beta are scalars and \textbf{v} is a vector, then \alpha (\beta \textbf{v}) = (\alpha \beta)\textbf{v}. And, of course, we want 1\textbf{v} = \textbf{v} (in fact, even if 1\textbf{v} weren’t \textbf{v} exactly, we’re always going to be multiplying by at least 1, so we can recast our thinking to think about 1\textbf{v} any place we write \textbf{v}).

You can add together any two vectors to get another vector. Further, this addition is completely compatible with the scalar multiplication so that \alpha\textbf{v} + \beta\textbf{v} = (\alpha + \beta)\textbf{v} and \alpha(\textbf{v} + \textbf{w}) = \alpha\textbf{v} + \alpha\textbf{w}. This of course means that every vector has a negative vector. Further 0\textbf{v} = 0\textbf{w} for all vectors \textbf{v} and \textbf{w}. This is the distinguished vector called the zero vector. The sum of any vector \textbf{v} and the zero vector is just the vector \textbf{v}.

Every vector space has a basis (though some have an infinite basis). A basis is a minimal subset of the vectors such that every vector vector can be written as the sum of multiples of the basis vectors. So, if the whole basis is \textbf{v} and \textbf{w}, then every vector can be written as \alpha\textbf{v} + \beta\textbf{w}. A basis is a minimal subset in that no basis element can be written as the sum of multiples of the other elements. Equivalently, this means that the only way to express the zero vector with the basis is by having every basis element multiplied by the scalar zero.

Okay, but what is an orthonormal basis?

You need more than just a vector space to make a Clifford algebra. You need either a quadratic form or a dot-product defined on the vectors.

A quadratic form on a vector is a function Q that takes in a vector and outputs a scalar. Further, Q(\alpha\textbf{v}) = \alpha^2 Q(\textbf{v}) for all scalars \alpha and all vectors \textbf{v}.

A dot product is a function \langle\cdot{},\cdot{}\rangle that takes two vectors and outputs a scalar. A dot product must be symmetric so that \langle\textbf{v},\textbf{w}\rangle = \langle\textbf{w},\textbf{v}\rangle for all vectors \textbf{v} and \textbf{w}. Furthermore, the dot product must be linear in either term. (Since it’s symmetric, it suffices to require it be linear in either term.) This means that for all scalars \alpha and \beta and all vectors \textbf{v}, \textbf{w}, and \textbf{x} then \langle\alpha\textbf{v}+\beta\textbf{w},\textbf{x}\rangle = \alpha\langle\textbf{v},\textbf{x}\rangle + \beta\langle\textbf{w},\textbf{x}\rangle.

From any dot product, you can make a quadratic form by saying Q(\textbf{x}) = \langle\textbf{x},\textbf{x}\rangle \textbf{}. And, so long as you’re working with scalars where one can divide by two (aka, almost always), you can make a dot product from a quadratic form by saying \langle\textbf{x},\textbf{y}\rangle = \frac{1}{2}(Q(\textbf{x}+\textbf{y}) - Q(\textbf{x}) - Q(\textbf{y}). So, it doesn’t really matter which you have. I’m going to freely switch back and forth between them here for whichever is most convenient for the task at hand. I’ll assume that I have both.

So, let’s say we have a dot product on our vector space. What happens when we take the dot product on pairs of our basis vectors? If \textbf{v} and \textbf{w} are distinct elements of our basis with \langle\textbf{v},\textbf{w}\rangle = 0 \textbf{}, then \textbf{v} and \textbf{w} are said to be orthogonal (basis elements). If every element of the basis is orthogonal to every other basis element, then we have an orthogonal basis.

We say a basis element \textbf{v} is normalized if \langle\textbf{v},\textbf{v}\rangle = \pm 1. If all of the basis vectors are normalized, we have a normal basis.

An orthonormal basis is a basis that’s both an orthogonal basis and a normal basis.

You can represent any dot product as a symmetric matrix A. To find \langle\textbf{v},\textbf{w}\rangle, you multiply \textbf{v}^TA\textbf{w}. Further, you can always decompose a scalar matrix into the form A = S D S^T where D is a diagonal matrix (a matrix where all of the elements off of the diagonal are zero) and S^T = S^{-1}. Because of that, you can always find an orthogonal basis for a vector space. So, with just a little bit of rotating around your original choice of basis set, you can come up with a different basis that is orthogonal.

If your orthogonal basis is not normalized, you can (almost always) normalize the basis vectors where Q(\textbf{v}) \ne 0 by dividing it by the square root of Q(\textbf{v}). If any of the elements on the diagonal in the diagonal matrix are zero, then you didn’t have a minimal set for a basis.

So, as long as you can divide by square roots in whatever numbers system you chose for your scalars, then you can find an orthonormal basis. That means that Q(\textbf{v}) is either +1 or -1 for every basis vector \textbf{v}. It also means (going back to dot product) that \langle\textbf{v},\textbf{w}\rangle = 0 for distinct basis vectors \textbf{v} and \textbf{w}.

You can also re-order your basis set so that all of the Q(\textbf{v}) = +1 vectors come first and all of the Q(\textbf{v}) = -1 vectors are last. So, much of the literature on Clifford algebras (and all of the stuff that I had done before with them) uses such a basis. If the field of scalars is the real numbers \mathbb{R}, then we abbreviate the Clifford algebra as \mathcal{C}\ell_{p,q} when there are p basis vectors where Q(\textbf{v}) = +1 and q basis vectors where Q(\textbf{v}) = -1.

What about Projective and Null?

I mentioned at the outset that the Hestenes paper uses a projective basis that’s also a null basis. If you’ve done any geometry in computers before you have probably bumped into projective coordinates. If you have a 3d-vector [x,y,z]^T then you turn it into a projective coordinate by making it a 4d-vector that ends with 1 giving you [x,y,z,1]^T. Now, you take your three by three rotation matrices and extend them to four by four matrices. This lets you incorporate rotations and translations into the same matrix instead of having to track a rotation and an offset.

What about a null basis though? With a null basis, Q(\textbf{v}) = 0 for each (some?) of the basis vectors. The key point for me here is that the matrix representing the dot product isn’t diagonal. As an easy example, if we have a basis with two basis vectors \textbf{v} and \textbf{w}, then we can represent any vector \textbf{x} as \alpha\textbf{v} + \beta\textbf{w}. If we have Q(x) = \alpha^2 - \beta^2, then that is an orthonormal basis (with Q(\textbf{v}) = +1 and Q(\textbf{w}) = -1). If we picked two different basis vectors \textbf{v}^\prime and \textbf{w}^\prime then we would represent \textbf{x} as \alpha^\prime\textbf{v}^\prime + \beta^\prime\textbf{w}^\prime. We could pick them so that Q(\textbf{x}) = 2\alpha^\prime\beta^\prime. This would be a null basis because Q(\textbf{v}^\prime) = Q(\textbf{w}^\prime) = 0.

Clifford Algebras with an Orthonormal Basis

Once you have a vector space and a quadratic form or dot product, then you make a Clifford algebra by defining a way to multiply vectors together. For Clifford algebras, we insist that when we multiply a vector \textbf{v} by itself, the result is exactly Q(\textbf{v}). Then, we go about building the biggest algebra we can with this restriction.

Let’s look at what happens when we have two vectors \textbf{v} and \textbf{w}. Our Clifford restriction means that (\textbf{v}+\textbf{w})^2 = Q(\textbf{v}+\textbf{w}). We want multiplication to distribute with addition just like it does in algebra so the left hand side there should be: \textbf{v}^2 + \textbf{vw} + \textbf{wv} + \textbf{w}^2. Note: we haven’t yet assumed that our multiplication has to be commutative, so we can’t reduce that to \textbf{v}^2 + 2 \textbf{vw} + \textbf{w}^2.

Remember, now, the connection between the quadratic form Q and the dot product \langle\cdot,\cdot\rangle. We have, for the right hand side, that Q(\textbf{v}+\textbf{w}) = \langle\textbf{v}+\textbf{w},\textbf{v}+\textbf{w}\rangle. Now, we use the fact that the dot product is linear in both terms to say that \langle\textbf{v}+\textbf{w},\textbf{v}+\textbf{w}\rangle = \langle\textbf{v},\textbf{v}\rangle + \langle\textbf{v},\textbf{w}\rangle + \langle\textbf{w},\textbf{v}\rangle + \langle\textbf{w},\textbf{w}\rangle. Using the connection to the quadratic form again and the fact that the dot product is symmetric, we can simplify that to Q(\textbf{v}) + 2\langle\textbf{v},\textbf{w}\rangle + Q(\textbf{w}).

Because v^2 = Q(\textbf{v}) and w^2 = Q(\textbf{w}), we can simplify our original equation (\textbf{v}+\textbf{w})^2 = Q(\textbf{v}+\textbf{w}) to be \textbf{vw} + \textbf{wv} = 2\langle\textbf{v},\textbf{w}\rangle.

If \textbf{v} = \textbf{w}, then the above reduces to the definitional \textbf{v}^2 = Q(v). If \textbf{v} and \textbf{w} are distinct basis vectors in our orthogonal basis, then 2\langle\textbf{v},\textbf{w}\rangle = 0. This means that \textbf{wv} = -\textbf{vw}. So, our multiplication of distinct basis vectors anticommutes!

Now, given an arbitrary vector, we can express it as a sum of multiples of the basis vectors: \textbf{v} = \alpha_1\textbf{e}_1 + \alpha_2\textbf{e}_2 + ... where the \alpha_i are all scalars and the \textbf{e}_i are all basis vectors in our orthogonal basis. Given two such vectors we can do all of the usual algebraic expansion to express the product of the two vectors as a sum of multiples of products of pairs of basis vectors. Any place where we end up with \textbf{e}_i\textbf{e}_i we can replace it with the scalar number Q(\textbf{e}_i). Any place we end up with \textbf{e}_i\textbf{e}_j with i < j, we can leave it as it is. Any place we end up with \textbf{e}_j\textbf{e}_i with i < j, we can replace it with -\textbf{e}_i\textbf{e}_j. Then, we can gather up like terms.

So, suppose there were two vectors in our orthonormal basis \textbf{e}_1 and \textbf{e}_2. And, assume Q(\textbf{e}_1) = +1 and Q(\textbf{e}_2) = -1. Then (a\textbf{e}_1 + b\textbf{e}_2)(c\textbf{e}_1 + d\textbf{e}_2) expands out to ac\textbf{e}_1^2 + ad\textbf{e}_1\textbf{e}_2 + bc\textbf{e}_2\textbf{e}_1 + bd\textbf{e}_2^2. We can then manipulate that as outlined in the previous paragraph to whittle it down to (ac-bd) + (ad-bc)\textbf{e}_1\textbf{e}_2.

We still don’t know what \textbf{e}_1\textbf{e}_2 is exactly, but we’re building a big-tent algebra here. We don’t have a restriction that says it has to be something, so it gets to be its own thing unless our restrictions hold it back. How big is our tent going to be? Well, let’s see what happens if we multiply \textbf{e}_1\textbf{e}_2 by other things we already know about.

What happens if we multiply \textbf{e}_1(\textbf{e}_1\textbf{e}_2)? We want our multiplication to be associative. So, \textbf{e}_1(\textbf{e}_1\textbf{e}_2) = \textbf{e}_1^2\textbf{e}_2 and because \textbf{e}_1^2 = 1, this is just \textbf{e}_1(\textbf{e}_1\textbf{e}_2) = \textbf{e}_2. Well, what if we had multiplied in the other order? (\textbf{e}_1\textbf{e}_2)\textbf{e}_1 = -(\textbf{e}_2\textbf{e}_1)\textbf{e}_1 = -\textbf{e}_2\textbf{e}_1^2 = -\textbf{e}_2. Interesting. By similar reasoning, \textbf{e}_2(\textbf{e}_1\textbf{e}_2) = -\textbf{e}_2^2\textbf{e}_1 = \textbf{e}_1 and (\textbf{e}_1\textbf{e}_2)\textbf{e}_2 = \textbf{e}_1\textbf{e}_2^2 = -\textbf{e}_1.

What happens if we multiply (\textbf{e}_1\textbf{e}_2)(\textbf{e}_1\textbf{e}_2). This is just a combination of the above sorts of things and we find that

(\textbf{e}_1\textbf{e}_2)^2 = -(\textbf{e}_2\textbf{e}_1)(\textbf{e}_1\textbf{e}_2) = -\textbf{e}_2(\textbf{e}_1^2)\textbf{e}_2 = -\textbf{e}_2^2 = 1

So, that’s as big as our tent is going to get with only two vectors in our orthonormal basis.

Our Clifford algebra then has elements composed of some multiple of the scalar 1 plus some multiple of \textbf{e}_1 plus some multiple of \textbf{e}_2 plus some multiple of \textbf{e}_1\textbf{e}_2. If we had added a third basis vector \textbf{e}_3, then we also get \textbf{e}_1\textbf{e}_3, \textbf{e}_2\textbf{e}_3, and \textbf{e}_1\textbf{e}_2\textbf{e}_3. In general, if you have n vectors in the basis of the vector space, then there will be 2^n basis elements in the corresponding Clifford algebra.

You can rework any term \alpha\textbf{e}_i\textbf{e}_j\textbf{e}_k... so that the subscripts of the basis vectors are monotonically increasing by swapping adjacent basis vectors with differing subscripts changing the sign on \alpha at the same time. When you have two \textbf{e}_i side-by-side with the same subscript, annihilate them and multiply the coefficient \alpha by Q(e_i) (which was either +1 or -1). Then, you have a reduced term \pm\alpha\textbf{e}_i\textbf{e}_j... where the subscripts are strictly increasing.

What Happens When You Don’t Have An Orthonormal Basis?

The Hestenes paper doesn’t use an orthonormal basis. I’d never played with Clifford algebras outside of one. It took me about two weeks of scrounging through text books and information about something called the contraction product and the definitions of Clifford algebras in terms of dot-products plus something called the outer product (which gives geometric meaning to our new things like \textbf{e}_1\textbf{e}_2).

I learned a great deal about how to multiply vectors, but I didn’t feel that much closer to being able to multiply \textbf{e}_1\textbf{e}_4\textbf{e}_3\textbf{e}_1 unless the basis was orthonormal. I felt like I’d have to know things like the dot product of a vector with something like \textbf{e}_4\textbf{e}_3\textbf{e}_1 and then somehow mystically mix in the contraction product and extend by linearity.

There’s a whole lot of extending by linearity in math. In some cases, I feel like extending by linearity leaves me running in circles. (To go with the theme, sometimes I’m even running in a really big circle through the point at infinity.) We did a bit of extending by linearity above when we went from \textbf{v}^2 = Q(\textbf{v}) into what (\textbf{v}+\textbf{w})^2 must be based on the linearity in the dot product.

Finally, something clicked for me enough to figure out how to multiply \textbf{e}_1\textbf{e}_4\textbf{e}_3\textbf{e}_1 and express it as a sum of terms in which the basis vectors in each term had increasing subscripts. Now that it has clicked, I see how I should have gone back to one of our very first equations: \textbf{vw} + \textbf{wv} = 2\langle\textbf{v},\textbf{w}\rangle. With our orthogonal basis, \langle\textbf{v},\textbf{w}\rangle was always zero for distinct basis vectors.

If we don’t have an orthogonal basis, then the best we can do is \textbf{wv} = 2\langle\textbf{v},\textbf{w}\rangle - \textbf{vw}. That is good enough. Suppose then we want to figure out \textbf{e}_1\textbf{e}_4\textbf{e}_3\textbf{e}_1 so that none of the terms have subscripts out of order. For brevity, let me write d_{i,j} to mean \langle\textbf{e}_i,\textbf{e}_j\rangle. The first things we see out of order are \textbf{e}_4 and \textbf{e}_3. To swap those, we have to replace \textbf{e}_4\textbf{e}_3 with d_{3,4} - \textbf{e}_3\textbf{e}_4. Now, we have \textbf{e}_1 ( 2d_{3,4} - \textbf{e}_3\textbf{e}_4 ) \textbf{e}_1. With a little bit of algebra, this becomes 2d_{3,4}\textbf{e}_1^2 - \textbf{e}_1\textbf{e}_3\textbf{e}_4\textbf{e}_1 = 2d_{3,4}d_{1,1} - \textbf{e}_1\textbf{e}_3\textbf{e}_4\textbf{e}_1. That last term is still not in order, so we still have more to do.

\begin{array}{c}2d_{3,4}d_{1,1} - \textbf{e}_1\textbf{e}_3\textbf{e}_4\textbf{e}_1 \\2d_{3,4}d_{1,1} - \textbf{e}_1\textbf{e}_3(2d_{1,4} - \textbf{e}_1\textbf{e}_4) \\2d_{3,4}d_{1,1} - 2d_{1,4}\textbf{e}_1\textbf{e}_3 + \textbf{e}_1\textbf{e}_3\textbf{e}_1\textbf{e}_4 \\2d_{3,4}d_{1,1} - 2d_{1,4}\textbf{e}_1\textbf{e}_3 + \textbf{e}_1(2d_{1,3} - \textbf{e}_1\textbf{e}_3)\textbf{e}_4 \\2d_{3,4}d_{1,1} - 2d_{1,4}\textbf{e}_1\textbf{e}_3 + 2d_{1,3}\textbf{e}_1\textbf{e}_4 - \textbf{e}_1^2\textbf{e}_3\textbf{e}_4 \\2d_{3,4}d_{1,1} - 2d_{1,4}\textbf{e}_1\textbf{e}_3 + 2d_{1,3}\textbf{e}_1\textbf{e}_4 - d_{1,1}\textbf{e}_3\textbf{e}_4\end{array}

Whew. Now, to get that into code.

26 SLOC

In the end, I ended up with this 26 SLOC function that takes in a matrix dots to represent the dot product and some number of ordered lists of subscripts and returns a list of scalars where the scalar in spot i represents the coefficient in front of the ordered term where the k-th basis vector is involved if the (k-1)-th bit of i is set. So, for the example we just did with the call (basis-multiply dots '(1 4) '(3) '(1)), the zeroth term in the result would be 2d_{3,4}d_{1,1}. The fifth term ((2^2 + 2^0)-th term) would be -2d_{1,4}. The ninth term would be 2d_{1,3}. The twelfth term would be -d{1,1}. The rest of the terms would be zero.

From this, I will be able to build a function that multiplies arbitrary elements of the Clifford algebra. Getting to this point was the hard part for me. It is 26 SLOC that took me several weeks of study to figure out how to do on paper and about six hours of thinking to figure out how to do in code.

(defun basis-multiply (dots &rest xs)
  (let ((len (expt 2 (array-dimension dots 0))))
    (labels ((mul (&rest xs)
               (let ((xs (combine-adjacent (remove nil xs))))
                 (cond
                   ((null xs)
                    (vec len 0))

                   ((null (rest xs))
                    (vec len (from-bits (first xs))))

                   (t
                    (destructuring-bind (x y . xs) xs
                      (let ((a (first (last x)))
                            (b (first y))
                            (x (butlast x))
                            (y (rest y)))
                        (if (= a b)
                            (combine-like x a y xs)
                          (swap-ab x a b y xs))))))))

             (dot (a b)
               (aref dots (1- a) (1- b)))

             (combine-like (x a y xs)
               ;; X e1 e1 Y ... = (e1.e1) X Y ...
               (vs (dot a a) (apply #'mul x y xs)))

             (swap-ab (x a b y xs)
               ;; X e2 e1 Y ... = 2(e1.e2) X Y ... - X e1 e2 Y ...
               (v- (vs (* 2 (dot a b)) (apply #'mul x xs))
                   (apply #'mul x (list b a) y xs))))
      (apply #'mul xs))))

I had one false start on the above code where I accidentally confounded the lists that I was using as input with the lists that I was generating as output. I had to step back and get my code to push all of the work down the call stack while rolling out the recursion and only creating new vectors during the base cases of the recursion and only doing vector subtractions while unrolling the recursion.

There are also 31 SLOC involved in the combine-adjacent function (which takes a list like ((1 2 3) nil (4 5) (3)) and removes the nils then concatenates consecutive parts that are in order already to get ((1 2 3 4 5) (3))), the vec function (which makes a vector of a given length with a non-zero coefficient in a given location), the from-bits function (which turns a list of integers into a number with the bit k set if k+1 is in the list) and the little functions like vs and v- (which, respectively, scale a list of numbers by a factor and element-wise subtract two lists).

The 31 supporting SLOC were easy though. The 26 SLOC shown above represent the largest thought-to-code ratio of any code that I’ve ever written.

WO0T!!!t!1! or something!

Now, to Zig this SLOC for Great Justice!

Making Fun Algebra Problems Funner October 29th, 2010
Patrick Stein

A month ago, a friend posted the following problem on Facebook. I just noticed it this week.
Drawing of a circle with radius r sitting on a line with two squares wedged against it.  One square has side-length 1, the other side-length 2.  There are six units between the squares.
The goal is to find the exact length of the radius r.

I love this kind of math problem.  It has a bunch of features that make it a great, toy math problem.

  • It looks like it’s going to be easy, but at the same time seems at a glance to not have enough information
  • It looks like a geometry problem but only requires that you know:
    • All radii of a circle have the same length
    • A radius to a point where a tangent line touches the circle is perpendicular to that tangent line
  • It requires only very basic algebra:
    • Pythagorean theorem
    • Solving quadratics
  • The numbers in the problem are small, non-zero integers

I spent the next 25 minutes and six pieces of paper working the problem. About 20% of the time that I spent was rechecking my work. Why did I bother rechecking my work on a toy problem?

Warning: Spoilers ahead. If you like this kind of problem, stop reading now and play with it first.


The problem with the problem

This problem failed to satisfy one of my other criterion for great, toy puzzle problems.

  • The answer is a small natural number (or the current year)

I am notorious for messing up signs when doing large algebra calculations. I had to check and recheck all of my work to make sure that I hadn’t done 5x - (3x - 2) and gotten 2x - 2 somewhere. If I had come up with an integer value for r, I’d have been done. Instead, the answer involved a half-integer plus a multiple of \sqrt{74}.

What? \sqrt{74}?!? Raise your hand if you’ve ever seen that written anywhere before this problem.

That’s what I thought.

So, I spent the next hour looking for a different set of numbers to put in for 1, 2, and 6 so that the radius r would come out to an integer. My approach was Brownian, at best. I threw Pythagorean triples at a wall until one stuck.

The sticky triple left me with [50,200,360] to put in place of [1,2,6]. So, I put the problem down for awhile.

The next time that I was in the car, I realized that the radius for the sticky triple was less than 200. That meant that the larger box was touching the upper half of the circle instead of the lower half. The circle had to overlap the box.

So, I was back to the drawing board. Of course, I’d have been back to the drawing board at some point anyway because that sticky triple violated both of my small numbers criteria.

Warning: If you want to play around with finding a set of numbers that work, do it before you read the following. There are even more spoilers ahead.

Infinitely many combinations to get an integer radius

When I next sat down to play (pronounced: /OBSESS/) with this problem, I quickly hit upon the following way to take any Pythagorean triple (a, b, c) and make a version of the above puzzle where the radius r is an integer.
Same sort of diagram as above except radius is c, boxes are height c-b and c-a, and some (a,b,c) triangles are drawn in

In fact, using a 3-4-5 triangle, it is obvious that had the distance between the blocks in the original problem been 7 instead of 6, the radius would have been a small natural number.

Now. Now, I had infinitely many ways to construct this problem so that the radius was an integer. But, did I have all of the ways?

All your configuration are belong to us

When looking at that last diagram, the first question that comes to mind (for me) is: Do I really need to use the same Pythagorean triple on the left and right triangles? The answer, of course, is No.

If I have two Pythagorean triples (a_1,b_1,c_1) and (a_2,b_2,c_2) and d is any (positive) divisor of both c_1 and c_2, then I can start with an a_1-b_1-c_1 triangle on the left and an a_2-b_2-c_2 triangle on the right. I can then scale up the left triangle by c_2/d and scale the right triangle up by c_1/d. Now, both triangles have the same hypotenuse.

This means that if the left block has height c_2 (c_1 - b_1)/d, the right block has height c_1(c_2 - b_2)/d, the distance between the two blocks is (c_2 a_1 + c_1 a_2)/d, and the resulting circle has radius c_1c_2/d.
A more general version of the previous diagrams

Does this cover all of the solutions? Could I possibly have non-integer a_1 and a_2 such that (c_2a_1 + c_1a_2)/d is still integer?

Well, for the problem to be formulated entirely with integers, we certainly need the hypotenuse of the right triangles to be an integer. Further, to make the blocks integer heights then both of the legs of the right triangles along the vertical radius must also be integer. So, if the triangle on the left had hypotenuse r, vertical leg b and horizontal leg a, then we know that a = \sqrt{r^2 - b^2} where both r and b are integers. Thus, a must be the square root of an integer.

It is easy to see that if a were a non-integer rational number, then its square would also be a non-integer rational number. So, either a is integer or it is the irrational square root of an integer.

So, can a be an irrational number? If a were an irrational number, then the horizontal leg of the other triangle would have to be some integer n minus the irrational a for the distance between the two blocks to be an integer. Let’s say the vertical leg of the triangle on the right is \beta. Just like b in the first triangle, \beta must be an integer. We also know that (n-a)^2 = r^2 - \beta^2. This, in turn, means that a = (\beta^2 + n^2 + a^2 - r^2) / (2n). The right hand side is clearly rational, so a could not have been irrational.

So, there you have it: all of the numbers that make this problem have all integers on the drawing and an integer answer.

Finding the Perfect Hyperbola February 15th, 2010
Patrick Stein

For an application that I’m working on, I needed a way to scale quantities that range (theoretically) over the real numbers (though practically are probably between plus and minus three) into positive numbers. I wanted the function to be everywhere increasing, I wanted f(0) = 1, and I wanted control of the derivative at x = 0.

The easy choice is: f(x) = e^{\alpha x}. This is monotonically increasing. f(0) = 1 and f^\prime(0) = \alpha.

I needed to scale three such quantities and mush them together. I thought it’d be spiffy then to have three different functions that satisfy my criteria. The next logical choice was f(x) = 1 + \mathrm{tanh} (\alpha x). It is everywhere positive and increasing. And, it has f^\prime(0) = \alpha.

Now, I needed third function that was always positive, always increasing, had f(0) = 1 and f^\prime(0) = \alpha. One choice was: f(x) = e^{e^{\alpha x} - 1}. But, that seemed like overkill. It also meant that I really had to keep my \alpha tiny if I didn’t want to scale things into the stratosphere.

Playing with hyperbolas

So, I thought… why don’t I make a hyperbola, rotate it, and shift it so that the apex of one side of the hyperbola is at (0,1). And, I can adjust the parameters of the hyperbola so that f'(0) = \alpha. After a variety of false starts where I tried to keep the hyperbola general until the very end (\frac{x^2}{a^2} - \frac{y^2}{b^2} = r^2, rotated by \theta degrees, and shifted by \beta), I quickly got bogged down in six or seven incredibly ugly equations in eight or nine variables.

So, it was time to start trying to make it easy from the beginning. I noticed that if was going to rotate it by an angle \theta in the clockwise direction, then I needed \phi = \frac{\pi}{2} - \theta to be such that \tan \phi = \alpha if my slope was going to work out right in the end. So, I’m looking at the basic triangle on the right then to determine all of my sines and cosines and such.

Based on that triangle, it was also obvious that the asymptote for my starting hyperbola had to have \frac{b}{a} = \frac{1}{\alpha}. I played around a bit then with making a = \alpha and b = 1. In the end, I found things simplified sooner if I started with a = 1 and b = \frac{1}{\alpha}.

I also needed r to be such that the point (-r,0) rotate up so that its y-coordinate was 1. This meant that r \sin \theta = 1 or r = \frac{1}{\sin \theta} = \sqrt{1 + \alpha^2}.

So, my starting hyperbola then was: x^2 - \alpha^2 y^2 = 1 + \alpha^2.

From there, I had to rotate the x and y by \theta in the clockwise direction. This gave me:

(x\cos\theta - y\sin\theta)^2 - \alpha^2 (x\sin\theta + y\cos\theta)^2 = 1 + \alpha^2

A little multiplying out leads to:

(x^2\cos^2\theta - 2xy\sin\theta\cos\theta + y^2\sin^2\theta) \\ - \alpha^2(x^2\sin^2\theta + 2xy\sin\theta\cos\theta + y^2\cos^2\theta) = 1 + \alpha^2

From there, using cos^2\theta = \frac{\alpha^2}{1 + \alpha^2}, sin^2\theta = \frac{1}{1 + \alpha^2}, and \sin\theta\cos\theta = \frac{\alpha}{1 + \alpha^2}, we come to:

\frac{-2\alpha(1 + \alpha^2)xy + ( 1 - \alpha^4 )y^2}{1 + \alpha^2} = -2\alpha{}xy +  (1 - \alpha^2)y^2 = 1 + \alpha^2

The only step remaining was to shift it all over so that when y = 1, we end up with x = 0. Plugging in y = 1, we see that -2x\alpha + 1 - \alpha^2 = 1 + \alpha. That equation balances when x = -\alpha. We want it to balance when x = 0, so we’re going to substitute (x - \alpha) in place of the x in the above equation to get:

-2\alpha(x - \alpha)y + (1 - \alpha^2)y^2 = 1 + \alpha^2

We can easily verify that the point (0,1) is on the curve. And, we can implicitly differentiate the above to get:

-2\alpha(x - \alpha)\frac{dy}{dx} - 2\alpha{}y + 2(1 - \alpha^2)y\frac{dy}{dx} = 0

Plopping x = 0 and y = 1 in there, we find that \frac{dy}{dx} = \alpha.

This is pretty good as it goes. The only step is to complete the square to find a nicer expression for y in terms of x. We start by adding \left[ \frac{\alpha}{\sqrt{1 - \alpha^2}} (x - \alpha) \right]^2 to both sides to get:

\left[ \sqrt(1 - \alpha^2)y - \frac{\alpha}{\sqrt{1 - \alpha^2}(x - \alpha)} \right]^2 = 1 + \alpha^2 + \frac{\alpha}{1 - \alpha^2} (x-\alpha)^2

This is easy enough to solve for y by taking the square root of both sides and shuffling some things about:

y = \frac{\alpha}{1 - \alpha^2}(x - \alpha) + \frac{1}{\sqrt{1 - \alpha^2}} \sqrt{1 + \alpha^2 + \frac{\alpha^2}{1 - \alpha^2}(x - \alpha)^2}

Here are all three curves with \alpha = \frac{1}{4}. The exponential is in black, the hyperbolic tangent is in red, and the hyperbola is in blue:

The first image on the page here was made with Zach’s Vecto library with some post-processing in the GIMP. (Here is the source file: hyperbola.lisp.) The second image was made entirely within the GIMP. And, the last image was made using Foo Plot and the GIMP.

Clifford Algebras for the Non-Mathie May 17th, 2009
Patrick Stein

I was at a party last night. I mentioned to someone that I was a math geek.

She asked, What kind of math are you into now?

I said, I really want to learn about Clifford Algebras.

She replied, What are they like?

Me, I did the deer in headlights thing. I had no idea of her math level. I had my doubts that she’d ever done Calculus. I would guess that the quadratic equation was the defining aspect of what she thought of as Algebra. I didn’t know where to start.

In thinking back now, I could have at least said something constructive.

What I Could Have Said

So, the algebra you learned about in high school was just the tip of a huge body of mathematics. If you take away the idea that you have to plug a number in for x and just look at the what you can do with the x‘s still in there, there is a whole structure going on. At more advanced levels, mathematicians work with those structures and other structures like them.

Do you know what a vector is? One of the easiest ways to think about it is this. Suppose you’ve got a number on the number line. You can kinda think of that number as a one-dimensional vector. It tells you which direction to go (positive or negative) from zero and how far to go. Now, if you take something with more dimensions than a line, like a a two-dimensional surface or three-dimensional space, you can still have the idea of what direction to go from zero and how far to go. You just have to broaden your idea of which direction to go.

So, the algebra that you did through high school is all centered on having x represent a number (a one-dimensional vector). But, things get a lot hairier if you let x represent a three-dimensional or an eight-dimensional vector. People generally agree about how to add vectors. Now, you’ve got to pick some way to multiply two vectors together.

Clifford Algebras are one system for multiplying vectors together.

If No One is Hyperventilating, Continue…

You can multiply a number by a vector to get another vector. You can also multiply a vector by a vector to get a bivector. You can multiply a vector by a bivector to get a trivector. Etc. Actually, the etc. is misleading there. It doesn’t go on forever. With Clifford Algebras, you have to pick how many dimensions your vector has. You can’t multiply a three-dimensional vector by a four-dimensional vector.

Plus, if you started with a two-dimensional vector, then bivectors are as big as you get. If you multiply a two-dimensional bivector by a two-dimensional vector, you get a two-dimensional vector. If you multiply a two-dimensional bivector by another two-dimensional bivector, you just get a number.

But, all of this is probably more than you wanted to know…. at a party….

Updates In Email

Email:

l