Public Domain Fourier Transform Library for Common Lisp October 7th, 2009
Patrick Stein

In response to my recent post about genetically selecting cosine waves for image approximation, several reddit commentors said that I should just have taken the Fourier transform, kept the largest 100 coefficients and did the inverse Fourier transform. [I haven’t run the RMS calculation on the result yet, but visually, it looks pretty nice with my test image. More details on that in a later post.]

The Lisp code that I used in that article didn’t actually use Fast Fourier Transforms. To test the commentors’ claims, I needed an FFT library in Lisp. I searched around a bit and found an FFI (Foreign Function Interface) wrapper around the FFTW library. After fiddling with that library and the wrapper for about two hours, I bailed on it and wrote my own Fast Fourier Transform library entirely in Lisp.

More information as well as links to getting the code are here: nklein / Software / CL-FFT.

Image Approximation with Genetically Selected Cosines October 2nd, 2009
Patrick Stein

input imageI started with the target image here on the right. I then used a genetic algorithm to try to find an approximation of this image as a sum of cosine waves.

Each cosine wave has the following form

f_i(x,y) = A_i \cos \left( 2 \pi \left( x \cos \theta_i + y \cos \theta_i \right) \omega_i + 2 \pi \phi_i \right)
The genetic algorithm can tweak the amplitude A_i, the orientation in the plane \theta_i, the frequency \omega_i, and the phase \phi_i of each wave.

out-0442 The image at the right here, is from the 443rd generation of a population of 200 where each gene represents 100 cosine waves. For the video below, I used a population of 100 genes, each representing the sum of 50 cosine waves.

Image Approximation with Genetically Selected Cosines from Patrick Stein on Vimeo.

Nitty Gritty Details

The genes are represented internally as an array of fixed-point numbers. Each number is five-bits integer, ten-bits fractional, and a sign bit. I scale the amplitudes down by a factor of 50 and scale the frequency and offset up by 50% when I convert them back to floating point from the fixed point representation just to get them into a more useful ballpark. During each generation, the fittest two quartiles of the population are cross-bred (in randomly selected pairs with two-point crossover) to replace the third quartile of the population, and the fourth quartile is replaced by mutations of randomly selected members of the top quartile. I also run through and remove any duplicate genes and replace them with randomly generated ones so that I don’t breed myself into a total corner.

I toyed around with two different fitness functions: RMS and entropy. So, I would calculate the image represented by the gene and subtract that from the original image. Then, I would calculate the RMS (\frac{\sum_{i=1}^{n} v_i^2}{n}) or the sum of the entropies of the possible values (\sum_{v=0}^{255} - p_v \log p_v, where p_v is the proportion of the pixels in the resulting image that had value v).

I bothered with entropy because I was hoping that if I could encode a gene in just a hundred or two hundred bytes that I might be able to compress the remainder (losslessly) and save bytes over just compressing the original image. The big problem with the entropy calculation though is that I get the same answer when every pixel value is off by 100 as I get when every pixel value is right on target. So, for a time, I was multiplying the RMS and the entropy for the fitness function. For the video and images above, I just used the RMS since it reached very close to the same solution as the multiplying method, and I saved myself fifty histograms per generation.

Source Code

Here is the Common Lisp source code for the genetic algorithm. It depends on the CL-PNG library for image input and output. As it appears here, it will create an output image from the fittest member of each generation, an image of the fittest member of the current generation, an image of the original with the fittest member of the current generation subtracted, a Lisp file containing the fittest gene of the current generation, and the current state of the random number generator when the run ends.

For the video above, I quickly put together some code to demonstrate the different parameters that the genetic algorithm can tweak and to show incrementally adding in more of the cosine waves. Here is some general utility code that is display-independent. And, here is the OpenGL display code which depends on the CL-OpenGL libraries. I included the source code here because I had trouble finding an example on the net of using Lisp-created textures with CL-OpenGL. I render the cosine waves into a buffer, copy that buffer into an OpenGL texture, and draw a quad with that texture. Note: it looks a little sluggish when run, but that is because I put a big (SLEEP …) in there to keep things from flying by too quickly.

Fourier Transform of Swarm Data with HUD September 25th, 2009
Patrick Stein

I added a heads-up display to the code from the previous article. This display shows you the coordinates of each of the bees.

The leader-bee is thick white string with green beads. The follower-bees are the thin black strings with red beads. Going clockwise from the top, the beads on each string represent amplitude, x-frequency, y-frequency, x-phase, and y-phase. The outside of the HUD represents positive infinity. The inner circle of the HUD represents negative infinity. The light yellow circle between the infinities represents zero. Note: the scaling is extreme. Two is almost 90% of the way from zero to infinity on this scale.

Fourier Swarm with Bee Display from Patrick Stein on Vimeo.

Technical details of the HUD scaling

Say you have a number m that you want to map into the interval [-1,1]. Draw the upper half of the unit circle centered at the origin. Start moving from left to right along the circle until you find the spot with a tangent of slope m. Use the x-coordinate of this point. To map that interval onto my HUD, subtract your calculated x-coordinate from two and use the result as the radius from the center of the HUD.

Here is the Lisp source code. It depends on Zach’s Vecto library.

Edit: Oops… there is a bug in the HUD code. The leader and first follower are not shown. The second follower is the thick white line. Ooops. Okay… I have replaced the video and the source code now.

Inverse Fourier Transform of Swarm Data September 24th, 2009
Patrick Stein

Inspired by Neil Baylis‘s Blob Thing program that I talked about previously, I made some Fourier Transform art of my own.

Take a leader-bee in five-dimensional space. Give him a place to go in five-dimensional space. When he gets close to his destination, move his target spot. Have ten follower-bees try to follow him. The follower-bees can fly faster than the leader bee. They are also super eager. They always accelerate at their maximum possible acceleration.

Now, for each follower-bee, consider how far away it is along each axis from the leader-bee. Interpret those deltas as an amplitude, a horizontal frequency, a vertical frequency, a horizontal phase angle, and a vertical phase angle. Compute the inverse 2-D Fourier transform of those components from the follower-bees.

Color code the output by the height. Add in some specular highlights. Animate through time.

Fourier Swarm from Patrick Stein on Vimeo.

I have used a similar technique in the past to generate ambient music. There, I made twenty follower-bees in thirty-two dimensions and used the average distances along each axis to the leader-bee as the volume of the note (each axis had a different pitch). Here, the results are visual instead of aural. I also cannot generate them in realtime except for very tiny sizes. *shrug*

Here is the Lisp source code. It depends on Zach’s ZPNG library for output.

Fourier Transforms in JavaScript September 2nd, 2009
Patrick Stein

In preparation for a study group on Numeric Photography, I started toying with what language to use for the coding projects.

Feature shopping

My inclination, of course, was to go with Lisp. Part of the goal of the assignments, however, was to make interactive web applications for each assignment and a gallery at the end of the completed assignments. I experimented a bit with Armed Bear Common Lisp. The hurdles that I would need a user to go through to use my application safely were too great. The user would have to edit their own java.policy file to give my applet permission. If I wanted to let the user do that with a single click or something, I would have to spring for a $200/year code-signing cert.

My next choice then was stand-alone Lisp applications. Rather than online, interactive applications, I would have downloadable ones. Not ideal, but doable. The big hurdle here is crafting a GUI to run the application. The GUI wouldn’t be too terrible thanks to an interesting concept that Dirk Gerrits pointed out: Immediate Mode GUIs. I have already implemented the buttons from this tutorial in OpenGL under SBCL.

Late one night, I remembered Parenscript. It is a Lisp-like language that compiles to Javascript. Parenscript was a little bit more its own thing and a little bit less Lisp than I had wanted. But, it got me thinking. What can you do in Javascript these days?

I searched for image processing javascript. The first hit was Pixastic. It is a library of basic image processing functions that you can run on any image on your page. At its heart, it uses the Canvas element that is available on Safari, Firefox, and Opera. So, I figured, why not? If I can get an FFT working in Javascript, then there is nothing keeping me from implementing all of the assignments in Javascript. Yes, Javascript is awful, but it’s a dream compared to Java. The new Javascript debugging features in Safari are quite useful, too.

Success

Below is a simple example of what I have working. When you first load the page, you see an image of my son and my dad at the Pittsburgh Zoo. This image is actually loaded in a hidden image and copied into a visible canvas. You are seeing the canvas.

If you press the FFT button and wait for it, you will see a representation of the 2-D Discrete Fourier Transform of the image. Really, the DFT output is complex numbers. The representation is just the magnitude of the complex numbers. The actual complex numbers are kept around in a variable so that when you hit the IFFT button, it has enough information to actually reconstruct the image. So, the FFT button works from the pixels in the canvas. The IFFT button works from the data generated with the last FFT. So, unfortunately, you cannot meaningfully FFT-FFT-IFFT-IFFT here. It is interesting, but not meaningful.

Also, I was trying to add a text box here so that you could use your own image, but I am not getting it right. You’re going to have to wait until I finish the whole assignment.




You need a browser that supports the Canvas tag.
Firefox, Safari, and Opera should all work.



Updates In Email

Email:

l