Presentation by Josh Marchán to the Twin Cities Lisp Users Group. The presentation is an introduction to the Sheeple Prototype-based Object Oriented Programming Framework. The presentation was recorded on 2009-10-22.
Sheeple Presentation from TC Lispers Meeting
October 25th, 2009
Patrick Stein
Common Lisp vs. C/ASM: Battle FFT
October 16th, 2009
Patrick Stein
Dr. David McClain of Refined Audiometrics sent me his code for interfacing with the Apple vDSP library. Here it is along with a Makefile that I put together for it: vdsp-wrapper.tar.gz. I have not actually run the code myself. I started to convert his Lispworks FLI code to CFFI code, but bailed on it in favor of making a minimal CFFI interface to FFTW.
Dr. McClain says that Apple’s vDSP library usually gains him 20% or so versus FFTW. Further, there may well be plenty of alignment issues that should be taken into account in allocating the foreign buffers. Regardless, here is my very basic wrapper around FFTW’s fftw_plan_dft_1d() function: fftw.lisp.
To get this working with SBCL on my Mac, I needed to get FFTW compiling 32-bit shared libraries. By default, it compiles 64-bit static libraries on my system.
% CC="gcc -arch i386" ./configure --prefix=/usr/local --enable-shared
Before I recompiled for 32-bit though, I manually moved the libfftw3.* files in /usr/local/lib into libfftw3_64.* files (of course, moving both the symlink libfftw3.dylib and its target).
Doing the FFI to FFTW does a 1,048,576-sample buffer in 0.202 seconds under SBCL 1.0.30 versus 0.72 seconds for my all-Lisp version. Under Clozure-64, it takes 0.91 seconds versus nearly 30 seconds for the all-Lisp version. I should check to make sure FFTW performs as well when my buffer is randomly filled instead of filled with a constant value. But, yep. There you have it.
So, certainly, if you need high-performance FFTs, you’ll want to FFI wrap something rather than use my all-Lisp version. If you don’t need such high-performance, I give you no-hassle portability.
Adventures in Package Dependencies
October 14th, 2009
Patrick Stein
A variety of people have asked how the speed of my Lisp Fourier Transform Library compares to an FFI call to the FFTW Library. I downloaded the FFTW Library, built it, and installed it. Then, I tried to get someone else’s FFI wrappers for the FFTW Library going. I worked on this for two hours several weeks ago. Today, I put almost an hour into it. If it’s going to get done, I’m just going to have to write my own wrapper.
Here’s the problem…. package dependencies. This wouldn’t be so much of a problem if all of the packages were in the Cliki and thus accessible to ASDF-Install. Most of them are not.
To install fftw-ffi, I needed to install ch-image and clem. One of those depended on ch-util. That (and fftw-ffi) depended on ch-asdf. That depended on smarkup. Something there depended on cl-typesetting which depends on cl-pdf. The ch-asdf also depended on cl-bibtex and ch-bib (who knows why? or where to find it?). And, something in the above depended on cl-fad. CL-FAD was the only one in that list that ASDF-Install could find on its own. Several of the typesetting and PDF libraries along with the ch-image library did not build properly. There was much reassuring ASDF-Install that everything was going to be okay even if there were no PGP signatures for half of this or if the PGP signatures didn’t jive with the owner of the website. There was also much reassuring ASDF-Install that I wanted it to try to forge ahead even if it couldn’t build cl-bibtex for me.
I couldn’t find a list of projects supported by clbuild, so I went to build darcs to fetch clbuild. Did you know darcs needs the GNU Multi-precision Library? I’m assuming it uses its own hashing functions that want big integers or something, but I wasn’t really expecting all of that. So, after a half hour getting clbuild, it knows nothing about fftw-ffi.
Alright, maybe I managed to get enough stuff installed that I can just load fftw-ffi now? No dice. The fftw-ffi seems to have been generated with some tool that spits out an XML file and a corresponding C file. Then, it tries to use special ASDF tricks to get the C file compiled. Alas, these blow up on me.
don't recognize component type FFTW-FFI-GCC-XML-C-SOURCE-FILE
Erf. Defeated.
Lisp Fourier Transform Library Faster Yet (v1.3)
October 13th, 2009
Patrick Stein
I released version 1.3 of my Common Lisp Fourier Transform library today. It is significantly faster than yesterday’s version. On my MacBook Pro in SBCL 1.0.30 with a 512 by 512 by 16 array, version 1.3 takes 3.74 seconds where version 1.2 took 9.77 seconds and version 1.0 took 25.31 seconds. For a breakdown of performance on various array sizes with various Lisp implementations, see the performance section of the library page.
Most of the speed improvement in this version came from memory improvements. Version 1.3 doesn’t cons at all in SBCL during the processing of each row. Version 1.2 inadvertently consed three rows worth of complex numbers for every row transformed.
My library is still about 25% slower than Bordeaux FFT for one-dimensional arrays. My library, however, has full support for multi-dimensional arrays where Bordeaux-FFT does not.
I also included some validation test cases using NST to this release. To try them, go to the library directory, hop in your Lisp, and load regression.lisp
.
Lisp Fourier Transform Library Faster
October 12th, 2009
Patrick Stein
I released version 1.2 of my Common Lisp Fourier Transform library today. It is significantly faster than the old version. On my MacBook Pro in SBCL 1.0.30 with a 512 by 512 by 16 array, version 1.2 takes 9.77 seconds where version 1.0 took 25.31 seconds. Version 1.2 consed 2.4M while version 1.0 consed 7.4M. Neither version should have to cons nearly that much, so there’s still work to be done.
This release is significantly faster under Clozure64 1.3-r11936, too. For a 512 by 512 array, version 1.2 takes 2.72 seconds (single-threaded) and 2.07 seconds (multi-threaded) where version 1.0 took 4.45 seconds (single-threaded) and 3.06 seconds (multi-threaded).
My library is still about 1/2 to 1/3 the speed of Bordeaux FFT and still allocates memory in situations that Bordeaux does not. Hopefully, I can close those gaps most of the way soon.