Faster Primes with Heaps January 28th, 2013
Patrick Stein

I read a nice blog post this morning about implementing the Sieve of Eratosthenes in Lisp. It got me thinking about something that I did a few weeks back.

A few weeks back, I needed to generate all of the primes up to a given number. The number was big enough that I didn’t want to keep an array of bytes (or even bits) big enough to do the sieving. I just had an aversion to keeping around O(n) memory when O(n / \ln n) would suffice. I also had a hankering (probably because of Let Over Lambda) to implement it as a generator—a function that each time you called it gave you the next prime.

My first version amounted to trial-division against the entire list of primes generated so far. For the number of primes that I wanted to generate, this took more than nine minutes. Unacceptable.

I made some tweaks to it to ensure that I was only checking n against the previously generated primes that were less than or equal to (isqrt n). I made some other tweaks to keep my previously generated primes sorted so the most-likely to be a divisor of n were first. I made another tweak to accumulate the first k primes into a product so that I could check all of those against n with a (gcd n product-of-first-k) where k was constrained so that the product stayed below most-positive-fixnum.

Those tweaks got me down to around two minutes. That was still more than twice as long as I wanted it to take.

Now, with the help of CL-HEAP and the Rosetta-Code-approved optimization mentioned in that blog post, I have it down to thirty seconds with no trial division, no square roots, and only one multiplication per prime.

I keep a heap of cons cells. The car is the next number that will be crossed out by this prime and the cdr is how much to increment the car once I’ve crossed out the car.

Ignoring the special-case at the beginning for two, my current algorithm looks like this:

(defun prime-generator ()
  (let ((nixed (make-instance 'cl-heap:fibonacci-heap
                              :key #'first
                              :sort-fun #'<))
        (guess 1))
   
    (lambda ()
      (loop :do (incf guess 2) :until (%prime-p guess nixed))
      (cl-heap:add-to-heap nixed (cons (* guess guess) (* guess 2)))
      guess)))

The %prime-p function pulls cons cells off of the heap until it finds whose car isn’t equal to the current guess. For each of them where the car was equal, it increments the car by its cdr. Then, it puts all of the cons cells it removed back on the heap. The guess was prime if it wasn’t equal to any of the numbers on the heap.

(defun %prime-p (guess nixed)
  (let ((prime t)
        (n nil))
    (loop :do (setf n (cl-heap:pop-heap nixed))
       :while (and n (= (car n) guess))
       :do (progn
             (incf (car n) (cdr n))
             (cl-heap:add-to-heap nixed n)
             (setf prime nil))
       :finally (when n (cl-heap:add-to-heap nixed n)))
    prime))

I could probably save some time and cons-ing by peeking at the top of the heap rather than always gathering or countless other ways, but this more than exceeded my expectations for today.

Whittling Away with Repetition January 15th, 2013
Patrick Stein

I’ve now done the Bowling Game Kata from the previous video seven more times imperatively as I did in the video and two more times with a functional style.

Each time I did it imperatively, I decreased the overall size of the code. The functional code is smaller still and got smaller on my second try. The previous video was bowling-20130102.lisp and the functional versions end in -f.lisp.

% lisp_count *.lisp
70 bowling-20121230.lisp
70 bowling-20130102.lisp
64 bowling-20130106.lisp
62 bowling-20130107.lisp
64 bowling-20130109.lisp
66 bowling-20130110.lisp
62 bowling-20130113.lisp
54 bowling-20130113b-f.lisp
57 bowling-20130114.lisp
57 bowling-20130114b.lisp
51 bowling-20130114c-f.lisp
Total:
677

These SLOC numbers were generated using David A. Wheeler’s ‘SLOCCount’. Straight up byte, word, and line counts follow the same trend.

Soon, I will record another run through the imperative style and a run through the functional style.

Here are the most recent versions of each style:

The Bowling Game Kata in Common Lisp January 3rd, 2013
Patrick Stein

The Bowling Game Kata in Common Lisp from Patrick Stein on Vimeo.

Code Kata are repetitive coding tasks designed to help one internalize certain patterns, methodologies, or tools. In this video, I go through Uncle Bob Martin’s The Bowling Game Kata. The Kata exercises test-driven development. Uncle Bob’s presentation of the Kata is in Java. This is in Common Lisp.

Nick Levine’s cl-log rocks… December 10th, 2012
Patrick Stein

Whenever I need logging in my Common Lisp applications, I use Nick Levine’s cl-log library. For many applications, spending the time to run CL:FORMAT during runtime is unreasonable. It is often much better to log something in a raw form and do any formatting/presentation offline rather than online. The cl-log library is the only one that I know of which supports binary logging readily.

Additionally, Nick is incredibly responsive. I found an oversight in a corner-case of cl-log last Friday at 1am. I emailed him about it at 1:19am. In under 5 hours, I had email back pointing me to a new release with that case corrected.

I hope to do a more complete comparison of the existing log libraries at some point. At this point, know that I’ve tried all of the ones on the Cliki which had obvious documentation or examples, and I’ll be using cl-log.

Coder’s Asceticism October 14th, 2012
Patrick Stein

A month or two ago, I watched a bunch of videos of Uncle Bob speaking at various conferences. He’s thought a lot about what makes code testable and maintainable, and he’s an excellent speaker. I’ve even paid to watch some of his hour-ish videos at his Clean Coders site.

In Clean Code, Episode 3 – Functions, he makes the case that we should be striving to keep each function under five lines long. So, over the past month or so, I’ve been trying this out in all of my Scala and Lisp endeavors.

Once you get each function down to four lines or fewer, you start running into symbol-proliferation. You pretty much throw out (LABELS ...) and (FLET ...) unless you need the closure (and then try to say that when you can’t that it’s good enough to just keep the subfunctions under five lines and the body under five, too). You create bunches of functions that should never be called anywhere except the one place it’s called now.

Enter Package Proliferation!

Then, I remembered some Lisp coding style guidelines that I read early this year sometime. Those guidelines advocated having one package for each Lisp file and explicitly making peer files have to import or qualify symbols used from other peer files. This will help me with the symbol-proliferation! Let’s do it!

Enter File Proliferation!

After two weeks of that, I woke up yesterday morning realizing that I wasn’t getting the full benefit from this if I was exporting more than one symbol from each package. If keeping each function tiny is Good and limiting a package to one file is Good, then it is certainly Bad to have one file’s package exporting eight different functions.

So, today, I reworked all of the UNet that I wrote in the last couple of weeks to have one package per file and as few exported symbols per package as possible.

In the interest of testability and modularity, I had broken out a subpackage of UNet called UNet-Sockets that is to be the interface the UNet library uses to interact with sockets. I had realized that I had a file called methods.lisp with four different publicly available (DEFMETHOD ...) forms in it. Now, each is in its own file and package. I did the same for various other declarations.

Now, there is a top-level package which uses Tim Bradshaw’s Conduit Packages to collect all of the symbols meant to be used externally from these packages. If there is an exported function in a package, that is the only exported symbol in that package. If there is an exported generic function declaration in a package, that is the only exported symbol from that package. If there is an exported method implementation in a package, there aren’t any exported symbols from that package. If there is an exported condition in a package, that condition and its accessors are the only exported symbols from that package. If there is an exported class in a package, that class and its accessors are the only exported symbols from that package.

Example

The top-level package of the UNET-SOCKETS system, that defines the interface that UNet will use to access its sockets, consists of just a (DEFPACKAGE ...) form. You can see the full package on github in the unet-sockets.asd and the sockets/ directory.

(org.tfeb.conduit-packages:defpackage :unet-sockets
  (:use :cl)
  (:extends/including :unet-sockets-interface
                      #:sockets-interface)
 
  (:extends/including :unet-sockets-base-socket
                      #:base-socket)
 
  (:extends/including :unet-sockets-hostname-not-found-error
                      #:hostname-not-found-error
                      #:hostname-not-found-error-hostname)

  (:extends/including :unet-sockets-get-address
                      #:get-address)

  (:extends/including :unet-sockets-address+port-not-available-error
                      #:address+port-not-available-error
                      #:address+port-not-available-error-address
                      #:address+port-not-available-error-port)

  (:extends/including :unet-sockets-create-datagram-socket
                      #:create-datagram-socket)
 
  (:extends/including :unet-sockets-send-datagram
                      #:send-datagram)
 
  (:extends/including :unet-sockets-poll-datagram
                      #:poll-datagram))

Conclusion?

The jury’s still out on this for me. On the one hand, the self-flagellation is an interesting exercise. On the other hand, if I’m going to have to export everything useful and import it in multiple places then I’m taking away some of the fun. I feel like I’m maintaining C++ header files to some extent.

I think I’ll keep it up for the UNet project, at least. It makes good documentation to have each exported symbol in a package and file named for that exported symbol. You can open up any file and read the three-to-four line exported function in it. You can see, right there with it, any supporting function it uses that no other files need to use. You can see from the (DEFPACKAGE ...) form at the top where any other supporting functions came from. And nothing else is cluttering up the landscape.

We’ll see if I develop religion on this topic. At the moment, I think it’s eventually going to fall into the moderation in all things bucket. Time will tell.

Updates In Email

Email:

l