Say What You Mean September 19th, 2015
Patrick Stein

There is a scene in the movie The Birdcage where the son tells his father that he (the son) has met a girl and is going to get married. The father begins gulping down the glass of wine that he has in hand. The son asks, Are you upset? The father finishes the glass of wine and says, Let me tell you why.

Here is a function that I wrote several years ago.

(sheeple:defreply mouse-move ((item =draggable=) xx yy)
  (let ((dragging (dragging item)))
    (when dragging
      (let ((dx (- xx (car dragging)))
            (dy (- yy (cdr dragging))))
        (incf (offset-x item) dx)
        (incf (offset-y item) dy))
      (let ((pp (parent item)))
        (when pp
          (when (< (width pp) (+ (offset-x item) (width item)))
            (setf (offset-x item) (- (width pp) (width item))))
          (when (< (height pp) (+ (offset-y item) (height item)))
            (setf (offset-y item) (- (height pp) (height item))))))
      (when (< (offset-x item) 0) (setf (offset-x item) 0))
      (when (< (offset-y item) 0) (setf (offset-y item) 0))
      t)))

This is awful! Am I upset? Let me tell you why.

Is it the Single Responsibility Principle (SRP)? No.

Is it Don’t Repeat Yourself (DRY)? No.

Is it Mixing Levels of Abstraction? Closer, but not quite.

Those are all clearly violated by this code. But, that’s not really the problem. The problem is Why. Nothing about this code tells you why it is here or what is doing.

There is no way to glance at that function and have any idea what’s going on. You have to read it carefully. You have to understand things that aren’t even in this source file to make head nor tail of it. Once you understand the second LET block, you will have nine more lines of code without the least inkling of why there should be nine more lines of code. Anyone care to hazard a guess as to why this function returns T (only) when we’re dragging?

Encapsulation

Two years ago, a colleague and I were tasked with providing docstrings for every function in all of the code we’d written in the last year. We’d done well on providing docstrings to the outward-facing functions, but now we had to do the rest. He started at one end of the directory (in alphabetical order), and I started at the other end. This gave me a good opportunity to look closely at a boat-load of code he’d written that I’d never really delved into before.

He was absolutely religious about encapsulating containers. If he had a hash-table or a p-list or a flat list in a DEFVAR, there was one and only one function that retrieved items from it and at most one function that added items to it. Those functions were one or two lines each (two if they needed a mutex). Those functions were named after what the collection was storing not what mechanism was used to store them.

A lot of times when people talk about the value of encapsulating, they talk about shielding the rest of the code from the implementation details so that if you need to replace how it’s actually implemented on the back end you can do it without breaking any existing code. You are protecting your precious implementation from how people will use it so that you can someday replace the implementation with an even more precious implementation next year (when your language finally gets first-class functions).

I’ve been coding for a good long time now. I’m going to let you in on a little secret. Code almost never gets replaced. When code does get replaced, it almost never continues to adhere to the old API (there was always a semantic leak). If there is a business justification strong enough to let you replace the code, it’s because the old code has become an unmaintainable mess of people subverting the interface or the code as it is didn’t scale and now synchronous things need to happen asynchronously or local things have to happen remotely and hiding that under your old API isn’t going to relieve the bottlenecks.

Trying to insulate your code so that it’s easy to replace is looking down the wrong end of the telescope. The real benefit of encapsulation is that the people who read your code later can be half-asleep and still get everything—your code will scream its meaning. The real benefit of encapsulation is that the person debugging your code can set a break-point in a place that means something—not in the seventeen places the state might have changed but in the only place it could change.

Making It Better

Any ideas what the body in this function does?

(sheeple:defreply mouse-move ((item =draggable=) xx yy)
  (if (being-dragged-p item)
      (handle-event ()
         (let ((dx (- xx (drag-starting-x item)))
               (dy (- yy (drag-starting-y item))))
           (translate-widget item dx dy)
           (keep-widget-inside-parent item)))
      (ignore-event ())))

The new functions BEING-DRAGGED-P, DRAG-STARTING-X, and DRAG-STARTING-Y are just wrappers around what had been explicitly treated as an (OR NULL (CONS INTEGER INTEGER)).

(defun being-dragged-p (item)
  (dragging item))

(defun drag-starting-x (item)
  (car (dragging item)))
(defun drag-starting-y (item)
  (cdr (dragging item)))

It is still an (OR NULL (CONS INTEGER INTEGER)) but nobody ever has to care. Nobody ever has to try to remember what the integers mean. Sure, you could replace it with a structure or a complex number, but why would you ever bother? Why would you ever look at it again?

The new macros HANDLE-EVENT and IGNORE-EVENT encapsulate the return value of this function into something with meaning.

(defmacro handle-event (() &body body)
  `(prog1
       (values t)
     ,@body))

(defmacro ignore-event (() &body body)
  `(prog1
       (values nil)
     ,@body))

It might still be too easy to write an event-handler with a path which doesn’t end in one of these two macros, but it is way better than that dangling T was. It looks like it’s really supposed to be there, and it looks like what it means rather than what it is.

The TRANSLATE-WIDGET and KEEP-WIDGET-INSIDE-PARENT functions can benefit greatly with some further helper functions (and analogous functions for top and bottom):

(defun left (item)
  (offset-x item))
(defun (setf left) (x item)
  (setf (offset-x item) x))
(defun right (item)
  (+ (left item) (width item)))
(defun (setf right) (x item)
  (setf (offset-x item) (- x (width item))))

Some Rules of Thumb

If you find that when you want to check (PRED1 ...) you instead have to check:

(and (PRED0 ...)
     (PRED1 ...))

Then you should consider making a function that does them both. Consider the difference between these two blocks of code:

(when (and (connectedp (player1 g))
           (connectedp (player2 g))
           (not (pausedp g)))
  ...)

(when (game-active-p g)
  ...)

If you find that you are depending on the NULL-ness or positiveness or some other property of some number of state variables to decide which course of action to take, then you should consider making predicates named after your state. In many OO scenarios, you may even want to explicitly track (or calculate) which state you are in at all times.

(defmacro state-case (g &body clauses)
  `(ecase (calculate-or-fetch-state-of g)
     ,@clauses))

(state-case g
 (:pause-screen-showing
  ...)
 (:settings-menu-showing
  ...))

In more imperative languages, it may even be beneficial to keep a STATE member variable in your class. When doing that, make sure that there is one and only one function which actually mutates the value of that STATE member. This will let you:

  1. Log all state transitions without having to hunt for all of them.
  2. Quickly hunt for all of them if you want to do that
  3. Set a break point on all state changes.
  4. Enforce the validity of transitions (or at least scream loudly when something transitions from STOPPED to PAUSED without having passed through PLAYING first).

If you have to check whether some resource is being used by some instance, don’t ask it which resource it is using, ask it whether it is using the one you want.

;;; Common: Reader is forced to know each player has one socket and
;;;    that sockets are comparable with #'=
(loop :for player :in all-networked-players
      :until (= socket-with-something-happening
                (player-socket player))
      :finally (return player))

;;; Better: All I wanted to know is, "Is this yours?"
(loop :for player :in all-networked-players
      :until (player-using-socket-p player
                                    socket-with-something-happening)
      :finally (return player))

Encapsulation is about protecting the person who has to read your code. It’s not about protecting your code.

Syntactic Corn Syrup June 16th, 2015
Patrick Stein

I’ve been bouncing around between Java and C++ and C and loads of JNI cruft in between. At some point today, I accidentally used a semicolon to separate parameters in my C function declaration:

void JNI_myJNIMethod( int paramA; int paramB; int paramC )
{
  ...
}

It looked wrong to me. But, I had one of those brain-lock moments where I couldn’t tell if it was wrong. I was pretty sure that it was wrong by the time my brain locked on pre-ANSI K&R:

void
JNI_myJNIMethod(paramA, paramB, paramC)
  int paramA;
  int paramB;
  int paramC;
{
  ...
}

Regardless, it got me thinking about the programming maxims: Deleted code has no bugs and Deleted code is debugged code.

I never have this kind of brain-lock in Lisp. Some of that is because my Emacs configuration has been molded to my Lisp habits better than to my C/C++/Java habits. Most of it, though, is that Lisp understands the difference between syntactic sugar and syntactic cruft.

Lisp decided long ago that writing code should be easy even if it makes writing the compiler tougher. C and C++ and Java all decided that LALR(1) was more important than me. As if that weren’t bad enough, C++ and Java have thrown the lexers and parsers under the bus now, too. No one gets a free ride.

Who Won? May 28th, 2015
Patrick Stein

The web comic XKCD recently published the following tournament bracket featuring match-ups like ORSON WELLS vs. H.G. WELLS and VAN HALEN vs. VAN MORRISON vs. VAN WILDER.

XKCD Tournament

So, who won? It seems probable to me that XKCD’s author Randall Munroe has in mind some way to decide these matches. If he published how the matches were to be decided, I missed it. However, based on some previous XKCD comics like Geohashing and Externalities, I think it’s a safe guess that it involves hashing.

So, how could we decide this? We could take the hash of each name and then in each match, the largest hash value wins. That, however, has the unfortunate side effect that the winner of the tournament would be the same regardless of the organization of the brackets.

I opted to decide the match between OSCAR DE LA RENTA and OSCAR DE LA HOYA by taking the SHA3-512 hash of the strings OSCAR DE LA RENTA vs. OSCAR DE LA HOYA and OSCAR DE LA HOYA vs. OSCAR DE LA RENTA. The winner is the one whose name appeared first in the string with the smallest hash value. For three and four person contests, I used all permutations of the players involved (separated by vs. ).

The winner? RYAN ADAMS beat out BILL PAXTON in the final.

The code for this project was a breeze thanks to #'ALEXANDRIA:MAP-PERMUTATIONS and my TRACK-BEST library. Here is a the meat of the whole thing which uses an evaluation function (here, it’s SHA3-512 of the vs. separated player list) and a way to compare the evaluations (here, a simple #'ARRAY-LESSP) and runs through all of the players.

(labels ((rank-one-match (players depth)
           (track-best:track (first players)
                             (funcall eval-permutation players depth)))

         (find-winner (players depth)
           (track-best:with-track-best (:order-by-fn compare-permutations)
             (alexandria:map-permutations (lambda (players)
                                            (rank-one-match players depth))
                                          players))))
  ...)

Here is the full source file for the tournament: tourney.lisp. Here is a text description of the whole tournament. And, here is a graphic with the outcomes of all of the matches.

final-bracket

Struggling to Keep the Faith April 11th, 2015
Patrick Stein

Six years ago, I wrote about how the choice of programming language dramatically affects the way in which I write code. In that post from six years ago, I said:

In Lisp, I will pull something out into a separate function if it makes the current function more self-contained, more one idea. In C++, I will only pull something out into a separate function if I need the same functionality in multiple places.

At the moment, I’m getting paid to write in C++ for the first time in two years (technically, I suppose I did write some small C++ sample code in my previous job, too). I am struggling at the moment to make my C++ functions as small as I would have made them if I were writing in Lisp.

There are obvious barriers to this imposed by C++. While one of the projects at my work is converting a large swath of code to C++11, my project is still stuck on C++98. This means that I can’t use lambda functions, auto, or for-each style for loops and that I can’t get away with letting the compiler figure out my template types in almost any situation.

For instance, I have one function that is about twenty lines of comments and ten lines of code comprising exactly three statements. Three statements took ten lines of code because when you want to use a pointer to a member function in a std::count_if call, you need to jump through about a hundred characters of declarations and syntax to turn the pointer to a member function of one argument into a regular function of two arguments (using std::mem_fun1_t) and another pile of characters to turn it into a regular function of one parameter again (using std::bind1st). And, I spent nearly an hour trying to glark the type I’d have had to declare in one of the intermediate steps to turn those three statements into four instead. I gave up.

bool MyClass::attemptToFrob( item_t& item )
{
  return item.frob( _frobRetryCount, _frobTimeoutValue );
}

bool MyClass::frobAllOfMyItems()
{
  return performItemOperationOnAllItems( &MyClass::attemptToFrob );
}

bool MyClass::performItemOperationOnAllItems(
                bool (MyClass::*op)(item_t&)
              )
{
  std::mem_fun1_t<bool,MyClass,item_t> binOp(op);
  unsigned int successes = std::count_if( _items.begin(),
                                          _items.end(),
                                          std::bind1st( binOp, this ) );
  return ( _items.size() == successes );
}

In C++11, I’d have written the function as two statements, only one of which had to span more than one line to stay under 80 columns. In C++0X, I’d at least have been able to auto an intermediate statement to hold the result of std::bind1st that I couldn’t manage to satisfy myself.

I really believe that trying to keep the functions down to three or four lines with at most one conditional is a goal that goes a long way toward readability. But, man, it sucks for writability when you’re stuck in a strongly-typed language with a compiler that isn’t at all interested in helping you out with that.

And, a pet peeve here while I’m already ranting about C++…. Am I right in believing that neither C++98, C++0X, C++11, or C++14 have any way to iterate over the keys of a std::map or to wrap a function of one argument that takes a pair and just uses the first half of the pair to invoke the original function? Something like this:

template <class R, class T, class U>
class call1st {
public:
  call1st( R (*fn)(T) ) : _fn(fn) {};
  R operator ( std::pair<T,U>& p ) { return (_fn)(p.first); };
private:
  R (*_fn)(T);
};

Most of why I would want this would be more clear with lambdas instead. But, if there is still going to be crap in the language like std::mem_fun1_ref_t and such, why is there still no functional way (in the #include <functional> sense) to get to the members of a pair? Or, I am just missing something obvious?

Trying Clojure again? March 26th, 2015
Patrick Stein

EDIT: Indeed, as people on Reddit pointed out, installing Lein is simply downloading a script and running it. Installing CIDER from MELPA was also easy. The two, however, aren’t completely compatible at the moment because CIDER from MELPA wants nREPL 0.2.7 at least and Lein only pulls in 0.2.6 (even though, I believe the current is 0.2.10).

It has been five years since I last tried Clojure. I feel like I should try it again.

I don’t want to beat my head against Leiningen for even ten minutes this time. Is there some way to reasonably use Emacs + Clojure without have to install and configure CLASSPATHS and Mavens and Ants and JDKs?

It seems SWANK-CLOJURE has been deprecated in favor of CIDER. The CIDER doc says how to configure Leiningen or Boot for use with CIDER. Is there some way that I can avoid Leingingen and Boot? Or some way that I can click one ‘Install’ button and have Leiningen and Boot work?

l