Comments on: Trying to Unconfound Lisp Speeds

By: Faheem Mitha

Faheem Mitha — Wed, 06 Jun 2012 00:01:30 +0000

I misunderstood your point. You used a local variable for (aref ret jj) to avoid recalculating it multiple times, so as to improve performance. symbol-macrolet would be exactly equal to having the form written multiple times in terms of the code behavior.

By: pat

pat — Tue, 05 Jun 2012 11:54:25 +0000

Do you have reason to believe that symbol-macrolet would result in something the compiler felt safe optimizing in a way it wouldn’t if I had manually written the form multiple times?

By: Faheem Mitha

Faheem Mitha — Tue, 05 Jun 2012 06:42:33 +0000

You could use symbol-macrolet instead of using “a local variable to avoid doing (aref ret jj) multiple times”.

Regards, Faheem

By: pat

pat — Tue, 30 Jun 2009 18:24:06 +0000

Indeed, that is a great deal faster, and without allocating. Thank you. I think I will write the numbers up in a separate article. But, reworking it with (incf ...) instead (and I used a local variable to avoid doing (aref ret jj) multiple times) resulted in 0.40 seconds for SBCL and 0.72 seconds for Allegro.

By: Jason Cornez

Jason Cornez — Tue, 30 Jun 2009 14:11:02 +0000

In Allegro 8.1, try the following formulation instead. It avoids the boxing and results in no extra memory allocation. Hence it is quite a bit faster. Unless I’ve made a silly mistake, it should compute the same result…

-Jason

(defun mvl*-acl (matrix vec ret)
(declare (type (simple-array single-float (12)) matrix)
(type (simple-array single-float (3)) vec)
(type (simple-array single-float (3)) ret)
(optimize (speed 3) (safety 0)))
(loop for jj fixnum from 0 below 3
do (let ((offset (* jj 4)))
(declare (type fixnum offset))
(setf (aref ret jj) (aref matrix (+ offset 3)))
(loop for ii fixnum from 0 below 3
for kk fixnum from offset below (+ offset 3)
do (incf (aref ret jj) (* (aref vec ii)
(aref matrix kk))))))
ret)

By: pat

pat — Sun, 28 Jun 2009 17:16:12 +0000

It does seem likely that it is boxing and unboxing floats that is causing the allocations. I would have hoped some of that could be done on the stack instead of in the heap, but….

If I get some time in the near future, I may explore the respective documentation to see how one is “supposed to” do such things with minimal memory thrashing.

Thanks….

By: Nathan

Nathan — Sat, 27 Jun 2009 18:24:59 +0000

The consing likely comes from allocating boxed single floats in the generic AREF routines and/or the arithmetic routines. SBCL and CMUCL know how to inline AREF on single-float arrays and the arithmetic routines so they don’t have to cons. Allegro and Lispworks should be able to do that too; Lispworks might require (FLOAT 0) or similar. I should think Clozure CL can do that too, although the 32-bit version obviously doesn’t. I do know that Clozure has different boxed representations for single-floats in 32-bit vs. 64-bit (the 64-bit boxed representation doesn’t require allocating any extra memory), so maybe it’s not inlining, but just relying on the generic version, which happens to not cons on 64-bit implementations.