## Even More OptimizationJune 23rd, 2009 Patrick Stein

In the previous article, I showed a chunk of code for multiplying a vector by a matrix. Nikodemus Silvola, on the sbcl-help mailing list, recommended that I change the matrix to a one-dimensional array instead of a two dimensional array.

Doing that takes me from ten million iterations in 1.038 seconds to ten million iterations in 0.645 seconds. That brings me an overall improvement of 14x from the original, untyped code.

Thanks, all!

For completeness, here is the final code:

(declaim (ftype (function ((simple-array single-float (12))
(simple-array single-float (3)))
(simple-array single-float (3))) mvl*))
(defun mvl* (matrix vec)
(declare (optimize (speed 3) (safety 0)))
(let ((ret (make-array 3 :initial-element 0.0f0
:element-type 'single-float)))
(loop for jj from 0 below 3
do (let ((offset (* jj 4)))
(setf (aref ret jj)
(+ (aref matrix (+ offset 3))
(loop for ii from 0 below 3
for kk from offset below (+ offset 3)
summing (* (aref vec ii)
(aref matrix kk))
single-float)))))
ret))

To test this code, I made a ring of 4096 random matrices and a ring of 4095 random vectors. Then, I cycled through each ring as I iterated.

(let ((matrixes (make-ring-of-matrixes '(12) 4096))
(vectors  (make-ring-of-matrixes '(3) 4095)))
(time (loop for jj from 1 to 10000000
for mm in matrixes
for vv in vectors
do (mvl* mm vv))))

Making a random matrix is simple enough.

(defun random-matrix (dims)
(let ((ret (make-array dims :element-type 'single-float)))
(loop for jj from 0 below (array-total-size ret)
do (setf (row-major-aref ret jj) (random 1.0f0)))
ret))

Making a bunch of them into a ring takes slightly more effort (in the form of tail recursion), but not much.

(defun make-ring-of-matrixes (dims count &optional ring first)
(cond
((zerop count) (setf (cdr first) ring))
(t (let ((mm (cons (random-matrix dims) ring)))
(make-ring-of-matrixes dims (1- count) mm (or first mm))))))