[math-fun] Lewis Carroll's determinant method and some (new?) uses

9 Mar 2014

      Lewis Carroll's determinant method
====Warren D. Smith===March 2014====

http://www.maa.org/sites/default/files/pdf/Mathhorizons/pdfs/nov_2006_pp.12-...
http://mathtourist.blogspot.com/2007/03/lewis-carroll-and-his-telescoping.ht...

Lewis Carroll (Rev. Charles L. Dodgson) in 1866 gave the following method
to evaluate det(M) where M is an nXn matrix.   At each step, we have a
kXk matrix
which is "condensed" to a (k-1)X(k-1) matrix.   Finally, a 1X1 matrix,
i.e. single number, is got, which is the desired determinant.

ALGORITHM:
We begin with M0 = M.

initial step  (k=n):   M1 = the 2x2 determinants of adjacent entries in M0.
   Note M1 is (n-1)X(n-1).

for(j=2 .. n-1){
   M[j] = the 2x2 determinants of adjacent entries in M[j-1];
   Note M[j] is (n-j)X(n-j).  This is not yet the final form of M[j].
   for(a=1..n-j){ for(b=1..n-j){
      Divide entry a,b of M[j] by entry a+1,b+1 of M[j-2].
   } }
  That yields the final form of M[j], the j-condensed version of M.
}

The sole entry of M[n-1] is now the desired determinant.
END OF ALGORITHM.

This process requires O(n^3) operations (each +,-,/,*) and fails if at
some point it
divides by zero.   [However, by performing some random row operations
before beginning,
we usually can eliminate internal zeros so the process will not fail.
Alternatively,
you could replace all 0s by epsilons, continue on symbolically, and at
the end take
the limit epsilon-->0.  Both approaches are a pain.]  Of course we only need
to store a constant number of the M[j] matrices, not all of them,
hence the total
storage needs are O(n^2) entries, not order n^3.

WHY SHOULD ANYBODY CARE?

1. I point out, which remark I have not seen before, that Carroll's
condensation process
preserves certain special matrix forms.   If M is Toeplitz, Hankel, diagonal, or
banded with bandwidth=w, then all the condensed matrices also are.
If the original matrix was all-integer, then all the condensed
matrices also are.
As a result, it seems to me that this yields an O(n^2) step algorithm for
computing the determinants of Toeplitz and Hankel matrices, and an O(n^2 * w)
step algorithm for matrices with bandwidth=w,  and an O(n) algorithm
for diagonal matrices.  This Toeplitz result seems quite nice, although probably
asymptotically faster methods are possible (?).

2. The RESULTANT of two polynomials is the determinant of the Sylvester matrix,
http://en.wikipedia.org/wiki/Sylvester_matrix,  which is made by gluing together
two matrices which are rectangular chunks of Toeplitz matrices.
The Carroll j-condensed forms of Sylvester matrices, also have
Sylvester form, except
for j extra rows inserted between the two Toeplitz rectangles...  does
this lead to
a quadratic-time resultant algorithm?   Unfortunately I am not seeing
any obvious reason
why this process can be made to run in only quadratic time for resultants.

3.  David Bressoud and James Propp have remarked that Carroll's method is highly
parallelizable.  It also exhibits excellent memory-locality.
On n^2 processors, each condensation will run in O(1) time,
causing the whole determinant process to take time O(n).

4. There are also other condensation processes similar to (but different from)
Carroll's, including Chio's process (M.F.Chio 1853,
http://mathworld.wolfram.com/ChioPivotalCondensation.html ) ,
and methods which involve 3x3 rather than 2x2
determinants of adjacent entries, with shrinkage by 2 not 1 at each step.
Chio's process seems better than Carroll's in the respect that it is easier to
avoid dividing by zero.

5. If you solve an nXn linear system using Cramer's rule,
then each entry of the solution vector is
got in O(n) time using n^2 processors i.e. O(n^3) total work.
The full solution vector then would naively require n^4 work.
However it is possible to share work and thus reduce the total work
back to O(n^3).
This is described for a 3x3 version of Chio's condensation process in:

  Ken Habgood & Itamar Arel:
  A condensation-based application of Cramerʼs rule for solving
large-scale linear systems,
  J. Discrete Algorithms 10 (January 2012) 98-109.
  http://web.eecs.utk.edu/~itamar/Papers/JDA2011.pdf

Their work-sharing scheme is a bit of a pain, although they succeeded
in programming
and testing it.   Its underlying idea is to regard the matrix as a
binary tree (splits
by vertical cuts) and note that any two Cramer determinants share a lot of their
computation-trees, so by memoizing subtree results you avoid redoing
those subtrees.
There is a picture in their paper which makes it clearer how they do that.
It seems to me a similar scheme would also work for Carroll's 2x2
process described here,
although I have not carefully checked.  In any case, it seems to me
that the primary benefit
of Habgood & Arel is one they did not mention (!!) -- that their approach
is highly parallelizable.    It will solve an nXn linear system in O(n) time
with n^2 parallel processors.

Habgood & Arel encountered some numerical problems.
There is an analogue of "pivoting" which could be
used to improve the numerics of the Chio-based determinant
computation, but if you use
it then it screws up the Habgood-Arel work-sharing scheme.

But a different linear-system-solving method which also works in O(n) time
with n^2 parallel processors is conjugate gradient method.  And this
method is considerably simpler to program and uses only O(n) storage
beyond the original matrix.
Further, CG can also be run less than all the way to completion and
still will often
compute good approximate answers.

6.  That suggests seeking a task that condensation with work-sharing can do,
which CG cannot do.  How about computing the INVERSE of an nXn matrix?
Using its explicit formula in terms of the n^2 minors,
each (n-1)x(n-1), of the matrix (as well as the determinant of the
original matrix).
  The naive implementation of this would compute the inverse in order
n^5 total work.
However, if we can share work enough, then this would be reduced to
O(n^3) work in O(n)
time with n^2 processors.  (A new record?)  I believe this can
actually be done, using
a treelike scheme resembling Habgood & Arel's, but based on a "quadtree" (both
horizontal & vertical splitting) rather than a binary horiz-splits-only tree.

7. Certain matrices whose a,b entries are given by formulas in terms
of a and b, yield
condensed forms also given by other formulas,... which in some cases
enables you to compute the determinant in closed form.

8. Also might be useful for thinking about determinants of "random" matrices.

SUMMARY:
New(?) results are an O(n) time n-processor parallel
det-algorithm for Toeplitz and Hankel matrices
(but if there is a similar way to do resultants in this time bound, I
am not seeing it),
and an O(n) time parallel algorithm on n^2 processors for matrix inversion.

[math-fun] Lewis Carroll's determinant method and some (new?) uses

Warren D Smith