I had never really heard of this result, sometimes called the Matrix Determinant Lemma, but it came up in the process of answering a relatively simple question. Suppose I have an -dimensional jointly Gaussian vector
with covariance matrix
. The differential entropy of
is
. Suppose now I consider some rank-1 perturbation
. What choice of
maximizes the differential entropy?
On the face of it, this seems intuitively easy — diagonalize and then pick
to be the eigenvector corresponding to the smallest singular value of
. But is there an simple way to see this analytically?
Matrix Determinant Lemma. Let
be an
positive definite matrix and
and
be two
matrices. Then
.
To see this, note that
,
and take determinants on both sides.
So now applying this to our problem,
But the right side is clearly maximized by choosing corresponding to the largest singular value of
, which in this case is the smallest singular value of
. Ta-da!
Someone, who will go unnamed, quizzed me on this in late 2002. Old times!
Trying to understand here: Via writing $U = A A^{-1} U$, the lemma is equivalent to the “special case $A=1$”, associated with Sylvester. In turn, the special case seems to be an exponentiated form of the easy (and more well-known?) identity $Tr(AB) = Tr(BA)$.
Er, not quite sure I see how commuting under the trace implies the determinant result? Maybe I am slow. You are the real mathematician after all 😉
It’s more that the reverse implication is true: in a horrible notation because things are multivariate here, d/dx det(1+x) = Tr(x), so the determinant result implies the trace result. Perhaps I could have said “integrated” or “antidifferentiated” instead of “exponentiated”, but in some sense they’re the same on a Lie group.
Thanks Jay. For a moment, I was as surprised as Anant.