One of the simple example graphs I’ve used in some of my research on gossip algorithms has been the 2-dimensional torus with vertices, which looks like a grid with the top and left edges wrapped around to connect with the bottom and right edges. Every vertex has 4 neighbors. Now imagine a very lazy random walk on this graph in which a random walker moves from vertex to one of its neighbors with probability . It’s “well known” that this random walk takes around steps to mix. That is, if is the matrix of transition probabilities then

Here is the second largest eigenvalue of and the relaxation time is the inverse of the *spectral gap* of the matrix. One way of characterizing for reversible Markov chains is via the *Dirichlet form*. For a function on the states of the chain define the Dirichlet form by

In our example the stationary distribution and for all edges in the graph. We write if

Define a norm associated with via

Then the characterization is

One question I asked myself today was whether it was “easy” to see what you should choose in the grid example to get the scaling of . Here’s one choice that gives the correct scaling. We’ll set to be constant on each column. Assume without loss of generality that is divisible by 4 and set . The values for on the columns will be like two triangles:

Now we can evaluate the norm, noting that there are vertices per column:

This is because the sum of the first squares scales like and . Now turning to the Dirichlet form, note that each difference between columns is at most and there are fewer than edges for which . Thus:

Taking the ratio gives the lower bound of as desired.

The first I tried was just equal to +1 on the first half of the columns and -1 on the second half of the columns. This ends up giving a suboptimal bound, because the norm but in the denominator we get positive terms. The key is to make all the differences in the denominator small while keeping the average of large enough. Even though you sum over small differences in the denominator, it stays small enough to pay for the in the numerator.

While doing this calculation, I noticed that the book Markov Chains and Mixing Times is also online — it makes a handy reference and is a little easier to use than my old go-to, the Aldous-Fill book.

Ah. So in this random walk, you stay in place with probability 1-4/n?

Yep, the reason for that (from the application) is that you select a location uniformly at random, so there’s only a 1/n chance of taking any edge. This makes the walk particularly lazy, and is what leads to a perceived inefficiency of asynchronous consensus algorithms versus synchronous ones. They use the same number of transmissions, but synchronous algorithms are measured in rounds, which loses a factor of n.