Lazy random walks on the torus via Dirichlet forms

One of the simple example graphs I’ve used in some of my research on gossip algorithms has been the 2-dimensional torus with n vertices, which looks like a \sqrt{n} \times \sqrt{n} grid with the top and left edges wrapped around to connect with the bottom and right edges. Every vertex has 4 neighbors. Now imagine a very lazy random walk on this graph in which a random walker moves from vertex u to one of its neighbors with probability 1/n. It’s “well known” that this random walk takes around n^2 steps to mix. That is, if P is the matrix of transition probabilities then

T_{r} = \frac{1}{1 - \lambda_2(P)} \approx n^2

Here \lambda_2(P) is the second largest eigenvalue of P and the relaxation time T_{r} is the inverse of the spectral gap of the matrix. One way of characterizing T_{r} for reversible Markov chains is via the Dirichlet form. For a function f : V \to \mathbb{R} on the states of the chain define the Dirichlet form D(f,f) by

D(f,f) = \frac{1}{2} \sum_{u,v \in V} \pi_u P_{uv} (f(u) - f(v))^2

In our example the stationary distribution pi_u = 1/n and P_{uv} = 1/n for all edges in the graph. We write f \perp \pi if

\sum_{v \in V} \pi_v f(v) = 0

Define a norm associated with \pi via

\|f\|^2 = \sum_{v \in V} \pi_i f(v)^2

Then the characterization is

T_{r} = \sup_{f \perp \pi, f \ne 0} \frac{ \|f\|^2 }{ D(f,f) }

One question I asked myself today was whether it was “easy” to see what f you should choose in the grid example to get the scaling of n^2. Here’s one choice that gives the correct scaling. We’ll set f to be constant on each column. Assume without loss of generality that \sqrt{n} is divisible by 4 and set m = \sqrt{n}/4. The values for f on the columns will be like two triangles:

\{\frac{1}{n}, \frac{2}{n}, \ldots, \frac{m}{n},  \frac{m}{n}, \frac{m-1}{n}, \ldots, \frac{1}{n}, \frac{-1}{n}, \frac{-2}{n}, \ldots, \frac{-m}{n}, \frac{-m}{n}, \frac{-m+1}{n}, \ldots, \frac{-1}{n} \}

Now we can evaluate the norm, noting that there are \sqrt{n} vertices per column:

\|f\|^2 = 4 \sum_{i =1}^{m} \frac{1}{n} \sqrt{n} \frac{i^2}{n^2} = c \frac{1}{n}

This is because the sum of the first m squares scales like m^3 and m = \sqrt{n}/4. Now turning to the Dirichlet form, note that each difference between columns is at most 2/n and there are fewer than n edges for which f(u) \ne f(v). Thus:

D(f,f) \le \frac{1}{2} n \frac{1}{n} \frac{1}{n} \frac{4}{n^2} = c' \frac{1}{n^3}

Taking the ratio gives the lower bound of T_r \ge c''/n^2 as desired.

The first f I tried was just equal to +1 on the first half of the columns and -1 on the second half of the columns. This ends up giving a suboptimal bound, because the norm \|f\|^2 = 1 but in the denominator we get 2 \sqrt{n} positive terms. The key is to make all the differences (f(u) - f(v))^2 in the denominator small while keeping the average of \|f\|^2 large enough. Even though you sum over n small differences in the denominator, it stays small enough to pay for the \|f\|^2 = c/n in the numerator.

While doing this calculation, I noticed that the book Markov Chains and Mixing Times is also online — it makes a handy reference and is a little easier to use than my old go-to, the Aldous-Fill book.