In the Hieroglyphica of Valerian, the Ass is the symbol of the Scholar, humbly chewing his dry diet of texts, laboring mightily for Learning. The lacquered eyeballs moist, intelligent maybe, but a little cocked, not easy to make them both look in the same direction. Left side the wackier one, mad and errant; right side patient and mild.
– Daemonomania, by John Crowley
Daily Archives: March 9, 2012
Avoidance Coupling
I took a read through this fun paper that appeared on ArXiV a few months ago:
Avoidance Coupling
Omer Angel, Alexander E. Holroyd, James Martin, David B. Wilson, Peter Winkler
arXiv:1112.3304v1 [math.PR]
Typically when you think of coupling arguments, you want to create two (or more) copies of a Markov chain such that they meet up quickly. A coupling of Markov chains with transition matrix is a sequence of pairs of
such that
and
are Markov chains with transition matrix
:
Note that the two chains need not be independent! The idea is that we start and
at different initial positions and when they “meet” we make them move together. Once they meet, the decisions from random mappings are the same for both chains. The coupling time is
. Coupling is used to show fast mixing of Markov chains via theorems like the following, which says that the difference between the distribution of the chain started at time
and the chain started at time
at time
is upper bounded by the probability of coupling:
Theorem. Let
be a coupling with
and
. Then
![]()
This paper takes an entirely different tack — sometimes you want to start off a bunch of copies of the chain such that they never meet up. This could happen if you are trying to cover more of the space. So you want to arrange a coupling such that the coupling time is huge. There’s a bit of a definitional issue regarding when you declare that two walkers collide (what if they swap places) but they just say “multiple walkers are assumed to take turns in a fixed cyclic order, and again, a collision is deemed to occur exactly if a walker moves to a vertex currently occupied by another. We call a coupling that forbids collisions an avoidance coupling.”
There are a number of results in the paper, but the simplest one to describe is for two walkers on a complete graph (or the complete graph with loops
.
Theorem 4.1. For any composite
, where
, there exist Markovian avoidance couplings for two walkers on
and on
.
How do they do this? They partition into
clusters
of size
. Let’s call the walkers Alice and Bob. Suppose Alice and Bob are in the same cluster and it’s Bob’s turn to move. Then he chooses uniformly among the vertices in another cluster. If they are in different clusters, he moves uniformly to a vertex in his own cluster (other than his own). Now when it’s Alice’s turn to move, she is always in a different cluster than Bob. She picks a vertex uniformly in Bob’s cluster (other than Bob’s) with probability
and a vertex in her own cluster (other than her own) with probability
.
So let’s look at Bob’s distribution. The chance that he moves to a particular vertex outside his current cluster is the chance that Alice moved into his cluster times the uniform probability of choosing something outside his cluster:
The chance that he moves to a vertex inside his own cluster is likewise
So the marginal transitions of Bob are the same as a uniform random walk.
For Alice’s distribution we look at the time reversal of the chain. In this case, Alice’s reverse distribution is Bob’s forward distribution and vice versa, so Alice also looks like a uniform random walk.
There are a number of additional results in the paper, such as:
Theorem 7.1. There exists a Markovian avoidance coupling of
walkers on
for any
, and on
for any
n/(56 log_2 n)$.
Theorem 8.1. No avoidance coupling is possible for
walkers on
, for
.
In addition there are a ton of open questions at the end which are quite interesting. I didn’t mention it here, but there are also interesting questions of the entropy of the coupling — lower entropy implies easier simulation, in a sense.
Readings
How to Survive in a Science Fictional Universe (Charles Yu) – Not the book you think it is. This was a fantastic read, a meditation on loss and time and memory and connections. Sure there is a time travel, but it’s more about what that means psychologically than technologically. Highly recommended.
Carter Beats The Devil (Glen David Gold) – A fictionalized biography of stage magician Charles Joseph Carter, this is a real page-turner, especially if you like the attention to historical detail. I quite enjoyed it, despite the rather abrupt ending.
Daemonomania (John Crowley) – Book Three of the Aegypt cycle. This one was slow going and hard to get into, but it really picks up in the middle. The costume party towards the end of the book is all worth it, as is the mission to rescue Rosie’s daughter Sam from the cultish Powerhouse. Crowley may be accused of injecting the mystic into the mundane in an artificial way, but I think what this book did was show how desperate circumstances amplify and distort our perceptions to make events seem mysterious or magical.
Tango (Slawomir Mrozek) – a somewhat absurdist allegory about reactionary tendencies. The setting is a rather bohemian family living in a house in disarray, with the father Stromil putting on ridiculous art shows (“experiments”) and ignoring the fact that his wife is cheating on him with a roustabout, Eddie. All of this is very frustrating to Arthur, his son, who longs for a return to the old classical ways. He gets his way by holding them to higher standards at gunpoint, but then realizes the futility of it all. I imagine it’s funnier performed with more dramaturgical context. It takes place between the wars, so that clearly has something to do with it, but my play-reading chops are not up to snuff to say anything insightful.
Complexity and Information (J. F. Traub and A. G. Werschulz) – a gentle, if scattered, introduction to information based complexity, which I had heard about but didn’t really know too much about. It somehow feels “old fashioned” to me (perhaps that’s the machine learning kool-aid speaking), with comparisons to Turing machines and so on. But the central question of how to appropriately estimate integrals from samples is pretty interesting, given my recent forays into using MCMC.