# Bach Collegium San Diego Bleg

The Bach Collegium San Diego, a group with whom I sang on occasion has a Kickstarter going to fund a tour. Please consider helping them out!

The Bach St John Passion is truly coming full circle for the BCSD, as it was our debut concert in 2003. This year marks our second annual performance of this work. We are seeking to establish an annual tradition of performing a Bach Passion (and other Passion Music) near Holy Week and Easter leading to an eventual Easter Festival.

In order to help bring this annual tradition to reality, we’re asking that you consider helping to sponsor the 16 singers who will form the dynamic vocal ensemble that will propel this dramatic work. The size of the donation is not as important as the interaction and participation of our those who believe in our mission and work. We thank you in advance for your generosity and we look forward to seeing you at the performances in April!

# Facts, “facts,” and ficts

After reading and hearing about The Lifespan of a Fact, I was primed to learn from TPM that the This American Life story involving Mike Daisey’s “trip” to see Apple’s products being made at Foxconn was retracted due to significant fabrications by Daisey, resulting in a episode entitled Retraction going into it:

When the original 39-minute excerpt was broadcast on This American Life on January 6, 2012, Marketplace China Correspondent Rob Schmitz wondered about its truth. Marketplace had done a lot of reporting on Foxconn and Apple’s supply chain in China in the past, and Schmitz had first-hand knowledge of the issues. He located and interviewed Daisey’s Chinese interpreter Li Guifen (who goes by the name Cathy Lee professionally with westerners). She disputed much of what Daisey has been telling theater audiences since 2010 and much of what he said on the radio.

# Univariate distributions at a glance

My father sent me this paper (author’s version here) a little while ago on “Univariate Distribution Relationships.” It contains the following stunning chart:

Univariate Distributions in a Nutshell

All I can say is : wow. Pretty amazing, no?

As simplified version of the chart has been put up by John Cook.

# Bellairs Workshop 2012

The beach at Bellairs

I am spending the week at the Bellairs Research Institute in Holetown, Barbados. McGill owns this facility and faculty organize informal workshops throughout the year on various topics. There are two going on right now — one on control theory approaches in computer animation, and out workshop on signal processing in networks. The McGill side is Mark Coates and Mike Rabbat and a number of their students, both masters and PhD. Anna Scaglione and Angelia Nedich arrived recently.

The format of the workshop has been a mix of tutorials and student presentations and plenty of time for discussion and some problem formation. And of course, for the beach, which is just behind the research facility. Holetown is on the west coast of Barbados, and the Caribbean is warm and inviting. I’m having a great time, even though I am falling behind on my other projects and email and whatnot.

People from Barbados call themselves Bajans (‎/ˈbeɪdʒənz/), so one should be careful not to discuss p-values or t-tests around them.

# Quote of the day : the Scholar

In the Hieroglyphica of Valerian, the Ass is the symbol of the Scholar, humbly chewing his dry diet of texts, laboring mightily for Learning. The lacquered eyeballs moist, intelligent maybe, but a little cocked, not easy to make them both look in the same direction. Left side the wackier one, mad and errant; right side patient and mild.

Daemonomania, by John Crowley

# Avoidance Coupling

I took a read through this fun paper that appeared on ArXiV a few months ago:

Avoidance Coupling
Omer Angel, Alexander E. Holroyd, James Martin, David B. Wilson, Peter Winkler
arXiv:1112.3304v1 [math.PR]

Typically when you think of coupling arguments, you want to create two (or more) copies of a Markov chain such that they meet up quickly. A coupling of Markov chains with transition matrix $P$ is a sequence of pairs of $\{(U_t, V_t)\}$ such that $\{U_t\}$ and $\{V_t\}$ are Markov chains with transition matrix $P$:

$\mathbb{P}( U_{t+1} = j | U_t = i, V_t = i') = P(i,j)$
$\mathbb{P}( V_{t+1} = j' | U_t = i, V_t = i') = P(i', j')$

Note that the two chains need not be independent! The idea is that we start $\{U_t\}$ and $\{V_t\}$ at different initial positions and when they “meet” we make them move together. Once they meet, the decisions from random mappings are the same for both chains. The coupling time is $T_c = \min \{t : U_t = V_t \}$. Coupling is used to show fast mixing of Markov chains via theorems like the following, which says that the difference between the distribution of the chain started at time $i$ and the chain started at time $i'$ at time $t$ is upper bounded by the probability of coupling:

Theorem. Let $\{ (U_t,V_t) \}$ be a coupling with $U_0 = i$ and $V_0 = i'$. Then

$\| P^t(i,\cdot) - P^t(i',\cdot) \|_{\mathrm{TV}} \le \mathbb{P}_{(U,V)}\left( T_c > t \right)$

This paper takes an entirely different tack — sometimes you want to start off a bunch of copies of the chain such that they never meet up. This could happen if you are trying to cover more of the space. So you want to arrange a coupling such that the coupling time is huge. There’s a bit of a definitional issue regarding when you declare that two walkers collide (what if they swap places) but they just say “multiple walkers are assumed to take turns in a fixed cyclic order, and again, a collision is deemed to occur exactly if a walker moves to a vertex currently occupied by another. We call a coupling that forbids collisions an avoidance coupling.”

There are a number of results in the paper, but the simplest one to describe is for two walkers on a complete graph $K_n$ (or the complete graph with loops $K_n^{\ast}$.

Theorem 4.1. For any composite $n = ab$, where $a,b > 1$, there exist Markovian avoidance couplings for two walkers on $K_n$ and on $K_n^{\ast}$.

How do they do this? They partition $n$ into $b$ clusters $\{S_i\}$ of size $a$. Let’s call the walkers Alice and Bob. Suppose Alice and Bob are in the same cluster and it’s Bob’s turn to move. Then he chooses uniformly among the vertices in another cluster. If they are in different clusters, he moves uniformly to a vertex in his own cluster (other than his own). Now when it’s Alice’s turn to move, she is always in a different cluster than Bob. She picks a vertex uniformly in Bob’s cluster (other than Bob’s) with probability $\frac{a(b-1)}{ab - 1}$ and a vertex in her own cluster (other than her own) with probability $\frac{a - 1}{ab - 1}$.

So let’s look at Bob’s distribution. The chance that he moves to a particular vertex outside his current cluster is the chance that Alice moved into his cluster times the uniform probability of choosing something outside his cluster:
$\frac{a(b-1)}{ab - 1} \times \frac{a (b-1)} = \frac{1}{ab - 1}$
The chance that he moves to a vertex inside his own cluster is likewise
$\frac{a - 1}{ab - 1} \times \frac{1}{a-1} = \frac{1}{ab - 1}$
So the marginal transitions of Bob are the same as a uniform random walk.

For Alice’s distribution we look at the time reversal of the chain. In this case, Alice’s reverse distribution is Bob’s forward distribution and vice versa, so Alice also looks like a uniform random walk.

There are a number of additional results in the paper, such as:

Theorem 7.1. There exists a Markovian avoidance coupling of $k$ walkers on $K_n^{\ast}$ for any $k \le n/(8 log_2 n)$, and on $K_n$ for any $k \le$n/(56 log_2 n)$. Theorem 8.1. No avoidance coupling is possible for $n - 1$ walkers on $K_n^{\ast}$ , for $n \ge 4$. In addition there are a ton of open questions at the end which are quite interesting. I didn’t mention it here, but there are also interesting questions of the entropy of the coupling — lower entropy implies easier simulation, in a sense. # Readings How to Survive in a Science Fictional Universe (Charles Yu) – Not the book you think it is. This was a fantastic read, a meditation on loss and time and memory and connections. Sure there is a time travel, but it’s more about what that means psychologically than technologically. Highly recommended. Carter Beats The Devil (Glen David Gold) – A fictionalized biography of stage magician Charles Joseph Carter, this is a real page-turner, especially if you like the attention to historical detail. I quite enjoyed it, despite the rather abrupt ending. Daemonomania (John Crowley) – Book Three of the Aegypt cycle. This one was slow going and hard to get into, but it really picks up in the middle. The costume party towards the end of the book is all worth it, as is the mission to rescue Rosie’s daughter Sam from the cultish Powerhouse. Crowley may be accused of injecting the mystic into the mundane in an artificial way, but I think what this book did was show how desperate circumstances amplify and distort our perceptions to make events seem mysterious or magical. Tango (Slawomir Mrozek) – a somewhat absurdist allegory about reactionary tendencies. The setting is a rather bohemian family living in a house in disarray, with the father Stromil putting on ridiculous art shows (“experiments”) and ignoring the fact that his wife is cheating on him with a roustabout, Eddie. All of this is very frustrating to Arthur, his son, who longs for a return to the old classical ways. He gets his way by holding them to higher standards at gunpoint, but then realizes the futility of it all. I imagine it’s funnier performed with more dramaturgical context. It takes place between the wars, so that clearly has something to do with it, but my play-reading chops are not up to snuff to say anything insightful. Complexity and Information (J. F. Traub and A. G. Werschulz) – a gentle, if scattered, introduction to information based complexity, which I had heard about but didn’t really know too much about. It somehow feels “old fashioned” to me (perhaps that’s the machine learning kool-aid speaking), with comparisons to Turing machines and so on. But the central question of how to appropriately estimate integrals from samples is pretty interesting, given my recent forays into using MCMC. # He Who (Theatre Zarko) On Saturday, I saw He Who, a Kickstarter-funded production by puppet symbolist theatre group Theatre Zarko playing at the Steppenwolf Garage Rep. A dream-like meditation on contemporary politics, motherhood, responsibility, and depression, He Who at times feels too serious, but I think the group manages to find the humorous moments inside the pathos. The main character is a giant infant, mostly a globe-like puppet head made of wire and partially skinned in rags, tilted askew with a single eyeball and and mouth that can open to plead for food or attention or issue commands. It’s clear that this is not necessarily a literal baby, but more of an infantile being — at times a baby, and at times something else that is demanding and incapable of taking care of itself (our political system?) The baby is cared for by a mother played by 4 women in parallel, representing perhaps 4 different aspects of the same mother. She is tired of caring for this huge baby and tries to escape into fantasy, only to be dragged back into line and interrogated by an authoritarian lady in red. One of the most affecting moments of puppetry / mask for me was when one of the mothers dons a coat, hat, and mask holding a cigarette and dances. The physicality was so expressive it almost made the mask seem to move. An absent father figure, also masked, appears and disappears, adding to the dreamlike quality of the piece. The synopsis claims that the play is about “an old man’s dying few seconds, [in which] he experiences the distorted and painful dreams of his most influential acts and their consequences.” I think this gives far too much credit to the baby, and more or less turns the mother(s) into scraps of his memory, rather than their own characters with agency (whatever that means in this context). I think this play is really about the women, or rather, that it should be about them. It is definitely a strange work, but I highly recommend checking it out. It will most likely be unlike anything you’ve seen before. # Products of random stochastic matrices As I mentioned, Behrouz Touri gave a really great presentation at ITA on some of his work with Angelia Nedić on products of stochastic matrices, some of which was in this paper on ArXiV. The setup of the paper is relatively straightforward — we have a sequence of independent random matrices $\{W(k)\}$, each of which is row-stochastic almost surely, and we want to know when the product $\lim_{k \to \infty} W(k) W(k-1) \cdots W(t_0)$ converges almost surely. The main result is that if the chain is balanced and strongly aperiodic, then the limit is a random stochastic matrix such that the rows in the same connected component of the infinite flow graph are equal. Recall that for a chain $W$ a lazy version of the chain is $\alpha W + (1 - \alpha) I$. Laziness helps avoid periodicity by letting the chain be “stuck” with probability $(1 - \alpha)$. Strongly aperiodic means that there is a $\gamma \in (0,1]$ such that $\mathbb{E}[ W_{ii}(k) W_{ij}(k) ] \ge \gamma \mathbb{E}[ W_{ij}(k) ]$. Basically this is a sort of “expected laziness condition” which says that there is enough self-transition probability to avoid some sort of weak periodicity in the chain. Consider a cut of the chain into a set of states $S$ and a set $\bar{S}$. A chain is balanced if there is an $\alpha > 0$ such that for all cuts $(S,\bar{S})$, we have $\mathbb{E}[ W_{S \bar{S}}(k) ] \ge \alpha \mathbb{E}[ W_{\bar{S} S}(k) ]$. So this is saying that at each time, the flow out of $S$ is commensurate with the flow into $S$. The infinite flow graph has the same vertex set as the original chain, but with edge $(i,j)$ existing only if $\sum_{k} W_{ij}(k) + W_{ji}(k) = \infty$. That is, the edge has to exist infinitely often in the graph. So to add it all up, the well-behaved independent products of stochastic matrices that converge are those which don’t behave badly over time — for each time $k$, they don’t shift around the mass too much and they are not too periodic. Basically, they don’t become volatile and cyclical, which sounds perfectly reasonable. So how does he prove it? The first part is to connect the problem with a dynamic system whose state $x(k+1) = W(k+1) x(k)$ and look to see if the state converges in the limit. The next part uses a connection to absolute probability processes, which were used by Kolmogorov in 1936. A random vector process $\pi(k)$ is an absolute probability process for $\{W(k)\}$ if it is adapted to the same filtration as the chain, it’s stochastic almost surely, and $\mathbb{E}[ \pi(k+1) W(k+1) | \mathcal{F}_k ] = \pi(k)$ Kolmogorov showed that for a deterministic sequence of stochastic matrices there is always an deterministic absolute probability process, so for independent chains we can always find a random absolute probability process. Using this, Touri and Nedić define a class of comparison functions which are super-martingales for each$\latex x(0)\$ and have a limit. By choosing a particular comparison function they can get a version of the main result. It’s a nicely written paper and worth a skim if you’re interested in these things (as I am).

It’s been a busy week, deadline-wise, but I did see a few cool things on the interwebs which seemed worth sharing:

Tarantulas molting, courtesy of my high school biology teacher and ExploraVision coach extraordinare, Mr. Stone (his blog is cool too).

Keeping with the nature theme, find the cuttlefish!. The octopus video is cool too. Thanks to my commute being a bit longer, I listen to Science Friday podcasts as well as Story Collider, which is a pretty cool Moth-meets-science storytelling podcast.

Sometimes papers use pretty strong words in their titles (see for more context). On that note, some letters from John Nash (see also) were recently declassified by the NSA wherein he seems to predict fundamentals of cryptography and computational complexity. In more Rivest news, he coded up the cryptosystem.

In sadder news (also not so recent now), De Bruijn passed away. I’ve started a bioinformatics project recently (maybe more like “started”) and DeBruijn graphs are a pretty useful tool for making sense of data from next-generation sequencing technologies. Here are some animations describing how Illumina and 454 sequencing work.

Maybe when it gets warmer I will put together a worm bin — I miss the curbside composting of Berkeley.

I get a lot of positive comments about this shirt, but Topatoco are discontinuing it. Speaking of potatoes, Lav has a nice post with some links to papers on the importance and history of potatoes.