Linkage

Via Brandy, Kenji breaks down perfect hard boiled eggs. See also sauceome.

Bret Victor talks about Inventing on Principle — the first half are a lot of demos of some pretty amazing applications of his major driving principle, which is that creators should be able to see what they are creating in real time. He sometimes waxes a little TED-like, but overall, quite inspiring.

My high school history teacher, Chris Butler, has turned his award-winning lecture notes and flowcharts into an iPad app which is available on the App Store.

Queen, live at Wembley. (via MeFi)

Some pretty cool visualizations of sorting. (via logistic aggression)

Upper bounds for causal adversaries

Bikash Dey, Mike Langberg, Sid Jaggi, and I submitted an extended version of our (now accepted) ISIT paper on new upper bounds for binary channels with causal adversaries to the IT Transactions. The model is pretty straightforward : Alice (the encoder) transmits a message to Bob (the receiver) encoded over n uses of a binary input, binary output channel. The channel is controlled by Calvin (an adversary) who sequentially looks at each bit and can decide whether or not to flip it, up to pn total bit flips. That is, Calvin is causal : the decision to flip bit i is based on the knowledge of bits 1, 2, \ldots, i. What we show is a new upper bound on the capacity of this channel. Let \alpha(p,\bar{p}) = 1-4(p-\bar{p}). Then

C \le \min_{\bar{p} \in [0,p]} \left[ \alpha(p,\bar{p})\left(1-H\left(\frac{\bar{p}}{\alpha(p,\bar{p})}\right)\right) \right]

This is what it looks like:

New Upper Bound for Causal Adversaries

Plot of the new upper bound


So clearly causal adversaries are worse than i.i.d. noise (the 1 - H(p) bound).

To show such a bound we have to propose a new attack for the adversary. We call our attack “babble and push.” It operates in two phases. The first phase is of length \ell channel uses and the second of length n - \ell. Let \mathbf{x}(m) be the codeword for message m.

  1. (Babble) Calvin chooses a random subset \bar{p} n indices uniformly from all (\bar{p} n)-subsets of \{1, 2, \ldots, \ell\} and flips bit i for i \in \Gamma.
  2. (Push) Calvin finds all possible codewords which are consistent with what Bob has received in the first phase:

    B_{\mathbf{y}^{\ell}} = \{ u : d_H(\mathbf{y}^{\ell}, \mathbf{x}^{\ell}(u))=\bar{p}n \},

    and selects an element \hat{u} \in B_{\mathbf{y}_1} uniformly at random. For the second phase, Calvin selectively pushes the received codeword towards \mathbf{x}(\hat{u}) — if the transmitted codeword and the selected codeword match, he does nothing, and if they do not match he flips the bit with probability 1/2.

Analyzing this scheme amounts to showing that Calvin can render the channel “symmetric.” This is a common condition in arbitrarily varying channels (AVCs), a topic near and dear to my heart. Basically Bob can’t tell the difference between the real codeword and the transmitted codeword, because under Calvin’s attack, the chance that Alice chose u and Calvin chose \hat{u} is the same as the chance Alice chose \hat{u} and Calvin chose {u}. To establish this symmetry condition requires some technical excursions which are less fun to blog about, but were fun to figure out.

It’s relatively clear that this approach would extend to more general AVCs, which we could work on for the future. What is neat to me is that this shows how much value Calvin can derive by knowing the current input bit — by forcing additional uncertainty to Bob during the babble phase, Calvin can buy some time to more efficiently use his bit flipping budget in the second phase.

If at first you don’t succeed, normalize, normalize again

My ex-groupmate and fellow Uni High graduate Galen Reeves told me about a paper a few weeks ago when I visited him at Stanford:

Successive Normalization of Rectangular Arrays
Richard A. Olshen and Bala Rajaratnam
The Annals of Statistics 38(3), pp.1369-1664, 2010

Apparently, however, the arguments in the paper are not quite correct [1], and they recently uploaded a correction to ArXiV.

This paper looks at the effect of a very common preprocessing step used to transform an n \times k data matrix \mathbf{X} into a form acceptable for statistical or machine learning algorithms that assume things like zero-mean or bounded vectors. Here n may represent the number of individuals, and k the number of features, for example. Or the data may come from a DNA microarray (their motivating example). This preprocessing is often done without much theoretical justification — the mathematical equivalence of tossing spilled salt over your shoulder. This paper looks at the limiting process of standardizing rows and then columns and then rows and then columns again and again. They further need that n,k \ge 3. “Readers will see that the process and perhaps especially the mathematics that underlies it are not as simple as we had hoped they would be.”

So what exactly is the preprocessing? I am going to describe things in pseudocode (too lazy to do real code, sorry). Given a data matrix X[i,j] they look at

for i = 1:n { X[i,1:k] = X[i,1:k] - sum(X[i,1:k]) }
for j = 1:k { X[1:n,j] = X[l:n,j] - sum(X[1:n,j]) }

They call the first a “row mean polish” and the second a “column mean polish.” They show this converges in one step.

But what about standardizing? The more complicated polishing procedure looks like this:

for i = 1:n {
mu = sum(X[i,1:k])
sigma = sqrt( sum( (X[i,1:k] - mu)^2 ) )
X[i,1:k] = (X[i,1:k] - mu)/sigma
}
for j = 1:k {
mu = sum(X[1:n,j])
sigma = sqrt( sum( (X[1:n,j] - mu)^2 ) )
X[1:n,j] = (X[1:n,j] - mu)/sigma
}

This standardizes rows first, and then columns (or “lather” and “rinse,” since we are going to “repeat”). They call this Efron‘s algorithm because he told them about it. So what happens if we repeat these two steps over and over again on a matrix with i.i.d. entries from some continuous distribution?

Theorem 4.1 Efron’s algorithm converges almost surely for X on a Borel set of entries with complement a set of Lebesgue measure 0.

So what does it look like in practice? How fast is this convergence? Empirically, it looks exponential, and they have some theoretical guarantees in the paper, kind of hidden in the discussion. The proofs are not terribly ornate but are tricky, and I don’t quite get all the details myself, but I figured readers of this blog would certainly be interested in this cute result.

[1] A fun quote from the paper “Theorem 4.1 of [2] is false. A claimed backwards martingale is NOT. Fortunately, all that seems damaged by the mistake is pride. Much is true.” I really liked this.

Linkage

More attacks on anonymity in DNA databases.

A letter from Kurt Vonnegut to the head of a school board in North Dakota who burned copies of Slaughterhouse Five. (via MeFi)

An interview with Olympic bronze medalist John Carlos on giving a black power salute at the 1968 Olympics.

Tomorrow is National Grilled Cheese Day.

I really enjoyed this exhibit of Tagore’s painting at The Art Institute of Chicago, although my favorite drawing is not online, this bird was pretty cool.

The history of the martingale

The other day I found myself wondering “so what does the word martingale come from?” A short time on Google later, I came across this paper from Journal Electronique d’Histoire des Probabilités et de la Statistique, which had a special issue on The Splendors and Miseries of Martingales (Splendeurs et misères des martingales):

The Origins of the Word “Martingale”
Roger Mansuy
(earlier version : “Histoire de martingales” in Mathématiques & Sciences Humaines/Mathematical Social Sciences, 43th year, no. 169, 2005(1), pp. 105–113.)

It’s 10 pages and worth a read just for fun. Some of the fun facts:

  • Doob is the one who really made the name popular (in addition to proving many fundamental results). He got the name from a thesis by Ville.
  • A martingale is the name for a Y-shaped strap used in a harness — it runs along the horse’s chest and then splits up the middle to join the saddle.
  • A martingale is a name for a betting strategy (usually we think of doubling bets) but it’s not clear which one from the historical record.
  • “To play the martingale is to always bet all that was lost” (dictionary of the Acad ́emie Fran ̧caise, 4th ed.) — there are earlier dictionary definitions too, to 1750.
  • “A very slim trail seems to indicate a derivation of the word from the Provençal expression jouga a la martegalo, which means ‘to play in an absurd and incomprehensible way’.” Apparently Provençal is also the origin of Baccarat.
  • So what is martegalo? It might refer to a place called Martigues, whose residents are supposedly a bit naïve.
  • “Martingale pants” are from Martigues, and have, according to Rabelais, “a drawbridge on the ass that makes excretion easier.”
  • There’s a woman in the 17th century who called herself La Martingale and who made a number of prophetic predictions.
  • There were sailors called martégaux who gave their name to a rope called a martegalo used on sailboats. Perhaps this is where the horse connection comes in?
  • Apparently “martingale” is also vernacular for “prostitute,” but the etymology for that usage is not well-documented.

All in all, perhaps this essay ends up raising more questions than it answers, but I certainly had no idea that there was this much to be unearthed behind a simple word.

ICITS Workshop Deadline Extension

(Via Adam Smith) The deadline for submitting workshop papers to the 6th International Conference on Information Theoretic Security (ICITS) has been extended from today to Monday, April 23 (at 3pm Eastern) due to holidays. It’s in Montreal in August, so if you have some recent results that you would like to present in a workshop setting, please send them in. “Papers which broaden the applicability of information-theoretic techniques (say, to areas such as data privacy or anonymity) would also be very welcome.”

ICITS will have two tracks this year : a more CS-style conference with published proceedings, and a workshop (think ITA) track without proceedings where you can present older stuff.

Jácaras and Ensaladas

I am singing with the UChicago Early Music Ensemble, a somewhat relaxed group led by David Douglass and Ellen Hargis of The Newbury Consort. I started rehearsing a bit late, so I’ve been playing catch-up. This year the repertoire is all music from Spain and Spanish colonies, and today we worked on two of the harder pieces in the program : an ensalada called La Bomba, by Mateo Flecha “El Viejo”, and a jacara by Juan Gutiérrez de Padilla. Too many words!

In looking for some recordings to get a better sense of the pieces, I came across this charmingly old King’s Singers TV special (check out those sweater vests!) acting out La Bomba in what appears to be the house from Clue:

Lucky them, they get multiple takes which makes it a bit easier to manage the crazy transitions in the piece. There’s also a multitracked recording on which is pretty good:

Unfortunately we are doing it up a third from there, much to the chagrin of my passagio.

We spent a bit of time trying to get the jácara up to speed, but singing Spanish that fast is hard! When I heard how fast this version went I almost lost it:

It looks like I have my work cut out for me, especially if I want to roll my r’s like that.

New books on the foundations of signal processing

For those who are interested in signal processing, the classic Wavelets and Subband Coding by Martin Vetterli and Jelena Kovačević is a must-have. But perhaps you were unaware that the two of them, together with Vivek Goyal have written a pair of books:

Signal Processing: Foundations
Signal Processing: Fourier and Wavelet Representations

They should give you a solid grounding the fundamentals of signal processing from a more modern perspective. Preliminary versions are available for download from the book’s website.