Linkage

Links to videos and a special chair.

James Baldwin debates William F. Buckley, Jr. I’ve only seen part of it so far, but it’s pretty interesting (via Ta-Nehisi Coates).

I’ve heard quite a bit about the treatment of agricultural workers in Florida, particularly in tomato farming, but this video with a representative of the Coalition of Immokalee Workers is a good introduction to what is going on there (via Serious Eats). The book Tomatoland is on my reading list.

I didn’t know the origin of the term swizzle-stick until now.

I’m a big fan of Cowboy Bebop, and Shinichiro Watanabe has a new show out called Sakamichi no Apollon (via MeFi). I watched the first episode, and the Art Blakey album Moanin’ features prominently, so I think I’m going to like this show quite a bit. It’s being streamed in a ad-heavy format on Crunchyroll.

That’s a lot of pendulums. That’s right, pendulums.

Why don’t you relax a little in the bear chair?

Concert Bleg : Chicago Chorale performs Vierne, Bach, and Schoenberg

I will probably repost this later, but I am singing with the Chicago Chorale in a few weeks. Luckily for me, I sang the two shorter pieces on the program in San Diego, but the centerpiece of this concert is the Vierne Messe Solennelle, which is a real showcase for the organist.

Voices Aloft

Sunday, May 13, 2012. 3:00 p.m.
Thomas Weisflog, Organ
Rockefeller Memorial Chapel, 5850 S. Woodlawn Avenue

Ticket Pricing
Reserved $35, General Admission $25, Student $20
Reserved and GA tickets will be $5 more at the door

The centerpiece of the concert is Louis Vierne’s Messe Solennelle, for choir and organ, composed in 1899. The greatest organist of his time, Vierne played and composed for the great Parisian organs of St. Sulpice and Notre Dame. As the Messe is one of the grandest compositions of the Golden Age of French organ composition, no organ in Chicago is more suited to this repertoire than the recently restored E.M. Skinner organ at The University of Chicago’s Rockefeller Memorial Chapel, now the largest organ in Chicago. Nor is any organist more suited to perform the work with Chorale than the Chapel organist, Thomas Weisflog. A heartfelt and sincere work, it also utilizes all of the sonic fireworks that the instrument and the choir are capable, entirely filling the Chapel with sound.

This concert will also feature two ethereally beautiful a cappella works: J.S. Bach’s double choir motet, Komm, Jesu, komm, and Arnold Schoenberg’s Friede auf Erden, utilizing the extraordinary acoustic properties of the chapel’s choir loft.

A.D. Sarwate’s Own Tangelo Bitters

A couple of weeks ago I started a batch of Tangelo bitters, using a couple of recipes I cobbled together from the web as guidelines. To be honest, I can’t even remember which recipes I used, but the closest one is the Serious Eats version. I had not yet obtained the book Bitters, but I figured it would be a fun experiment and I could always foist off the resulting stuff on my friends. The recipe uses two infusions — spices and peel into clear liquor and bittering agents into rye.

Continue reading

Juking the stats in academic publishing

I heard recently of a case where someone got a paper back with revisions requested, and a deadline for said revisions. They ended up asking for a week extension, but then the journal said they would have to do a fresh submission and redo the whole review cycle. I found this baffling — but then that person pointed out that the journal has built a reputation on fast turnaround times, and so to keep their “sub-to-pub” numbers low, they don’t want to give any extensions to the authors. It’s better to do a resubmission than to continue with the same “paper ID” in the system.

This is a classic example of juking the stats:

I just got a rejection from KDD 2012 which smacks of the same ominous reasoning:

We try to notify authors once a decision on a submission is concretely made, and hope that the early notifications can reduce the average review turn-over time.

But the real kicker is that “due to technical constraints” they can’t give us the reviews until May 4th. So I’m not really sure what I am supposed to do with this information — I can’t really start on revisions without the reviews, so this “early notification” thing is really just to make them feel better about themselves, it seems. Or perhaps they can then report that the reviewing was “more efficient.”

In any case, no harm is done, per se. But optimizing metrics like “sub-to-pub” seems to be as misguided as teaching to the test. What do we really want out of our peer review process? Or should we abandon it?

Manuscript Central is annoying

The IEEE Transactions on Information Theory recently transitioned to using Manuscript Central from the old Pareja system, so now all of the IEEE journals for which I review seem to be managed by the same external management system. As a reviewer/author, I have a lot of complaints (small and large) about Manuscript Central:

  • Why oh why do I need to disable my popup blocker for your site to work?
  • Why can login information not be shared across different IEEE publications? I have a separate account for each journal, with a separate password. Thank goodness I have LastPass, but even that program gets confused sometimes.
  • What is the deal with the mandatory subject classifications for papers? One of the “topics” I could pick was “IEEE Transactions on Information Theory.” Really? That’s a topic?
  • Why must papers for review be emblazoned with that stupid pale blue “For Peer Review Only” running diagonally across each page? This causes PDF annotations such as highlighting to barf, making paperless reviewing of papers significantly more annoying than it needs to be.

The worst part is that I am sure IEEE could implement a significantly cheaper and just-as-effective system itself, but now each Society is forking over money to Manuscript Central, which as far as I can tell, offers significantly more annoyances for authors and reviewers and is a shoddy product. Perhaps as an editor it’s significantly better (I imagine it is), but it seems like a bad deal overall.

Of course, now I sound curmudgeonly. Get off my lawn!

Do other people like MC? Or do you have other pet peeves?

In The Family

On Saturday evening I saw In The Family at the Asian American Showcase. It’s a film by Patrick Wang, who I may have last seen in a production of Grand Hotel at MIT when I was just starting college. It’s a film that is definitely worth seeing — an affecting and truthful story, it may make you tear up at times. It will also make you believe that a deposition can be the most important moment in a person’s life.

The trailer for the movie is here:

The synopsis says

In the town of Martin, Tennessee, Chip Hines, a precocious six year old, has only known life with his two dads, Cody and Joey. And a good life it is. When Cody dies suddenly in a car accident, Joey and Chip struggle to find their footing again. Just as they begin to, Cody’s will reveals that he named his sister as Chip’s guardian. The years of Joey’s acceptance into the family unravel as Chip is taken away from him. In his now solitary home life, Joey searches for a solution. The law is not on his side, but friends are. Armed with their comfort and inspired by memories of Cody, Joey finds a path to peace with the family and closer to his son.

The trailer starts almost towards the end of the film, and I think doesn’t really show the things which are the most beautiful about it. There is a scene after Cody’s funeral when Joey and Chip return to the house, shocked. Joey sits at the kitchen table, and Chip (where do they get these child actors — the kid is amazing!) has a long silent scene in which he gets the mail, climbs on the step stool, gets a glass, gets the Coke from the fridge, pours himself some, gets his dad a beer, opens the beer with some effort, then clinks the bottle and glass for cheers, and that is what snaps Joey out of it and he start sorting the mail. This is what I mean by a truthful scene — in the face of trauma and loss, at some point we go on, as Beckett might say. Watching those moments is important.

So the film is 3 hours long almost. But it’s worth it, because it shows you that kind of truth. Moment by moment. You get to understand what is at stake in this story, why Cody and Chip mean so much to Joey. It’s a beautiful debut film, and was rejected from a number of festivals but they are self-distributing it and it’s going to appear soon in a venue near you, hopefully. Do try to see it — it will move you.

Cover’s test for the irrationality of a coin

Someone hands you a coin which has a probability p of coming up heads. You can flip the coin as many times as you like (or more precisely, you can flip the coin an infinite number of times). Let S = \{r_i : i = 1, 2, \ldots\} be the set of rational numbers in [0,1]. After each flip, you have to guess one of the following hypotheses: that p = r_i for a particular i, or p is irrational. Furthermore, you can only make a finite number of errors for any p \in [0,1] - N_0, where N_0 is a set of irrationals of Lebesgue measure 0. Can you do it? If so, how?

This is the topic addressed by a pair of papers that Avrim Blum mentioned in Yoav Freund‘s remembrances of Tom Cover:

COVER, THOMAS M (1973). On determining the irrationality of the mean of a random variable. Ann. Math. Statist. 1862-871.
COVER & HIRSCHLER (1975). A finite memory test of the irrationality of the parameter of a coin. Annals of Statistics, 939-946

I’ll talk about the first paper in this post.

The algorithm is not too complicated — you basically go in stages. For each time j = 1, 2, \ldots you have a function n(j). Think of n(j) as piecewise constant. There are two sequences: a threshold k_{n(j)}, and an interval width \delta_{n(j)}.

  1. Take the sample mean \hat{x}_{n(j)} and look at a interval of width 2 \delta_{n(j)} centered on it. Note that this makes the same decision for each j until n(j) changes.
  2. Given an enumeration of the set S, find the smallest i such that r_i \in [\hat{x} - \delta_{n(j)}, \hat{x} + \delta_{n(j)}].
  3. I there is an i < k_{n(j)} such that r_i \in [\hat{x} - \delta_{n(j)}, \hat{x} + \delta_{n(j)}] then declare p = r_i, otherwise declare p \notin S
  4. .

The last thing to do is pick all of these scalings. This is done in the paper (I won’t put it here), but the key thing to use is the law of the iterated logarithm (LIL), which I never really had a proper appreciation for prior to this. For \epsilon > 0,

| \hat{x}_n - p | \le (1 + \epsilon) (2 p (1 - p) \sqrt{ \frac{\log \log n}{n} })

for all but finitely many values of n. This gets used to set the interval width \delta_{n(j)}.

The cool thing to me about this paper is that it’s an example of “letting the hypothesis class grow with the data.” We’re trying to guess if the coin parameter p is rational and if so, which rational. But we can only apprehend a set of hypotheses commensurate with the data we have, so the threshold k_{n(j)} limits the “complexity” of the hypotheses we are willing to consider at time j. The LIL sets the threshold for us so that we don’t make too many errors.

There are lots of little extensions and discussions about the rationality of physical constants, testing for rationality by revealing digits one by one, and other fun ideas. It’s worth a skim for some of the readers of this blog, I’m sure. A miscellaneous last point : Blackwell suggested a Bayesian method for doing this (mentioned in the paper) using martingale arguments.

Linkage

Via Brandy, Kenji breaks down perfect hard boiled eggs. See also sauceome.

Bret Victor talks about Inventing on Principle — the first half are a lot of demos of some pretty amazing applications of his major driving principle, which is that creators should be able to see what they are creating in real time. He sometimes waxes a little TED-like, but overall, quite inspiring.

My high school history teacher, Chris Butler, has turned his award-winning lecture notes and flowcharts into an iPad app which is available on the App Store.

Queen, live at Wembley. (via MeFi)

Some pretty cool visualizations of sorting. (via logistic aggression)

Upper bounds for causal adversaries

Bikash Dey, Mike Langberg, Sid Jaggi, and I submitted an extended version of our (now accepted) ISIT paper on new upper bounds for binary channels with causal adversaries to the IT Transactions. The model is pretty straightforward : Alice (the encoder) transmits a message to Bob (the receiver) encoded over n uses of a binary input, binary output channel. The channel is controlled by Calvin (an adversary) who sequentially looks at each bit and can decide whether or not to flip it, up to pn total bit flips. That is, Calvin is causal : the decision to flip bit i is based on the knowledge of bits 1, 2, \ldots, i. What we show is a new upper bound on the capacity of this channel. Let \alpha(p,\bar{p}) = 1-4(p-\bar{p}). Then

C \le \min_{\bar{p} \in [0,p]} \left[ \alpha(p,\bar{p})\left(1-H\left(\frac{\bar{p}}{\alpha(p,\bar{p})}\right)\right) \right]

This is what it looks like:

New Upper Bound for Causal Adversaries

Plot of the new upper bound


So clearly causal adversaries are worse than i.i.d. noise (the 1 - H(p) bound).

To show such a bound we have to propose a new attack for the adversary. We call our attack “babble and push.” It operates in two phases. The first phase is of length \ell channel uses and the second of length n - \ell. Let \mathbf{x}(m) be the codeword for message m.

  1. (Babble) Calvin chooses a random subset \bar{p} n indices uniformly from all (\bar{p} n)-subsets of \{1, 2, \ldots, \ell\} and flips bit i for i \in \Gamma.
  2. (Push) Calvin finds all possible codewords which are consistent with what Bob has received in the first phase:

    B_{\mathbf{y}^{\ell}} = \{ u : d_H(\mathbf{y}^{\ell}, \mathbf{x}^{\ell}(u))=\bar{p}n \},

    and selects an element \hat{u} \in B_{\mathbf{y}_1} uniformly at random. For the second phase, Calvin selectively pushes the received codeword towards \mathbf{x}(\hat{u}) — if the transmitted codeword and the selected codeword match, he does nothing, and if they do not match he flips the bit with probability 1/2.

Analyzing this scheme amounts to showing that Calvin can render the channel “symmetric.” This is a common condition in arbitrarily varying channels (AVCs), a topic near and dear to my heart. Basically Bob can’t tell the difference between the real codeword and the transmitted codeword, because under Calvin’s attack, the chance that Alice chose u and Calvin chose \hat{u} is the same as the chance Alice chose \hat{u} and Calvin chose {u}. To establish this symmetry condition requires some technical excursions which are less fun to blog about, but were fun to figure out.

It’s relatively clear that this approach would extend to more general AVCs, which we could work on for the future. What is neat to me is that this shows how much value Calvin can derive by knowing the current input bit — by forcing additional uncertainty to Bob during the babble phase, Calvin can buy some time to more efficiently use his bit flipping budget in the second phase.

If at first you don’t succeed, normalize, normalize again

My ex-groupmate and fellow Uni High graduate Galen Reeves told me about a paper a few weeks ago when I visited him at Stanford:

Successive Normalization of Rectangular Arrays
Richard A. Olshen and Bala Rajaratnam
The Annals of Statistics 38(3), pp.1369-1664, 2010

Apparently, however, the arguments in the paper are not quite correct [1], and they recently uploaded a correction to ArXiV.

This paper looks at the effect of a very common preprocessing step used to transform an n \times k data matrix \mathbf{X} into a form acceptable for statistical or machine learning algorithms that assume things like zero-mean or bounded vectors. Here n may represent the number of individuals, and k the number of features, for example. Or the data may come from a DNA microarray (their motivating example). This preprocessing is often done without much theoretical justification — the mathematical equivalence of tossing spilled salt over your shoulder. This paper looks at the limiting process of standardizing rows and then columns and then rows and then columns again and again. They further need that n,k \ge 3. “Readers will see that the process and perhaps especially the mathematics that underlies it are not as simple as we had hoped they would be.”

So what exactly is the preprocessing? I am going to describe things in pseudocode (too lazy to do real code, sorry). Given a data matrix X[i,j] they look at

for i = 1:n { X[i,1:k] = X[i,1:k] - sum(X[i,1:k]) }
for j = 1:k { X[1:n,j] = X[l:n,j] - sum(X[1:n,j]) }

They call the first a “row mean polish” and the second a “column mean polish.” They show this converges in one step.

But what about standardizing? The more complicated polishing procedure looks like this:

for i = 1:n {
mu = sum(X[i,1:k])
sigma = sqrt( sum( (X[i,1:k] - mu)^2 ) )
X[i,1:k] = (X[i,1:k] - mu)/sigma
}
for j = 1:k {
mu = sum(X[1:n,j])
sigma = sqrt( sum( (X[1:n,j] - mu)^2 ) )
X[1:n,j] = (X[1:n,j] - mu)/sigma
}

This standardizes rows first, and then columns (or “lather” and “rinse,” since we are going to “repeat”). They call this Efron‘s algorithm because he told them about it. So what happens if we repeat these two steps over and over again on a matrix with i.i.d. entries from some continuous distribution?

Theorem 4.1 Efron’s algorithm converges almost surely for X on a Borel set of entries with complement a set of Lebesgue measure 0.

So what does it look like in practice? How fast is this convergence? Empirically, it looks exponential, and they have some theoretical guarantees in the paper, kind of hidden in the discussion. The proofs are not terribly ornate but are tricky, and I don’t quite get all the details myself, but I figured readers of this blog would certainly be interested in this cute result.

[1] A fun quote from the paper “Theorem 4.1 of [2] is false. A claimed backwards martingale is NOT. Fortunately, all that seems damaged by the mistake is pride. Much is true.” I really liked this.