# Tracks : Wiping the mat

A mix for Celeste LeCompte. I always come back to some tracks, but this has more new songs than old, I’d say.

1. Rain on a tin roof
2. The Evil One – James Blood Ulmer
3. No One Gonna Honor Kill My Baby (But Me) – The Kominas
4. Chaal Baby – Red Baraat
5. American Dreamin’ – Jay-Z & Music Without Borders
6. Desafinado – Ryuichi Sakamoto and Paula Morelenbaum/Jobim
7. Survive It – Ghostpoet
8. The Part You Throw Away – Tom Waits
9. The Green Pastures – William Tyler
10. 1445 Blue Lead Fences – Loch Lomond
11. If I Had A Million Dollars – Miss Erika
12. Cornbread And Butterbeans (Album Version) – Carolina Chocolate Drops
13. Just squeeze me – Ella Fitzgerald
14. All Night Long – Pert Near Sandstone
15. Kithkin – Ampersand
16. You Go Running – Deep Sea Diver

I got an email today from Elsevier:

It is our pleasure to inform you that your publication has been cited in a journal published by Elsevier.

Through this unique service we hope we can offer you valuable information, and make you aware of publications in your research area.

The service is called CiteAlert. It sends you an email every time you’re cited!

Clearly, it’s little touches like this that justify the price gouging and subscription bundling. Kind of like the little chocolate on your pillow at the expensive hotel.

# The Early History of the SVD

I recently read G.W. Stewart‘s little paper On the Early History of the Singular Value Decomposition (free tech report version is at UMD). It talks about how Beltrami, Jordan, Sylvester, Schmidt, and Weyl all had different approaches to finding/proving the SVD. It’s worth a quick skim, because goodness knows it appears everywhere under all sorts of names. Part of the problem is characterizing the SVD, and the other is calculating it. Since numerical analysis was never part of my training, I don’t have as much sophisticated appreciation for the algorithmic aspects, but I certainly benefit from having efficient solvers.

One point Stewart makes is that we really shouldn’t call the approximation theorem for the SVD the Eckart-Young Theorem, since Schmidt was really the one who showed it much earlier in the context of “integral equations, one of the hot topics of the first decades of our [the 20th] century.” I’ve been guilty of this in the past, so it’s time for me to make amends. I suppose I better start saying Cauchy-Bunyakovsky-Schwarz too.

What was weird to me is that as an (erstwhile?) signal processor, there was not much mention of the Karhunen–Loève transform, even in the little paragraphs on “principal components.”

Links to videos and a special chair.

James Baldwin debates William F. Buckley, Jr. I’ve only seen part of it so far, but it’s pretty interesting (via Ta-Nehisi Coates).

I’ve heard quite a bit about the treatment of agricultural workers in Florida, particularly in tomato farming, but this video with a representative of the Coalition of Immokalee Workers is a good introduction to what is going on there (via Serious Eats). The book Tomatoland is on my reading list.

I didn’t know the origin of the term swizzle-stick until now.

I’m a big fan of Cowboy Bebop, and Shinichiro Watanabe has a new show out called Sakamichi no Apollon (via MeFi). I watched the first episode, and the Art Blakey album Moanin’ features prominently, so I think I’m going to like this show quite a bit. It’s being streamed in a ad-heavy format on Crunchyroll.

That’s a lot of pendulums. That’s right, pendulums.

Why don’t you relax a little in the bear chair?

# A.D. Sarwate’s Own Tangelo Bitters

A couple of weeks ago I started a batch of Tangelo bitters, using a couple of recipes I cobbled together from the web as guidelines. To be honest, I can’t even remember which recipes I used, but the closest one is the Serious Eats version. I had not yet obtained the book Bitters, but I figured it would be a fun experiment and I could always foist off the resulting stuff on my friends. The recipe uses two infusions — spices and peel into clear liquor and bittering agents into rye.

# Juking the stats in academic publishing

I heard recently of a case where someone got a paper back with revisions requested, and a deadline for said revisions. They ended up asking for a week extension, but then the journal said they would have to do a fresh submission and redo the whole review cycle. I found this baffling — but then that person pointed out that the journal has built a reputation on fast turnaround times, and so to keep their “sub-to-pub” numbers low, they don’t want to give any extensions to the authors. It’s better to do a resubmission than to continue with the same “paper ID” in the system.

This is a classic example of juking the stats:

I just got a rejection from KDD 2012 which smacks of the same ominous reasoning:

We try to notify authors once a decision on a submission is concretely made, and hope that the early notifications can reduce the average review turn-over time.

But the real kicker is that “due to technical constraints” they can’t give us the reviews until May 4th. So I’m not really sure what I am supposed to do with this information — I can’t really start on revisions without the reviews, so this “early notification” thing is really just to make them feel better about themselves, it seems. Or perhaps they can then report that the reviewing was “more efficient.”

In any case, no harm is done, per se. But optimizing metrics like “sub-to-pub” seems to be as misguided as teaching to the test. What do we really want out of our peer review process? Or should we abandon it?

# Manuscript Central is annoying

The IEEE Transactions on Information Theory recently transitioned to using Manuscript Central from the old Pareja system, so now all of the IEEE journals for which I review seem to be managed by the same external management system. As a reviewer/author, I have a lot of complaints (small and large) about Manuscript Central:

• Why oh why do I need to disable my popup blocker for your site to work?
• Why can login information not be shared across different IEEE publications? I have a separate account for each journal, with a separate password. Thank goodness I have LastPass, but even that program gets confused sometimes.
• What is the deal with the mandatory subject classifications for papers? One of the “topics” I could pick was “IEEE Transactions on Information Theory.” Really? That’s a topic?
• Why must papers for review be emblazoned with that stupid pale blue “For Peer Review Only” running diagonally across each page? This causes PDF annotations such as highlighting to barf, making paperless reviewing of papers significantly more annoying than it needs to be.

The worst part is that I am sure IEEE could implement a significantly cheaper and just-as-effective system itself, but now each Society is forking over money to Manuscript Central, which as far as I can tell, offers significantly more annoyances for authors and reviewers and is a shoddy product. Perhaps as an editor it’s significantly better (I imagine it is), but it seems like a bad deal overall.

Of course, now I sound curmudgeonly. Get off my lawn!

Do other people like MC? Or do you have other pet peeves?

# In The Family

On Saturday evening I saw In The Family at the Asian American Showcase. It’s a film by Patrick Wang, who I may have last seen in a production of Grand Hotel at MIT when I was just starting college. It’s a film that is definitely worth seeing — an affecting and truthful story, it may make you tear up at times. It will also make you believe that a deposition can be the most important moment in a person’s life.

The trailer for the movie is here:

The synopsis says

In the town of Martin, Tennessee, Chip Hines, a precocious six year old, has only known life with his two dads, Cody and Joey. And a good life it is. When Cody dies suddenly in a car accident, Joey and Chip struggle to find their footing again. Just as they begin to, Cody’s will reveals that he named his sister as Chip’s guardian. The years of Joey’s acceptance into the family unravel as Chip is taken away from him. In his now solitary home life, Joey searches for a solution. The law is not on his side, but friends are. Armed with their comfort and inspired by memories of Cody, Joey finds a path to peace with the family and closer to his son.

The trailer starts almost towards the end of the film, and I think doesn’t really show the things which are the most beautiful about it. There is a scene after Cody’s funeral when Joey and Chip return to the house, shocked. Joey sits at the kitchen table, and Chip (where do they get these child actors — the kid is amazing!) has a long silent scene in which he gets the mail, climbs on the step stool, gets a glass, gets the Coke from the fridge, pours himself some, gets his dad a beer, opens the beer with some effort, then clinks the bottle and glass for cheers, and that is what snaps Joey out of it and he start sorting the mail. This is what I mean by a truthful scene — in the face of trauma and loss, at some point we go on, as Beckett might say. Watching those moments is important.

So the film is 3 hours long almost. But it’s worth it, because it shows you that kind of truth. Moment by moment. You get to understand what is at stake in this story, why Cody and Chip mean so much to Joey. It’s a beautiful debut film, and was rejected from a number of festivals but they are self-distributing it and it’s going to appear soon in a venue near you, hopefully. Do try to see it — it will move you.

# Cover’s test for the irrationality of a coin

Someone hands you a coin which has a probability $p$ of coming up heads. You can flip the coin as many times as you like (or more precisely, you can flip the coin an infinite number of times). Let $S = \{r_i : i = 1, 2, \ldots\}$ be the set of rational numbers in $[0,1]$. After each flip, you have to guess one of the following hypotheses: that $p = r_i$ for a particular $i$, or $p$ is irrational. Furthermore, you can only make a finite number of errors for any $p \in [0,1] - N_0$, where $N_0$ is a set of irrationals of Lebesgue measure 0. Can you do it? If so, how?

This is the topic addressed by a pair of papers that Avrim Blum mentioned in Yoav Freund‘s remembrances of Tom Cover:

COVER, THOMAS M (1973). On determining the irrationality of the mean of a random variable. Ann. Math. Statist. 1862-871.
COVER & HIRSCHLER (1975). A finite memory test of the irrationality of the parameter of a coin. Annals of Statistics, 939-946

I’ll talk about the first paper in this post.

The algorithm is not too complicated — you basically go in stages. For each time $j = 1, 2, \ldots$ you have a function $n(j)$. Think of $n(j)$ as piecewise constant. There are two sequences: a threshold $k_{n(j)}$, and an interval width $\delta_{n(j)}$.

1. Take the sample mean $\hat{x}_{n(j)}$ and look at a interval of width $2 \delta_{n(j)}$ centered on it. Note that this makes the same decision for each $j$ until $n(j)$ changes.
2. Given an enumeration of the set $S$, find the smallest $i$ such that $r_i \in [\hat{x} - \delta_{n(j)}, \hat{x} + \delta_{n(j)}]$.
3. I there is an $i < k_{n(j)}$ such that $r_i \in [\hat{x} - \delta_{n(j)}, \hat{x} + \delta_{n(j)}]$ then declare $p = r_i$, otherwise declare $p \notin S$
4. .

The last thing to do is pick all of these scalings. This is done in the paper (I won’t put it here), but the key thing to use is the law of the iterated logarithm (LIL), which I never really had a proper appreciation for prior to this. For $\epsilon > 0$,

$| \hat{x}_n - p | \le (1 + \epsilon) (2 p (1 - p) \sqrt{ \frac{\log \log n}{n} })$

for all but finitely many values of $n$. This gets used to set the interval width $\delta_{n(j)}$.

The cool thing to me about this paper is that it’s an example of “letting the hypothesis class grow with the data.” We’re trying to guess if the coin parameter $p$ is rational and if so, which rational. But we can only apprehend a set of hypotheses commensurate with the data we have, so the threshold $k_{n(j)}$ limits the “complexity” of the hypotheses we are willing to consider at time $j$. The LIL sets the threshold for us so that we don’t make too many errors.

There are lots of little extensions and discussions about the rationality of physical constants, testing for rationality by revealing digits one by one, and other fun ideas. It’s worth a skim for some of the readers of this blog, I’m sure. A miscellaneous last point : Blackwell suggested a Bayesian method for doing this (mentioned in the paper) using martingale arguments.