Not really the digital divide

I started my new job here at TTI Chicago this fall and I’ve been enjoying the fact that TTI is partnered up with the University of Chicago — I get access to the library, and a slightly better rate at the gym (still got to get on that), and some other perks. However, U of C doesn’t have an engineering school. So the library has a pretty minimal subscription to IEEExplore. Which leaves me in a bit of predicament — I’m a member of some of the IEEE societies, so I can get access to those Transactions, but otherwise I have to work a bit harder to get access to some papers. So far it hasn’t proved to be problem, but I think I might run into a situation like the one recently mentioned by David Eppstein.

A creepy but prescient quote

… statistical research accompanies the individual through his entire earthly existence. It takes account of his birth, his baptism, his vaccination, his schooling and the success thereof, his diligence, his leave of school, his subsequent education and development; and, once he becomes a man, his physique and his ability to bear arms. It also accompanies the subsequent steps of his walk through life; it takes note of his chosen occupation, where he sets up his household and his management of the same; if he saved from the abundance of his youth for his old age, if and when and at what age he marries and who he chooses as his wife — statistics looks after him when things go well for him and when they go awry. Should he suffer a shipwreck in his life, undergo material, moral or spiritual ruin, statistics take note of the same. Statistics leaves a man only after his death — after it has ascertained the precise age of his death and noted the causes that brought about his end.

Ernst Engel, 1862

The basketball strike and some confusing lingo

Via Deadspin I saw this AP article on the latest twist in the NBA labor dispute and this tweet from columnist Adrian Wojnarowski : “The chances of losing the entire 2011-12 season has suddenly become the likelihood.” Assuming we correct to “likelihood,” what does this mean from a statistical standpoint? Is this frequentist analysis of a Bayesian procedure? Help me out folks…

HGR maximal correlation and the ratio of mutual informations

From one of the presentation of Zhao and Chia at Allerton this year, I was made aware of a paper by Elza Erkip and Tom Cover on “The efficiency of investment information” that uses one of my favorite quantities, the Hirschfeld–Gebelein–Rényi maximal correlation; I first discovered it in this gem of a paper by Witsenhausen.

The Hirschfeld–Gebelein–Rényi maximal correlation \rho_m(X,Y) between two random variables X and Y is

\sup_{f \in \mathcal{F}_X, g \in \mathcal{G}_Y} \mathbb{E}[ f(X) g(Y) ]

where \mathcal{F}_X is all real-valued functions such that \mathbb{E}[ f(X) ] = 0 and \mathbb{E}[ f(X)^2 ] = 1 and \mathcal{G}_Y is all real valued functions such that \mathbb{E}[ g(Y) ] = 0 and \mathbb{E}[ g(Y)^2 ] = 1. It’s a cool measure of dependence that covers discrete and continuous variables, since they all get passed through these “normalizing” f and g functions.

The fact in the Erkip-Cover paper is this one:

sup_{ P(z|y) : Z \to Y \to X } \frac{I(Z ; X)}{I(Z ; Y)} = \rho_m(X,Y)^2.

That is, the square of the HGR maximal correlation is the best (or worst, depending on your perspective) ratio of the two sides in the Data Processing Inequality:

I(Z ; Y) \ge I(Z ; X).

It’s a bit surprising to me that this fact is not as well known. Perhaps it’s because the “data processing” is happening at the front end here (by choosing P(z|y)) and not the actual data processing Y \to X which is given to you.

Readings

Tiassa, by Steven Brust. As Cosma puts it, mind candy, and only worth reading if you’ve read the other 10 books in the series. Quite enjoyable, however.

Kraken, by China Miéville. A rollicking adventure involving a giant squid, horrific monsters and gruesome deaths, a dark underbelly of London, the end of the world, and… a ghost piggie. Among other things. I enjoyed it.

Hindoo Holiday, by J.R. Ackerley. A travelogue of a gay Englishman who becomes an attaché to a gay Raja in a princely state in the early 20th century. Often full of colonial condescension (though in a light tone) about things Indian. Most of us are tragically sad of buffoonish. The homosexuality is not overt but explicit enough that the book was censored when published. Still, it’s an interesting historical read, just because it is so weird.

The Lost Promise of Civil Rights, by Risa Goluboff. A really fascinating book about the history of civil rights litigation in the US from Lochner to Brown. The term “civil rights” was in a state of flux during that era, transitioning from a labor-based understanding to discrimination-based standing. The main players were the Justice Department’s Civil Rights Service and the NAACP. By choosing which cases to pursue and which arguments to advance, they explored different visions of what civil rights could mean and why they were rights in the first place. In particular, the NAACP did not take on many labor cases because they were actively pursuing a litigation agenda that culminated in Brown. The decision in Brown and subsequent decisions shaped our modern understanding of civil rights as grounded in stopping state-sanctioned discrimination. However, the “lost promise” in the title shows what was lost in this strategy — the state-sponsored parts of Jim Crow were taken down, but the social institutions that entrench inequality were left.

The Devil in the White City, by Erik Larson. I had to read this since I just moved to Chicago and I work right near Jackson Park. This was a very engaging read (Larson just has that “style”) but a bit creepy in that “watched too many episodes of Dexter” way. I enjoyed it a little less than Thunderstruck, but I had more professional attachment to that one.

Interesting facts about Rényi

Perhaps I should say “facts” since I got them from his Wikipedia page (corroborated by more official accounts):

  • “He was prevented from enrolling in university in 1939 due to the anti-Jewish laws then in force, but enrolled at the University of Budapest in 1940 and finished his studies in 1944. At this point he was drafted to forced labour service, escaped, and completed his Ph.D. in 1947 at the University of Szeged, under the advisement of Frigyes Riesz.” Hardcore!
  • Rényi said “A mathematician is a device for turning coffee into theorems,” not Erdös.
  • He passed away at the age of 48!

Banff blog

I figured I would blog about this week’s workshop at Banff in a more timely fashion. Due to the scheduling of flights out of Calgary, I will have to miss the last day of talks. The topics of people’s presentations varied rather widely, and many were not about the sort of Good-Turing estimator setup. Sometimes it was a bit hard to see how to see how the problems or approaches were related (not that they had to be directly), but given that the crowd had widely varying backgrounds, presenters had a hard time because the audience had to check in a new set of notation or approach for every talk. The advantage is that there were lots of questions — the disadvantage is that people insisted on “finishing” their presentations. By mid-week my brain was over-full, and a Wednesday afternoon hike up Sulphur Mountain was the perfect solution.

The view from Sulpur Mountain

The view from Sulpur Mountain

Continue reading

Banfffffffffffffff

I’ve just arrived in chilly but beautiful Banff for a workshop on Information theory and statistics for large alphabets. I’m looking forward to it, although I will have to miss the last day due to the timing of flights out of Calgary that get me to Chicago before midnight. My itineraries there and back seem especially perverse : ORD-SEA-YYC and YYC-SFO-ORD. However, thanks to the new gig I have a new laptop with a functional battery so I am doing a bit more busy-work and less New Yorker reading in the plane. I might try to write a bit more about the topics in the workshop — although the topic seems focused, there are a wide range of approaches and angles to take on the problem of estimating probabilities/prevalences in situations where you may not get to see each outcome once. Certainly I hope I can get the journal version of a paper from last year’s Allerton squared away.