ITA Workshop 2012 : More Talks

There are some more talks to blog about, probably, but I am getting lazy, and one of them I wanted to mention was Max’s, but he already blogged a lot of it. I still don’t get what the “Herbst argument” is, though.

Vinod Prabhakaran gave a talk about indirect decoding in the 3-receiver broadcast channel. In indirect decoding, there is a “semi-private” message that is not explicitly decoded by the third receiver. However, Vinod argued that this receiver can decoded it anyway, so the indirectness is not needed, somehow. At least, that’s how I understood the talk.

Lalitha Sankar talked about two different privacy problems that could arise in “smart grid” or power monitoring situations. The first is a model of system operators (ISOs) and how to view the sharing of load information — there was a model of K different “sources” or states being observed through a channel which looked like a AWGN faded interference channel, where the fading represents the relative influence of the source (or load on the network) on the receiver (or ISO). She didn’t quite have time to go into the second model, which was more at the level of individual homes, where short-time-scale monitoring of loading can reveal pretty much all the details of what’s going on in a house. The talk was a summary of some recent papers available on her website.

Negar Kiyavash talked about timing side channel attacks — an adversary can ping your router and from the delays in the round trip times can learn pretty much what websites you are surfing. Depending on the queueing policy, the adversary can learn more or less about you. Negar showed that first come first serve (FCFS) is terrible in this regard, and there is a bit of a tradeoff wherein policies with higher delay offer more privacy. This seemed reminiscent of the work Parv did on Chaum mixing…

Lav Varshney talked about security in RFID — the presence of an eavesdropper actually detunes the RFID circuit, so it may be possible for the encoder and decoder to detect if there is an eavesdropper. The main challenge is that nobody knows the transfer function, so it has to be estimated (using a periodogram energy detector). Lav proposed a protocol in which the transmitter sends a key and the receiver tries to detect if there is an eavesdropper; if not, then it sends the message.

Tsachy Weissman talked about how to estimate directed mutual information from data. He proposed a number of estimators of increasing complexity and showed that they were consistent. The basic idea was to leverage all of the results on universal probability estimation for finite alphabets. It’s unclear to me how to extend some of these results to the continuous setting, but this is an active area of research. I saw a talk recently by John Lafferty on forest density estimation, and this paper on estimating mutual information also seems relevant.

Linkage

It’s been a while since I’ve posted, and I am going to try to post more regularly now, but as usual, things start out slowly, so here are some links. I’ve been working on massaging the schedule for the 2012 ITA Workshop (registration is open!) as well as some submissions for KDD (a first for me) and ISIT (since I skipped last year), so things are a bit hectic.

Chicago Restaurant Week listings are out, for the small number of you readers who are in Chicago. Some history on the Chicago activities of CORE in the 40s.

Via Andrew Gelman, a new statistics blog.

A paper on something called Avoidance Coupling, which I want to read sometime when I have time again.

Our team, Too Big To Fail, finished second in the 2012 MIT Mystery Hunt. There were some great puzzles in there. In particular, Picture An Acorn was awesome (though I barely looked at it), and Slash Fiction was a lot of fun (and nostalgia-inducing. Ah, Paris!). Erin has a much more exhaustive rundown.

Readings

I anticipate I will be doing a fair bit more reading in the future, due to the new job and personal circumstances. However, I probably won’t write more detailed notes on the books. This blog should be a rapidly mixing random walk, after all.

Embassytown (China Miéville) : a truly bizarre novel set on an alien world in on which humans have an Embassy but can only communicate with the local aliens in a language which defies easy description. Ambassadors come in pairs, as twins — to speak with the Ariekei they must both simultaneously speak (in “cut” and “turn”). The Ariekei’s language does not allow lying, and they have contests in which they try to speak falsehoods. However, events trigger a deadly change (I don’t want to give it away). Philosophically, the book revolves a lot around how language structures thought and perception, and it’s fascinating if you like to think about those things.

Chop Suey: A Cultural History of Chinese Food in the United States (Andrew Coe) : an short but engaging read about how Chinese food came to the US. The book starts really with Americans in China and their observations on Chinese elite banquets. A particular horror was that the meat came already chopped up — no huge roasts to carve. Chapter by chapter, Coe takes us through the railroad era through the 20s, the mass-marketing of Chinese food and the rise of La Choy, through Nixon going to China. The book is full of fun tidbits and made my flights to and from Seattle go by quickly.

The Thousand Autumns of Jacob de Zoet: A Novel (David Mitchell) : I really love David Mitchell’s writing, but this novel was not my favorite of his. It was definitely worth reading — I devoured it — but the subject matter is hard. Jacob de Zoet is a clerk in Dejima, a Dutch East Indies trading post in 19th century Japan. There are many layers to the story, and more than a hint of the grotesque and horrific, but Mitchell has an attention to detail and a mastery with perspective that really makes the place and story come alive.

Air (Geoff Ryman) : a story about technological change, issues of the digital divide, economic development, and ethnic politics, set in a village in fictional Karzistan (looks like Kazakhstan). Air is like having mandatory Internet in your brain, and is set to be deployed globally. During a test run in the village, Chung Mae, a “fashion expert,” ends up deep into Air and realizes that the technology is going to change their lives. She goes about trying (in a desperate, almost mad way) to tell her village and bring them into the future before it overwhelms them. There’s a lot to unpack here, especially in how technology is brought to rural communities in developing nations, how global capital and the “crafts” market impacts local peoples, and the dynamics of village social orders. It’s science fiction, but not really.

The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy (Sharon Bertsch McGrayne) : an engaging read about the history of Bayesian ideas in statistics. It reads a bit like an us vs. them, the underdog story of how Bayesian methods have overcome terrible odds (prior beliefs?) to win the day. I’m not sure I can give it as enthusiastic a review as Christian Robert, but I do recommend it as an engaging popular nonfiction read on this slice in the history of modern statistics. In particular, it should be entertaining to a general audience.

Dangerous Frames: How Ideas about Race and Gender Shape Public Opinion (Nicholas J.G. Winter) : the title says most of it, except it’s mostly about how ideas about race and gender shape white public opinion. The basic theoretical structure is that there are schemas that we carry that help us interpret issues, like a race schema or a gender schema. Then there are frames or narratives in which issues are put. If the schema is “active” and an issue is framed in a way that is concordant with the schema, then people’s opinions follow the schema, even if the issue is not “about” race or gender. This is because people reason analogically, so they apply the schema if it matches. To back up the theory, Winter has some experiments, both of the undergrads doing psych studies type as well as survey data, to show that by reframing certain issues people’s “natural” beliefs can be skewed by the schema that they apply. The schemas he discusses are those of white Americans, mostly, so the book feels like a bit of an uncomfortable read because he doesn’t really interrogate the somewhat baldly racist schemas. The statistics, as with all psychological studies, leaves something to be desired — I take the effects he notices at a qualitative level (as does he, sometimes).

Notes on stable distributions

After attending a recent talk at TTI about dimension reduction by Moses Charikar in which he mentioned the special role stable distributions play, I made a note to freshen up my own scattershot knowledge of facts about stable distributions. Of course, things got too busy and the the note ended up on my sub-list of to-do items that get infinitely postponed. However, I’ve been saved by a recent post to ArXiV by Svante Janson, who does all sorts of interesting work on these cool objects called graphons (the limits of infinite graph processes) :

Stable Distributions
Svante Janson

We give some explicit calculations for stable distributions and convergence to them, mainly based on less explicit results in Feller (1971). The main purpose is to provide ourselves with easy reference to explicit formulas. (There are no new results.)

All (or at least most) of the facts I wanted in one place! Hooray!

He starts with infinitely divisible distributions (e.g. Gaussian, Poisson, Gamma) and then talks about \alpha-stable distributions and the uniqueness of the corresponding measures for \alpha \in (0,2] (the case \alpha = 2 gives the Gaussian. I’m still reading it (bits at a time), but it’s great to have little surveys like this — broadens the mind, builds character, &c.

The basketball strike and some confusing lingo

Via Deadspin I saw this AP article on the latest twist in the NBA labor dispute and this tweet from columnist Adrian Wojnarowski : “The chances of losing the entire 2011-12 season has suddenly become the likelihood.” Assuming we correct to “likelihood,” what does this mean from a statistical standpoint? Is this frequentist analysis of a Bayesian procedure? Help me out folks…

HGR maximal correlation and the ratio of mutual informations

From one of the presentation of Zhao and Chia at Allerton this year, I was made aware of a paper by Elza Erkip and Tom Cover on “The efficiency of investment information” that uses one of my favorite quantities, the Hirschfeld–Gebelein–Rényi maximal correlation; I first discovered it in this gem of a paper by Witsenhausen.

The Hirschfeld–Gebelein–Rényi maximal correlation \rho_m(X,Y) between two random variables X and Y is

\sup_{f \in \mathcal{F}_X, g \in \mathcal{G}_Y} \mathbb{E}[ f(X) g(Y) ]

where \mathcal{F}_X is all real-valued functions such that \mathbb{E}[ f(X) ] = 0 and \mathbb{E}[ f(X)^2 ] = 1 and \mathcal{G}_Y is all real valued functions such that \mathbb{E}[ g(Y) ] = 0 and \mathbb{E}[ g(Y)^2 ] = 1. It’s a cool measure of dependence that covers discrete and continuous variables, since they all get passed through these “normalizing” f and g functions.

The fact in the Erkip-Cover paper is this one:

sup_{ P(z|y) : Z \to Y \to X } \frac{I(Z ; X)}{I(Z ; Y)} = \rho_m(X,Y)^2.

That is, the square of the HGR maximal correlation is the best (or worst, depending on your perspective) ratio of the two sides in the Data Processing Inequality:

I(Z ; Y) \ge I(Z ; X).

It’s a bit surprising to me that this fact is not as well known. Perhaps it’s because the “data processing” is happening at the front end here (by choosing P(z|y)) and not the actual data processing Y \to X which is given to you.

Readings

Tiassa, by Steven Brust. As Cosma puts it, mind candy, and only worth reading if you’ve read the other 10 books in the series. Quite enjoyable, however.

Kraken, by China Miéville. A rollicking adventure involving a giant squid, horrific monsters and gruesome deaths, a dark underbelly of London, the end of the world, and… a ghost piggie. Among other things. I enjoyed it.

Hindoo Holiday, by J.R. Ackerley. A travelogue of a gay Englishman who becomes an attaché to a gay Raja in a princely state in the early 20th century. Often full of colonial condescension (though in a light tone) about things Indian. Most of us are tragically sad of buffoonish. The homosexuality is not overt but explicit enough that the book was censored when published. Still, it’s an interesting historical read, just because it is so weird.

The Lost Promise of Civil Rights, by Risa Goluboff. A really fascinating book about the history of civil rights litigation in the US from Lochner to Brown. The term “civil rights” was in a state of flux during that era, transitioning from a labor-based understanding to discrimination-based standing. The main players were the Justice Department’s Civil Rights Service and the NAACP. By choosing which cases to pursue and which arguments to advance, they explored different visions of what civil rights could mean and why they were rights in the first place. In particular, the NAACP did not take on many labor cases because they were actively pursuing a litigation agenda that culminated in Brown. The decision in Brown and subsequent decisions shaped our modern understanding of civil rights as grounded in stopping state-sanctioned discrimination. However, the “lost promise” in the title shows what was lost in this strategy — the state-sponsored parts of Jim Crow were taken down, but the social institutions that entrench inequality were left.

The Devil in the White City, by Erik Larson. I had to read this since I just moved to Chicago and I work right near Jackson Park. This was a very engaging read (Larson just has that “style”) but a bit creepy in that “watched too many episodes of Dexter” way. I enjoyed it a little less than Thunderstruck, but I had more professional attachment to that one.

Banff blog

I figured I would blog about this week’s workshop at Banff in a more timely fashion. Due to the scheduling of flights out of Calgary, I will have to miss the last day of talks. The topics of people’s presentations varied rather widely, and many were not about the sort of Good-Turing estimator setup. Sometimes it was a bit hard to see how to see how the problems or approaches were related (not that they had to be directly), but given that the crowd had widely varying backgrounds, presenters had a hard time because the audience had to check in a new set of notation or approach for every talk. The advantage is that there were lots of questions — the disadvantage is that people insisted on “finishing” their presentations. By mid-week my brain was over-full, and a Wednesday afternoon hike up Sulphur Mountain was the perfect solution.

The view from Sulpur Mountain

The view from Sulpur Mountain

Continue reading

Banfffffffffffffff

I’ve just arrived in chilly but beautiful Banff for a workshop on Information theory and statistics for large alphabets. I’m looking forward to it, although I will have to miss the last day due to the timing of flights out of Calgary that get me to Chicago before midnight. My itineraries there and back seem especially perverse : ORD-SEA-YYC and YYC-SFO-ORD. However, thanks to the new gig I have a new laptop with a functional battery so I am doing a bit more busy-work and less New Yorker reading in the plane. I might try to write a bit more about the topics in the workshop — although the topic seems focused, there are a wide range of approaches and angles to take on the problem of estimating probabilities/prevalences in situations where you may not get to see each outcome once. Certainly I hope I can get the journal version of a paper from last year’s Allerton squared away.

Bayesianism in philosophy

In an effort to get myself more philosophically informed with regards to probability and statistics, I’ve been reading about various notions and their discontents, such as symmetry, or Bayesianism, or p-values. I was delighted to find this recent pair of papers (part I,part II) by fellow Berkeley-ite and occasional puzzle-partner Kenny Easwaran (now a prof at USC) on Bayesianism in Philosophy Compass. In the first paper he goes through basic tenets of Bayesian approaches to probability in terms of subjective belief, and their philosophical justification via rational actions or “Dutch book” arguments and representation theorems. What’s also interesting from a scientific view (somewhat off-topic from the article) is the angle being advanced (some might say “pushed”) by some cognitive scientists that people are actually doing some kind of Bayesian conditionalization in certain tasks (here’s a plug for my buddy Pradeep‘s work). The second article talks about the difficulties in developing a consistent and quantitative “confirmation theory” in Bayesianism. In different fields there are different questions how how to do this, and as Kenny points out, the anti-Bayesians in different fields are different — the null-position is not necessarily frequentism.

They’re a relatively quick read, and I think provide some different perspectives for those of us who usually see these concepts in our little fiefdoms.