The NSF should run a special program on modernizing FastLane. It could be an interdisciplinary proposal between HCI, sociology, graphics, and the databases communities.
Moo
“Happy cows” may come from California, but the cow in Figure 1 of this paper can’t be too happy. It’s unclear, however, if is this cow from the East Bay or UIUC… I’m guessing the latter.
Notes on stable distributions
After attending a recent talk at TTI about dimension reduction by Moses Charikar in which he mentioned the special role stable distributions play, I made a note to freshen up my own scattershot knowledge of facts about stable distributions. Of course, things got too busy and the the note ended up on my sub-list of to-do items that get infinitely postponed. However, I’ve been saved by a recent post to ArXiV by Svante Janson, who does all sorts of interesting work on these cool objects called graphons (the limits of infinite graph processes) :
Stable Distributions
Svante JansonWe give some explicit calculations for stable distributions and convergence to them, mainly based on less explicit results in Feller (1971). The main purpose is to provide ourselves with easy reference to explicit formulas. (There are no new results.)
All (or at least most) of the facts I wanted in one place! Hooray!
He starts with infinitely divisible distributions (e.g. Gaussian, Poisson, Gamma) and then talks about -stable distributions and the uniqueness of the corresponding measures for
(the case
gives the Gaussian. I’m still reading it (bits at a time), but it’s great to have little surveys like this — broadens the mind, builds character, &c.
T-Rex learns information theory
Namely, that continuous variables have infinite entropy and common randomness can be very useful : Unlimited data transmission with only TWO NUMBERS, baby!
The linguistic diversity of mustard seeds
From Thangam Philip’s book Modern Cookery:
Mustard seeds (Brassica nigra) :
Hindi – rai
Tamil – kadugu
Telugu – avalu
Kannada – sasuve
Oriya – sorisa
Marathi – mohori
Bengali – sorse
Gujarati – rai
Malayalam – kadugu
Kashmiri – aasur
A recent discussion with Lalitha Sankar and Prasad Santhanam brought up this linguistic diversity. Clearly sorse/sasuve/sorisa/ come from the same root as sarson, which are mustard greens. Maybe aasur is derived from that as well, but where do the others come from?
It turns out that the Farsi word is خردل, or khardal (thanks to Amin Mobasher for the help), which is probably the source for the Tamil/Malayalam.
But, much to my chagrin as a Maharashtrian, I do not know the origins of mohori, nor do I have any in my kitchen right now (soon to be rectified by a trip to Devon)!
Not really the digital divide
I started my new job here at TTI Chicago this fall and I’ve been enjoying the fact that TTI is partnered up with the University of Chicago — I get access to the library, and a slightly better rate at the gym (still got to get on that), and some other perks. However, U of C doesn’t have an engineering school. So the library has a pretty minimal subscription to IEEExplore. Which leaves me in a bit of predicament — I’m a member of some of the IEEE societies, so I can get access to those Transactions, but otherwise I have to work a bit harder to get access to some papers. So far it hasn’t proved to be problem, but I think I might run into a situation like the one recently mentioned by David Eppstein.
A creepy but prescient quote
… statistical research accompanies the individual through his entire earthly existence. It takes account of his birth, his baptism, his vaccination, his schooling and the success thereof, his diligence, his leave of school, his subsequent education and development; and, once he becomes a man, his physique and his ability to bear arms. It also accompanies the subsequent steps of his walk through life; it takes note of his chosen occupation, where he sets up his household and his management of the same; if he saved from the abundance of his youth for his old age, if and when and at what age he marries and who he chooses as his wife — statistics looks after him when things go well for him and when they go awry. Should he suffer a shipwreck in his life, undergo material, moral or spiritual ruin, statistics take note of the same. Statistics leaves a man only after his death — after it has ascertained the precise age of his death and noted the causes that brought about his end.
Ernst Engel, 1862
“You are unfit to ensure the safety of students at UC Davis”
An open letter to Linda Katehi.
She was pretty good at ensuring the admission of children of the wealthy while at UIUC though.
The basketball strike and some confusing lingo
Via Deadspin I saw this AP article on the latest twist in the NBA labor dispute and this tweet from columnist Adrian Wojnarowski : “The chances of losing the entire 2011-12 season has suddenly become the likelihood.” Assuming we correct to “likelihood,” what does this mean from a statistical standpoint? Is this frequentist analysis of a Bayesian procedure? Help me out folks…
HGR maximal correlation and the ratio of mutual informations
From one of the presentation of Zhao and Chia at Allerton this year, I was made aware of a paper by Elza Erkip and Tom Cover on “The efficiency of investment information” that uses one of my favorite quantities, the Hirschfeld–Gebelein–Rényi maximal correlation; I first discovered it in this gem of a paper by Witsenhausen.
The Hirschfeld–Gebelein–Rényi maximal correlation between two random variables
and
is
where is all real-valued functions such that
and
and
is all real valued functions such that
and
. It’s a cool measure of dependence that covers discrete and continuous variables, since they all get passed through these “normalizing”
and
functions.
The fact in the Erkip-Cover paper is this one:
.
That is, the square of the HGR maximal correlation is the best (or worst, depending on your perspective) ratio of the two sides in the Data Processing Inequality:
.
It’s a bit surprising to me that this fact is not as well known. Perhaps it’s because the “data processing” is happening at the front end here (by choosing ) and not the actual data processing
which is given to you.