December 2009


I came across this tidbit about anti-miscegenation laws in the US as applied to groups other than blacks in Randall Kennedy’s Interracial Intimacies. In these states, the groups not allowed to marry whites were:

  • Arizona : Mongolians, Malayans, Hindus, Indians
  • California : Mongolians, Malayans
  • Georgia : Japanese, Chinese, Malayans, Asiatic Indians
  • Mississippi : Mongolians
  • Montana : Chinese, Japanese
  • Nebraska : Chinese, Japanese
  • Nevada : Ethiopians, Malays, Mongolians
  • Wyomimg : Malayans, Mongolians

As a postdoc at a school with a gigantic biosciences program and surrounded by other biomedical research institutes (Scripps, Burnham, etc), a lot of the professional development workshops offered here are not specifically helpful to me. For example, I went to a workshop on writing grants, but it was almost entirely focused on NIH grants; the speaker said he had never applied to the NSF for a grant. Still, I did pick up general tips and strategies about the process of writing a grant. In the same vein, I read an article in The Scientist (registration required) about improving scientific writing which offered ideas applicable to technical writing in general. One that stuck out for me was:

Write daily for 15 to 30 minutes
During your daily writing sessions, don’t think about your final manuscript. Just write journal entries, says Tara Gray, director of the teaching academy that provides training and support to New Mexico State University professors. “People think there’s two phases of a research project—doing the research and writing it up,” she says. Rather than setting aside large chunks of time for each activity, combine them to improve your writing and your research. The first time Gray encouraged a group of faculty members at New Mexico State to adhere to this schedule for three months, they wrote about twice as much as their normal output.

I think I’ll try doing this. I often complain that I live an “interrupt-driven” lifestyle, but sometimes flailing on some very involved epsilonics at the last minute to get something to work results in errors, tension, and woe.

MANGOSTEENS!

From Bobak I saw that US drones in Iraq have been hacked because “the remotely flown planes have an unprotected communications link” but “there was no evidence that they [insurgents] were able to jam electronic signals from the aircraft.”

This illustrates nicely the difference between eavesdropping and jamming. However, a nice by-product of anti-jamming codes using shared encryption keys (here they can be easily agreed upon before the drone takes off) is that sometimes you can get both eavesdropping and jamming protection at the same time.

One of the fun thing about graphical models is that arguments can be done by looking at diagrams (kind of like a diagram chase in algebraic topology). One such trick is from R.D. Shachter’s paper in UAI called “Bayes-Ball: The Rational Pastime (for Determining Irrelevance and Requisite Information in Belief Networks and Influence Diagrams)” (see it here. for example). This is a handy method for figuring out conditional independence relations, and is a good short-cut for figuring out when certain conditional mutual information quantities are equal to 0. The diagram below shows the different rules for when the ball can pass through a node or when it bounces off. Gray means that the variable is observed (or is in the conditioning). I tend to forget the rules, so I made this little chart summary to help myself out.

Ehsan Ardestanizadeh posed a little problem in Young-Han Kim‘s group meeting before break and Paolo Minero presented a cute solution which I thought might be of interest. It’s related to ideas my advisor, Michael Gastpar worked on in his thesis.

Suppose I have a Gaussian variable S that is \mathcal{N}(0,1), so zero-mean and unit variance. If I observe Y = S + Z where Z is also \mathcal{N}(0,1) and independent of S, the minimum mean-square error (MMSE) estimate of S given Y is just

\mathsf{MMSE}( S | Y) = \frac{1}{2} Y

and the MSE is \mathbb{E}[ (S - \hat{S})^2 ] = 1/2. The question is this: can we somehow get an MSE of less than 1/2 by “encoding” S? Suppose now we can take any function f and let X = f(S), and Y = X + Z but with the restriction that \mathsf{Var}(X) \le 1. That is, the encoding cannot take any more power.

The answer is no, and comes via an “information theoretic” argument. Consider the vector version of the problem where you have n iid unit variance Gaussians S^n and you want to estimate S^n from Y^n = f(S^n) + Z^n where Z^n is iid unit variance Gaussian as well. The goal is to minimize the average per-letter distortion:

d(\hat{S}^n, S^n) = \frac{1}{n} \sum_{i=1}^{n} \mathbb{E}[ (S_i - \hat{S}_i)^2 ]

This is just the problem of joint source-channel coding of a Gaussian source over an AWGN channel with quadratic distortion and encoding function f must satisfy a unit power constraint. For this problem the rate distortion function is

R(D) = \frac{1}{2} \log \frac{1}{D}

and the capacity of the channel is \frac{1}{2} \log(1 + 1) = \frac{1}{2}. Since separate source coding (compression) followed by channel coding (error control) is optimal, in order to get distortion D the rate R(D) \le 1/2 so D \ge 1/2. Furthermore, this is achievable with no coding at all by just setting f(S^n) = S^n.

Now if there was a scheme for the single-letter case which got MSE less than 1/2, we could concatenate it to get a vector scheme with distortion less than 1/2. But since D \ge 1/2 in the optimal code, we get a contradiction. Thus encoding does not help in the single-letter case either. If S isn’t Gaussian the whole story changes though.

I’m singing again!

The Messiah

Orchestra Nova
Conducted by Jung-Ho Pak

Bach Collegium San Diego
Ruben Valenzuela, Music Director

Guest Artists:
Virginia Sublett, Soprano
Katherine Lundeen, Alto
Robert MacNeil, Tenor
John Polhamus, Bass

St. Paul’s Cathedral, San Diego
Thursday, December 10, 7:30 p.m.

St. James by-the-Sea Episcopal Church, La Jolla
Friday, December 11, 7:30 p.m.

Solana Beach Presbyterian Church, Solana Beach
Saturday, December 12, 7:30 p.m.

This season’s Masterpiece Messiah is an encore presentation of our dramatic video experience of the great masterpieces of art
complementing the most famous of all oratorios, George Frideric Handel’s Messiah. Joined again by Bach Collegium San Diego, our
interpretation has become well-known for its original 18th-century period approach, creating an unforgettable emotional experience that
goes beyond most traditional performances.

New this year and by popular demand, we’re adding a performance at St. Paul’s Cathedral, Downtown San Diego. Don’t miss out on one of the hottest tickets in town during December. Buy tickets online or call 858-350-0290.

I went to a workshop at the end of the summer at the American Institute of Mathematics on Permanents and modeling probability distributions. The main questions we looked at were how to estimate a probability distribution from samples when you don’t know e.g. how many possible values there are. A simple version of this which is often mentioned is estimating the number of different butterfly species from a sample containing many unique species. As C.B. Williams wrote:

About 1940, a closer cooperation in a new field was started by a letter from Dr Steven Corbet, who had collected butterflies for many years in Malaya. On studying his collections after returning to England, he found that he had approximately 9000 specimens which included 316 species. Of these 118 were represented by only a single individual; 74 by 2 individuals, 44 by 3 and so on-the greater the number of individuals per species, the fewer were the number of species at that
level. From this skew distribution it followed that the 37% rarer species accounted for only 1.3% of the total sample.

R.A. Fisher, Corbet and Williams wrote a paper in the Journal of Animal Ecology in 1943 to estimate the number of species (“The relation between the number of species and the number of individuals in a random sample of an animal population”). Another famous example is estimating the number of words Shakespeare knew, as in the work of Thisted and Efron (popularized a bit more by the application to determining if a new poem is by Shakespeare or not).

At the workshop I learned from Susan Holmes about this collection of essays by S.L. Zabell called Symmetry and its Discontents, which I have been enjoying tremendously. Alon Orlitsky gave a short presentation orienting the workshop participants, and started out with a “simpler” problem of inductive inference — given that you saw a trial succeed p times and fail q times, what is the bias of the coin? Bayes thought about this a lot, as did Laplace, who wrote this infamous example (excerpted from Zabell’s book):

Thus we find that an event having occurred successively any number of times, the probability that it will happen again the next time is equal to this number increased by unity, divided by the same number, increased by two units. Placing the most ancient epoch of history at five thousand years ago, or at least 1826213 days, and the sun having risen constantly in the interval at each revolution of twenty-four hours, it is a bet of 1826214 to one that it will rise again tomorrow. [Essai philosophique p. xvii]

Laplace gets a lot of flak for this estimate, and it’s an infamous example of the “excesses of probabilistic reasoning.” But as I learned this morning, Laplace immediately went on to say:

But this number is incomparably greater for him who, recognizing in the totality of the phenomena the regulatory principle of days and seasons, sees that nothing at the present moment can arrest the course of it.

Laplace was advocating this method of calculation formally, as the correct way to compute a probability based only on the information from the sample (e.g. the number of repeated successes/events). As Persi Diaconis pointed out at the workshop, Laplace would be “turning over in his grave” at some of the things people have accused him of mathematically.

And that is your historical nugget for the day.

Follow

Get every new post delivered to your Inbox.

Join 858 other followers