Starting up, and some thoughts on admissions

It’s been a busy January — I finished up a family vacation, moved into a new apartment, helped run the MIT Mystery Hunt, started teaching at Rutgers, and had two conference deadlines back to back. One of my goals for the year is to blog a bit more regularly — I owe some follow-up to my discussion of the MAP perturbation work, which I will be talking about at ITA.

In the meantime, however, one of the big tasks in January is graduate admissions. I helped out with admissions at Berkeley for 4 years, so I’m familiar with reviewing the (mostly international) transcripts, but the level of detail in transcript reporting varies widely. The same is true for letters of recommendation. I’m sure this is culturally mediated, but some recommenders write 1-2 sentences, and some write paeans. This makes calibrating across institutions very difficult. While the tails of the distribution are easy to assess, decisions about the middle are a bit tougher.

Rutgers, like many engineering school across the country, has a large Masters program. Such programs serve as a gateway for foreign engineers to enter the US workforce — it’s much easier to get hired if you’re already here. It’s also makes money for the university, since most students pay their own way. In that regards, Rutgers is a pretty good deal, being a state school. However, it also means making admissions decisions about the middle of the distribution. What one wants is to estimate the probability an applicant will succeed in their Masters level classes.

It’s a challenging problem — without being able to get the same level of detail about the candidates, their schools, and how their recommenders feel about their chances, one is left with a kind of robust estimation problem with a woefully underspecified likelihood. I’ve heard some people (at other schools) discuss GPA cutoffs, but those aren’t calibrated either. More detail about a particular individual doesn’t really help. I think it’s a systemic problem with how graduate applications work in larger programs; our model now appears better suited to small departments with moderate cohort sizes.

Linkage

I’m in the process of moving to New Jersey for my new gig at Rutgers. Before I start teaching I have to go help run the the Mystery Hunt, so I am a little frazzled and unable to write “real” blog posts. Maybe later. In the meantime, here are some links.

The folks at Puzzazz have put out a bevy of links for the 200th anniversary of the crossword puzzle.

The UK has issued a pardon to Alan Turing, for, you know, more or less killing him. It’s a pretty weasely piece of writing though.

An important essay on women’s work: “…women are not devalued in the job market because women’s work is seen to have little value. Women’s work is devalued in the job market because women are seen to have little value.”. (h/t AW)

Of late we seem to be learning quite a bit about early hominins and hominids (I had no idea that hominini was a thing, nor that chimps are in the panini tribe, nor that “tribe” is between subfamily and genus). For example,
they have sequenced some old bones in Spain. Extracting sequenceable mitochondrial DNA is pretty tough — I am sure there are some interesting statistical questions in terms of detection and contamination. We’ve also learned that some neanderthals were pretty inbred.

Kenji searches for the perfect chocolate chip cookie recipe.

My current job

I’ve had to do a lot of explaining about my current position and institution since moving here, especially when I go visit ECE departments. So I figured I might use the blog to give a quick rundown of the job. I’m a Research Assistant Professor at the Toyota Technological Institute at Chicago, a philanthropically endowed academic computer science institute located on the University of Chicago campus.

  • The Toyota Technological Institute at Chicago is a branch of the Toyota Technological Institute in Nagoya, Japan. Their website is a little slow to load, but the Wikipedia entry has more quick facts. TTI-Japan was founded through an endowment from the Toyota Motor Corporation in 1981 (so it’s younger than me). The Toyota Motor Corporation is not my employer, although some executives are on the board of the school.
  • I do not work for Toyota. My research has nothing to do with cars. At least not intentionally.
  • TTI-Chicago is basically a stand-alone computer science department and was started in 2003. It only has graduate students and grants its own degrees. It happens to be located on the University of Chicago campus — we rent two floors of a building which also contains the IT services. Classes at TTI are cross-listed with the University of Chicago — students at TTI take classes at UChicago and students at UChicago take classes at TTI.
  • I get an “affiliate” card for UChicago which lets me use the library and stuff. It’s great to have a library there, but since UChicago has no engineering, my access to IEEExplore is a bit limited.
  • The research at TTI-Chicago is mostly in machine learning, computer vision, speech processing, computational biology, and CS theory. This makes me a bit of an odd-one-out, but I have been doing more machine learning lately. It’s fun learning new perspectives on things and new problems.
  • The Research Assistant Professor position at TTI-Chicago is a 3-year position (some people have stayed for 4) which pays a 9 month salary (out of general institute funds) and gives a yearly budget for research expenses like travel/conferences and experimental costs (e.g. for Mechanical Turk or Amazon EC2). It’s not a “soft money” position but people are free to raise their summer salary through grants (like I did) or by taking a visiting position elsewhere for part of the year. I do not have to teach but can offer to teach classes or help teach classes
  • There are tenure-track faculty at TTI, and it’s the same tenure deal as elsewhere. Their teaching load is one quarter per year (that should make people jealous).
  • There are graduate students here, but not a whole lot of them. I can’t directly supervise graduate students but I can work with them on research projects. I’m starting to work with one student here and I’m pretty excited about our project.

Upper bounds for causal adversaries

Bikash Dey, Mike Langberg, Sid Jaggi, and I submitted an extended version of our (now accepted) ISIT paper on new upper bounds for binary channels with causal adversaries to the IT Transactions. The model is pretty straightforward : Alice (the encoder) transmits a message to Bob (the receiver) encoded over n uses of a binary input, binary output channel. The channel is controlled by Calvin (an adversary) who sequentially looks at each bit and can decide whether or not to flip it, up to pn total bit flips. That is, Calvin is causal : the decision to flip bit i is based on the knowledge of bits 1, 2, \ldots, i. What we show is a new upper bound on the capacity of this channel. Let \alpha(p,\bar{p}) = 1-4(p-\bar{p}). Then

C \le \min_{\bar{p} \in [0,p]} \left[ \alpha(p,\bar{p})\left(1-H\left(\frac{\bar{p}}{\alpha(p,\bar{p})}\right)\right) \right]

This is what it looks like:

New Upper Bound for Causal Adversaries

Plot of the new upper bound


So clearly causal adversaries are worse than i.i.d. noise (the 1 - H(p) bound).

To show such a bound we have to propose a new attack for the adversary. We call our attack “babble and push.” It operates in two phases. The first phase is of length \ell channel uses and the second of length n - \ell. Let \mathbf{x}(m) be the codeword for message m.

  1. (Babble) Calvin chooses a random subset \bar{p} n indices uniformly from all (\bar{p} n)-subsets of \{1, 2, \ldots, \ell\} and flips bit i for i \in \Gamma.
  2. (Push) Calvin finds all possible codewords which are consistent with what Bob has received in the first phase:

    B_{\mathbf{y}^{\ell}} = \{ u : d_H(\mathbf{y}^{\ell}, \mathbf{x}^{\ell}(u))=\bar{p}n \},

    and selects an element \hat{u} \in B_{\mathbf{y}_1} uniformly at random. For the second phase, Calvin selectively pushes the received codeword towards \mathbf{x}(\hat{u}) — if the transmitted codeword and the selected codeword match, he does nothing, and if they do not match he flips the bit with probability 1/2.

Analyzing this scheme amounts to showing that Calvin can render the channel “symmetric.” This is a common condition in arbitrarily varying channels (AVCs), a topic near and dear to my heart. Basically Bob can’t tell the difference between the real codeword and the transmitted codeword, because under Calvin’s attack, the chance that Alice chose u and Calvin chose \hat{u} is the same as the chance Alice chose \hat{u} and Calvin chose {u}. To establish this symmetry condition requires some technical excursions which are less fun to blog about, but were fun to figure out.

It’s relatively clear that this approach would extend to more general AVCs, which we could work on for the future. What is neat to me is that this shows how much value Calvin can derive by knowing the current input bit — by forcing additional uncertainty to Bob during the babble phase, Calvin can buy some time to more efficiently use his bit flipping budget in the second phase.

Jácaras and Ensaladas

I am singing with the UChicago Early Music Ensemble, a somewhat relaxed group led by David Douglass and Ellen Hargis of The Newbury Consort. I started rehearsing a bit late, so I’ve been playing catch-up. This year the repertoire is all music from Spain and Spanish colonies, and today we worked on two of the harder pieces in the program : an ensalada called La Bomba, by Mateo Flecha “El Viejo”, and a jacara by Juan Gutiérrez de Padilla. Too many words!

In looking for some recordings to get a better sense of the pieces, I came across this charmingly old King’s Singers TV special (check out those sweater vests!) acting out La Bomba in what appears to be the house from Clue:

Lucky them, they get multiple takes which makes it a bit easier to manage the crazy transitions in the piece. There’s also a multitracked recording on which is pretty good:

Unfortunately we are doing it up a third from there, much to the chagrin of my passagio.

We spent a bit of time trying to get the jácara up to speed, but singing Spanish that fast is hard! When I heard how fast this version went I almost lost it:

It looks like I have my work cut out for me, especially if I want to roll my r’s like that.

Linkage

Congratulations to my fellow Beast Amitha Knight on being a co-winner of the 2012 PEN New Enlgand Susan P. Bloom Children’s Book Discovery Award!

Speaking of children’s books, some people who saw The Hunger Games movie are upset that Rue is black. Unsurprising but sad.

And speaking of friends, my friend Amber is slumming it in Antarctica and is writing some fascinating blog posts from down there.

Can Ellen Do More Push-Ups Than Michelle Obama? They both seem to be able to do more pushups than me. Time to hit the gym I think.

I’ve been eating this spicy peanut noodle salad for lunch this week and boy is it delicious.