The card-cyclic to random shuffle

I heard about this cool-sounding seminar at the Berkeley Statistics department:

Mixing time of the card-cyclic to random shuffle
Ben Morris

We analyze the following method for shuffling n cards. First, remove card 1 (i.e., the card with label 1) and then re-insert it randomly into the deck. Then repeat with cards 2, 3,…, n. Call this a round. R. Pinsky showed, somewhat surprisingly, that the mixing time is greater than one round. We show that in fact the mixing time is on the order of \log n rounds.

The talk is based on a paper with Weiyang Ning (a student at UW) and Yuval Peres. The description the results is somewhat different because it’s \log n rounds of n moves, or n \log n moves. From the intro to the paper:

To prove the lower bound we introduce the concept of a barrier between two parts of the deck that moves along with the cards as the shuffling is performed. Then we show that the trajectory of this barrier can be well-approximated by a deterministic function f… we relate the mixing rate of the chain to the rate at which f converges to a constant.

The key is to use path coupling, a technique pioneered by Bubley and Dyer. It’s a cute result and would be a fun paper to read for a class project or something, I bet.

Five career-oriented things I wish I had done more of in grad school

As I’ve started my nascent academic career, I’ve faced a number of new and difficult challenges, and on more than one occasion I’ve felt hoisted with my own petard. Looking back on my graduate school experience, I realized there were a number of things which I could have done more that would have made things easier now.

  1. Write more grants. I came in to grad school with a NDSEG fellowship and then my advisor was well-funded. I had a blesséd life (especially compared to those in other departments). I should have gotten a bit more on the ball helping to write grants, even if it was just for the experience. Reading a proposal (or many) really gives a sense of what the document is about. More importantly, it’s a grant in your field and that’s something that you can’t really get out of one of those panels on grantsmanship that are pitched to a broad audience. Learning the process early has two benefits : it demystifies the strategy space in terms of writing, and it gives a good sense of what kind of things make sense and are “fundable.” I wrote little dribs and drabs of proposals here and there but it wasn’t until my postdoc that I really got to see and work on a full proposal.
  2. Do an internship. : It’s sad to say, but I never did an internship while in grad school. This was a real mistake. As an engineer it’s important to learn what is going on in “real” engineering but more importantly, as a engineering theorist it’s important to understand what’s going to be important in the future. Some of that you can do by watching the news and trends, but having a visceral sense of the challenges in making a real object is an important perspective. On the more practical side, it gives you more contacts in industry and the more people you know, the more ideas you can integrate.
  3. Constantly tell stories. : Part of this is for grant writing, but it’s a more general skill that is important for the job market and interacting with your colleagues, 99% of whom will be outside your area. This is a skill that is hard to develop in graduate school because we spend our time in our labs talking to other graduate students who are not that far away from us in terms of intellectual background. This kind of story-telling is often cast as “be able to talk to the general public” or “explain your research to a crowd that includes information theorists and device physicists.” Part of being able to tell a story about your research is being able to tell it in someone else’s language, which means understanding a bit about how people in other fields think. How would you explain your research to an anthropologist? I did practice explaining to other people, but I wish I had practiced more.
  4. Find a community. : I do a lot of work that falls between fields or on the corners of fields, so I’ve bounced around a bit. The last time I applied for jobs I had interviews wearing a networks hat, a signal processing hat, and an information theorist hat. Finding a community can have a positive or negative connotation. On the one hand, it’s dumb that people care so much about labels : “oh, he’s a machine learning guy,” or “oh, he’s a signal processing guy.” On the other hand, cultivating professional relationships with a research community is valuable because those are the people who will remember you when your name comes up.
  5. Write the journal version first. People always say this. I think Lalitha told me this in 2005 at my first conference. Of course it’s a good idea. And it’s a slippery slope if you let it slide for one paper… 5 years later you realize you have 4 nascent journal papers with nice full stories but not enough time to do the last bit of research or flesh out the results a bit more. Not every conference paper is (or should be) a journal-worthy idea. But the goal is to tell bigger stories than 4 page ICASSP paper, and it’s important to keep that bigger picture in mind. I wish I had done more work contextualizing some of the things I worked on so I could come back to them and expand on them later. Instead I have a lot of weird half-baked ideas in old PDFs sitting around on my hard drive.

On conference schwag

When I moved to Chicago I realized I had a ton of conference bags. I was lucky enough to have a large closet in San Diego, and they kind of piled up in a back corner, from various ISITs and so on. I gave a few to Goodwill, but I have no idea if they will get used. I was recently talking to folks who work in public health and they were shocked that we get these “nice” bags at conferences — they get water bottles. Computer science conferences give out shirts. Why do we get bags? It’s kind of a pet peeve of mine, for a number of reasons.

How much do these bags cost? I should find this out, but I imagine a full messenger bag with laptop sleeve (c.f. ISIT 2012) isn’t cheap, and that cost gets directly transferred to registration fees. It probably only works out to 20 bucks but still…

Why don’t we get pens and paper? Most of the time you get a bag, a USB stick with the proceedings, the program booklet… and that’s it. I was shocked at SPCOM in Bangalore that they gave us a spiral notebook (handy!) and a pen (and really nice one at that) on *top* of the bag (which is pretty nice, as backpacks go).

Don’t people already have bags? You’re traveling halfway around the world with your laptop. You must have some sort of bag that you use already — why do you need another one? How are you going to fit it into your luggage?

Of course, after thinking of these complaints it turns out my ISIT 2012 bag has been a lifesaver — my apartment was robbed and they took all of my computers and electronics and carried them off in my bag, so having a ready replacement sure was handy. Furthermore, I use the tote bags (ISIT 2007-2009) all the time for groceries or going to the gym or the beach, so those are great.

What kind of schwag would you prefer? I use my IPSN 2005 mug all the time at work for tea…

Readings

Did I mention I love the Chicago Public Library? It can be frustrating at times, but the main branch is right on my way to and from work.

The Magician King (Lev Grossman) — The sequel to The Magicians, sometimes described as Hipsters in Narnia. This book is actually darker, if such a thing as possible. I think it’s interesting to look at it plotted in terms of the lives of likely readers. The first book is for college kids. The second is for post-college working kids who have nice jobs and realize that their lives feel a bit empty.

Odd and The Frost Giants (Neil Gaiman) — I’ve been on a children’s book kick. A lovely little tale set in Viking mythos.

The Alchemyst, The Magician, The Sorceress (Michael Scott) — Children’s/YA fantasy series. I had mixed feelings about it but it featured John Dee as a villain, and having read so much of Crowley’s Aegypt Cycle, I was interested in Dee as a character. Very different here — he’s a supervillain.

The Poisoner’s Handbook (Deborah Blum) — A fascinating tale about the rise of the medical examiner’s office and forensic medicine. The descriptions of how to detect various poisons in the tissues of the deceased is not for the squeamish!

The Immortal Life of Henrietta Lacks (Rebecca Skloot) — This book tells the story of the cell line HeLa, taken from Henrietta Lacks, an African-American patient in the 50s who died of cancer. Her cells were able to multiply on their own in the lab and ushered in a new era of research, but the way she and her family were treated epitomizes the ethical void at the heart of many scientists’ view of human subjects research. Despite this being an important story to tell, Skloot manages to make a lot of the story about herself — there’s a rather vigorous critique here.

1Q84 (Haruki Murakami) — Pretty classic Murakami, but a little more focused in content if expansive in scope. Investigates in fictional form some of the cult phenomena that seem to have captured his imagination lately. Critical opinion has been pretty divided, but I’d recommend it if you like Murakami, but not as an intro to his oeuvre.

The Atrocity Archive (Charles Stross) — sysadmins battle Cthulhu-eqsue horrors from the beyond. This is the first book in the Laundry series, and while the narrator is entertaining, I’ll probably give the rest of the series a pass.

Halting State (Charles Stross) — a near-future in which massive fraud/theft in an online game threatens to undermine the real economy. Takes gold farming and selling of WoW stuff on eBay to its extreme and then looks at what happens. Stross is good at extrapolating economic scenarios, and this was certainly more fun to read.

Postdoc at Cornell in smart grid, learning, optimization and control

Applicants are sought for postdoctoral scholar position(s) at Cornell University in the areas of smart grid, learning, optimization, and control. Topics include, but are not limited to,

  1. the economics and operation of power systems with significant penetration of intermittent renewables.
  2. stochastic optimization, learning, game theory, mechanism design, and their applications.
  3. inference and control involving heavy tail distributions.

Successful candidates will participate in research activities led by Professors Lang Tong and/or Eilyan Bitar.

To apply, please send your CV, two recent papers, and 2-3 names references to Lang Tong (ltong@ece.cornell.edu) and Eilyan Bitar (eyb5@cornell.edu).

Postdoc job openings

Some people have told me about postdoctoral position openings that are opening up, and I figured I’d repost some of them here as they come along. Of course, there are other places to post announcements, but I find that postdoc opportunities are a bit harder to advertise/hear about. I think a lot of systems EE people applying for academic positions right out of grad school tend to put off applying for postdocs until they hear back about their faculty interviews — I’d tend to say this is a mistake:

  • If your graduation date may be a little flexible, pinging someone early on (e.g. in the fall) about possible postdoc opportunities can be a good plan. NSF grant deadlines are in the fall, and so they could write a postdoc position into a current proposal.
  • Of course you’re going to apply for faculty positions, and the people you’re talking to about postdoc positions know that. However, if you get to May and haven’t talked to anyone about postdoc options, you may find that those positions have filled up.
  • Don’t think of a postdoc as a “fallback plan” (akin to a “safety school”) — it’s an opportunity and a chance to make a strategic decision. Do you want to switch areas or learn about something new? Do you want to dig deeper into things you’ve already been working on? Do you want a springboard to get a job in a specific country? Do you want to build closer ties to industry? Do you want closer mentorship?

I went to a panel at Allerton once on “whether you should do a postdoc” starring (among other people) Aaron Wagner and Todd Coleman, I believe. Everyone was very enthusiastic about doing a postdoc. Everyone on the panel had faculty positions lined up for after their postdoc and deferred their start date to do that postdoc. This is the best of all possible worlds but is pretty unusual, so don’t count on it.

This is all dodging the issue of whether or not you should even do a postdoc. That might be a topic for a different post (or a debate for the comments) — I know people have strong feelings on both sides. I tend to think our system is broken or veering into brokenness.

However, more information is more power, so if you have a postdoc announcement (details are helpful) and want me to post it here, please do send it my way. You can also try to post to the IT Society website.

Linkage

Via Amber, a collection of grassroots feminist political posters from India.

Via John, some fun investigations on how 355/113 is an amazingly good approximation to \pi. Also related are the Stern-Brocot trees, which can give continued fraction expansions.

I had missed this speech by a 10 year old on gay marriage when it happened (I was in India), but it’s pretty heartwarming. For more background on how the principal originally deemed the story “inappropriate.”

What is a Bayesian?

Unrelatedly, ML Hipster — tight bounds and tight jeans.

Le charme discret de l’entropie?

TTI Chicago is located on the University of Chicago campus. I am a UChicago affiliate, which means I get to use the libraries here. Unfortunately, the University is almost opposed to the idea of engineering, so things like institutional subscriptions to IEEExplore are out. Compounding this is a kind of old-fashionedness about the university which makes a lot of their collection… dated. However, since information theory is math-like, the university does have a fair number of information theory books and resources, including a number of textbooks and monographs which I had not seen before. One such is Stanford Goldman’s book, whose first chapter is entitled:

Le charme discret de l'entropie?

Le charme discret de l’entropie?

There is also a cute figure:

Moo

Moo!

ArXiV notes : July 2012 part 2

Differentially Private Filtering and Differentially Private Kalman Filtering
Jerome Le Ny, George J. Pappas
These papers apply the differential privacy model to classical signal processing and look at the effect of adding noise before filtering and summing versus adding noise after filtering and summing. The key is that the operations have to be causal, I think — we usually think about differential privacy as operating offline or online in very restricted settings, but here the signals are coming in as time series.

Finite sample posterior concentration in high-dimensional regression
Nate Strawn, Artin Armagan, Rayan Saab, Lawrence Carin, David Dunson
They study ultra high-dimensional linear regression (cue guitars and reverb) in the “large p, small n” setting. The goal is to get a Bernstein-von Mises type theorem — e.g. assuming the data comes from a linear model \beta_0, then using Bayesian inference the posterior should concentrate around \beta_0. They of course need a sparsity assumption, and the prior must assign reasonable mass around the true parameter and assign low probability to non-sparse signals. The methods use some ideas from compressed sensing (the Dantzig selector) and should be of interest to people working in that area.

Identifying Users From Their Rating Patterns
José Bento, Nadia Fawaz, Andrea Montanari, and Stratis Ioannidis
This is a paper about recommender systems as part of the 2011 CAMRa challenge. They look at the problem of re-identification of users in this data and show that looking at the time stamps of movie ratings is much more useful than looking at the rating values. This suggests to me that people should use a masking system like Anant and Parv’s “Incentivizing Anonymous ‘Peer-to-Peer’ Reviews” (P. Venkitasubramaniam and A. Sahai, Allerton 2008) paper.

<a href="http://arxiv.org/abs/1207.4084v1"Mechanism Design in Large Games: Incentives and Privacy
Michael Kearns, Mallesh M. Pai, Aaron Roth, Jonathan Ullman
This proposes a variant of differential privacy which they call joint differential privacy and look at mechanism designs that satisfy privacy and are incentive compatible. At first glance, these should be incompatible, since the latter implies “revealing the truth.” The model is one in which each agent has a finite set of actions but its own payoff/value is private. This is somewhat out of my area, so I can’t really get the nuances of what’s going on here, but a crucial assumption here is that there are a large number of agents. Joint differential privacy seems to be a form of (\epsilon,\delta) differential privacy on the utility functions of the users.