Linkage

An animation of integer factorizations. Goes well with music. (h/t BK).

Graphics from the Chicago L (via Chicagoist)

Tony Kushner is kind of a tool. I find this unfortunate. But I still want to see Lincoln.

Aaron Roth reports that the DIMACS tutorial videos have been posted. A perfect time to brush up on your differential privacy!

An analysis of the Thai government’s menu served to President Obama.

A Choose Your Own Adventure version of Hamlet, from the creator of Dinosaur Comics.

The ACME Catalog, for your roadrunner-catching needs.

The things we know we don’t know

As a theoretical engineer, I find myself getting lulled into the trap of what I now starting to call “lazy generalization.” It’s a form of bland motivation that you often find at the beginning of papers:

Sensor networks are large distributed collections of low-power nodes with wireless radios and limited battery power.

Really? All sensor networks are like this? I think not. Lots of sensor networks are wired (think of the power grid) but still communicate wirelessly. Others communicate through wires. This is the kind of ontological statement that metastasizes into the research equivalent of a meme — 3 years after Smart Dust appears, suddenly all papers are about dust-like networks, ignoring the vast range of other interesting problems that arise in other kinds of “sensor networks.”

Another good example is “it is well known that most [REAL WORLD THING] follows a power law,” which bugs Cosma to no end. We then get lots of papers papers which start with something about power laws and then proceed to analyze some algorithms which work well on graphs which have power law degree distributions. And the later we get statements like “all natural graphs follow power laws, so here’s a theory for those graphs, which tells us all about nature.”

Yet another example of this is sparsity. Sparsity is interesting! It lets you do a lot of cool stuff, like compressed sensing. And it’s true that some real world signals are approximately sparse in some basis. However, turn the crank and we get papers which make crazy statements approximately equal to “all interesting signals are sparse.” This is trivially true if you take the signal itself as a basis element, but in the way it’s mean (e.g. “in some standard basis”), it is patently false.

So why is are these lazy generalization? It’s a kind of fallacy which goes something like:

  1. Topic A is really useful.
  2. By assuming some Structure B about Topic A, we can do lots of cool/fun math.
  3. All useful problems have Structure B

Pattern matching, we get A = [sensor networks, the web, signal acquisition], and B = [low power/wireless, power laws, sparsity].

This post may sound like I’m griping about these topics being “hot” — I’m not. Of course, when a topic gets hot, you get a lot of (probably incremental) papers all over the place. That’s the nature of “progress.” What I’m talking about is the third point. When we go back to our alabaster spire of theory on top of the ivory tower, we should not fall into the same trap of saying that “by characterizing the limits of Structure B I have fundamentally characterized Topic A.” Maybe that’s good marketing, but it’s not very good science, I think. Like I said, it’s a trap that I’m sure I’m guilty of stepping into on occasion, but it seems to be creeping into a number of things I’ve been reading lately.

Readings

Snuff [Terry Pratchett] : this was standard Discworld stuff, but I found it a little below-average. I’m finicky that way though.

Snakes Can’t Run [Ed Lin] : a follow-up to This Is A Bust, this book is about human smuggling in New York Chinatown in the 70s. Recommended if you like thing Asian and mysterious.

Shark’s Fin and Sichuan Pepper: A Sweet-Sour Memoir of Eating in China [Fuschia Dunlop] : I found this memoir to be engaging but it’s definitely got that feel of “Western person’s observations about China.” Dunlop is more aware of her situation as outsider/observer, but sometimes its hard to shake that narrative vibe. That being said, you should definitely read this if you want to know more about Chinese cuisines.

Among Others [Jo Walton] : it won a Nebula and a Hugo and I could see why. This is a really sharply observed and narrated coming-of-age story about a high school girl who is not part of the “main crowd” and finds her solace in voraciously reading all of the SciFi/Fantasy novels she can get her hands on. Really lovely writing.

The Lost Soul of Higher Education [Ellen Schrecker] : a pretty sobering read with a lot of historical background on the state of academic freedom, the corporatization of the university system, and possible ramifications for the future of the US. It was a bit depressing but well worth reading.

One Day in the Life of Ivan Denisovich [Aleksandr Solzhenitsyn] : this is Solzhenitsyn’s first book, a first-person narrative of one person’s life in a Stalinist labor camp. It really brings the grimness of the place alive — Cool Hand Luke’s prison camp had nothing on this. They’re worth comparing, I think.

Linkage : Black Friday edition

This is an amazing video that makes me miss the Bay Area. (via Bobak Nazer)

Also via Bobak, we’re number 8 and 10!

Since it’s holiday season, I figured it’s time to link to some profanity-laden humor about the holidays. For the new, The Hater’s Guide to the Williams-Sonoma Catalog, and the classic It’s Decorative Gourd Season….

A Game of Food Trucks. (via MetaFilter)

Larry Wasserman takes on the Bayesian/Frequentist debate.

LCD Soundsystem + Miles Davis youtube mashup.

My friend Erik, who started the Mystery Brewing Company, has a blog called Top Fermented. He is now starting a podcast, which also has an RSS feed.

i’m in ur protocolz, jammin ur cellphonez

Krish Eswaran sent me a story about how a group at Virgina Tech described how LTE networks are susceptible to a certain kind of jamming strategy:

“An example strategy would be to target specific control or synchronization signals, in order to increase the geographic range of the jammer and better avoid detection,” the Wireless @ Virginia Tech research group said in a filing (PDF) submitted to the National Telecommunications and Information Administration. “The availability of low-cost and easy to use software-defined radios makes this threat even more realistic.”

Color me unsurprised! For my PhD, I studied arbitrarily varying channels (AVCs), which are information-theoretic models for communication against adversarial interference. There are a couple of design insights one can distill from considering the AVC model:

  • Separating protocol and payload makes schemes susceptible to spoofing.
  • Lack of synchronization/coordination between sender and receiver can be a real problem in adversarial settings.

Here we have a case where the protocol is easy to spoof/disrupt, essentially because the control information in unprotected.

This separation between control information and payload is often suboptimal in other senses. See, for example, Tchamkerten, Chandar and Wornell.

Linkage

New(ish) policies at the NSF — read up if you are planning on writing some grants! h/t to Helena, who sent this in aaaaages ago.

I’m not sure I agree that these are the 10 must-listen, but it’s something at least.

This article on Jonah Lehrer is quite interesting. I think there are some things to be learned here for academic writers as well…

I forgot to add a link to Suhas Mathur has a blog, sorry!

bibimbap is a tool to import BibTeX entries from various sources. It runs in the console and is designed to be simple and fast. bibimbap is simply the easiest way to manage your BibTeX bibliographies. Be advised that it still won’t read the papers for you, though.” — looks like it could be awesome. h/t to Manu.

Problems in modeling Illumina sequencing

About a year ago I started collaborating with a friend in the Armbrust Lab at the University of Washington on some bioinformatics problems, and as a part of that I am trying to give myself a primer on sequencing technologies and how they work. I came across this video recently, and despite its atrocious music and jargonized description, I actually found it quite helpful in thinking about how this particular sequencing technology works:

  • Acoustic waves shatter the DNA.
  • Things (ligases) get attached to the end.
  • The fragments get washed over a “lawn” and the ligases stick the sequences to the lawn.
  • The strands get amplified into larger spots.
  • Single nucleotides with phosphorescent tags are washed on, they are hit with a laser to reveal the color, the tag is sheared, and then the next nucleotide is washed on.

A simple model for the data we get is to say that a position in the genome is selected uniformly at random, and then the read is the sequence of size L, starting from that position. Just a brief glance at the physical process above shows how simple that model is. For the purposes of statistics, it may be enough, but here are some complications that I can see, from knowing almost no physics and biology:

  • The places at which the DNA fragments are not uniformly distributed — in fact, they should be sequence-dependent.
  • The ligases may have some preferential attachment characteristics. Ditto for the oligos on the lawn in the flowcell.
  • The amplification may be variable, spot by spot. This will affect the brightness of the flash and therefore the reliability of the read assessment.
  • The ability of single nucleotides to bind will vary as more and more bases are read, so the gain in the optical signal (or noise) will vary as the read goes on.

Some of these effects are easier to model than others, but what is true from the real data is that these variations in the technology can cause noticeable effects in the data that deviate from the simple model. More fun work to do!

DIMACS Workshop on Information-Theoretic Network Security

At DIMACS, I got a notice about a workshop here that is coming up in November with a deadline ofr November 5 to register: the DIMACS Workshop on Information-Theoretic Network Security organized by Yingbin Liang and Prakash Narayan. Should be worth checking out — they have a nice slate of talks.

If you do come though, don’t stay at the Holiday Inn — go for The Heldrich or a Hyatt or something that is anywhere near walking distance to restaurants or something. I think I almost got run over going to Walgreens yesterday in this land of strip malls…

Postdoc / Visiting Scholar positions at Harvard’s Center for Research on Computation and Society

The Harvard Center for Research on Computation and Society (CRCS) solicits applications for its Postdoctoral Fellows and Visiting Scholars Programs for the 2013-2014 academic year. Postdoctoral Fellows are given an annual salary of approximately $60,000 for one year (with the possibility of renewal) to engage in a program of original research, and are provided with additional funds for travel and research support. Visiting Scholars often come with their own support, but CRCS can occasionally offer supplemental funding.

We seek researchers who wish to interact with both computer scientists and colleagues from other disciplines, and have a demonstrated interest in connecting their research agenda with societal issues.  We are particularly interested in candidates with interests in Economics and Computer Science, Health Care Informatics, Privacy & Security, and/or Technology & Accessibility, and those who may be interested in engaging in one of our ongoing/upcoming projects:

  • Intelligent, Adaptive Systems for Health Care Informatics
  • Language-Based Security
  • Personalized Accessibility
  • Privacy and Security in Targeted Advertising
  • Privacy Tools for Sharing Research Data
  • Trustworthy Crowdsourcing

Harvard University is an Affirmative Action/Equal Opportunity Employer. We are particularly interested in attracting women and underrepresented groups to participate in CRCS.  For further information about the Center and its activities, see http://crcs.seas.harvard.edu/.

Application Procedure

A cover letter, CV, research statement, copies of up to three research papers, and up to three letters of reference should be sent to:

Postdoctoral Fellows and Visiting Scholars Programs
Center for Research on Computation and Society
crcs-apply@seas.harvard.edu

References for postdoctoral fellows should send their letters directly, and Visiting Scholar applicants may provide a list of references rather than having letters sent. The application deadline for full consideration is December 16, 2012.