Posting a hodgepodge of links after a rather wonderful time hiking and camping, solving puzzles, and the semester starting all together too soon for my taste.

[Trigger warning] More details on Walter Lewin’s actions.

The unbearable maleness of Wikipedia.

Hanna Wallach’s talk at the NIPS Workshop on fairness.

Reframing Science’s Diversity Challenge by trying to move beyond the pipeline metaphor.

An essay by Daniel Solove on privacy (I’d recommend reading his books too but this is shorter). He takes on the “nothing to hide” argument against privacy.

I don’t like IPAs that much, but this lawsuit about lettering seems like a big deal for the craft beer movement.

I’ve always been a little skeptical of Humans of New York, but never was sure why. I think this critique has something to it. Not sure I fully agree but it does capture some of my discomfort.

Judith Butler gave a nice interview where she talks a bit about why “All Lives Matter,” while true, is not an appropriate rhetorical strategy: “If we jump too quickly to the universal formulation, ‘all lives matter,’ then we miss the fact that black people have not yet been included in the idea of ‘all lives.’ That said, it is true that all lives matter (we can then debate about when life begins or ends). But to make that universal formulation concrete, to make that into a living formulation, one that truly extends to all people, we have to foreground those lives that are not mattering now, to mark that exclusion, and militate against it.”

A nice essay on morality and progress with respect to Silicon Valley. Techno-utopianism running amok leads to bad results: “Silicon Valley’s amorality problem arises from the implicit and explicit narrative of progress companies use for marketing and that people use to find meaning in their work. By accepting this narrative of progress uncritically, imagining that technological change equals historic human betterment, many in Silicon Valley excuse themselves from moral reflection.”

Data: what is it good for? (Absolutely Something): the first few weeks

So Waheed Bajwa and I have been teaching this Byrne Seminar on “data science.” At Allerton some people asked me how it was going and what we were covering in the class. These seminars are meant to be more discussion-based. This is a bit tough for us in particular:

  • engineering classes are generally NOT discussion-based, neither in the US nor in Pakistan
  • it’s been more than a decade since we were undergraduates, let alone 18
  • the students in our class are fresh out of high school and also haven’t had discussion-based classes

My one experience in leading discussion was covering for a theater class approximately 10 years ago, but that was junior-level elective as I recall, and the dynamics were quite a bit different. So getting a discussion going and getting all of the students to participate is, on top of being tough in general, particularly challenging for us. What has helped is that a number of the students in the class are pretty engaged with the ideas and material, and we do in the end get to collectively think about the technologies around us and the role that data plays a bit differently.

What I wanted to talk about in this post was what we’ve covered in the first few weeks. If we offer this class again it would be good to revisit some of the decisions we’ve made along the way, as this is as much a learning process for us as it is for them. A Byrne Seminar meets for 10 times during the semester, so that it will end well before finals. We had some overflow from one topic to the next, but roughly speaking the class went in the following order:

  • Introduction: what is data?
  • Potentials and perils of data science
  • The importance of modeling
  • Statistical considerations
  • Machine learning and algorithms
  • Data and society: ethics and privacy
  • Data visualizaion
  • Project Presentations

I’ll talk a bit more on the blog about this class, what we covered, what readings/videos we ended up choosing, and how it went. I think it would be fun to offer this course again, assuming our evaluations pass muster. But in the meantime, the class is still on, so it’s a bit hard to pass retrospective judgement.

CFP: JSTSP Special Issue on Privacy

IEEE Signal Processing Society
IEEE Journal of Selected Topics in Signal Processing
Special Issue on Signal and Information Processing for Privacy

Aims and Scope: There has been a remarkable increase in the usage of communications and information technology over the past decade. Currently, in the backend and in the cloud, reside electronic repositories that contain an enormous amount of information and data associated with the world around us. These repositories include databases for data-mining, census, social networking, medical records, etc. It is easy to forecast that our society will become increasingly reliant on applications built upon these data repositories. Unfortunately, the rate of technological advancement associated with building applications that produce and use such data has significantly outpaced the development of mechanisms that ensure the privacy of such data and the systems that process it. As a society we are currently witnessing many privacy-related concerns that have resulted from these technologies—there are now grave concerns about our communications being wiretapped, about our SSL/TLS connections being compromised, about our personal data being shared with entities we have no relationship with, etc. The problems of information exchange, interaction, and access lend themselves to fundamental information processing abstractions and theoretical analysis. The tools of rate-distortion theory, distributed compression algorithms, distributed storage codes, machine learning for feature identification and suppression, and compressive sensing and sampling theory are fundamental and can be applied to precisely formulate and quantify the tradeoff between utility and privacy in a variety of domains. Thus, while rate-distortion theory and information-theoretic privacy can provide fundamental bounds on privacy leakage of distributed data systems, the information and signal processing techniques of compressive sensing, machine learning, and graphical models are the key ingredients necessary to achieve these performance limits in a variety of applications involving streaming data, distributed data storage (cloud), and interactive data applications across a number of platforms. This special issue seeks to provide a venue for ongoing research in information and signal processing for applications where privacy concerns are paramount.

Topics of Interest include (but are not limited to):

  • Signal processing for information-theoretic privacy
  • Signal processing techniques for access control with privacy guarantees in distributed storage systems
  • Distributed inference and estimation with privacy guarantees
  • Location privacy and obfuscation of mobile device positioning
  • Interplay of privacy and other information processing tasks
  • Formalized models for adversaries and threats in applications where consumer and producer privacy is a major concern
  • Techniques to achieve covert or stealthy communication in support of private communications
  • Competitive privacy and game theoretic formulations of privacy and obfuscation

Important Dates:
Manuscript submission due: October 1, 2014
First review completed: December 15, 2014
Revised manuscript due: February 1, 2015
Second review completed: March 15, 2015
Final manuscript due: May 1, 2015
Publication date: October 2015

Prospective authors should visit the JSTSP homepage for information on paper submission. Manuscripts should be submitted using Manuscript Central.


I am traveling all over India at the moment so I’m not really able to write contentful posts. Here are even more links instead, sigh. Maybe later I’ll talk about log-Sobolev inequalities so I can be cool like Max.

Speaking of Max, he posted this hilarious bad lip reading version of Game of Thrones. Probably NSFW. I don’t even like the series but it’s pretty funny.

For those who are fans of Rejected, Don Hertzfeldt’s new film is available on Vimeo.

Those who were at Berkeley may remember seeing Ed Reed perform at the Cheeseboard. His album (which I helped fund via indiegogo, was named a Downbeat Editors’ Pick. It’s a great album.

In light of the Snowden leaks, some doubt has been cast on NIST’s crypto standards.

I’m super late to this, but I endorse Andrew’s endorsement of Sergio‘s interview with Robert Fano in the IT Newsletter. Here’s just the article, if you want that.


David McAllester, my department chair at TTI, has a started a new blog.

I thought it was pretty well known that people are fairly unique by ZIP code, but Forbes has an article about it now (h/t Raj). Of course, stores can also ping a smartphone’s WiFi to get more accurate location information about your activity within the store — when you check out they can tag your the MAC address of your device to all the other information about you. Creeptastic!

Bradley Efron’s perspective on the impact of Bayes’ Theorem from Science (h/t Kevin).

Some discussion on what makes a popular philosophy book. I wonder what, if anything, transfers over to a popular mathematical book?

Some thoughts from Larry Laudan on the mathematization of the presumption of innocence.

CFP : GlobalSIP 2013 Deadline Extended to June 15

I’m on the program committee for the Cyber-Security and Privacy symposium, so I figured I would post this here to make more work for myself.

GlobalSIP 2013 – Call for Papers
IEEE Global Conference on Signal and Information Processing
December 3-5, 2013 | Austin, Texas, U.S.A.

GlobalSIP: IEEE Global Conference on Signal and Information Processing is a new flagship IEEE Signal Processing Society conference. The focus of this conference is on signal and information processing and up-and-coming signal processing themes.

GlobalSIP is composed of symposia selected based on responses to the call-for-symposia proposals. GlobalSIP is composed of symposia on hot topics related to signal and information processing.

The selected symposia are:

Paper submission will be online only through the GlobalSIP 2013 website Papers should be in IEEE two-column format. The maximum length varies among the symposia; be sure to check each symposium’s information page for details. Authors of Signal Processing Letters papers will be given the opportunity to present their work at GlobalSIP 2013, subject to space availability and approval by the Technical Program Chairs of GlobalSIP 2013. The authors need to specify in which symposium they wish to present their paper. Please check conference webpage for details.

Important Dates:
*New* Paper Submission Deadline – June 15, 2013
Review Results Announce – July 30, 2013
Camera-Ready Papers Due – September 7, 2013
*New* SPL request for presentation – September 7, 2013

Linkage : science edition

Learning from transcriptomes can be cheaper for organisms which have never been sequenced.

A fancy Nature article on mobility privacy, in case you weren’t convinced by other studies on mobility privacy.

Bad statistics in neuroscience. Color me unsurprised.

I bet faked results happen a lot in pharmaceutical trials, given the money involved. Perhaps we should jail people for faking data as a disincentive?

The Atheist shoe company did a study to see if the USPS was discriminating against them.