I’m at the Institute for Pure and Applied Math at a workshop on Statistical and Learning-Theoretic Challenges in Data Privacy. It’s a rather diverse group of computer scientists, statisticians, medical informatics, and policy researchers, and I feel a bit like I’m the only electrical engineer here. It’sa been pretty educational in the “learning about new problems” way, but I think by the time Friday morning rolls around I’ll be suffering from information overload. The nice thing is that the slides are posted online so I can refresh my memory when I get back. There’s also a poster session for some more recent results.
Most of the speakers have been talking about either (a) applications of the differential privacy model to some problems (e.g. data release, function computation, classification, PAC-learning, auctions, or Google’s MapReduce, the Census Bureau’s OnTheMap, and PINQ)or (b) areas in which privacy is a real problem (hospital discharge data and the dangers of re-identification, genome-wide association studies (GWAS), query logs from search engines, or (c) bridges between fields and their privacy definitions and models.
I’ve just started working in this area, so I’m still processing things (the talks range from high level to technical, and I often lack the background to understand fully what’s going on). I might blog a bit more about it as things come up.