The 2008 measles outbreak in San Diego

A pediatrician friend of mine pointed out this bit of news in Pediatrics on the January 2008 outbreak of measles in San Diego:

The outbreak began in January 2008 when a 7-year-old boy whose parents refused to vaccinate him returned to the U.S. from Switzerland. Before symptoms appeared, he infected his 3-year-old brother and 9-year-old sister. Neither was vaccinated.

Neither were 11% of the boy’s classmates, whose parents shared similar beliefs that a healthy lifestyle protected against disease while vaccines were riskier than the illnesses they prevented.

In the end, 839 people were exposed to measles. Eleven were infected, and 48 exposed kids too young to be vaccinated were quarantined — forbidden to leave their homes — for 21 days. Jane Seward, MBBS, MPH, was the CDC’s senior investigator for the outbreak.

Despite the extraordinary efforts of health workers, what really ended the San Diego outbreak wasn’t quarantine or post-exposure vaccination. It was the high vaccination rate in the rest of the community that kept the outbreak from becoming an epidemic.

This is the summary of the study:

The importation resulted in 839 exposed persons, 11 additional cases (all in unvaccinated children), and the hospitalization of an infant too young to be vaccinated. Two-dose vaccination coverage of 95%, absence of vaccine failure, and a vigorous outbreak response halted spread beyond the third generation, at a net public-sector cost of $10376 per case. Although 75% of the cases were of persons who were intentionally unvaccinated, 48 children too young to be vaccinated were quarantined, at an average family cost of $775 per child. Substantial rates of intentional undervaccination occurred in public charter and private schools, as well as public schools in upper-socioeconomic areas. Vaccine refusal clustered geographically and the overall rate seemed to be rising. In discussion groups and survey responses, the majority of parents who declined vaccination for their children were concerned with vaccine adverse events.

CONCLUSIONS Despite high community vaccination coverage, measles outbreaks can occur among clusters of intentionally undervaccinated children, at major cost to public health agencies, medical systems, and families. Rising rates of intentional undervaccination can undermine measles elimination.

The medical and public health community needs to really get going on this. The article ends by saying the researchers met parents with “real fears” about the risk of autism from vaccines. I’m sure their fears are real, but how on earth do you convince them otherwise?

Talk at USC Wednesday

In case you’re at USC or in the area, I’m giving a talk tomorrow there on some of the work I’ve been doing with Kamalika Chaudhuri (whose website seems to have moved) and Claire Monteleoni on privacy-preserving machine learning.

Learning from sensitive data – balancing accuracy and privacy

Wednesday, March 24th, 2010
2:00pm-3:00pm
EEB 248

The advent of electronic databases has made it possible to perform data mining and statistical analyses of populations with applications from public health to infrastructure planning. However, the analysis of individuals’ data, even for aggregate statistics, raises questions of privacy which in turn require formal mathematical analysis. A recent measure called differential privacy provides a rigorous statistical privacy guarantee to every individual in the database. We develop privacy-preserving support vector machines (SVMs) that give an improved tradeoff between misclassification error and the privacy level. Our techniques are an application of a more general method for ensuring privacy in convex optimization problems.

Joint work with Kamalika Chaudhuri (UCSD) and Claire Monteleoni (Columbia)

Privacy and Google Web History

Posted on ArXiV last night: Private Information Disclosure from Web Searches. (The case of Google Web History), by Claude Castelluccia, Emiliano De Cristofaro, Daniele Perito.

Our report was sent to Google on February 23rd, 2010. Google is investigating the problem and has decided to temporarily suspend search suggestions from Search History. Furthermore, Google Web History page is now offered over HTTPS only. Updated information about this project is available at: this http URL

The link above has some more details of their back and forth with Google on the matter, and at least it looks like Google’s on the losing end of it.

Search histories have a lot of information in them, since searches correlated with local events, such as disease spread (related and interesting is Twitter’s tracking of earthquakes). Since user sessions can be compromised by someone hijacking the cookies that maintain the session, Google requires HTTPS for many services, like GMail, but not for the “automatic suggestion” for searches. The authors implemented an attack called The Historiographer:

The Historiographer uses the fact that users signed in any Google service receive personalized suggestions for their search queries based on previously-searched keywords. Since Google Web Search transmits authentication cookies in clear, the Historiographer monitoring the network can capture such a cookie and exploit the search suggestions to reconstruct a user’s search history.

This attack is not looking at a short time-window of browsing history, but essentially the entire search history as stored by Google. They did real experiments, and found:

Results show that almost one third of monitored users were signed in their Google accounts and, among them, a half had Web History enabled, thus being vulnerable to our attack. Finally, we show that data from several other Google services can be collected with a simple session hijacking attack.

So how does it work? The program hijacks the SID cookie from the user by eavesdropping, and then issues prefixes to the suggestion services; that is, it simulates a user typing in the first few letters of a search query. Prefixes have to be at least 2-3 letters to trigger the suggestion, and the top 3 completions are given. Of course 26^3 is a lot of prefixes to try, so the system has to sample effectively. The system just queries the top 10% of most frequent 3-letter prefixes (based on the statistics of English), which amounts to 121 queries to the system. If a particular 2-letter prefix (e.g. “pr”) is a prefix for many 3-letter prefixes (e.g. “pre”, “pra”, “pro”) which result in 3 completions, they will proceed greedily to look at longer prefixes in that direction. Note that this is the same principle behind Dasher (or arithmetic coding, really).

Based on this, the system can reconstruct the search history for the hijacked user. By using Google’s personalized results service, they can also get more information about the user’s preferences. A little more worrying is this observation:

In fact, a malicious entity could set up a Tor exit node to hijack cookies and reconstruct search histories. The security design underlying the Tor network guarantees that the malicious Tor exit node, although potentially able to access unencrypted traffic, is not able to learn the origin of such traffic. However, it may take the malicious node just one Google SID cookie to reconstruct a user’s search history, the searched locations, the default location, etc., thus signi cantly increasing the probability of identifying a user.

It’s an interesting paper, and worth a read if you are interested in these issues.

self (the remix)

Last weekend I had a chance to see Mo’olelo Performing Arts Company‘s production (they also have a blog) of Robert Farid Karimi’s self (the remix) featuring Karimi and DJ D Double:

Storyteller/performance artist, def poetry jam performer, national poetry slam champion robert farid karimi — supported by an amazing soundscape spun live by Chicago DJ and Violator All-Star DJ D Double — mixes together stories, movement, and music to tell the tale of a first generation child of Iranian and Guatemalan immigrants learning how to survive the cultural imperialism of the United States on his quest to find wholeness in the fractured atmosphere of the 70s and 80s.

It’s a coming-of-age story that seems to have a new relevance given the current tensions between the US and Iran and the heated rhetoric around immigration. I usually enjoy solo performance, and although this is technically a dual performance, the “style” is similar to other narrative solo performances (c.f. Josh Kornbluth). What was particularly effective is the way in which DJ D Double weaves the soundtrack and effects into the narrative. It’s rapid-changing and pulls samples, beats, and songs from every direction, providing an structure to support Karimi’s performance while commenting and in an effect becoming its own character. In terms of “solo performance,” it’s some of the best use of sound I’ve seen.

The show only has a few more performances, starting tonight and going through this weekend. If you’re in San Diego and reading this (probably 5 people total), then go check it out!