A recent article was passed along to me:

Jane Bambauer, Krishnamurty Muralidhar, and Rathindra Sarathy

Fool’s Gold: An Illustrated Critique of Differential Privacy

Vanderbilt Journal of Entertainment and Technology Law 16(4):701-755.

The article is aimed at the legal community, which has seen in differential privacy a potential technological solution for data privacy issues. The goal of the article is to throw some cold water on some law scholars’ embrace of differential privacy as a solution concept. I’m not a one-method-fixes-all kind of person, but this article is sort of relentlessly negative about differential privacy based solely on a single mechanism: output perturbation. The authors appear to be laboring under the impression that this is really the only way to provide differential privacy, “an assumption that contorts the rest of [their] analysis,” the charge that they level at one proponent of differential privacy.

In the example with which they open the paper, they claim that “even knowing the distribution of noise that is randomly added to each cell, the internist has no hope of interpreting the response. The true values could be almost anything.” While technically true, it’s quite misleading. Indeed, by knowing the distribution, one can create bounds on the accuracy of the answer — this is, contra the authors’ claims, the “tension between utility and privacy” that differential privacy researchers do “toil” with. They manage to explain the statistics fairly reasonably in the middle of the paper but ignore that in the introduction and conclusion in favor of some ascerbic *bons mots*. Now, perhaps to them, privacy should be an almost-sure guarantee. There is a critique in that: differential privacy can only make probabilistic guarantees, and if your legal standard is stricter than that, then it’s probably not a good way to go. But the misleading rhetoric employed here is meant to stir emotions rather than sway the intellect.

The language in the article is quite florid: “Differential privacy has been rocking the computer science world for over ten years and is fast becoming a crossover hit among privacy scholars and policymakers.” I suppose this sort of prose may be what constitutes scholarly writing in law, but it lacks the measured tones that one might want in more objective criticism. Perhaps they read academic writing in science and engineering in an equally emotional register. They use some strong language to conclude “differential privacy is either not practicable or not novel.” I find such blanket statements both puzzling and vacuous. If you set up a straw-man of what differential privacy is, I suppose you can derive such dichotomies, but is that the best argument one can make?

One thing that comes out of this reading is that most people don’t really appreciate how technology progresses from academic research to practical solutions. Perhaps some legal scholars have overstated the case for differential privacy based on the state of the technology *now*. But whose to say how things will look a few years down the line? We’ll have better algorithms, different database structures, and different data sharing mechanisms and platforms. Perhaps differential privacy is not ready for prime time, although Google seems to disagree. The authors’ main point (hidden in the in the breathless indignation) is that it’s probably not the best solution for every data sharing problem, a statement with which I can completely agree.

In their effort to discredit differential privacy, the authors ignore both the way in which scientific and academic research works as well as contemporary work that seeks to address the very problems they raise: context-awareness via propose-test-release, methods for setting in practical scenarios, and dealing with multiple disclosures via stronger composition rules. They further ignore real technical hurdles in realizing “pure” differential privacy in favor of “illustrations” with the goal of painting proponents of differential privacy as ideologues and hucksters. Of course context and judgement are important in designing query mechanisms and privacy-preserving analysis systems. Furthermore, in many cases microdata have to be released for legal reasons. I think few people believe that differential privacy is a panacea, but it at least provides a real quantifiable approach to thinking about these privacy problems that one can build theories and algorithms around. The key is to figure out how to make those work on real data, and there’s a lot more *research* to be done on that front.

I personally thought some of their arguments were quite outdated. For example, they talk about the Dinur-Nissim paper, and make much of the fact that the attack there requires 2^n queries for databases with n individuals. However, later papers have shown much stronger attacks where you only need O(n) queries!

The moral is that it is often good to do a very thorough reading of the literature before writing something this scathing.

In fact, if you simply read a bit further in the Dinur-Nissim paper, they give an efficient version that only requires n*log(n)^2 queries. no need to read the subsequent papers for that. 🙂

Thanks, Anand. I’m glad someone finally took the time to comment on this absurd article. There are so many issues, it’s hard to find a starting point. The first time I saw it I thought it was an elaborate joke. The single most serious misunderstanding is to confound algorithm and definition as you explain. Indeed, they imagine Laplace noise is the only way to achieve differential privacy.

Another almost comical mistake is to add Laplace noise in the wrong places, for example, in answering queries of very high sensitivity without breaking them down into low-sensitivity queries.

There is certainly a very interesting discussion to be had of how to apply differential privacy in practice. It’s a challenge indeed, but this article fails to add a single interesting observation to this discussion.

What, you don’t think their prose stylings are “interesting observations?” I found it intriguing that they could get so overworked, to be honest. Maybe I am not in touch with my emotions enough.

There are so many problems with this “critique.” Most egregiously, they conflate differential privacy with the Laplace mechanism — they don’t seem to be aware that we have private mechanisms that are more advanced than adding independent Laplace noise. Their understanding of DP is rudimentary at best — they don’t even seem to know about advanced composition. They only mention (eps,delta)-DP in passing (in the conclusion), where they proceed to incorrectly define it! Their only real observation is that the Laplace mechanism doesn’t work very well if you have many queries and few samples.

The authors have absolutely no idea what they are talking about, yet they write so confidently and stridently. The best line has to be “the proponents of differential privacy do not themselves fully understand the theory.”

Good for a laugh…

I can’t tell if it’s bad scholarship or mendacity, to be honest. It doesn’t help that law reviews are edited by law students, which puts a new spin on “peer review.”

I have to say that I found the paper quite disturbing. Maybe it’s my own naivety, but I will *never* barge in to a professor-of-law office and claim (s)he doesn’t understand the law. If I have a concrete comment of something to say about her/his work, I’d bring myself to *ask* and try to *understand*. To me, academia is a place to learn, discuss, and propose changes (and hopefully trying to resolve differences using math 🙂 ). I do not see myself voiding other people’s work, but rather trying to complement it. Yet this paper is written in the same zealous spirit I thought people reserve only for religion, not the academia. And so, I don’t see any way to talk, to have a genuine discussion, with this type of people. They don’t want to converse, to have a back-and-forth exchange of ideas. They want to shout.

Had they wanted to listen, we could argue to them that there’s no need to perturb a histogram a 1000 times; that they needn’t apply solely the Laplace mechanism; that there are numerous post-processing techniques aimed at reducing the error; that while mocking the Turing-is-higher-than-the-average-Lithuanian-woman they essentially point out to the same problem later with the doctor that treats mostly diabetic patients; or even the basic fact that data science, as is (without throwing in differential privacy), requires the data curator to make multiple judgment calls such as deciding on the range of a variable. Had they wanted to listen, we could tell them that, as Anand phrased it, differential privacy isn’t a panacea but rather one tool out of several, all aimed at protecting the privacy of individuals. (Perhaps the best example is that differential privacy doesn’t really allow us to send a message from A to B without the message being intercepted by the NSA…) Had they wanted to listen, we could argue that differential privacy, like so many other things, comes with a cost, measured in accuracy in our case, and it’s up to you to decide whether or not you are willing to “pay the price”; but, to the best of my knowledge, it is the only privacy-preserving technique where the cost is explicit.

Unfortunately, they don’t want to listen. Instead, what they seem to want is to viciously tear down Cynthia Dwork, who pioneered the field and it’s probably its strongest advocate. And she does not deserve this. I think her work should be praised, they think the opposite. I’m fine with that, it’s their prerogative to think differently than me. But the least they could do is to give her some credit for trying to do something right, and to quote the cliche — to make the world a better place.

Actually, I’m pretty sympathetic with the article. There is an incredible hype surrounding DP in the CS research community, and it is my mind mostly unfounded. I personally find the notion of a budget to be highly impractical. This is central to DP, and while I don’t follow the DP literature in great detail, I haven’t seen anything that gets around it. I find the following statement from Moritz to be very telling:

> There is certainly a very interesting discussion to be had of how to apply differential privacy in practice. It’s a challenge indeed, but this article fails to add a single interesting observation to this discussion.

I translate this as “we don’t know how to make DP work in practice”. But indeed if you listen to much of the rhetoric coming from DP researchers, Dwork included, you’d think quite the opposite. The message in Fool’s Gold may be too strident, but I think it is the right message to balance the DP hype.

I think this is a bit of a failure as a general principle — poorly-written and rhetorically-poor screeds are not a good “message to balance.” One can and should write sober, balanced, critiques of the DP framework (e.g. budgeting in practice) and try to delineate where and when the solutions make sense vs. not.

Or, as we do in crypto, give up on perfect security and fall back on computational models that assume the hardness of certain factoring problems. The moral equivalent is what will need to happen in practical systems, I think.

Francis, you misinterpreted my statement. It’s hard to call something “impractical” that’s being used at scale by Google. My point was that there are many possible applications. It’s not a great fit for all, but for many and there’s an interesting discussion to be had about which ones those are. See, my post for background:

http://blog.mrtz.org/2015/03/13/practicing-differential-privacy.html

Btw, your perception of “DP hype” seems bewildering to me. I’ve always found the DP naysayers to greatly outweigh the DP proponents in number and stubbornness.

@Moritz

The argument that just because Google is using Differential Privacy (a special distributed version) for a very unimportant task, makes DP “practical” is ridiculous. If Google were to adopt Differential Privacy for its core data mining tasks crucial to its revenue stream (such as advertising), I would happily concede defeat. In reality Google, like all other organizations and individuals relies on privacy protocols that are rooted in reasonable approximation of an adversary. Not in some theoretical fantasyland that theorist inhabit.

Now as your comical blog post shows. You refuse to accept any compromise, in spite of falsely pretending to do so. In the blog you ended up destroying work of researchers who had been working in the field for several years, building tools that were tailored to tasks and risks at hand. Instead of supporting them, you teamed with EFF and presented a false third choice. A false choice that had never been shown to work in practice. An application you had zero background information about. As a result of presenting the false choice, the judge simply chose the safest choice. EFF loves useful idiots when it comes to privacy, during the Sorrell vs IMS Health case that the party they supported petitioners, while academic research scientists supported respondents. Eventually the petitioners lost. Thankfully otherwise if EFF had its way we would be signing 100 consent forms daily or relying on China for all future medical progress.

This, this is the reason why there so much hate around Differential Privacy proponents.

Why don’t you petition Google to use only Differential Privacy for all of its tools. All adwords/adsense prediction algorithm ought to be Differentially Private. See how long before it takes them to kick DP zealots out. Until then maybe check yourself out of debates in areas where you have zero background knowledge.

@asdasdasdas:

Medical researchers haven’t cured cancer yet. Nor have they developed a vaccine for HIV. Does that mean all medical research is folly?

Of course not! Likewise, differential privacy hasn’t solved every problem relating to private data. That doesn’t make it useless.

I actually enjoyed reading the paper. As someone with access to large amount of sensitive data, and who does “actual” medical research. I found the paper to be illuminating. The issue Aaron, Kamalika and You don’t understand is that the underlying assumptions made in the differential privacy are way too extreme. E.g. with a dataset of 100 Million visits, if an attacker has an access to 99,999,999 visits the ability to divulge information of the remaining patient is meaningless. Such extreme ideas do work out in Communication Privacy (since the social standards are different) but they are utterly useless from the standpoint of Data Privacy. The current state of Data Privacy Research closely mirrors the “Statistical Learning Theory” Mania of 2003 – 2013 where if your paper did not prove bounds it was considered unacceptable. This went on until Computer Vision researchers started focusing on actual results (via competitions) instead of who has the most complicated proof and rest is history. Differential Privacy researchers (Theoretical Computer Scientists) have same truly terrible culture to Data Privacy research, where rather than developing tailored strategies for data release that include inputs from researchers (Medical, Economists etc) We instead have useless conferences with Proofs after Proofs that provide little guidance in application or can actually be shown to work. I have yet to find a single paper that shows how the choice parameters translates into actual results and whether those results are actually acceptable to that field.

This paper shows that the “Emperor has no clothes” and that the current pursuit of “Differential Privacy” is meaningless and is alienating a significant number of applied researchers from the fields while at the same time increasing risk. None of the current recommendation (if there are any) by differential privacy research are likely to be adopted by medical research community. And the end result is going self congratulatory dick (proof) length measuring contest between Theory researchers.

I guess I’ll leave aside your blanket generalized statements and bizarre capitalizations (which honestly do not bolster your stated credentials as “A Medical Researcher”).

This paper shows nothing of the sort. I agree that there is a significant amount of theoretical research that may turn up algorithms and approaches that prove to be useless in practice, but this is, in fact, the process of doing research. Presumably in medical research every single experiment turns out to be an amazing new discovery. No? In fact, one goes down dead ends, sometimes for years? Well, I’ll be!

As a matter of fact, I am working with people to try and see how we can implement differential privacy into neuroinformatics analysis pipelines that analyze “real data” from “real subjects.” It’s a question if we can get meaningful values for epsilon. But I’m trying.

If you really are an “actual” medical researcher, you certainly have a very particularized and visceral reaction to CS Theory. I’d posit that the amount of money “wasted” on medical research that uses flawed statistics, falsified data, and improperly documented protocols and analyses would pay for all of the theory grants awarded each year many times over.

The issue with epsilon is that it’s a meaningless metric in practice similar to VC dimension. Sure a large VC dimension implies ability to memorize entire dataset, yet in spite of having near infinite VC dimension the deep learning networks do learn generalizable representations. Similarly the “Differential Privacy” researchers have been on a meaningless crusade touting epsilon as “One and Only true measure” for privacy. As if systems with very high values of epsilon are somehow automatically inferior and should be banished. Now if this religious dogma was confined to FOCS and STOC I couldn’t give a shit. But numerous researchers are going around FTC and HHS proselytizing this fake prophet.

This cult of epsilon is dangerous, and would lead to multi year winter in all sorts of data mining research. Especially for Medicine where lives at risk this kind of issues have to be taken very seriously.

Pingback: On “Pure,” “Applied,” and Real Life Research – Thoughts and Observations

I know Dr. Sarathy personally, and have had many meetings with him and a PhD student discussing DP and yes painfully going over the classic Dinur Nissim paper as well. I have read their papers in total, and that also includes Dwork. I recently received a very brief email reply from Dwork and a link to a overview of the basic principles of DP which can be read here:

https://github.com/frankmcsherry/blog/blob/master/posts/2016-02-06.md

McSherry gets a little aggressive and sarcastic but he makes some excellent points.

In our meetings at OSU we often revisited the meanings of important concepts, and the DP definition, statistical distance/indistinguishably, the application of Laplace, and what do we mean when we say something is statistically private?

Dr. Michael Ekstrand sent me this response to Sarathy et al.:

https://github.com/frankmcsherry/blog/blob/master/posts/2016-02-03.md

I am independently working with Ekstrand on constructing a DP recommender system that is decentralized. It has never been done before and it is unclear if it is possible. This should give you a bit of insight of my goals and background; I am not a pure CS or math academic but I do have a CS/IS and math background to ask some questions and attempt to answer some of them. Time will tell how well it goes. I have aligned myself with as many competent and superior mathematicians, computer scientists, statisticians, and Information Systems experts as I can. I have to say both sides of the DP debate have gotten quite aggressive and to some extent judgmental. DP does not currently work to scale on large databases or in all types of applications; this we know and do not argue, but Google and Facebook are using a form of DP, and better believe MS too has applications of DP it uses and tests in the real world.

However, the IS/pure math and even the cryptography people, rightly so, have issues with Dwork, Smith and McSherry’s definitions of privacy, and how the definition has evolved from pure epsilon to some watered down epsilon delta definition. Ligett herself states when adding delta we run into all sorts of potential privacy issues, including becoming blatantly non-private.

Sarathy’s issue this: how did Dwork and colleagues make the leap from early definitions of DP prior to say 2006-2008, to 2010 and thereafter. His view is this was not adequately explained or tested.

My issue with Dwork’s (and yes she is a brilliant theoretical computer scientist) series of papers and presentations is she tries to appeal to: cryptographers, statisticians, mathematicians, and various CS people by changing the narrative, and the language used, but the cryptographers do not accept her only private under very narrow conditions, and the statisticians want a more rigorous proof, besides a bunch of lemmas and a handful of theorems.

I emailed Nissim as well but never received a reply; oh well.

I know Dr. Sarathy personally, and have had many meetings with him and a PhD student discussing DP and yes painfully going over the classic Dinur Nissim paper as well. I have read their papers in total, and that also includes Dwork. I recently received a very brief email reply from Dwork and a link to a overview of the basic principles of DP which can be read here:

https://github.com/frankmcsherry/blog/blob/master/posts/2016-02-06.md

McSherry gets a little aggressive and sarcastic but he makes some excellent points.

In our meetings at OSU we often revisited the meanings of important concepts, and the DP definition, statistical distance/indistinguishably, the application of Laplace, and what do we mean when we say something is statistically private?

Dr. Michael Ekstrand sent me this response to Sarathy et al.:

https://github.com/frankmcsherry/blog/blob/master/posts/2016-02-03.md

I am independently working with Ekstrand on constructing a DP recommender system that is decentralized. It has never been done before and it is unclear if it is possible. This should give you a bit of insight of my goals and background; I am not a pure CS or math academic but I do have a CS/IS and math background to ask some questions and attempt to answer some of them. Time will tell how well it goes. I have aligned myself with as many competent and superior mathematicians, computer scientists, statisticians, and Information Systems experts as I can. I have to say both sides of the DP debate have gotten quite aggressive and to some extent judgmental. DP does not currently work to scale on large databases or in all types of applications; this we know and do not argue, but Google and Facebook are using a form of DP, and better believe MS too has applications of DP it uses and tests in the real world.

However, the IS/pure math and even the cryptography people, rightly so, have issues with Dwork, Smith and McSherry’s definitions of privacy, and how the definition has evolved from pure epsilon to some watered down epsilon delta definition. Ligett herself states when adding delta we run into all sorts of potential privacy issues, including becoming blatantly non-private.

Sarathy’s issue this: how did Dwork and colleagues make the leap from early definitions of DP prior to say 2006-2008, to 2010 and thereafter. His view is this was not adequately explained or tested.

My issue with Dwork’s (and yes she is a brilliant theoretical computer scientist) series of papers and presentations is she tries to appeal to: cryptographers, statisticians, mathematicians, and various CS people by changing the narrative, and the language used, but the cryptographers do not accept her only private under very narrow conditions, and the statisticians want a more rigorous proof, besides a bunch of lemmas and a handful of theorems.

I emailed Nissim as well but never received a reply; oh well.

It seems you are concerned about the difference between “pure” epsilon-differential privacy and “approximate” (epsilon,delta)-differential privacy.

The reason the relaxed definition was introduced is that pure differential privacy is unnecessarily restrictive in many settings. For example, when using the Laplace mechanism to answer k counting queries, we can only answer k~=epsilon*n queries with pure epsilon-differential privacy. However we can answer k~=(epsilon*n)^2/log(1/delta) with approximate (epsilon,delta)-differential privacy. [DRV10] That’s a big difference in usefulness and there are other examples where you can do much more with approximate differential privacy.

You are right that approximate differential privacy is a weaker guarantee than pure differential privacy. Thus we need to justify that approximate differential privacy is a satisfactory privacy definition. One way to think of (epsilon,delta)-differential privacy is that it is modifies epsilon-differential privacy by allowing a delta probability of failure. [KS] If the probability of failure is extremely small, say, delta=10^-10, this is considered acceptable to most people.

Bambauer, Muralidhar, and Sarathy’s paper only briefly mentions approximate differential privacy. They give a nonsensical criticism of this definition. Essentially they say that the problem with approximate differential privacy is that it isn’t pure differential privacy and that we need to set the delta parameter. I have not seen any real argument that approximate differential privacy is insufficient as a privacy guarantee.

I cannot speak for Cynthia Dwork, but I don’t think anyone has argued that differential privacy is the one and only definition of privacy. What has been argued is that there needs to be a rigorous and scientific study of data privacy. And, so far, differential privacy is the only definition that fits the bill.

The definition of differential privacy has evolved – from pure epsilon-differential privacy to approximate (epsilon,delta)-differential privacy – and it will likely continue to evolve. That is the nature of research.

Thomas,

thank you for your reply. I agree with most of your statements about pure versus epsilon delta DP, Sarathy’s misinterpretations etc… I know Google has Rapport and there is the University of Penn project, but until algorithms can better meet the DP definition of privacy, I am more inclined to go with traditional cryptography and exploring homomorphic encryption as well. My first publication was on Homomorphic Encryption, and I am more inclined to accept its guarantees; still DP has shown early signs of practical potential–just not viewing it as the best definition of privacy that can to date, be truly implemented.

Yes, research continues to evolve, and this is often a positive thing, however, DP should be a small component of a larger cryptosystem within an information system framework.

I think homomorphic encryption is addressing a different problem to what differential privacy is addressing. I don’t think they are comparable.

There is research that combines differential privacy with cryptography. You can look up “computational differential privacy” and “multiparty differential privacy”.

Thomas, I know they do combine DP and cryptography-not only did I look it up in the past, but my research involves various hybrid systems: homomorphic encryption and computational indistinguishability, DP with decentralized server systems, network security, VPN’s, and yes, I am familiar with: “computational differential privacy” and “multiparty differential privacy”.

My point is simple: DP while it has some uses, it is more constrained and less applicable than it was originally touted as. DP based applications are still not engineered on a large scale and 100% in line with any allowable definition, pure or delta… there are even papers out relaxing the constraints even further. I do correspond with a few of the researchers in the field and work with others in CS to figure out what is practical. I think DP can protect medical records in part, but we need people to build a system that can implement the definition. What with that slew of data breaches…

I disagree on your mentioning not seeing evidence adding delta DP is not a robust privacy guarantee–there is research on this in CS, and Katrina Ligget briefly discussed this in a presentation a couple of years ago; here is her website with relevant papers and video presentations:

http://users.cms.caltech.edu/~katrina/research.html

Dwork has referenced Ligget as well a few times, just not on that topic far as I can tell. However, there exists research on the privacy issues of adding delta, and Ligget states there is an issue.

Can you be more specific? Is there a particular paper or video you can point to?

Here is the video; https://www.youtube.com/watch?v=ekIL65D0R3o

The whole presentation is great, but at one point the question is asked after she she presents the relaxed constraint of adding delta, and she admits that too large a delta would be problematic, and implies it might become an issue. As far as research, you can find it via a literature review–should only take you a few minutes.

Finally to date NO application meets the full definition of either pure epsilon or epsilon delta DP, yet, yeah I am working on a decentralized recommender system which will hopefully apply some DP definitions via the algorithms we use–thus far not attainable by anyone, anywhere, so the odds are against us.

Of course, delta needs to be small – much smaller than epsilon. delta should be thought of as the probability of the privacy protection failing completely. We want delta to be at most about one in a million.

In cryptography, we must choose an appropriate key length. If it is too short, the adversary may guess the secret key easily. If it is too long, encryption and decryption may become too slow to be practical. The choice of delta is similar. If delta is too large, our privacy guarantee is broken; if delta is too small, we must add too much noise.

I’m not sure what you mean by this. There are definitely algorithms that satisfy differential privacy and some of them have been put into practice.

The issue is delta is far more sensitive to leaking privacy in practice than epsilon, and no there are no applications that meet 100% differential privacy definition under various real world conditions. Hence why researchers are still racing to build one.

Delta is far more sensitive than epsilon to data leakage, for one-so the keeping or choosing how small is more complex and not completely worked out. Second no, there does not exist anywhere in the literature or any research presentation a series of algorithms and software that are 100% in line with the DP definition in real world examples.

In fact PINQ has been broken–all iterations of it, in terms of what should be differentially private, which McSherry needs to respond to, and Dwork’s latest definitions of privacy have yet to be upheld.