IHP “Nexus” Workshop on Privacy and Security: Days 4-5

Wrapping this up, finally. Maybe conference blogging has to go by the wayside for me… my notes got a bit sketchier so I’ll just do a short rundown of the topics.

Days 4-5 were a series of “short” talks by Moni Naor, Kobbi Nissim, Lalitha Sankar, Sewoong Oh, Delaram Kahrobaie, Joerg Kliewer, Jon Ullman, and Sasho Nikolov on a rather eclectic mix of topics.

Moni’s talk was on secret sharing in an online setting — parties arrive one by one and the qualified sets (who can decode the secret) is revealed by all parties. The shares have to be generated online as well. Since the access structure is evolving, what kinds of systems can we support? As I understood it, the idea is to use something similar to threshold scheme and a “doubling trick”-like argument by dividing the users/parties into generations. It’s a bit out of area for me so I had a hard time keeping up with the connections to other problems. Kobbi talked about reconstruction attacks based on observing traffic from outsourced database systems. A user wants to get the records but the server shouldn’t be able to reconstruct: it knows how many records were returned from a query and knows if the same record was sent on subsequent queries — this is a sort of access pattern leakage. He presented attacks based on this information and also based on just knowing the volume (e.g. total size of response) from the queries.

Lalitha talked about mutual information privacy, which was quite a bit different than the differential privacy models from the CS side, but more in line with Ye Wang’s talk earlier in the week. Although she didn’t get to spend as much time on it, the work on interactive communication and privacy might have been interesting to folks earlier in the workshop studying communication complexity. In general, the connection between communication complexity problems and MPC, for example, are elusive to me (probably from lack of trying).

Sewoong talked about optimal mechanisms for differentially private composition — I had to miss his talk, unfortunately. Delaram talked about cryptosystems based on group theory and I had to try and check back in all the things I learned in 18.701/702 and the graduate algebra class I (mistakenly) took my first year of graduate school. I am not sure I could even do justice to it, but I took a lot of notes. Joerg talked about using polar codes to enable private function computation — initially privacy was measured by equivocation but towards the end he made a connection to differential privacy. Since most folks (myself included) are not experts on polar codes, he gave a rather nice tutorial (I thought) on polar coding. It being the last day of the workshop, the audience had unfortunately thinned out a bit.

Jon spoke about estimating marginal distributions for high-dimensional problems. There were some nice connections to composite hypothesis testing problems that came out of the discussion during the talk — the model seems a bit complex to get into based on my notes, but I think readers who are experts on hypothesis testing might want to check out his work. Sasho rounded off the workshop with a talk about the sensitivity polytope of linear queries on a statistical database and connections to Gaussian widths. The main result was on the sample complexity of answering the queries in time polynomial in the number of individuals, size of the universe, and size of the query set.

IHP “Nexus” Workshop on Privacy and Security: Day 2

Verrrrrry belated blogging on the rest of the workshop, more than a month later. Day 2 had 5 talks instead of the tutorial plus talks, and the topics were a bit more varied (this was partly because of scheduling issues that prevented us from being strictly thematic).

Amos Beimel started out with a talk on secret sharing, which had a very nice tutorial/introduction to the problem, including the connection between Reed-Solomon codes and Shamir’s t-out-of-n scheme. For professional (and perhaps personal) reasons I found myself wondering how much more the connection between secret sharing and coding theory was — after all, this was a workshop about communication between information theory and theoretical CS. Not being a coding theory expert myself, I could only speculate. What I didn’t know about was the more general secret sharing structures and the results of Ito-Saito-Nishizeki scheme (published in Globecom!). Amos also talked about monotone span programs, which were new to me, and how to prove lower bounds. He concluded with more recent work on the related distribution design problem: how can we construct a distribution on n variables given constraints that specify subsets which should have identical marginals and subsets which should have disjoint support? The results appeared in ICTS.

Ye Wang talked about his work on common information and how it appears in privacy and security problems from an information theoretic perspective. In particular he talked about secure sampling, multiparty computation, and data release problems. The MPC and sampling results were pretty technical in terms of notions of completeness of primitives (conditional distributions) and triviality (a way of categorizing sources). For the data release problem he focused on problems where a sanitizer has access to a pair (X,Y) where X is private and Y is “useful” — the goal is to produce a version of the data which reveals less about X (privacy) and more about Y (utility). Since they are correlated, there is a tension. The question he addressed is when having access to Y alone as as good as both X and Y.

Manoj, after giving his part of the tutorial (and covering for Vinod), gave his own talk on what he called “cryptographic complexity,” which is an analogy to computational complexity, but for multiparty functions. This was also a talk about definitions and reductions: if you can build a protocol for securely computing f(\cdot) using a protocol for g(\cdot), then f(\cdot) reduces to g(\cdot). A complete function is one for which everything reduces to it, and a trivial function reduces to everything. So with the concepts you can start to classify and partition out functions like characterizing all complete functions for 2 parties, or finding trivial functions under different security notions. He presented some weird facts, like an n bit XOR doesn’t reduce to an (n-1) bit XOR. It was a pretty interesting talk, and I learned quite a bit!

Elette Boyle gave a great talk on Oblivious RAM, a topic about which I was completely oblivious myself. The basic idea in oblivious RAM is (as I understood it) that an adversary can observe the accesses to a RAM and therefore infer what program is being executed (and the input). To obfuscate that, you introduce a bunch of spurious accesses. So if you have a program $\latex \Pi$ whose access pattern is fixed prior to execution, you can randomize the accesses and gain some security. The overhead is the ratio of the total accesses to the required accesses. After this introduction to the problem, she talked about lower bounds on the overhead (e.g. you need this much overhead) for a case where you have parallel processing. I admit that I didn’t quite understand the arguments, but the problem was pretty interesting.

Hoeteck Wee gave the last (but quite energetic) talk of the afternoon, on what he called “functional encryption.” The ideas is that Alice has (x,M) and Bob has y. They both send messages to a third party, Charlie. There is a 0-1 function (predicate) P(x,y) such that if P(x,y) = 1 then Charlie can decode the message M. Otherwise, they cannot. An example would be the predicate P(x,y) = \mathbf{1}(x = y). In this case, Alice can send h(x) \oplus M and Bob can send h(y) for some 2-wise independent hash function, and then Charlie can recover M if the hashes match. I think there is a question in this scheme about whether Charlie needs to know that they got the right message, but I guess I can read the paper for that. The kinds of questions they want to ask are what kinds of predicates have nice encoding schemes? What is the size of message that Alice and Bob have to send? He made a connection/reduction to a communication complexity problem to get a bound on the message sizes in terms of the communication complexity of computing the predicate P. It really was a very nice talk and pretty understandable even with my own limited background.

Bob Gallager on Shannon’s tips for research

One of the classes I enjoyed the most in undergrad was Bob Gallager’s digital communications class, 6.450. I was reminded of what an engaging lecturer he was yesterday when I attended the Bell Labs Shannon Celebration yesterday. Unfortunately, it being the last week of the semester, I could not attend today’s more technical talks. Gallager gave a nice concise summary of what he learned from Shannon about how to do good theory work:

  1. Simplify the problem
  2. Relate it to other problems
  3. Restate the problem in as many ways as possible
  4. Break the problem into pieces
  5. Avoid getting locked into thinking ruts
  6. Generalize

As he said, “it’s a process of doing research… each one [step] gives you a little insight.” It’s tempting, as a theorist, to claim that at the end of this process you’ve solved the “fundamental” problem, but Gallager admonished us to remember that the first step is to simplify, often dramatically. As Alfred North Whitehead said, we should “seek simplicity and distrust it.”

Multiple Postdoc Openings at USC

Prof. Urbashi Mitra is looking for multiple postdocs. Given that this is the time of year when the future looks murkiest, these are great opportunities!

I am seeking multiple post-doctoral researchers are sought with expertise in one or more areas: Communication Theory, (Statistical) Signal Processing, Controls, Information Theory, and Machine Learning. In particular, the following expertises are of interest: structured inference (sparse approximation, low rank matrix completion, tensor signal processing, graph signal processing); multi-terminal information theory, or information theory at the boundaries of control or signal processing; distributed control, consensus methods and partially observable Markov Decision Process modeling and algorithms; modern optimization methods; or biological communications, signal processing or information theory.

The successful applicants will be expected to perform innovative translational research, mentor PhD students, give oral presentations, write journal papers, and participate in grant writing and project management. There will be significant opportunities for research leadership and interaction with funding agencies.

Ideally, the successful applicants will start in Summer 2016.

Please have your interested graduate students apply using the following portal:


In addition to a cv and research statement, the applicants are requested to have three letters of reference uploaded to the system as well.

Jan Hein van Dierendonck’s Painting of Shannon

Jan Hein van Dierendonck, a science writer and illustrator/cartoonist from Leiden, recently contacted the IT Society about an oil painting he made of Claude Shannon. He has kindly given permission to post it here. It will be used by some of the Shannon Centenary events this year.

Claude Shannon, by Jan Hein van Dierendonck

Claude Shannon, by Jan Hein van Dierendonck

Claude Elwood Shannon (April 30, 1916 – February 24, 2001)

In the Forties a juggling Claude Elwood Shannon rides a unicycle down the endless hallways of Bell Labs, a telecommunications research laboratory south of New York. Perhaps this balancing act puts his brilliant mind in the right state to look at complex problems in an original way and to devise the formulas that initiate the Digital Era.

As a 21-year-old master’s degree student at the Massachusetts Institute of Technology, Shannon wrote his thesis demonstrating that electrical applications of Boolean algebra could construct and resolve any logical, numerical relationship. In 1948 this mathematician, electronic engineer, and cryptographer published a landmark paper that laid the foundation for information theory. From that moment on, information is something computable. Whether you are dealing with images, text or sound: convert everything into zeros and ones and remove all redundant information and noise. This has changed our world completely. Without Shannon’s Information Theory, your phone simply wasn’t smart.

Averse to fame, the professor in electronics preferred tinkering with his amazing magnetic mouse in a maze with memory and his mechanic juggling robots. He also refined his Juggling Theorem: the number of hands (H) multiplied by the total time a ball spends in the air (F) and is held in a hand (D) is in balance with the number of balls (N) multiplied by the total time a hand is empty (V) and holding a ball (D).

On April 30, 2016, he would have been a hundred.

Postdoc positions at UCLA in Coding Theory

I will write more about the IHP workshop! In the meantime, here are some exciting postdoc opportunities with my ex-classmate Lara Dolecek!

I’m writing to let you know that I have 2 postdoc positions available in my research group at UCLA, starting this summer. I am looking for talented students who want to work on one of the following:

  1. Coding theoretic methods and algorithms for emerging memories and modern computing systems
  2. New algorithms and coding-theoretic techniques for data management (data science)

Both projects are interdisciplinary. Postdocs will be working closely with me and a vibrant group of my graduate students, and will have the opportunity to collaborate with other researchers and to interact with our industry sponsors.

Strong background in mathematics and interest in interdisciplinary research are required.

I would greatly appreciate if you can pass this information to interested students.

Prospective students should contact me via email at dolecek@ee.ucla.edu

with subject line

[Prospective postdoc interested in LORIS research]

along with their CV and 3 selected publications. Students should plan to arrange for 2 letters of recommendation to be sent to the email address above.

IHP “Nexus” Workshop on Privacy and Security: Day 1

The view from my office at IHP

The view from my office at IHP

I am attending the Nexus of Information and Computation Theories workshop at the Institut Henri Poincaré in Paris this week. It’s the last week of a 10 week program that brought together researchers from information theory and CS theory in workshops around various themes such as distributed computation, inference, lower bounds, inequalities, and security/privacy. The main organizers were Bobak Nazer, Aslan Tchamkerten, Anup Rao, and Mark Braverman. The last two weeks are on Privacy and Security: I helped organize these two weeks with Prakash Narayan, Salil Vadhan, Aaron Roth, and Vinod Vaikuntanathan.

Due to teaching and ICASSP, I missed last week, but am here for this week, for which the sub-topics are security multiparty computation and differential privacy. I’ll try to blog about the workshop since I failed to blog at all about ITA, CISS, or ICASSP. The structure of the workshop was to have 4 tutorials (two per week) and then a set of hopefully related talks. The first week had tutorials on pseudorandomness and information theoretic secrecy.

The second week of the workshop kicked off with a tutorial from Yuval Ishai and Manoj Prabhakaran on secure multiparty computation (MPC). Yuval gave an abbreviated version/update of his tutorial from the Simons Institute (pt1/pt2) that set up the basic framework and language around MPC: k parties with inputs x_1, x_2, \ldots, x_k want to exchange messages to implement a functionality (evaluate a function) f(x_1, x_2, \ldots, x_k) over secure point-to-point channels such they successfully learn the output of the function but don’t learn anything additional about each others’ inputs. There is a landscape of definitions within this general framework: some parties could collude, behave dishonestly with respect to the protocol, and so on. The guarantees could be exact (in the real/ideal paradigm in which you compare the real system with an simulated system), statistical (the distribution in the real system is close in total variation distance to an ideal evaluation), or computational (some notion of indistinguishability). The example became a bit clearer when he described a 2-party example with a “trusted dealer” who can give parties some correlated random bits and they could use those to randomly shift the truth table/evaluation of f(x_1, x_2) to guarantee correctness and security.

Manoj, on the other hand talked about some notions of reductions between secure computations: given a protocol which evaluates f, can you simulate/compute g using calls to f? How many do you need? this gives a notion of the complexity rate of one function in terms of another. For example, can Alice and Bob simulate a BEC using calls to an oblivious transfer (OT) protocol? What about vice versa? What about using a BSC? These problems seem sort of like toy channel problems (from an information theory perspective) but seem like fundamental building blocks when thinking about secure computation. As I discussed with Hoeteck Wee today, in information theory we often gain some intuition from continuous alphabets or large/general alphabet settings, whereas cryptography arguments/bounds come from considering circuit complexity: these are ideas that we don’t think about too much in IT since we don’t usually care about computational complexity/implementation.

Huijia (Rachel) Lin gave an introduction to zero-knowledge proofs and proof systems: a verifier wants to know if a statement X is true and can ask queries to a prover P which has some evidence w that it wants to keep secret. For example, the statement might be “the number y is a perfect square” and the evidence might be an \alpha such that y = \alpha^2 \mod n. The prover doesn’t want to reveal w = \alpha, but instead should convince the verifier that such an alpha exists. She gave a protocol for this before turning to a more complicated statement like proving that a graph has a Hamiltonian cycle. She then talked about using commitment schemes, at which point I sort of lost the thread of things since I’m not as familiar with these cryptography constructions. I probably should have asked more questions, so it was my loss.

Daniel Wichs discussed two problems he called “multi-key” and “spooky” fully-homomorphic encryption (FHE). The idea in multi-key FHE is that you have N users who encrypt values \{ x_i : i \in [N] \} with their public key and upload them to a server. Someone with access to the server wants to be able to decode only a function f(x_1, x_2, \ldots, x_N) using the combined private keys of all the users. In “spooky” FHE, you have N decoders, each with one of the private keys, but they want to decode values \{y_i : i \in [N]\} which are functions of all of the encoded data. A simple example of this is when y_1 \oplus y_2 = x_1 \wedge x_2: that is, the XOR of the outputs is equal to the AND of the inputs. This generalizes to the XOR of multiple outputs being some function of the inputs, something he called additive function sharing. He then presented schemes for these two problems based on the “learning with errors” from Gentry, Sahai, and Waters, which I would apparently have to read to really understand the scheme. It’s some sort of linear algebra thing over \mathbb{Z}_q. Perhaps there are some connections to linear block codes or network coding to be exploited here.