# Call for Papers: T-SIPN Special Issue on Inference and Learning Over Networks

IEEE Signal Processing Society
IEEE Transactions on Signal and Information Processing over Networks
Special Issue on Inference and Learning Over Networks

Networks are everywhere. They surround us at different levels and scales, whether we are dealing with communications networks, power grids, biological colonies, social networks, sensor networks, or distributed Big Data depositories. Therefore, it is not hard to appreciate the ongoing and steady progression of network science, a prolific research field spreading across many theoretical as well as applicative domains. Regardless of the particular context, the very essence of a network resides in the interaction among its individual constituents, and Nature itself offers beautiful paradigms thereof. Many biological networks and animal groups owe their sophistication to fairly structured patterns of cooperation, which are vital to their successful operation. While each individual agent is not capable of sophisticated behavior on its own, the combined interplay among simpler units and the distributed processing of dispersed pieces of information, enable the agents to solve complex tasks and enhance dramatically their performance. Self-organization, cooperation and adaptation emerge as the essential, combined attributes of a network tasked with distributed information processing, optimization, and inference. Such a network is conveniently described as an ensemble of spatially dispersed (possibly moving) agents, linked together through a (possibly time – varying) connection topology. The agents are allowed to interact locally and to perform in-network processing, in order to accomplish the assigned inferential task. Correspondingly, several problems such as, e.g., network intrusion, community detection, and disease outbreak inference, can be conveniently described by signals on graphs, where the graph typically accounts for the topology of the underlying space and we obtain multivariate observations associated with nodes/edges of the graph. The goal in these problems is to identify/infer/learn patterns of interest, including anomalies, outliers, and existence of latent communities. Unveiling the fundamental principles that govern distributed inference and learning over networks has been the common scope across a variety of disciplines, such as signal processing, machine learning, optimization, control, statistics, physics, economics, biology, computer, and social sciences. In the realm of signal processing, many new challenges have emerged, which stimulate research efforts toward delivering the theories and algorithms necessary to (a) designing networks with sophisticated inferential and learning abilities; (b) promoting truly distributed implementations, endowed with real-time adaptation abilities, needed to face the dynamical scenarios wherein real-world networks operate; and (c) discovering and disclosing significant relationships possibly hidden in the data collected from across networked systems and entities. This call for papers therefore encourages submissions from a broad range of experts that study such fundamental questions, including but not limited to:

• Adaptation and learning over networks.
• Consensus strategies; diffusion strategies.
• Distributed detection, estimation and filtering over networks.
• Distributed dictionary learning.
• Distributed game-theoretic learning.
• Distributed machine learning; online learning.
• Distributed optimization; stochastic approximation.
• Distributed proximal techniques, sub-gradient techniques.
• Learning over graphs; network tomography.
• Multi-agent coordination and processing over networks.
• Signal processing for biological, economic, and social networks.
• Signal processing over graphs.

Prospective authors should visit http://www.signalprocessingsociety.org/publications/periodicals/tsipn/ for information on paper submission. Manuscripts should be submitted via Manuscript Central at http://mc.manuscriptcentral.com/tsipn-ieee.

Important Dates:

• Manuscript submission: February 1, 2016
• First review completed: April 1, 2016
• Revised manuscript due: May 15, 2016
• Second review completed: July 15, 2016
• Final manuscript due: September 15, 2016
• Publication: December 1, 2016

Guest Editors:

# Randomized response, differential privacy, and the elusive biased coin

In giving talks to broader audiences about differential privacy, I’ve learned quickly (thanks to watching talks by other experts) that discussing randomized response first is an easy way to explain the kind of “plausible deniability” guarantee that differentially private algorithms give to individuals. In randomized response, the setup is that of local privacy: the simplest model is that a population of $n$ individuals with data $x_1, x_2, \ldots, x_n \in \{0,1\}$ representing some sensitive quantity are to be surveyed by an untrusted statistician. Concretely, suppose that the individual bits represent whether the person is a drug user or not. The statistician/surveyor wants to know the fraction $p = \frac{1}{n} \sum x_i$ of users in the population. However, individuals don’t trust the surveyor. What to do?

The surveyor can give the individuals a biased coin that comes up heads with probability $q < 1/2$. The individual flips the coin in private. If it comes up heads, they lie and report $y_i = 1 - x_i$. If it comes up tails, they tell the truth $y_i = x_i$. The surveyor doesn’t see the outcome of the coin, but can compute the average of the $\{y_i\}$. What is the expected value of this average?

$\mathbb{E}\left[ \frac{1}{n} \sum_{i=1}^{n} y_i \right] = \frac{1}{n} \sum_{i=1}^{n} (q (1 - x_i) + (1 -q) x_i) = q + (1 - 2q) p$.

So we can invert this to solve for $p$: if we have a reported average $\bar{y} = \frac{1}{n} \sum y_i$ then estimate $p$ by

$\hat{p} = \frac{\bar{y} - q}{ 1 - 2 q }$.

What does this have to do with differential privacy? Each individual got to potentially lie about their drug habits. So if we look at the hypothesis test for a surveyor trying to figure out if someone is a user from their response, we get the likelihood ratio

$\frac{ \mathbb{P}( y_i = 1 | x_i = 1 ) }{ \mathbb{P}( y_i = 1 | x_i = 0 ) } = \frac{1 - q}{q}$

If we set $\epsilon = \log \frac{1 - q}{q}$, we can see that the protocol guarantees differential privacy. This gives a possibly friendlier interpretation of $\epsilon$ in terms of the “lying probability” $q$. We can plot this:

Epsilon versus lying probability

This is a bit pessimistic — it says that to guarantee reasonable “lying probability” we need $\epsilon \ll 1$, but in practice this turns out to be quite difficult. Why so pessimistic? The differential privacy thread model is pretty pessimistic — it’s your plausible deniability given that everyone else in the data set has revealed their data to the surveyor “in the clear.” This is the fundamental tension in thinking about the practical implications of differential privacy — we don’t want to make conditional guarantees (“as long as everyone else is secret too”) but the price of an unconditional guarantee can be high in the worst case.

So how does randomized response work in practice? It seems we would need a biased coin. Maybe one can custom order them from Alibaba? Turns out, the answer is not really. Gelman and Nolan have an article about getting students to try and evaluate the bias of a coin — the physics of flipping would seem to dictate that coins are basically fair. You can load dice, but not coins. I recommend reading through the article — it sounds like a fun activity, even for graduate students. Maybe I’ll try it in my Detection and Estimation course next semester.

Despite the widespread prevalence of “flipping a biased coin” as a construction in probability, randomized algorithms, and information theory, a surprisingly large number of people I have met are completely unaware of the unicorn-like nature of biased coins in the real world. I guess we really are in an ivory tower, eh?