# Call for Papers: T-SIPN Special Issue on Distributed Information Processing in Social Networks

IEEE Signal Processing Society
IEEE Transactions on Signal and Information Processing over Networks
Special Issue on Distributed Information Processing in Social Networks

Over the past few decades, online social networks such as Facebook and Twitter have significantly changed the way people communicate and share information with each other. The opinion and behavior of each individual are heavily influenced through interacting with others. These local interactions lead to many interesting collective phenomena such as herding, consensus, and rumor spreading. At the same time, there is always the danger of mob mentality of following crowds, celebrities, or gurus who might provide misleading or even malicious information. Many efforts have been devoted to investigating the collective behavior in the context of various network topologies and the robustness of social networks in the presence of malicious threats. On the other hand, activities in social networks (clicks, searches, transactions, posts, and tweets) generate a massive amount of decentralized data, which is not only big in size but also complex in terms of its structure. Processing these data requires significant advances in accurate mathematical modeling and computationally efficient algorithm design. Many modern technological systems such as wireless sensor and robot networks are virtually the same as social networks in the sense that the nodes in both networks carry disparate information and communicate with constraints. Thus, investigating social networks will bring insightful principles on the system and algorithmic designs of many engineering networks. An example of such is the implementation of consensus algorithms for coordination and control in robot networks. Additionally, more and more research projects nowadays are data-driven. Social networks are natural sources of massive and diverse big data, which present unique opportunities and challenges to further develop theoretical data processing toolsets and investigate novel applications. This special issue aims to focus on addressing distributed information (signal, data, etc.) processing problems in social networks and also invites submissions from all other related disciplines to present comprehensive and diverse perspectives. Topics of interest include, but are not limited to:

• Dynamic social networks: time varying network topology, edge weights, etc.
• Social learning, distributed decision-making, estimation, and filtering
• Consensus and coordination in multi-agent networks
• Modeling and inference for information diffusion and rumor spreading
• Multi-layered social networks where social interactions take place at different scales or modalities
• Resource allocation, optimization, and control in multi-agent networks
• Modeling and strategic considerations for malicious behavior in networks
• Social media computing and networking
• Data mining, machine learning, and statistical inference frameworks and algorithms for handling big data from social networks
• Data-driven applications: attribution models for marketing and advertising, trend prediction, recommendation systems, crowdsourcing, etc.
• Other topics associated with social networks: graphical modeling, trust, privacy, engineering applications, etc.

Important Dates:

Manuscript submission due: September 15, 2016
First review completed: November 1, 2016
Revised manuscript due: December 15, 2016
Second review completed: February 1, 2017
Final manuscript due: March 15, 2017
Publication: June 1, 2017

Guest Editors:

Zhenliang Zhang, Qualcomm Corporate R&D (zhenlian@qti.qualcomm.com)
Wee Peng Tay, Nanyang Technological University (wptay@ntu.edu.sg)
Moez Draief, Imperial College London (m.draief@imperial.ac.uk)
Xiaodong Wang, Columbia University (xw2008@columbia.edu)
Edwin K. P. Chong, Colorado State University (edwin.chong@colostate.edu)
Alfred O. Hero III, University of Michigan (hero@eecs.umich.edu)

# more not-so-recent hits from ArXiV

arXiv:1403.4696 [math.OC]
Design and Analysis of Distributed Averaging with Quantized Communication
Mahmoud El Chamie, Ji Liu, Tamer Başar

The goal of this paper is to analyze the “performance of a subclass of deterministic distributed averaging algorithms where the information exchange between neighboring nodes (agents) is subject to uniform quantization.” I was interested in the connections to Lavaei and Murray’s TAC paper. Here though, they consider a standard consensus setup with a doubly stochastic weight matrix $W$ and deterministic, rather than randomized, quantization. They consider two types — rounding and truncation (essentially the floor operation). The update rule is $x_i(t+1) = x_i(t) + \sum_{j} W_{ij} (Q(x_j(t)) - Q(x_i(t)))$ where $Q$ is the quantization operation. They shoe that in finite time the agents either reach a consensus on the floor of the average of their initial values, or that the cycle indefinitely in a neighborhood around the average. They they show how to control the size of the neighborhood in a decentralized way. There are a lot of works on quantized consensus that have appeared in the last 5 years, and to be honest I haven’t really kept up on the recent literature, so I’m not sure how to compare this to the other works that have appeared, but perhaps some of the readers of the blog have…

arXiv:1403.4699 [math.OC]
A Proximal Stochastic Gradient Method with Progressive Variance Reduction
Lin Xiao, Tong Zhang

This paper looks at convex optimization problems of the form

$\min_{x \in \mathbb{R}^d} F(x) + R(x)$

where the overall objective $P(x) + R(x)$ is strongly convex, the regularizer $R(x)$ is lower semicontinuous and convex, and the term $F(x) = \frac{1}{n} \sum_{i=1}^{n} f_i(x)$ separates into a sum of function $f_i(x)$ which are Lipschitz continuous. The proximal gradient method is an iterative procedure for solving this program that does the following:

$x_{t} = \mathrm{argmin}_{x \in \mathbb{R}^d} \left\{ \nabla F(x_{t})^{\top} x + \frac{1}{2 \eta_t} \| x - x_{t-1} \|^2 + R(x) \right\}$

If we define the $\mathrm{prox}_{R}$ function as

$\mathrm{prox}_R(y) = \mathrm{argmin}_{x \in \mathbb{R}^d} \left\{ \frac{1}{2} \|x - y\|^2 + R(x) \right\}$

then the step looks like:

$x_t = \mathrm{prox}_{\eta_t R}(x_{t-1} - \eta_{t} \nabla F(x_{t-1}))$

A stochastic gradient (SG) version of this is

$x_t = \mathrm{prox}_{\eta_t R}(x_{t-1} - \eta_{t} \nabla f_{i_t}(x_{t-1}))$

where $i_t$ is sampled uniformly from $\{1,2,\ldots, n\}$ at each time. The advantage of the SG variant is that it takes less time to do one iteration, but each iteration is much noisier. The goal of this paper is to adapt a previous method/approach to variance reduction to improve the performance of the Prox-SG algorithm. The approach is one or resampling points according to the Lipschitz constants. This sort of “sampling based adaptivity” was also used by my ex-colleague Samory Kpotufe and collaborators in their NIPS paper from 2012 (a longer version is under review). At least I think they’re related.

arXiv:1403.5341v1 [cs.LG]
An Information-Theoretic Analysis of Thompson Sampling
Daniel Russo, Benjamin Van Roy

In a multi-armed bandit problem we have a set of actions (arms) $A$ and at each time the learner picks an action $a$ and observes an outcome $Y_t(a) \in \mathcal{Y}$ which is assigned a reward by a function $R: \mathcal{Y} \to \mathbb{R}$. The rewards are assumed to be i.i.d. across time for each action with distributions $p(y | a)$ that are unknown to the learner. The goal is to maximize the reward, which is the same as finding the arm with the largest expected reward. This leads to a classical explore/exploit tradeoff where the learner has to decide whether to explore new arms which may have higher expected reward, or continue exploiting the reward offered by the current arm. Thompson sampling is a Bayesian approach where the learner starts with a prior on the best action and then samples actions at each time according to its posterior belief on the best arm. The authors here analyze the regret of such a policy in terms of what they call the information gain of the system. This gain depends on the ratio between two quantities that are functions of the outcome distributions $p(y | a)$. One is what they call the “divergence in mean,” namely the difference in expected reward between arms, and the other is the KL divergence.

# More ArXiV skims

Assumptionless consistency of the Lasso
Sourav Chatterjee
The title says it all. Given $p$-dimensional data points $\{ \mathbf{x}_i : i \in [n] \}$ the Lasso tries to fit the model $\mathbb{E}( y_i | \mathbf{x_i}) = \boldsymbol{\beta} \mathbf{x}_i$ by minimizing the $\ell^1$ penalized squared error
$\sum_{i=1}^{n} (y_i - \boldsymbol{\beta} \mathbf{x}_i)^2 + \lambda \| \boldsymbol{\beta} \|_1$.
The paper analyzes the Lasso in the setting where the data are random, so there are $n$ i.i.d. copies of a pair of random variables $(\mathbf{X},Y)$ so the data is $\{(\mathbf{X}_i, Y_i) : i \in [n] \}$. The assumptions are on the random variables $(\mathbf{X},Y)$ : (1) each coordinate $|X_i| \le M$ is bounded, the variable $Y = (\boldsymbol{\beta}^*)^T \mathbf{X} + \varepsilon$, and $\varepsilon \sim \mathcal{N}(0,\sigma^2)$, where $\boldsymbol{\beta}^*$ and $\sigma$ are unknown constants. Basically that’s all that’s needed — given a bound on $\|\boldsymbol{\beta}\|_1$, he derives a bound on the mean-squared prediction error.

On Learnability, Complexity and Stability
Silvia Villa, Lorenzo Rosasco, Tomaso Poggio
This is a handy survey on the three topics in the title. It’s only 10 pages long, so it’s a nice fast read.

Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression
Francis Bach
A central challenge in stochastic optimization is understanding when the convergence rate of the excess loss, which is usually $O(1/\sqrt{n})$, can be improved to $O(1/n)$. Most often this involves additional assumptions on the loss functions (which can sometimes get a bit baroque and hard to check). This paper considers constant step-size algorithms but where instead they consider the averaged iterate $\latex \bar{\theta}_n = \sum_{k=0}^{n-1} \theta_k$. I’m trying to slot this in with other things I know about stochastic optimization still, but it’s definitely worth a skim if you’re interested in the topic.

On Differentially Private Filtering for Event Streams
Jerome Le Ny
Jerome Le Ny has been putting differential privacy into signal processing and control contexts for the past year, and this is another paper in that line of work. This is important because we’re still trying to understand how time-series data can be handled in the differential privacy setting. This paper looks at “event streams” which are discrete-valued continuous-time signals (think of count processes), and the problem is to design a differentially private filtering system for such signals.

Gossips and Prejudices: Ergodic Randomized Dynamics in Social Networks
Paolo Frasca, Chiara Ravazzi, Roberto Tempo, Hideaki Ishii
This appears to be a gossip version of Acemoglu et al.’s work on “stubborn” agents in the consensus setting. They show similar qualitative behavior — opinions fluctuate but their average over time converges (the process is ergodic). This version of the paper has more of a tutorial feel to it, so the results are a bit easier to parse.

/

# Bellairs Workshop 2012

The beach at Bellairs

I am spending the week at the Bellairs Research Institute in Holetown, Barbados. McGill owns this facility and faculty organize informal workshops throughout the year on various topics. There are two going on right now — one on control theory approaches in computer animation, and out workshop on signal processing in networks. The McGill side is Mark Coates and Mike Rabbat and a number of their students, both masters and PhD. Anna Scaglione and Angelia Nedich arrived recently.

The format of the workshop has been a mix of tutorials and student presentations and plenty of time for discussion and some problem formation. And of course, for the beach, which is just behind the research facility. Holetown is on the west coast of Barbados, and the Caribbean is warm and inviting. I’m having a great time, even though I am falling behind on my other projects and email and whatnot.

People from Barbados call themselves Bajans (‎/ˈbeɪdʒənz/), so one should be careful not to discuss p-values or t-tests around them.

# ITA Workshop 2012 : Talks

The ITA Workshop finished up today, and I know I promised some blogging, but my willpower to take notes kind of deteriorated during the week. For today I’ll put some pointers to talks I saw today which were interesting. I realize I am heavily blogging about Berkeley folks here, but you know, they were interesting talks!

Nadia Fawaz talked about differential privacy for continuous observations : in this model you see $x_1, x_2, x_3, \ldots$ causally and have to estimate the running sum. She had two modifications, one in which you only want a windowed running sum, say for $W$ past values, and one in which the privacy constraint decays and expires after a window of time $W$, so that values $W$ time steps in the past do not have to be protected at all. This yields some differences in the privacy-utility tradeoff in terms of the accuracy of computing the function.

David Tse gave an interesting talk about sequencing DNA via short reads as a communication problem. I had actually had some thoughts along these lines earlier because I am starting to collaborate with my friend Tony Chiang on some informatics problems around next generation sequencing. David wanted to know how many (noiseless) reads $N$ you need to take of a genome of of length $G$ using reads of length $L$. It turns out that the correct scaling in this model is $L/\log G$. Some scaling results were given in a qualitative way, but I guess the quantitative stuff is being written up still.

Michael Jordan talked about the “big data bootstrap” (paper here). You have $n$ data points, where $n$ is huge. The idea is to subsample a set of size $b$ and then do bootstrap estimates of size $n$ on the subsample. I have to read the paper on this but it sounds fascinating.

Anant Sahai talked about how to look at some decentralized linear control problems as implicitly doing some sort of network coding in the deterministic model. One way to view this is to identify unstable modes in the plant as communicating with each other using the controllers as relays in the network. By structurally massaging the control problem into a canonical form, they can make this translation a bit more formal and can translate results about linear stabilization from the 80s into max-flow min-cut type results for network codes. This is mostly work by Se Yong Park, who really ought to have a more complete webpage.

Paolo Minero talked about controlling a linear plant over a rate-limited communication link whose capacity evolves according to a Markov chain. What are the conditions on the rate to ensure stability? He made a connection to Markov jump linear systems that gives the answer in the scalar case, but the necessary and sufficient conditions in the vector case don’t quite match. I always like seeing these sort of communication and control results, even though I don’t work in this area at all. They’re just cool.

There were three talks on consensus in the morning, which I will only touch on briefly. Behrouz Touri gave a talk about part of his thesis work, which was on the Hegselman-Krause opinion dynamics model. It’s not possible to derive a Lyapunov function for this system, but he found a time-varying Lyapunov function, leading to an analysis of the convergence which has some nice connections to products of random stochastic matrices and other topics. Ali Jadbabaie talked about work with Pooya Molavi on non-Bayesian social learning, which combines local Bayesian updating with DeGroot consensus to do distributed learning of a parameter in a network. He had some new sufficient conditions involving disconnected networks that are similar in flavor to his preprint. José Moura talked about distributed Kalman filtering and other consensus meets sensing (consensing?) problems. The algorithms are similar to ones I’ve been looking at lately, so I will have to dig a bit deeper into the upcoming IT Transactions paper.

# Allerton 2011

Despite Yury‘s attempts to get me to “stop blogging,” here is my much-delayed recap of Allerton. Because of moving to TTI-Chicago right after Allerton and then almost immediately shuttling back to UCSD for the iDASH Privacy Workshop, things have been a bit delayed. I could only attend for two days but I wanted to highlight a few interesting talks that I saw. More Allerton blogging was done by Michael Mitzenmacher (part 1, part 2) and a bit by Maxim Raginsky about his talk on causal calculus (since he blogged about it I don’t have to, ha!). The conference has gotten too big and the rooms are too small to hold the audience, so it probably is time to move the thing. We have similar issues at ITA and the 2012 ITA Workshop is moving off campus next year (you heard it here first, folks!)

But here are some interesting talks I saw: