ArXiV notes : July 2012 part 2

Differentially Private Filtering and Differentially Private Kalman Filtering
Jerome Le Ny, George J. Pappas
These papers apply the differential privacy model to classical signal processing and look at the effect of adding noise before filtering and summing versus adding noise after filtering and summing. The key is that the operations have to be causal, I think — we usually think about differential privacy as operating offline or online in very restricted settings, but here the signals are coming in as time series.

Finite sample posterior concentration in high-dimensional regression
Nate Strawn, Artin Armagan, Rayan Saab, Lawrence Carin, David Dunson
They study ultra high-dimensional linear regression (cue guitars and reverb) in the “large p, small n” setting. The goal is to get a Bernstein-von Mises type theorem — e.g. assuming the data comes from a linear model \beta_0, then using Bayesian inference the posterior should concentrate around \beta_0. They of course need a sparsity assumption, and the prior must assign reasonable mass around the true parameter and assign low probability to non-sparse signals. The methods use some ideas from compressed sensing (the Dantzig selector) and should be of interest to people working in that area.

Identifying Users From Their Rating Patterns
José Bento, Nadia Fawaz, Andrea Montanari, and Stratis Ioannidis
This is a paper about recommender systems as part of the 2011 CAMRa challenge. They look at the problem of re-identification of users in this data and show that looking at the time stamps of movie ratings is much more useful than looking at the rating values. This suggests to me that people should use a masking system like Anant and Parv’s “Incentivizing Anonymous ‘Peer-to-Peer’ Reviews” (P. Venkitasubramaniam and A. Sahai, Allerton 2008) paper.

<a href="http://arxiv.org/abs/1207.4084v1"Mechanism Design in Large Games: Incentives and Privacy
Michael Kearns, Mallesh M. Pai, Aaron Roth, Jonathan Ullman
This proposes a variant of differential privacy which they call joint differential privacy and look at mechanism designs that satisfy privacy and are incentive compatible. At first glance, these should be incompatible, since the latter implies “revealing the truth.” The model is one in which each agent has a finite set of actions but its own payoff/value is private. This is somewhat out of my area, so I can’t really get the nuances of what’s going on here, but a crucial assumption here is that there are a large number of agents. Joint differential privacy seems to be a form of (\epsilon,\delta) differential privacy on the utility functions of the users.

SPCOM 2012 : the talks

Well, SPCOM 2012 is over now — it was a lot of fun and a really nice-sized conference. I missed the first day of tutorials, which I heard were fantastic. Qing Zhao couldn’t make it due to visa issues but gave her tutorial over Skype. Hooray for technology!

Continue reading

CFP : IEEE Signal Processing Magazine Special Issue on Signal Processing for Cyber-security and Privacy

CALL FOR PAPERS
IEEE Signal Processing Society

IEEE SIGNAL PROCESSING MAGAZINE
Special Issue on Signal Processing for Cyber-security and Privacy

Aims and Scope:
Information technology and electronic communications have been rapidly applied to many spheres of human activity, including commerce, medicine and social networking. This has led to the creation of massive electronic repositories for distributed information storage and processing, which enables access by a large number of authorized users. The need for timely access to electronic data makes it imperative to guarantee the security and privacy of this data. Traditionally, electronic data security has been ensured via cryptographic techniques, but these distributed data systems require security and privacy mechanisms at all levels of the system. Thus, providing precise guarantees on the security and privacy of electronic information requires leveraging a range of information processing techniques beyond traditional cryptography to ensure secure distributed storage and access mechanisms. The problems of information exchange, interaction, and access lend themselves to fundamental information processing abstractions and theoretical analysis. The tools of rate-distortion theory, distributed compression algorithms, distributed storage codes, machine learning for feature identification and suppression, and compressive sensing and sampling theory are fundamental and can be applied to precisely formulate and quantify the tradeoff between utility and privacy in a variety of domains. Thus, while rate-distortion theory and information-theoretic security can provide fundamental bounds on privacy and security leakage of distributed data systems, the information and signal processing techniques of compressive sensing, machine learning, and graphical models are the key ingredients necessary to achieve these performance limits in a variety of applications involving streaming data (smart grid, intelligent data collection), distributed data storage (cloud), and interactive data applications across a number of platforms. This special issue seeks to provide a venue for ongoing research in information and signal processing for security and privacy applications across a wide variety of domains, including communication media (e.g. ranging from wireless networks at the edge to optical backbones at the core of the Internet), to computer systems (e.g. ranging from traditional computer architectures to distributed systems, including cloud computing).

Topics of Interest include (but are not limited to):

  • Signal processing for information-theoretic security
  • Data mining and analysis for anomaly and intrusion detection
  • Forensic analysis: device identification, recovery of lost/corrupted information
  • Information processing in the encrypted domain
  • Security in distributed storage systems
  • Codes for security in distributed storage and cloud computing
  • Location privacy and obfuscation of mobile device positioning
  • Physical layer security methods: confidentiality and authentication
  • Secure identity management
  • Formalized models for adversaries and threats
  • Techniques to achieve covert or stealthy communication
  • Stochastic models for large data repositories and for streaming data in cyber-physical systems

Submission Process:
Articles submitted to this special issue must contain significant relevance to signal processing and its application to security and privacy. All submissions will be peer reviewed according to the IEEE and Signal Processing Society guidelines for both publications.Submitted articles should not have been published or under review elsewhere. Manuscripts should be submitted online using the Manuscript Central interface. Submissions to this special issue of the IEEE SIGNAL PROCESSING MAGAZINE should have significant tutorial value. Prospective authors should consult the Magazine site for guidelines and information on paper submission.

Important Dates: Expected publication date for this special issue is September 2013.

Guest Editors:
Lalitha Sankar, Lead GE, Arizona State University, USA, lalithasankar@asu.edu
Vincent H. Poor, Princeton University, USA, poor@princeton.edu
Mérouane Debbah, Supelec, Gif-sur-Yvette, France, merouane.debbah@supelec.fr
Kannan Ramchandran, University of California Berkeley, USA, kannanr@eecs.berkeley.edu
Wade Trappe, Rutgers University, USA, trappe@winlab.rutgers.edu

The Early History of the SVD

I recently read G.W. Stewart‘s little paper On the Early History of the Singular Value Decomposition (free tech report version is at UMD). It talks about how Beltrami, Jordan, Sylvester, Schmidt, and Weyl all had different approaches to finding/proving the SVD. It’s worth a quick skim, because goodness knows it appears everywhere under all sorts of names. Part of the problem is characterizing the SVD, and the other is calculating it. Since numerical analysis was never part of my training, I don’t have as much sophisticated appreciation for the algorithmic aspects, but I certainly benefit from having efficient solvers.

One point Stewart makes is that we really shouldn’t call the approximation theorem for the SVD the Eckart-Young Theorem, since Schmidt was really the one who showed it much earlier in the context of “integral equations, one of the hot topics of the first decades of our [the 20th] century.” I’ve been guilty of this in the past, so it’s time for me to make amends. I suppose I better start saying Cauchy-Bunyakovsky-Schwarz too.

What was weird to me is that as an (erstwhile?) signal processor, there was not much mention of the Karhunen–Loève transform, even in the little paragraphs on “principal components.”

New books on the foundations of signal processing

For those who are interested in signal processing, the classic Wavelets and Subband Coding by Martin Vetterli and Jelena Kovačević is a must-have. But perhaps you were unaware that the two of them, together with Vivek Goyal have written a pair of books:

Signal Processing: Foundations
Signal Processing: Fourier and Wavelet Representations

They should give you a solid grounding the fundamentals of signal processing from a more modern perspective. Preliminary versions are available for download from the book’s website.

Bellairs Workshop 2012

The beach at Bellairs

The beach at Bellairs

I am spending the week at the Bellairs Research Institute in Holetown, Barbados. McGill owns this facility and faculty organize informal workshops throughout the year on various topics. There are two going on right now — one on control theory approaches in computer animation, and out workshop on signal processing in networks. The McGill side is Mark Coates and Mike Rabbat and a number of their students, both masters and PhD. Anna Scaglione and Angelia Nedich arrived recently.

The format of the workshop has been a mix of tutorials and student presentations and plenty of time for discussion and some problem formation. And of course, for the beach, which is just behind the research facility. Holetown is on the west coast of Barbados, and the Caribbean is warm and inviting. I’m having a great time, even though I am falling behind on my other projects and email and whatnot.

People from Barbados call themselves Bajans (‎/ˈbeɪdʒənz/), so one should be careful not to discuss p-values or t-tests around them.

Spread spectrum… in spaaaaaaaace…

I saw on the ArXiV earlier this month a paper on interstellar communication by Berkeley’s own David Messerschmitt. I only met him once, really, at my prelim exam oh so many years ago, but I figured I would give it a read. And here you thought spread spectrum was dead…

Prof. Messerschmitt proposes using spread-spectrum because of its combination of interference robustness and detectability. The fundamental assumption is that the receiver doesn’t know too much about the modulation strategy of the transmitter (this is a case of stochastic encoding but deterministic decoding). The choice of wide-band signaling is novel — SETI-related projects have looked for narrowband signals. The bulk of the paper is on what to do at the transmitter:

The focus of this paper is on the choice of a transmitted signal, which directly parallels the receiver’s challenge of anticipating what type of signal to expect. In this we take the perspective of a transmitter designer, because in the absence of explicit coordination it is the transmitter, and the transmitter alone, that chooses the signal. This is significant be- cause the transmitter designer possesses far less information about the receiver’s environment than the receiver designer, due to both distance (tens to hundreds of light-years) and speed-of-light delay (tens to hundreds of years). While the receiver design can and should take into account all relevant characteristics of its local environs and available resources and technology, in terms of the narrower issue of what type of signal to expect the receiver designer must rely exclusively on the perspective of the transmitter designer.

The rest of the paper centers on designing the coding scheme which is robust to any kind of radio-frequency interference (RFI), without assuming any knowledge at the decoder — specific knowledge of the RFI (say, a statistical description) can only enhance detection, but the goal is to be robust against the modeling issues. To get this robustness, he spends a fair bit of time is spent developing isotropic models for noise and coding (which should be familiar to information theorists of a Gaussian disposition) and then reduces the problem to looking for appropriate time and bandwidth parameters.

This is definitely more of a “communication theory” paper, but I think some of the argument could be made clearer by appeals to some things that are known in information theory. In particular, this communication problem is like coding over an AVC; the connection between spread-spectrum techniques and AVCs has been made before by Hughes and Thomas. However, translating Shannon-theoretic ideas from AVCs to concrete modulation schemes is a bit messy, and some kind of translation is needed. This paper doesn’t quite “translate” but it does bring up an interesting communication scenario : what happens when the decoder only has a vague sense of your coding scheme?

Linkage part deux

Most of these are stolen from MetaFilter.

Welcome back to public blogging, Dan.

All about time zones.

Musical instrument samples. My first UROP at MIT was at the Media Lab, where I helped record instrumentalists as part of a musical instrument identification system. Paris Smaragdis was there at the time, and now he is at UIUC where he has a lot of cool audio demos. There are also some great clips Inside the Music Library at the BBC.

Ridiculous computer interfaces from movies.

In case you are in Austin…

I’m giving a talk on Friday, so come on down! This has been your daily self-promotion effort, thank you for reading.

Consensus in context : leveraging the network to accelerate distributed consensus

October 30, 2009 ENS 637
11:00 am

Gossip algorithms are a class of decentralized solutions to the problem of achieving consensus in a network of agents. They have attracted recent research interest because they are simple and robust — attractive qualities for wireless ad-hoc and sensor networks. Unfortunately, the standard gossip protocol converges very slowly for many popular network models. I will discuss three ways to leverage properties of the network to achieve faster convergence : routing, broadcast, and mobility.

Joint work with Alex G. Dimakis, Tuncer Can Aysal, Mehmet Ercan Yildiz, Martin Wainwright, and Anna Scaglione.

how JPEG works?

In this month’s Notices of the AMS, there was an article on how JPEG works [pdf]. These little articles are supposed to be about neat topics in mathematics or applications, and they are often uneven and sometimes contentious, as the Koblitz article [pdf] has recently demonstrated.

Even though I do signal processing research, I hadn’t looked under the hood of JPEG too carefully in any of my classes, so it was interesting to me to read about process of translating RGB to luminance and two chrominances, the DCT coefficient quantization scheme, and so on. The bit at the end on wavelets in JPEG-2000 I already knew about. But I think that for a mathematician the “why” was lost in all of the details of Huffman coding, Cohen-Daubechies wavelets, and so on. All of the engineering choices were made (in terribly contentious standards meetings, I’m sure) for a reason, and the choice of the mathematics is really a choice of model of “natural” images. He does a good explanation of luminance-chrominance vs. RGB in terms of the visual system, but what about the Fourier transform in terms of capturing edge detail in high frequency coefficients?

Unfortunately for me, the article ended up confirming my pessimistic view that standards look like a sequence of ad-hoc decisions. There’s lots of theory behind those implementations, and I think that might appeal to more mathematicians.