Juggling, (a)synchrony, and queues

Research often takes twisty little paths, and as the result of a recent attempt to gain understanding about a problem I was trying to understand the difference between the following two systems with k balls and n (ordered) bins:

  1. Synchronous: take all of the top balls in each bin and reassign them randomly and uniformly to the bottoms of the bins.
  2. Asynchronous: pick a random bin, take the top ball in that bin, and reassign it randomly and uniformly to the bottom of a bin.

These processes sound a bit similar, right? The first one is a batch version of the second one. Sort of. We can think of this as modeling customers (balls) in queues (bins) or balls being juggled by n hands (bins).

Each of these processes can be modeled as a Markov chain on the vector of bin occupation numbers. For example, for 3 balls and 3 bins we have configurations that look like (3,0,0) and its permutations, (2,1,0) and its permutations, and (1,1,1) for a total of 10 states. If you look at the two Markov chains, they are different, and it turns out they have different stationary distributions, even. Why is that? The asynchronous chain is reversible and all transitions are symmetric. The synchronous one is not reversible.

One question is if there is a limiting sense in which these are similar — can the synchronous batch-recirculating scheme be approximated by the asynchronous version if we let n or k get very large?

Postdoc position (one year) in network information theory

Giuseppe Caire has a 1-year postdoc position at the University of Southern California. Candidates should have finished their PhD in some area of network information theory. The position is for one year. Interested candidates should contact Prof. Caire at giuseppe.caire [at] gmail.com.

Postdoc Announcement : wireless network information theory

Salman Avestimehr has a postdoc position open in his group at Cornell. He says it is quite flexible (not tied to a very specific topic) and is broadly in the area of wireless network information theory. Current students and postdocs should contact Salman at avestimehr [at] ece.cornell.edu for more information.

What we want are young fresh faces

Via SEK I read about a job ad at Colorado State which specifies that applicants should have received their PhD from 2010 onwards. Sorry 2009-ers, you’re not eligible.

According to the Colorado State jobs site:

Colorado State University does not discriminate on the basis of race, age, color, religion, national origin or ancestry, sex, gender, disability, veteran status, genetic information, sexual orientation, or gender identity or expression. Colorado State University is an equal opportunity/equal access/affirmative action employer fully committed to achieving a diverse workforce and complies with all Federal and Colorado State laws, regulations, and executive orders regarding non-discrimination and affirmative action. The Office of Equal Opportunity is located in 101 Student Services.

Because I was interested, I called the Office of Equal Opportunity at (970) 491-5836 to see if their office had vetted the ad and whether or not requiring a post-2010 PhD constituted age discrimination. This is what they told me, not specifically about this position, but in general about ads of this type:

  • Fort Collins is a nice place to live and they get lots of people who apply to jobs who have more experience (even at the associate level, so they claim). They want to make it very clear that those people are not welcome to apply.
  • It’s a salary thing where they don’t want to pay for the experience of someone with a PhD pre-2010.
  • They want to be “very clear about what they are looking for” (sounds ominous!)
  • This is a way of cutting down on the number of applications they have to read. Sure they may lose out on some qualified candidates. Implicit in this is that it’s an employer’s market out there.

All in all, this is not a sanguine situation for job applicants. I bet it would be nice if Michael Bérubé could weigh in on this in his capacity as MLA President.

Linkage

I’m a big fan of 99% Invisible, a podcast about design and architecture and… stuff. They rebroadcast a longer piece about U.N. Plaza in San Francisco, and it’s fascinating. I was living in Berkeley at the time much of this went down, and I was semi-unaware of it.

A post from the new desi blog, Brown Town, about Freddie Mercury.

My cousin Supriya has e-books in Hindi, Marathi, and Gujarati available. You know, for kids!

My very dear (but not so near) friend Charlotte has a new blog : Mary Magdalene Girl in which she will discuss gender and her un-diasporaing (not a word) by moving to Ireland.

Photos of yakuza. Unrelatedly, construction worker fashion in Japan.

Approaches to the research statement

It’s job season again and I am revising my research statement. I was pretty happy with the last iteration of it, but as things change I might need to find a new story for my life. As I get farther along, it has become a bit harder to cram all of the things I’ve worked on into a single consistent story. There are even some grad students I know who have worked on several distinct things and they probably have the same problem. There’s a tension in the research statement between coming up with a coherent story and accurately representing your work. There are a few generic ways of handling this, it seems.

The omnibus. You can write many mini-stories, one about each of the major projects, and then have a section for “miscellaneous other papers.” This approach eschews the overarching framework idea and instead just goes for local internal consistency.

The proposal. Instead of talking about all of your work (or mentioning it), you propose one research direction and give short shrift to the rest. This has the advantage of letting you write more fully about one topic and provide sufficient context and exciting new research directions, but then again you’re mis-representing the actual balance of your research interests.

The tractatus. You develop some principles or philosophical underpinnings for your work and then try to connect everything you’ve done to these and explain your future work ideas as further developing these themes. This approach goes for consistency above all else. The advantage is coherence, and the disadvantage is that some projects may have to get strong-armed into it.

I am sure there are more varieties out there, but on the whole the research statement is a weird document — part description, part proposal. You can’t make it only about your existing work because that’s looking to the past, but you can’t make it a proposal because the reader is actually trying to learn what you are interested in.

Broadcast gossip revisited : companion variables and convergence to consensus

Mike Rabbat pointed out his new preprint on ArXiv:

Broadcast Gossip Algorithms for Consensus on Strongly Connected Digraphs
Wu Shaochuan, Michael G. Rabbat

The basic starting point is the unidirectional Broadcast Gossip algorithm — we have n nodes in a graph and each node i starts with a value x_i(0). At each time, a random node broadcasts its value to its neighbors in the graph and they each update their value with the weighted average of their current value and the received value. Eventually, this process converges almost surely and all nodes i will have a common value \hat{x}. However, in general \hat{x} \ne \bar{x}, where \bar{x} = \frac{1}{n} \sum x_i(0), but \mathbb{E}[\hat{x}] = \bar{x}. So this process achieves consensus but only computes the average in expectation.

The reason broadcast gossip does’t compute the average is pretty clear — since the communication and updates are unidirectional, the average of the nodes’ values changes over time. One way around this is to use companion variables \{y_i(t)\} to track the effect of the broadcast. These have been studied before, but in this paper they set the initial values y_i(0) = 0 and perform updates as follows : if node k broadcasts at time t and node j is a neighbor, then

x_j(t+1) = (1 - a_{j,k}) x_j(t) + a_{j,k} x_k(t) + \epsilon d_j^{(k)} y_j(t)
y_j(t+1) = a_{j,k} (x_j(t) - x_k(t)) + (1 - \epsilon d_j^{(k)}) y_j(t) + b_{j,k} y_k(t)
x_k(t+1) = x_k(t)
y_j(t+1) = 0

So what’s going on here? Each time a node transmits it resets its companion variable to 0. Each time it receives a broadcast it accumulates an offset in y_j. The actual value estimate at the nodes is weighted average of the current and received value plus some of the local offset.

The parameters of the algorithm are matrices A = (a_{j,k}) and B = (b_{j,k}) and per-node matrices D_k = (d_j^{(k)}). Let

A_k = A e_k e_k^T
B_j = B e_k e_k^T
L_k = \mathrm{diag}(A_k \mathbf{1}) - A_k
S_k = I - e_k e_k^T + B_k

When node k broadcasts) we can write a matrix

W_k \left[ \begin{array}{cc} I - L_k & \epsilon D_k \\ L_k & S_k - \epsilon D_k \end{array} \right]

such that

\left[ \begin{array}{c} \mathbf{x}(t+1) \\ \mathbf{y}(t+1) \end{array} \right]^T = W(t) \left[ \begin{array}{c} \mathbf{x}(t) \\ \mathbf{y}(t) \end{array} \right]^T

where W(t) = W_k if node k broadcasts at time $t$.

The main results are:

  1. If we choose \epsilon small enough, then \lim_{t \to \infty} \mathbb{E}[ \mathbf{x}(t) | \mathbf{x}(0) ] = \hat{x} \mathbf{1} and \hat{x} = \bar{x} under more conditions.
  2. A sufficient condition for the second moment of the error to go to 0.
  3. Rules on how to pick the parameters, especially \epsilon.

There are also some nice simulations. Broadcast gossip is nice because of its simplicity, and adding a little monitoring/control variable y_j(t) seems to buy a lot in terms of controlling bad sample paths for the process.

The editor-in-chief has a new stick

Here’s a little bit of news which may have escaped notice by some in the information theory community:

“In view of its concerns about excessive reviewing delays in the IT Transactions, the BoG authorizes the EiC in his sole judgment to delay publication of papers by authors who are derelict in their reviewing duties.”

Reviewers may be considered to be derelict if they habitually decline or fail to respond to review requests, or accept but then drop out; or if they habitually submit perfunctory and/or excessively delayed reviews.

I’ve said it before, but I think that transparency is the thing which will make reviews more timely — what is the distribution of review times? What is the relationship between paper length and review length? Plots like these may shock people and also give a perspective on their own behavior. I bet some people who are “part of the problem” don’t even realize that they are part of the problem.