a paper a day : neuron polarity

I’ll keep ’em shorter from now on I think.

A probabilistic model for the establishment of neuron polarity
Kostya Khamin and Raya Khamin
J. Math Biol. 42, 26-40 (2001)

This paper seeks to answer a puzzling question in neuron developent — why does only one neurite (site on the neuron) out of many on a neuron grow into an axon? The answer they have lies in a particular kind of inhomogenous Markov chain or random walk that has also been called a “balls-in-bins process with feedback.” This kind of model has also been used in analyzing competitive scenarios in other fields as well as preferential attachment models for networks. They model neurites as bins and the length as the number of balls in a bin.

The basic model is this: suppose we have k bins which contain balls. Let a(t) be a length k vector whose entries are the number of balls in the bins at time t, and let f(n) be function that we will call the value of having n balls. At each time, a ball is added to one of the bins at random. The probability that
bin j is chosen is:

That is, the probability is equal to the value of bin j over the sum of the values of all the bins.

In this paper they look at value functions of the form

and show that there are three regimes of behavior:

  • Critical regime (p = 1) : all bins grow linearly with time, but there is one winner.
  • Ballistic growth (p > 1) : after a certain time, only one neurite grows. Another way of viewing this is all but finitely many balls will end up in one bin.
  • Subcritical + winner (1/2 < p < 1) : In this case the ratio of the number of balls in any two bins tends to one. So they all grow linearly, but if b(t) = a(t) – t/k is the fluctuation around the linear growth, then

    That is, the flucations are on the order of f(n). The bin with the largest c_j will always be leading.

  • Subcritical (0 < p <= 1/2) : In this case there is no leader. That is, for each bin there exists an infinite sequence of times for which that bin is the leader. The flucations, appropriately normalized, tend to be Gaussian. That is, some sort of central limit behavior kicks in.

In order to prove these cool and weird results, they embed the arrival process of balls in each bin into a sum of exponential random variables whose rate changes over time (according to how many balls are in the bin). That is, if a bin has r balls, the rate of the next term in the sum is f(r). The bin whose exponential random variable is smallest gets the next ball — the min operation on exponential random variables gives the probability ratio above. That is, the probability that the smallest exponential random variable was the j-th one is exactly that probability above. Thus the probability distribution for the number of balls is identical for the discrete-time model as well as the continuous time embedding. Once they have the embedding it’s a bunch of other tricks to get the results in nice form.

A more recent paper by Oliveira has a nice extension of these arguments for more general f(n) by approximating the embedding with Brownian Motion. As it turns out, the critical thing to check is the summability and square summability of 1/f(n).

a paper a day : 1

A new feature! Just to keep myself motivated on research and to dissuade people from reading the blog, I am trying to “read” one research paper a day (-ish) to get the new ideas running around my head. And you guessed it, I’m going to blog the interesting (to me) ideas here.

Denoising and Filtering Under the Probability of Excess Loss Criterion (PDF)
Stephanie Pereira and Tsachy Weissmann
Proc. 43rd Allerton Conf. Communication, Control, and Computing (2005)

This paper looks at the discrete denoising problem, which is related to filtering, estimation, and lossy source coding. Very briefly, the idea is that you have a iid sequence of pairs of discrete random variables taking values in a finite alphabet:

Where X is the “clean” source and Z is the “noisy” observation, so that the joint distribution is p(x,z) = p(x) p(z | x), where p(z | x) is some discrete memoryless channel. A denoiser is a set of mappings

so that g_i(z^n) is the “estimate” of X_i. One can impose many different constraints on these functions g_i. For example, they may be forced to operate only causally on the Z sequence, or may only use a certain subset of the Z‘s or only the symbol Z_i. This last case is called a symbol-by-symbol denoiser. The goal is to minimize the time-average of some loss function

This minimization is usually done on the expectation E[L_n], but this paper chooses to look at the the probability of exceeding a certain value P(L_n > D).

The major insight I got from this paper was that you can treat the of the loss function

as outputs of an source with time varying (arbitrarily varying) statistics. Conditioned on Z^n each h_k is independent with a distribution in a finite set of possible distributions. Then to bound the probability P(L_n > D), they prove a large deviations result on L_n, which is the time-average of the arbitrarily varying source.

Some of the other results in this paper are

  • For a Hamming loss function the optimal denoiser is symbol-by-symbol.
  • Among symbol-by-symbol denoisers, time-invariant ones are optimal.
  • An LDP for block denoisers and some analysis of the rate.

Most of the meat of the proofs are in a preprint which seems to still be in flux.