paper a day : approximating high-dimensional pdfs by low-dimensional ones

Asymptotically Optimal Approximation of Multidimensional pdf’s by Lower Dimensional pdf’s
Steven Kay
IEEE Transactions on Signal Processing, V. 55 No. 2, Feb. 2007, p. 725–729

The title kind of says it all. The main idea is that if you have a sufficient statistic, then you can create the true probability density function (pdf) of the data from the pdf of the sufficient statistic. However, if there is no sufficient statistic, you’re out of luck, and you’d like to create a low-dimensional pdf that somehow best captures the features you want from the data. This paper proves that a certain pdf created by a projection operation is optimal in that it minimizes the Kullback-Leibler (KL) divergence. Since the KL divergence dictates the error in many hypothesis tests, this projection operation is good in that decisions based on the projected pdf will be close to decisions based on the true pdf.

This is a correspondence item, so it’s short and sweet — equations are given for the projection and it is proved to minimize the KL divergence to the true distribution. Examples are given for cases in which sufficient statistics exist and do not exist, and an application to feature selection for discrimination is given. The benefit is that this theorem provides a way of choosing a “good” feature set based on the KL divergence, even when the true pdf is not known. This is done by estimating an expectation from the observed data (the performance then depends on the convergence speed of the empirical mean to the true mean, which should be exponentially fast in the number of data points).

The formulas are sometimes messy, but it looks like it could be a useful technique. I have this niggling feeling that a “bigger picture” view would be forthcoming from looking at information geometry/differential geometry viewpoint, but my fluency in those techniques is lacking at the moment.

Update: My laziness prevented me from putting up the link. Thanks, Cosma, for keeping me honest!

3 thoughts on “paper a day : approximating high-dimensional pdfs by low-dimensional ones”

Cosma says:

on February 20, 2007 at 8:19 am

Links, man, links! Its http://dx.doi.org/10.1109/TSP.2006.887112 (and volume 55, not 35).

rif says:

on March 7, 2007 at 3:58 pm

The link is dead.

asarwate says:

on March 7, 2007 at 4:08 pm

Fixed!

	Zonghong Liu on A story about Canvas
	anonymousskimmer on “The needs of the many,…
	Chanterelle Recipes… on Broiled shrimp with chanterell…
	kvarsh on ICML 2019 encouraged code subm…
	Pulkit Grover on gender inclusivity in communic…

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28

An Ergodic Walk

a process whose average over time converges to the true average

paper a day : approximating high-dimensional pdfs by low-dimensional ones

3 thoughts on “paper a day : approximating high-dimensional pdfs by low-dimensional ones”

Leave a comment Cancel reply

Share this:

Related

3 thoughts on “paper a day : approximating high-dimensional pdfs by low-dimensional ones”

Leave a comment Cancel reply