Sparsity vs. statistical independence in adaptive signal representations: A case study of the spike process, (with B. Benichou), in Beyond Wavelets (G. V. Welland, ed.), Studies in Computational Mathematics, Vol. 10, Chap. 9, pp. 225-257, Academic Press, 2003.

Abstract

Finding a basis/coordinate system that can efficiently represent an input data stream by viewing them as realizations of a stochastic process is of tremendous importance in many fields including data compression and computational neuroscience. Two popular measures of such efficiency of a basis are sparsity (measured by the expected l^p norm) and statistical independence (measured by the mutual information). Gaining deeper understanding of their intricate relationship, however, remains elusive. Therefore, we chose to study a simple synthetic stochastic process called the spike process, which puts a unit impulse at a random location in an n-dimensional vector for each realization. For this process, we obtained the following results: 1) The standard basis is the best both in terms of sparsity and statistical independence if n ≥ 5 and the search of basis is restricted within all possible orthonormal bases in Rⁿ; 2) If we extend our basis search in all possible invertible linear transformations in Rⁿ, then the best basis in statistical independence differs from the one in sparsity; 3) In either of the above, the best basis in statistical independence is not unique, and there even exist those which make the inputs completely dense; 4) There is no linear invertible transformation that achieves the true statistical independence for n > 2.

Get the full paper: gzipped PS file or PDF file.

Get the official version via doi:10.1016/S1570-579X(03)80037-X.

Please email me if you have any comments or questions!
Go back to Naoki's Publication Page