High-dimensional pattern recognition using low-dimensional embedding and Earth Mover's Distance (with L. Lieu), submitted for publication, 2008, revised 2009.
Abstract
We propose an algorithm that combines existing techniques in a novel way to do
classification of datasets consisting of high-dimensional data (e.g., sets of
signals or images).
Furthermore, our algorithm sets up a framework for application of the Earth
Mover's Distance (EMD) [Rubner-Tomasi 1999, Rubner-Tomasi-Guibas 2000] as
a discriminant measure between datasets. We show how to prepare a compact
representation --- a signature --- for each dataset so that computation
of EMD between
datasets can be done efficiently. This signature-construction step requires the
tasks of dimension reduction, automatic determination of the data's intrinsic
dimensionality, out-of-sample extension, and point clustering. We will show how
to apply some existing methods (which include Laplacian eigenmaps
[Belkin-Niyogi, 2001, 2003, 2005], diffusion maps framework
[Coifman-Lafon 2006, Lafon 2004, Lafon-Keller-Coifman 2006], and
the elongated K-means [Sanguinetti-Laidler-Lawrence 2008])
to perform these tasks successfully.
We will also provide two examples of applications of our proposed algorithm.
Get
the full paper (Revised on 07/07/09): PDF file.
Please email
me if you have any comments or questions!
Go
back to Naoki's Publication Page