Generalized Separation of
Style and Content on Nonlinear Manifolds with Application to Human Motion
Analysis
NSF CAREER Award number IIS- 0546372
PI: Ahmed Elgammal,
Graduate Students Investigator: Chan-Su Lee
Duration: January, 2006 – Dec -2010
Project Goals
The problem of separation of style
and content is essential task in visual perception and is a fundamental mystery
of perception. Such problem
appears extensively in different computer vision applications. The objective of
this project is to develop general frameworks and algorithms for nonlinear
separation of content and various style factors. In particular, we address this
problem in the context of human motion analysis.
In general the visual input is a
function of various conceptually orthogonal factors. Each of these factors
represents an underlying nonlinear manifold. So, in general, each data point
lies on a mixture of manifolds. For example, have body configuration manifold,
view manifold, and different people shape and appearance manifold, etc. These
variabilities make the problem very challenging as we have a product space of
all these factors. However, the problem can be approached if we understand,
conceptually, to some extent, the topology, the dimensionality and the
properties of each the individual manifolds of the orthogonal factors that
generated the data.
The goal is to establish general
mathematical frameworks for separation of multiple factors in the data. In
particular, in the context of human motion, the objective for this project is
to establish a mathematical framework that decouples the intrinsic body
configuration from other sources of variability that affect the visual input.
Consequently, to exploit such model in the recovery of body configuration.
To achieve this goal, we investigate
several fundamental problems:
How to learn a unified invariant
content manifold representation from various sets of data representing style
variations on the same manifold?
How to learn factorized generative
models for the data given representation of one or more of the underlying
manifolds?
Given representation of the
underlying manifold, how that can be used to select discriminative features in
the visual input?
How these ideas can be applied
towards the recovery of intrinsic body configuration?
Recent Achievements:
Modeling Multiple Continuous Manifolds – View and Posture |
|
|
|
We consider modeling data lying on
multiple continuous manifolds. In particular, we model shape manifold of a person
performing a motion observed from different view points along a view circle
at fixed camera height. We introduce a model that ties together the body
configuration (kinematics) manifold and the visual manifold (observations) in
a way that facilitates tracking the 3D configuration with continues relative
view variability. The model exploits the low dimensionality nature of both
the body configuration manifold and the view manifold where each of them is
represented separately. |
|
Example tracking of a ballet motion from multiple views:
|
|
|
(a) Sampled shapes in different views for
ballet. (b)
Embedded kinematics manifold in 2D
(c) View manifold embedded in the kinematic manifold
mapping (d) Velocity field value with interpolation.
(e) Test silhouette sequences
(f) Reconstruction of 3D body pose based on estimated
body configuration |
|
|
Some demo views are available here [click] |
|
|
|
|
Publications:
C.-S. Lee and A. Elgammal Modeling View and Posture Manifold for Tracking , International Conference on Computer Vision (ICCV), 2007.