Generalized Separation of Style and Content on Nonlinear Manifolds with Application to Human Motion Analysis

 

NSF CAREER Award number IIS- 0546372

PI: Ahmed Elgammal, Rutgers University

 

Graduate Students Investigator: Chan-Su Lee

Duration: January, 2006 – Dec -2010

 

Project Goals

 

The problem of separation of style and content is essential task in visual perception and is a fundamental mystery of perception. Such problem appears extensively in different computer vision applications. The objective of this project is to develop general frameworks and algorithms for nonlinear separation of content and various style factors. In particular, we address this problem in the context of human motion analysis.

 

In general the visual input is a function of various conceptually orthogonal factors. Each of these factors represents an underlying nonlinear manifold. So, in general, each data point lies on a mixture of manifolds. For example, have body configuration manifold, view manifold, and different people shape and appearance manifold, etc. These variabilities make the problem very challenging as we have a product space of all these factors. However, the problem can be approached if we understand, conceptually, to some extent, the topology, the dimensionality and the properties of each the individual manifolds of the orthogonal factors that generated the data.

 

The goal is to establish general mathematical frameworks for separation of multiple factors in the data. In particular, in the context of human motion, the objective for this project is to establish a mathematical framework that decouples the intrinsic body configuration from other sources of variability that affect the visual input. Consequently, to exploit such model in the recovery of body configuration. 

 

To achieve this goal, we investigate several fundamental problems:

 

How to learn a unified invariant content manifold representation from various sets of data representing style variations on the same manifold?

 

How to learn factorized generative models for the data given representation of one or more of the underlying manifolds?

 

Given representation of the underlying manifold, how that can be used to select discriminative features in the visual input?

 

How these ideas can be applied towards the recovery of intrinsic body configuration?

 

 

 

Recent Achievements:

 

Modeling Multiple Continuous Manifolds – View and Posture

 

We consider modeling data lying on multiple continuous manifolds. In particular, we model shape manifold of a person performing a motion observed from different view points along a view circle at fixed camera height. We introduce a model that ties together the body configuration (kinematics) manifold and the visual manifold (observations) in a way that facilitates tracking the 3D configuration with continues relative view variability. The model exploits the low dimensionality nature of both the body configuration manifold and the view manifold where each of them is represented separately.
 

 

Example tracking of a ballet motion from multiple views:

 

 

(a) Sampled shapes in different views for ballet.          (b) Embedded kinematics manifold in 2D

 

(c) View manifold embedded in the kinematic manifold mapping  (d) Velocity field value with interpolation.

(e) Test silhouette sequences

(f) Reconstruction of 3D body pose based on estimated body configuration

 

 Some demo views are available here [click]

 

 

 

 

 

Publications:

 

C.-S. Lee and A. Elgammal Modeling View and Posture Manifold for Tracking , International Conference on Computer Vision (ICCV), 2007.