Qualifying Exam

Qualifying Exam

Application of meta-learning on Facial Action Unit Detection and learning private-shared latent representation in multimodal VAE


Download as iCal file

Friday, June 12, 2020, 01:30pm - 02:30pm


Speaker: Mihee Lee

Location : Remote via Webex


Prof. Vladimir Pavolovic

Prof. Sungjin Ahn

Prof. Abdeslam Boularias

Prof. Swastik Kopparty

Event Type: Qualifying Exam

Abstract: Detecting facial action units (AU) is one of the fundamental steps in automatic recognition of facial expression of emotions and cognitive states. Though there have been a variety of approaches proposed for this task, most of these models are trained only for the specific target AUs, and as such, they fail to easily adapt to the task of recognition of new AUs. In this paper, we propose a deep learning approach for facial AU detection that can easily and quickly adapt to a new AU or target subject by leveraging only a few labeled samples from the new task (either an AU or subject) based on the notion of the model-agnostic meta-learning. We show on two benchmark datasets, BP4D and DISFA, for facial AU detection that the proposed approach can be easily adapted to new tasks. Using only a few labeled examples from these tasks, the model achieves large improvements over the baselines. Multi-modal generative models represent an important family of deep models, whose goal is to facilitate representation learning on data with multiple views or modalities. However, current deep multi-modal models focus on the inference of shared representations, while neglecting the important private aspects of data within individual modalities. In this paper, we introduce a disentangled multi-modal variational autoencoder (DMVAE) that utilizes disentangled VAE strategy to separate the private and shared latent spaces of multiple modalities. We specifically consider the instance where the latent factor may be of both continuous and discrete nature, leading to the family of general hybrid DMVAE models. We demonstrate the utility of DMVAE on a semi-supervised learning task. Our experiments on several benchmark datasets, including MNIST, SVHN, and CelebA, indicate the importance of the private-shared disentanglement as well as the hybrid latent representation.