CS Events Monthly View

PhD Defense

Generative Modeling for Multimodal Data Fusion


Download as iCal file

Thursday, September 08, 2022, 01:00pm


Speaker: Mihee Lee

Location : Virtual


Professor Vladimir Pavlovic, Chair

Professor Mubbasir Kapadia

Professor Dimitris Metaxas

Professor Kris Kitani (Carnege Mellon University)

Event Type: PhD Defense

Abstract: Humans can process multiple perspectives of the world and generate compound information from them to obtain a richer understanding of the world.In machine learning, generative models have been widely used to comprehend the world by learning the latent representation of the given data. Representation learning is a key step in the process of data understanding, where the goal is to distill interpretable factors associated with the data. However, representation learning approaches typically focus on data observed in a single modality, such as text, images, or video. In this dissertation, we develop generative models that can fuse multimodal data in order to have a comprehensive understanding of the tasks. First, we introduce a model that can learn the latent representation given multi-modality data based on the VAE framework. Specifically, we disentangle the latent space into modality-common (shared) and modality-specific (private) spaces. By considering the private latent factors aside from the shared latent factors of all modalities, the proposed model can achieve more precise cross-modal generation or retrieving from one modality to another modality. Moreover, by assuming the missing modality, we demonstrate that our model can be utilized for solving semi-supervised learning or zero-shot learning problems. We further study the importance of perceiving the world from multiple views through trajectory forecasting scenarios. The growing demand for autonomous vehicles is spurring numerous studies of behavior prediction. The existing works underestimate the context such as nearby environment and neighbors while predicting the target's future trajectories. We suggest combining such information, and generate the context-aware future trajectory prediction conditioned on the past trajectory based on the Conditional VAE. We show that the proposed model is able to achieve collision-free predictions from the surrounding environment or neighbors in both real and simulated data.


Department of Computer Science

School of Arts & Sciences

Rutgers University

Contact  Professor Vladimir Pavlovic

Zoom Link: https://rutgers.zoom.us/j/95384967037?pwd=eEx4Vk1aSk5zOUorQlRHL3JCL2dodz09