Skip to content Skip to navigation

Pre-Defense: Dynamically Compressed Bayesian Hidden Markov Models using Haar Wavelets

Abstract: 

In bioinformatics, Hidden Markov Models (HMM) are a powerful tool which is used, among other things, to find candidate regions of genomic copy-number variants (CNV). While frequentist inference techniques such as Baum-Welch have come under increased scrutiny in the field, Bayesian CNV detection requires the inference of millions and billions of latent variables, thereby posing a challenge to MCMC-based methods. In this work, we present a Forward-Backward Gibbs sampler which operates on compressed representations of the data at different resolution levels. We integrate Haar wavelet shrinkage with Bayesian HMM to obtain a dynamic compression scheme which greatly improves both speed and convergence. We provide a decision-theoretic justification for such a scheme, and discuss a variety of algorithms and data structures for its implementation. We demonstrate its applicability through large scale simulations as well as finding plausible CNV candidates in whole-genome sequencing data of rat populations divergently selected for tame and aggressive behavior.

Speaker: 
John Wiedenhoeft
Location: 
CoRE B (305)
Event Date: 
07/12/2018 - 10:00am
Committee: 
Dr. rer. nat. Alexander Schliep (Chair), Prof. Martin Farach-Colton, Prof. Kevin C. Chen, Prof. Anders Krogh, PhD (University of Copenhagen)
Event Type: 
Pre-Defense
Organization: 
Dept. of Computer Science