Latent-Variable Modeling: Algorithms, Inference, and Applications

Author: Taeb, Armeen

Year: 2020

Degree: Dissertation (Ph.D.)

Advisor: Chandrasekaran, Venkat

Committee Members: Hassibi, Babak; Stuart, Andrew M.; Pachter, Lior S.; Doyle, John Comstock; Chandrasekaran, Venkat

Option: Electrical Engineering

DOI: 10.7907/YRF1-7W29

Abstract

Many driving factors of physical systems are often latent or unobserved. Thus, understanding such systems crucially relies on accounting for the influence of the latent structure. This thesis makes advances in three aspects of latent-variable modeling: inference, algorithms, and applications. Specifically, we develop and explore latent-variable techniques that a) ensure interpretable and statistically significant models, b) can be efficiently optimized to identify best fit to data, and c) provide useful insights in real-world applications. The specific contributions of this thesis are:

1. We employ a latent-variable graphical modeling technique to develop the first state-wide statistical model of the California reservoir network. With this model, we precisely characterize the system-wide behavior of the network to hypothetical drought conditions, and proposed guidelines for more sustainable reservoir management.

2. Motivated by the previous application, we provide a geometric framework to assess the extent to which our latent variable model has learned true or false discoveries about the relevant physical phenomena. Our approach generalizes the classical notions of true and false discoveries in mathematical statistics that rely on the discrete structure of the decision space to settings where the decision space is continuous and more complicated. We highlight the utility of this viewpoint in problems involving subspace selection and low-rank estimation.

3. We propose a convex optimization procedure to fit a latent-variable graphical model for generalized linear models. This framework provides a flexible approach to model non-Gaussian variables including Poisson, Bernoulli, and exponential variables. A particularly novel aspect of our formulation is that it incorporates regularizers that are tailored to the type of latent variables.

4. We describe a computationally efficient framework to learn a latent-variable model with high-dimensional and non-iid data. This framework is based on factoriable precision operators that decouple the component associated with the observational dependencies and the component associated to interdependencies among the variables.

5. We propose a convex optimization technique to provide semantics to latent variables of a factor model. This approach is based on linking auxiliary variables -- chosen based on domain expertise -- to these latent variables.

Files

Taeb_thesis.pdf (application/pdf)