Bayesian Chain Graph LASSO Models to Learn Sparse Microbial Networks with Predictors

15 Dec 2020  ·  Yunyi Shen, Claudia Solis-Lemus ·

Microbiome data require statistical models that can simultaneously decode microbes' reaction to the environment and interactions among microbes. While a multiresponse linear regression model seems like a straight-forward solution, we argue that treating it as a graphical model is flawed given that the regression coefficient matrix does not encode the conditional dependence structure between response and predictor nodes as it does not represent the adjacency matrix. This observation is especially important in biological settings when we have prior knowledge on the edges from specific experimental interventions that can only be properly encoded under a conditional dependence model. Here, we propose a chain graph model with two sets of nodes (predictors and responses) whose solution yields a graph with edges that indeed represent conditional dependence and thus, agrees with the experimenter's intuition on the average behavior of nodes under treatment. The solution to our model is sparse via Bayesian LASSO. In addition, we propose an adaptive extension so that different shrinkage can be applied to different edges to incorporate edge-specific prior knowledge. Our model is computationally inexpensive through an efficient Gibbs sampling algorithm and can account for binary, counting and compositional responses via appropriate hierarchical structure. We apply our model to a human gut and a soil microbial compositional datasets and we highlight that CG-LASSO can estimate biologically meaningful network structures in the data. The CG-LASSO software is available as an R package at https://github.com/YunyiShen/CAR-LASSO.

PDF Abstract

Categories


Applications Methodology 62H10, 62P10

Datasets


  Add Datasets introduced or used in this paper