Bayesian variable selection in linear regression models with instrumental variables

10 Jan 2019  ·  Sabnis Gautam, Atchadé Yves, Dovonon Prosper ·

Many papers on high-dimensional statistics have proposed methods for variable selection and inference in linear regression models by relying explicitly or implicitly on the assumption that all regressors are exogenous. However, applications abound where endogeneity arises from selection biases, omitted variables, measurement errors, unmeasured confounding and many other challenges common to data collection Fan et al. (2014). The most common cure to endogeneity issues consists in resorting to instrumental variable (IV) inference. The objective of this paper is to present a Bayesian approach to tackling endogeneity in high-dimensional linear IV models. Using a working quasi-likelihood combined with an appropriate sparsity inducing spike-and-slab prior distribution, we develop a semi-parametric method for variable selection in high-dimensional linear models with endogeneous regressors within a quasi-Bayesian framework. We derive some conditions under which the quasi-posterior distribution is well defined and puts most of its probability mass around the true value of the parameter as $p \rightarrow \infty$. We demonstrate through empirical work the fine performance of the proposed approach relative to some other alternatives. We also include include an empirical application that assesses the return on education by revisiting the work of Angrist and Keueger (1991).

PDF Abstract
No code implementations yet. Submit your code now

Categories


Methodology

Datasets


  Add Datasets introduced or used in this paper