Bayesian multiple logistic regression for case-control GWAS
Publishing date: 2019-01-17
Published on: PLOS Genetics
summary: Genetic variants in genome-wide association studies (GWAS) are tested for disease association mostly using simple regression, one variant at a time. Standard approaches to improve power in detecting disease-associated SNPs use multiple regression with Bayesian variable selection in which a sparsity-enforcing prior on effect sizes is used to avoid overtraining and all effect sizes are integrated out for posterior inference. In this paper the authors introduce the quasi-Laplace approximation to solve the integral and avoid MCMC sampling. The authors expect the logistic model to perform much better than multiple linear regression except when predicted disease risks are spread closely around 0.5, because only close to its inflection point can the logistic function be well approximated by a linear function.
authors: Saikat Banerjee, Lingyao Zeng, Heribert Schunkert, Johannes Söding
link to paper: 10.1371/journal.pgen.1007856
Icons made by catkuro from www.flaticon.com