Mark van de Wiel - How to learn from a lot: Empirical Bayes in Genomics

Mark van de Wiel (Vrije Universiteit, Amsterdam) will give a seminar in the lunch area, 8th floor N.H. Abel's House at 14:15 November 3rd.

Title: How to learn from a lot: Empirical Bayes in Genomics

Abstract: The high-dimensional character of genomics data generally forces statistical inference methods to apply some form of penalization, e.g. multiple testing, penalized regression or sparse gene networks. The other side of the coin, however, is that the dimension of the variable space may also be used to learn across variables. Empirical Bayes (EB) is a powerful principle to do so. In both the Bayesian and frequentist paradigms it comes down to estimation of the a priori distribution of parameter(s) from the data. We shortly review some well-known EB principles and their applications to the analysis of genomics data. However, often EB not used at its full strength. We extend EB methodology to allow for automatic inclusion of auxiliary information and illustrate the methods in two settings: 1) Prediction of binary response, and 2) Gene network reconstruction. For 1) we demonstrate how auxiliary information in the form of 'co-data', e.g. p-values from an external study or genomic annotation, can be used to improve prediction of binary response, like tumor recurrence. We derive empirical Bayes estimates of penalties of groups of variables in a classical logistic ridge regression setting, and show that multiple sources of co-data may be used. For 2) we combine empirical Bayes with computationally efficient variational Bayes approximations of posteriors for the purpose of gene network reconstruction by the use structural equation models. These models regress each gene on all others, and hence this setting can be regarded as an extension of 1). We show that inclusion of information on a prior network may dramatically improve reproducibility of the estimated network.

Published Oct. 23, 2015 10:09 AM - Last modified May 14, 2019 9:08 AM