Integrative analysis of multiple genomic variables using a hierarchical Bayesian model

Speaker: Martin Schäfer, Heinrich Heine University Düsseldorf, Germany.

Abstract

Genes showing congruent differences in several genomic variables between two biological conditions are crucial to unravel causalities behind phenotypes of interest. Detecting such genes is important in biomedical research, e.g. when identifying genes responsible for cancer development. Small sample sizes common in next-generation sequencing studies are a key challenge, and there are still only very few statistical methods to analyze more than two genomic variables in an integrative, model-based way.

In this talk, a novel bioinformatics approach is presented to detect congruent differences between two biological conditions in a larger number of different measurements such as various epigenetic marks or mRNA transcript levels. A coefficient inspired by previous work regarding the integration of two genomic variables (Klein et al., 2014, Schäfer et al., 2009) is proposed to quantify the degree to which genes present consistent alterations in multiple (more than two) genomic variables when comparing samples presenting a condition of interest (e.g., cancer) to a reference group. A hierarchical Bayesian model is then employed to assess uncertainty on a gene level, incorporating information on functional relationships between genes via a conditionally autoregressive prior, adapting a convolution model from spatial epidemiology (Besag et al., 1991) for the genomic context. Borrowing information from functionally similar genes facilitates the inference in the context of small sample sizes. The approach is demonstrated on experimental data sets containing RNA-seq gene transcripton and up to four ChIP-seq histone modification measurements as well as in a simulation study. Both the coefficient-based ranking and the inference based on the model lead to a plausible prioritizing of candidate genes when analyzing multiple genomic variables.

References

Besag, J., York, J., and Mollie, A. (1991). Bayesian image restoration, with two applications in spatial statistics. Annals of the Institute of Statistical Mathematics, 43, 1–59.

Klein, H.-U., Schäfer, M., Porse, B., Hasemann, M., Ickstadt, K., and Dugas, M. (2014). Integrative analysis of histone ChIP-seq and gene expression microarray data using Bayesian mixture models. Bioinformatics, 30(8), 1154–1162.

Schäfer, M., Schwender, H., Merk, S., Haferlach, C., Ickstadt, K., and Dugas, M. (2009). Integrated analysis of copy number alterations and gene expression: a bivariate assessment of equally directed abnormalities. Bioinformatics, 25(24), 3228–3235.

Published Sep. 7, 2016 12:02 PM - Last modified Sep. 8, 2016 3:19 PM